Azure Cosmos DB– Gaining the Azure Data Engineer Associate Certification

Azure Cosmos DB is a database administration feature. It will automatically administer DBMS updates and patching, storage capacity management, and autoscaling capabilities. From a storage perspective, the scale happens automatically and elastically. Elastically means that when the database needs more storage space, it gets allocated, and when it is no longer needed, it gets deallocated. That is a very cost‐effective approach where only what is needed is provided and paid for. In this scenario, capacity scaling has more to do than increasing the size or number of compute instances the database runs on. It means that the instances can be scaled into different Azure regions, globally. Then, not only do you have built‐in isolated data failover capabilities spread worldwide, you can also get the data as close as possible to those who produce or consume it. The closer the data is to those who require or produce it, the faster the data can be retrieved and manipulated.

Another great feature is the ability to have write capabilities in different regions. In most multi‐instance data service scenarios, there is a single‐source database, often called the single version of the truth. The other instances are copies of that source database, are read only, and get updated on a scheduled basis. However, Azure Cosmos DB will support multiregion writes and will manage the synchronization of data among all the instances.

If Azure Cosmos DB is a data administration tool, where is the data stored? When you provision an Azure Cosmos DB account, the first step is to select which data storage service API you require. The list of possible APIs is shown in Figure 1.23.

FIGURE 1.23 Azure Cosmos DB and supported APIs

See Table 1.7 for a quick overview of the APIs.

TABLE 1.7 Azure Cosmos DB APIs

APIData modelContainer item
Core (SQL)Document databaseDocument
Azure Cosmos DB API for MongoDBDocument databaseDocument
CassandraRelations (Apache/CQL)Row
TableKey/value pairItem
GremlinGraph databaseNode or edge

Core (SQL)

Choose this native API if you need to work with documents, especially when the documents are JSON. Using the familiar SQL query language supports searching the content within the document itself. Additionally, there is a very feature‐rich Azure Cosmos DB SDK that lets you interface and query the data source using .NET, Python, JavaScript, and Java.

Azure Cosmos DB API for MongoDB

If you already have an existing on‐premises version of MongoDB, then choose this API when moving the workload to Azure. It uses the SQL query language and the Azure Cosmos DB SDK, which supports several programming languages, such as Rust, Golang, Xamarin, .NET, and Python.

Azure Cosmos DB is a database administration feature. It will automatically administer DBMS updates and patching, storage capacity management, and autoscaling capabilities. From a storage perspective, the scale happens automatically and elastically. Elastically means that when the database needs more storage space, it gets allocated, and when it is no longer needed, it gets deallocated.…

Leave a Reply

Your email address will not be published. Required fields are marked *