Scaling a database is a no small feat. It might seem to be under control after hours of fighting but it might fall apart again in the next moment. Besides infrastructural challenges, there will be never-ending data inconsistency sagas. In a nutshell, you are better off. But, then who is going to bring performance, scale, and high throughput? I may not have any answer two years ago but I have it now - Cosmos DB. It's the best thing happened ever in the world of distributed computing.
This is how Microsoft defines it - Azure Cosmos DB was built from the ground up with global distribution and horizontal scale at its core. It offers turnkey global distribution across any number of Azure regions by transparently scaling and replicating your data wherever your users are. Elastically scale throughput and storage worldwide, and pay only for what you need. Azure Cosmos DB provides native support for NoSQL choices, offers multiple well-defined consistency models, guarantees single-digit-millisecond latencies at the 99th percentile, and guarantees high availability with multi-homing capabilities and low latencies anywhere in the world—all backed by industry-leading, comprehensive service level agreements (SLAs), something no other database service can offer.
It all started in 2010 as "Project Florence" and, later in 2014, got merged with document db. It's designed to allow customers to elastically and independently scale throughput and storage across any number of geographical regions. It's the first and only globally distributed database service to offer guaranteed low latency at the 99th percentile and 99.99% high availability. Dr. Leslie Lamport, Turing Award Winner and world-renowned computer scientist, has profoundly influenced many large-scale distributed systems. Azure Cosmos DB is no exception. You can listen to his thoughts here.
1. Enable customers to elastically scale throughput and storage base on-demand, globally
2. Enable customers to build highly responsive and mission-critical applications
3. Ensure that the system is "always available"
4. Provide developers with the well-defined consistency models
5. Offer comprehensive SLAs
6. Provide a schema-free, auto-indexed and versioned database
7. Inherent support for multiple data models and APIs for accessing them
8. Provide all of this at a very low cost
Expanding on multi-model support:
Getting the terminology right, first -
Cosmos db uses containers to store data. Each model has a different name for it. Sql and MongoDb api calls it a Collection, Gremlin api calls it a Graph, and Cassandra and Table api calls it a Table.
- Sql Api is meant to be used for storing and querying Json documents.
- Table Api is port of Azure table storage to Cosmos db. Same features plus new premium platform capabilities.
- Cassandra and MongoDb Apis are Cosmos Db's SAAS offerings
- Gremlin Api provides a fully managed, horizontally scalable graph database service
Irrespective of which Api you are using to store your data, under-the-hood, Cosmos Db is storing your data in an ARS format (atom-record-sequence).
Throughput and Operations:
- Data reads of 1 kb take 1 RU and writes of equivalent size indexed data, take around 5-6 RU.
- For a typical 1KB item, Cosmos DB guarantees end-to-end latency of reads under 10 ms and indexed writes under 15 ms at the 99th percentile, within the same Azure region. The median latencies are significantly lower (under 5 ms)
- Currently, the max document size is limited to 2 MB only
What is an RU (Request Unit)?
RU is a way of measuring throughput. In other words, it's the currency of Cosmos db. While working with it, you are not required to reserve CPU, memory or IOPS. So, how can they charge you? Also, remember that not all requests are the same. An uniformity is required. This is where RU comes in. You can use this estimator for your capacity planning and this RU calculator, to calculate your bill. Each response has a charge field which you can use to know - how many RUs it has charged you. Any requests going beyond your reserved capacity will be throttled.
Cosmos Db SDK is really a sophisticated piece of software craftsmanship. It has built-in retry logic capabilities, region failure detection, and redirection, query optimization etc.
All the data within an Azure Cosmos DB container (e.g. collection, table, graph etc.) is horizontally partitioned and transparently managed by resource partitions as shown in the graphic below.