What does "scalable database" mean?

Mar 30, 2010

All over the interwebs, people are equating NoSQL with scalability. Yesterday, I wrote a blog post explaining that most NoSQL dbs aren't, in fact, scalable. The general consensus in the comments seemed to be that my article was missing substantiation — that without explaining why the dbs weren't scalable, I wasn't adding anything to the discussion. So, what does ”scalable database“ mean?

There are two kinds of scalability: vertical and horizontal. Vertical scaling is just adding more capacity to a single machine. Virtually every database product is vertically scalable to the extent that they can make good use of more CPU cores[1], RAM, and disk space. With a horizontally scalable system, it's possible to add capacity by adding more machines. By far, most database products are not horizontally scalable.

But, people have been scaling products like MySQL for years, so how'd they do it?

Vertically is one way. 37signals has written about the mammoth (128GB of RAM and 8 x 15,000 RPM SAS drives) machines they run their dbs on — and that was over a year ago. I bet they've at least doubled the capacity in those machines by now.

It's also common to see RDBMS-backed applications run one or more read-slaves. With master-slave replication, it's possible to scale reads horizontally. But, there's a trade-off there. Since most (all?) replication systems are asynchronous, reads from slaves may be somewhat stale. It's not uncommon for replication lag times to be measured in minutes or hours in the event of network partitioning (for example). So, read-sharding trades some consistency (the ”C“ in ACID) for aditional read capacity[2].

When an application needs more write capacity than they can get out of a single machine, they're forced to partition (shard) their data across multiple database servers. This is how companies like facebook and twitter have scaled their MySQL installations to massive proportions. This is the closest you can get to horizontal scalability with most database products.

Sharding is a client-side affair — that is, the database server doesn't do it for you. In this kind of environment, when you access data, the data access layer uses consistent hashing to determine which machine in the cluster a specific piece of data should be written to (or read from). Adding capacity to (or alleviating “hot spots” from) a sharded system is a process of manually rebalancing the data across the cluster. So, while it's possible to add capacity to a mysql-backed application by adding machines, mysql itself is not horizontally scalable.

On the other hand, products like cassandra, riak, and voldemort automatically partition your data. It's possible to add capacity to these systems by simply turning on a new machine and starting the service (actually, I only know this for certain about cassandra, but I'm reasonably certain it's true for the others). The database system itself takes care of rebalancing the data and ensuring that it is sufficiently replicated across the cluster. This is what it means for a database to be horizontally scalable[3].

  • [1] Many databases are not capable of making use of additional CPU cores. Many NoSQL dbs, such as redis, are event-driven and run in a single thread.
  • [2] For the sake of completeness, I am aware of some tools (such as mysql-mmm) that monitor replication lag, and remove overly lagged shards from read rotation to help maintain consistency.
  • [3] Truly distributed dbs like cassandra also provide other goodies like per-query consistency and availability settings. But, that's beyond the scope of this article. For more on that (and many other things on this topic), the dynamo paper is a good place to start.