My Year of Riak
Startups often ask my opinion on databases for their new application. In the past year we've launched a big CouchDB-based application, and we've helped build stylesclub.com, a Riak-based facebook app (but not yet launched). We've also helped launch ming.ly, an Amazon SimpleDB based application. Each of these has been an opportunity to further develop my philosophy about what makes a good database for a website or mobile app, and when to choose each one. I have a separate series coming on database tradeoffs but since there isn't that much information on Riak out in the wild yet, I will start by profiling my thoughts on the database. These are my opinions and thoughts after a bit more than a year of using it at Inaka.
Riak is a Key/Value store database where an afternoon reading the behind-the-scenes architecture of Amazon S3 is a helpful primer on how the database works. Basho's website (Basho is the company that built Riak) is remarkably obtuse about explaining what's so great about it. However, if you've ever had to deal with the hassles of scaling MySQL or other data stores across multiple servers, you'll want to familiarize yourself with Riak. Those who know they need Riak have suffered the pain of scaling other databases, and so they are a self-selecting group. Riak hasn't exactly gone mainstream yet, but it's the database all the cool kids are talking about so it's good to know at least what makes it special.
- Data is stored in buckets of keys, just like Amazon S3
- Keys hold values and values can be any type of data.
- Keys have a content-type which is set when the data is PUT or POSTed into Riak.
Servers crash, hard drives fail
Servers will fail; recovering from failure should be automatic; data should not be lost during a failure. Riak does this by making copies of the data across different nodes. The process of making replicas is automatic when items are stored in Riak. When a node goes down, the cluster of nodes detects and rebalances the data in the cluster across the remaining nodes. The brilliance of Riak is all the hassle of recovering from failure and of adding new nodes as you need more storage, is absolutely painless. It Just Works. There are tools for adding and removing nodes and they couldn't be simpler to use.
A weed-eater vs a V-8
The minimum recommended configuration is three nodes, and you can add as many as you would like. I've heard of clusters up to 60 and I'm sure at this point there are more. The idea of having one Riak "node" is possible - but it's like running a 1-cylinder four-stroke engine, which would run rough, if at all. The whole design of the cylinder and its valves is to work in concert with 3 or 5 or 7 others, each one at a different point in the power cycle, all together bringing the syncopated low rhythmic rumbling sound that indicates the power underneath the hood. Riak is like a V-8, it's really designed to run as a cluster of nodes. Riak throughput goes up with additional nodes, and we have anecdotal evidence that response time is faster to a point as you add nodes as well.
Accessing data and client libraries
Riak provides two protocols for accessing data - HTTP and Protocol Buffers - a Google-created format for structuring data which is more efficient than HTTP.
Because each node communicates with all of the others, you can ask any one node for data. It's trivial to place a HTTP load balancer in front of the nodes to automate the hassle of making round robin requests from your clients - some clients don't yet support round-robin requests to multiple nodes - but it adds (depending on your load balancer) a potential single point of failure.
When you want to get data out, you need to query for it. If you know the key, it's easy - just make an HTTP request for the data and you're done. But what about queries - aggregate data or a selection of nodes? There is currently only one way to get data out of Riak and that's to use Map/Reduce.
The easiest way to explain Map/Reduce is to say that it's like writing a simple piece of code to query a database, then running that query on all the rows of data, on all the servers where that data lives, and then collating the results. Think of Google's giant search index. It wouldn't be possible to build that index by bringing all the data of the billions of web pages to the server that builds it - the actual work of building the index must be completed close to where the data actually exists.
So what do people actually use Riak for?
See Who is Using Riak for specific companies, but here's my short list:
- For storing web session data that could grow indefinitely. Shopping carts that are always available.
- Storing log data that could grow very large.
- Write-heavy projects.
- Documents where the schema between documents could be different.
- When you absolutely can't have a database with a single point of failure.
- Example: streaming video data to disk for later processing.
- Example: storing streams of sensor data.
When do I want to use something other than Riak?
- If you will be performing SQL-style set operations or your data is relational.
- If you have budget constraints, because of disk storage requirements, the amount of data stored to ensure redundancy across the nodes will be high.
- If you don't like running your own servers, as there are not any hosted-Riak services that I would recommend (currently).
- If absolute latency for response times of individual requests is a priority.
- If you need to guarantee any read of a key will see the same value immediately, as nodes can take a while to guarantee writes. There are no transactions in Riak.
If I use Riak, I need to be comfortable with...
- A potential tradeoff -- possible increased development complexity for massively decreased deployment complexity.
- No ability to list keys and therefore no equivalent to "select * from customers". (You can request keys from a bucket but it's - currently an expensive operation that can block all other activies on the nodes; meaning, don't do it.)
How do I deal with things that need to be atomic like queues and counters?
For every application Inaka has built to-date, we use Riak with Redis. For caching data, counters, quick set operations, and anything we would use Memcache for, we use Redis. For all the actual permanent data storage, we use Riak. This often creates a single point of failure at the database level, but we're almost always dealing with other single points of failure, and you can use read-slaves with Redis to eliminate this to some extent. Particularly if you're not using Redis for permanent storage, you can go a long way with two Redis servers and a Riak cluster. They're a great compliment to each other.
Basho has demonstrated secondary indices which allow for querying across the database without having to write a Map/Reduce. I believe this will be a significant improvement to the product, though I'm not super convinced they have the right format for the query language yet - it feels a bit clumsy with type definitions in the HTTP query syntax.
Additionally, Basho has promised, down the road, a SQL-like syntax which would make interacting with the database much more powerful for the average developer. The roadmap looks bright and Basho is very responsive to community feature requests.
I talked with Mark Phillips with Basho, and he gave me the short list of 1.0 features coming this fall:
- Secondary Indexing - as referenced above.
- Lager - More traditional, unix-friendly logging
- LevelDB Integration - A google-created backend that allows for different performance characteristics than the default backend, called BitCask. One thing I intentionally didn't discuss is the pluggable datastores in Riak, as it's not that important for understanding the basics of the database, but since a new one is a roadmap item I'll just say that it's a great feature - you can use the default or the MySQL backend, or any of a number of stores, even Redis. LevelDB seems to have some important characteristics such as built-in compression, instant snapshotting, and more.
- Riak Pipe - Ability to setup phases of map/reduce jobs in a 'pipeline'. It's in beta and I've played around with it. Easiest way to explain it is that the Basho guys are thinking through how to make the complexity of Map/Reduce easier and more powerful and this is the first step.
- Search Integration - Search was a separate install with a Java dependency. That has been removed and search is a 'first class feature' of Riak now.
- I only recommend Riak to clients that really understand their needs and can confidently walk through the list above. I have mixed feelings about recommending Riak today, because most people don't have the context to make the right decision. It's easy to pick Riak or another NoSQL store because you're worried about scale. But, it's way, way more important to worry about those first 100 customers than it is to worry about your first million. Riak is catnip to sufferers of "Premature Database Optimazation Syndrome" because it works - it does actually allow near linear scale, but that's not usually the problem.
- Hosting can be expensive. We've seen poor performance running Amazon small nodes, so we generally recommend running on AWS large boxes. Running three m1.large AWS boxes will cost around $730 USD/month, which means starting out with Riak is a more expensive proposition than with some other databases in a cloud environment, and it is often worth considering dedicated hosting.
- There are some commercial features available as well that I didn't mention. Probably the most important is site-to-site replication. This is not available in the open source version but I'm assuming is the major draw for the big enterprise paying customers so far.
- Net/net: I REALLY like Riak when I'm lying in bed at night not worried about a few node failures.
- Overall, if you know you need Riak, it's a joy to use; easy to scale, and it's a powerful tool in your database arsenal.