(Not) failing with Node.js
Some time ago, I was tasked with a quite interesting work: to mix old and new, PostgreSQL and Node.js, into an application, a backend for an app, providing a RESTful API. Being a 100% Rails developer, I felt lost without a beloved ORM betweeen the hostile SQL wasteland and my nice tidy code. Since I never used an ORM in Node, I did a quick google search and found a promising ORM named Sequelize. I quickly installed it, tried it and found that it worked as expected. Not as cute, not as easy as ActiveRecord, but it did its job, namely isolate me of the burdens of SQL generation.
Fast-forward a few weeks and we were production-ready. It was load testing time. Client requirements were 1,250 concurrent requests. I was using Node, so of course, I felt confident. Node can handle thousands of connections per second, as that first demo taught us all. So it will handle that puny 1,250 concurrent requests.
In our first load test we managed to handle the incredible amount of 150 concurrent requests per core. Not 1000, not 500 but only 150, with the CPU begging for mercy at a constant 100% load, and the database server perfectly handling the load. That was wrong. We were allowed to use more than one core, but even doing that trick we will need more than eight cores to handle the required amount of users. Not acceptable.
We then did some profiling, with the help of the --prof option in node, perf, and node.js flame graphs for the profiling part, and locust.io and ab for the scale testing part. With the profiling results, we had some clues. Of course, the problem was in the code, as server's CPU was constantly at 100% while the database server sat idly. Another thing we found is that the culprit was in Sequelize. It was high level, but also quite unoptimized.
Looking for alternatives, we found an ORM known by its speed, although not so high level, named node-orm2. We started load testing, disabling all endpoints and converting them to node-orm2, and then enabling them, one by one. Picture appeared promising. We got a 100% speed-up, and we were happy as a clam. Then, with our confidence in unprecedent high levels, we decided to fully switch to node-orm2. And we were smacked to the ground after finding that, after adding all endpoints, we got a speed-up of 5% at best.
Time to switch. Tired of ORMs which promised everything and delivered nothing, we switched to bare node-postgres, which is a package providing only database connection. We also used a repository pattern, which is a layer designed to remove querying from the code. We moved all the code related with data querying to group of prototypes, one for each table, containing all the necessary functions to fetch the data. Such functions will only return pure data, as hashes.
Returning only pure data was great improvement, as we discovered that objects with many functions were quite difficult for node to handle. This is an inherent advantage of the repository pattern.
After that, we found we could handle the impressive amount of 350 concurrent requests. It was not awesome, but at least it was an improvement of 233%. That means we could achieve 1,250 concurrent requests by having a machine with four cores.
All of this taught three lessons for us. The first of all is an old lesson which people keep rediscovering: Node.js is not a silver bullet. It's fast for some scenarios, but it's not always fast. Second one, is do not get over-excited by a partial result. When we partially switched to node-orm2 we found a giant improvement, only to become hugely disappointed when we finally switched everything. And last of all, and I think it's the most important lesson, is: never, ever use an ORM which is not well tested and optimized, and backed by a lot of benchmarks and real use. ORMs are nice and good to have, but they are too a resource hog if they are not really optimized. It's better to not to have any ORM than having a bad ORM.