How We Shaved 200+ Milliseconds Off Our Response Times

By Spicer Matthews

At Cloudmanic Labs we have built a variety of software applications; some for customers and others for internal use, some that are bleeding-edge-brand-new and others that are more mature. In the past decade scaling our applications was never much of an issue. Our growth pattern was pretty linear, and we always ran our applications on the same collection of servers. To manage growth we simply added another server or upgraded the specs of our existing servers.

But over time, collocating newer and older applications on the same servers created a major problem. Before running new libraries (such as upgrading Node.js or PHP to the latest versions) we needed to make sure that our older software was compatible. Meaning we could not release new software as fast as we wanted to due to the risk of causing issues with our older products.

About 2 months ago we set out to solve this problem. Before making any changes we profiled our applications. In the past they were similar in terms of memory, CPU, disk I/O, and network usage: serve up some HTML, make some calls to the database, and support some Ajax calls. No more. Our new world is very real-time and mobile, and these factors stress our systems in different ways. Profiling helped us realize that each application has unique requirements and each needs to be on its own collection of servers. We see managing multiple server configurations as a small price to pay as we mature as a company.