Previously Science and wayneseguin published a study looking at the performance of nginx fair proxy. To take that a little further, Science conducted an examination of how Thin and Mongrel compare head-to-head on performance. For kicks we took a look at Rails page template caching facility to see if that significantly impacts performance (it does). Full details follow..
For an idea of what the h/w testing setup looked like, read the previous study (cited above). For these tests, we used 3 instances of Thin or Mongrel (with lots of free ram). nginx fair proxy was turned on. Rails caching was fully enabled (including template caching). The tests all ran for 300 seconds. We pulled HEAD requests only to minimize over the wire throughput variance (since the test were initiated remote of the data center). Thins were wired up to unix sockets and Mongrels were going over IP.
Overall, I’d say that under these testing conditions Thin is 4% faster than Mongrel. That’s not much, and it’s within the standard deviation of each test result, but it was pretty consistent throughout the testing so I’m inclined to believe it. Your results may vary. [1]
Here is the summary of results:
| Server type | Avg response (s) | Total pages (#) | 90% max response (s) |
|---|---|---|---|
| Thins, 10 threads | 1.72 | 1734 | 3.92 |
| Mongrels, 10 threads | 1.78 | 1677 | 3.31 |
| Thins, 30 threads | 5.08 | 1738 | 10.14 |
| Mongrels, 30 threads | 5.20 | 1709 | 10.86 |
| Thins, 40 threads | 6.69 | 1753 | 10.97 |
| Mongrels, 40 threads | 6.98 | 1685 | 13.39 |
The full test results (including Standard Deviations) can be found here. We hope the provided measurements meet your requirements. Post a comment here if you’d like more information or background.
End Notes
- While I thought that much of the performance improvements could be attributed to the unix sockets themselves, many knowledgeable folks including Zed (author of Mongrel) and Marc (author of Thin) assert that performance of IP vs Sockets is really marginal in this day and age. Both Zed and Marc have indicated that any performance differences are probably due to code and architecture differences in the app servers themselves.
- You may also find this study of performance of various ruby app servers interesting: http://wiki.codemongers.com/Main
- Science also has a writeup on optimizing Nginx and Rails page caching which may be of interest to readers.
Methodology Addendum
- Zed Shaw requested a methodology review. The following bullets outline how I conducted all the tests.
- I used Jakarta JMeter 2.3.1 to run the test.
- I pulled HEAD requests to minimize measurement errors that might be caused by over the wire variations in bandwidth
- Rails Code path:
- I ran against two fairly distinct code paths and had several hundred URL variations that hit all over the database within those two code paths (One code path searched for a set of records within a US state, the other searched for a specific record and displayed details about it).
- Rails code used was production scale and quality. By this I mean it is code that runs a full-fledged webserver and it does a lot of work. It may not be the smartest or fastest code, but it provides a lot of user functionality that is probably pretty typical for “read mostly” Rails sites.
- All activity was read-only - no insert/updates were performed. Simulated traffic was typical for this website.
- I had a warmup period before each test, to make sure that all core code was cached before running the actual test. Warmup was generally around 100 seconds. I was not precise on this though.
- Tests all ran for 300 seconds. There was a ramp-up period (to go from 0 threads running to all threads running) on each test that was equal in seconds to half the number of threads - so a 10 thread test had a 5 second ramp-up. A 30 thread test had a 15 sec ramp-up.
- Testing was run from a single dev workstation located on a 6mbs ADSL line (uplink is usually around 600kbs effective).
4% is good, but it’s also not much. I guess we’ll stick with mongrel_cluster and look for more fundamental optimizations in our code, queries, and database setup rather than undergoing the configuration pain of switching over…
it would be nice to have numbers about RAM usage
I realized recently that the 4% can be entirely explained by the difference between unix and IP sockets. There is another study proving this out with nginx + mongrel vs. fastcgi. Basically, the unix sockets are faster but at the expense of potential horizontal scaling which mongrels allow because they can be on different machines than the nginx. I looked at the mongrel source, and it is pretty nicely optimized. It starts with an HTTP parser and then drops control into a ruby thread pool. Using a Mongrel::HttpHandler subclass and setting the thread pool to 1000, you can get really high concurrency. Rails locks each mongrel however so for that configuration the number of concurrent users is 1:1 to the number of mongrels.
RE: mongrel optimization — i meant to say the HTTP parser is written in tight C.
m++: I have heard some credible people say that it’s unwise to let nginx proxy across multiple machines. It’s better if possible to have nginx serve only mongrels on the OS where it’s running and to put a set of load balancers in front of your cluster of web boxes running nginx/mongrel/rails to smooth things out. The downside of this is that you can get somewhat uneven load across your servers in some circumstances.
I’d be interested to know the reasoning behind that. Ezra Zygmuntowicz of Engine Yard has been advocating separated nginx/mongrel machines for the benefit of identifying individual performance problems (I believe that’s the idea, see http://www.slideshare.net/vishnu/xen-and-the-art-of-rails-deployment/ page 32 and 35). However, I can also imagine that something like ultramonkey would be very good at load balancing requests, which nginx isn’t specifically made for, so you could put that in front of a bunch of nginx/mongrel/rails boxes and the relative computing overhead of nginx to a cluster of local mongrels would be negligible, no? I have always thought this, but only heard that it was a better practice to separate them. Could someone elaborate?
Woody: Ezra is one of my main sources on the idea that nginx should not proxy across multiple machines. Their model is to put a set of Linux-based load balancers in front of a bunch of machines all running identical stacks of nginx/mongrel/rails.
The problem with this model as best I can see it is that you have to have either an event-driven queue on the front-end LB’s or you have to have identical performance/power on the nginx boxes and a very consistent response time on your various web pages. If one nginx box falls behind it will be loaded up with additional requests that it will have trouble clearing in a timely manner, while other nginx boxes may actually have free cycles.
I hope this helps a bit.. - S