The Ruby on Rails story is usually presented to the new developer as a wonderful break from tradition that makes a developer’s life so much better than the frameworks of the past. The clattering of skeletons in the closet you’re hearing? Well, that’s because it makes the sysadmin’s life much worse than PHP or Java. That just improved on Friday, with the release of mod_rails. If you’re looking for a way to do shared (or low traffic) hosting of Rails applications, this is for you.
With Java there’s this alien environment of CLASSPATHs and WARs and JARs and heap size limits, but once you get it up and running, developers can include libraries in with their application or the lib/ directory of the J2EE server, and the sysadmin doesn’t have to care. A Java developer is unlikely to ask you to build and install a pile of custom libraries.
With PHP it’s just another Apache module, but you might need to build a few extra libraries and maybe custom-compile Apache. Once you get it up and running, though, you don’t even need to restart the server when you deploy new code. It’s automatically updated.
With Ruby on Rails, it has been far uglier, especially as you go further back. The standard “Matz Ruby Interpreter” (MRI) doesn’t thread well and is quite remarkably slow, and Ruby + Rails in an MRI process use a lot lot lot of memory. So you don’t really want RoR running inside each Apache process. Folks used to use FastCGI (which should have died over a decade ago, but lingers on like a bad cold) but now use Mongrel, which is conceptually kind of like FastCGI, except that it actually works. Mongrel presents the application via HTTP, which is much easier to understand and integrate with other parts of your architecture (such as a load balancer) than FastCGI.
Whereas in J2EE you’d run one big honkin’ JVM that used lots of memory to load up your code and data structures, but then ran many threads inside that one process, with the limitations of the MRI (green threads and many, many trips into non thread safe C code that requires the use of a “giant lock” that essentially makes it single-threaded), you run one process per thread. That’s like Apache+PHP or OpenSSH or many other unix programs that fork, right? Well, sort of. The issue is that your Ruby code is not seen by the kernel as something that all those forked processes can share; it sees the parsed Ruby code as data, and when the MRI’s garbage collector marks all those objects during garbage collection, it seems this data as being recently changed, differently for each forked process. So not only do you need 30-70MB or more per process, but very little of that is shared between processes. Ouch!
A second problem is that these processes take a while to start up and load the code, so it’s not reasonable to embed the Ruby interpreter in Apache when using Rails; the overhead is just too high. So the Mongrel solution is to pre-launch a bunch of interpreters, and have them just sit there until requests arrive. That’s pretty inefficient from a memory standpoint, but the latency when a request comes in is quite low since there is no initialization needed.
There have been a few interesting alternatives under development: JRuby is very promising, because it reuses all of the investment in VM development that Sun made over the last 10+ years for Java. At this point the JVM is pretty darn good at running many threads across multiple CPU cores, and at garbage collecting efficiently, among other things. These are key weaknesses of MRI, so running Rails on JRuby seems like a huge benefit. I haven’t tried it yet but I suspect that this will become one of the 2 or 3 most common ways to run Rails applications in the near future.
Another interesting alternative was some experimental hacking to MRI’s garbage collector by Hongli Lai, to store its working data separately from the objects being examined, so that preloaded Ruby code would remain shared by many forked interpreter processes over long periods of time. In other words, this is a potentially major memory use savings for Mongrel cluster users, which would in turn allow the sysadmin to run more Mongrels to service more simultaneous requests, or to bump up the database cache, or to increase the size of the running memcached instance. So, this would indirectly be a performance booster, and Ruby could really use that.
This experimentation apparently became Ruby Enterprise Edition, which as of this writing is not available yet. But the other development coming from Hongli Lai’s new company, Phusion, is Passenger, a.k.a. mod_rails.
What’s interesting about mod_rails for the beginning Rails developer is that it is intended to make Rails hosting easier, particularly for shared hosting enviroments, which have been struggling to offer Rails hosting in a uniform and cost-effective fashion. That means that in a short while (weeks?), shared hosting plans for fiddling around with Rails will become much cheaper and more widely available than they are now.
What’s interesting about mod_rails for the experienced sysadmin is that it mimics the min/max process pooling behavior of Apache, and addresses startup overhead in a clever way. It also serves static images via Apache automatically, eliminating the need for a separate block of mod_rewrite rules that must be crafted carefully so as to avoid conflicts with mod_proxy.
The architectural overview is comprehensive and well written, but here’s a summary: The Spawn Server makes a tree of child processes that preloads Ruby, Rails, and your application code for you, and then that is fork()ed to satisfy incoming requests. So the first request after startup incurs startup overhead (in my case, 5 seconds to load the Redmine login page) but subsequent requests get much better response time (.6s to reload that login page).
That seems like a lot of overhead in terms of big Ruby processes. Here’s what I measured just now: 97MB free with just Apache running (no spawn server yet). After the first page view, there was 36MB free, and four new processes: the Spawn Server taking a little over 6MB (rsize), the FrameworkSpawner taking 20MB (rsize), the ApplicationSpawner taking 34MB (rsize), and one Rails process taking 34MB (rsize).
The new “free” value is 36MB. The Buffers and used Swap values remained constant, with only 48KB of swap used. So that means that all four processes, which would seem to need 94MB to run (34+34+20+6), are actually overlapping enough that they are using only 61MB (97-36). And the ApplicationSpawner eventually terminates, leaving 36MB still free, which makes sense – it’s the process that fork()ed the Rails process, so they should ideally be overlapping nearly 100%. I’m surprised that this is so high; based on the GC experimentation that Hongli Lai did, I would have expected them not to overlap as much.
The idle Rails process exits eventually also, controlled by the RailsPoolIdleTime setting. That saves memory but re-introduces the startup overhead. That leaves the FrameworkSpawner and the SpawnServer running, taking about 25MB of memory (quite close to the 20+6 shown by their rsize values).
Let’s compare this memory footprint to a Mongrel cluster. In a Mongrel cluster the processes start up and stay running forever, so the users are unlikely to incur much startup overhead at all, since it’s done long before they visit the application. Some amount of application-specific internal overhead is still an issue, though; that might include gradually filling an initially empty memcached, template compilation and/or caching, etc. As for memory, each Mongrel would need the same 34MB of memory, but there’s no SpawnServer, FrameworkServer, or ApplicationServer, so the extra 25MB of overhead would not be present with a Mongrel cluster.
That means that for a shared hosting setup where many low-traffic Rails sites may be used, or a multifunction server where serving one or more low-traffic Rails applications is just part of the job, mod_rails is a benefit. When the Rails app isn’t being used, it will exit and free up that memory for other processes. The starting and stopping of Rails with mod_rails is automatic and demand-based, so the sysadmin can tune it and forget about it.
On the other hand, a single dedicated server or VPS with a fixed amount of memory serving a single application would be better off with Mongrel, because of the lower memory overhead (25MB less), and the fact that the Mongrel processes start up before users need them and stay running indefinitely. Mongrel clusters could still potentially benefit from the Ruby Enterprise Edition’s garbage collector tweak if forking were used after preloading all of the code.
A single-purpose dedicated server running mod_rails could attain similar performance to a Mongrel cluster by simply setting the RailsPoolIdleTime value to a very high number. Then the Rails processes would hang around, and although you’d pay the price of a 25MB memory overhead, the startup overhead would only be paid by the very first visitor. However, you’d lose the main benefit of mod_rails, which is demand-based pool resizing, particularly if you’re running more than one application, Rails version, or Ruby interpreter version.
In short, I think mod_rails is very nice, and having actually used it I’m impressed with how polished it is for a 1.0 product. But if you’re already running a single application as a Mongrel cluster on a dedicated server, there’s no point in switching.
Hi Jamie.
Your article is well-written. Kudos. But there are some inaccuracies. The FrameworkSpawner and ApplicationSpawner servers also have idle timeouts, of 1 hour and 2 minutes, respectively. So if you set RailsPoolIdleTime to a high value, then eventually the spawner servers will go away, and memory usage becomes the same as Mongrel cluster.
“I’m surprised that this is so high; based on the GC experimentation that Hongli Lai did, I would have expected them not to overlap as much.”
I might be asking for the obvious, but were you using my GC patch when measuring memory usage? If you use it with standard Ruby then the garbage collector will destroy whatever memory savings have been achieved by fork(). We’ve tested Passenger + Ruby Enterprise Edition vs Mongrel cluster, and we notice a significant decrease in memory consumption, even with the spawner server overheads.
I mentioned that I had seen the ApplicationSpawner go away. Does the Spawn Server exit? It’s only 6MB so that’s not a big deal anyway. But I didn’t know that the FrameworkSpawner also exited after a while; that’s good news.
I’m not using a patched Ruby, unless the Passenger gem installer does that on its own. I suppose that if the MRI garbage collector doesn’t run periodically like it does in the JVM, I might not have used the Rails app enough to trigger a demand-based GC.
I eagerly await the release of Ruby Enterprise Edition. :)
The spawn server does not exit. Though you could kill it manually if you want to, it’ll just be restarted next time it’s needed. If your RailsPoolIdleTime is large enough then the spawn server will eventually become unused.
The Passenger installer doesn’t install Ruby Enterprise Edition for you. It’s a separate install, but we’ve made sure that installation is as easy and risk-free as possible.
And Ruby’s GC doesn’t run periodically. I’m not sure how the JVM GC works, but Ruby’s GC is triggered based on several conditions, one of which is the number of currently allocated objects.
By the way, the name “Ruby Enterprise Edition” is actually a joke, but we’ve already received messages from concerned people. ;) It’ll be explained on RailsConf.
I’m thinking of the “asyncgc” option to the JVM, which appears to be obsolete now. It would wait for the JVM to be idle for an extended period of time, after which it would trigger a GC all by itself.
I agree, a 6MB process that doesn’t do much will probably be swapped out, so it probably won’t impact performance over time.
The latest development version (git repository) has a tool for gathering memory statistics. It’s located in misc/memory_stats.rb. This tool only works in Linux though.
I think mod_rails could save a lot of RAM because its spawned processes die somewhat quickly–mongrel is subject to something of a RAM creap with time, sucking more and more, so it avoids that. right on.
Roger, as I understand things, that’s not correct. If the process grows and grows, that’s a memory leak, and isn’t likely to be Mongrel’s fault.
In the case where you have a memory leak and a very low-traffic server, I guess it would help to have idle processes exit and restart. But if there’s enough traffic to keep that Rails app running (i.e. the interval between hits is shorter than the idle timeout) then the process will not restart and the memory leak will still be a problem.
As a stopgap measure, if you have a heinous memory leak and no time to fix it properly, I suppose a hack would be to lower that idle timeout to a teeny value, so that even the slightest lull in traffic would cause a reaping of processes, which would then be re-forked later. But that’s not really a feature.
I’m having trouble understanding why the FrameworkSpawner has these timeouts. Currently, when my site has not been accessed for over an hour, it takes about 25 seconds for it to come alive again. Why not always keep the last one alive–or have the option to do so?
I understand waiting 3-5 seconds for another applicationspawn to occur, because 3-5 seconds is very reasonable.
But the 25 seconds for the FrameworkSpawner will make people assume that the site is down.
If you bump up the idle timeout to a large value (a few hours) it’ll already be running.
But 25 seconds seems very high for a web application to start up. Is that server starved for memory (so app startup incurs swapping overhead to make room) or is there something in the app that has to preload a lot of data?
No not at all…it’s a very small application, and is the only thing running on a VMWare slice with 512MB RAM.
Hmm, does it also take 25 seconds to start under ./script/server or a Mongrel cluster?
Just for reference, on a VPS that has a single core (2.2GHz) and 336MB RAM, I can stop and start Apache (thus killing all running mod_rails related processes) and then load the login page of a Redmine instance in my browser in just over 3 seconds. The VPS is in Dallas and I’m in SF, and this includes all the round-trips for the images, stylesheets, etc. to return a 304 status code. So my experience is that it’s really quite fast to start.
So, if I were you I’d run ./script/server on that server in development more or production mode with extra logging, and see what’s taking 25 seconds. It’s not mod_rails. Maybe it’s something in your app that is absolutely correct and can’t be removed, in which case you could bump up the idle timeout to a very high number to reduce the frequency of a cold start of your app. But I suspect there is a network timeout (trying to connect to a host that isn’t up? DNS timeout?) or other such configuration error that is hiding in there.
What is the CPU utilization of the Rails and mod_rails processes during that 25-second interval? That (and ‘top’, and ‘iostat -1’) would give you a clue as to whether the server was idle waiting for a timer to expire (config error), or if it was working like mad for all that time trying to load something from the disk or calculate something.
Site loads instantly with ./script/server, but also loads in 3 seconds when I kill Apache and start it up again. The issue is the timeouts…if I let it go over an hour or so, it takes the full 25 seconds to come back to life.
It’s not a DNS timeout, I’m hitting the IP directly on a local network.
And it’s really not a complex app at all, just an Intranet that we use to post updates, share files, etc. So there’s nothing legit that would take 25 seconds to load.
Really appreciate you taking the time to help with troubleshooting.
Looking for a timeout:
What is the CPU doing during that time (looking at top, is it mostly in idle or user or iowait or what)? Is it working like mad for 25 seconds, or doing nothing for most of the time and then suddenly starting up the Rails app?
Looking for a lack of sufficient physical memory:
If you run ‘iostat 1’, are the disks busy during that time? How about the swap partition? In ‘top’ during that time, does the Swap: line’s ‘used’ value grow?
When you’ve just restarted Apache and no mod_rails or Rails processes are running, run ‘free’ to find out how much memory is available. Is there enough under ‘free’ in the ‘-/+ buffers/cache’ line to accomodate the processes that need to launch? In my case, I need about 60-65MB of memory to fit them all without swapping.
I don’t think these are the issue based on what you’ve said, but it’s worth saying for the benefit of other folks.
If you restart Apache, load up the Rails app and get the 3 second response time, then immediately kill all the mod_rails processes (spawn server, framework and app spawners, and the Rails processes themselves) and then hit reload, does it take 25 seconds or 3 seconds?
This is pretty weird – you might want to contact the mod_rails folks for support. I’m just an unaffiliated mod_rails fanboy. :)
There was just a pure ruby process running for all 25 seconds, using 60% CPU the whole time when looking at top. Then it went away and httpd, mysql, and a few others took over the top few lines once the app was up and running.
Can’t seem to mimic the state of the timeout by killing processes, it only seems to do it if I actually wait out the hour. I’m sure there’s a way, but I haven’t been able to come up with the combo.
“But 25 seconds seems very high for a web application to start up. Is that server starved for memory (so app startup incurs swapping overhead to make room) or is there something in the app that has to preload a lot of data?”
It’s the data that has to be loaded. The Ruby on Rails framework itself consists of tens of files. A cold start is quite slow because a lot of disk seeking has to be performed. Plus, Ruby has to interpret all those instructions.
On my laptop, a cold start takes 10 seconds while a warm start takes 3 seconds.
Hmm. If I reboot my VPS (thereby clearing the disk cache) and then load a page served by mod_rails, it takes 7.122s to load the whole page. That includes all the startup time of the mod_rails spawners as well.
Maybe the dom0 disk cache is involved, but it has a very limited amount of memory so I doubt it. Or maybe my VPS is on much faster hardware than bensie’s VPS, and it’s just gonna take that long any time the app really starts without the benefit of any disk caching.
I guess the way to test that theory is to reboot the VMWare virtual server and try a cold start, and see how long that takes.
If disk performance is really the bottleneck, a quick and dirty Rails startup accelerator could be added that would work more or less like the ‘readahead’ program that comes with CentOS 5.1:
cd /usr/lib/ruby/gems/1.8/gems ; \
find . -type f -iname \*.rb -and \( \
-iwholename ./actionpack\* -or \
-iwholename ./actionmailer\* -or \
-iwholename ./activerecord\* -or \
-iwholename ./activesupport\* -or \
-iwholename ./activeresource\* \) \
-exec cat {} > /dev/null \;
What types of things within the rails application itself might take a long time to load it up? This particular app is a rails 1.2.6 app with just a few plugins…
I’m going to try it with different apps now to see if that’s the issue.
For the the big win is not having to manage the mongrel cluster port range, not having to worry about a couple slow file uploads/downloads locking up all the mongrels, not having to worry about slow S3 or Salesforce calls locking up all the mongrels, etc., etc. Count me onboard the mod_rails bandwagon… it’s making my cluster management skills obsolete and I still love it :-)