Multicore Hardware and the Future of Ruby

by David Fayram

Consider this fact: Multi-core CPUs are not only the future, they're the only way CPUs can continue to grow at their current pace. It's also a hotly debated subject in the software world. Multi-threaded programming is different and not seen as often as procedural programming, and therefore it's not yet as well understood. So the question is, how can programming languages (and Ruby in particular) make it easier to harness these systems?

As Ruby struggles to graduate from its current implementation into something more powerful, we've already seen several projects attempt to update Ruby to help developers cope. Those who've been working with Ruby for awhile may remember YARV, which promises to provide more threading support. JRuby offers all the power of Java's threads to Ruby, if it can harness it. And Evan Phoenix's small but rapidly growing project Rubinius is attempting to be the next big contender.

No matter what implementation becomes the next de-facto Ruby platform, one thing is clear: People are interested in taking advantage of their newer, more powerful multi-core systems (as the recent surge in interest in Erlang in recent RailsConf and RubyConfs has shown). As Ruby becomes increasingly part of solutions that deal in high volumes of data processing, this demand can only increase.

That's why it's so very surprising to see David Heinemeier Hansson dismiss the whole notion out of hand regarding Rails. His argument seems to be that Rails already scales to multiple cores in the same way it scales to multiple machines, via UNIX process distribution. After all, isn't this the very crux of "Share Nothing?"

33 Comments

Neil Wilson
2007-06-07 08:43:35
I'm afraid I'm with DHH. Memory is cheap and working set size is a matter for the operating system - as should the issue of multi-core and anything to do with concurrency.


Fundamentally the most limited resource is human brainpower. Introducing threading into any program creates problems - because humans just don't think like that.


You get the biggest pool of programmers available to you if you can just assume that you aren't sharing anything with anybody for any reason. That is why the process was invented in the first place - to help eliminate screw ups.


Now if the Unix process abstraction isn't what we need in the 21st Century, then go revisit that. But whatever you do, don't make it something an application programmer has to worry about.

Gregory
2007-06-07 08:55:16

Now if the Unix process abstraction isn't what we need in the 21st Century, then go revisit that. But whatever you do, don't make it something an application programmer has to worry about.


Full ACK. Concurrency is an extremely hard problem to get right, and I much prefer to let the programming language, operating system, etc handle it for me. Much less likely to end up with awful dilemmas that way. :)

Joe
2007-06-07 08:56:19
Hm, my mongrels are just over 60MB or so.


Never understood why a Rails process had to be so huge. Getting it down to say, 20MB or so, would be a huge improvement.

Gregory
2007-06-07 08:56:40
By the way, nice post David, hope to see more like these...
James Herdman
2007-06-07 09:02:02
Excellent article. It makes me happy to see sane, levelheaded discussion about the matter.
Daniel Berger
2007-06-07 09:35:47
I'm afraid I disagree with DHH on this issue, although I have to admit I'm confused about what he's talking about when he uses the word "processes". Is he talking about green threads? Fork? I don't know.


In any case, after reading even a portion of "Programming Erlang" I've realized a couple of things. First, we're thinking about COP wrong - it's possible (and better, and easier) without native threads.


Second, we should try to make a language that scales up as well as out. I mean, why toss more hardware at a problem, when you're not taking advantage of the hardware you have? That, or avoid having to jump through lots of extra hoops to do so.


Consider this benchmark of YAWS (an Erlang web server) vs Apache. In a world where PM's and programmers constantly question the scalability of various web frameworks, we need to look at Erlang's implementation and seriously consider if we've all been thinking about the issue from the wrong angle all this time.


Anyway, my general feeling is that we're rationalizing the lack of SMP support in Ruby (and other languages), instead trying to solve it.

Dave Fayram
2007-06-07 10:38:46
Neil Wilson said:
Memory is cheap and working set size is a matter for the operating system - as should the issue of multi-core and anything to do with concurrency.


First of all, the kind of memory we're talking about here (and the boards to handle it) are not cheap (for example, look at the prices of 4gb of ram in 2-module configurations), they can easily make up a sizable chunk of the cost for a server. And even if memory is cheap, system bus bandwidth is growing at a much sower rate than the other aspects of modern server hardware.


Second of all, I'm not saying we should engage in some kind of radical rewrite of Rails or other Ruby libraries. I'm saying we should start to make our apps thread safe so that when a good ruby implementation that does threading right comes along, it will result in a "free" upgrade. In the specific case of Rails, let's make ActiveRecord thread safe, and people writing new libraries like mongrel should think about concurrency as well.


I'm surprised my opinion is at all controvesial. To me, it seems like just reading the writing on the front of a swiftly approaching train. :)

Neil Wilson
2007-06-07 11:13:41
Memory arrays might not be cheap at the moment, but they'll get cheaper once you have 64 core chips demanding service.


The arguments you put forward are the same ones for threads in the first place - it is machine optimal. But as we have seen machines get faster - very quickly.


What doesn't get faster or smarter is the brain power of your average human grunt programmer. It is that we need to optimise towards. Hardware will become fast enough and cheap enough as the need arises, but will require a different programming technique. That technique needs to be simple if programmers are to master it effectively.


For me threads always were a backward step. That Rails is not thread-safe is actually one of its benefits - it stops the premature optimisation in a single program that seems so popular amongst those who have grown up with easily available threading.


Modern programmers need to reacquaint themselves with the process, and the delights of not having to worry about whether there is some other execution context twiddling with your object attributes.


I'd write a 'threads considered harmful' piece, but others have done it much more eloquently than I ever could.

Rick
2007-06-07 11:45:36
I'm not sure that YAWS benchmark is relevant to anything in the real world. On what hardware could processing 80,000 requests *simultaneously* be useful?


You don't even have to look far past the network: Divide a gigabit connection up 80,000 ways and you're serving content to each user at a blistering 1.5KB/s. Hello, 1990. Oh, how I've missed you and your bad hair.

chromatic
2007-06-07 13:03:05
What kind of operating system are you using that fork() and COW memory pages can't get you processes that can run on multiple cores without using hundreds of megabytes apiece?
Dave Fayram
2007-06-07 13:09:30
Neil Wilson:
Memory arrays might not be cheap at the moment, but they'll get cheaper once you have 64 core chips demanding service.


As I said, the problem exists today. It's not some "Memory will be cheap enough soon" situation. When I was working at Mog, we deliberately chose our servers with the amount of ram it had and the amount of ram it could use in mind. The former need not be higher than the later, and so we chose systems with fewer cores than was financially possible.


The arguments you put forward are the same ones for threads in the first place - it is machine optimal. But as we have seen machines get faster - very quickly.


I am not arguing for any specific concurrency paradigm. I am arguing for concurrency-safe programming. These practices dovetail nicely with the list of best practices for object oriented coding anyways (e.g., "Minimize use of global state").


What doesn't get faster or smarter is the brain power of your average human grunt programmer. It is that we need to optimise towards. Hardware will become fast enough and cheap enough as the need arises, but will require a different programming technique. That technique needs to be simple if programmers are to master it effectively.


I don't like talking in terms like this, I find it extremely patronizing. If a programmer cannot perform, then they will lose their job to someone who can. Again, I reiterate I am not saying, "Write a threaded rails". I'm saying, "Write a thread-safe rails." As it stands, I can't even cleanly fork() a rails process, or use multithreading within it.


Modern programmers need to reacquaint themselves with the process, and the delights of not having to worry about whether there is some other execution context twiddling with your object attributes.


And if Rubinius suddenly offers erlang-style concurrency, Rails will need a ton of work to take advantage of it. You're caught up on the word thread, assuming I am too. Please don't be.

Dave Fayram
2007-06-07 13:12:01
What kind of operating system are you using that fork() and COW memory pages can't get you processes that can run on multiple cores without using hundreds of megabytes apiece?


It's not the operating system. It's Ruby's garbage collector. Since it sweeps the whole object space marking objects, it defeats any kind of CoW scheme you might expect to work, since it writes to those pages.

chromatic
2007-06-07 14:30:47
@Dave, that's a good point--but it's not terribly difficult to keep object GC flags in their own memory pages (assuming a well-factored GC system).


Even so, I remain unconvinced that the largest scalability problem of most server applications is CPU time.

evan
2007-06-07 14:37:00
Rubinius's separate, 2-space GC will handle CoW in a sane way. Watch for it in the fall at a Ruby dealer near you.
Tim Olsen
2007-06-07 15:25:25
The fundamental problem is functional vs. imperative programming languages.


Imperative programming languages make it easy to write sequential programs and hard to write parallel programs.


Purely functional programming languages make it hard to write sequential programs (monads), but easy to write parallel programs.


Ruby has some functional programming features, but it still relies heavily on state which makes automatic parallelization difficult.


Multicore processors will drive the adoption of functional programming languages like Erlang and Haskell.

Michael Koziarski
2007-06-07 15:31:12
Rails is, by and large, threadsafe in production mode. The parts of the framework which aren't are isolated to actionpack, mostly in the caching code. Far from needing a 'ton of work', rails is already mostly there.


However, at present there's essentially no point spending time making those changes. Green threads and blocking IO make for shitty concurrent performance, and while merb may well be faster than rails, I've yet to see any convincing real world profiling reports which prove that that's just because of 'thread safety'. Threads aren't magic beans that give you a magic performance boost.


Until we have a ruby VM with 'real' threads, DB drivers which do async IO, and a GC which is compatible with all of the above, making rails threadsafe is a wasted effort. Once we have those things, or they're clearly just around the corner, I'll be happy to work with concerned individuals to get their patches merged.

Mike G
2007-06-07 18:15:57
I think what we need to do is, step back and take Dave Hansson's pronouncements about databases and hardware architecture with a grain of salt. I would recommend a cowlick, but that would be plain rude. Dave Hansson is a marketing guy. Stop treating him like you would treat Joe Armstrong or Rob Pike or something.


Dave doesn't understand hardware and the unix process model, calls the Ruby interpreter "the compiler", refers to a relational database system as a "big hash" and so on and so forth.


Dave H created rails, using some really crufty PHP like code, and it took off. It doesn't mean he knows something on concurrency that people like Joe Armstrong don't. Stop treating him like he is an expert on everything. Outside of rails, he's mostly talking out of his umm .. you know.. A$$?


Sometimes we give people too much credit in unrelated areas of expertise. It's like George Cloony talking out of his ass on geo political issues or a basketball player pontificating about morality just cuz he got picked up by the NBA. You know, asking Ja Rule about the significance of the events of 9/11... Just not gonna happen!


Fame does not an expert make. (groupies not withstanding)

Gregory
2007-06-07 20:33:08
@Mike G


Hahahaha. You may be overstating, but nice counterbalance :)

steve
2007-06-08 00:21:17
The biggest problem with this discussion is that it rests on an assumption that wasn't tested in reality. The statement that 1.6GB of RAM would be used is based on the extremely unlikely mathematics of 8 cores * 200MB.


Until this is actually tried out (rather than arguments about memory and CPU), we have no extraordinary proof to back up the extraordinary assertion... which, if true, does indeed prove DHH wrong.


If the extraordinary assertion is wrong, and the concrete measurement of the actual value is closer to a much more reasonable number, then no time would have been wasted by the discussion you can see by scrolling up.


My ordinary assertion, based on running multiple copies of other programs, is that the OS is capable of sharing memory. For example, glibc is present in nearly every program that runs on Unix, yet the hundred or so processes do not incur a 100 * 20MB overhead (2GB RAM!).

Dave Fayram
2007-06-08 07:44:55
Steve, I was the sole developer for Ma.gnolia.com up until its launch, and I worked with Lucas Carlson to launch Mog.com and maintain it. I've also consulted for several people launching or maintaining Rails apps. I have deployed and profiled apps running on multi-core hardware, so I've seen these effects in action.


Michael Koziarski knows more about the rails codebase than I do, but I stand by my numbers in terms of where apps often end up. I've also gone and asked several prominent Rails developers what ranges their Rails responders fall into. I did a fair amount of homework before settling on 200mb, which is close ot my personal experienes and falls into the 150-250 range that most people with nontrivial apps quote.


It depends on a variety of factors, of course, but 200mb for a complex app with Rmagick isn't unreasonable at all. You can save some memory by forking, but that amount gradually goes down, and forking freaks ActiveRecord out.


Finally, Steve, the OS is sharing memory for libraries as intelligently as it can. It will give every process the same copy of the read-only pages associated with the imagemagick libraries, for example. However, This is a small fraction of where the memory goes. One of the problems with interpreters is that they can't automatically share memory as efficiently as compiled code can, since the system is so optimized for the compiled code case.

Gregory
2007-06-08 07:55:02

It depends on a variety of factors, of course, but 200mb for a complex app with Rmagick isn't unreasonable at all.


Is that taking into account the memory leaks or no? :)

John Pywtorak
2007-06-08 09:13:07
All I can say is proove it. Stop spewing out FUD and show me some real evidence. So a bunch of people are creating some hysteria over multi-core, well, it just makes them that, a bunch of people creating hysteria. That is all, end of story.


I ask you what benchmarks did you use to make sure your single core processor was being used efficiently? How do you know that a single core was your performance bottlenect? Did a big company with a big marketing budget tell you that you needed more cores? So, what makes you think you can now all of the sudden be critical of raw computing power just because you have two, or more cores?


Please people, don't fall for this FUD. Do some real investigation and benchmarking.

Wilson Bilkovich
2007-06-08 10:18:57
I have done a lot of client work, and I routinely see Mongrels in the 200 to 300MB range. The 'record' so far was on a consultation for a California company, that had 13 Mongrel instances each taking 305 to 320MB of RAM.
This is not Mongrel's fault, of course.. it's the fault of the Rails programmers involved. However, Ruby's GC behavior is not well understood by Rails developers, generally, and it is easy for inexperienced coders to get into deep water.


BackgrounDRb is a common source of trouble. While that is a solid and excellent library, it must be used with care to avoid preventing the GC from collecting your objects.

Lucas Carlson
2007-06-08 10:19:29
Here is a snapshot of typical RAM usage for one of the Mongrel's running http://mog.com/ on a 4 core machine.


RAM used: 219184 kB

Tom Preston-Werner
2007-06-08 10:40:01
Mongrels for http://gravatar.com/ consistently stabilize between 200MB and 250MB each. These are on a two core machine.



VIRT RES SHR DATA COMMAND
257m 251m 3176 249m mongrel_rails
238m 232m 3040 230m mongrel_rails


It's foolish not to use whatever threading capabilities are available to us in the framework. Having to start and maintain multiple Rails instances where one (thread safe version) could suffice represents additional complexity that I would rather do without. Not to mention that the memory overhead is real.

Vlad D
2007-06-08 11:08:36
Oh, well if stateful connections will take of - no way to stay on the process level. Here is the point in case: http://blog.fastmail.fm/?p=592
Jeremy McAnally
2007-06-08 11:19:17
In general, my apps run slightly under 200MB when they've been up for some amount of time.
Jeremy McAnally
2007-06-08 11:19:18
In general, my apps run slightly under 200MB when they've been up for some amount of time.

2007-06-10 13:59:19
"The first people to really innovate technically with them will have an enormous advantage over their competitors."


Never trust people who try to scare you into behaving like they want. This is clearly FUD, and somebody here wants very eagerly be seen as an expert. Very bad style.

Dave Fayram
2007-06-10 14:41:26
Anonymous (if that is your real name?)...


My argument is that conceptually, nothing stops rails from taking advantage of N-core systems. In fact, conceptually it's in a very good position to do that. However, currently its implementation precludes that notion.


I'm not sure how I'm manipulating people to my benefit by suggesting we should be improving Rails. I'm also usure how I ended up defending the position that we should be writing parallelization threading code by hand. Certainly that's not how I feel. I'm advocating making rails thread-agnostic in preparation for ruby implementations that offer true threading (and/or better concurrency models).

teki321
2007-06-11 00:52:37
I think 1G RAM always going to be cheaper than 1core+2MB cache.


It's technically really challenging to feed 64 cores with data alone, so the first move is to release that 64 core machines, and make them cost effective. There will be a lot of Ruby implementations that time to choose from.

tom
2007-06-22 17:19:47
ditto, for me on mongrel mem usage...


VIRT RES SHR S %CPU %MEM TIME+ COMMAND
217m 165m 8568 S 0 1.0 24:43.48 mongrel_rails
230m 178m 8568 S 0 1.1 26:32.17 mongrel_rails
289m 237m 8568 S 0 1.5 26:50.59 mongrel_rails


I am exploring deploying under glassfish w/jruby and this
looks like it could be an attractive alternative down the road. The rails-integration (goldspike?) code maintains an object pool of Rails instances - so scaling means creating more these which I suppose is more efficient memory wise than N mongrel processes since large parts of the JVM can be shared?


Anyway, I think rails deployment has a ways to go before it
can be considered stupid simple to do. It reminds me of the early days of Java app servers where you were always worrying about your server locking up and needing a restart.



Grzegorz Daniluk
2007-06-24 03:44:49
Don't forget about people how use cheap hosting services. I guess that single server can handle thousands of accounts. Each account can be accessed for example ten times per day. Note that PHP won the development market taking it from the bottom. So from this point of view, ability to economical use of memory, might be very important for RoR to become mainstream web development tool.


To use Comet like features, there is no need to put that stuff int o RoR. Web server can handle that dirty work very well. Check my fdajax module for lighttpd web server.