Tuesday, June 19, 2007


Monkeys down a fire pole

How many monkeys can slide down a fire pole in 5 minutes?

Ok, so this question doesn't seem to have a lot to do with technology or engineering when you fist consider it. But give me a little bit of patience and let's see where we go.

I have recently begun to undertake a personal quest for education and education provision to others about threading and how to do it well and efficiently. With the introduction of multi-core processors and processor companies no longer creating faster and faster single core CPUs with clocking, over-clocking and over-over-clocking, it has become more important to actually understand threads.

Even as recently as when Java first came out the only places that you really ran into threads in open systems was in Solaris (or other Unix) systems with big multi-CPU backplanes. These were usually reserved for big 'ol database boxes though so you could optimize your system to really what amounted to one big fire pole. If we imagine each processor as a fire pole and each unit of work to run as a monkey, we can start to use my fun little analogy.

Everything you did was about queuing up the monkeys efficiently then sending them down the pole as quickly as possible for each individual Monkey thereby maximizing your monkey throughput. Great schemes for keeping the Monkeys in the right order, keeping the pole clean during non-monkey sliding moments (Garbage collection in Java-land) and other similar optimizations were the wave of the day.

Now the fire poll manufacturers (Intel and AMD) discovered that they just couldn't build a faster fire pole without violating the laws of physics (and it's not that they wouldn't break those laws, more that they couldn't figure out how) and therefore decided the best way to move forward was to build a multi-fire pole firehouse in the same amount of space. It was a great innovation... but the monkeys were still optimized for a single fire pole. So in many cases (not all, but many) the fire pole (CPU) utilizations actually was very low even though other areas were slammed. Monkey lines (memory) and even Monkey storage (I/O) became bottlenecks with a low 25% pole usage.

The time has come to build education on efficient fire pole usage. We need more effective ways to slide those little monkeys down. There are some basic practices that seem to make sense. Sun recently explained how they are building better threading into Java and things are rolling out from there. That's a great start... but that is just a start. To really use the power we are now putting into machines we need to figure out better ways to feed the cores.

Help me save the monkeys... do you have any good tips, observations, sites or do you have any bad monkey practices that you have seen that should be avoided? This practice is, of course, not limited to Java but C, C++, Ruby and others. I would like to pull these together to be shared with everyone. If you have a good monkey tail please drop a comment. I would love to hear from you.

No comments: