All hail the speed demons

by Jono Bacon

You know, the Open Source community is filled with hero's. This is evident from the helpful souls who walk you though your mailserver configuration again at 3am to those who feverishly hack on your favourite software and continue to refine and develop into the world-class tool you brag about to your pals. There are of course many unrewarded heros, and this entry is dedicated to one particular breed - the optimisers.

In recent years the focus on performance has flooded the Open Source community. From old chestnuts such as KDE and X being slow, to newer complaints with GNOME and, it seems that 'just buy a faster computer' is still as useless an excuse it was five years back. Although various developers have tried speed up and refine the system, it seems that in recent months a new breed of developers are really tearing things open and squashing down the padding for a smaller footprint and better performance.

Back in May 2001, Waldo Bastian, an esteemed and well respected KDE developer wrote Making C++ ready for the desktop. In those days, KDE was taking quite some flak for its performance issues and although the developers were using better coding practises and techniques, there seemed to be a general feeling that the underlying tools need optimising and sheening. Waldo pinpointed the source of some of these problems to the way applications linked with shared libraries. The objprelink tool was developed to solve these issues and KDE got faster.

In recent months, some brave souls have stepped up to the fold to also speed things up. In the GNOME camp, Federico Mena-Quintero, contender for one of the coolest names in Open Source (closely rivalled by the equally cool Torsten Rahn from KDE) has been working extensively on improving the performance of the GNOME file picker dialog. Back on the 20th July 2005 Federico expressed concerns about the file picker, "Starting up a GtkFileChooser is slow. On my 1.7 GHz laptop, a file chooser never seems to come up in under a second. That's way too long, given that even the old Midnight Commander could display a large directory in under 0.3 sec". What followed is a series of detailed blog entries that track the experiments, problems and efforts that Federico made to optimise this small but important part of the GNOME desktop. Currently at part eight, each blog entry is both fascinating and perplexing at the same time. A great example of this is in Part 6. Federico utters a nugget of profiling wisdom, " At first we thought of splitting this table into two: a tiny one for code points below 8192 (0x2000), and another one for the high code points. But then we figured out that we can instead use the upper bit of each entry in our lookup table for the gunichar->scripts (the one I described above). Since the numbers that we store in that table are small — 0 through 62, which is the number of scripts in Pango — the upper bit is free for us to use. So we modified the program that computes the lookup table, and put the new table into the code". The entire post is comprised of heavy technical discussion about the problems and tell-tale performance issues and then at the end summarises with " In total, we have managed to kill around 24% of the original running time. Not bad for two days of work on a seemingly-intractable flat profile!" Amazing.

Not content with just hacking the file chooser performance, Federico went on to have a fiddle with Calc performance. Although he confesses that the speed improvements need to be factored in by an established developer, it seems from his post that his knowledge of identifying performance bottlenecks in GNOME helped in identifying similar problems in

Recently, Federico has not been the only person blogging about optimisation. Michael Meeks, insanely clever gent and long time poker has also been buffing the free software office suite. Michael also has the ability to both baffle and excite in his short but accurate blog entries. Gems such as "Need to work out why we loose 260ms without using any of the -Bdirect stuff; an unusually large amount of time. Interestingly 75% is about the upper limit I'd expect by analysis of the relocations, many of which are vague-linkage type info related." and "Saved 400k over all OO.o libraries by stripping the .comment sections (full of "GCC: (GNU) 4.0.2 20050901 (prerelease) (SUSE Linux)" over and over again)." demonstrate more fascinating experiments into profiling.

What I find so interesting about Waldo's, Federico's and Michael's work is that they are playing with something of a black-art. Performance optimisation is something that not only requires an expansive knowledge of how software is built and represented in memory, but also how to optimise code and the way code is interpreted. I find it fascinating in the same way that kernel hacking is so intriguing. The difference is that my coding is entirely exposed within a software platform, and every action, effect and result is displayed and interpreted with the usual screen, keyboard and mouse. Writing code that makes motors turn, lights flash, network cards transmit, LCD panels glow and remote controls respond is incredible. This kind of profiling does just that. It takes the rather safe and transient world of application development, turns it on its head and draws conclusions.

The work that these chaps and others like them are doing is really important, and I can't stress it enough. Years back, performance was always a card we could play to move people over to an Open Source OS, but recently the system has become rather bloated and sluggish. Federico's efforts into the file picker are a great example of a devotion to really refine something to the n'th level. A lesser soul may have performed some initial optimisations and then given up. To write eight parts that chronicle the full gamut of the file picker performance issues is a real achievement. Just imagine if Federico did that to every base component in the GNOME desktop.

What do you think? Have you done any profiling work? Can you suggest a place for interesting hackers to start profiling and optimising applications?


2005-11-03 11:53:12
Heroes, not hero's...
2005-11-03 12:14:16
Heroes, not hero's...
2005-11-04 12:59:23
Heroes, not hero's...
Damn, missed that typo. Sorry. :)