OSCON: Inside LiveJournal's Backend

Email.Email weblog link
Blog this.Blog this

Robert Kaye
Jul. 29, 2004 01:28 PM

Atom feed for this author. RSS 1.0 feed for this author. RSS 2.0 feed for this author.

My favorite OSCON presentation on wednesday was Brad Fitzpatrick's "Inside LiveJournal's Backend" presentation. This year at OSCON there are a number of presentations that focus on high availability, scalability and growing web sites, and Brad's presentation was the most down to earth presentation. When comparing the story of LiveJournal to the story of TicketMaster, it's clear that LiveJournal has more of the open source bootstrapping feel to it. The TicketMaster presentation shows that TicketMaster started off with a huge budget to buy all the right hardware, whereas LiveJournal started by randomly cobbling together hardware from random sources here and there.

LiveJournal started in 1999 with a shared server, which was promptly killed with too much traffic. The first dedicated server didn't fare much better, and from there Brad started collecting money from users to afford the hardware to host the service. Today LiveJournal is a colorful collection of 90+ machines that employ a lot of fail over techniques and custom sofware bits that complement the off-the-shelf open source software that powers the rest of LiveJournal.

Rather than going into all of the details of Brad's talk -- go check out the PDF of his slides -- there is a lot of valuable information in there. If you find yourself hosting a web sites that is starting to push the boundaries of your current hardware, go read the slides.

The scary thing to me is that MusicBrainz is currently in step 2 of the LiveJournal history. We have two servers which are starting to strain under the load -- we're planning on adding more servers soon, but will we end up operating 90+ server 5 years from now? Egads -- I hope not.

Regardless of where MusicBrainz is going in the future, I feel less dread about scaling our service than I used to. It seems that the open source community has started working on high availability and scaling tools (e.g. memcached, wackamole) that will make scaling web sites easier and cheaper. Not requiring expensive specialized hardware and relying to software solutions running on commodity hardware makes a lot of sense to me.

While taking in all this valuable information about creating scalable web systems, it becomes clear to me that the organization of a scalable web site heavily depends on the data being used in the site and what users are doing with it. No doubt the LiveJournal and TicketMaster models would not work for MusicBrainz. I guess its time to put on the thinking cap and start planning how MusicBrainz might be scaled up in the future.

Robert Kaye is the Mayhem & Chaos Coordinator and creator of MusicBrainz, the music metadata commons.