oreilly.comSafari Books Online.Conferences.


Planning for Disaster Recovery on LAMP Systems

by Robert Jones

I make my living building custom databases with web interfaces for biotechnology companies. These are MySQL and Perl CGI applications, running under Linux, and every one of them is different. Disaster recovery planning for these applications has consisted of routine tape backups of all the software and data, a bunch of ReadMe files, and having me around to put the pieces back together if something breaks—and things do break... power supplies, disk drives, RAID controllers, you name it. Recovery means we fix or replace the hardware, reinstall Linux, restore the apps from tape, and then stitch everything back together. Some recoveries have been easy. Others have involved pacing back and forth and swearing for hours on end while figuring out how the heck I had all this working in the first place. Not pretty, but that's just what you do, right?

That's fine in some situations but in larger companies, especially those with a formal "Corporate IT" group, this approach just doesn't cut it. There is a clash of cultures here that many of us have to face as our startup companies reach a certain size. All of a sudden we find ourselves spending way too much time in meetings, drawing up formal specifications and policies for this, that, and the other. Disaster recovery planning is one of the first of these efforts that most of us have had to deal with. Don't get me wrong, disaster recovery is a critical issue. It's just that it can be a very painful process for those of us who come from an informal development background.

I went through this last year when the CIO at one of my clients brought in outside consultants to formulate their disaster recovery plan. Their first step was to ask what the name of my executable was and where the installation script was located... OK... bit of a problem there. I have 54 Perl CGI scripts in one application alone, nine applications, and no installation scripts for any of them. Eventually, I understood what they really wanted to know, as opposed to what they asked in the questionnaire that they gave me to fill out. I viewed the process as an opportunity, rather than a hassle, and reviewed how I had my applications set up from their perspective. I had to make some changes to the software but, more importantly, I came up with an approach that I now use with all my projects to design in disaster recovery right from the start. I know I'm not the only one dealing with this issue so here are some ideas that you might want to build into your apps.

The (Configuration) Problem

The problem with our sort of database applications is that they weave themselves into the Linux system configuration. We add definition blocks to the configuration files for Apache, MySQL, Samba, et al. We create system-wide environment variables in user shells and insert symbolic links into the filesystem. Every time we rebuild a system we have to make the configuration changes anew. The potential for error is large, even assuming we remember all the steps.

On top of that, we need to deal with software dependencies. Perl modules are a godsend, but most of the sophisticated modules use of other modules extensively. When we use these, we inherit a hierarchy of dependencies that can make installation and recovery even more challenging and error prone. We don't want to give up the things that make our mode of development so productive, but we do need to understand and manage these dependencies. Our goals in designing for disaster recovery should be to keep things simple, to understand where dependencies exist, and to limit those where appropriate.

Separate the Application from the System

I place all my application software and data on its own partition, preferably on a separate disk. You want to be able to restore the application onto any suitable Linux system without regard for how that system is configured. Don't use /var, /opt, or /usr/local. You don't want your software anywhere near anything that the system or any other package might install. My preference is to create a partition called /proj on its own disk with each application in its own directory under there.

With this layout I can take any suitable hardware and perform a standard install of whatever version of Linux is current. This step is really simple; anyone can do it and it lets you verify general system operation before we start with any of my applications. That is exactly what the disaster recovery people want to hear.

Only then do I create my partition and restore my software and data from tape. It is totally separate from any of the system software. This way the request for whomever looks after your backups is simply "Give me the latest copy of /proj", as opposed to "Give me this from /usr/lib, this from /usr/local/bin, and this from /etc" and so on. Again, the disaster recovery people really like that. Complexity means room for error, simplicity means success.

Use Perl Modules Wisely

The CPAN collection of Perl modules has incredible value, saving us from reinventing many wheels. But with every one that we use we add to the dependencies we need to manage. The more dependencies, the less robust the application. If a given module is not part of the standard distribution then ask yourself if you really need it. If the only reason to include, for example, is to use a simple date conversion function, then think about writing your own version. I know that goes against the grain but in some cases, it can have a big impact on the complexity of your installation.

Archive Local Copies of Helper Applications

If you use third-party software within in your application, make sure to archive copies of the distribution kits for each package. For instance, I use ImageMagick for image manipulation and gnuplot to generate scatter plots in a gene expression application. In a recovery situation I don't want to hunt around the Net looking for the right tar files when I should be getting the database back up. Archive tar or RPM files for each package in your application directory, list them in the appropriate ReadMe file, and describe what they do and where they are used. You can then include that list directly in the recovery plan.

Archive Local Copies of Perl Modules

The same advice goes for Perl modules, but here things can get a bit messy. The problem is that many of the really useful modules, for example, require other modules in order to function. So, you need to archive the whole tree of dependent modules. Figuring out what modules you need and whether they are part of the standard Perl distribution is not a simple task. The Module::CoreList can be useful in this regard.

Most of us deal with this by using the CPAN module to download and install our modules. When it works it is the best thing since sliced bread, but having it fail halfway through an install is not uncommon. Debugging the problem requires you to sift through the reams of output it generates and even then the fix is often not apparent. The most common advice found on the web is to run the program again and see if it works the second time around. Sorry, but that just isn't a good answer. In that case you are stuck with fetching the distribution kits for each module and building them by hand. See the .cpan/build/ directory where stores and builds modules it downloads.

Ideally, I would walk through the installation of everything I need for a project, fixing problems as I go. Then I would flush all of that out of the system and reinstall everything from scratch using the knowledge gained the first time around. In reality, I don't have time for that. So the best advice I can give is to record every step you take and then edit that log to produce the preferred set of steps that you will follow next time around. Be warned that this does not sit well with disaster recovery professionals. Explaining this in the plan will test your creative writing skills.

Edit the System Configuration Files

Note: My company is called Craic Computing and so you will see the word craic dotted throughout the following examples. It simply serves to distinguish my modifications from any system code.

The next step is to modify the system configuration files to suit your application. One major target is likely the Apache httpd.conf file. This is where you set up virtual hosts, link directories to web trees, take care of URL rewriting, and set any special options. The default file is already a beast, with around 1,500 lines, so we would prefer to keep all our application-specific definitions in one place, preferably at the very end of the file.

The Include directive is the answer to our concerns. We can place all the application-specific definitions into a separate file and have that text included verbatim through the use of this reference.

Include /proj/linux_config/httpd/craic.conf

Better still, I can create a separate configuration file for each application, place all of these in the same directory, and refer to that directory with a single Include directive. (In this example, craic_config is that directory.)

Include /proj/linux_config/httpd/craic_config

In a similar fashion, we can define system-wide environment variables in application-specific files and include them in /etc/profile by sourcing a separate shell script file, or files, using this block of code:


for j in $CRAIC_DIR/*.sh; do
        if [ -r $j ]; then
                . $j
unset j

We can even use the same mechanism to set up Samba shares by including an external file in the main smb.conf file:

include = /proj/linux_config/samba/smb.conf

By limiting the changes made to system files to these simple statements we maintain our separation of system and application as much as possible.

Pages: 1, 2

Next Pagearrow

Sponsored by: