oreilly.comSafari Books Online.Conferences.


Pre-Patched Kickstart Installs

by Q Ethan McCallum

Editor's note: Ethan has collected this series and other information into Managing RPM-Based Systems with Kickstart and Yum.

My two previous articles explained how to use Kickstart to automate OS installs and upgrades. This article demonstrates some techniques for the third piece of the system maintenance cycle: keeping your machines up to date. That includes how to:

  • Create your own yum repository, such that your machines don't have to update from the public servers. This saves you bandwidth, shortens update times, and gives you more control over what updates you install.
  • Have Kickstart install the newest RPMs from the start (and I don't mean use a postinstall script). Why do the "install, then update" tango when you don't have to?
  • Add a layer of change control to these processes so that you can safely let yum run unattended.

What's interesting is that the last two techniques are half technology, half architecture: a naming convention and a few symbolic links go a long way. Some custom code doesn't hurt, either.

I tested the steps outlined in this article under Fedora Core versions 2 and 3, but they should also work under Red Hat 9 and FC1. This article assumes you have a modest familiarity with RPM, Kickstart, and yum. Refer to the Resources section for links to documentation and other articles on these topics.

Setting Up Your Own yum Repository

Red Hat 9 and the Fedora Core series include yum (Yellow Dog Updater, Modified) to simplify system updates. Point yum to a collection of RPMs (a repository, or repo for short) and it will find the latest packages to install on your system. Fedora's default yum install includes public repo definitions, so you can keep your system up to date by running yum's cron jobs.

Running your own internal repo saves bandwidth if you have multiple machines, because only one machine fetches new RPMs from the outside world. This makes internal updates faster, because few Internet hookups can match LAN speeds. You can also fold your own RPMs into the repo and manage all software updates from the same centralized resource.

Most of all, pointing machines to a private repo gives you control: you can limit what yum sees and, in turn, what it installs. yum downloads a repo's newest RPMs, which aren't always the best for you. For example, an new version of a shared library may require you to recompile some homegrown code. That makes for an ugly surprise.

A yum repo is just a collection of RPMs and some metadata extracted from them. yum clients use the metadata to determine what RPMs are in the repo. Setting up a repo, then, requires:

  • A means to download the RPM updates from a public repo. I prefer scheduled wget jobs, but there are certainly other ways.
  • Enough disk space to store the RPMs. Plan to store a couple of gigabytes of data for each OS release.
  • A way to serve the RPMs to clients. This article uses a web server, but there are other ways (such as NFS or FTP). The examples assume the repo and Kickstart install media are hosted on the same machine.
  • Tools to extract the RPM metadata. While the web server host doesn't have to run Linux, it may simplify the job with FC2 and earlier. I'll explain why shortly.

To setup your wget job, first select a download site from the Fedora team's list of update mirrors.

Next, wrap your wget command in a shell script, Perl tool, or whatever else suits your fancy. I use the following wget switches:

  • --progress=dot:mega: each dot represents 64K, instead of the default 1K. This lets you track download progress without flooding your screen or log file on large files.
  • --accept=rpm: download only files with a *.rpm extension. There's no need to grab miscellaneous HTML or text files.
  • --recursive: even though you're fetching only one level of files, this causes wget to descend into the specified directory to fetch the RPMs.
  • --no-parent: don't follow links that point above this directory, or you risk downloading the entire site.
  • --relative: follow only relative links. Absolute links usually point to off-site resources, or at least resources that aren't under the current directory.
  • --no-directories: don't create a directory structure that matches the remote site. You just want the files.
  • --exclude-directories='*/SRPMS/*': sometimes the source RPMs are in a directory beneath the binary RPMs. You probably don't want them for your private yum repo.
  • --no-clobber: don't overwrite existing files. You don't want to download the entire set of RPMs each time, just the updates.
  • --wait {n}: pause {n} seconds between downloads, so your job doesn't hammer the remote server. (This is more useful for HTTP download mirrors than for FTP mirrors.)
  • --directory-prefix={dir}: where to put the downloaded files.

You can call your wget script manually or via cron. Please show courtesy to your download site's maintainer and schedule jobs for off-hours, and set the --wait flag to 60 or 120 seconds (1 or 2 minutes) or more between downloads. If the job runs overnight, the extra download time won't make a difference.

Setting up the web server is even easier: point the document root to the directory where you downloaded the RPM updates. For flexibility and growth, you may want to standardize on a directory structure, such as that shown in Figure 1.

sample repo directory structure
Figure 1. Sample directory structure for a yum repo, hosted on a Kickstart server

This directory structure accommodates several OS releases and architectures. In this example, the updated RPMs for Fedora Core 2, i386 architecture go under the web server's document root in FC2-i386/updates/Fedora/RPMS.

Run yum-arch to extract RPM metadata if the repo is for FC2 or older. Using the directory structure from above:

$ yum-arch {document root}/FC2-i386/updates

This command scans the RPMs in the tree and dumps header information into the FC2-i386/updates/headers directory. There is one .hdr file for each RPM in the tree.

FC3 stores its header info in a different format, generated by the createrepo command:

$ createrepo {document root}/FC3-ia64/updates

createrepo stores the RPM metadata in a set of XML files. In the above example, these files exist under the web server document root in FC3-ia64/updates/repodata.

yum-arch still exists under FC3, so you can create the older header format for FC2 clients. It may be possible to run createrepo under older Red Hat releases in order to serve FC3 clients. Because both tools are written in Python, they might work under other operating systems. Admittedly, I haven't tried this.

Configuring yum Clients to Use the New Repo

Configuring a client is as simple as editing a few text files.

For FC2 and earlier, the repo definitions live in /etc/yum.conf. You don't want the client machines downloading from the public repos anymore, so comment out those preexisting definitions with # characters.

Next, define an entry for your shiny new local repo:

name = internal update server
baseurl = http://{update-server}/FC$releasever-$basearch/updates

This repo definition breaks down as follows:

  • [internal-updates] marks the beginning of a new repo definition. This name should be unique within the file.
  • name is a descriptive name for the repo.
  • baseurl points to the update web server. The path portion of the URL is the directory containing the headers directory created by yum-arch.
  • yum expands $releasever and $basearch into the current host's OS revision and hardware architecture, respectively. A mix of these variables and a predictable repo directory layout allows you to maintain a single repo def for your entire shop.

FC3 separates repo definitions from the main yum.conf. To disable the existing repos, add:


to all of the .repo files in /etc/yum.repos.d. Create your own internal-updates.repo file that contains just a stanza, similar to the example FC2 entry. Next, test the repo configuration in a nondestructive manner:

# yum check-update

This will contact the repo web server, fetch RPM header info (either from headers or repodata, depending on the target machine's OS version), and list the RPMs for which updates are available. If you're satisfied with those results, tell yum to update the machine based on the repo's contents:

# yum update

You certainly don't have to call yum by hand on all of your machines every time you want to update them. Enable the yum daemon to take advantage of automatic (cron'd) updates:

    (set the daemon to start on every system boot)
# chkconfig --add yum
# chkconfig yum on

    (start the daemon now, so you don't have to reboot)
# service yum start

There's a trade-off between the risk of unattended, automated updates and the cost of manual labor. Manual updates tend to win out in more formal shops. Later in the article, I'll demonstrate a method that provides a layer of change control while allowing machines to update themselves.

Pages: 1, 2

Next Pagearrow

Sponsored by: