ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Introducing openMosix

by Kris Buytaert
02/19/2004

Most of the time, your computer is bored. If you have two or more computers, chances are that at any given time, at least one of them is doing nothing. One goal of clustering is to spread resource-hungry loads among all available computers, using the resources that are free on other machines.

Cluster Types

There are three basic types of clusters, the most deployed being the Fail-Over Cluster and the Load-Balancing Cluster. High-Performance Computing Clusters are the currently lesser used type within the commercial market.

Fail-over clusters consist of two or more network-connected computers with a separate heartbeat connection between the two hosts. The heartbeat connection between the two machines helps to monitor whether all the services are still in use. As soon as a service on one machine breaks down, the other machine tries to take over.

In a load-balancing cluster, when a request comes in (for a web page, for example), the cluster checks which machine is the least busy and then sends the request to that machine. Most of the time, a load-balancing cluster is also a fail-over cluster but with extra load-balancing functionality and, often, more nodes.

The last clustering variation is the high-performance computing cluster. This design exists to give data centers the extreme performance they need. For example, the Beowulf cluster was developed especially for research facilities. These kinds of clusters also have some load-balancing features, so they try to spread different processes to other machines in order to improve performance. The underlying work-distributing mechanism executes different routines of a program on different systems. Special programming libraries synchronize the systems and collect the results.

Learning Lab TigerLinux/Unix System Administration Certification -- Would you like to polish your system administration skills online and receive credit from the University of Illinois? Learn how to administer Linux/Unix systems and gain real experience with a root access account. The four-course series covers the Unix file system, networking, Unix services, and scripting. It's all at the O'Reilly Learning Lab.

Enter openMosix

The openMosix software packages turn networked computers running GNU/Linux into a cluster. It automatically balances the load between different nodes of the cluster. Nodes can join or leave the running cluster without disruption. The cluster spreads the workload between nodes according to their connection and CPU speeds.

openMosix is a Linux-kernel patch that provides full compatibility with standard Linux for IA32-compatible platforms. The internal load-balancing algorithm transparently migrates processes to other cluster members, aiming for better load-sharing between the nodes. The cluster itself tries to optimize utilization at any time. Besides using the automagic resource-use optimization, a system administrator can affect the load by manual configuration at runtime, specifying where applications have to run and directing openMosix to send loads to certain nodes.

Test driving openMosix is easy: just download a copy of Clusterknoppix. There's no need to (re)install anything, just boot it on any of your machines. You'll be up and running in a couple of minutes. This is the ideal way to test-run a couple of applications and convince yourselve that openMosix is the solution for your clustering needs.

A Little Cluster History

The openMosix project, born in early 2002, is not the first occurrence of this technology. Looking back in time, Mosix technology started in the early 1980s on a PDP/11. Later it appeared in BSD/OS environments. The first Linux implementations came about around 1997. Amnon Barak and Moshe Bar were two of the most active people in the old Mosix project. After a difference of opinion on the commercial future of Mosix, Moshe Bar started a new clustering company, — Qlusters, Inc. Amnon Barak has decided not to participate for the moment in this venture (although he did seriously consider joining) and held long-running negotiations with investors. It appears that Mosix is no longer supported openly as a GPL project. Because of the significant user base (about 1,000 installations worldwide), Moshe Bar decided to continue the development and support of the Mosix project under a new name, openMosix, under the full GPL2 license.

O'Reilly Open Source Convention.

openMosix Features

One of the differences between openMosix and other clustering environments, such as Beowulf-style clusters, is that for an application to run on an openMosix cluster there is no need for recompilation or integration of other libraries. Programs such as Flac, Bladeenc, Povray, and mjpeg tools work without any modifications, as does MPI.

Indeed, MPI applications benefit from running in an openMosix environment. Although a process starts on node 1, the cluster determines whether it would be better to run a certain process on another, less loaded node. openMosix uses an advanced algorithm based on market economics to determine which node best suits the application. This way, even already parallelized applications will gain from openMosix. However, there are limitations. For example, applications that use pthreads won't migrate, but that's a Linux limitation, not an openMosix limitation.

openMosix also features autodiscovery, a tool that detects other openMosix nodes on your network and modifies the openMosix configuration on these nodes to reflect the changes in your cluster. This makes the initial configuration of an openMosix cluster very easy.

openMosix has a rather exhaustive set of configurable parameters. A cluster administrator can, for instance, define the maximum load a node should tolerate before processes are being migrated away, or he can manually migrate processes to other nodes. He can influence the attraction of a node by modifying its virtual speed and enable many other features.

openMosix is probably one of the easiest and fastest ways to set up your own high-performance cluster, with no hassle in installing extra libraries or fiddling with configuration problems. Add to that an excellent cluster-management interface such as openMosixview and you are on the road to your own supercomputer.

Running Cluster Jobs as a User

Let's look at a real-life example. Consider statistics. People use statistics to predict the winning lottery ticket; to predict stock prices; and to calculate the chance that an 18-year-old will crash his brand-new car within the first three months he has his license. Some companies do zillions of calculations per year. Some of them take weeks or months to finish, occupying some CPU's that can't otherwise be used by other people. These calculations include simulations of space flights, analysis of sequence databases, or just calculations of leasing or rental tariffs.

A simulation user logs onto the cluster, starts the application, and leaves. The user does not have to find a suitable node to run his application; openMosix looks for the best node to run the application and migrates it there. Two minutes later, when another user starts another application, the same process happens. As long as there are nodes available in the cluster, openMosix will migrate applications to them, giving your whole cluster some work.

For users, the cluster resembles one giant SMP machine. In yesterday's environment, each job had to wait till the other was finished. Now, applications can run together. Nodes can join and leave the cluster, so you might want to set up an environment where your employees do their normal work on their workstations during the day, and at night their workstations join the openMosix cluster and start crunching numbers.

openMosix is particularly strong in environments where people need to run multiple instances of a program or different programs. Traditional Beowulf-style computing is more suitable where one long-running job has been parrallelized. However openMosix and Beowulf mix perfectly. As research has shown, openMosix actually often improves the performance of an MPI- or PVM-based environment.

Conclusion

In the last couple of months, openMosix has garnered a lot of attention in the clustering area. With the initial releases of the MigSHM patch, Checkpointing enabled, the General openMosix Deamon, and the birth of ClusterKnoppix, openMosix has entered a new era. openMosix is bringing clustering to the masses.

Even though the openMosix project isn't that old, it has grown to be one of the most active open source projects around. It's the second-most active project in the clustering foundry at SourceForge. If you need an HPC solution for your calculation problem, there's a good chance openMosix is a perfect solution for your problem. Its openness is one extra reason to choose it. Many people have migrated from their old Mosix environment to the newer openMosix environment.

As we say within the openMosix project, Live free() or die().

Editor's note: Kris will be at the European openMosix Summit in Brussels during FOSDEM 2004.

Resources

Kris Buytaert is a Linux and open source consultant operating in the Benelux. He currently maintains the openMosix HOWTO.


Return to the LinuxDevCenter.com.



Sponsored by: