ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Ten Myths about Open Source Software

by Tim O'Reilly, Tim O'Reilly
11/01/1999

Editor's Note: The following is the transcript of a talk given by Tim O'Reilly to a group of Fortune 500 executives in October.

I'm very pleased to be here. I'd like to learn from you as well as give you my ideas, and so I'm hopeful that you'll feel free to interrupt me with questions or comments so we can get quickly to issues of common interest. I've tried to break my talk into a kind of "top ten list format" so that there will be easy break points for conversation along the way, as well as at the end.

My main topic of discussion is going to be open source software. After the great success of the Red Hat IPO, Linux has to be on everyone's mind: is this another thing like the Internet that has come out of left field and is about to change the rules for everyone? I believe it is. (In fact, I believe that it's a natural extension of the Internet story.) In that vein, after I talk about Linux and Open Source, I'd also like to talk about some ways that the web is changing the whole computing paradigm.

Let me start by addressing some myths about open source software.

Myth #1. It's all about Linux versus Windows, with Red Hat as yet another challenger to Microsoft.

In the course of my comments today, I hope to convince you that the Linux vs. Windows face-off is the wrong way to think about open source. I'm going to explain how open source has been instrumental in the rise of the Internet, how it's behind many of the most exciting and innovative technology companies out there, and why the open source methodology is something you can apply in your businesses whether or not your company ever uses Linux.

They say that if all you have is a hammer, everything looks like a nail. One of the biggest problems everyone has in coming to grips with the open source phenomenon is that they insist on interpreting it in terms of the previous generation of computing, and so miss its real, pervasive effect. In my remarks today, I hope to outline some parts of that effect.

Myth #2. Open Source Software Isn't Reliable or Supported.

If open source software isn't reliable enough to use, then the Internet isn't reliable enough, because the Internet infrastructure relies heavily on Open Source software. Every single internet address--both web and email--depends on the Domain Name System, or DNS. At the heart of the DNS is an open-source program called BIND, originally developed at UC Berkeley, but maintained ever since by its principal developer as an independent free software project under the auspices of a group called the Internet Software Consortium. Given the importance of the Internet today, BIND is arguably one of the most mission-critical programs in the world.

And that's not all. Virtually any email message sent over the net relies on sendmail, the mail transport server that serves approximately 75% of all internet sites (including many at large companies that don't even know they are using it.) Because email messages are always handled by at least two mail servers in going from one site to another, chances are very good that almost every message relies on sendmail. Like BIND, sendmail was originally developed at UC Berkeley, and maintained over the years by a free software developer community headed by Eric Allman, the original developer.

It's also well known that the open source Apache web server hosts more than 60% of the world's web sites, including many of the most heavily trafficked, such as Yahoo!, which runs on a network of more than 2000 FreeBSD-based machines running a modified version of the Apache web server.

What's more, in an attempt to increase its scalability and performance, Yahoo! has recently outsourced parts of its network to caching service provider Akamai, who not incidentally is one of VA Linux Systems' largest customers. (While your briefing paper focused on the lack of 32-way SMP in Linux, the web provides some alternate approaches to scalability, distributing services to bring them closer to the customer. The lack of scalability is also an argument that was used against the PC. While it is true that the PC never replaced the mainframe, it did something more important: it brought new kinds of applications to the user. And I believe that's also the key role of open source software.)

(An interesting side note is the way that the Internet has redefined the concept of reliability, not as a lack of failure, but an indifference to it: multiple paths to reach a destination, the ability to retry failed transactions, and so forth.)

But that's not all! (I feel like the salesman in a Ginzu knife commercial!) The TCP/IP protocol stacks used in most commercial internet software are based on the code originally written as part of the Berkeley networking package. What's more, the Internet Engineering Task Force, or IETF, the body that creates and governs the Internet standards, operates by a process that has a great deal in common with open source. Anyone can join its mailing lists and meetings and shape the direction of its standards--technical contribution, not political maneuvering by large companies--and ultimately guides what is in and what is out.

Nor is most of this Internet software fly-by-night. Many of these programs have continued to be developed, refined, and supported over many years--years in which supposedly supported commercial software packages have ended up in the dustbin after their parent companies have been acquired, or simply defeated in the marketplace.

One of the most exciting things about open source is that it represents a huge shift of power from vendors to end users, who are not left without recourse if the original developer abandons the marketplace. The Apache web server is the clearest case of this benefit: after the university-developed NCSA web server was abandoned because the entire technical staff was hired away by Netscape, it was a group of users of the NCSA server who put together an informal group (held together by an Internet mailing list) to coordinate their updates and patches to the NCSA code. That was the origin of the name--it was "a patchy server".

With the growing interest in open source software in the computer industry, large end users will soon have the best of both worlds, with the informal support mechanisms of the networked developer community with the big-company support of players like IBM, HP, Sun, and Silicon Graphics, not to mention aggressive startups like Red Hat.

Myth #3. Big companies don't use open source software.

Despite official edicts to the contrary, there is an enormous amount of open source work going on at your own companies. To give just one example, my company's books on Perl, the leading open source development language, have been sold in large quantities into virtually every Wall Street investment bank. A large number of attendees at my company's Perl conferences come from Boeing and other aerospace companies. When Amazon recently unveiled their Purchase Circles feature, which shows the books most frequently bought at various companies, books on Perl dominated the top ten at every large semiconductor company, and showed up in more purchase circles than any other technology topic. In fact, I'd bet that we've sold books on Perl to virtually every company on the Fortune 500. The users aren't using Perl to do things that they could do with commercial tools; they're using Perl because it allows them to solve a new class of problems.

I found it amusing that some of the Wall Street banks I talked with didn't want to talk publicly about their use of Perl, arguing that it gave them a strategic advantage over the competition. I had to tell them that all their competitors were already using it as well.

Stories like this abound in the Open Source community. Greg Olson, the CEO of Sendmail Inc., recounts a conversation with the IS department at a large bank. He noticed from characteristics of the mail sent out from the bank that they used sendmail, and asked if they'd go on record. They denied that they used sendmail, until he showed them the tracks their email server was leaving in cyberspace.

(That brings up a related, but very significant point: the connectedness of the internet is changing the rules about what is visible to market researchers and the competition. Gideon Gartner, the founder of market research firm The Gartner Group, once told me that the secret of his business was that he got in between the computer vendors, the Wall Street investment bankers, and the large end users, and sold to each of them what he learned from the others. But the Internet has changed that equation, because a lot of technology decisions are now visible to everyone. While a Gartner study based on conversations with IT managers might have profiled intent to use the Netscape web servers versus Microsoft's IIS, the unbiased "robot-based" Netcraft survey, which talked to the web servers themselves rather than to the companies that run them, demonstrated Apache's dominant market share. Much of the talk about the tracks we leave in cyberspace focuses on personal privacy, but I think that we're going to see enormous impacts on the IT market research business as well.)

Myth #4. Open Source is hostile to intellectual property.

I'm sure that if you've looked into this topic at all, you've heard scare stories about the Free Software Foundation's General Public License, or GPL. You hear that use of any software under the GPL will require your company to give away all its software.

It's true that software that incorporates other GPL-based software must be provided under the same terms--this is its so-called "viral" characteristic. However, you can use even GPL'd software freely. What's more, the Free Software Foundation represents only one of the traditions that make up the open source movement. Many of the very important programs I mentioned earlier come out of the university software development tradition, of which Berkeley UNIX was the mother lode. These programs are published under licenses that allow for proprietary extensions, and, in fact, many of the key developers have sought to find a balance between what they give away and what they keep proprietary.

What's more, companies including Netscape, IBM, Apple, and many smaller developers have created licenses that reserve various intellectual property rights to the creator of the software, while opening up the source code to spur involvement by outside developers.

We're still trying to understand what kinds of licenses are best for encouraging open source development.

Myth #5. Open Source is all about licenses.

In fact, open source is about internet-enabled collaboration. Licenses play a role only to the extent that they set out rules designed to make sure that companies don't undermine the playing field.

At bottom, Open Source is a software development methodology, not a political belief. Some of the principles underlying that methodology are outlined in Eric Raymond's new book, The Cathedral and the Bazaar, which I've brought for each of you. However, though I believe Eric has put his finger on a lot of the important principles, we're still learning the rules of this new game.

Efforts like Sun's Community Source License represent attempts to find the boundaries of the phenomenon. What "rules" are required to ignite an open source community, and what are the beliefs that get in the way?

The real lesson to be learned from open source communities are the techniques of networked collaboration that they've pioneered. Open source software projects have developed techniques--the use of mailing lists, distributed access to version control software, techniques of peer review, discussion and voting on features, rapid response to user feedback and opportunities for user participation--that can be applied fruitfully to the development of any kind of software.

In this regard, I want to mention a company that we recently started with Apache Group co-founder Brian Behlendorf. It's called Collab.Net, and it specifically attempts to create various kinds of infrastructure services for collaborative projects. It's first product is something called SourceXchange, which is a marketplace for open source development, in which companies can put up RFPs for software they want written, and independent developers can bid on the projects. Long term, Collab.Net will be creating outsourced services that allow any kind of software development project to use the techniques used so fruitfully by the Apache project.

Myth #6. If I give away my software to the open source community, thousands of developers will suddenly start working for me for nothing.

As Netscape found out with its Mozilla project, there isn't a magical community waiting to jump on any new open source software project. Linux claims thousands of active developers because it is really an aggregate of hundreds of independent projects.

Most open source projects have a core of a few dozen dedicated developers, a larger ring of a few hundred interested collaborators who provide problem reports, bug fixes, and occasional enhancements, and thousands or tens of thousands of users. Note however that some users eventually migrate from the outer to the inner rings. It's not dissimilar to the story I once heard from a fundraiser at the Nature Conservancy, which has something like 750,000 members. They don't make money on their individual members--but those members represent the pool from which their large donors eventually come.

The more important lesson here is that, if you want to engage a collaborative community, you need to engage the community of your users, not some vaguely-defined "open source community."

Myth #7. Open source only matters to programmers, since most users never look under the hood anyway.

The benefits of open source are exactly the same as the benefits of any other free market: competition between multiple suppliers results in lower prices, more innovation, and specialization to meet the needs of new niches.

This is in fact the chief benefit of open source as far as I'm concerned: you are no longer locked into a single-source supplier. If worst comes to worst, you can solve your own problems; more likely, there will be a variety of third-party vendors who will be able to support you better than a single-source vendor ever will.

In short, open source isn't just something that matters to computer software vendors. It's a way for you to provide better services to your internal users and to your customers, by applying techniques of networked collaboration first pioneered by leading edge software developers.

Myth #8. There's No Money to be Made on Free Software.

Despite the excitement about Red Hat, their success at putting software in a box and selling it only helps to perpetuate the myth that most software is written for sale. In fact, as most of you know, you write software for use in your businesses. It's a tool, like any other, with a set of build vs. buy tradeoffs.

It is true that open source software will reduce the amount of money that is spent on existing commercial software (which is why hardware vendors like IBM are so eager to embrace it--it cuts out the "Microsoft tax"). Disruptive technologies like open source software development reduce the margins of existing players, lower the barriers to innovation, and end up expanding the market--for players who are able to quickly understand and play by the new rules.

For a historical parallel, you have only to look at the history of the personal computer industry. Essentially, IBM changed the rules with the release of the specification for the personal computer as an open standard. For some years, there was an obvious battle over proprietary extensions to the open standard, but eventually, at the systems level, it became clear that the strategic advantage was not in gaining proprietary advantage, but in supply-chain management. That's why Dell, a company started by a college student, is now a multi-billion dollar company, and why Bob Young of Red Hat claims that Red Hat is not a software company per se, but a brand marketing company, and insists that his goal is "to shrink the size of the operating system market." If certain types of software become more of a commodity, the skills needed to prosper change accordingly.

The Internet, a disruptive technology based on open standards and open source software, has created huge new markets away from the software. I mentioned Yahoo! previously. You can make a good argument that the explosion of the web is a direct outgrowth of the open source movement. But I'm not just talking about the role of a free operating system like FreeBSD and a free Web server like Apache. The fact that there are millions of sites for Apache and the search engines to index relies on the fact that HTML itself (the language in which web pages are written) is open. Most people who build Web pages do so by imitation--the Web browser includes a "view source" menu item that lets you see how any feature is implemented, and makes it easy to copy it. That's open source in action at a grassroots level. So I can make a strong argument, which the founders of Yahoo! will echo, that Yahoo! and many of the other high-profile Internet opportunities, are actually a direct outgrowth of open source. As one of the people who sparked the web revolution, I can personally attest to the importance of open source in our early efforts.

The lessons of how to "think outside the box" about how to make money on open source software are even clearer in the case of the Internet Service Provider industry. Few people realize that Rick Adams, who founded UUNEt, the first commercial ISP, was a noted free software author. He wrote both B News, which at the time was the dominant usenet news server, and the widely used SLIP implementation that first allowed internet access over dial-up modems. Rick didn't try to sell his software: he realized that the money was in building the services that were needed if this technology was to spread beyond the hacker elite into the mainstream.

Myth #9. The Open Source movement isn't sustainable, since people will stop developing free software once they see others making lots of money from their efforts.

If you look at the development communities around most open source software projects, you see a very large contingent of people who fund open source software development because they use it in their work, or who have found some other way to monetize that development.

I mentioned earlier that the Apache project was founded by a group of end users of the NCSA server. That's not quite true. Some of those "users" were web design and hosting firms, who resold their services to others. Having access to the server code was instrumental for their business, and it made complete sense for them to fund further development. By collaborating on improvements, they were able to gain immediate competitive advantage and provide new features for their customers. And because their services were generally provided in a specific geographic area, collaboration even with companies in the same business made perfect sense.

The key contributors to most open source projects today are a mix of university researchers, developers internal to companies who use that particular open source package in their work, independent consultants who profit from the increased visibility their participation brings them, and developers sponsored by companies who have identified a clear revenue stream associated with that project.

Myth #10. Open Source is playing catch up to Microsoft and the commercial world.

While it is true that there are major Linux projects to recreate the equivalent of the Windows desktop and the common office applications, these are not in fact the most important parts of the open source phenomenon.

Consider for a moment the most exciting new computer applications for the consumer. These are no longer desktop computer applications. (You can argue that the web browser was the last significant desktop application, and it was introduced over six years ago.) The exciting consumer applications of today are all web based -- Amazon, EBay, E*Trade, maps.yahoo.com. New functionality is being delivered via the web. I would argue that even in the back office, the web is changing everything.

Now, not all of these applications are running on open source operating systems, but that misses the point. It's a technology developed by an open source process operating outside the mainstream of the computer industry that brought us the disruptive changes that made all of this possible.

Returning to my earlier story about the early days of the IBM PC, I'd argue that the greatest contribution IBM made was in lowering the barriers to entry to the computer market. Once they published the specifications for the PC, anyone could build a PC. And two important things happened as a result: first, we had the development of a commodity hardware business, with multiple sources competing to provide computers at the lowest cost. The barriers to entry were so low that Michael Dell could start what became a multi-billion dollar company from his college dorm room.

But perhaps more significant was the shot in the arm that the open hardware platform gave to the software industry. Suddenly, the barriers to entry there were lowered as well. Where formerly software companies were satellites to hardware companies, they now became a power in their own right. IBM's great miscalculation was in thinking that hardware mattered more than software, which gave Microsoft the opportunity to take pole position in the computer industry.

In a similar way, I argue that the open, commodity software platform of the Internet is giving rise to a new class of applications on top of software--a layer I call "infoware".

If you look at these "applications" you'll see that they are processes more than they are products. Microsoft comes out with a new version of its products every 12 to 18 months. Yahoo, Amazon, and E*Trade rev their products continuously. And if you look under the hood, you'll see that open source scripting languages such as Perl, Tcl, and Python (or commercial products imitating their functionality) are a key part of the development mix. The reason, one missed by both Microsoft with Active/X and Sun with client-side Java, was that at many of these sites, a key part of the "application" isn't built by programmers, but by writers, editors, catalogers and other content specialists. And some part of the programming that is done is "on the fly" mapping of dynamic text-based content sources such as newsfeeds. Perl's ability to parse text using powerful regular expressions turned out to be more important for the applications of the future than object-oriented code re-use.

This to me is the real significance of open source. If you create low barriers to entry, you increase the opportunities for surprise. As Alan Kay once said, "It's easier to invent the future than it is to predict it." Open source gives us a better tool for innovation, not because of any magic in its development methodology (although there is great power in distributed peer review), but because it is part and parcel of an environment in which multiple players can take us in unexpected directions. Software companies didn't invent the web because they had too much at stake, and tried to fit the world of networked multimedia into narrow product visions that were compatible with their existing revenue streams. It was the availability of free software and open standards that let people outside the industry create what turned out to be the next paradigm.

The real secret of open source is that it's the latest disruptive technology, one that disenfranchises existing players and lets in fresh ideas. The last time round, the "barbarians" (to use Philippe Kahn's terms) were small software companies. Now that Microsoft has conquered the software market and choked off innovation, the commoditization of software through the open source process has opened up new avenues and an entirely new class of applications.

Does this mean that the software industry as we knew it has become irrelevant? Not at all. It will continue to flourish, just as the hardware industry has flourished in an age dominated by software companies. In one sense, whether or not the web keeps to its open roots is irrelevant; its mission has already been accomplished. In fact, I expect that many applications that were originally developed in the open source community will be taken proprietary over the next dozen years, as web-application vendors who built their fortune on an open foundation create walls to protect themselves. Even Microsoft was once an outsider, a small company trying to change the world.

In fact, I believe that any successful industry provides a balance of open and proprietary. At the heart of the open PC hardware platform is a proprietary CPU, and a variety of proprietary devices. At the heart of the open internet are proprietary Cisco routers, and for every open source program, there are proprietary ones as well. It's not a matter of either-or.

That being said, I believe that we can learn from our mistakes. It's not necessary for us to go through a cycle of openness and heightened competition followed by stagnation as a few vendors dominate the industry and limit us to their centrally managed master plan. History teaches us that as far as innovation is concerned, open beats proprietary every time. You have only to look at the history of the UNIX operating system to see this effect. Many of the innovations that were incorporated into commercial UNIX systems (as well as many of the foundational technologies for the Internet) were developed in universities as extensions to the original work at Bell Labs. Once AT&T took UNIX commercial, under a restrictive license, that work stopped, and didn't burst into flower again until Linux, a free implementation, took over leadership of UNIX operating system development.

My concluding argument to you, therefore, is that if you value competition and innovation, it's in your best interest to support the open source software development community. Not only should you be experimenting with open source products, you should be learning from its processes. My dream is that we can have the best of both worlds: a vibrant commercial industry based on openness and cooperation where it makes sense, and competition and proprietary advantage where it makes sense.

Some Final Thoughts

Ten years or so ago, Sun Microsystems coined a slogan that is only now becoming true: "The Network is the computer." I'd like to outline a few ways I see that coming true:

  • Every device is becoming a network peripheral.

  • Applications are living on the web rather than on the desktop. We see this in Sun's and Microsoft's plans to offer office applications as hosted services, but even more in the new class of applications I talked of earlier.

  • Web sites are no longer running on single servers, or controlled by a single company. I already mentioned the role of caching services such as Akamai. When you look at beta chapters of O'Reilly books on Amazon, you're actually talking to an O'Reilly site, melded invisibly into Amazon. When you use the search engine on many a site, you're actually using a remote search service. Mapping services like maps.yahoo.com are outsourced from other companies. When you look up a phone number on PageNet's two-way pager, you're talking to a proxy server on the web that pulls down a page from Infospace, throws away everything but the phone number, and downloads it to the pager. With the Palm VII and WAP-enabled phones, the "clients" of web sites will often not be browsers being run by humans, but other programs. The list goes on and on.

  • Another exciting idea here is that you can think of any web site as having an implicit API, and in fact, you can use it as if it were a subroutine in a program. We're already doing this in a small way throughout the web. For example, Jon Udell, one of our authors, built a script he calls the "mindshare script", which takes any particular point of the Yahoo directory hierarchy as a starting point, "unrolls it" to get a complete list of all the sites Yahoo has in that category, and then feeds them one by one to Altavista, using the links keyword. The result is a sorted list of all the sites for a given topic, in descending order according the number of other sites that link to them. The utility of this application may be limited, but the implications are astonishing: I can write a program that uses Yahoo and Altavista as components. With the advent of xml, this kind of thing is going to become easier and less fragile. One of the most exciting areas to watch is something called xml-rpc, which has been incorporated into a protocol called SOAP (Standard Object Access Protocol), which has the backing of Microsoft as well as the open source community. This is the first intimation of truly global change in the ways software will be written.

Open source is like a stone thrown into a pond. The ripples spread outwards, even if you can no longer see the stone that caused them.

A final point about open source itself that bears thought, in terms of the disruptive power of open source with regard to current companies: open source projects are managed by individual developers, not by companies. As Larry Augustin of VA Linux Systems (who probably has more of an open-source all star team than any other company) remarks, this is an enormous shift of power. When a leading open source developer moves on, his project (and the status that goes with it) moves with him. A comparison of the film industry, with its studio-dominated system of the thirties and forties replaced by the star-dominated system of today, bears some thought.

If this power shift meant that key projects were controlled by single individuals rather than by distributed teams, there might be reason to fear this change. But the unique characteristics of open source, in which key developers lead but do not unilaterally control their projects, provide insurance both against the loss of a key developer and against that developer taking the software in a bad direction.

In short, open source is here to stay. It's already had major impact, but there's more to come. Keep your eyes open, and prepare for more positive surprises!





Sponsored by: