Open Source Convention Report

Open Documentation Summit

Monday, July 17

Day 1: Open Documentation Summit
Day 3: Keynotes & Announcements
Day 4: Last Day and Wrap-up

The Open Source Conference began slowly on Sunday, with lots of people unpacking, constructing booths, connecting wires, registering for tutorials, and meeting their friends from last year. The lobbies of the two conference hotels were full of people lugging the day's freebies off to their hotel rooms. In the Monterey Conference Center's Dana Room, however, one group of attendees was already hard at work at the all-day, O'Reilly-sponsored Open Documentation Summit.

The goal of this meeting was to gather those people who are interested in finding a way to improve the documentation accompanying open source software. Some of the issues are controversial and have generated friction among various camps in the open source development community and publishers. Representatives from all the camps were invited to the summit. Lots of open source documentation groups participated: the Linux Documentation Project, GNOME, KDE, FreeBSD, BSDI, SourceForge, Samba, OASIS, Los Alamos National Labs, Python, and Open Content. Naturally, all the O'Reilly open source and XML editors filled the remaining chairs. Controversy at the summit, however, was constructive and minimal. Participants showed a spirit of cooperation and a desire to get beyond philosophical differences to solve common problems. As Eric Raymond, Internet anthropologist, author of The Cathedral and the Bazaar, and attendee put it, our mission is to recommend "small steps, each a reasonable extension to established practice."

The major conclusion reached by attendees was to standardize on DocBook/XML as the canonical format for open source documentation. The role of DocBook, however, is to be the storage and exchange format. Different documentation groups will use it in different ways, but all will provide some form of their documents in a standard DocBook format.

The participants did not believe that there was any other practical solution. The Free Software Foundation intends to continue to use TeXinfo, but FSF is working on a way to convert SGML documents into a TeXinfo format that they can use. Most commercial documentation projects use proprietary products like Word or Frame that are just inappropriate for open source projects. Furthermore, Word, Frame, and other formatting alternatives lack the expressive richness of DocBook. One of the key characteristics of any open source solution is that it has to be able to express every important distinction across many environments, technologies, formats, and languages. Nik Clayton, who heads the FreeBSD documentation efforts, noted the many languages into which FreeBDS documentation is translated. Only DocBook could store sufficient information to satisfy all those needs.

Nobody ignored the problems inherent in the use of DocBook, however. Many people complained that its richness also meant that it was highly complex. Several attendees expressed a desire for a subset of DocBook that contained only the 25 or so most important tags. Norm Walsh, a board member of OASIS and the maintainer of the XML version of DocBook (and, I might add, the co-author of DocBook: the Definitive Guide, by O'Reilly & Associates) has tried to create such a subset, but he believes that even a minimalist set requires about 100 tags. (Tables are the biggest problem, he said.)

Everyone agreed that a key to good open documentation is to make the creation and maintenance of it easy, so the complexity of DocBook is a hurdle. If a potential writer knows lots of good information about an open source technology, we don't want that person discouraged by having to learn DocBook. Nik Clayton said that DocBook ought to supply better style sheet support. Mark Galassi from Los Alamos National Laboratories disagreed: "People should not have to think about such things," he said. For the short term, then, many projects may use DocBook behind the scenes, through conversions, allowing interested authors to submit their information in whatever format is comfortable. In the long run, however, as Norm reminded us, "What we need is a good, open source XML editor."

(There was considerable interest among participants in an XML editor named Conglomerate. Aside from the fact that this editor has its own form of style sheets, it seems to provide a lot of the features that writers need in an XML environment. We're going to keep our eye on this project.)

XML is replacing SGML. Norm recommended that groups begin using XML instead of SGML, if they haven't already, and start thinking about converting existing SGML documents. Most new development, Norm told us, is taking place with XML. According to Norm, conversion from SGML to XML ought to be easy for most documents. The one area of concern involves the use of "marked sections," which XML does not support.

The group also settled on the use of other standard formats: PNG for bitmapped graphics and SVG for vector graphics.

More interestingly, the group decided to support a set of metadata tags called the Dublin Core. These tags provide the background information that lets projects track the status of documentation quickly. This initiative tracks information such as the creator of a file; its title; its most recent revision; its current maintainer; its subject and related keywords; and so on. It is much easier for a person or a browser to search this material than the whole document. Paul Jones, director of MetaLab at the University of North Carolina, and Dan Mueth, project coordinator of the free software GNOME Documentation Project, proposed an Open Source Metadata Framework based on the Dublin Core. Norm Walsh agreed to create the necessary templates to make it easy to implement these metadata tags into DocBook documents.

Dan Mueth, of GNOME, hopes that such a set of tags will make it easier to implement a robust help system on open platforms, one that will be able to find relevant documentation (help and man pages, FAQs, HOWTOs, related documentation, and even information sources out on the Internet. One way to make such searches easier is to have a mechanism that would let a project "register" its information during installation. Dan and Eric Bischoff of the KDE documentation team have agreed to investigate this problem for the group.

There is much more to say about this eight-hour-long summit meeting, and I may add some information in my reports later in the week. What I've noted above are the highlights. The most important conclusion to draw from this summit, however, is that it is just a beginning. Many of these individuals met each other for the first time on Sunday. They have established some goals, started some investigations, and gained some focus: just the kind of "small steps" that Eric Raymond suggested. This project is off the ground; I hope that the open source groups that did not send a representative will get involved after reading this report.

Frank Willison
Editor-in-Chief, Technical Publishing
O'Reilly & Associates

Return to: Frankly Speaking