Seth Maislin, O'Reilly Indexing Guru

by Lori Houston

Turn to the index in the back of any O'Reilly book published in the last five years and chances are you're looking at the handiwork of O'Reilly's resident indexing guru, Seth Maislin. Though indexes are the most frequently fingered section of any computer book, they remain the one element most taken for granted. Those ostensibly logical, orderly columns of subject-page references belie the complexity of indexing.

The craft of indexing involves much more than the mere alphabetization of a book's key words. It requires something that is at once science and art form, the product of someone painstakingly fleshing out a book's information design while copiously accounting for nuances of language and word associations. You might say an index is like a fingerprint: intricate, revealing, utterly unique.

The Index is the Key

Taking into consideration the specific needs of the audience is a particularly critical aspect of any well-designed book index, according to Seth. So is user-friendly, intuitive organization. A good book index features a kind of layered treatment of all subjects addressed within the covers. One of the indexer's most significant challenges comes in discerning how "deep" to go in interlacing those layers without crossing a fine line into annoying triviality.

In technical books, the indexing variables are particularly complex. "I would have to say the most difficult O'Reilly book I've indexed was also one the most popular: Perl in a Nutshell, Seth recounts. "That index begins with page after page of non-alphanumeric entries, like $^. The challenge in indexing symbols is two-fold. First, there's the simple matter of which comes first, the # or the ^? There are many options, and none of them is intuitive. Even the sort order that I choose, "English spelling order" -- "ampersand," "asterisk," "at sign" -- fails when you consider the many different English words for the # character: hash mark, octothorpe, pound sign, sharp, and so on. The second problem is communicating all of this to the book's readers. What becomes intuitive to me may still look like a three-page nightmare of comic book curses to most people."

In the end, Seth only learns whether he has succeeded through reader feedback, although this rarely filters in specifically regarding a book's index. He happily notes, however, "There has been significant positive feedback regarding the index to Perl in a Nutshell, specifically in its level of detail. But no one has mentioned the symbols pages yet. I'm still waiting."

Understandably, accuracy is another, highly critical element to book indexing. A full spectrum of software products now facilitate greater indexing accuracy while also expediting tedious clerical tasks through such techniques as embedding electronic index "tags" within files or documents. Automated indexing tools, add-on utilities, and dedicated programs certainly grease the wheels, but "anyone who thinks a computer can write an index has failed to appreciate the subjectivity of language," Seth says. "The indexer's style has much to do with grasp of language and the skills of communication. Anyone can highlight the important ideas in a text, but few can structure that information to meet the needs of all audiences. In fact, indexing is like playing piano in a restaurant: people never notice you're there until you make a mistake. Or are missing altogether."

His big-picture approach also applies to how he defines his profession. "I define indexing simply as the development and creation of navigational tools for users who want to find existing information. An index is a map of locations, much like the indexes to street maps. It doesn't matter if those locations are Web pages or street addresses, since the goal of 'getting the user where she wants to go' is still tantamount."

Indexing Flowers with Online Communications

The explosion in online communications is calling into question all the indexing rules developed around books over the last 200 years. "In a nutshell," he says, "indexes are the tools we use to find information. If we can't find it, it might as well not exist. The classic back-of-the-book index is one example; library cataloguing systems -- the theory of which is the basis for book indexing -- are another. I would also argue that any navigationally oriented web page is an index. The value of information is in how it can be found and used, and the index is the key.

Online communication opens up whole new avenues for more intuitive indexing. Consider this: Have you recently used a help menu, queried a search engine to locate resources, or clicked on hypertext to navigate a Web page or site? Every time you do any of these things, you are calling upon an online index.

"The rules of back-of-the-book indexing don't quite apply to online indexes," Seth says. "The nonlinear nature of online material makes it difficult for a reader to grasp the greater context; readers don't know how big a Web site is by looking at just one page. Online indexes need to be written with the goal of recreating that lost context.

"The beauty of the Web's graphical nature," Seth continues, "is that there are many ways more ways to organize index entries than the standard alphabetical list. Links can be organized chronologically, geographically, metaphorically, by task, by importance, or by some other custom scheme."

While savvy Web developers and designers may understand and apply these concepts intuitively, Seth says that it's hard for most people, particularly traditional indexers, to think of a clickable map as an index, particularly because the Web is not yet 10 years old.

"An image map serves the same purpose as an index, only without words. Layout and design begin to substitute for language, yet the navigational goals of an index remain."

The fact that many O'Reilly books are also offered on CD-ROM lead Seth to experience firsthand the added dimensions of what he calls computer-based indexing.

"When an index moves from the back of a book to an online format, such as HTML source code, a number of basic indexing elements disappear. That's because the very concept of a page disappears, not to mention page numbers and page ranges. Online, the page references become URLs, and HTML documents exist in a very flat environment. That is, there is no sense that one URL comes first, but rather they all reside in the same or in different directories."

In the global Internet environment, meaning also becomes harder to manage. "The major flaw of search engines, for example, is the inability to filter out less meaningful text. Whereas a search engine considers every hit equally, an indexer must judge the importance of a concept before including it in the index. Indexing is as much art and science as is writing."

Seth finds himself traveling and conducting indexing workshops more and more frequently, communicating his new vision to his venerable trade's unsung practitioners. He's generated a lot of nuts-and-bolts copy on computer-based indexing and its associated technologies on the Web site of the American Society of Indexers, where he was formerly Webmaster and now serves on the board of directors.

In response to numerous requests from colleagues, he wrote and published several guides and tutorials. Those documents have taken on a life of their own within the industry, he says. He also wrote a chapter in an upcoming textbook, Beyond Book Indexing (being published by ITI for the American Society of Indexers).

Learning the Trade

Seth stumbled into indexing by accident, having earned undergraduate and graduate degrees in optical engineering from the University of Rochester in New York. Hired straight out of college, he went to work for a company that marketed optical, scientific, and educational equipment. While he found the business mildly interesting and earned a decent living, "I didn't find the job challenging. On good days I was a consultant, most days I was a salesperson, and on bad days I was a customer service representative."

Eventually he decided to quit and make a move to Boston for a new life, but while still with the marketing company ("bored almost to the point of photocopying my butt"), he walked into the middle of a catalog indexing project and took it on. Only later did he learn that indexing is a profession in its own right through an ad for an indexing class, although he admits, "My first impression put indexing on par with 'Read books for money!'"

In Boston, he got into the publishing industry, freelancing initially as a copy editor before moving into indexing. "Thus, I navigated rather spontaneously from 'Optics? What's that?' to 'Indexing? What's that?'" Seth muses. Seeking work lead to contact with O'Reilly's Cambridge offices. By 1994, he had grown into his current indexer-in-residence role. The O'Reilly relationship has been fortuitous for Seth in several ways. For one, freelancing and consulting has allowed him to pursue one of his other passions: acting. He has frequented community theater productions in Massachusetts and New York for several years either as a performer or behind-the-scenes technician.

Professionally, Seth's long-term O'Reilly affiliation has paid off in huge dividends. "Five years later, my O'Reilly-learned tools background has placed me on the cutting edge of indexing. Fewer than 10% of indexers have used these tools even once, yet the technology of the Web is making them more prevalent in the industry. In addition, indexing O'Reilly book subjects has helped me develop an understanding of Internet technologies. The combination of tools and technology know-how has helped me develop a solid, moonlighting consulting business: I speak at conferences and present corporate-based workshops around the country. Who would have thought indexing could be more rewarding than engineering?"