oreilly.comSafari Books Online.Conferences.


Advanced Data Engineering Helps Publishers Bridge the Digital Gap

by Bonnie Allen

The revolution from ink and paper to digital publishing has caught many in the publishing industry off-guard. One of the challenges that all publishing companies face is how to make both new and existing information available in the many electronic formats that customers expect.

Advanced Data Engineering, Inc., based in Petaluma, California, has stepped in with custom-made solutions. "We're the people who can turn a book into a CD-ROM or convert raw data into useable information for presentation in a wide variety of media, including the Internet, company intranets, and e-books," said Allan Syiek, Advanced Data Engineering's vice president of sales and marketing.

Advanced Data Engineering gathers a client's data from many sources--including text, graphics, and multimedia--and translates it all to XML or SGML, a flexible format for easy conversion to any other print or electronic format. The company also consults with clients to help them define and meet their data-conversion needs, and designs systems that let clients do their own translating in a way that is least disruptive to their established procedures.

With so much riding on getting the information distributed, companies don't just want their publication in the right format--they want it now. This means turnaround time must be fast, and the conversion software must be easily adaptable.

This is where Perl comes in. Kevin Behr, Data Analyst for Advanced Data Engineering's client CCH, put it this way: "Development and production of our electronic products is time critical. Prior to Advanced Data Engineering's conversion to Perl processing, our cycle of product delivery was impacted by processing time. Once Perl programs were introduced into the process, the time and effort involved was considerably streamlined."

CCH (formerly Commerce Clearing House) is a company that specializes in providing customers in the legal profession with up-to-date monthly summaries of legal changes, notably tax code updates. A number of Advanced Data Engineering's 13 employees, including Syiek, a self-described "lawyer who programs," worked together on their data-conversion applications as employees of CCH.

Originally, CCH mailed monthly printed updates that the customer used to replace outdated pages in a looseleaf binder. When CCH offered an online version of their product, from which customers were issued a monthly CD-ROM, initial development was slow and costly. It took six to nine months to develop conversion programs for each series of publications and their monthly updates.

"At the time we were using a popular language that was very expensive to license," recalled Syiek. "The language was difficult to learn, read, and implement. Conversion projects were falling behind, and monthly production was also impacted because the code was difficult to maintain.

"Enter Perl. The fact that Perl is open-source software was very appealing, but the determining factors for us were that it was freely available, easy to learn, and so efficient at manipulating very large sets of data." Compared to the proprietary software, Syiek found Perl easier to write and more modular; in short: simpler, faster, and cleaner.

When CCH closed its San Rafael office in 1996 to consolidate its operations in Chicago, the group that had developed the conversion software proposed to spin off as Advanced Data Engineering in order to offer a much-needed service to additional clients.

CCH is still Advanced Data Engineering's biggest client; other recently acquired clients include WebMD, for whom Advanced Data Engineering is translating thousands of pages of Readers Digest books from QuarkXPress to XML. Another recent client, IDG Books Worldwide, is using Advanced Data Engineering's services to translate its popular For Dummies® books to XML.

Perl's open-source status has allowed access to ready-made tools such as a freely available SGML parser. "Since moving to Perl, we have been able to develop a set of tools that has standardized most of our processes, and we have avoided reinventing code that others have already written," said Syiek. "We can bring new programmers up to speed relatively quickly."

As a result, tools and programming are quickly adaptable to individual client needs. "We have taken full advantage of Perl's object-oriented nature by developing a module which has simplified most of our efforts to convert one client's print-formatted data into SGML, and from there into proprietary Standard Information Format (hence the name," said Syiek. " routines standardize aspects of the conversion process, as well as handle typecode parsing, charts, and graphics."

How much code and time went into the module's development? "The module is a mere 680 lines," said Syiek. "Of course, it incorporates at least 10 other packages. It was developed and used in a matter of days, and has been modified since its inception six months ago.

"Bottom line: with Perl, we have cut our development time from months to weeks. Our ability to turn jobs around quickly and accurately has not only kept us competitive, it is the key to our profitability."

Sponsored by: