Dita, DocBook and the Art of the Document

by Kurt Cagle

While a remarkable amount of both ink and electronic bandwidth have been expended upon the use of XML in the data realm, there are times where it is necessary to step back for a bit and look at what and where XML is being used today. One thing that becomes obvious when studying the XML landscape is that a significant amount of XML is still being used for purposes of describing narrative, for telling a story, advising people in the use of a product, structuring reports, and doing other things that focus more on documents than they do on data.

In some respects, this is not all that surprising. In general, when you're dealing with data-centric applications, XML isn't always the best choice for working with structured content, and indeed there are times where XML is perhaps the worst, most hideously inefficient mechanism for dealing with data. However, the use of XML as a means of writing and marking up narrative has become the standard means of encoding structured content in most organizations. That doesn't mean that XML is dominant in most organizations for "unstructured" content - that distinction is still very much in favor of Microsoft Word, with XML occupying a considerably inferior position there - but for organizations that recognize the benefit of structured content, XML languages such as DITA and DocBook are very quickly becoming the standard for storing information.

I had a chance to see that principle at work this week at the DocTrain conference in Vancouver, British Columbia. Conference chairman Scott Abel (CEO of The Content Wrangler ) graciously invited me to the conference and I had the chance to talk with a number of people working with technical documentation, online content creation and related material, and overall it opened up my eyes fairly dramatically to the hyper-accelerated world of content management a decade after the introduction of XML.


3 Comments

D. Hoskins
2008-05-28 17:21:11
I am working with DITA on a large corporation's Help and Quick Reference Guide. We can interchange content between the two deliverables, build a library of reused items, and generate PDF and HTML. Making the HTML customized is not too hard, but making PDF output changes throws one headlong into XSL-FO (and it's gnarly). We are going to shift to a CMS + PDF rendering engine (TopLeaf) combination that will provide a visual design interface for the PDF output. As part of the migration, our reused items will be managed as discrete XML content objects in the CMS, rather than as conrefs. This permits reorganization of content without relinking between the DITA files with hard-coded hrefs. The CMS and XML content are hosted (along with a XML editor application) and we can send content to translation workflows within the CMS application.


If you have the budget and the reason to spend it, the CMS + rendering engine make DITA more robust for an enterprise. We hope to recoop the cost in a year or so by better translation memory management. If you are cash-poor, invest in the dita-users.org membership and use their online tool.


BTW, as a developer, I have never regretted working with XML and XSLT. It's a good move for people who are analytical by nature and have a strong visual orientation for creating outputs that match client requirements for look and feel. (I started as an artist and graphic designer years ago.) There seems to be a shortage of XSLT developers who understand print outputs in particular, and people who have these skills may have a strong market as DITA grows.

D.Hoskins
2008-06-29 20:15:10
DITA:tech content as HTML:general content
Evolved as a subspecies of markup language and poised to revolutionize how tech writers collaborate. Will succeed if it becomes as easy to create and therefore as ubiquitous as HTML. Otherwise, it will die off as an unsuccessful adaptation. Open Toolkit is too onerous to set up for general consumption, so better applications are required to establish DITA. These applications should be cheap or free, browser based (maybe Flash or AJAX-coded), unicode/multilingual, and as easy to use as a Wiki or blogsite with WSYISYG editing. (Gosh, a lot of that sounds like DITA Storm, which is a good starting point.) Adobe, why don't you build a better DITA to PDF + HTML engine and give it away with the Tech Suite? You're still too slow to make FrameMaker into a true DITA toolkit.
D.Hoskins
2008-06-29 20:28:43
IMHO, no one should even think about DITA as a true solution for globalized tech content unless they can make the case for online version control, collaboration, translation, workflow and publishing to a variety of output formats (HTML, PDF and RTF at a minimum). I know of one solution that seems to offer this package of features (DocZone) as a hosted solution. Does anyone know of any open-source application suite that is addressing the same feature set? seems like a Moodle-like approach might be feasible. The commercial solutions seem to like about $40,000 as a price for a CMS/server + DITA editor/publisher + collaboration suite. That's because the dotNetNuke gang hasn't gotten serious on DITA, perhaps?