Castoff hints? Rethinking interoperability and fidelity
by Rick Jelliffe
Deciding on breaks
I am ancient enough to have used galley proofs, the long pages of text of books before it had finally been made up into the final pages and runoff on a printer (or rather, by a printery.) It still exists in the draft modes on some modern word processors, I suppose. There has always been a chicken and egg problem in documents which contain dynamic forward references that expand to section or page numbers (e.g. See page 99: how do you know how much space to reserve for the page number? A reference on a tightly-set line or full page may cause different page breaks if it is a two or three digit number, for example. A traditional way to deal with this was to allow a lot of space around page references (to reduce the impact) and to take two passes of the document, the first to estimate the pages and the space required for each reference, and the second to actually compose the document using the calculated space as fixed and squeezing the generated text if necessary.
The idea that you could divide the same text into different length pages is obvious, and quite early on even the electronic typesetting programs alllowed draft modes (or provided alternative macros) for producing proofs. The requirement of some publishers for double spaced manuscripts made the idea of separating structure and presentation, ideas ascribed to Charles Goldfarb and (independently) Brian Reid, does not seem a big leap to us nowadays. Multi-publishing and retargetting became commonplace in the SGML arena, with the advent of declarative stylesheets looming for a long while, but the next really big step was with the advent of the WWW and the impact of resizable windows on formatting.
One of the most important ideas following from the separation of presentation (into stylesheets) and content has been the formalization of the page-flow model (frames), which was championed by Frame Corporation's FrameMaker though the simpler concept of regions was of course older. The idea is you "pour" the text into the frames and they flow, break and cause new pages where they will.
In my blog yesterday, I mentioned that the transformational approach of stylesheets in XML (the DSSSL, XSL-FO streams) is only loosely-coupled with the typesetting engine (or formatting engine...some people think that word processors don't do typesetting, I don't want to get hung up on terminology) so there are some kinds of page design rules that are impossible even if because the developers cannot be aware of every design rule anyone might want to make.
The separation also impacts another area: the area of document interoperability. I have written several blogs referring to Markup's Dirty Little Secret, which is that because everyone's system and each system's algorithms and resources and capabilities are different, you cannot expect perfect fidelity to the extent of the same line and page breaks when exchanging XML+stylesheet documents (such as OOXML, ODF, DOCBOOK, you name them). This goes quite against the expectations of some users (though I think people are much more realistic about this now than two years ago) and quite against the hard requirements of others (for example, people who need fixed page numbering for legal requirements.)
In yesterday's blog, Standardization as a collective loss of imagination? I suggested that users may need to assert themselves to prevent the standardization of the current round of office application formats from a particular pitfall of losing sight of the centrality of page (and document and information) design: how to help people communicate rather than how to add the latest pet feature from some vendor. Not that pets are not fun and valuable.
Hinting at our priorities
The tie in that suggestion and the page-fidelity problem (which is really an interoperability issue) is that I think we need some more imagination about whether our current re-pour-each-time model of formatting is actually good enough if we genuinely want substitutability of office applications. People don't want to be sold a turkey.
Now SGML did provide processing instructions, a kind of markup that still exists in XML, for applications to add extra information that belonged to formatting for example. The ArborText Publisher program used them very successfully, with processing instructions that let you force page and line breaks in certain places, for example. That is one way Iof integrating page markup, but it is not what I am suggesting (for various reasons.)
At the moment, I think that a much better approach would be to add a kind of cast off hint as an attribute to each block-level object (paragraph, list item, table cell, etc). This would be added to the XML markup by the formatting engine as a hint, to enable a subsequent formatter to try to get the same results.
The first time data came into a document, the normal composition mechanisms would apply. But the document's block structures would also be decorated by these hints at save time. And subsequent opens of the document would use these hints as well when composing the pages. For example the castoff hint might be as simple as
giving the bounding box of the block on the page. The composing system would used differences in these bounding boxes with the bounding boxes it wanted to use as penalties to adjust line feathering (or even margins, padding, breakpoints, spacing, text size.)
Auto-sizing is not completely unknown: WordPerfect had a patent on automated adjusting various page parameters to make sure some range of text fitted on a single page. And many people are aware of the behaviour of some page-oriented systems such as presentation programs to automatically resize text (including nested text lists) to fit into the available space.)
It could be user selectable whether to freeze the page according to the block hints or just use them as hints, or ignore them. As a hint, it wouldn't interfere with minimal implementations.
I don't completely agree with the notion of embedding presentation metadata along with the content markup (which in my understanding is what you're suggesting) - after all separation of content and presentation is in place for many valid reasons.