Breaking the models

by Simon St. Laurent

A week ago, C. Michael Sperberg-McQueen closed the Extreme Markup Languages conference by questioning models. I wasn't sure what I thought about it at the time, but this time, his closing sermon is reverberating.

Sperberg-McQueen's question was:

Does XML have a model? A supermodel? Does it matter?

After talking about several senses of "model", including model trains, which I especially appreciated, Sperberg-McQueen talked about people who ask for "the model" for XML, or charge that the entire project is fatally flawed because it doesn't have one. While the word "model" is extremely difficult to define and seems to change from situation to situation, it gets used as a bludgeon on a regular basis.

Model questions have caused plenty of grief in the XML world, with the Infoset standing in as a model for some people, the DOM for other people, RDF `for others, and application-specific models for others. There seems to be a deep and conflicted emotional need for models. Even within an application domain, there may be questions about the value of particular models, demonstrated by the continuing battles over various kinds of models for Topic Maps discussed at the conference.

Sperberg-McQueen's question seemed to fit the overall conference very well. A presentation by the W3C's Eric Miller (co-authored with Sperberg-McQueen) on integrating RDF with XML through XML Schema had faced some fairly hostile questions from people asking what value the RDF model added to the project. In a series of presentations on overlapping markup, it was clear that there are cases where the same document can and indeed should offer applications the choice of multiple ways to model it.

After years of disagreeing with Sperberg-McQueen about various aspects of the XML universe, I'm happy to say that he's inspired some new lines of thought for me, and reinforced others. (I doubt he'd agree with all of these, of course!)

I first saw XML as an effort to trade a bit of discipline on the part of markup creators for improved processability, but that's lurched entirely too far to processing and away from people, driven by an obsession with machine-comprehensible models. By re-examining the question of models and rejecting the notion that there must be one true model (even for a given project), I'm freed to accept the human reality that even simple documents have multiple interpretations.

I can embrace Walter Perry's long-standing argument that understanding happens in the processing applied to a document, not the external constraints used to describe it.

I can take a fresh look at Layered Markup Annotation Language (LMNL) and enjoy that it provides a syntax for what I want to do without concern for the issues that bedevil the model.

I can't say that I expect this viewpoint to catch fire too rapidly, as it goes severely against the grain of so much of computing culture, but it does a nice job of explaining how XML syntax caught on so wildly even if the model wasn't clear. Letting go of ambitions for complete communication or universal exchange makes it much easier to focus on particular projects that solve immediate problems, using only as much standardization as a particular problem needs.

After all, we're just exchanging data, right?

Do models get in your way sometimes?