More on XML class warfare

by Uche Ogbuji

In this xmlhack article I present a thread summary of discussion of my ADT Magazine article XML class warfare on XML-DEV:

Discussion of the role of data typing and other artefacts from programming languages and classical DBMS never seems to end on XML-DEV. Simon St.Laurent fired up the debate again with a twist by linking Uche Ogbuji's article XML class warfare. The article made whimsical reference to the debate as a "class war" between "bohemians" and "gentry". A long and very interesting sequence of threads ensued.

I also received some very interesting personal messages related to the article and the XML-DEV thread, which I don't include in the xmlhack article.

Norman Samuelson wrote:

I have some thoughts on your article on XML Class Warfare.  I have been 
working for the last few years on a project named Cyclops. It is centered
around a GUI that can be used to setup input files for large physics
simulation programs.

I have an XML file for each of the physics codes that Cyclops supports,
called a Parameter Description File. In this file I define every parameter
that the code allows in input. I specify whether it is an integer, real,
boolean, text, etc. The Cyclops GUI uses this to create an interface with
a field for each possible input, arranged in multiple windows related to
various parts of the problem definition (such as geometry, materials,
control, and various physics concepts).

For real numbers (this is where it ties back to your article), I dont have
to worry much about different number formats, but I do have to support
multiple systems of units. If I have a time value, I may be asked to
display it in second, or microseconds, or some other units. If it is
pressure, it may be in Pascals, or in MegaBars. If it is temperature it
may be in Kelvin, or it might be in electron-volts (or MeV). And so on for
many classes of units.

It seems to me that keeping those things straight is the responsibility not
of XML, but of my code, with direction from the XML in the form of
attributes that tell me that one number is pressure and another is
temperature. I agree with you that XML should not be called on to do
absolutely everything for everybody. It's job is to carry text from one
application to another in such a way that the application can make sense of
that text.

Very well put, Norm. My mantra in these discussions has been that the most important parameters on the data we process almost never falls along the lines drawn by the designers of "standard" data types. Issues such as units (which, I think are even more a consideration in safety than integer/float distinctions) are not amenable to universal data typing. In the end, Safety and interoperability is a matter of careful axioms constructed by competent programmers and subject-matter experts. Strong data typing is mostly a false crutch that collapses very readily.

In one of my XML-DEV messages I wrote:

Here is what I'm asking for, more specifically: separate 
static typing and other schema-annotation-specific specifications to a separate
document which is an optional augmentation to XPath 2.0. Also, provide
XPath 2.0 with generic extensibility that allows people to plugg in
alternative augmentations according to their preference.

(I guess "specific" was my favorite word just then)

Mark Seaborne quoted this to respond:

Some while ago a tentative suggestion was made that it might be quite useful if XForms was not tied so closely to W3C XML Schema and data types, so that one could, in theory, associate a form with other schema languages capable of expressing a different set of constraints to WXS.

Interestingly, XForms itself defines a set of constraints (based on XPath expressions) to augment those supplied by WXS (e.g. element x is only relevant if the sum of the values of elements p and q = y). That such expressions are equally useful however an XML instance is generated, and that there are already alternative (non-W3C controlled) ways of expressing such constraints must be obvious to those on the working group. I once commented that XForms has invented its own, mini schema language, but was emphatically told that this is not so. However, we are seriously looking at augmenting the WXS schemas we already produce, with XForms defined constraints. To my way of seeing things this makes XForms constraints at least a component of a schema language.

I thought part the point of XML was to allow people to combine technologies in ways not necessarily envisaged (to use a fast disappearing verb) by their originators. The W3C (apparent) approach of creating strong interdependencies between everything and WXS + WXS data types, I think does no one any favours in the long run, and may shorten the lifespan of some of the otherwise useful work it is currently doing.

Yours, somewhat frustrated

And in an interesting follow up a few days later, Seaborne added:

Interestingly, since I emailed you, my employer (a W3C member) has put me forward to join the new XForms working group that will start on version 2 next year. Having said that, my employer is heavily committed to promoting WXS, so officially I don't think we mind the tight binding between it and XForms. Not that I have any problem with using WXS schemas (and data types) with XForms, I just think it would be nice, if technically practical (and I, perhaps naively, don't really see the problem), to have a choice.

The fact that the current CR for version 1 has already had to build on the foundation of WXS to come up with something usable, suggests that WXS is not, by itself, adequately expressive for applications that need to generate XML instances (as opposed to just validate existing ones?). I would be keen to try Schematron (with or without WXS), which already has the concept of differing states of validity at different points in a XML document's life cycle, and is already built over XPath (like XForms binding constraints). To me it looks as though it might be a good fit for rules driven, XML generating applications, forms based or not. Not that I have ever tried ....

Oh well, back to writing WXS schemas!

I asked my colleague Micah Dubinko about this because he's on the XForms WG, and he admitted (speaking personally and not in any official capacity, mind you) that the strong tie between XForms and WXS is a problem. I do hope the XForms WG finds a way to open up to other schema and constraint expession systems. I think they already have the needed building blocks in their model item properties and bindings constructs.

I also received a handful of messages with no new insights, but just expressing support for the "bohemian" point of view and dismay at the layers of complexity the W3C appears to be molding over everything it produces lately.

This debate clearly has plenty of legs left in it.