Where's XML Going?

by Kurt Cagle

Recently I passed both my 44th birthday and my 15th wedding anniversary, just signed my daughter up for high school and was told by my doctor that my HDL was soundly thrashing my LDL. My beard, which I've worn since my early twenties, is now streaked with gray (a curse of red hair, I fear), and I notice that lately the stairs seem to have mysteriously begun to grow from one trip to the next. T.S. Elliot is beginning to become ... relevant ... to me. All signs, perhaps, that I am no longer the young spring chicken I once was.

As I was thinking about things to write for this particular column, this realization about age began to sink in about the standard that I've spent the last decade writing about. A decade is a long time in computer circles, especially when you figure that there's only been five or six of them in the whole history of computing. XML has gone from being a "standard" that perhaps a couple dozen people worldwide knew about to a pervasive technology that is so well entrenched that many people don't really even think much about it any more. We argue about the XMLification of word processing and spreadsheet programs, we debate whether Atom or RSS 2.0 will predominate, we shake our heads at the whole notion of web services and how the dominant web services protocol was designed largely by bloggers to let people know about their websites.

In short, while XML is not exactly doddering off to the rest home, its angle-bracket knees are no longer as flexible as they used to be. If it were a person you'd expect it to be muttering about those damn JSON punks and how property taxes and inflation are eating up its standard of living. It no longer is as flashy a technology as it used to be (even as Flash has been migrating to an XML format), and more than once I've run into twenty-something AJAX hot-shots who declare XML so yesterday (even as they write applications that bind AJAX objects to XML structures). It's become the establishment, though in many respects I suspect that while its glory days are behind it, XML is becoming more integrated into the fabric of computing.

To that end, I wanted to offer up an assessment of where XML itself is going. As always, this is written by a guy in a coffee-shop, so take it with the usually assortments of saline condiments:


Sylvain Hellegouarch
2007-07-20 06:12:47
> (...) those damn JSON punks (...)

Priceless Kurt! :D

2007-07-20 07:12:11
Everything we told the HTML punks would happen did happen, and everything the JSON punks tell us will happen will happen.

My son turned 18 yesterday. I can describe almost every minute of the day 18 years ago but only snippets of what has happened since. On the other hand, it was all *inevitable*. So JSON supporters take heart, because we float on a sea of relentlessly churning technology which is short-cycle unpredictable and long-cycle boringly predictable, your days to whine shine and then recline will come too.

By the time you finally kick back, you won't be asking yourself if the technology wins matter. You will be looking quite personally and locally at what you did with it and how that affects your 18 year old kid. You will measure your success in the ambition in his eyes to best you, and you will be satisfied or dismayed to the degree you made that possible. Then it will be his day or her day.

Don't weep for the stuff. Cheer for the mammals.

Kurt Cagle
2007-07-20 08:25:29

True, all too true.

I think that AJAX and JSON will have a huge impact in the long term, but I suspect that its wins will likely be in areas far outside of its current focus - over time technologies that work together tend to reach a balance; for JSON, I see its role being that of RNA to XML's DNA, both are necessary, and RNA does not have quite enough structure and integrity to serve as a foundation by itself for stateful storage of content, but it makes a compelling envelope. Get a lightweight parser like E4X in place, and you have a powerful package.

As to your comments about kids besting their old man, I could not agree more. Raising girls I think the dynamics are a little different, but not dramatically so (my oldest has nearly no interest in computing, but she's going to be a hell of an artist, and the competitive spirit with her dad is alive and well there, thankfully ... of course, my youngest has figured out the game editor on one of my Linux games at the age of seven completely by herself, so yeah, the punks are alive and well ...).

Thanks for the perspective.

2007-07-20 11:12:22
Love the Insight...
I am A young Punk(28) who really sees the value of XML. As more 4g Languages are Speaking Directly to Web services(which Are Spitting out MBs of XML), I have the urge to Get my new'XML' Tattoo right on My neck. Underneath all of this web 2.0 Buzz is the power of XML(portable DBs) Flying though Cyberspace...I wish More Of my generation would look a little deeper into the data, rather than CFing everything and swearing the know the Way.
Thanks for the Insight.
2007-07-20 11:54:26
So, I'll admit to being something of an XML newbie—being that I've been an unemployed stay-at-home dad since January 2001, right around the time when XML seemed to be getting hot.

I am, however, a fifty year old former IT professional with almost 20 years experience who is currently enrolled in an XML Programming class—something that gets mildly griped about a tad on my blog—and I must say that your post is exceptionally full of perspective that intrigues and excites.

Indeed, much in your post is beyond what I know about the technology at the moment, but that won't stop me from digging deeper into any of that. My Advisor at Saint Paul College told me XML was one of the topics I should embrace and it seems he's quite correct.

Great post. I'll be back . . .

2007-07-20 12:37:10
My daughter scares me worse. My son is level headed and wants a career in computer science and game design. My daughter wants to rule the world and says she has to knock off the old man first because he is smart and will foil her. She's right on item 2. The daughter will break the heart worse because we have absolutely no defense.

XML is here to enable a system to interoperate with the most liberal contract possible. Ever since its release, most of the churn has been in tightening up the contracts and that leads to some insanely complex and innanely simple ideas. I still consider it a minimal amount of control where a minimal amount is the best thing possible across the domains, and local control strength is left to the local voters.

No size fits all. Some sizes fit most uncomfortably. What I'm seeing done to Rick Jeliffe (arguably one of the most level headed guys in the markup business) demonstrates the costs of being civil when the 'wisdom of crowds' turns into a lynchmob. That twig was bent a long time ago. There are times when the crowd should get what it wants and see if it can live with it.

The sad thing is, if he were a girl and those posts were being put up, a lot of folks you and I know would be out there blogging to defend him instead of benefitting by it. I weep for the web when I see this.

2007-07-20 15:52:10
Or an apple user... it's sad that the web should and can enable us to communicate better with one another, and yet people still take an iota of information about a person and extrapolate the rest from there, whether the topic is iso standards or the ipod vs. zune debate. It puts me in mind of the secret life of walter mitty: "Your small minds are musclebound with suspicion. That's because the only exercise you ever get is jumping to conclusions."
2007-07-21 04:29:54
XML has failed in many places. JSON is a childrens toy. AJAX is for 'inventors of wheels'.

Nothing here shows that XML has managed to integrate and interoperate X with Y in some great engineering fashion. To do those you have to that kind of work everyday which sadly means it failed to do it. Simple.

Kurt Cagle
2007-07-21 09:54:48

I would not agree with that assertion at all. XML is the worst of all technologies, except for everything else. It's a generalist solution, and like many generalist solutions it is often not as efficient at accomplishing a task as a specialist solution, but nonetheless because it provides a tool for abstraction it makes solving a broader range of similar problems possible in a given domain.

Consider HTML for a second. Many people considered HTML too simple when it first debuted, that is was "a mere toy", yet HTML is now more widely used than any other computer language by a wide margin. There is more JavaScript (another toy language) in use now than there is C++, Java, COBOL, Perl, Python and Ruby combined. Yet because neither one of them came with the mind-numbing complexity that is so common with strongly typed imperative languages, they've long been dismissed as being too primitive to be useful. If, as I do, you see HTML as an early form of XML, then what this tells me is that, far from being a failure, XML has in fact largely replaced a fairly significant swath of the tools written in those other languages, and is well on its way to replacing the rest within a couple of decades.

Simplicity seems to go hand in hand with elegance, I've noticed, and that holds as true for computer languages as it does most other endeavors.

2007-07-21 14:20:15
I like JSon and I like XML . Both have their places. JSon is simple and compact and works beautifully with Java script (not surprisingly as it is Java script ...). XML has the advantage of Schema and specifications and hence is preferable for interoperability. The biggest threat to XML is not from JSon but from DSLs which offer to be both more compact and descriptive than XML and still easy to validate for correctness.
Whether I prefer a set of different DSLs to the verbose but always simmilar beast called XML is however a different question ...
2007-07-22 11:06:38
it is nide article
2007-07-22 11:06:58
it is nice article. I like it ! Thank you !
2007-07-23 06:08:19
Today I use XML when I am designing a document page that is also a web page that is also an application page that is also transformed into a printed form... yadda. The change most completely wrought by the HTML/XML/XSLT trinity is that when we design an application today, we are designing a document. It is so obvious we seldom mention it but the triumph of the hypertext community over almost all of the rest of computer interface designs would be what I would consider most notable were I to have slept from 1985 to the present day.

Straight out, I am using ASP 2.0 with those seductive data binding providers. So I can create some fast XML instances given I can type elements and atts faster than I can design a table, bind to them, get a notional screen up and running and move on to the next one. Is it a good way to work? Possibly not given I will eventually bind to a final table design, but it more nearly resembles what I want the customer to work with than when I start from the tables. Then I throw away the XML or keep it for the on-the-wire designs.

How many of you craft views/XML instances and then come back to the table design later?

2007-07-23 06:45:52
I liked today's topic, lately I've been tempted to think of XML as "old" but that's not the case. xschema seems like it's still in infancy, judging from the tutorials, and people (myself included) are only just now being made aware of schematron.
Kurt Cagle
2007-07-23 08:17:17

I bind to XML in various ways; the ASP.NET bindings is certainly one of the features to like about the ASP.N2 framework. I use XForms some for this task - a table in XForms is a handful of elements for explicitly mapped bindings and a couple of elements for a generic mapping, though of course this isn't assuming CSS maps. I find I'm using XSLT less and less for this particular task, though I'll typically resort to that route if I can wrap the XSLT in an XBL or similar structure.


Yup. My take is that a lot of the W3C technologies in particular are finally JUST beginning to percolate into public consciousness - XForms, XQuery, RDF, SVG, etc, all of which are likely to cause some very interesting changes if they get widely adopted.

Dan Sickles
2007-07-24 22:09:45
A nod to XML as a native data type seems appropriate. E4X (as already mentioned) Scala and Xlinq come to mind. There was a proposal to add this to Java 7 but I haven't seen any activity on that lately (no surprise, it was controversial and with everything else queued up for v7 it may not stand a chance anyway). The native XML plus pattern matching in Scala has been eye-opening for me. It's almost a guilty pleasure; like I'm eating candy when I should be "coding". Nothing a little java/DOM can't cure;-)

Native XML certainly isn't the "right thing" for every language but E4X and Scala/XML hit a sweet spot for me. On the other hand, flying pigs will be ancient history before python has native XML and I agree with the python communities thinking on this.

Do you see native XML becoming mainstream? What about the rise of functional languages and the builder pattern implementations in ruby and Groovy etc?

2007-07-25 06:03:30
It depends on what you mean by 'native XML'. Isn't a DOM 'native XML'? If an XML file is supported by an XML data provider, isn't that 'native XML'? X3D, XAML, XUL, SVG, aren't these 'native XML'?

XML is a syntax, not an application. That distinction makes a world of difference in how or why 'native XML' is supported. One can create a 'native XML' database but effectively, it is just a hierarchical database with special provisions for XML syntax and the oddity of mixed structures.

Dan Sickles
2007-07-25 07:33:15

I'm using natvie XML to mean that literal XML is recognized as a core data type by the language compiler without explicitly importing libraries or invoking another compiler/processor. My intent was not to make a hard distinction but to recognize what's different in the context of Where is XML Going. It's still XML but a native data type does change how developers interface with it.

2007-07-26 06:11:05
Ah. No disagreement with that. Back in the 1980s when discussing SGML at a design meeting, Charlie Sorgi from what was then Mentor Context made the statement that one day SGML would simply be a checkmark in a list of product features. A generation later that is pretty much the case. Maybe the answer to 'where is XML going' is 'nowhere'. It's just there.
2007-08-01 11:50:52

What leads you to say that,

"HTML 5.0 will be XML based, it's just a question of how much core technology will separate it from XHTML 2. There is no valid reason for HTML not to close its tags, quote its attributes, and respect namespaces..."

It seems that some members of the HTML 5 working group are approaching the issue from the opposite angle. Which is, that they are saying there is no valid reason for HTML to close its tags, quote its attributes, and respect namespaces..."


Kurt Cagle
2007-08-01 15:22:16

I think the key to that statement is "Some members of the HTML 5 working group ...". There are two or three individuals that seem wedded to an extraordinarily conservative approach to web development working with the W3C, usually invoking either some mythical great aunt who works with HTML or some pre-existing customer base that would face incredible hardship if XHTML was used. However, while this argument is in fact fairly compelling in the face of the HTML 4.0 specification, its an assinine argument for HTML 5.0; the changes involved will necessitate that a new DTD be established, will necessitate changes in both HTML interpreters and renderers, and for the most part works against three significant changes since HTML4.

First, the idiot HTML coder theorem, to which I would replay that you have the fact that a number of alternative markup schemes have appeared in the wild since HTML4, such as BBCode or WIKI code, that assume some form of preprocessor interpretation. Most such schemes include the notion of explicit closure and require a sufficiently sophisticated understanding of attributes that the notion of quoting such attributes pales into absurdity in comparison. Given that these are actually used quite successfully, especially in client facing text entry mechanisms, the notion that sloppy validation is a core aspect of any HTML specification for adoption is absurd on the face of it.

The second facet is that increasingly rich text content generators on the client often hide the production of that HTML code (the number of sites that incorporate some form of JavaScript-based Rich Editor is growing dramatically) so that those people who do not know/cannot understand HTML do are not in fact in a position where they are explicitly generating that code. This does place a requirement on the developers of such functionality to support XHTML based encodings, but this is happening anyway for other reasons (largely feed syndication).

The notion of dealing with code for an existing customer base also does not apply here. HTML 5 is a new standard. In order to support it, vendors will need to change their code anyway in order to work with new HTML declarations. In most cases, when the 4.0 specification rolled around, most vendors were only just beginning to become aware of XML and were still assessing the role that the specification would have in their operations, whereas today, XML has become an integral part of the web in more ways than even than the original designers anticipated. Those vendors' customers want XML based solutions to work with their increasingly XML based document workflow systems, and they are much more aware of the hazards of splitting core technologies down two competing paths.

Finally, I would hazard (from my own experience with those individuals) that many people within the W3C have become leery of working with them and view their efforts as being frankly counterproductive. The WHATWG effort was a direct effort to wrench control of HTML from the W3C standards process, in part because of the belief that the W3C solutions were too esoteric and not sufficiently responsive to the needs of web developers. While I think there is some validity to that argument, those same individuals were involved in both the W3C and WHATWG, and the point can be made that the efforts made by those individuals often amounted to arguing points in order to keep the W3C from coming to resolutions that would have actually pushed those technology standards out the door.

I think it can be argued that much of the WHATWG "standard" has far more to do with what has increasingly been seen as AJAX related work - data stores and persistence, graphics in 2D and (theoretically) 3D, programmatic bindings and behaviors, enhanced validators in HTML fields and so forth, things that have traditionally sat above the HTML stack. I see HTML 5.0 as recognizing the need to embrace that higher stack, but to do so in a way that is consistent with related W3C specifications. HTML 4.0 is not compatible with most of these been CSS and (to a certain degree) DOM. However, by assuming that HTML 5.0 will make the predication that there is a minimal requirement upon an XML basis, it makes it far easier for the W3C to assert compatibility with its other technologies, from XPath to RDFa.

It's for these reason that I just do not see the looser validation requirements of HTML 4.0 making their way into 5.0. The adoption of these may be in the interests of a small number of individuals who do not wish to see XML succeed, but given that XML has long since established itself this action can only be seen as grandstanding - it works against the interests of the W3C, against organizations that use XML for content management and against web developers who increasingly see XHTML as a critical piece of pipeline architectures for moving data through their systems. Great Aunt Bernice frankly doesn't care - she uses a WIKI - and the vendors should realistically see an XML-aware HTML 5 as being an attractive bullet point for new products.

I'd personally like to see a legitimate argument FOR the non-adoption of XML notation ... I haven't to date, and frankly I think it would have to be an incredibly compelling argument to make up for all the negatives that it introduces.