Just say no to XML?

by Steve Anglin

TheServerSide.com is pointing to this SDTimes column by IT expert and author Allen Holub: Just Say No to XML.

31 Comments

Sancho Neves-Graca
2006-09-29 13:04:05
The paragraph you quoted is just about all I want to read from Mr. Holub. It said it all about his inadequacies with XML technologies. The problem is with Mr. Holub, not with XML technologies.
Trevor Harmon
2006-09-29 14:00:20
I think Holub's absolutely right. XML is great for describing data, but it falls down when people try to use it to describe control flow. The only reason people use XML for control flow is simply because they can use existing XML tools for validating and parsing the input. In other words, XML tools have become "the poor man's parser". It saves work for the architects but makes life difficult for people who are then forced to use XML for control flow.


Even the creator of Ant says it was a mistake to use XML. He admits he was simply too lazy to write a proper parser. See:


http://www.theserverside.com/news/thread.tss?thread_id=24864
http://blogs.tedneward.com/2005/08/22/When+Do+You+Use+XML+Again.aspx

grrrr
2006-09-29 14:03:13
@Sancho Neves-Graca
Only there is no such thing as xml technologies what did xml bring for new algorithms or concepts none to be exactly it is just a standard you may need a standard to cooperate on the net but it is far worse than the tools it replaces like yacc or lex for creating languages.
Kurt Cagle
2006-09-29 20:09:58
XML is simply an abstraction mechanism. It provides a way to describe structured information in an internally consistent manner. There are occasional bad XML schemas out there, just as there are many bad Java programs and programming APIs, but I would agree with Mr. Neves-Graca on this - the article points out far more the fact that Mr. Holub really doesn't understand what XML is and why it is in general a very good thing (or believes that it threatens his favored language, Java, and so will attempt to deprecate it).
Al Potson
2006-09-29 20:39:49
Yes XML
---------


-Easy to understand. Just give me the XSD/DTD and I will figure out
the syntax. No need to learn proprietary syntax.
-Parsers available
-Can share XML across languages (.NET, Java, PHP)
-Editors already available for XML
-No need to learn jsf, jdo, hibernate, spring, ant specific syntax.





Jack Meoff
2006-09-29 20:49:41
Holub does not elaborate on the "enormous cost" that XML brings upon others.


You don't have to write a parser
You don't have to write an editor
You don't have to validate your script


Just what is that enormous cost, Mr Holub?



JCN
2006-09-29 22:51:47
But where can I learn to write a compiler and be a "real" programmer? Oh, Mr. Holub just so happens to have written a book on exactly that topic. He was even kind enough to put a plug in for it. How kind of him.


So if I have a great new idea for a really useful tool, I'll go learn how to write a compiler. Then I'll write the tools I need to build a compiler. Then I'll create a new scripting language. Then I'll write the compiler/parser for that language, and then I can actually write the tool. That's how "real" programmers do it.


And can someone explain to me how object/relational configuration files *aren't* modeling data? Because last time I looked, that's exactly what they were doing.


BTW, anybody remember back in, oh say '96 or '97 when everyone was complaining about the ridiculous number of specialized scripting languages there were, and what a pain it was because you had to learn syntax for half a dozen different tools just to do the simplest things? Yeah, apparently those were the good old days.

grrrr
2006-09-30 12:01:31

"-Easy to understand. Just give me the XSD/DTD and I will figure out
the syntax. No need to learn proprietary syntax."
but you are studying a proprietary syntax described in a XSD/DTD and a very awkward description also
"-Parsers available"
you still need to construct a parser same as with yacc
"-Can share XML across languages (.NET, Java, PHP)"
You can share unicode or ascii files
"-Editors already available for XML"
There are editors for unicode or ascii files
"-No need to learn jsf, jdo, hibernate, spring, ant specific syntax."
Yes right i give you hibernate and no documentation on the xml and you can use it???
that kind of XML claims are all nonsense
also writing a recursive decent parser from scratch is probably still easier than using sax or dom


JCN
2006-09-30 13:00:04
grrrrr: No, you do have to learn the specific XML implementations for each use. You can't write an Hibernate file without documentation, so you are right. What I'm saying is that we already have a way to describe complex relationships. Why should we reinvent the wheel for every application? If I am writing Hibernate, why invent a new Hibernate language to configure it when we already have XML, everyone already knows XML, and XML does exactly what I want?


Can it be done another way? Absolutely, there are a million ways to do anything. But I don't see why XML is so hard for some people. Nine times out of ten you copy/paste from an example anyway. I've never, ever, in over 6 years of using XML extensively for lots of uses (XSLT, Hibernate, JSF, Tapestry, Tacos, OAS, Weblogic, Ant, as well as data representation) written an XML file from scratch. There's just too much existing code out there, and the cases we run in to are rarely unique. And in the rare instance of a unique case, we still take a working example and tailor it to our needs.


And a well thought out XML schema reads almost like prose. You can tweak/tailor an existing XML file without documentation because it makes sense when you read it.


A lot times the difficulty comes in because of the openness of the tool. For instance, Ant isn't hard because of XML. Ant is hard because it will do anything you can imagine. Any language, I don't care if it's X=Y style properties files, is going to be cumbersome with that much freedom.

grrrr
2006-09-30 13:36:45
JCN: But we have a good example of a non xml system UNIX. there are a lot of special languages there sed awk sh make a lot of non xml configuration files. Non of the concerns of the people who use XML is a problem there. The sharing is also for xml just as big a problem as for other solutions I do not believe i can use a hibernate file with for example the cayenne orm mapper. People also seem to like annotations a lot better than xml configuration.
JCN
2006-09-30 14:47:39
I agree that annotations are nice, and I would like to see more of them in the future. I just don't think XML is that bad or that big of a problem.
grrrr
2006-09-30 15:14:31
JCN: I think your right xml is not really a problem and maybe if I use it more I wil get used to it it is probably going to stay for a long time even if mr holub or I do not like it ;-)
Tim O'Brien
2006-10-01 09:16:34
Steve thanks for the link, the article is right on target. Using XML as a substitute for a good parser/compiler is a huge mistake.


Java has relied on the XML parser for "scripting" for some time. JSTL's c:if, c:forEach. Ant's procedural XML scripting language, the are both examples of the Java communities inability to grasp the importance of writing domain specific languages.


Even more telling is the response of the OnJava readership, which proves itself reactionary and unable to hear anything that even comes close to criticizing the use of XML.


XML? XML is great, don't get me wrong, but only when it is used to represent *data*, implementing a grammar in XML is a PITA, and there are tool much more appropriate to the task.


grrrr
2006-10-01 13:02:24
Tim: xml is for documents only for data storage there was some actual progress that kind of hierarchical model is from long time ago in the data world ;-)
Patric Fornasier
2006-10-01 14:57:26
I'm not really sure if I'd call Ant, Hibernate, JSF etc. programming languages, but I get the point.


However, maybe it makes more sense, if we try to understand, why developers of tools such as those mentioned have chosen XML in the first place. One could without doubt come up with a proprietary format for his application that is more compact, more efficient etc. However this would come at a price (I already mentioned one...): Either you assume that the user of your application will learn some new syntax that might seem oh so logical to you but maybe not necessarily to others. Or you provide good tool support for that language which means (potentially much) more effort for you.


If using XML on the other side, you'll know that users are familiar with the syntax, you have a lot of good editors out there that can help anybody editing their files and you can validate the files for errors using schemas in a very powerful way.


So clearly, I don't see XML as the best of all solutions, but it is currently the best one that everybody has agreed upon. It's all about compromises or like the magic triangle: You can't pull at one end without influencing the other ends...


Cheers,
patric

Dmitry Mamonov
2006-10-02 01:06:14
With my opinion, XML is a good solution for small use languages.
Have your ever seen IDE for "strange-name-make" tool?
XML syntax and good DTD permit users to use STANDARD,
and well known, XML Editrors. With syntax highligting and so on. Autocomplete can be done too!!!


Jay Wilt
2006-10-02 10:57:03
XML is an abomination! As a developer I'm forced to use it every day. I have written many routines that require much less code and are much more efficient at reading, writing and transporting data. Sure, you just need your favorite text editor to edit a file. This begs the question, why? Why do I need to look at a file at all! Why can't software vendors build a cool user interface the way they did in the days of old?
In the early days XML was going to solve all the problems of inter-enterprise communications. EDI worked for decades. Nothing has changed! Companies must agree on a standard message and data still needs to be mapped from/to our internal representation. The only difference, with XML we require significantly more storage space and network bandwidth!

2006-10-02 11:02:37
totaly agreee, xml is crap use yaml or JSON instead...
In fact it XML is ok, but yes if used for what it was invented...
Henrik
2006-10-03 03:50:53
That article is just a plug for a book about building compilers.


I think it is fair to question the use of XML by actual developers to describe turing machinery, but to argue that you should make a compiler to parse a special purpose descriptive file strikes me as being very uninformed.


On XML itself the parsing performance is a common criticism as well as readability. Parsing performance isn't a real issue in the context mentioned as they can be parsed once and leave an object graph in memory. File size is completely irrelevant. In terms of readability I agree that a plain text file might be better, but then you can make a viewer that presents it a color/icon hint coded plain text. IDEs like Eclipse can actually help you with validating syntax and code completion while editing if there is a doctype. That isn't possible for specialised formats.


The examples given are very different. I can see the reason in questioning Ant logic, as Ant files can indeed become quite bloated, but I would take it any day over make. Maven has more of a rules perspective and turns out to be easier to maintain it seems. With TestNG the author might be on the money, I haven't used it, but I thought TestNG used annotations.


The point at which I loose respect for the author are remaining two examples.


Hibernate schemas? uhm. Personally I'm moving to annotation based mapping definition because I need a specialised object with some extra control flow. The schema is purely data mapping/object model. Why would a specialised format be any better. That claim is just stupid.


JSF. Well JSF is a mixture of too many technologies, but thats a different discussion, so I take it as a criticism of templates with control flow tags. Let me see.... You have a html/wml design and you want to fill in some blanks and repeat some parts and make others conditional, where would you define the flow if not in the template. Ahhh... you don't like templates and want to go back to servlets. Ok so lets forget the lessons learned for the past 10 years, and expect everyone to be programmers.


It sounds very much like the real schism isn't XML but GUI/IDE vs SlickEdit+shell scripts. Now that I have realised that my response is much easier.


A Guide: How to make buggy programs and look like an ubernerd


1) Generalise to the extreeme. Solve the riddle of like rather than the problem at hand. Since the solution can do everything it is impossible to test it, so let the user do the testing.


2) Keep the code lean, avoid meaningful logging statements.


3) Provide a multitude of options with random defaults forcing the user to precisely specify the task at hand.


4) Make half the options compile time options to reduce startup time.


5) Write a book about how to use the options in clever ways.


6) Test manually and only the configuration you use yourself.


7) Do everything your own way. If people are desperate enough they will buy the book and hire you to make sense of things.


8) When documenting explain how cleverly the engine works, and leave the rest as an exercise for the reader.

Henrik
2006-10-03 04:04:25
that would be "riddle of life".


And on the topic of unix config files they don't seem preferable at all. If they contain plentiful documentation they are not too bad, but that makes them no less bloated than a well documented xml config file.
When it comes to config files the real issue is that we need to keep the number down and have a stict but powerful policy on locating them, like HiveMind's hivemodule.xml combined with good ole META-INF.

yaph
2006-10-03 15:11:42
XML can be useful but is often only used because there was or still is such a big hype about it. I have seen many applications where the use of XML is completely inappropriate and adds lots of additional developing and maintenance taks.
JCN
2006-10-04 12:22:11
Henrik: Thank you, that is exactly what I was trying to say. Perfect response.
Raphael Valyi
2006-10-05 08:46:27
Hi,


sorry but I disagree, Here are my objections:
"these sorts of XML "programs" are unreadable"
-> XML has never been made to be hand written and read with no tools. XML true potential only comes with XML dedicated editors. With such editors, you can offer different presentation layers of the same information to appropriately adapt your data to your taget audience. The advantage of XML over other language is that it states a very simple and clean grammar making it very easy to writte those/adapt editors compared to the tools is replaces.


In my Company, I always milited for a JSF like flow engine with XML description because at least we could encapsulate better the concerns (not use a Turing complete language were we don't not need to, just the same reason as using CSS for HTML, you know), integrate better the code and the specs (specs and code gets intergrated thanks to XML tools).


"unmaintainable"
-> Again I disagree: writting a validator is much easier with XML than for any other language. I'm sorry, but compare the complexity of code checkers such as PMD versus simple custom XML schemas. writting some schemas is far easier than rewritting PMD from scratch for each domain specific language.


"an order of magnitude larger than necessary"
That's the cost of this extra abstraction layer.


"and audaciously inefficient at runtime."
Depends: lots of tools using XML actually build their own AST of the XML file, then once the file is loaded in memory, it's just as fast as any language (example: you unmarshall a javabean into a bean, then your bean is a normal bean). Still the parsing is slow. But I believe optimized XML parsers may outperform buggy custom parsers (In my company were we are using Edifact instead of XML, this is what just happens: our language should be faster in theroy, but it's not and especially, all our Edifact middleware is much more buggy than standard XML tools).


Finally, I think you should state an XML schema frame each time you define a quite static strong specification. On the contrary, XML should be married with very dynamic languages such as Java or even more dynamic like Ruby. Then you gets the ease of write for things that are more likely to be programmed than read and you also gets very stric enforcement of strong interfaces thanks to XML.


I'm not saying everything should be XML, but at least for JSF and Ant, I find them quite better than make or programmatic flow systems, at least, it's very good to have them even if we could imagine you get a handy scripting language to build them too.


Regards,


Raphael Valyi.

devdan
2006-10-06 04:24:53
I confess I'm a programmer who doesn't know how to write a compiler/interpreter.


Ant was great when it came out and is still very helpful. Nevertheless, larger build files can get complex and hard to understand, partly due to the XML. Maybe now is the time for something better to emerge.


I'd love to see ideas to replace Ant functionality with a domain specific language. Any compiler/interpreter guys out there who wanna take a crack at this? If so, keep me posted.


Thanks
devdanke [AT] gmail [DOT] com

Benjamin Leo
2006-10-08 20:08:32
No matter who Allen Holub is, and no matter if he is right or wrong on this particular issue, his seemingly intentional arrogance is such a put-off that I have no interest in considering his ideas or associating with him any further.

2006-10-09 20:18:01
XML rocks! ... But only when used correctly and for the right purposes. It should not be seen as a "language". It should be seen as a way to store data. Like any "tool" it can be used the wrong way. Learn to use it right and you will like it. Abuse it and you will hate it. So i completely agree with Allen's quote.
Bill Lin
2006-10-10 13:54:39
Benjamin Leo writes: "No matter who Allen Holub is, and no matter if he is right or wrong on this particular issue, his seemingly intentional arrogance is such a put-off that I have no interest in considering his ideas or associating with him any further."


I heartily second this notion.


2007-01-16 03:37:20
select *
from employees
xml-rocks
2007-03-08 05:53:50
XML is best innovation after transistor.
Say yes for XML and throw away those ugly unnessary tools or ways of doing things, you need only XML, Ant and and Java to do thigs.


Linux and unix has far too much ugly scripts which don't archieve anything. But because there is so much those ugly scripts and formats you just need to create more and more that stuff and junk just grows and grows.

Allen Holub
2007-08-05 20:48:36
Though I don't usually comment on forums such as these, I do get annoyed when my name is besmirched by people who haven't read the article in question. My original article was not a "plug" for my book, "Compiler Design in C," which has been out of print for years, so there's not much point in plugging it. The article did list a few in-print titles on compiler design, however, and these are worth checking out.


I personally think that learning how to build a parser is an essential skill for a programmer. Not only are small special-purpose languages wonderful things, but a lot of programming has to do with making sense of user input (i.e. parsing). Building a parser for a small language is not rocket science --- it's way easier to do than programming a Struts application, for example. Shouldn't take you longer than a day or so to learn enough to put together a parser for a small language using any of the readily available parser generators (CUP, ANTLR, YACC, etc.) There are lots of books on this subject. Of course, building a full-blown compiler for a significant language is way more work than that, but we're talking about replacing XML with small languages, here, not implementing C++.


In any event, I'd recommend spending some time learning about how to build small languages before deciding to use XML. You can then make an informed decision about whether XML or a small language is better for a particular application. XML is fine for a lot of things, but it's lousy for others. It pays to consider all the possibilities. Going with XML simply because you happen to have and XML parser laying around is a classic example of "When all you have is a hammer, everything looks like a nail."


2007-10-10 20:57:57
How to compare xml value with input value from a jsp page?