What's Wrong With RDF?
by Timothy Appnel
"What puzzles and confuses me is why there is so much animosity towards RDF" writes Shelley Powers, author of O'Reilly's upcoming book on RDF.
Shelley's post was made in response to Tim Bray's attempt to implement an RDF model into the RDDL specification that ultimately lead to his recommendation to use XLink instead. Bray's comments where picked up through the community unleashing a torrent of criticism and "animosity" directed at RDF. Jonathan Borden summarizes the significance of Bray's comments when he wrote, "this is the crux of the problem. If Tim Bray can't do RDDL/RDF using his little toe, with his hand tied behind his back and the rest of him hog tied and upside down, then what prayer to we have trying to foist this upon the rest of the world, i.e. people who just want to create and document XML namespaces?"
Shelley Powers' post to the xml-dev touched off a heated discussion late last week that continued across mailing lists and weblogs through the weekend. In this weblog post I will attempt to highlight and summarize this conversation. I attempted to order the comments in a way that they make sense and do follow somewhat of a chronological order though not entirely. I have attempted to compensate for the distributed and parallel nature of the conversation in order to maintain some semblance of its flow.
"I am particularly unhappy because of Tim Bray's involvement in all of this," wrote Shelley Powers on her weblog. "There's an implication and an assumption made that because Tim Bray 'invented' XML, he's qualified to be a definitive judge of RDF and RDF/XML. However, the two efforts are not the same: XML deals with meta-language, RDF with meta-data. Tim has a right to his opinion, and I don't fault him for it though I don't have a tremendous amount of respect for his half-hearted and rather dubious effort to use RDF/XML to model RDDL."
(Jonathan Bowen subsequently posted a human-readable RDF compliant RDDL format to demonstrate a human readable RDDL format could be created with an RDF model.)
Shelley offered some advice to anyone put off by RDF: "If you don't understand it, and don't want to take the time to understand it, or don't feel it will buy you anything, or hate the acronym, or you're in a general bitchy mood that's easily triggered if someone uses "Semantic" in the same sentence that contains "Web", the solution is simple: don't use it. Don't use it. Don't study it, look at it, listen about it, work with it, sleep with it, or generally go out and dance late at night with it."
She also notes "However, you may feel about RDF, the spec, or RDF/XML, the serialization, I would hope that you all remember one thing: in the last few days, the RDF Working Group has released not one, not two, but six new working drafts. Six. That's a hell of a lot of work."
(See this post for more on these latest RDF drafts from the W3C's Semantic Web Activity.)
Simon St. Laurent writes "I have a lot of respect for certain RDF applications that appear to be working, a general lack of interest in describing the world as graphs, and a serious distaste for RDF syntax. I genuinely resent what I see as the unfortunate influence of RDF on XML's post-1.0 development and the URI-centric viewpoint it has foisted on XML."
Simon later went on to say "RDF is powerful stuff, great for those who want to use it. Just keep it off _my_ dance floor, please."
Tim Bray responded "I'd go further. I think the current RDF/XML syntax is so B.A.D. (broken as designed) that it has seriously got in the way of people being open-minded about RDF. I'm baffled why the RDF working group has been forbidden to work on replacing that syntax."
In response Shelley Powers posted "because, Tim, there are implementations of RDF/XML as described, including Mozilla and RSS 1.0. I know you don't approve of them, but they are real, they are production, they are in use. Bitch about them as much as you want, but people use them."
On the comment board to Shelley's weblog Mark Pilgrim offered his take. "The fundamental flaw of the overzealous RDF advocates is the implicit assumption that "because I want to work with this data as RDF, it must be produced and stored natively as RDF." This is demonstrably false, and is what people are objecting to when they talk about "the RDF tax"."
Joe Gregorio published similar sentiments, "...my animosity comes from a push by possibly overzealous RDF proponents to change every format they come in contact with into valid RDF serialized as XML. I can point to RSS 1.0 and now the abortive RDDL as RDF attempt as failures of that strategy. On the other hand I can point to the use of RDF in Mozilla as a successful strategy of *leaving the native format alone* but still getting the benefits of RDF, as I pointed out Wednesday."
(Gregorio later published that Mozilla's use of RDF is smaller then first believed according to a OSAF mailing list thread.)
Gregorio continues, "I think a healthy dose of skepticism and a critical eye turned on it by people outside of the usual circle would be helpful to RDF, and it certainly couldn't hurt the XML serialization."
Elsewhere Tim Bray offered: "<famous-anecdote>Stuart Feldman, the Bell Labs guy who invented "make", woke up one morning a few weeks after he'd released it, and realized that the syntax basically sucked - all those tabs and colons and weird continuation rules. He started working on something better and was shot down because someone said "Stuart, there are *dozens* of people using this, it's too late to change it."</famous-anecdote> I think the number of people who are now using RDF is comparable, in relation to the number of people who need something like RDF, to the couple of dozen make users in 1970-something. It is *not* too late to fix the RDF syntax, it just takes some courage and initiative."
Shelley Powers responded:
"Yeah, but who is to say that [Stuart Feldman's] new approach would have been better? We can work and work and work a spec until we're blue in the face and not find a perfect solution. People learn to work the situation, or they learn to automate it -- i.e. autoconf, automake, and libtool.
Tim, we need the [workgroup] to finish. We have been waiting over a year for them to finish. We need something stable that we can work with. We do NOT need to start all over again. I would pack it in at that point. I really would."
Responding to a comparison of RDF now to HTML in the early days of the Web, Bray wrote, "HTML was by no means "bad". It was exactly what the world needed, and millions of people started using it because they liked it and because they could do "view source" and figure it out. My gripe with RDF/XML is precisely that it's failing to learn this lesson from HTML's success. Thus not enough people are using it, even though it's arguably what they need."
Shelley Powers notes, "the RDF Working Group was given a charter not to rewrite RDF/XML but to answer issues and provide as much cleanup and clarification as they could but to still remain within that support for previous implementations. It's sad that one can't just throw things out and start over again, but that's the way of the real world."
To that Bray responds "No it's not and yes you can, and you should."
Elsewhere Mark Pilgrim wrote in response to similar comments by Shelley, "you're hurting yourself more than anyone else by defending the status quo. You have a lot invested in RDF (the theory), and it'll all go to hell. The rest of the world will remain blissfully unaware that there was this great idea here, buried under mounds and mounds of incomprehensible angle brackets."
Tim Bray also wrote "The proponents of RDF (including myself) say that RDF's value add is that it allows the efficient interchange and manipulation of [Resource, Property, Value] triples. I happen to believe this propaganda and I also believe that one of the obstacles is the human-incomprehensible syntax. If you believe that RDF/XML's current syntax is not a problem please continue with your project of trying to sell it to the world, but it feels to me you're trying to accomplish a good thing with one hand tied behind your back."
(Mark Pilgrim offers a personal account of his attempts and frustrations with RDF here. I don't quote it here since the entire post is worth a read as a first-hand account of the issues being discussed throughout this discussion.)
In addressing the XML serialization of RDF Danny Ayers offers, "probably the primary cause of the ugliness of RDF/XML is the mismatch between the tree model of XML and the graph model of RDF. To explicitly represent a graph in XML the syntax will start getting ugly whatever you do. This is a weakness of XML, not RDF.
In a post to the xml-dev mailing list Shelley Powers wrote "I'll be honest, I don't care about the human readable/writeable aspects of RDF/XML as much as I do care that there are tools and APIs that manage it all for me. Sorry -- but I just don't think that is the most important aspect of either XML or RDF/XML. Again, IMHO."
To which Sean McGrath replied:
"I'm afraid, I take a diametrically opposite view. Things should be as complex as necessary but not more complex.
Punting to tools and APIs to salvage mankind from complexity of its own making is one of the main reason this industry is constantly battling the alligators rather than clearing out the swamps."
Elsewhere Jack William Bell echoed the same sentiment. "I have a problem with [an (easily) human readable format not being necessary]. If you don't care about being able to read it easily, why not use a binary format of some kind in the first place and reduce the bandwidth footprint?"
Tim Brays writes "I guess where Shelley and I would agree to disagree is that she doesn't think that easy human-readability is very important in the data formats she uses, and I think it's terribly, terribly important; I think one of the central lessons of the Web is that enabling people to do a "View Source" and roll their own based on what they see is, well... there's nothing more important."
Shelly Powers explained "RDF/XML is a mapping of that model to XML -- a mapping that's not necessarily easy or uncomplicated. XML was picked because XML is the prime metalanguage format used in many intra-mechanical transitions, such as forming the messages and providing the framework for something such as SOAP. It wasn't necessarily picked because XML is human readable, though we hope that would be a side benefit."
Tim Bray writes:
"for the record, I did *not* invent XML, I was a member of a [workgroup] of 11 people supported by an interest group of another hundred or so who subsetted an existing standard called SGML whose position was spookily similar to where RDF is today: it's important, some smart people are using it to do some big things, but it has no grass-roots uptake.
Turns out that some of the things you could do with SGML you can't do with XML, and some of them are awfully handy, but in the end it turned out that the complexity cost for doing them pushed the cost/benefit ratio into really lousy territory. Hmm, there's an echo in here."
In response to a post on why RDF is hard, Simon St. Laurent wrote "I don't think the RDF community has ever really understood that what they do is genuinely difficult for most people. The RDF community seems very self-selecting to me - those who can cope with RDF like it, and the rest of us keep our distance. I'm not sure it's ever been clear to people who find RDF intuitive why so many people bounce off of it completely, and I'm not convinced that it's possible to explain that to someone who genuinely likes RDF."
Shelley Powers replied "No one is forcing anyone to use RDF. This isn't a dismissal -- this was meant to be a reassurance."
What's wrong with RDF and its XML serialization?
Great summary! Broken links!
Thanks for putting together this summary - great work! However, there are a couple of broken links in the story.
Imperfect Tools. Imperfect Tools.
Readability - I think they are both right! (and wrong)
Regarding readability, I think they are both right and wrong (but this is the guy who thinks postscripts is/should be readable). The great thing about html was that in the beginning it was readable - this was important when there were no tools to do it for you. People *still* (such as myself) handwrite html, but most of it is machine generated, and because of this and other standards (eg css) it has started to become opaque. Likewise the Web standards - I'm working through some BPEL4WS stuff and yeah, its human readable but a headache! This stuff has reached a complexity where human readability is there as a backup, but one really should use a tool.
XML is as bad as RDF
I think XML is very hard to edit and type -- its not very good for humans, because its so easy to get the files wrong. If you look at the RDDL/RDF thread and tim brays "RDF" error, it looked to me more like he had trouble making his example valid xml.