The difference between XML and RDF

by Dan Zambonini

Although regular readers of O'Reilly Weblogs probably already grasp the difference between XML and RDF, I still occassionally talk to a developer who is unsure of the difference.

I recently co-presented a discussion at the British Computer Society on the Semantic Web and Social Technologies. This is how I described the difference between the technologies.

I've created a new reality TV show called The XML Factor, set in the future (I'm not sure if you get The X-Factor in the U.S., but it's just like Pop Idol). In this show, Simon Cowell (whose body has been replaced with robotics, for some reason - probably just because it's the future) judges mark-up languages and technologies on their relative merits.

Simon Cowell and XML

With XML, Simon is impressed by his technical ability - very flexible, can perform any song - but just isn't feeling any emotion. This is because XML is basically just a data format. A pretty good one, but there is no built-in meaning to an XML file; unless your computer has prior knowledge of a particular type of XML (a particular schema, like XHTML or SVG), it won't be able to do much with it.

Simon Cowell and RDF

With RDF, it's a different kettle of fish. Simon's really feeling it. RDF may not have the technical flexibility of XML, but she's got real emotion in her voice. And that's because RDF is all about conveying semantics (OK, maybe that's a bit of a twist of the truth, it conveys statements about things). RDF isn't a data format, it's... a model (hence the picture above, and the following particularly bad pun).

Are you sure you're ready for this pun? OK... Well:

  • like all good models,

  • RDF is very simple,

  • and hence is easy to take advantage of.

See? Told you it was bad... Anyway, in conclusion, RDF is not a data format, but a simple model (which goes a bit like: something has a something of something). Which means that whenever a computer gets hold of an RDF file, no matter how complex, or what it's about, it can always break it down to a set of these statements - i.e. information that follows this model.

Any more bad XML or RDF puns out there?


2005-09-20 12:11:59
not as flashy of a metaphor, but...
whenever I read about a communication document standard that is in XML, I think of a "standard stack" like a networking stack. The standard like RDF sits on top of the XML layer. The XML layer is a low-level layer that provides a way of formatting documents in general. At the XML layer, elements are parsed and namespaces resolved. It is only at a higher level, provided by other standards, that the parsed XML structure takes on meaning. In the same way, the packets that are sent between routers at a low level in the networking stack may contain all kinds of data, but it is up to the higher levels to synthesize the packets into something meaningful on the receiving end.
2005-09-20 12:25:50
Re: not as flashy of a metaphor, but...
Actually, that's an excellent metaphor, and one that Tim Berners-Lee shares with you!

2005-09-21 01:14:17
Meaningless meaning
Both RDF and XML fulfill the task of defining the structure (relationships) of data, e.g. "Peter is the child of John". There is no semantics in this; but, beyond that, RDF is supposed to be able to define what the relationships between these data (i.e. "Is child of") "mean", by again relating them to other resources, until some meaning is conveyed from that for the computer. Here, "meaning" refers to the computer being able to process that relationship in some useful way for the user (e.g. infer that the last name of Peter is the same one as John's).

So far, this ability of RDF has proven useless in the real world of computing, and RDF and all the stack above it (RDFS, RDQL, OWL, and so on) is ignored by the market, being used only by the academia. With only a few exceptions worth to note, the industry embeds meaning into computers not by writing declarative relationships or rules, but mainly by programming procedural software. Since end users are not able to create semantic definitions either, the distinction is not really important to them; they need techies anyway. But so far the procedural software has proven much more successful in creating useful IT systems than declarative academic approaches.

The industry has actually built a second stack on top of XML, paralleling that of RDF: XML Schema, XSLT, XQuery, UDDI, WS-*, etc. Noboy knows where this second stack will lead, but IMHO, when some of this "computer meaning" is needed in the industry, they will not use RDF - they will build new, less academic but more practical standards on top of it, probably without any inference or logic computing involved.


2005-09-21 01:16:01
not as flashy of a metaphor, but...
Maybe it was true few years ago, but nowaydays, having N3, SPARQL and RDF-databases, the RDF does not sit on top of XML anymore. - The XML is only one of the forms, how the RDF can be represented. But there are many more, and from certain point of view, better forms of RDF representation.
2005-09-21 02:11:42
not as flashy of a metaphor, but...
yeah, oh, and would this be why it's always RDF in XML serialization everyone talks about annotating web pages with?

If RDF didn't have xml to sit on top of we wouldn't have to talk about RDF at all.

2005-09-21 02:14:10
like all models everyone is always either accusing it of being ugly, or making excuses why it isn't that bad looking as one might think when first looking at it...
hmm, this metaphor sure runs out of steam quick.
2005-09-21 06:17:45
Awful, awful
Dan, those are the worst graphics I have seen in a very long time, and your pun is excruciating. Brilliant!

Ok, while I'm here I have to take issue with JCamara's comments, firstly:
"So far, this ability of RDF has proven useless in the real world of computing, and RDF and all the stack above it (RDFS, RDQL, OWL, and so on) is ignored by the market, being used only by the academia.".
That's factually incorrect on both counts. Ask anyone that's using this stack, and they'll tell you it is extremely useful, and there are a wide range of problems (mostly associated with the Web) for which it is considerably easier than other approaches. There is a lot of work going on in the industry, with players like Nokia, Adobe, IBM, Oracle and HP investing in the technology.

Secondly (s)he makes a contrast between declarative approaches and procedural ones, accurately associating RDF with the former but saying the industry mainly uses the latter. This is to a great extent still true, but there has been a widespread shift towards encoding things like business rules declaratively rather than hard-wired into code. Good examples are the adoption of XML Schema, XSLT, UDDI and a significant chunk of WS-*, essentially declarative approaches, not procedural. I'd also note that the analysis fails to note that relational databases (which are hardly limited to academia), are essentially declarative logic systems. All RDF is doing in that context is making the same techniques more Web-friendly.

The prediction that industry will continue to build on the non-RDF XML stack is fairly safe. But it is a mistake to view of the XML stack and the RDF stack as mutually exclusive. Although not essential, the RDF toolset does include syntax-oriented tools like XSLT etc. What RDF brings to the table, as Dan (painfully) points out, is the model. There are a *lot* of things that are considerably easier to implement when one has a common model designed for the Web environment, *in addition* to the XML tools.

Finally I'm curious to know what JCamara has in mind when talking of "computer meaning" without the use of logic, something like Web-wide neural nets could be very interesting...

2006-04-17 08:49:35
u r so gay
2006-05-05 02:56:37