[Jesper Tverskov:XSL-List] More On The Incompatibilities of XML and JSON and Why You Should Care

by M. David Peterson

In a follow-up conversation to the post made by Dimitre Novatchev, Jesper Tverskov provides an excellent summary as to why XML and JSON are incompatible. He goes on to describe several functions in XSLT he feels would help alleviate at least some of the pain, but to keep things focused on understanding the incompatibilities between XML and JSON, I've left them out. When XSL-List archives todays conversations, I'll update this post with a link to his comlete post.

However, before I provide his summary, I want to quickly provide some of my own thoughts on the XML -> JSON <- XML discussion which I provided in follow-up to a comment made by Robert Korberg,

On Sun, 18 May 2008 08:04:26 -0600, Robert Koberg wrote:


> Bottom line - there is no standardization. If you want to do xml2json
> and json2xml you pick your library and write for it.

I agree. Furthermore I believe the notion of converting XML and JSON to and from each other is the wrong approach altogether. Instead I believe the emphasis should be upon creating a standard format for JSON in which can be referenced and queried by XPath in such a way that regardless of whether the incoming format is XML or JSON, the same XPath applied to both will result in the same generalized result set.

In fact, if not mistaken, this is what Mike Champion and friends have been discussing over in the MSFT camp for several years now, which makes sense when you look at what they're doing with LINQ-to-*.


Jesper's summary follows in-line below,

5 Comments

John
2008-05-20 13:28:46
I've been watching some threads on JSON vs. XML. I think a little bit of history helps put the debate in context. XML is called "SGML for the web" and it was meant for document creation. The originators of the WWW/XML had their hypertext vision, but the masses took it a different direction. The masses instead wanted interfaces, not necessarily content. So XML and HTML have been stretched to do things never imagined (especially with the help of JavaScript).


One way XML has been used beyond it's original purpose is for serialization. It works, however it isn't necessarily succinct. Unfortunately I find most people think this is the main purpose of XML.


So should we have XPath for JSON? I personally think this is silly. It might be a fun exercise, but it tries to generalize XPath querying the world of objects. JSON's purpose is to serialize JavaScript objects. XML Nodes != JavaScript Objects, they only equal each other using obscure abstraction. Why stop at JSON, why not have XPath for any object of any language?


So to my final point. Converting JSON to XML or vice versa implies converting one serialization format to another serialization format. Doing so will either only contain the intersection of each's aspects or will include assumptions made by the converter. JSON's original purpose wasn't for creating documents, and XML's original purpose wasn't for serialization. It seems to me that bridging the two will always be a hack.

M. David Peterson
2008-05-21 18:41:38
@John,


>> So should we have XPath for JSON?


Absolutely!


>> I personally think this is silly. It might be a fun exercise, but it tries to generalize XPath querying the world of objects.


XPath is not a query language. As per the abstract from the official W3C spec @http://www.w3.org/TR/xpath


"Abstract


XPath is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer."


The primary reason people suggest JSON is easier to work with than XML has everything to do with the fact that they associate XML with DOM manipulation due to the fact their primary introduction to programming came via Javascript. And working directly with the DOM via Javascript *SUCKS*! So when someone said,


>> "Hey, instead of walking the DOM via Javascript, you can just do Foo.Bar and gain access to the value of Bar, or you can iterate over Foo.Bar if it's an array using the more compact,


var x = Foo.Bar.length;
for (i=0; i < x; i++){
... do something meaningful.
}


... of course people jumped at the opportunity. This seems much cleaner and simpler than walking the DOM via it's Javascript bindings, and -- of course -- it is. But that's because most people who have come into the world of programming via Javascript and DOM programming don't know any better. These same people associate XML and walking the DOM as being one in the same, which quite obviously they are not. Furthermore, given that XPath is nothing but a way to address certain parts of an XML document, and given that JSON represents nothing more than a collection of properties and values, then what's the difference between writing,


>> /Foo/Bar


... and ...


>> Foo.Bar


?


The difference? A "." instead of a "/" or vice versa. That's it!


Of course, XPath is a *MUCH* more powerful language when it comes to addressing as much or as little of any given document as you might want, providing ways to address a much larger swath of any given document with a much more compact syntax than what you gain with Javascript.


That said, projects like jQuery have helped things out quite a bit, but even jQuery provides a limited subset of what XPath provides as far as being able to address any collection of properties and values, and it does so using a -- albeit slight -- more verbose format.


On the other hand, jQuery provides an update facility where as XPath is strictly a way of extracting the information you want, which is where XSLT and friends comes into play. So while Javascript (and therefore jQuery) facilitates direct operation on the in memory DOM object, XSLT is always working against a copy of the DOM, therefore making it a much safer language than is Javascript for the same reasons a functional language is safer than an imperative language.


That's not to suggest I feel that Javascript has no value. I *LOVE* Javascript in fact. And I think jQuery is one of the finest representations of taking something very complex in DOM manipulation and making it drop-dead simple. And when you combine jQuery with XSLT and XPath, I honestly can't think of more powerful combination of tools at your disposable, which is exactly why I believe XPath over JSON would be a wonderful addition to the developers toolbag.


>> JSON's purpose is to serialize JavaScript objects. XML Nodes != JavaScript Objects,


Of course they don't. But XML can represent the properties and values of a Javascript object in the same way JSON can represent the properties and values of a Javascript object. Accepting the fact that JSON can represent an array of unnamed values, (which means you have to translate [1,2,3] into something like <array>1,2,3</array> or <a><i>1</i><i>2</i><i>3</i></a>) there is absolutely nothing that JSON offers that XML doesn't offer as far as being able to represent a collection of properties and values, which is exactly what a Javascript object represents (Javascript is referred to as a "Property Bag" language for a reason!)


>> they only equal each other using obscure abstraction.


Obscure abstraction? You mean "." and "/"? Sorry, but apparently you need to spend more time reading and less time writing about what you deem obscure abstraction. There's nothing obscure about addressing the internal properties and values of any given serialized representation of a Javascript object when the primary difference is replacing a . with /.


>> Why stop at JSON, why not have XPath for any object of any language?


Absolutely agree! Let's do it! ;-)


M. David Peterson
2008-05-21 20:50:47
@John,


One additional point: When what we are referring to is the representation of a data structure, both JSON and XML offer exactly the same value. The way I see it, however, is that JSON is really poorly named because an object, in pretty much any language, represents not only properties and values, but functions/methods as well. JSON doesn't allow me the ability to pass methods/functions encapsulated in my data set. It's just data. That's it. To gain the true benefits of an object I would need to add back in methods/functions and by doing that we'd be right back where we started: Javascript.


So then how can JSON be a better way to represent an object than XML when the only thing either format is designed for is to represent a structured set of properties and values? And given that both formats do a really good job of representing structure, properties, and values, then why not use XPath as a way to address the internals of both?


No abstraction layer is needed. It's pretty straight forward, so please don't make attempt to overcomplicate something that isn't complicated in the slightest.

John
2008-05-22 09:38:14
>>There's nothing obscure about addressing the internal properties and values of any given serialized representation of a Javascript object when the primary difference is replacing a . with /.


Addressing is only a portion of XPath. XPath has a significant set of functions specific to XML. You make good points regarding working with the DOM; XPath is definitely meant for the DOM. But I see JSON being used just to encode data in general, or should I say property bags ;).


>>XPath is not a query language. As per the abstract from the official W3C spec


The abstract has been updated for 2.0, I would imagine since the original abstract didn't quite capture the full extent of the language:
"XPath 2.0 is an expression language that allows the processing of values conforming to the data model defined in [XQuery/XPath Data Model (XDM)]..."
@http://www.w3.org/TR/xpath20/


So I agree with you with regard to basic addressing, and this is what I meant by saying "Doing so will ... only contain the intersection of each's aspects". Addressing is a good example of part this intersection. However I'm not convinced yet that XPath in its entirety is a good match for JSON. JSON representing the DOM however is a different story.


>>The way I see it, however, is that JSON is really poorly named because an object, in pretty much any language, represents not only properties and values, but functions/methods as well.


Excellent point.

M. David Peterson
2008-05-28 20:42:54
@John,


Yikes! Just noticing your response now. Sorry for the delay!


Regarding the extended functions of XPath, and in particular XPath 2.0, I can now see your point. You're right, I'm speaking directly to the addressing portions of XPath. In this regard, I think it could be a good fit. Past that, and especially as it relates to XPath 2.0, there's a lot of mismatch.


Thanks for helping to refine the focus!