Bridging XML, E4X and JSON

by Kurt Cagle

Efforts have been underway recently to develop a schema language for JSON, analogous to the XML Schema Definition Language (XSD) or RelaxNG languages in the XML arena. Similarly, a JSON transformation language is being proposed and bandied about in various AJAX circles as web2 developers attempt to take the best of what XML has to offer and recast it from the angle-bracket modality to the braced modality.

These efforts are intriguing, and for the most part people within the XML community are now affecting the same rather confused expression on their face that I remember seeing on the SGML generation as they watched the young turks of the XML movement push their view of the world out to the world - "Didn't we already DO that?"


14 Comments

joshuadf
2007-10-13 00:14:29
"DOM, while necessary at a very low level, does not necessarily lend itself to ... AARGGGGHHHH" ... What? To what?
q
2007-10-13 07:47:19
Fisrt, JSON's aba "problem" is resolved as:


[{a:"foo"},{b:"bar"},{a:"bat"}]


This is nice because it explicitly indicates order. In XML, this has to be mandated using a schema. I personally prefer to see context within the document itself when it can remain concise without duplication though it does mean you can't sloppily create the document but that's the same if your document must adhere to a schema.



XPath is great ... for talking about XML. It shouldn't be primary means for accessing data within a programming language, which, by the way, is 99.9999% of the way structured documents are processed. This has been one of my biggest issues for the past few years. Languages should be finding ways to represent and interrogate documents using native syntax.


For example, take Java, an XML document should be converted to a language-specific semantic so that accessing "/gameCharacter/spells[@level=1]/text()" becomes like


foreach (gameCharacter.spells, {level==1}) println text();


Of course, XPaths can be far more complicated than a typical language syntax can reasonably represent but the point is that the document will ultimately be processed with a particular language so document processing should be tightly coupled with the language syntax and semantics.

Theo
2007-10-13 08:02:34
Hm, so they want a way to write schemas for a language that is a stripped-down version of a language which has schemas (more or less). It sounds to me like they want.... JavaScript?
M. David Peterson
2007-10-13 11:24:03
@q,


This is nice because it explicitly indicates order. In XML, this has to be mandated using a schema.


Are you on crack? WTF? What does,


[{a:"foo"},{b:"bar"},{a:"bat"}]


... suggest implicitly about order that the XML fragment,


<root>
<a>foo</a>
<b>bar</b>
<a>bat</a>
</root>


... does not? I do not need an XML schema to tell me that elements a, b, a are ordered.


The assumed "problem" with XML is not XML itself. The problem is w/ people who make silly assumptions about XML w/o first verifying those assumptions are correct.


M. David Peterson
2007-10-13 11:25:14
s/implicitly/explicitly
M. David Peterson
2007-10-13 11:29:59
For example, take Java, an XML document should be converted to a language-specific semantic so that accessing "/gameCharacter/spells[@level=1]/text()" becomes like


foreach (gameCharacter.spells, {level==1}) println text();


No it doesn't. /gameCharacter/spells[@level=1]/text() returns the text nodes that are descendants of /gameCharater/spells who's @level is == 1. foreach (gameCharacter.spells, {level==1}) println text(); will iterate through those text nodes and print them out to the screen or other specified device. In other words your dealing with the difference between a data set and a function that processes that data set.



Ric
2007-10-13 12:42:07
M. David,
Me thinks that "q" is solving the _JSON_ order problem, not suggesting XML has one. And, yes, the results in XML look very similar to JSON.
XML has HISTORY. Lots of tools; well understood
JSON is great for client side AND I get to stand on the sholders of Giants- I can make JSON schema beteer, faster. We have the technology. (Now if I only had $6 million, man!)
For example, I can do cross selectors in XPath2, but it's a pain using old processors. Also, XSLT is still NOT a functional language: in jsPath, I can define my OWN cross selectors and do WHATEVER I want with MY data.

2007-10-13 18:42:39
q,


Good point on the ABA resolution, and one that basically confirms my initial thesis. As I was writing this, I had to ask what the drawbacks were to JSON in terms of encoding XML; the internal answer was not much, and the ABA issue was really the only one that came to mind.


XPath emerged for two reasons - given the hierarchical nature of a given XML document, a folder paradigm was a fairly obvious metaphor for encapsulating that information, and the predicate model also assumed navigation in directions beyond descendants. That you can argue that the "." notation is superior in some (many) respects should indicate that there are in fact other valid syntactical mechanisms for accessing data. That e4x can encapsulate much the same information using dot notation is proof that such a paradigm is quite effective for working with XML.


There is one point that is worth noting here, however. XPath is in many respects more akin to SQL than it is to class manipulation - it works on sets, not individual contexts. This isn't necessarily a superior approach, but it is a different than JavaScript takes.


Note that neither XML nor JS explicitly REQUIRE schemas (inlike languages such as C++ or Java).



Kurt Cagle
2007-10-13 18:44:23
Ooops, forgot to indicate it was me writing the previous comment.
q
2007-10-15 09:18:04
@M. David,


In other words your dealing with the difference between a data set and a function that processes that data set.


I'll just ask a question, within any XPath implementation to process "/gameCharacter/spells[@level=1]/text()" what will they have to do prior to returning the set?


Ok, I can't resist, I'll answer that. They'll traverse and iterate. I was only making a point that processing XPaths abstracts away processing details at the expense of code uniformity. All I'm saying is that XPath is great, I just wish languages implemented such capabilities natively.


And no, I'm not on crack. I was saying that using "[]" in JSON explicitly mandates an ordered list. There is nothing in your XML that does the same. It's partially implicit in XML but if I just have

<root>
<a>foo</a>
<b>bar</b>
<root>
How would I know that is supposed to become
[{a:"foo"},{b:"bar"}]
as opposed to
{a:"foo",b:"bar"}
?
I'd actually prefer the later but if the schema indicates it must support "aba", my JSON creation would fail to reproduce the XML in certain circumstances. But when seeing
[{a:"foo"},{b:"bar"},{a:"bat"}]
you immediately understand the ordering requirement without referring to a schema.
Kurt Cagle
2007-10-15 11:44:26
As I see it, the notation

{root:[{a:"foo"},{b:"bar"},{a:"bat"}]}


produces a JS tracking of:

root[0].a, root[1].b, root[2].a


This is where e4x has the advantage:


<root>
<a>foo</a>
<b>bar</b>
<a>bat</a>
</root>
encodes as:


root.a[1],root.b,root.a[2]


and you can work with all a nodes with the expression:


for (a in root.a){process(a);} 


Using the full encoding layer described above, the same usage with object notation is a little more hairy:


for (var index=0;index != root.length;index++){
var node = root[index];
for ([objname,objval] in node){
if (objname=="a"){process(objval);}
}
}


That highlights the fundamental difference between the two notations - e4x is implicitly designed to work with node-sets, while js isn't. This is more noticeable when you start talking about descendants:


/root/*/*/bin[starts-with('fin')]


requires a lot of JavaScript object code to duplicate, and the generic:


//bin[starts-with('fin')]


is worse.

Lars
2007-10-16 07:04:16
var spells = gameCharacter.spell;
var spell_arr=[];
for ([index,spell] in spells){
if (spell.level==1){spell_arr.push(spell.name);}
}

The notation is short, sweet, easy to code

May be, but it's 5 lines versus 8 lines for the equivalent DOM code. The XPath code is 7 lines. Yes, the JSON version is simpler and more elegant, but is it really a quantum leap?
Larry Edelstein
2007-10-16 09:50:04
E4X is very attractive to me; seeing it in the title of this article was why I read it. It looks to me like the most readable way of handling XML in JS. If MSIE supported it, I'd probably have used it in my last project.
extsoft
2007-10-18 23:41:58
Куплю Windows Куплю Office -Vista/XP/2003 extsoft@mail.ru
и другое ЛИЦЕНЗИОННОЕ ПО Microsoft
пишите на е-мейл extsoft@mail.ru