Ron Bourret on XML and databases

by Edd Dumbill

A while back, I wrote about the changing face of XML, and said that XML and relational databases appeared to be a slow starter. XML database guru Ron Bourret wrote back with a different perspective.



Here's what he has to say:




I've been off wandering about the Web tonight and ran across the following
statement on your blog:



"Databases. Though there's a reasonable amount of interest in the W3C XML Query language, there's not much to say about XML and
databases. It doesn't seem to me that the integration of XML with
relational databases has taken off in the way we once thought it
might."



I will admit to just a wee bit of bias here, but I'm not sure I'd translate a
lack of conference presentations to the integration of XML and relational
databases not taking off. The use of Web services has certainly taken off, and
I'd venture to say that behind almost every Web service is a relational database
generating or receiving data in XML form.


On the other hand, the lack of conference papers seems indicative of
what's going on in the XML database industry:


1) The basic mappings between XML and relational databases are around five years
old and haven't changed much in that time. Hence, there's not a lot to talk
about at conferences, at least in terms of cutting edge stuff.


2) SQL/XML is somewhat new and Jim Melton seems to be a fixture at most
conferences, but he's just one guy and doesn't translate into more than one
talk. Although it is supported by DB2 and Oracle, SQL/XML suffers from a less
than wonderful surface syntax. I have no idea how widely it is used -- the Web
is almost devoid of information about it -- but if I was using DB2 or Oracle and
needed to publish my data as XML, I'd use SQL/XML. There aren't really any other
competitors with the same degree of flexibility.


3) XQuery over relational databases hasn't taken off like I thought it would.
IBM came out with an alphaWorks product a year or two ago, but it doesn't appear
to have gone anywhere since. There are also a number of integration products
that do this and new ones seem to be showing up slowly but steadily, but the
authors don't seem to do conference presentations. From a conference point of
view, XQuery-over-relational seems to be one slide in an XQuery presentation.


4) The big relational databases do seem to have some exciting things in the
works, but what they're doing is mostly implementation of existing ideas (e.g.
XQuery, SQL/XML) rather than pushing any new boundaries.


Yukon (SQL Server) is due out in 2005 and claims to have native XML
support -- that is, the XML data type is supported by what appears to be a
native XML database built into the relational database. If this is true, it's
significant. They also have an XQuery implementation and can query relational
data from XQuery as well as embedding XQuery queries in SQL. Good stuff.


Oracle 9i release 2 was the first out with "native" XML support, but their
"native" XML support means storing the XML in a CLOB column or mapping it to
relational columns with an object-relational mapping, with XPath supported over
both types of storage. The CLOB support is is cute, but it won't scale without
heavy indexing. Oracle has an XQuery prototype with extensions for SQL queries
inside XQuery. It appears that this works over both the aforementioned storage
types, but I'm not really sure. If they add true native storage, this will be a
good product.


DB2 currently has SQL/XML support and they had the first XQuery-over-relational
implementation, although only on alphaWorks. They haven't said much publicly
about where they are going with XML support, but given the amount of work going
on in their labs, one assumes some of it will eventually make it into DB2.


Sybase, Informix, Access, and FoxPro have XML support as well, but I haven't
tracked these closely enough to know where they're going in the future. (Oddly
enough, Sybase provides native XML storage, but not much in the way of XML
support for existing data.) One of the great mysteries to me is why PostgreSQL
and MySQL essentially don't have XML support. They've got some toy stuff, but
nothing serious. Maybe Open Source middleware is filling the need?


My guess is that everything will pick up on this front in a year or two, with
companies moving towards what I consider the holy grail of XML support in
relational databases: native storage behind a first-class XML data type, XQuery
support with extensions for (a) including relational data or SQL queries and (b)
updates, SQL/XML support with extensions for embedded XQuery queries, and
support for JSR 225 (see below).


Of course, this still won't translate into lots of conference talks, as only a
few companies are involved.


5) IBM and Oracle are leading work on a Java API for XQuery (JSR 225), but this
isn't strictly for use with relational databases. Still, it should start showing
up at conferences at some point in the near future.


6) Native XML databases commonly include the ability to integrate data from
relational databases. (There are a number of XQuery-based integration engines
that work with relational and other types of data as well.) You can find a
number of these companies on the trade-show floor at conferences but, except for
Mike Champion from Software AG, few of them seem to do presentations.


I beat up the native XML database companies on this every chance I get, but I
haven't seemed to make much headway. Not sure if this is lack of budget --
except for Software AG and Sonic (eXcelon), these companies tend to be small --
or just that they're making the mistake of a lot of technology firms that
technology trumps marketing. (Which, of course, explains the failure of
Microsoft in the marketplace ;) That said, the number of presentations by these
companies at conferences has gone from zero a few years ago to a few now.




Do you use XML support in databases? What's your opinion on the state of the technology?


8 Comments

jwenting
2004-06-02 03:30:30
hmm
I'd say that more likely all those webservices use each a custom written mechanism to turn the results of SQL queries into XML for transmission over the network (a few special cases where the appserver handles that translation by using ITS custom functions excluded) rather than the database engine generating the XML on the fly.


I've been involved in several such projects (though I admit none very recently, so maybe technology has moved on) and usually the messages are either generated on the fly every time or are generated on the fly once and then store in a database table as flat text for later retrieval should the same data be requested again.

uche
2004-06-02 07:17:09
Nice observations. Some notes.
Those interested in this topic might want to see http://www.infoworld.com/article/04/04/23/17FExml_1.html?XML%20DATABASES.


Oracle 10g does have true native XML data type (i.e. neither CLOB nor mapping to tables). I'm pretty sure 9i did as well. IBM DB2 does as well. Microsoft is the sluggard in this area. From my own experience I'd give IBM DB2 a bit more of an edge than the article does, but it is right that Oracle has worked a fairly impressive business in XML capabilities.


I do agree with you that overall the state of things is not where all the hype would have pointed. I think that's largley a problem with the hype. XML and RDBMS do represent different ways of thinking about data, regardless of what relational purists argue, and the relationship between the two technologies will always be like like the uneasy detente between XML and objects. I've touched about this likelihood at http://www.adtmag.com/article.asp?id=8596.


Finally, most Web services out there are still toys, and I'd say a pitiful few have RDBMS backing of any sort.


Speaking from what I see, Ron is right that RDBMS/XML integration is relatively sluggish right now.


--Uche

uche
2004-06-02 07:28:32
Clarification
By "the article" I meant the Infoworld article, not Edd's blog.
BR
2004-06-02 14:22:16
XML in DBMS is a bad idea...
Read pretty much any of these articles as to why:
http://www.google.com/search?q=site:www.dbdebunk.com+XML+data+management


Or this one:
http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci557439,00.html

eddodds
2004-06-03 10:12:59
IMIC and healthcare.xml.org
Just wanted to announce that OASIS is planning on launching the international medical interoperability committee pretty soon and that news will be posted at healthcare.xml.org.
mchampion1
2004-06-03 11:11:26
XML in DBMS is a bad idea...
You can also read all sorts of articles there saying why SQL is a bad idea :-)
rpbourret
2004-06-03 12:42:23
Nice observations. Some notes.
Uche: "Oracle 10g does have true native XML data type (i.e. neither CLOB nor mapping to tables). I'm pretty sure 9i did as well."


Not that I can see. See figure 5-2 in Chapter 5 (registration required) of the Oracle® XML DB Developer's Guide 10g Release 1 (10.1). This explicitly shows that structured storage maps XML Schemas to relational tables with an object-relational mapping. Or read the description of XMLType in 9i XML Storage Models: One Size Does Not Fit All.


(One significant difference between Oracle's O-R mapping and most others is that Oracle can use hidden columns to store things like sibling order, comments, and PIs.)


Oracle throws the word "native" about when describing XMLType, but this is marketing lingo, not a technical description of the implementation of XMLType.


Uche: "IBM DB2 does as well."


Really? I've played with 8.1 and all the XML types I read about mapped to character storage of one sort or another (file, varchar, CLOB, etc.)


-- Ron

bry
2004-06-07 00:17:43
XML in DBMS is a bad idea...
everything if taken far enough is a really bad idea.