Reevaluating XSLT 2.0

by Kurt Cagle

I recently wrote a blog about the directions that I saw with XML, and while it has proved to be fairly popular, it has also generated a fair number of comments that really need their own more detailed examination. One of these, and one that I've been planning to write for a while anyway, has to do with my comments about XSLT 2.0 increasingly being used as a "router" language, replacing such applications as Microsoft's BizTalk Server.

This is not a disparagement of BizTalk - it's actually one of the Microsoft technologies that I have actually endorsed on a regular basis, because it solves one of the thornier issues involved in creating complex data systems - how do you handle the intermediation of data coming from different data sources, and while I have some quibbles about the interface, I think BizTalk does its job admirably. It also served as a bridge technology for quite some time between the SQL and XML worlds, and it will continue to serve in that role for quite some time to come.

However, I also think that in the long run it is a bridge technology, and that as the world moves increasingly to the use of XML as the preferred data transport story, other technologies, such as XSLT 2.0, will likely end up making most of it functionality redundant and useful at best on the edge cases. Such statements presuppose a lot, of course, most notably that XML is in fact becoming the dominant form for expression of data content. Before digging into XSLT2, I'd like to address this issue head-on, as it ties into a number of other things I've been looking at of late.


17 Comments

Dimitre Novatchev
2007-03-09 12:22:15
Hi Kurt,


Thank yo for the very inteesting observations on the role of XSLT 2.0.


In this context, would you, please, let us know how XSLT 2.0 + FXSL 2.0 fits in the picture?


A good reference to FXSL 2.0 is one to the ExtemeMarkup conference presentation I gave last August in Montreal:


http://www.idealliance.org/papers/extreme/proceedings/xslfo-pdf/2006/Novatchev01/EML2006Novatchev01.pdf


Cheers,
Dimitre Novatchev

edward kepler
2007-03-09 14:00:12
Great Work. I would love to see an example of the Exist/XQuery/Transform Web Service You Described. Sounds Very Interesting.
Hunter Thomas
2007-03-09 15:47:08
As interesting as all these capabilities are, they come with the necessity to treat XSLT as a real programming language if it wants to behave like one. There are two major things that I have yet to see emerge on the landscape to address the 'real language' issue.


The first is a development concern, a debugger. If you can't debug it, it's not going to be maintainable. It'll end up needing to be debugged by side effect, which is a horrible thing to do to a maintenance programmer.


The second is security. As more and more features become available, so do the doors to exploitation. If XSLT2 does in fact reach wide adoption within browsers, it will undoubtedly require a consistent sandbox technique to be applied. And we've seen how well that really works in practice. By which I mean, even if done well, with the release schedules that browsers end up with, there will end up being defects that allow exploitation.

Peter
2007-03-09 17:10:51
Sorry...this turned out rather long...


From my personal experience (recently having switched from developing XQuery to trying to use it in real world scenario's) there is certainly a need for better dev tools (although some are available).


More fundamentally though, there is a need to develop a vision and associated technologies to make these new things work seamlessly with the old things.


I honestly do not believe that XSLT or XQuery will replace the OO wirings (be it .net or java) that are so massively deployed and developed, anytime soon.


To give just two examples of the problems I bumped into


When one calls an external XQuery function that has side effects (and a lot of useful "functions" have side effects) - the spec states you are on your own. Depending on the XQuery optimizer and the location of your function call in the query your function might be invoked before or after a second function call or it might not be invoked at all. Even if your app works today it might blow up when e.g. your XQuery optimizer for whatever reason decides to do something in a different order (which it is allowed to do). Not the most comfortable seat to sit in.


A second example


While I am pretty new to it, the EJB3 spec unlike it's predecessors is really a nice beast. E.g. the caching you get there often gives you interesting performance benefits that would be hard to achieve otherwise. That however is as far as I know and at least today out of the scope of XQuery (or XSLT).


It might be doable somewhere in the future to write an XSLT stylesheet or XQuery application that enforces security restrictions, reads XML from a SOAP message, pulls in some information through some next version of an ORM spec, executes some business logic, writes some log info to a file, and then returns some feedback to some other service, but today, with XSLT2 or XQuery1 we are not there just yet.


As long as the indeed interesting features of XQuery/XSLT do not *seamlessly* integrate with all other parts of a typical business (web) app/service environment, I at least would not choose to build the main driver of the app/service in XSLT/XQuery. In such scenario's XQuery and XSLT are interesting tools to have around (be it difficult ones to learn and use properly) but they will stay sitting somewhat around the main part of most apps....not that that is bad thing, but I got the impression from the article that a somewhat different position is taken.


The little I know from LINQ seems very promising in this context and I hope someday a similar alternative will be available in the Java space. Just like in the 80's when the 4GL's with very good (R)DB(MS) integration where having their successes, an integrated data / programming language environment that not only supports R(DB)MS but the whole range of data sources we want to use in todays' applications seems a promising way forward.


Peter

Kurt Cagle
2007-03-09 18:52:51
A lot of good comments here, so rather than trying to interleave, let me address them all here.


Dmitri,


I don't know yet. I'm going to spend some time playing with FXSL2 (you'd be surprised by how little time I have to play with XSLT2 right now), but I hope to address the issue shortly.


Ed,


Thanks! I've not been avoiding you on XForms.org, btw ... I've just been trying to balance a lot of things locally. I'd also like to work with some of the ideas you suggested and tie in XSLT2 as a web service infrastructure.


Hunter,


Oxygen 8.1 has a superb XSLT debugger, which has saved my @#!&$ more than once in the midst of a tight deadline.


Security is a concern, though to be honest I think less of one with XSLT than it is with most application frameworks. XSLT operates on XML at a purely syntactic level; there is no specific need to instantiate a scripting interface or otherwise instantiate any object beyond an XML DOM or Sax entity, and those are locked down fairly tight. Client-side XSLT is subject to the same sandboxing that any other application is (and in general doesn't create hacks that compromise the integrity of that sandbox), so if anything I'd say its probably a lot more secure than most application frameworks.


Peter,


You raise a number of excellent points. My sense with XQuery is that its scope is, and should remain, relatively low level; you don't want to be building extensive recursive calls in XQuery if you can help it, and you reach a point fairly quickly where it is just easier to make an XSLT call (though if you can do it from XQuery - that's powerful).


The side-effect issue is one I've wrestled with more than once, and I think it is a valid issue. Like all languages, XQuery needs to be worked with by enough people to make a best practices methodology feasible, and I don't think we're there yet by a ways.


One of the reasons I like eXist is that it makes it possible to create such a seamless application environment, where you don't need to escape outside of the context of XML development. Admittedly, this isn't a perfect solution in all cases, but when you CAN do it, magic happens.


I'm still playing with LINQ. I think it's an exciting technology and expect it to play a large role in Microsoft development in the future. However, it still embodies a principle I've come to realize is crucial - we are seeing a move towards XML becoming increasingly the default representation for at least a certain class of data structures, and this in turn is driving the adoption of XML-oriented tools and techniques. I don't necessarily see XSLT 2.0 becoming the be all and end all of server technology, but what I do see is that the technology will have much less resemblance to the ASP/PHP/JSP modality and will increasingly be oriented toward the manipulation of XML entities as their primary focus. I do think your assessment in the main will likely remain true for a lot of reasons (XSLT, even XSLT 2.0, requires that you take your brain and twist it sidewise in order to really get it, so I see it as being less heavily adopted than the easier to understand imperative code; on the flipside, I see XSLT 2.0 actually becoming the core for many larger scale XML-centric applications.


piers
2007-03-09 20:17:53
While eXist/XQuery/XSLT2 may not be ideally suited to an enterprise level server project, when combined with input/output from/to an XML envelope like RSS or Atom, you have a very powerful tool for building lightweight services quickly, seemlessly, and yes, magically.
Peter
2007-03-10 03:29:44
Thanks for taking the time to reply Kurt. I can certainly agree with your comments. Allow me to add an observation that strikes me as odd. With respect to





However, it still embodies a principle I've come to realize is crucial - we are seeing a move towards XML becoming increasingly the default representation for at least a certain class of data structures, and this in turn is driving the adoption of XML-oriented tools and techniques.



Is this not strange though? We have seen the "main stream" programming environments migrate from those that just had a bunch of data structures and associated routines that worked on those (e.g. Fortan and Cobol), to the abstract data types from Pascal, Modula, etc to the current oo environments, including the useful generic extensions. In general there seemed to be a trend to isolate behavior from data, or to generalize algorithms through generic functions. Currently you observe a shift that seems to go back to something that more resembles "data driven programming". I guess an important difference with the Fortran/Cobol era is that today we want to work on (XML) syntax as opposed to in memory data structures. The advantages this brings for interoperability are obvious, but I have the feeling that some of the issues the Cobol et al environment struggle(d) with will rear their ugly head in this XML syntax centric world as well.


Personally I would be reluctant to build any core business logic based upon XML data structures. As such the "xml programming" tools are very nice indeed to sit on the input and output ends of the app, and would benefit substantially from seamless integration with the "oo" world, but I can not see xml datastructures play a viable role anywhere in between. That certainly does not diminish the value of the XQuery, XSLT, LINQ, E4X etc technologies - liberal on your input and strict on your output is important and requires good tools. It looks like these tools are getting there. The integraties is lacking though....at least from what I have experienced.


And apologies for being so long winded,


Peter


Kurt Cagle
2007-03-10 09:35:42
I see (or at least interpret) the trends a bit differently. I was pondering this last night as I was reading a book on UML2. That isolation of data and functionality still very much exists, but what I see with XML is that it is, fundamentally, a mechanism for abstraction. XSLT2 aside, XML is a lousy language for writing imperative programming. Issues of verbosity aside (though I think they are important) XML fundamentally is a finite representation of state or intent, but is generally useless for doing anything with that state unless you have some means of "interpreting and acting upon" that state.


Perhaps the most telling question I can see here is "what is a DOM?" A DOM doesn't fit neatly into an OOP class, largely because it is best modeled as a very flexible collection of imperative classes that provide the semantics (the operability) to the representation of each entity within that DOM. I think that's what a lot of people miss when they look at XML - they see data structure, but a data structure is in fact only one use case of a large DOM model.


This is one of the reasons that I'm such a proponent of bindings and behaviors, and an insight as to why I like XSLT2 for what it does. An elemental binding establishes a semantic association with a given element, such that the associated operations and attributes (in UML speak, not XML) effectively provide the inputs into the instanced object (and provide a queryable interface for the dynamic state of that object). The XML by itself has no intrinsic semantic - it is only when it is bound do the semantics manifest.


XSLT2 works at a pre-semantic level; in many respects it is a not so distant cousin to regular expressions. Both are purely functional languages, both are pattern oriented (the one with respect to lexical patterns, the other with respect to relational patterns), both are largely transformative in nature, and both are a real pain in the butt to program in. You wouldn't build operating systems from them (the router example I gave above is perhaps the limit of where I see XSLT going in that regard), but they are nonetheless both crucial and powerful for what they do.


What XSLT2 doesn't do (usually, unless you really force the envelope on it) is provide semantics - it is, in fact, a fairly bad idea to do it, because if it did do so, the effect would be very much analogous to destroying and recreating the entire DOM during every step of the program. There are more efficient means to do that (bindings), though one interesting point to consider is that many bindings in fact do such stepped transformations, largely because it is more efficient at that level of granularity, (though these are typically leaf transformations).


Thus, XSLT2 handles DOM to DOM transformations, with the understanding that while the XSLT2 may establish WHICH bindings will be manifest for a given node in a DOM, in general it is not the XSLT2's responsibility to instantiate the bindings directly.


Thus I'm not sure that I would concede the argument that XML represents a step "backwards" in contravention of the OOP paradigm - rather, I see it as an abstraction beyond that OOP, one which is more emergent at this point than fully articulated.


Now, given that it's still worth addressing where I see the imperative languages moving as well. I think that there is concurrent evolution taking place against the backdrop of XML. One of the biggest factors that I see right now is that most of the imperative languages that were developed either prior to or in the early stages of the Internet tend to all have fairly strong similarities - the classical C++ OOP model. Data flows are largely reflective of disk serialization and limited pipe communication, and most tend to be fairly strongly typed. The languages that came afterwards are increasingly weakly typed, have fairly shallow inheritance structures, are built with a far more simplified data access/parsing/serialization model, and are generally at the forefront of working with ad hoc object definitions, one subset of which being E4X, JSON, LINQ, etc. - micro-languages. I don't see this as being at all accidental. These languages are tending towards a more declarative format, one that actually suits the abstraction model inherent in the XML DOM quite well. It's another reason why I see the DOM/Binding model as being the likely outcome. It will still take some time to manifest completely, but my gut feel is that the use of XML as the "bones" or framework, the use of imperative languages to act increasingly as bindings (the muscles) and the use of a core messaging architecture (perhaps built around XML, perhaps not) for messaging and event and exception handling management is likely to be the end result. MVC is a powerful design pattern, and one that I see being the "REST" state of the Internet; as more applications become web applications (or at least are centered around the web) this will only become more prevalent.

Sylvain Hellegouarch
2007-03-10 13:02:24
Excellent article Kurt as usual.


Although I can definitely not comment on the technology itself what I wonder is whether or not there will be enough momentum that the chain you suggest (NXD+XQuery+XSLT2+XForms+CSS) will ever take off. The big problem is that this chain forces a fairly different mind set from the developer point of view (just the XForms model is a mind set on its own really and the learning curve can be tough) and I don't see how companies will cope with it.


AFAIK companies look for stability and maintainance. At an age where development is outsourced or handed over to contractors which hav a high level of turn over the technology needs to be most common out there. Although XML is not part of any business as a unit of message I'm not sure business can be built atop the XML family without forcing companies to spend more money on experts and people who know what they're doing. This means loosing in flexibility.


So my point is that even if the technology is there and brilliant I wonder about how the business will integrate it.


2007-03-10 14:37:40
Kurt,


Interesting meanderings indeed, although we are fast approaching the level of abstraction of the typical xml-dev permathread :)


I don't think there is any real disagreement. There might be a difference where on the application stack and abstraction level we are putting the focus, but other than that I agree with what you are saying.


I originally reacted on the article, something which I almost never do, because I got the impression it is implying that XSLT and XQuery or getting ready to go beyond their intended usecase of working on XML and I do not think that is the case, nor do I think it will happen anytime soon (despite initiatives like XQuery scripting). They are 'just' becoming better at what they were designed to do - working with XML, or information that presents itself as XML (including tabular database data).


I am not saying that to a certain extent it can't be done today, but one has to be pretty persistent to get there and while entertaining and intellectually challenging, it is not the most efficient approach to get the application finished. I have been guilty of that sin myself the last year.


My experience is though that even if one sticks with using XQuery/XSLT for what they were meant there is an important chasm to cross. A chasm that I hope can be bridged to a certain extent by better tooling, perhaps along the lines of what LINQ offers. It should make the job of converting from the external 'syntax' to your business object model smoother. This is really, on a different level, the same problem as the one of integrating rdbms information in the application. Trying to strike the balance between ease of use and flexibility, we currently seem to have landed at ORM tools. If there is anything to learn from that journey, it is that one should not try to tightly couple the external representation, the syntax or database model if you want, with the application model (remember the rise and fall of oodbms technology).


So, I guess what I am saying is that what I lack today for my day to day job, is a set of tools, or even patterns, that make it easy to feed information from my app into XQuery/XSLT and vice versa without having to "manually" convert back and forth between the app's object model and DOM or XDM or whatever.


Once that is available I can focus on finding the right balance for the problems that can't be solved with tooling, like how loosely coupled can you make different systems but still make sure they 'understand' each other.



Kurt Cagle
2007-03-10 14:42:59
Sylvaen,


You raise some good points. It's always difficult to get an idea about the number of users of any given technology, especially one like XML that often appears in a supporting role in so many, so getting an idea about the pervasiveness of "XML developers" can be a little hard. However, as at least one sample, I offer up a few searches on Monster.com, for job listings in the US in the last two months:























TechnologyNumber of Jobs
ASP.NET> 5000
HTML> 5000
Java> 5000
JavaScript> 5000
Perl> 5000
XML> 5000
Flash3754
JSP3291
PHP2309
AJAX1985
XSLT1381
Python1128
XHTML862
ActionScript573
Ruby/Rails400
XQuery65
XAML/WPF60
SVG21
XForms6


Now, this may be apropos of nothing - I'd have to sample over a much broader period of time for this to be any other than a snapshot, but what it does indicate is telling. There are more than 5000 positions open for XML, which indicates to me that it has reached such a level that not knowing XML at this stage can be considered a major liability in your programming portfolio. That some XSLT work may be subsumed into this total should also be considered.


JavaScript is also up in this category, which is actually quite interesting; I've broken down JavaScript and AJAX as two distinct categories, but I suspect that the actual AJAX programming levels are considerably higher than the 1985 jobs given here.


The real eye-openers, however, are in the second tier.


First, Flash. Flash is, not surprisingly, still very popular, but its hard to tell how much of this is programming - breaking out ActionScript as a distinct language only indicates 573 listings. My gut feel is that Flash programming is likewise between 1250 and 2000 jobs.


However, PHP, which largely powers most contemporary LAMP applications, is not all that much higher in usage than XSLT - not quite twice the latter. There are more job openings requiring XSLT than there are in Python or Ruby (even taking into account all of the Ruby on Rails permutations).


XHTML exists at a lower level than languages such as Ruby, but I think this is partially due to the fact that most hiring managers don't differentiate between XHTML and HTML, even where one does exist. The fact that there is such a fairly significant chunk of XHTML, however, shows to me that more than a few companies are beginning to realign along the XML rather than the HTML arc, which also has a tendency to favor XSLT.


XQuery and XForms are nascent technologies - I expect once XForms gets fully incorporated into Mozilla, the XForms numbers will shoot up pretty dramatically, and XQuery development should climb as companies and OSS projects begin introducing XQuery support now that the W3C rec is final.


The point I'm trying to make on all of this is that in terms of open positions (which are, admittedly, not the same as hires) the demand for XSLT skills is comparable to that of PHP and other OSS technologies. I see XSLT as being an enabling technology for the others - if you are using XHTML + CSS + XSLT, then XForms becomes an easier proposition to accept, XQuery becomes easier to accept, and the above pipeline tends to flow naturally. The next major stage, as I indicated earlier, is bindings, but that's going to take some consolidation in the AJAX space (and perhaps some W3C actions) first before it becomes realistic.


Kurt Cagle
2007-03-10 15:23:49
Anonymous (Peter?),


I don't see that much disagreement myself. What I was describing in the original article (and the commentary) to a certain extent are reflections I've had with people that are using XSLT in live, commercial grade systems. XSLT as router IS happening ... I've had three or four different IT managers tell me that they've migrated to XSLT as their primary routing system over using BizTalk (though I also know of course that BizTalk incorporates XSLT routing itself). XSLT as a transformational grammar to develop other languages is also very much there (this from a PM at Microsoft, among others, albeit from a couple of years back).


XSLT has its place, and I've certainly seen it abused to do things it probably should not be doing. Ditto XQuery, XForms, and most of the other W3C technologies. Nonetheless, I see with XSLT2 that XSLT will become easier to write and to manage, which will in turn make it more accessible and consequently likely to be more heavily adopted.


A number of IT managers that I've talked to already have an XSLT2 migration strategy formulated, though they are waiting primarily for either Microsoft (System.xml.xslt), Sun (Xalan) or Red Hat (LibXSLT) to upgrade their core transformers rather than trying to integrate outliers like Saxon. In other words, more than a few IT managers are basically waiting for canonical adoption of XSLT2 in their platform so that they can integrate it without having to add to their development costs.


With the approval of those specs, I expect that to happen by early 2008. With the considerable increase in power between XSLT1 and XSLT2, I also expect that this process will result in a number of novel uses of these technologies that we haven't really even began to touch yet by late 2008 or early 2009.


I'm going to address LINQ under a separate posting, but I see LINQ as being an instance of the "new" form of XML processing - more AJAX like, with XML treated as native, inline datatypes. LINQ is complementary to XSLT in that LINQ focuses on the ease of use of existing DOM entities and handles what I'd see as localized transformations (E4X does the same thing), but there will still be times where the best course of action when working with an XML bundle is to use it as the basis for a transformation.


I find with E4X (and XQuery, for that matter) that there's a point of complexity (usually at about a dozen lines or so of XML) where discrete manipulation of individual properties has become cumbersome and its easier and more flexible to go with an XSLT transformation. My suspicion is that LINQ is the same way. If the result can be in term expressed in the same objective XML style (the one weak point in this whole train - not insurmountable but nonetheless awkward) then you can go with a work flow that looks something like:


  1. Set up original XML as objectiveXML entity (ob1), using inline manipulation in JavaScript (e4x), VB.NET or C# (Linq) or xquery (XPath) to build the core "parameteric bundle".

  2. Pass the XML into an XSLT transformation T() to generate a resultant DOM reflecting either new state or subordinate child processes: T(ob1) => (ob2,ob3,..,obN)

  3. Pass these secondary processes off to either persistence or services entities, possible with additional transformations for handling presentation layer output.


This isn't theoretical; I've been working with E4X, XQuery and other objective XML languages for a while, and I find that this pattern appears all the time. The significant aspect here is that the pattern has minimal impedence - what you are doing is mapping out an orchestration pipeline, either synchronously or asynchronously, with the endpoints being access points into non-XML portions of the system.


I don't think that XSLT is itself magical - only that it serves a necessary "magical" purpose as a transformative engine in such an orchestrated system. You can readily replace it and the query engine with something else, but those replacements will likely perform the same roles. In homogeneous synchronous systems, you can readily control the impedance due to XML to non-XML interfaces, but in heterogeneous asynchronous systems that impedance can become a significant deterrent to the scalability of the system. As you abstract the XML/non-XML gateways, this generally serves to make the system increasingly XML at the core. That's all that I'm trying to imply from the above article - syntactics at the center, semantics at the edge.


Gods, one of these days I need to write a thesis on this. To me it seems inevitable, but that may only because I tend to view the world through XML tinted glasses.

Sylvain Hellegouarch
2007-03-10 16:08:42
Kurt,


I can appreciate the way you argument my point but I wonder if it holds since you start from its conclusion: the raw number themselves.


First of all if I consider that lots (if not most I would dare saying) of job positions are filled with the broader set of technology buzzword one can find. I do wonder about their meaningful aspect and how close to the reality they truelly are. The fact that world XSLT appears in a job description says very little on the technology is being used. I think we can agree that writing a XSLT stylesheet as a template to a web page is much much different than using XSLT for more advanced XML transformations as in a pipeline we have described above.


Secondly although I have much much less experience than you in that respect I would venture saying that large companies are usually careful when it comes to try a different way of thinking. Moreover when we see that technologies like SOAP are not described as the evil of all by most columunists when companies are only on the second gear of actualling using it I do wonder if they will actually jump the gap for a "whole XML".


Moreover the tools aren't there yet. If SGBDR are so mainstream when OODBMS or even XMLDBMS have proved they could be more relevant in some cases it's also because there are more tools, bigger names and a better understanding of SQL. One of the project I have worked on was heavily using different aspects of orient object programming with the .NET framework and we were pushing to use db4o but it was never approved because there was a strong requirement to be able to plug OLAP products or be able to run SQL queries out of the program. Whether or not we like it RDBMS and SQL are the default storage/querying system companies seem to look at.


So the problem I see here is that the chain NXD+XQuery+XSLT2+XForms+CSS needs to be complete. If you take away one of its component you break the model implied by XML througout and therefore the point is moot.


I wish for instance that XForms gets more momentum but as long as Microsoft won't support it natively there is little chance companies will invest on it I think.


Overall what I'm trying to say is that even though the technology is there and will certainly take off in some specific cases I do wonder if this will be as big as you'd like it to be.

Kurt Cagle
2007-03-10 18:30:47
Sylvaen,


Maybe. The tech sector is unique in that you are selling to the future. In the commodities sector you have to predict the level of usage of a given product in order to maximize profit, but you know that the needs for a barrel of oil is not, in general, going to change appreciably in six months or six years. In the health care sector, your primary services, providing care to patients, does not change, although the tools (again the technology) can change both the level of information you are able to work with and the degree to which that information can affect your ability to provide your core services.


For this reason information technologies generally evolve far faster than the industries that they service, while at the same time the tail of those services can stretch back pretty dramatically. Note that Perl is still well above the 5000 person mark, largely because of ALL of those legacy systems built around Perl that need to be maintained.


I'm a systems architect - I basically look at the needs of my clients not just today, but five to ten years down the road ... and in fact I have to work carefully to insure that my focus remains on building systems that are realistically operational and able to meet the business needs of those companies 5-10 years down the road, because these companies are going to spend a significant amount of money today to insure that this happens.


In many ways this means that an architect and a programmer are often at odds. The purpose of the architect is to create the system that best fulfills the requirements of the customer in the future. The challenge to the programmer is that they have to build that system with today's tools. The software vendor, of course, wants to encourage this, because they are trying to insure that the tools they are developing today will limit the alternatives to the customer choosing their tools tomorrow.


Trends that I see now indicate that XML is becoming the de facto mechanism for messaging (I'm going to be a little liberal here in calling JSON an XML standard here, the differences are syntactic, but the philosophies are not so radically different that I see any benefit in seeing them as distinct technologies). The messaging architecture has a profound impact upon the shape of a technology, because the messaging format needs to be serializable and parseable at each point of the message, which in turn implies gateways and information impedance. One solution to such impedance was SOAP/WSDL based web services. Another is REST (call it the W3C stack if you will, even though I know SOAP is currently a W3C standard), and the current directions in this area imply that the ultimate solution will reside somewhere in between those two arenas.


SQL is not going to go away anytime soon, but SQL is also a pre-XML technology - it's not real good about messaging architectures, and relies upon fairly non-uniform binary implementations to retrieve information from the database. In other words, SQL makes for a lousy gateway.


XQuery (once some agreement is made on the update mechanism) is an abstraction layer to the database, one that makes the database appear as an XML messaging source or sink. With extensibility in the XQuery layer, that may include sending SQL to the database through an appropriate pipeline, but the resulting data will still appear to this abstraction layer as if it was XML.


The idea of an XML database in this context thus appears to be somewhat quaint, and largely unnecessary - XML is not efficient as a structural architecture for OLAP or similar large scale database invocations ... but it doesn't need to be. XML only needs to be efficient at encoding the results of these queries.


Now, back to the points you raised - every computing era has had a specific domain that drove both the products and the developer skills (who are, after all, only another kind of vendor), and these in turn were driven largely by the dominant messaging systems that set the scope of that domain.


The first generation web involved the development of low level packet communication - the Ethernet era - and the messaging architectures were at a very low level. This permitted very tight granularization of commands, but at the cost of complexity - you needed to be a very skilled developer indeed to get those packets to do specialized actions, until it becomes more advantageous to build an abstraction layer (and the efficiencies make it feasible).


The second generation web involved the development of high level HTTP message communication, and for the most part it devolved into a somewhat extended client-server based system that had a largely read focus. A lot of the tools that were developed during the 1990s involved ways of trying to circumvent the limitations of what was, for most OS developers, largely a step backwards. This was because the dominant domain that they were used to working with consisted primarily of the PC itself, or at worst the abstraction to the immediate LAN.


The third generation web is basically large scale distributed systems primarily using the Internet as the messaging infrastructure. The standardization of XML as the messaging infrastructure dramatically drops the level of information impedance between systems, and while the shape of that is still to a certain extent being determined, the broad outlines at least are now becoming clear.


What this means from a developer perspective is also becoming clear - technologies that facilitate the integration of the XML messaging architecture will tend over time to replace those that impede it. Please note that if I am wrong on this one point - the believe that XML will be the primary architecture for messaging - then I am wrong on everything, but I think it reasonably safe to say that I see little evidence that I am wrong about this.


This is the backdrop against which everything else plays - not the number of developers that are skilled in technology X, not the amount of money put into the marketing budget of Microsoft or IBM, not the degree of aggrendizement in the popular media against this or that technology. The mechanism of messaging ultimately drives what will prove marketable, in terms of developer skills, vendor tools, and ultimately customer adoption. The only thing to be determined is the degree of time that this process will take.


I may be wrong in the particulars - some tool may come along that works with XML better than XSLT (possible) or XQuery (likely) or XForms (again possibly) and that satisfies the broader political boundaries of the solution space (cost, both extrinsic and intrinsic, political and social sentiment, efficiency and extensibility), but I don't see the shape of the future being significantly different, only the names of the tools. Of course, that's only my interpretation, and as I have said, if I am wrong about the acceptance of XML as the messaging architecture then I am wrong about most of it. The only other serious contender in this space is JSON, and JSON is able only to provide a subset of what XML does ... the crucial question is whether that subset is sufficient (I don't believe it is, for reasons I've elucidated elsewhere in this thread, but others may prove me wrong here).


Okay, sorry for the length of this response, but I wanted to show you the basis for my reasoning.

Peter
2007-03-11 04:12:20
Kurt,


Anonymous was me yes...sorry for that. The result of typing outside of the comment box, which is sometimes a bit tight. The site could also use threaded comments, so I am not interfering with other comments.


But, thanks for sharing your experiences. I certainly find them interesting, and I promise to stop after this entry.


So...


It is not because one can build XSLT/XQuery routers, and because people are doing it, it is necessarily the right tool for the job. As long as these routers only manipulate the XML syntax and use the XML syntax to implement the application logic, using XML tooling to get the job done is probably fine, but one crosses an unseen border when other information or services are needed to implement the routing algorithms. At first I am sure that will work out fine: call an external function here or there, or map some external information in an XML wrapper and on it goes. At the end of the day though that becomes as messy as trying to write the same router straight on the DOM (just another form of XML syntax) in Java, C# or whatever. The first approach gives you flexible and maintainable XML operations the other gives you easy integration with other parts of the application. But the bottom line is (imo) that XML syntax is not the right level of abstractions to build most applications from.


For lack of an easy integration, I have reluctantly used both approaches (not with a router, but other applications that work with XML input and create XML output). The obvious third alternative I have used is to 'manually' implement two-way conversions from the XML syntax environment to the application domain objects. In the end I think that is the best approach, although I'd like to get rid of the development and maintenance impact that 'manual' part has on the duration of the development effort. XQuery (or XSLT) certainly help, but it is especially here that I could see "LINQ like" technologies play a valuable part.


Assuming I understand what you mean with ObjectXML, that is perhaps an interesting "pattern" to be applied to this conversion task.


For a lot of smaller stuff it is today and unfortunatly still very tempting to do it the 'wrong' way, and either suffer through the DOM or have fun with XQuery (or XSLT) and try to invoke the services from there, but I'll try an 'ObjectXML' approach next time I need something like this.


Oh...and perhaps tell the IT managers waiting for Xalan to level up with Saxon not to hold their breadth ( ;) ). Not that I have any predictive talents, but I'd be surprised if Xalan is able to match quality and performance of Saxon by the beginning of next year.


Peter


mperestrello
2007-03-15 17:04:45
Hi All,
If you'll excuse me I would like to use the topic of this article to announce the beta release of an OPEN SOURCE project which brings XSLT 2.0 to BizTalk, called XChain. This uses the Saxon Processor to allow using the constructs mentioned above in the BizTalk enviromnent. I hope you don't consider this comment inappropriate, as I find that what is now happening in terms of the open source community and cross compatibility can lead to breaking vendor lock-in, especially on the Microsoft platform.


The Microsoft decision to not do XLST 2.0 was a major stumbling block to more widespread use of XSLT 2.0 standard. With the release of Saxon for .NET, there are suddenly many new ways to do what you want to do in the MS world, without being locked down to microsoft tools, or, decisions.

Kurt Cagle
2007-03-15 18:05:44
Cool! I can only see this as a good thing. XSLT2 in BizTalk opens up a great number of possibilities, and I think will help promote the use of XSLT2 throughout the Windows sphere!