Streaming XPath

by Uche Ogbuji

There has been a lot of discussion of a subset of XPath that can be computed on the fly while parsing a document. A good way of thinking about this is a subset of XPath that could be implemented in SAX with relatively little fuss. In the XPath NG project, we have been discussing such a subset of XPath as a possible goal of XPath NG. This is a summary of the various streaming XPath proposals that have been brought up so far.


Desai, Arpan presented Introduction to Sequential XPath at XML 2001. His introduction says it all:




This paper will provide an explanation of and the subset of XPath which we will tentatively dub: Sequential XPath, or SXPath for ease of use. SXPath allows a event-based XML parser, such as a typical SAX-compliant XML parser, to execute XPath-like expressions without the need of more memory consumption than is normally used within a sequential parser.



Robin Berjon then mentioned another project:




There is another stab at [streaming XPath] (with an implementation):

http://search.cpan.org/author/RBS/XML-Filter-Dispatcher-0.31/lib/XML/Filter/Dispatcher.pm

The beginning describes the module more, but the end focusses on the subset of
XPath which Barrie calls "EventPath".

PS: don't worry if you don't quite understand the code in some of the examples,
it often uses XML::SAX::Machines which is a level of abstraction above SAX filters.


Berend de Boer also brought up Rules for Efficient XPath Evaluation, another such proposal, constructed with mathematical fastidiousness.




Do you know of any other streamable XPath proposals? Do you have any particular preference?


1 Comments

evlist
2002-12-11 16:28:48
ISO DSDL part 6- path-based integrity constraints
Uche,


ISO DSDL part 6 is about "Path-based integrity constraints".


James Clark who has taken the lead on this part made a requirement that the subset of XPath which will be used should be streamable and there might be some synergy to find here too.


(see http://dsdl.org/ )


Eric