XSLT and Binary File Formats
by Philip Fennell
With all the recent talk of angle bracket taxes and what XML is and isn't good for, I thought it would be fun to look at taking XSLT to places where it is not normally associated - the generation of binary file formats.
The sequence in XSLT 2.0 is of more use than the humble node-set. Not just restricted to nodes, you have access to things like the
tokenize() function, that creates a sequence of strings or you can concatenate a sequence using the comma operator. The comma operator can be used on any data type.
However, there is nothing here that lifts us out of the ordinary; not until, that is, you create a sequence of
xs:unsignedByte numbers. This sequence can be considered a byte sequence, and if you can create a byte sequence you can create just about any binary file format you like. A good example of this would be an image file like a Tagged Image File Format (TIFF) image. If you don't get involved in image compression, it is relatively easy to create a TIFF image, after all it is only a series of sequences of bytes.
Mind you, there are two problems to deal with. The first is that a basic XSLT 2.0 processor does not support the
xs:unsignedByte data type. Only a schema aware processor is required to support that data type. So, in the absence of the latter you'd have to make do with
xs:integer and put up with the extra memory needed. Secondly, and more importantly is - how to get a byte sequence out the other end of an XSLT processor!
MPEG, a working group of ISO/IEC, has standardized (within its MPEG-21 Digital Item Adaptation standard) means for generating binary file formats based on so-called Bitstream Syntax Descriptions (BSDs). The main application is the adaptation of (scalable) multimedia contents (JPEG2000, MPEG-4 SVC, etc.).
|Thanks Christian, I'll take a look at that.|