ISO Schematron gets more standards uptake

by Rick Jelliffe

W3C's Services Modeling Language group has two new drafts out: Services Modeling Language 1.1 (latest version) and Service Modeling Language Interchange Format Version 1.1 (latest version). From the abstract

This specification defines the Service Modeling Language, Version 1.1 (SML) used to model complex services and systems, including their structure, constraints, policies, and best practices. SML uses XML Schema and is based on a profile of Schematron.

SML comes out of the XML activity at W3C, not the WS-* activity, so it seems more aimed at working on top of POX (plain ole' XML) systems. It has representation from IBM, Sun, BEA, CA, Intel, HP and a Microsoft. WS has a bad rep at the moment for over-engineering, but that is partly because many people have problems that they want to be solved by the almost-simplest possible technology. The would prefer erring on the side of modesty rather than grandiosity.

SML has nothing directly to do with services despite the name, and nothing to do with modeling for that matter either: that just seems to be the use-case that has driven the development of a more general technology that takes seriously the problem How do we validate systems of documents, including documents held in multiple files and documents that transclude other documents?, which seems to be an entirely practical question to me: this is the kind of use case that should be driving XSD and DSDL development IMHO.

As I understand it, the recipe for SML is roughly

  • Systems or services are modeled using XML documents which are either definition documents or instance documents

  • Definition documents are either schema documents that use W3C XML Schemas (with a completely reworked version of XSDs key/keyref mechanism allowed under appinfo that handles multi-file references), or rules documents that use ISO Schematron (vanilla XSLT query language with a slightly extended XPath). A whole Schematron schema is plonked into the appinfo element rather than using the Eddie Robertsson' minimal form for embedded Schematrom, however, they use a rule context of "." which works out the same. A nifty attribute is added to allow better localization.

  • The model documents are validated against the instance documents

  • A little error report container, to hand back bad data.

  • A kind of transclusion link to allow documents to reference other parts: yet another replacement for entity references! The interesting idea is that the refered-to fragments are not substituted in the document, so we have two PSVIs: the PSVI of the document transcluded and the PSVI of the document without the transclusion. A deref() extension is provided for XPaths: I supose this is something to add to the list for the Schematron skeleton implementation. XPointers can be used for references: I see that, of course, it is the restricted XPath that doesn't include the range-to functions that killed XPointer. The link allows the element name at the other end to be specified.

  • The Interchange Format (SML-IF) provides containers and accoutrements for bundling everything up into a single file for interchange

I'll write to the SML group, because they have the use of sch:schema/@queryBinding slightly wrong. It is intended to clearly label what query language is used. The SML draft says that it must be "xslt" however actually they use an extended xslt. What they need is a little Query Language Binding document (which only needs to be a paragraph) to define a query language binding name like "xslt+sml" or whatever. If users don't use deref() they won't need to do anything, but it is better to catch schema errors early rather than having obscure XPath messages.

The downside of SML is that it again (as did WSDL's extensions) shows that XSD, despite being so large, is still simply not capable enough: a non-trival language should be able to handle non-trivial problems otherwise what is the point? Schematron's approach of explicitly allowing different query languages (and providing guidance on profiles and embedded vocabularies) is much more flexible and practical, IMHO.

In other Schematron news, I see that it is being used by the RELAXED online HTML validator (SourceForge). This project is a good demonstration of using the ISO DSDL little schema langauges together: NVDL, RELAX NG, and Schematron. NVDL and RELAX NG are also used in Open XML, and ODF was defined using RELAX NG. For comments on making standards from Schematron schemas, see this blog item.