ACORD, XSD and Schematron
by Rick Jelliffe
This is not to point out deficiencies in XSD (the facts can speak for themselves) but to look for the relative strengths of Schematron. This kind of data, of course, is very prone to have having several layers of rules for each user: business rules, occurrence rules that come from the forms used, systems limits, and so on: each of these can well be represented by Schematron schemas usually (or combined as different phases of the same schema.)
But lets just look at a three additional constraints above the ones that XSD schemas can represent, just from s4 Implementation Conventions of Acord Life, Annuity and Health Standard v2.17
The first interesting thing is the use of typecodes, explained in s4.4. ACORD documents are interesting because they want to use the same XSD schema and elements for each stage of processing. So when a form comes in, before it has been assessed, in some process all the data may be just treated as strings. Then when a datum has been assessed and possibly fixed up, then it can be marked with a typecode has having a certain data type, for example being a date (in 8601 form).
This is pretty much unfeasible in XSD: I don't think we can use xsi:type for this, because IIRC the type nominated by xsi:type has to be derivable from the actual type specified in the XSD schema, and a date is not derived from String. (Maybe XSD 1.1 fixes this, it doesn't matter.) In Schematron, it is easy: something like
<sch:assert rule=".='true' or .='false' or .='1' or .=0'">
A <sch:name/> element should be a boolean</sch:assert>
...other rules for other typecodes...
There is an interesting constraint in s4.13 that says that aggregate elements with no optional subelements should be omitted. This is not something that can be specified using grammars, since it makes the occurrence of a parent dependent on the value of a child. The Schematron assertion might be as simple as something like this:
<sch:rule context="*[string-length(.) = 0]">
<sch:assert test="not(*)">Aggregate elements should contain elements with content</sch:assert>
In s4.14 it speaks of nested data ranges: the example they give is
it would not be valid for a PolicyProductInfo to specify an expiration date of 3/1/2005, while one of its child JurisdictionApprovals specifies an expiration date of 4/1/2005.
This is obviously quite trivial for Schematron, especially when you make life easier for yourself by using
sch:letto parse the dates into fragments that make comparison easier.
TriSystems Infobahn have a brochure (PDF) on their approach for using Schematron with ACORD, for people who want more information. The ACORD schemas were developed with respected industry figure Daniel Vint as the senior architect: I see he is potentially nabbable for contract work now.
Thanks Rick. Very interesting. Many years ago, I had prototyped a simple web form that would, upon "submit", create a schematron rule xml instance as well as autogenerate the XSLT that would validate it. The web form was dirt simple, only allowing to change cardinality or optionality.
I even went as far as to auto-generate an Xml Schema which reflected the changed cardinality! It was all in an attempt to add constraints onto a base schema.
Whether it is adding schematron constraints or reflecting additional ones on a subset schema, the need for this kind of functionality is important.
As an update, TriSystems Software (formerly called TriSystems InfoBahn) have done and continue to do a lot of Schematron work around the ACORD standards - in particular we built ACORD's own test harness (named the TCF - "Testing and Certification Facility") which uses Schematron intensively to carry out tests on members' XML messages as part of the process of certifying them as compliant with the ACORD standards.