UBL Methodology for Code-list and Value Validation
by Rick Jelliffe
Imagine you are a trading company: you have documents which various fields for countries: countries you can send from, countries you can send to, countries the US won't allow you to export to, countries you can use as hubs, countries with regional offices, etc. And you also have lots of other documents with similar or different sets of countries. And countries are only the start: you also have product codes where different fields can have different sets of codes, and so on. And this may vary according to where the document came from (the Libyan branch office may have different rules from the Alaskan branch office). And, of course, the values of codes may have interdependencies, such as "the source must be different from the destination."
So lots of uses of a standard vocabulary, but lots of local and changing subsets that are much closer to "business rules" than "datatypes".
If you used XML Schemas, you could theoretically derive by restriction all the different subset codes, then use "redefine" on every top-level element that used the subsets. (You'd have to do this redefine on base types where possible, so that subsequent derived types would inherit the restriction, perhaps, except then you'd have to check that any subsequent derived types that themselves define restrictions are indeed subsets. Have a breakdown and a good cup of tea.)
With the Schematron approach, you select the items from the code list you want, and some magic tool provided by the methodology generates the Schematron code, which just uses simple XPaths (i.e. what processing software probably uses.) You could still use an XML Schema, just to constrain the lexical space very broadly, but the Schematron constraints would check the values against the list.
>>> somewhere between baroque and a hard place