Standardize the jellybeans not the jars
by Rick Jelliffe
The result: vocabularies where unnecessary order and structuring constraints are given. You can tell when a standard schema is over-specified, because people using it will just snip out the low-level elements they need and plonk these in their own home-made container elements.
I have noticed this in a few schemas I have been working with recently: in fact, the trend I notice is that people start off with their own home-made schema, then "adopt" the standard by finding any elements that have close semantics to their home-made elements, and changing the name of the home-made element to the standard name. SVG in ODF looks like an example of this, and there is another standard I have been working with recently that has the same issue: when you adopt arbitrary portions of a cohesive standard, are you really using or abusing that standard?
I suppose there is a case to be made that transitional schemas should be treated seriously.
One software engineering idea that has stuck with me over the last years (which I wrote about in The XML & SGML Cookbook) is the twinning of cohesion and coupling. Basically, that when some information is highly coherent (think of Eve Maler's Information Units) i.e., it belongs together semantically and would not make much sense in isolation, it deserves an official container.
Conversely, you should try to reduce coupling of information that is not cohesive.
A rule of thumb for many situations is that industry standard groups (and, indeed, inhouse schema developers), may be well advised to standardize data elements eagerly but container elements suspiciously: standardize the jellybeans not the jars. The next bloke may likes your jellybeans but have his own jars.
Various approaches to do this come to mind: think in terms of creating a vocabulary rather than a language; split your industry standard in two, with the tightly coupled elements in one normative section and the loosely-coupled elements in another non-normative section, perhaps with different namespaces even; use open content models and order-independence for loosely-coupled elements.
Another upside for this approach, is that it reduces the number of trivial issues for committee members to get excited about.
The unsurprising part of this is that many SGMLers came to these conclusions over a decade ago (lots of litte schemas/DTDs) although that led to some of the wrapper approaches where entities were not well-supported (the Navy Work Package and the European cousins come to mind). I liked the frame approach and that was later replaced with divs oddly enough by the same people who disliked frames.