On error rates in drafts of standards.

by Rick Jelliffe

The early tallies I have of the number of comments from national bodies DIS 29500 is about 3,550: I expect there are an awful lot of duplications though. Is that a lot?

A question on this came up on XML-DEV this weekend.

Tim Bray said
The task of addressing all ten thousand or so ISO-member comments, even after removing dupes, and dealing with the callouts to unspecified product behavior, and so on, with no assurance that doing so would result in ISO blessing, seems just insanely expensive and difficult to me. If those guys take it on, they have my respect and sympathy.


Michael Kay responded
Actually, 10,000 comments on a 6,000 page spec doesn't sound like a large number to me. If I had less than two comments per page on a book or spec I had submitted for technical review, I would be concerned that the review wasn't thorough enough. Perhaps people were holding back because they don't want to provide MS with a free QA service.


And Jim "ISO SQL" Melton added
Or perhaps most people were somewhat intimidated by the prospect of (thoroughly) reviewing a 6,000 page document. To put this in perspective for those who know SQL's size and complexity, the sum of all nine parts of SQL is about 3950 pages. A ballot on SQL frequently receives several thousand comments, and we've been balloting versions of SQL for 20 years!

In fact, virtually every large spec I've ever had the "pleasure" to review leads to "thread-pulling", in which every page yields at least "one more" bug, and following up on that one leads to more, and following up on those leads to still more, etc. I would personally be stunned if 30 dedicated, knowledgeable reviewers of a 6,000 page spec on its first public review were unable to find at least 3,000 unique significant problems and at least 40,000 minor and editorial problems. But that's just me...


And here is a comment from my blog a few week's ago:
A big standard will have a lot of changes. If my 30 page standard had 10 changes in its final stages of national review, then DIS 29500 will have about 1000 changes at the same rate (assuming it has 3000 normative pages, which is probably too much). That is just the slog in getting a standard out the door, tedious work not a cause of panic.


So my bold prediction is that the extreme anti-OOXML squad will alternate incoherently between "Its too many! We have to draw the line somewhere!" and "Its not enough! It is beyond the powers of mankind to read this thing!" while MS PR will alternate reactively "Its wonderful and thorough! Long live openness" and "We can do it in our sleep!" And the ISO process will continue calmly on, disappointing the bullies and the racists and the cartel-izers and the sour-grapers and the parrots, and deliver a good initial version of the standard. I think many reasonable people who had reasonable concerns about DIS29500 will see that the process actually has allowed their concerns to be addressed, and will see through the hysteria for what it is.

4 Comments

Asbjørn Ulsberg
2007-09-10 03:48:44
You still seem to agree with Microsoft in that it is perfectly normal for a specification to receive this amount of comments in a Fast Track procedure and that the six months provided is enough to thoroughly review a six thousand pages. A Fast Track procedure is supposed to rubber-stamp a specification with "ISO Approved" in the end, it is not intended to be a thorough review process like Microsoft and you seem to imply.


If Microsoft wanted a thorough review, they shouldn't have posted the specification for Fast Track approval in the first place. The fact that they got 3.550, 10.000 or whatever number of comments on a specification that should have been complete and ready for approval after ECMA was done with it, is humiliating and a disgrace to both ECMA and Microsoft.


If it wasn't for the fact that these are the results after a Fast Track procedure, then I'd say "kudos" and "well done" to all participants, including Microsoft. But since it indeed is, I say "shame on you"!

Rick Jelliffe
2007-09-10 04:27:44
Asbjørn: Err, you are complaining that ISO isn't rubberstamping?


That is original...I thought the problem was "MS IS RAMMING THIS THROUGH AND THERE WILL BE NO OPPORTUNITIES TO CHANGE ANYTHING AND THE ISO PROCESS IS CORRUPT AND EVERYONE WHO VOTED YES IS CORRUPT AND THE DRAFT STINKS ANYWAY!!!!!!!"


"ONLY 30 DAYS OOPS ONLY SIX MONTHS OOPS ONLY THIRTEEN MONTHS!!!!"


You say "Six months" but the total time will be about 13 months. There is no ban on new issues being discovered and pursued through the BRM, though the chairs of all the sessions will certainly have some discretion in preventing any hijacking of the meeting.


Almost all the fast-track procedure does is let the Committee Draft be developed externally. When transposing a standard from an external organization, there are bound to be differences in approaches, some of which may be major. Every different organization has a different approach to "shall" and "must" for example. Where the editor of the external standard is also a member of ISO SC34 (as was the case with Patrick Durusau for ODF) then by happy accident the external standard may be fairly ISO-ish and fairly SC34-ish, but this is not something that people should expect.


Read the comments of Jim Melton above: do you see any hint of "shame on us" for comments being found? On the contrary, a lack of comments is usually a sign of poor review.


A fast-track standard can have as thorough review as National Bodies want to give it: little or very full, depending on their interest and expertise. National Bodies decide on whether they are happy to rubber-stamp or not. The thing that people don't get is that at the DIS stage of the ballot process (this last year), the initial round of comments take six months to prepare, then the resolution process takes about another six months. But the thing is that when a National Body takes an interest, as everyone knew they would, and makes comments, it sets in train this next process.


Now I am sorry if the realization is dawning that the headlines were wrong, and that the whole ISO process is skewed to win/win not win/lose.


As for "perfectly normal", my answer is threefold. First, the number does not ring alarm bells, based on my estimates and certainly from Jim Melton's comments. Second, a large standard of course will have a large number of errors; big deal. Third, in just the same way as one "6,000 pages" is misleading (see my earlier blog "That Diagram") so with these comments you really need to look at them in detail to see how challenging and favourable/unfavourable they really are.


Furthermore, a nation might have 20 comments, but only be particularly attached to a couple of them. Or they may reconsider the comment in the light of subsequent discussion.

Asbjørn Ulsberg
2007-09-10 05:03:59

Asbjørn: Err, you are complaining that ISO isn't rubberstamping?


Ehm, no. I'm complaining that Microsoft and ECMA posted a specification of such poor quality that ISO couldn't rubber-stamp it. My problem with OOXML isn't that it's developed by Microsoft or that I think Microsoft doesn't somehow "deserve" to get a specification ISO approved. My problem is that Microsoft did such a mind-blowingly bad job at writing the specification, that the internal bugs of their bug-ridden Office suite is inherited in the spec and that ECMA did an even lousier job at reviewing the specification before submitting it to ISO. Before submitting it to ISO through Fast Track, the spec should have been virtually bug free. It wasn't. It isn't. Is this fact so impossible for you to agree on?


Almost all the fast-track procedure does is let the Committee Draft be developed externally. When transposing a standard from an external organization, there are bound to be differences in approaches, some of which may be major.


Exactly. What this last year has shown us is that transposing a standard developed by Microsoft through ECMA will guarantee a very poor specification with lots of ambiguoity, large unspecified portions and extreme amounts of cruft. What you get is something that can't be run through a Fast Track procedure because the quality of the specification you get is so poor that you need at least two years of thorough review to get it in shape.


Third, in just the same way as one "6,000 pages" is misleading (see my earlier blog "That Diagram") so with these comments you really need to look at them in detail to see how challenging and favourable/unfavourable they really are.


As was commented on That Diagram:

you are absolutely incorrect when you say that because a document could be reduced to 800 pages, that makes it as easy to review as a document that is actually 800 pages.


Saying that the specification could be reduced to 800 pages doesn't make it 800 pages. It's still 6000 pages, and that's the number of pages each and every NB needed to go through. Separating the wheat from the chaff was up to each NB although it should have been up to ECMA and Microsoft before they handed the specification over to ISO.


Of course you need to look at the comments in detail, and it's obvious that every comment isn't at the "comply or die" level. However, considering that Microsoft Office's internal document representation is likely to be fairly 1:1 with the OOXML specification, some of the comments are going to require such an enormous refactoring and reprogramming in Office (and what are they to do with the OOXML documents already being produced?) that it can't be easy for Microsoft to swallow. For example:

Simplify the information model and document structure, in order to ease implementation, interoperability and the processing of the OOXML documents. Where possible use notations in conformance with ODF.


And:

All references to platform specific and/or binary notations, such as DEVMODE for printer settings and bitmasks for boolean values, should be removed and, where possible, replaced by open, XML-based standards, more explicit XML vocabulary, or base64 encoding.


And:

Ensure that 29500 does not conflict with the above-mentioned standards and use only ISO standard date formats, not ambiguous numeric dates.


These comments are highly technical (e.g. not just editorial) and will require enormous changes in Microsoft Office if they want to comply. They are also of so high importance that they can't be ignored; the NB that proposed these changes will at least not ignore them.
Rick Jelliffe
2007-09-11 06:48:54
Asbjørn: If you are asking what my expectations for changes from the BRM are, here is my current list, not based on any particular information from MS.


In with no contest: Editorial changes. Typo correction. Example correction. More information on existing topics. Scope and conformance clarification. Schema correction. Rejigging of the schema and text so that lists that are currently closed are open. Reformatting. Removal of non-normative text.


Perhaps in, after discusion: Re-arrangement of Part 4 especially into multiple parts. Provision of alternative embedded notations in addition to existing optimized or home-made notations. Sometimes Open XML uses element names where attribute values would be better, and it might be tidy to repair this too.


Out by deflection: Comments that change the scope of the standard. MS can say "These would be good to have, and we will start participation in forum XXX, but they don't belong in IS 29500." I would expect considerable pushback from some NBs against other NBs who want to make the standard even larger, actually.


Out by conflict with the purpose of the standard: Changes that invalidate existing documents.


I would expect that MS will want to address some of the concerns of each National Body, not just the easy ones on the total list, because for marketing to the particular nation they will want to be able to say "We took your requirements seriously". Not quite "take each NB response and accept the most doable 50%" but you get the idea.