Previously uncooperative multinational does the right thing: market-dominating, proprietary document format accepted as ISO standard

by Rick Jelliffe

I am glad to see that Adobe's PDF 1.7 has been accepted as an ISO standard, IS 32000:2008. It still needs to have a few hundred comments resolved and folded back into the final text, but the initial ballot was a success and I suppose early next year the spec will go online at ISO's free site. It has gone through very fast, and I congratulate all concerned.

For my opinion on why an ISO standard for PDF is a good thing, see yesterday's blog All interfaces by market dominators should be QA-ed, ZRAND standards!

There have already been smaller subsets of PDF available: PDF/A for archiving and PDF/X for exchange, both subsets of PDF 1.4. (The links are to pages that are really good examples for what governments and guidance organizations need to provide, to help people select between multiple standards.)

I am sure ISO PDF will help reduce that apoplexy that some people are being encouraged to have concerning OOXML, because it shows that there can be multiple standards (even for the same thing: three ISO standards for PDF alone, and counting!) as long as they don't contradict (which has a very strict meaning in ISO usage: standard A cannot say X is a Z while standard B says X is a Z). And it shows that proprietary technologies can be standardized. And it shows that there is a difference in the (good) openness for getting good documentation and (coutner-productive) openness in arbitrarily changing a standard on ideological/aesthetic lines so that it no longer reflects the existing, deployed technology. And it shows that standardizations is a positive step forward for the community to manage market-dominating technologies (I mean standardization in the sense of being published as a ISO standard, which does not imply being adopted by any nation as a required format by regulation.)

They have 205 comments. It would be interested to see how this compares to the size of the spec, and compare it to OOXML. (I was pleased to see that some ISO PDF people measure the size of their document in total surface area of printed page frames rather than just raw page count: this is a little bit more sophisticated than dumb page count, but still only an unsound indicator for serious comparisons of standard size or complexity.) I couldn't find a draft fast, but I read that in ISO format it takes fewer pages than the Adobe format: but taking th eAdobe 1.7 of 1310 pages as a roug guide, that gives an issue rate of 1 issue per 6.4 pages, compared to the OOXML rate of about 1 issue per 8 pages (assuming about 750 unique issues for OOXML). The numbers are not precise, but they are about the same! The only difference is that the OOXML changes tend to be broader (conformance, organization) and more disruptive (since people expect XML to be readable in the most general sense, while they don't expect this of PDF.)

One of the most interesting documents about how Adobe/AIIM created the draft ahead of standarization is here. It is strikingly similar to how the OOXML draft was created, but note that among the national body complaints about OOXML include several concerning the use of "shall" and "should" (I raised this issue with my national body, and it was included in the Australian comments.) Conformance language is important: a standard is not really a document that is a specification suitable for a programmer to implement directly, but it is something that may be used in contracts (or called up by regulations) so it needs to be clear about what it requires and what it doesn't require (clarity is more essential than completeness, if you know what I mean.)

ISO 32000 is based on the PD 1.7 spec, available here. The document ISO 32000 - Summary of Changes describes how the format was made.

The 205 ballot comments and their resolutions will not be publicly available, I expect, according to the usual ISO requirements. The mechanism for participation in standards development is to seriously join in, not criticize from armchairs: openness does not mean a free-for-all. People who suggest that somehow we can have Slashdotters directing standards are not realistic.

It will be interesting to see which other market dominators sniff the wind. Standardization through ISO of market-dominating technologies is good for everyone. The technology is already entrenched, so it does not entrench things further, but it provides a better basis for substitution (good for user choice and competitors) and interoperability (good for user choice and the dominator company and peripheral developers): everyone wins. They need to do this voluntarily before regulators use closed standards as evidence in anti-trust procedings.

I don't see the people complaining on OOXML about proprietary technologies being standardized, the ISO fast-tracking procedure, the use of vendor consortia to largely rubber-stamp a pre-existing text, the kinds of error-rates, and the presence of actual users, vendors and stakeholders' representatives on committees, complaining about ISO PDF. But all the things are present there. What is the difference? (Flamers: don't sidestep by mentioning other supposed flaws in DIS 29500, that is not what I was asking, thanks.)

17 Comments

Tim Bray
2007-12-06 00:18:27
You say "(assuming about 750 unique issues for OOXML)". That's new to me, and interesting. Where can I go to see the 750 unique issues?
dave
2007-12-06 04:43:04
I assume your title is an allusion to OOXML? Surely .doc et al. are the market-dominating, proprietary document formats that everyone would welcome additional documentation and standardisation of? In fact didn't a country or competitor suggest this should be part of OOXML and you discounted it as out of scope grandstanding?


OOXML itself on the other hand doesn't dominate any market, not yet and possibly never, and since it's sole purpose since conception is to (at least appear to) be "open", surely it being proprietary would be a very bad thing for all concerned?

Gray Knowlton
2007-12-06 07:15:46
Rick you are my hero.
Bruce D'Arcus
2007-12-06 11:56:38
In general, I'm not the biggest fan of Adobe's standards credibility. Like Microsoft, they tend to publish internal specs, and sometimes move to standardize them as is. They have not been good on standards collaboration.


But I'm really surprised you think it a contradiction that I'm not as worried about PDF as an ISO standard as I am about OOXML. The big difference that distinguishes PDF from OOXML is that it is a) widely and independently implemented, and b) there are no other similar ISO standards that it contradicts. I know you contest the claim that OOXML contradicts ODF, but you have to grant it's a reasonable claim to make (e.g. that reasonable people may differ).

Andre
2007-12-06 17:50:42
Everybody has cancer. Get it too...


"I don’t see the people complaining on OOXML about proprietary technologies being standardized, the ISO fast-tracking procedure, the use of vendor consortia to largely rubber-stamp a pre-existing text, the kinds of error-rates, and the presence of actual users, vendors and stakeholders’ representatives on committees, complaining about ISO PDF. But all the things are present there. What is the difference?"


What you describe is a perversion of what standards are meant to be.


What is the difference between corruption in Nigeria and Sweden?

Asbjørn Ulsberg
2007-12-07 02:03:30
Why does everyone who speaks in favour of OOXML conflate the new XML-based ECMA-specified format with the old, proprietary and binary .doc, .xls and .ppt documents? The latter are the ones that are widely deployed, the former are not compatible and not widespread at all. Standardizing OOXML in ISO is not the same as standardizing the old binary documents, so you can't really compare this to standardizing PDF.


There are billions of PDF files in existence and each and every one of them will benefit from being defined as an ISO standard. There are billions of binary Office document files in existence and none of them will benefit frmo OOXML becoming an ISO standard. There are almost no OOXML files in existence that will benefit from being defined as an ISO standard.

Rick Jelliffe
2007-12-07 05:50:56
Tim: I will try to pin down my source.


Dave: The grandstanding is claiming the DIS29500 should be accepted *unless* it adds documentation of .DOC formats (and presumably RTF, why not?) together with the mappings. That would only be a standard of, say 20,000 pages or more. It lacks proportionality and modest scope, which is already hardly what one associates with DIS29500.


Bruce: DIS29500 does not contradict any known ISO standard in the technical sense. ISO was 100% clear on this by their response to all the bogus contradiction claims. The people who lowered the bar on what a contradiction was could only do so by ignoring precedent. People who claim that it does contradict in the technical ISO sense are living in their own little world.


Gray: Good? Am I also your hero for calling on MS to deliver Office with ODF support built-in on the main menu, and for calling on governments to require that MS (and others) put all their interface formats/apis/protocols through a standarization process? I hope so :-)


Andre: Thanks for wishing that I get cancer. That is the kind of proportional response that I have come to expect.


Asbjørn Ulsberg: What you are arguing is that is would be more beneficial to have ISO standards for .DOC and .RTF. That is plausible, but not through SC34. But it says nothing about DIS29500.

Gazpacho
2007-12-07 13:06:16
Asbjørn, maybe Microsoft's goal isn't just standardization for its own sake, but actual cross-vendor interop. Interop based on binary formats doesn't work very well. If one bit flips the wrong way, the whole file is broken.


Why should Microsoft go through standardization for the binary formats, if they wouldn't be good standards?

Rick Jelliffe
2007-12-07 17:56:35
Gazpacho: They would be bad standards, but that does not mean they would not be good to have. Bad because they would be product specific (like I said, over 50 different versions of .DOC) and obsolescent-when-started.) But good to have for archiving and level-playing field reasons. But putting them out as IS may be inappropriate. It may be better to make them as PAS numberered specs (ISO has a range of types of PAS numberered specs (ISO has a range of types of standards, but unfortunately the JTC1 joint committee which handles SC34 only uses a couple of them: the fast-tracked standards would which handles SC34 only uses a couple of them: the fast-tracked standards would be better as PAS usually, and it would confuse people less.
Gazpacho
2007-12-07 19:47:03
Nitpick: "Don't work well" was an exaggeration. Yes, there are many widely-used binary interop standards, but they have the problem I described.


As I see it, OOXML is a standardization of the legacy Office formats, and one that goes all the way instead of making a token effort.

Rick Jelliffe
2007-12-09 05:06:06
Tim: The count from the horses mouth is 1061 non-duplicate errors by location, however the same horse says hundreds are the same kind of error (e.g. doubled punctuation marks). Hal has estimated from his look at between 500 and 1000 really unique errors, and my sense from looking at all the comments when they were public is also around 750.


The duplicate errors take only a tiny but tedious amount extra BRM time, as do the repeated instances of the same error: it just goes to the editors instructions.


I think it is quite unlikely that there will not be a successful BRM (i.e. week where at the end of it the delegates won't say: the spec with these changes is an improvement on the draft) which is all the BRM needs to vote on. Which is not to say they won't be really busy to get through enough of the ballot comments to make a difference. But even so far the progress clearly makes a mockery of the people who were saying that there was no chance of any change (of coure, these people were complaining that there was no changes during the stages in the ISO process where changes were not allowed, so it is hard to think they were nota teensy bit ingenuous.)that were made are enough to alter their negative votes.

Rick Jelliffe
2007-12-09 05:12:41
Tim: I missed your question "where".


You can participate in SC34 or your national body and get access through the key. (Each NB decides on the particular distribution mechanism for private documents.)


Or you can ask the SC34 member who is contracted to Sun to do some standards work to show you his count or give you some info.


Or you can look at that website that has pirated the comments.

Lars
2007-12-11 11:44:50
contradict (which has a very strict meaning in ISO usage: standard A cannot say X is a Z while standard B says X is a Z).

Can you explain this definition? Did you mean "while standard B says X is not a Z"?
Jesper Lund Stocholm
2007-12-12 05:42:04
Rick,


Do you know if the spec is available from somewhere else than ISO? I can't even seem to find them on http://www.iso.org/iso/store.htm .

Jesper Lund Stocholm
2007-12-12 05:50:09
Never mind ... I found it at


http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45873

Rick Jelliffe
2007-12-13 06:32:00
Lars: Doh, you are correct.


Jesper: Great to meet you in Kyoto. I hope we can get more time next time!

Bruce D'Arcus
2007-12-17 10:36:08
Rick: just came back to this "Bruce: DIS29500 does not contradict any known ISO standard in the technical sense. ISO was 100% clear on this by their response to all the bogus contradiction claims."


You're not even going to give an inch here?


My point was really simple: you suggest OOXML and PDF are more-or-less the same, and so express surprise that one has sailed to ISO approval, while the other has been caught up in the most intense of opposition. I suppose you suggest, then, that this opposition is therefore entirely unreasonable.


But a lot of people who oppose ISO standardization for DIS29500 have an entirely reasonable and principled position that having two competing standards for the same thing (in particular here for office documents, vector drawing, and math) is a bad thing. PDF does not raise similar concerns.


Do you really not see that this as a reasonable position to take, even if you disagree with it? Because if you don't, then I think you' re boxing yourself into the same kind of tight corner you suggest the "extreme anti-OOXML" camp is in.