ODF vs. EOOXML From An Insider Out

by M. David Peterson

Update: Simon Phipps has stated that my assertion that Novell is the largest contributor of source code to OpenOffice.org is incorrect, and in fact it is Sun Microsystems who is the largest contributor. My data says otherwise, but I also have a lot of trust in as well as respect for Simon, so I have to assume that my data is incorrect.

I don't think this changes my assertion that Miguel is someone in whom can be seen as an authority figure when it comes to the technical debate between ODF and EOOXML, as regardless of whether Novell is, in fact, the largest contributor, they are a significant contributor, and therefore his understanding of the technical issues involved are significant. None-the-less, if my assertion is incorrect (and, again, I can only assume that it is) then please take this into consideration in your overall analysis of both this post as well as the follow-up comments.

Thanks for helping to clarify the facts, Simon!

Update: bryan presents a refreshing perspective than what seems to be the standard "I hate Microsoft!" attitude when it comes to why they feel EOOXML is a bad thing.

Basically if someone asked me to work with OOXML for extracting data I would say, sure but it would be cheaper and easier to use Microsoft's APIs to work with office data, or to convert the document to ODF and extract the data that way.

This is really my only dislike of ooxml. I don't think it qualifies as FUD, it is just my experience of how it is to work with these technologies.


In follow-up, Miguel de Icaza provides a solution to the stated problem at hand,

This is an article about LINQ and how to extract Word ML document using it:

http://blogs.msdn.com/ericwhite/pages/Retrieving-the-Paragraphs.aspx

Folks have showcased LINQ-like technologies for Ruby and Python, so it cant be that hard to parse.


What I truly admire and appreciate about this exchange is pretty straight forward,

A real-world problem in regards to the usage of EOOXML, and a real-world solution provided in follow-up, or in other words, just like what tends to take place on a daily basis in both the open and closed source camps (though the the community aspect and overall openness, as should be obvious, is much more prevalent, generally speaking, in the OSS camps), two hackers find ways to present real-world problems and real-world solutions to these problems.

This is the way it *NEEDS* to be folks. The FUD, anti-EOOXML smear campaigns accomplish *NOTHING*, where as the exchange that took place below accomplished exactly what needed to be accomplished... Find where the problems exist and then fix them.

Thanks to both of you for providing a picture perfect example of how things both could and should be working as we move forward into the next generation of open xml document formats!

[Original Post]

So I just finished up reading an interesting post from Miguel de Icaza regarding ODF vs. EOOXML, and felt that it was really quite important to share with the rest of you all,

The EU Prosecutors are Wrong. - Miguel de Icaza

... I think that the group is not only shooting themselves in the foot, they are shooting all of our collective open source feet.


Interesting lead in, and something that I can assure you lives up to the promise of showcasing why the ODF vs. EOOXML battle field is doing more harm than it could ever do good, but before I move on, there's something I've been wanting to get off my chest...

45 Comments

bryan
2007-02-01 03:38:35
"Unlike the XML Schema vs Relax NG discussion where the advantages of one system over the other are very clear, the quality differences between the OOXML and ODF markup are hard to articulate."


OOXML has a lot in common with WordML, have you tried to extract meaningful data from WordML and ODF with any of the standard XML processing technologies? Which did you find to be the format better suited to extracting that data? Which differences in OOXML do you think will make it easier to extract data using the standard XML stack of technologies.


Basically if someone asked me to work with OOXML for extracting data I would say, sure but it would be cheaper and easier to use Microsoft's APIs to work with office data, or to convert the document to ODF and extract the data that way.


This is really my only dislike of ooxml. I don't think it qualifies as FUD, it is just my experience of how it is to work with these technologies.


Miguel de Icaza
2007-02-01 06:57:49
Bryan,


This is an article about LINQ and how to extract Word ML document using it:


http://blogs.msdn.com/ericwhite/pages/Retrieving-the-Paragraphs.aspx


Folks have showcased LINQ-like technologies for Ruby and Python, so it cant be that hard to parse.


Miguel.

M. David Peterson
2007-02-01 07:19:30
@bryan,


While it seems that Miguel has followed up with a link to help bring greater understanding to the problem at hand, I want to commend you for your approach to this... I find it a rare occasion in which people provide solid, well founded concerns with the various aspects of the EOOXML spec (or better stated: real-world usage of implementations of the spec), using this as the basis of why they find a particular method of resolving this concern more preferable over other known alternatives.


It seems to me that if more people would place their focus on finding the problems that actually do exist (from a real-world use case such at what you have provided) as opposed to searching for more arsenal to use in the EOOXML smear campaign, then a lot more would be getting accomplished than is currently the case.


It's refreshing to see something other than "EOOXML sucks cuz' I said it does!". So, for what its worth, thanks for this! It is much appreciated and respected.

M. David Peterson
2007-02-01 07:23:25
@Miguel,


Thanks for your follow-up comment and the link! Will bring this to the top of the post to ensure this information is more widely propagated.


And while this thread is open, thanks for the refreshing perspective you have provided. I can't stress enough how important I believe your overall attitude and straight forward approach to seeing and promoting things for how they truly are has been to the success of Linux and OSS software in general.


Thanks for providing such a great example for the rest of us to learn from to then follow your lead!

William
2007-02-01 07:38:18
In case, once you read Mr. de Icaza's blog post, you think that some of it does not sit well, you may be interested to read Rob Weir's response. There is more to this issue then Mr. Peterson's "so there!" post would suggest. I think part of the problem here is that Microsoft is interested in total dominance of all computing, and the siloing of all documents is a key piece of that strategy, so people are leery of any standard that they propose.
M. David Peterson
2007-02-01 07:49:15
@William,


What is truly sad to me is that any follow-up by Rob Weir is viewed as anything other than yet another attempt at an Anti-EOOXML smear campaign. When you reach the point where you are using arguments such as "It's just so hard to say the name 'Office Open XML' and this is yet another reason why EOOXML is a bad thing" you are proving only one thing...


You are willing to try ANYTHING in an attempt to spread the FUD. What would his argument have been if instead of using "Office Open XML" Microsoft chose to use "Open Office XML" like he was suggesting would be the better choice of the two?


"See, they're even copying the term 'Open Office' in an attempt to confuse the end user into believing this is the official standard for OpenOffice."


Rob Weir will do and say *ANYTHING* -- He could care less whether its actual fact, FUD, or complaints about meaningless, non-technical merits such as the chosen name.


Why?

M. David Peterson
2007-02-01 08:29:29
@William,


So I just finished reading through Rob's post as well as each of the follow-up responses. What amazes me by all of this is how many flat out lies exist in his follow-up -- to the point in which anyone who even remotely understands the issues at hand immediately recognizes it for what it truly represents (and have no fear of stating this in follow-up comments),


*NOTHING* but useless FUD.


For example, he actually attempted to defend the lack of any sort of specified spreadsheet formulas (and no, I'm not kidding) by stating that, in essence, "look at how many applications have gotten by just fine without it, and yet still have spreadsheet capabilities."


So, in essence, the argument is founded upon the notion that there's no need to have the specification, as implementations seem to be getting by just fine without it. But based on what? Their own ideas of how it should be implemented? I assume yes, which would then lead to the following problem...


*Lack of interoperability between implementations of the ODF specification.*


AND THIS A GOOD THING???!!!


WTF???!!!


*WOW*!!! Why not just write a one sentence spec and be done with it.


"You just do what you think is best, K' there Sparky."

William
2007-02-01 08:52:08
Mr. Peterson:


That's a pretty shrill response. I understand (even applaud) your passion on this issue, I just think everyone would benefit from some deep breaths and a few fewer punctuation marks.


I cannot make educated comment on either document spec, but I did note a sharply contrary response to Mr. de Icaza's post, and I linked to it to present a different view.


All *I* want is a single, vendor-neutral document standard that is transparent enough for anyone to use. This is a hard thing, and I am sure it won't be right in the first pass, but the extensibility of an XML-based spec appeals to me as a way to eventually incorporate all needed functionality.


I have never seen Microsoft act in a transparent, vendor-neutral way, and so a single-vendor spec proposal from them worries me, and sets off alarm bells. I sincerely hope that they are false-alarm bells.


My primary concern with OOXML is that it is trying to be backwards-compatible with all features of previous software. This is probably an undue burden for any spec.

M. David Peterson
2007-02-01 09:17:48
@William,


I hope I didn't come across as attacking you in particular. I most certainly didn't intend that, so my apologies if that is how it came across.


> That's a pretty shrill response.


Again, it wasn't intended for you, and instead directed towards Rob who has a long history of this kind of crap. I've been watching, reading, commenting, and in other ways have been a participant in this debate for quite some time, and to be honest, the camp of folks running the EOOXML smear campaign is pretty small, but at the forefront of them all is Rob Wier. Maybe its just a reflex reaction, but anytime I hear/see mention of Rob and a link to one of his posts, my immediate reaction has become "Oh Dear God, what as he come up with this time."


I do understand and respect your concerns. I honestly and truly do. ODF has accomplished some pretty amazing things, one of which was forcing MSFT into pushing forward with opening up and standardizing their previous closed/proprietary format. This is a *HUGE* deal, and I know for a fact that there are a *TON* of folks in the ODF camp who see this as a pretty major accomplishment, quite proud of this very fact.


Of course, none of these same folks are the ones spreading the FUD, and its the FUD that gets the most attention as it immediately draws discussion and debate.


It seems there are two camps (though the first is quite a bit larger than the second),


- Those who want open standards/specifications, and are proud of the fact that through their efforts they have been able to not only produce a fantastic doc format in ODF, but have also cracked the nut that was once the closed/proprietary format of the Office document formats.
- Those who have seen ODF as a marketing tool, and are pissed off that they now have to "share the openness" with EOOXML.


Anyway, I want to make sure that it's well known, widely spread knowledge that I believe ODF in and of itself is a great accomplishment and a fantastic, well designed/thought through document format/specification. I know several of the folks who's names are associated with the development of this spec, some of them quite personally, and I have nothing but respect for each and every one of them. My problems are not with ODF. There with the anti-EOOXML smear campaigners who are doing more damage than they could ever possibly do good.


And with all of this, thanks for your respectful follow-up. It truly is appreciated. I know I can come across pretty harsh, much of which is just my in-person personality only partially reflecting itself into text. In other words, what probably reads as being pretty harsh, is actually just my inability to properly reflect the much less aggressive/playful/laughing while I work/play/communicate all the day long self.


Of course, its not your responsibility to better interpret this, and instead mine to better communicate this. So, if its helps any... I'll try just that much harder to come across just that much less harsher the next go round. ;) :D


Thanks for your follow-up, William!

stelt
2007-02-01 16:12:52
That picture asks for it:
After http://svghearts.com you're up for a http://svgheartsonhatonyourhead.com ? :-)
M. David Peterson
2007-02-01 21:47:29
@stelt,


Never knew about http://svghearts.com/ < SWEET! I *LOVE* new toys! :D

Kurt Cagle
2007-02-02 02:09:01
I think one of the key points of consideration wrt EOOXML (can we please come up with a better acronym - EOX, for now) and its consideration as an ISO standard is this - can a document be created that is conformant to EOX that isn't produced by a Microsoft product? Regardless of any other issues of FUD on either side, this one has to be determined. If there are dependencies upon non-open technologies, then while EOX may be a wonderful specification for describing the use of Microsoft technology, it is not, and should not be, ratified by ISO as a global standard. This of course also applies to ODF, but ODF doesn't have any such issues because it utilizes open standards technologies exclusively.


Of all of the standards bodies, ISO is probably the most important - it is the one that businesses actively seek to conform to most closely, and as such its standards are higher than ECMA, which has been seen as being very close to Microsoft on a number of issues. Thus, conformance is not just a matter of a number of Microsoft "detractors" sitting around throwing stones, but rather is an attempt to insure that any standards that are adopted are more than simply rubber stamps for this or that company.


-- Kurt

M. David Peterson
2007-02-02 03:31:26
@Kurt,


> can a document be created that is conformant to EOX that isn't produced by a Microsoft product


When you consider the fact that there is no requirement for an "all or nothing" implementation, then the answer to this is quite easy: Yes.


Is it possible to provide an implementation that is 100% compliant and not be a MSFT product? Well, when you consider that MSFT has an expectation from their customer base to provide backwards compatibility with previous versions of Office, where as other non-MSFT products do not have that requirement, then it seems to me that the argument as to whether or not it is possible is of little to no value if the focus is placed on the portions of the spec that provide for backwards compatibility.


Should those portions of the spec then not be specified? Don't know, but it seems to me that if they weren't specified, this would become yet another point of argument from those opposed to EOX (< like that, btw... :D), so I don't know if its one of those points thats even worth attempting to make any sort of determination as to the logic involved. Besides which, we are all keenly aware that MSFT has never sued a single office tool manufacturer for reverse-engineering any of the Office document formats, so the fact that the specification as to how to hook the previous formats into the new format exist, I have a hard time believing that any of these same office tool companies will not be using the specification as reference for the hooks into the older formats.


Of course, you could argue that MSFT should standardize the previous formats, but,


1) That will take time.
2) People are not going to wait around for just such a specification before they implement support.


which leads to,


3) For what purpose would standardization serve beyond that of a somewhat shallow victory?


In other words, the previous formats have nothing to do with the future; they're legacy formats, and if either ODF or EOX live up to the task they were intended to serve, then where's the incentive to save them back into the legacy formats? Of course, there's compatibility reasons with previous versions of Office that will require using the older formats, but even that is a somewhat moot point, as there are already add-ins that provide support of the .*x formats for as far back as Office 2000.


I must admit that I was intrigued by the idea of standardizing the legacy formats when Micah (Dubinko) made the suggestion a week or so back in a follow-up comment to a post from Rick Jelliffe. But then I thought about it some more and found myself wondering,


If I were MSFT, and just went through a battle in which they were pressured by the ODF camp to standardize their Office doc format; gave into that pressure and have now gone through the standardization process only to now find that the same folks who pressured them to standardize, are now attempting to derail the ISO standardization process train, would I honestly be compelled to then go through the same process all over again for little to no benefit to anyone beyond that of a shallow victory?


In other words, the pressure is being applied from folks such as Rob Wier to standardize their legacy formats. Rob Wier is attempting to derail the ISO standardization train. Rob Wier was one those behind the pressure being applied to standardize the new document formats for Office 2007. If they were to give into this same pressure like they did in the past, would folks like Rob Wier make every attempt in the world to derail the same train they pushed to have been brought into commission in the first place?


History tends to repeat itself, so my guess would be "Yep... absolutely."


So then wheres the value? What benefit is there to *ANYONE* to go through all of that effort? Maybe it's there, but at the moment, I'm not seeing it.

bryan
2007-02-02 04:53:50
thanks for the compliments; where Miguel's example is concerned: I don't say that it is impossible to work with WordML, I say it is a pain, and it is significantly more painful to work with than it is to work with ODF (which I have to say is also a pain.)


The problem is that they are both relatively flat formats, but consider the definition of a paragraph in ODF and its style name is
<text:p text:style-name="HereIsTheStyleName">here is the paragraph</text:p>


this seems easier than the example linked to, which it should be pointed out was a 'simplified' WordML.


Basically, aside from the flatness of WordML the major problems I have are a sort of arcane tendency, somewhat reflected in the example Miguel linked to, to have the 'style' (which in a word processing document is often used to assign the meaning of what something is) of a piece of text, as the child of a preceding sibling to the element that holds the text itself, and in the one major document processing project I have worked on (meaning processing into semantically meaningful structure from WordML) as opposed to smaller projects of data extraction, a tendency of the text in various bits of contiguous text to be split across the tree in a difficult to handle fashion.


As an example, a bit of simplified WordML from this project, with identifying information removed:
<w:body>
<wx:sect>
<wx:sub-section>
<w:p>
<w:pPr>
<w:pStyle w:val="Chapternummer"/>
</w:pPr>
<w:r>
<w:t>Chapter 2</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Chapterheading"/>
</w:pPr>
<w:r>
<w:t>some text that is complete</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Paragraphtext"/>
</w:pPr>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t>some text here that was meant to assign this element as being meaningful in a hierarchical manner relevant to other elements</w:t>
</w:r>
<w:r>
<w:t> </w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="character"/>
</w:rPr>
<w:t>some text here that was not comp</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="character"/>
<w:hyphen w:rule="normal"/>
</w:rPr>
<w:t>l</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="character"/>
</w:rPr>
<w:t>etely finished in the prior two w:t elements</w:t>
</w:r>



Now, I have to admit that I was not working in the part of the project that had to do with getting this output. I just had to handle this output. So it could well be that the other part of the project did something wrong but I'm going to guess they did things reasonably well. In the above example please notice the last three w:r elements.


This is dealable with, sure, but it is not especially pleasing to deal with XML structured like this, and it's not especially cheap in developer time, processing time or any other sorts of time that in the normal cycle of business translates into being money.


I am quite willing to complain about the problems that ODF has, but I think in some ways those are a much higher level (not so much markup related) and solvable range of problems than those that are found in WordML (and thus generally also in OOXML).


P.S.: I seem to remember there was something somewhat difficult about images in WordML vis a vis those in ODF, but don't have any examples here.

bryan
2007-02-02 04:59:03
"In other words, the pressure is being applied from folks such as Rob Wier to standardize their legacy formats"


I think you're wrong. The pressure is being applied from large governmental and international organizations. Maybe because they have listened to Rob Wier about the benefits of standardization.


This does not mean that they will listen to Rob Wier about the quality of the standard put out.


M. David Peterson
2007-02-02 07:23:37
@bryan,


This is what I would deem as a reasonable concern with the format, though I do know that there are a lot of these issues in which are pain to work with from an XML standpoint, but for various reasons (some performance, some compatibility, some etc...) they needed to be this way, or that way, or some other way all together. I think a lot of it has to do with attempting to map binary formats to their XML "equivalent", something that can be a real pain-in-the-a$$, though I don't have any real "insider" understanding -- I simply know of enough circumstances that are similar in nature in which have had similar results in regards to working with the XML.


>> P.S.: I seem to remember there was something somewhat difficult about images in WordML vis a vis those in ODF, but don't have any examples here.


If not mistaken, this had to do with labels having no hierarchal relationship to the image itself, and I agree, this can be a real pain when dealing with XML as you can't use standard hierarchal relationship models to determine relationship as you would expect. I don't know if this is directly related to attempting to make the translation from the binary to XML format, but using what I would consider fairly basic knowledge of creating efficient (both size and speed) binary data files, any time you can externalize strings (placing them into more efficient lookup-tables and what have you), the better off you will be, using a simple index id to point to the label or heading, or whatever else, instead of placing this information directly inline with the associated graphic file as you would if you were developing an XML format that utilized hierarchal relationships to determine this same information.


So what this really all comes down to is the pains that we must endure as developers when dealing with XML formats in which are an obvious compromise between performance, size, correctness, pretty much in that order. And I agree, it can be a real pain-in-the-a$$, but I do believe that its a pain with reason that goes beyond the inability to develop a proper hierarchal XML syntax.


Of course, ODF, as you point out, has it share of issues that are similar to this, which I think lends credibility to the notion that when it comes to designing XML document formats, if you hope to maintain any level of efficiency in the rendering of these documents, you have to make some hard decisions that take away some of the luxuries we tend to take for granted as XML developers when it comes to hierarchal document models. A real pain for us, but when it comes down to it, its the customer in whom is using the office document tool in which never has to see or deal with the underlying data structure, and yet is continually dealing with the performance related side of the equation, in which is the ultimate determination as to what the document format will be.


In other words, regardless of how much of a pain it might be to deal with from a developers perspective, if the level of slow-down is noticeably different, compromises are going to have to be made. And as you have showcased, its pretty obvious what these compromises are...


=> flat data models.

M. David Peterson
2007-02-02 07:45:02
@bryan,


>> I think you're wrong. The pressure is being applied from large governmental and international organizations. Maybe because they have listened to Rob Wier about the benefits of standardization.


Oh, I do recognize that it goes beyond Rob and directly into the government bodies such as the EU. What I am referring to specifically in this case is the recent and specific demands that Rob has made in regards to this specific topic. Obviously he isn't the only one, but he is certainly one of the "loudest" ones, which, again, is why I chose to pull his name out as an example.


> This does not mean that they will listen to Rob Wier about the quality of the standard put out.


True. Hopefully they will listen to a lot more folks other than those like Rob in whom have proven over, and over, and over again that their primary interest is keeping EOOXML from becoming a standard, which contradicts everything that ODF has ever been about... Open Document Formats.

W^L+
2007-02-02 12:08:38
I use ODF-aware applications all the time on both Windows and Linux. For my word processing documents, only AbiWord is substantially different. For spreadsheets, I have to admit that I don't do many formulae, so my experience may not be typical. Having said that, my spreadsheets in Windows OOo work fine with Linux KSpread. As far as Gnumeric goes, my distribution's version is not yet ODF-aware, so I cannot check its quality.


I have read about 25% of the ODF specification so far, and none of the OOXML spec. One thing that is clear from reading ODF is that it is designed to enable multiple implementors. The reports from those who *have* read the OOXML spec are that it is definitely not designed for this purpose.


There are a couple of purposes that are clear for any "open standard":
(1) Give customers the ability to mix-and-match readily. When you hear comparisons with electric power sockets, this is what they mean. If a standard is open, I as an IT person in an enterprise, ought to be able to get the same results with multiple applications. In fact, I would say that from a technical support standpoint, if there are not multiple implementations, then it is not an open standard.


This is one of the reasons the LAMP platform is so attractive: if PHP isn't meeting your needs, one can switch to Perl, Python, Ruby, or any of the other scripting engines. This makes the enterprise less vulnerable to licensing issues, for example, or even performance issues.


(2) Make it easier for various competitors to produce fully-compatible products which can be swapped in and out in customers' environments as desired. In other words, the software vendors produce their applications and merely have to map their functionality onto the input/output formats specified by the standard. This means that vendors no longer have to roll their own formats.


If you look at mobile phone chargers, you'll see that the cost of designing a new charger for each new series of phones means that it is costly for the companies, so when you have to buy a replacement charger, you pay more because that design cost is spread over fewer chargers. This extra cost and hassle, by the way, is why some Asian countries are now mandating that new phone designs use one standard charger design.


However you look at it, the only way that is good for users is to have one open format standard for exchanging files that can be implemented by all vendors. If there was no other reason, that would be enough to oppose pushing a second standard into government (which is really what ISO-standardization is about in this area) usage.


The reason for all of the so-called "anti-EOOXML FUD" is Microsoft's anti-competition stance. We already know how difficult it is for competitors to fully-implement the existing binary formats. In my employment, I find that Microsoft's own products have difficulty with their older binary formats, so I am not surprised that other vendors also have problems with those formats.


Like William, I find that above all, Microsoft fears and detests competition, using file formats to block out competitors time after time. Since their marketing proposition is "we are integrated--everything we make works together", it is clear that they do not wish to give that mix-and-match ability to customers (or end-users, if you will).


Read the things on their advocacy blogs. Read their legal documents. Read the things their chairman spouted at the Microsoft-Novell announcement. Competitive markets, with several compatible implementations of equivalent functionality, are *NOT* what Microsoft seeks, but it is *EXACTLY* what customers need. http://lnxwalt.wordpress.com/2007/01/20/whats-wrong-with-choice/


In any case, no one is against Microsoft using OOXML as a format, but if they *really* cared about "choice", they would have implemented *both formats*. Until they do, claiming that they want people to have a choice is a straight-up lie.


In any case, while I have never met Rob Weir in person to know what he is like, he has impressed me as being a basically honest person. On the other hand, with the pretense that IBM is the one that has its finances in danger over the ISO-ification of OOXML has impressed me that Brian Jones would say *ANYTHING* to make it happen. http://lnxwalt.wordpress.com/2007/01/21/whose-finances-are-on-the-line/ tells us who has the greater financial risk here. HINT: It isn't IBM.


Over the years, I have watched (and used) the products that Miguel has initiated. I respect him and his viewpoint, but I think that he is too close to Microsoft to be objective. One should keep Novell's new relationship in mind when he writes these things. This is not to put him down, one *should* weigh the relationships and incentives of anyone that expresses an opinion on this issue.


My relationships? I do tech support in an almost all-Microsoft environment. I do not work for or have any relationships with IBM, Novell, Sun, Microsoft (other than their software being used in our environment), or anyone else who stands to make money or lose it. My desire is to be able to utilize fully-open standard file formats, so that *almost any* office application from *any vendor* can be used to produce the same (from a user and support persective) results. That's it. That's what ODF offers, but OOXML does not.

mike
2007-02-06 09:24:23
i poked around in this - as a complete outsider - so bear with me


some of the objections in http://www.groklaw.net/article.php?story=20070123071154671 would seem worthy of concern - are they

M. David Peterson
2007-02-06 12:47:50
@mike,


some of the objections in http://www.groklaw.net/article.php?story=20070123071154671 would seem worthy of concern - are they


I believe so, yes. In most cases, however, I believe that concern is more of a "we need greater understanding as to why this route was chosen" instead of "this is wrong!" which is the current approach of the folks at GrokLaw.net. I believe the key word in GrokLaw is "Law" -- and its true, the folks at GrokLaw (at least the core group of lawyers) truly do understand Law. Does this mean they are then qualified to analyze a technical specificiation to determine if the chosen technical approach is, in fact, valid?


Well, let me answer that question with a question of my own: As an experienced software developer, am I also qualified to walk into a court room and argue a court case? Setting aside the fact that I can legally act as my own attorney, I couldn't based on the simple fact that I am not licensed to practice law. But beyond this, even if I could, my lack of experience and overall understanding of how the law works disqualify me on the grounds that I have no clue what I am talking about.


Computer Science is *HARD*. The fact that I am not legally required to maintain a license to program a computer doesn't change this fact. For a lawyer, determining the legal requirements as to what a piece of software must be enabled to do is one thing. Determining how it must do it, quite another.


Are the concerns pointed out by the folks at GrokLaw.net legitimate concerns? If extracting more information from MSFT is required for someone qualified to interpret and implement the spec to do just that, yes. But when you are using the argument that "6000 pages? That's too much information!", you can't exactly then use the argument "we need more information", so instead they have been using the "because we don't understand this, it must therefore be a bad thing... EVIL!" argument.


Anyone, that isn't a lawyer, ever tried to read a legal "brief"? What's your first reaction?


"Wow! This is just SOOO MUCH INFORMATION! It's SOOO CONFUSING!"


EXACTLY!


Law is *HARD*! Computer Science is *HARD*! They both require specialists to understand, and both require LOTS and LOTS of information to understand well.


Lawyer: "I'm sorry, Judge, but there's simply too much information here! You actually expect me to read all of this? I've got things to sue, (rich) people to free!"


Judge: "Will someone please get this phreak out of my court room?!"


Frank Daley
2007-02-06 15:28:20
You said:
This is the way it *NEEDS* to be folks. The FUD, anti-EOOXML smear campaigns accomplish *NOTHING*, where as the exchange that took place below accomplished exactly what needed to be accomplished... Find where the problems exist and then fix them.


But you are totally ignoring the most basic problem - that it is effectively impossible for any organization except Microsoft to fully implement EOOXML.

Yes, technology experts can design solutions to extract data from EOOXML documents. So what, if the only effective tool remains Microsoft Office. That just perpetuates Microsoft's current customer lock-in.

M. David Peterson
2007-02-06 18:23:22
@Frank Daley,


But you are totally ignoring the most basic problem - that it is effectively impossible for any organization except Microsoft to fully implement EOOXML.


I'm not ignoring it at all. In fact, I've written even more about this very topic than I did in my original post. Please see http://www.oreillynet.com/xml/blog/2007/01/odf_vs_ooxml_from_an_insider_o.html#comment-454196 for more detail.


Yes, technology experts can design solutions to extract data from EOOXML documents. So what, if the only effective tool remains Microsoft Office. That just perpetuates Microsoft's current customer lock-in.


Only effective tool? How so? If I am enabled to extract the data from the .*x formats, and then use this data in other tools, then the "lock-in" isn't the data itself (which is where the "lock-in" supposedly exists, though given that *EVERY* major office tool implements support for the same formats, I still fail to see where exactly the presumed lock-in exists), and instead the desire to use the best tool for the job at hand, whatever the consumer *chooses* that to be.


In other words, the game has now shifted from "we must use MSFT Office if we want compatible document formats" (which, again, has never actually been the case, but whatever) to a "we can use any tool we feel is best" feature war.


Maybe its just me, but isn't that exactly what we want as consumers? More choices, better tools due to feature competition instead of (ironically, given that the argument seems to be about moving from one "locked-in" format to another, and this supposedly being a good thing) format lock-in, etc...


If the "only effective tool remains Microsoft Office" it won't be because of lock-in -- It will be because the competition has chosen not to make any attempt at competing on features, and instead have put their resources into attempting to lock us each into *one format*.


Which is more important: Features or Formats?


Let me ask this another way... When the formats are openly specified, and can be used royalty free, which is more important: Features or Formats?


Let me ask this one more way: If the "only effective tool remains Microsoft Office", will that be because the competition chose formats where as MSFT chose features?


Let me answer this for you (from my own perspective): Yes!

M. David Peterson
2007-02-06 18:43:48
@W^L+,


My apologies for the late approval of your comment! I just went into the comments section to fix a spelling error on my last comment and noticed this was sitting there unapproved. I'm away from my office at the moment, but as soon as I return, I will reply in full.

M. David Peterson
2007-02-06 23:28:02
@W^L+,


Sorry for the hold-up. Have limited time at the moment, but will try to get through as much of your response as time allows,


Firstly,


My relationships? I do tech support in an almost all-Microsoft environment. I do not work for or have any relationships with IBM, Novell, Sun, Microsoft (other than their software being used in our environment), or anyone else who stands to make money or lose it. My desire is to be able to utilize fully-open standard file formats, so that *almost any* office application from *any vendor* can be used to produce the same (from a user and support persective) results. That's it.


The same is true about me (update: well, I don't do tech support, though I feel like that's what I do sometimes ;), so it seems we are working from a similar foundation.


That's what ODF offers, but OOXML does not.


And this is where we immediately differ in opinion.


In fact, I would venture to state that the exact opposite is actually the case. Why? Because ODF is working from the standpoint in which there is no requirement that they provide support for the billions and billions of documents that exist on this planet that are encoded in MSFT Office formats. It would be nice to ignore these documents, but for what I hope are obvious reasons, we can't. From this standpoint, I would conclude that the only true inter-operable format is EOOXML, as it is the only format that takes into consideration all of the billions and billions of existing documents on this planet, providing direct hooks into bringing these same documents into the "modern age" so to speak.


Having said that, my spreadsheets in Windows OOo work fine with Linux KSpread. As far as Gnumeric goes, my distribution's version is not yet ODF-aware, so I cannot check its quality.


I'm not sure I understand your point. Are you suggesting that because of your experience of working with OO.o > KSpread, you feel that the need to specify the formulas is of no great concern? If yes, I'm sorry, but I completely disagree. Specifying the formulas is crucial to ensure interoperability > I'm not talking about application-to-application interoperability, btw... We all know it's just as easy to open a MSFT Excel spreadsheet in OO.o as it is in MSFT Excel. What I am referring to specifically is the current situation in which each application that reports as being "ODF Compliant" must, in essence, guess the formula formats for *THE SAME* document type.


In other words, there is no guarantee, at the moment, that if I encounter any given .ods document (the extension of an Open Document spreadsheet file) that I can expect to find a formula format that I can understand. The "well it isn't a problem with the tools that I use" is all fine and dandy, but when the entire premise behind the reasoning for providing a specification is then set aside with a "works for me" response, then what's the purpose of having a specification in the first place? I mean, I can open a .doc, .xls, .mdb, and .ppt file inside of OO.o, Wordperfect (and/or equivalent spreadsheet, database, and presentation software), so why doesn't the "works for me" argument qualify as good enough, when in fact that's exactly the situation we're faced with?


Sorry, but you seem to arguing from both sides of your mouth.


Left Side: "We need specifications!"


Right Side: "Works just fine for me, regardless of whether its specified or not."


So which is it? Is the specification important, or is the "works for me" argument just as valid? Please pick one (and only one!) and run with it, because at the moment you're not making any sense.


That said, need proof that documents claiming to be of ODF "descent" can become a problem if that's not what actually happens?


http://www.oreillynet.com/xml/blog/2006/11/google_docsopeninterpretation.html


NOTE: The above link is actually a weak argument, as what actually took place is Google claimed it was saving a document in ODF, when in fact it was really the pre-ODF OO.o format. That said, it does showcase the obvious confusion that can and has taken place as to what can qualify as ODF, and the troubles that can be encountered when it turns out not to be in an understood format.


One thing that is clear from reading ODF is that it is designed to enable multiple implementors. The reports from those who *have* read the OOXML spec are that it is definitely not designed for this purpose.


I've read both specs. They both have problems. Not insurmountable problems, but problems none-the-less. ODF provides for a non-backwards-compatible document format. EOOXML provides for backwards compatibility with the billions of document on this planet encoded in previous Office formats, while at the same time providing a new document format that is just as capable, and in many cases, more-so than ODF. The "multiple-implementors" argument in favor of ODF is false. But don't take my word for it... Listen to what Tim Bray, and obvious proponent for ODF, has to state on the matter,


via http://www.oreillynet.com/xml/blog/2006/07/open_office_to_support_open_xm.html#comment-50276


OpenOffice.org has had import/export filters for every MS Office format back to the dawn of time, and can already open lots of vintage Office files that Office can't, any more. Do you have some inside information that would suggest they'll suddenly change their behavior and refuse to do it this time?


I most certainly don't. Anybody else?


I need to rush through the rest of your comments, but as soon as I have a chance, I will try and come back to answer those that I skipped over...


Since their marketing proposition is "we are integrated--everything we make works together", it is clear that they do not wish to give that mix-and-match ability to customers (or end-users, if you will).


False. COM has *ALWAYS* been about interop between applications -- 99+% of the applications developed for Windows are developed by companies other than MSFT, and yet a significant majority of these applications can inter-operate with one another (drag/drop, copy/paste, plug/play, etc...) From the hardware side, the primary reason MSFT has been as successful as they have been is due to the fact that they have made every effort to get as many hardware vendors involved as they possibly could. How do they get them involved -- Mix-and-Match interoperability.


You're claiming MSFT has no interest in mix-and-match interoperability? I would argue that's *ALL* they have interest in.


Competitive markets, with several compatible implementations of equivalent functionality, are *NOT* what Microsoft seeks,


False. Please see my same counter argument above.


but it is *EXACTLY* what customers need.


True. In fact, that's exactly what they get.


* Application interop (drag/drop, copy/paste (beyond just text, but for objects as well. VBA (Visual Basic for Applications) has been available for non-MSFT applications for over ten years, and VBA is *ALL* about cross-application interop as well as utilizing a common cross-application scripting environment.)


* Hardware compatibility. If I go into my local Office Depot, how many choices will I have of printers that will work with Windows? What about Mac? Linux? Consumer choice is what has made MSFT successful.


To put this another way, "My printer-of-choice may not work" or "my application-of-choice may not be compatible" are not reasons you hear for choosing to purchase Windows, where as "I sure wish my application-of-choice ran on Mac or Linux" and/or "I sure wish my printer-of-choice would work on Mac or Linux" are most definitely reasons you hear for people not choosing Mac or Linux-based machines.


In any case, no one is against Microsoft using OOXML as a format, but if they *really* cared about "choice", they would have implemented *both formats*. Until they do, claiming that they want people to have a choice is a straight-up lie.


Hmmm... Weird. I would have thought the news that MSFT paid for the development of the open source EOOXML-to-ODF-to-EOOXML document converter, and that Novell, in fact, plans to use this converter in OpenOffice to convert between the two formats, would have spread more widely by now.


Guess not.


Here's where you can find it. http://odf-converter.sourceforge.net/


Here's a snippet from a recent news post covering the release,


A Microsoft-sponsored open-source project is expected on Friday to release a translator that will convert file formats between Microsoft Office and rival standard OpenDocument, or ODF.


Microsoft started the project at SourceForge last year, relying on three partners to develop the code that lets a user open and save word processor documents in two different formats.


<snip/>


Novell last year said it will use the Word translator to allow users of OpenOffice, which supports ODF, to work with OOXML files from Microsoft Office.


<snip/>


More @ http://news.zdnet.com/2100-3513_22-6155585.html


Will attempt to come back to this later and provide extended commentary, but for now I need to get some other work done.

mike
2007-02-07 02:43:41
i wasn't thinking of the "6000 pages" comment - it was the objections listed in the contents that concerned me - and of which i didn't have to go further than the 1900 leap year to smell a rat:


http://www.groklaw.net/article.php?story=20070123071154671#The_Gregorian_Calendar


i stumbled on all of this while waiting for some tests to run - so really i'm just looking for some informed technical opinion


your "not a lawyer stay out of the courtroom" answer doesn't help me assess whether it is reasonable for Ecma 376 to break ISO 8601 so that it can accommodate an known Excel bug


my fault for not asking a specific question in the first place

alex
2007-02-07 07:28:23
While I am an admitted MS hater, there are problems with the ODF specification. Of course, it seems that a lot of people inside of ISO have problems with OOXML as well, since a number of them have objected to the standard as it is written.


I wouldn't suggest that all ODF be used as an excuse to abandon all the existing Word formatted documents, but I will suggest those documents be converted from Word format to ODF for future preservation, due to 3rd party software being more capable of reading older Word documents than Word is.

M. David Peterson
2007-02-07 08:13:54
@Mike,


i wasn't thinking of the "6000 pages" comment


Oh, I knew you were referring to more than just the size of the document, but had assumed it was more of a generalized "are any of these concerns legit?" question.


I am preparing for a meeting at the moment, and won't be back until about 3:00pm MST, but once I am back and have a chance to get caught up on any pertinent email, I will respond again in full.

M. David Peterson
2007-02-07 08:14:48
@alex,


Please see my last comment...

Segedunum
2007-02-07 09:11:48
I never cease to be amused by people whose first answer to any of the clear shortcomings and problems raised with OOXML is the ubiquitous 'FUD'.


Can you people really not answer any of the problems and objections that have been listed on Groklaw and by myself elsewhere? The fact is, if a proposed ISO standard does not use correct country codes for some unapparent reason, specifies its own vector drawing technology rather than sensibly using SVG as everyone else has been able to do, and even has a stated list of its own banners then it simply cannot be used for reliable data exchange of any kind.


You can sing a song, do a dance, jump up and down and do a play on Microsoft's own 'Get the Facts' campaign (which I find very amusing, by the way), but those ARE FACTS. Unless they are addressed, point by point, by one of those people who seems to think that every shortcoming pointed out in OOXML is simply FUD then you are all taking a very long toilet break in the wind. Yes, ODF needs to be continually improved (one question is then, why isn't Microsoft contributing to it?), but you can't then point the finger at that as a way of deflecting attention from OOXML.


I also particularly liked Miguel's answer to Microsoft not using SVG. His argument boils down to the fact that it would somehow be too hard for Microsoft (boo hoo), it would pull in other W3C standards (those things Microsoft doesn't support properly, but people would benefit a lot from) and some vague reference to Adobe hijacking it which I don't understand the relevance of. I really don't understand that considering that OOXML is nigh on impossible too implement fully away from Microsoft Office (and SVG is being implemented today), and that Microsoft controls the direction of OOXML.


Sorry, but many open source projects are forging ahead with SVG, which shows that it's possible, and Microsoft could easily do the same even if they only used a subset of it. He then goes on to dismiss all of that and then talk about how much easier it is to implement XAML and then tells us that it isn't tied to Windows. Go figure. He seems to not be able to get that XAML is still a Microsoft controlled technology, and should it take off, whatever Microsoft does with it will be the benchmark that everyone looks at. That of course, is the very point.


Unfortunately, Miguel's and a lot of other peoples attitudes can be summed up in one word - capitulation - rather than looking at the wider picture. "Oh, Microsoft has made some kind of open format that we have deluded ourselves into thinking will allow us to finally open Microsoft Word documents reliably, so let's use that instead!" Alas, that's the way people are seeing this. There is no discussion whatsoever of how much more Microsoft could be doing, because everyone sees them as an immovable rock everyone has to revolve around.


It's pretty sad to see.

Segedunum
2007-02-07 09:14:06
and of which i didn't have to go further than the 1900 leap year to smell a rat...i stumbled on all of this while waiting for some tests to run - so really i'm just looking for some informed technical opinion


The simple answer Mike is that this shouldn't be in a standard at all. It should be worked around within the application in question, leaving the format, everyone else and future applications to handle things correctly, without trouble.


There is nothing wrong with OOXML at a high level. The devil is in the details though. They're going to have to rip out and replace an awful lot of stuff to ensure that it can be implemented in a reliable manner away from Windows and Office and so that it can coexist with other standards and provide true data exchange.

Simon Phipps
2007-02-07 10:51:18
You assert:
> Again, as many of you will know, Novell is the top contributor
> of source code to the OpenOffice.org project.


This is incorrect. Sun Microsystems is the largest contributor to OpenOffice.org.

M. David Peterson
2007-02-07 14:26:40
@Segedunum,


Firstly, your attempt to continually suggest that no one is answering the specific questions is a lie. They're answering. You disagree with the reasoning. That doesn't then mean they are doing a song and dance around the issues, and instead you don't like the answer, and therefore want them to provide a different answer. When they don't, and instead stick to there original reasoning, you claim song and dance. It's a bullshit marketing tactic that you've been attempting to use time and time again.


Why?


Your belief regarding SVG is *FALSE*. There is one complete implementation by Adobe. And guess what? They end of lifed support as of January, and will discontinue its distribution on its web in a year. You can not redistribute it now or ever. So for all intents and purposes, the single SVG run-time with full support for the spec is no longer something that can be used as a reliable source for rendering SVG on the client, as unless the viewer is downloaded and installed by the time they pull the plug, there is no way to guarantee that the client has support, and no way to give it to them if they don't.


Opera has support for SVG tiny. But Opera, keep in mind, is a browser. SVG is a format designed by the World Wide Web Consortium. The W3C isn't in the business of creating document standards for anything outside of the scope of the web. So it makes sense for a browser vendor to consider providing supporting for SVG. It doesn't make sense for an office tool vendor *UNLESS*, of course, like HTML/XHTML, the format is ubiquitous, or, in other words there are LOTS of pre-existing SVG documents in existence that require they support it.


How many pre-existing SVG documents are there in the wild? HARDLY ANY!!! SVG is a nice idea, and its even a decent spec. But no one other than a handful of people have cared enough to implement rendering support, and no one other than a handful of people have cared enough to create development tools that render SVG output.


The notion that "its a standard, and so it MUST be used" is a complete crock. It might have been standardized by the W3C, but until there are a significant number of documents encoded with SVG, it DOES NOT MATTER. On the other hand, there are a significant number of documents with VML embedded into them, so providing support for VML makes sense from a *REAL WORLD* usage perspective, instead of "we think SVG is going to be a *BIG HIT*, and because it has been standardized, you must therefore use it as your vector graphics format."


Your "boo hoo" comment that SVG is difficult to implement is also a complete crock. No one that I am aware of has stated "SVG is too hard" and instead "why did they do it like this? It doesn't make any sense. So we'll implement partial support for the portions that do." which is why SVG is in such a fragmented state in regards to implementations (both run-time and tools), and is why *VERY FEW* documents have any level of SVG embedded into them.


Even after well over 3 years of development effort, Mozilla/Firefox *STILL* does not provide a full implementation of SVG. Why? Or better said, if SVG was the *WAVE OF THE FUTURE* don't you think that the folks a Mozilla would have put greater emphasis on its development? That said, SVG is a good XML vector graphics format. Does it have problems? Yes. Are there more attractive alternatives? Yes. In fact, if you want to place your bets on which vector graphic rendering engine Mozilla ultimatelly decides to move forward with, Flex/Flash would be the safest bet to make. Why? Well, why did Adobe dump their SVG player, built specificially to render a format they helped develop, and instead are placing their focus on Flex/Flash? Of course, they are now developing the MARS project which uses SVG, but MARS isn't going to make Adobe any money, so the focus is obviously going to be where they *WILL* make money > Flex/Flash, of which Mozilla is now the proud recipient and maintainer of the rendering engine.


So we have the one complete SVG engine that has been end-of-lifed, a bunch of half way finished implementations, the one with the greatest potential (Mozilla) has a much brighter/shinier lure in Flex/Flash, SVG Tiny support from Opera (which makes sense given Opera seems to have placed their focus on the mobile market) as well as at least partial support for SVG proper (need to check again to find out for sure how complete it is -- it may be near to complete, if not complete, but again, need to verify), but with less than a 1% browser market share coupled with the fact that they are a browser not a office document tool doesn't exactly providing compelling justfication when you state "well Opera provides support."


Why then did ODF go with SVG? Because it already existed, and they could simply push it into the spec and call it done. Will people actually implement that portion of the spec? Don't know, but OpenOffice.org doesn't even use it, and OO.o is supposedly the poster child for ODF.


With this point in mind -- Tell me again why EOOXML should have to use SVG when there isn't a single implementation of the ODF specification (that I am aware of) that uses it. Will they? Don't know, but they haven't as of yet...


Why? If SVG is so great and wonderful, and we all should have to use it as the standard for marking up vector graphics files, why then has OO.o not led the charge, and instead still use the same implementation they've been using all along?


SVG is part of ODF because it could be cut and pasted into the ODF spec. That is the *ONLY reason, and *IS NOT* a reason why the rest of the world should have to do the same thing.

M. David Peterson
2007-02-07 14:28:46
@Simon,


My data says otherwise. Can you please point me to the correct source that showcases this such that I can correct this using data that I can point to?


Thanks!

W^L+
2007-02-07 17:17:13
In fact, I would venture to state that the exact opposite is actually the case. Why? Because ODF is working from the standpoint in which there is no requirement that they provide support for the billions and billions of documents that exist on this planet that are encoded in MSFT Office formats. It would be nice to ignore these documents, but for what I hope are obvious reasons, we can't. From this standpoint, I would conclude that the only true inter-operable format is EOOXML, as it is the only format that takes into consideration all of the billions and billions of existing documents on this planet, providing direct hooks into bringing these same documents into the "modern age" so to speak.


That is what import filters are for.  You don't chain your future formats to the past formats.  Instead, you provide suitable import and export filters, so that documents in those legacy formats can be edited and if need be, re-saved in the original formats.  And seeing that there are documents right in the agency where I work that we cannot open any more [let me qualify that—OOo can open some of them, but Microsoft Office 2003 cannot], I already know that OOXML can not provide reliable ability to work with those documents either.  This is merely being used to prevent governments and other software buyers from insisting on real choice in applications.


For documents that are legally-required to be accessible, yet are not available with current versions of the original vendor's software, you suggest that we should somehow expect OOXML to make them available again?


I know about the plug-in.  I know it has had mixed reviews, and has a long list of unsupported portions of each format.  I also know that it makes it so obnoxious to try to use ODF that no user will use it regularly, no matter what an organization's policy says.  I also know that Sun announced its plug-in today.  It remains to be seen how well it works and whether it enables users and organizations to use ODF as their default save format, including registry and Group Policy methods.  With Novell planning to submit code based on the CleverAge plugin, to OOo, I expect that once again, only Microsoft will be fully-compatible with its format, this time OOXML as well as the "secret sauce" binary formats.  But then again, that is the status quo, so what's the difference?


I'm not sure I understand your point. Are you suggesting that because of your experience of working with OO.o > KSpread, you feel that the need to specify the formulas is of no great concern? If yes, I'm sorry, but I completely disagree. Specifying the formulas is crucial to ensure interoperability > I'm not talking about application-to-application interoperability, btw... We all know it's just as easy to open a MSFT Excel spreadsheet in OO.o as it is in MSFT Excel. What I am referring to specifically is the current situation in which each application that reports as being "ODF Compliant" must, in essence, guess the formula formats for *THE SAME* document type.


No, I am not suggesting that my personal experience makes it okay.  I agree that this is a hole that needs to be closed, and quickly.  OpenFormula is in the process of getting approved right now, so I am hopeful about that.  I even agree that they should have at least come up with a basic version before approval for ODF.  But I don't think anyone is saying ODF is perfect or complete, merely that it is much closer to what a standard should be than OOXML is and that Microsoft has had plenty of time to say, "let's work together to make this work for us".  They didn't do this, and we all know why.  As far as "backwards compatibility", if it can't be opened with the older versions of the software, it isn't really backwards compatible, is it?


I was already aware of the Google Docs *.sxw named as *.odt problem, which appears to have been corrected already.  I have not checked recently to be sure, other than round-tripping a couple of documents with KWord and AbiWord last month.


False. COM has *ALWAYS* been about interop between applications -- 99+% of the applications developed for Windows are developed by companies other than MSFT, and yet a significant majority of these applications can inter-operate with one another (drag/drop, copy/paste, plug/play, etc...) From the hardware side, the primary reason MSFT has been as successful as they have been is due to the fact that they have made every effort to get as many hardware vendors involved as they possibly could. How do they get them involved -- Mix-and-Match interoperability.


We aren't only talking about COM or OLE, or whatever this month's version is called.  How about the protocol used between Exchange and Outlook?  How about SMB/CIFS?  How about the extensions to Kerberos?  In each case, there is some "secret sauce" that is used to tie their applications together, but which is not fully available to outside vendors.  That is the integration that is talked about in the magazines that the PHBs read.


They do not seek competitive markets.  Court transcripts in both the federal and some state cases, including the current case in Iowa, show exactly how much trouble they go to trying to prevent any market from being competitive.  Reading their 10-Qs makes it pretty clear that either they completely dominate a market, or they remain very far behind the leaders.  In a normal company this size, there would be at least one market where they were one of a number of leading competitors. (XBox doesn't count, because it is still losing money and being subsidized by Office.)


As a side note, has O'Reilly ever considered using TinyMCE as the editor for these posts? It would make things a whole lot faster and easier.  And by the way, even though I hadn't heard of either you or Rick Jellife before this controversy burst out on the scene, I have to say that even when you disagree with your readers, you have been civil.  Much respect added there.

M. David Peterson
2007-02-07 17:44:57
@W^L+,


Thanks for the follow-up and extended information. To be honest, you have me stumped on a few things > meaning, I don't know enough about what you are referring to *yet* to be able to comment beyond "hmmm... I need to learn more about this before providing comment" ;-) So if you don't mind, I am going to spend a bit of time tonight doing some research to ensure that I can at least sound somewhat educated on the matters you have brought up.


Will follow-up with extended comment just as soon as I can.


re: TinyMCE -- I must admit, the tools we use are pretty horrid, huh?! If not mistaken, there is some serious effort being put forth at the moment behind the scenes into bringing together a stronger community-based front, which I believe includes providing better overall tools. Let me see what I can find out, and get back to you once I have some extended detail.


Oh, and regarding the Sun/ODF/Office plug-in. Agreed > http://www.oreillynet.com/xml/blog/2007/02/sun_to_ship_highly_optimised_b.html < pretty cool stuff, for sure!


Will follow-up again when I feel a bit more educated and able to do so.


Thanks for your follow-up, W^L+! It seems to me that communication, even when in the form of debate, is exactly what we need to be doing to best understand the various views and opinions we all have. Obviously we each have our opinions for a reason, and its usually not because we are each complete idiots, and instead because we come from different backgrounds. Taking the time to provide further understanding and insight is something I *VERY* much appreciate, so for what its worth, thanks! I learn a *TON* from folks such as yourself who are willing to share their knowledge and experience, and while I may not always agree (of course, that goes both ways, obviously) that certainly doesn't mean your time and effort is not appreciated. It most certainly is!

Simon Phipps
2007-02-08 00:24:31
To be honest I had never even considered the need to prove Sun to be the largest contributor to OO.o since most of the time it seems to be a commonplace to community members over there. But I'll point out that:
* Sun purchased StarDivision in 1999 and created OpenOffice.org by open sourcing its StarOffice product
* Ever since then Sun's staff in Hamburg have been by far the majority participants
* Novell are the number 2 contributor and do valuable work, with Michael Meeks being the most visible community participant
Most of the time I face criticism for the Hamburg folks not leaving enough room for others to work so I am bemused by the request for proof, I'll dig around and see what I can find (beyond the findings of the European Commission report[1] which actually makes this pretty clear).


[1] http://ec.europa.eu/enterprise/ict/policy/doc/2006-11-20-flossimpact.pdf

mike
2007-02-08 03:28:38
@Segedunum


that's exactly how it seemed to me

M. David Peterson
2007-02-08 13:35:09
@Simon,


I wrote the comment before I wrote the follow-up and realized after writing the follow-up that the link really wasn't necessary, as who you are and what you represent is MORE than enough for folks to understand which is the correct data source to look to. In short, please don't stress over finding anything to "back-up" your claim. I think we have all the back-up that is necessary ;)

Segedunum
2007-02-08 13:53:11
Firstly, your attempt to continually suggest that no one is answering the specific questions is a lie.


I've never seen anyone tell me why the OOXML spec uses its own country codes, works around problems in Microsoft Office rather than coming up with a format that can be universally used, doesn't make any attempt to use things like SVG that others are implementing or specifies a set list of its own banners in what they're putting forward as an international standard.


When they don't, and instead stick to there original reasoning, you claim song and dance. It's a bullshit marketing tactic that you've been attempting to use time and time again.


The fact that you're upset over this is not my problem - you're not answering anything. You also betray your Microsoft oriented thinking, because I'm not marketing anything ;-).


Your belief regarding SVG is *FALSE*. There is one complete implementation by Adobe. And guess what? They end of lifed support as of January


Again, you're missing the point. It is something that is being implemented successfully by many vendors and open source projects TODAY - if not all of SVG. It's a big area to implement, but that doesn't mean that XAML and Microsoft's own answer to SVG isn't complicated either. Your reasoning, as Miguel's is, is a confused bunch of fumbling as to why Microsoft should have invented their own format.


Miguel even resorts to some strange argument at the end of his article over the line spacing that the ODF and OOXML use, and the fact that ODF references other standards documents rather than defining its own stuff in many areas as an explanation as to why OOXML is around 6000 pages long. The question is, why is OOXML defining its own stuff? That's the problem I have with a lot of the counter arguments people come up with.


How many pre-existing SVG documents are there in the wild? HARDLY ANY!!!


Again, another silly argument. How many XAML and OOXML using documents and applications are there? Very, very few, but according to you and Miguel we should all be paying attention to them. Go figure.


But no one other than a handful of people have cared enough to implement rendering support, and no one other than a handful of people have cared enough to create development tools that render SVG output.


Why shouldn't Microsoft have used SVG then? Errrrr, ummmmmm, errrrrrr, ummmmmmm, errrrrrrr, no one's using it. I hate to break it to you, but no one is using XAML either. It doesn't answer the question of why Microsoft felt they should go their own way rather than make an effort to fit into a standards based world they say they're in favour of.


The notion that "its a standard, and so it MUST be used" is a complete crock.


Bye, bye OOXML then ;-). Why bother with standards at all?


It might have been standardized by the W3C, but until there are a significant number of documents encoded with SVG, it DOES NOT MATTER.


No one is currently using XAML or even OOXML in any numbers, so what does that mean? The fact is though that SVG is being successfully implemented, albeit it not all of it, in the open source world and elsewhere - as PDF has been over the years. There's no reason why that won't happen.


I believe you have tried desperately to dredge up PDF in the past and that's the difference - PDF has dozens of implementations in the open source world, printers and elsewhere by more than one organisation. Microsoft's XPS isn't used ;-). OOXML has nothing of the sort either, and the objections raise some serious questions as to whether that can ever be the case.


It's a very feeble reason for Microsoft to ignore it and use some NIH syndrome when they're supposedly supposed to be contributing an open format.


Your "boo hoo" comment that SVG is difficult to implement is also a complete crock. No one that I am aware of has stated "SVG is too hard"


You misunderstand. I was pointing out something rather feeble Miguel had come up with. So why did Microsoft feel the need to come up with their own equally complicated version then if it could have been done?


You're merely making my point for me.


which is why SVG is in such a fragmented state in regards to implementations (both run-time and tools)


It takes time and work to implement something like SVG, but it is being done. I also don't know what you mean by fragmentation in terms of tools - the acid test of a standard is if it can have independent but compatible implementations. You make it sound as if there should be one company or one application that should have the definitive SVG implementation ;-).


You've devoted a couple of paragraphs that can be summed up thus: "Microsoft doesn't use SVG not because it isn't promising or because we couldn't do it but because, errrrrrr, no one uses it". Given that few people currently use XAML and OOXML, we should all be ignoring those as well?


Don't know, but OpenOffice.org doesn't even use it, and OO.o is supposedly the poster child for ODF.


The fact is that it is in the ODF spec, and given that there are multiple implementations of SVG ongoing in the world in many environments and on different systems (such a standard takes time to really get rolling in terms of usage and quality) it was a sensible choice for ODF. The fact that Open Office hasn't got around to it yet means nothing because Open Office != ODF. In the open standards world one format doesn't have a one to one relationship with one application (MS Office and doc for instance).


I'm not saying that ODF doesn't need to be improved and added to or that any application like OOo has implemented all of it in the way that Microsoft Office has implemented all of OOXML (after all, they invented the format and it's based around Microsoft Office features ;-)).


The bogus argument many are coming up with is pointing fingers and picking at ODF and saying "Oh, it's incomplete and hasn't been fully implemented" simply because Microsoft have taken their existing binary Office format, dumped it into XML and a 6000 page document and said "Look, that's complete!". It's still no reason at all as to why Microsoft isn't or couldn't use ODF or get involved with contributing to it.


If SVG is so great and wonderful, and we all should have to use it as the standard for marking up vector graphics files, why then has OO.o not led the charge, and instead still use the same implementation they've been using all along?


Because it takes time and effort. Given that Microsoft are all about interoperability, apparently, then there's no reason why they couldn't have implemented it substantially in an effort to do just that, as a real commitment to show they were serious. They failed.


You're just answering question with questions and not admitting what we can all see to be true.

Segedunum
2007-02-08 14:36:25
In fact, I would venture to state that the exact opposite is actually the case. Why? Because ODF is working from the standpoint in which there is no requirement that they provide support for the billions and billions of documents that exist on this planet that are encoded in MSFT Office formats.


Wow. So I can have a new and open format with OOXML and continue to open these documents in Office 97 without any additional plugins or filters because of Microsoft's generous commitment to backwards compatibility? Errrrrrrrrr, no. I can't. In my book, that isn't backwards compatibility.


This argument for OOXML is so stupid and incorrect it isn't even funny. It is a completely new and incompatible format that has specific Microsoft Office features imported into it in a new and incompatible way. It's a break with the past. There's no reason why Microsoft couldn't have used ODF and added any Office or Windows specific parts in an applicaton specific way. There is scope to do this with ODF, but again, no one has got to the heart of the matter and explained why this couldn't be done. It's all smoke, mirrors, dancing and skirting.


We all know it's just as easy to open a MSFT Excel spreadsheet in OO.o as it is in MSFT Excel.


Not exactly, no. The answer, as always, is "it depends" because XLS is not an open format. One question I would ask though is, if your Excel spreadsheet with your calcs opens in OOo, is that thanks to Microsoft's commitment to interoperability or someone's ability to reverse engineer what's going on? (If you detect some sarcasm there, you're right).


EOOXML provides for backwards compatibility with the billions of document on this planet encoded in previous Office formats


It provides no backwards compatibility in any way shape or form in any meaningful way. Providing backwards compatibility with previous Office documents is such a daft statement because it's thoroughly meaningless. A new format does not have backwards compatibility with an older format.


This argument for OOXML really is so untrue and bogus it's unreal. It's one of the worst forms of FUD thrown around (and I do find that funny), because it's black and white. I cannot open an OOXML document in previous versions of Office without some form of plugin or update - it's that simple. Ergo, it isn't backwards compatible, ergo the notion that OOXML had to be created for backwards compatibility is simply a lie.


As mentioned above, you achieve compatibility through adequate filters for your applications in order to convert older documents to the new format from then on. Building backwards compatibility into a new format is a silly idea, because you can't (errr, because it's new!), and for OOXML the hilarious part is that it isn't, and can't be, true anyway!

M. David Peterson
2007-02-08 21:25:45
@Segedenum,


You've just done the same thing over again. Ignored the reasoning, acting as if the question wasn't, in fact, answered, and then going on your own little mini-rant about how "we feel this way" as if you are a representative of the rest of the world.


So instead of me spending my time writing something you are going to ignore, to then claim a song and dance answer, instead I am going to send you on a little crusade of your very own (oh, how exciting!)


You say: "It is something that is being implemented successfully by many vendors and open source projects TODAY"


If your so confident this to be the case...


Show me.

Segedunum
2007-02-09 04:50:25
You've just done the same thing over again. Ignored the reasoning, acting as if the question wasn't, in fact, answered, and then going on your own little mini-rant about how "we feel this way" as if you are a representative of the rest of the world.


Yadda, yadda, yadda I'm afraid. I haven't ignored any reasoning, simply because you haven't made any.


I've quoted other comments that have been made by various people and argued my points and my point of view, and you can't get more reasoned than that. You, on the other hand, haven't. You haven't answered any of the legitimate queries raised in what I've been saying (and what others have said) at all - because you simply can't:



  • Why does OOXML use its own country codes?
  • Why does it have a set list of its own banners in what is supposed to be an international format?
  • Why does it use Windows Metafiles in what is a format, supposedly, for interoperability?
  • Why does it take 6000 pages to specify its own poor internal implementations of things like hashing algorithms, when other standards are ready and available?
  • Why does Microsoft make the false claim that OOXML was created for backwards compatibility when it isn't backwards compatible with any previous Microsoft Office versions other than 2007 and can't be compatible with the previous formats? Previous formats are a problem for the application, not the standard.


That's just a selection. Unless you can answer those in a comprehensive way, as people like Miguel and others around here haven't, then you're simply tapping away at your keyboard arguing - nothing. Which of course, is the whole point of OOXML anyway.


So instead of me spending my time writing something you are going to ignore


You know, in my comments above I do believe that those are quotations from your comments, and others, which I have then given answers and arguments to. I can't claim that I've ignored your comments or anyone elses'.


to then claim a song and dance answer, instead I am going to send you on a little crusade of your very own (oh, how exciting!)


I refer you to the points I made above. The above is not worthy of any more time.


You say: "It is something that is being implemented successfully by many vendors and open source projects TODAY"


If your so confident this to be the case...


Is that it? The fact is that SVG and various other ISO and W3C standards are being implemented by a great many open source projects and others today. I'm not going to go off and list absolutely every one of them because that isn't the point of my argument and I'll get off my beaten track.


Your argument was that Microsoft didn't want to use SVG, and presumably other reasonable standards that exist and are used today such as MathML and XForms, simply because you argue that no one uses them. Apparently, it doesn't matter that SVG and others could have been used without any trouble by Microsoft, and there isn't anything technically wrong with them. The fact that they can be dismissed as not being used, as can many of Microsoft's NIH (Not Invented Here) inventions, is apparently enough.


Again, if that is the argument against using SVG and other various existing standards that Microsoft could implement with little trouble, why did Microsoft come up with XAML when very few people are currently using it or XPS, which absolutely no one uses? Your argument, such as it is, goes out with not so much as a whimper there.


Microsoft are trying to get OOXML ratified as an ISO standard, and they're trying to plant their flag in the world of interoperability. They can't fake it. If a proposed ISO standard doesn't have respect and nod towards other existing ISO, W3C or IETF standards then it simply isn't of use to anyone.

M. David Peterson
2007-02-09 19:54:16
@Segedunum,


I have answered these questions in various forms throughout my follow-up responses. Of the list of questions you have presented, only one of them I don't have an answer for, but let me quickly run through this list,


* Why does OOXML use its own country codes?


Don't know. Does using their own country codes stand in the way of anyone being able to implement the spec? Doubtful. But regardless, this is not for me or for you to decide. It's for the ISO voting members to decide. So let them.


* Why does it have a set list of its own banners in what is supposed to be an international format?


Because it does. Who cares? Why does it matter so much to you? Of course, one could easily argue that because of the vast number of documents in the wild encoded in previous Office formats, to ensure 1-to-1 fidelity, placing them in the spec is required. As I have already stated, it would be nice if we could just ignore all of the preexisting documents, but we can't. Of course, there are those who suggest that what should be done is that an extension to ODF should be developed to accommodate any potential loss of fidelity. But why, when it already exists? In essence, what is being suggested is that MSFT should re-invent what already exists. Of course, the same argument AGAINST re-inventing what already exists is being used in regards to why MSFT should use SVG instead of VML. Oh, and by the way, you keep pounding on the "yeah, but XAML and XPS don't have any documents either" which has nothing to do with EOOXML. EOOXML uses VML, which as I have already pointed out, does exist in the wild in great capacity. XAML and XPS are completely separate -- there is no direct connection. They are not listed in the specification. So stop attempting to suggest that they are. They're not!


* Why does it use Windows Metafiles in what is a format, supposedly, for interoperability?


Miguel answered this question. The Windows Metafiles format is publicly documented. He argued it should be added to the spec such that the t's can be crossed, and the i's dotted, but the fact of the matter is that it is information that is publicly available to anyone. His argument also included the point that the suggested alternative was to use a specification from the late 90's, ignoring all of the benefits gained by using the publicly documented Windows Metafile format, and that requiring that an inferior file format be used just because it exists, rather than because its the better of the two choices.


* Why does it take 6000 pages to specify its own poor internal implementations of things like hashing algorithms, when other standards are ready and available?


Firstly, stop making claims that MSFT has developed technically inferior technologies. Making statements such as this have absolutely no connection to reality. They are your opinions, and you provide no facts to back up why you feel this way. You seem to think that because you say they are "poor" that this is all that is needed to prove your point.


Secondly, there *IS NO REQUIREMENT* that an ISO standard *MUST* use existing standards. Rick Jelliffe has brought this out on several occasions, and given the fact that Rick has extensive experience with the ISO standardization process, and is himself an ISO insider, suggests that he does know a thing or two about how this all works.


What experience do you have with the ISO standardization process?


* Why does Microsoft make the false claim that OOXML was created for backwards compatibility when it isn't backwards compatible with any previous Microsoft Office versions other than 2007 and can't be compatible with the previous formats? Previous formats are a problem for the application, not the standard.


Once again, you make a statement such as "false claim" without any evidence that it is in fact, a false claim.


You have already attempted to gloss over the request to show me the proof, rather than just tell me this is the way it is, without *ANY* evidence to support your claims.


Is that it? The fact is that SVG and various other ISO and W3C standards are being implemented by a great many open source projects and others today. I'm not going to go off and list absolutely every one of them because that isn't the point of my argument and I'll get off my beaten track.


Segedunum > It *IS* the point of the argument. You claim that they exist, and I claim that they don't. If they did exist, and the SVG specification was as flourishing as you suggest, then your point would have at least some claim of validity. At the moment, you seem to be of the belief that "it is because I say it is" without any willingness to prove your claims, suggesting that your statement in and of itself is all that is needed.


Why are you unwilling to provide proof? Because it doesn't exist? My point, and the reason for the request, is that I believe they don't exist, and your claim that they do is a complete fabrication. Segedunum, it is the point, regardless of whether or not you are willing to accept it or not.


M. David Peterson
2007-02-09 20:30:57
@Segedunum,


One thing I should point out. I have no doubt you are one of the nicest people on the planet, and that your intentions are legit. I hate being in these types of arguments, because it forces out of me the "punk a$$ hacker with an attitude" side of me that can be okay at times, but at other times its not. Ultimately this isn't for you, me, or anyone else who isn't a voting member of ISO to decide. I do believe these issues are important to discuss, but you also reach the point in a discussion where you have far surpassed the benefit of point/counterpoint arguments, and have moved well into the area where there is simply nothing of value coming out of what's being discussed. Ultimately, this isn't our decision to make. It's in the hands of the ISO committee members at this stage. No doubt, if they have concerns, they will be raised. In fact, I believe thats exactly what has already taken place, and MSFT is now in a state where they must respond to these concerns. You and I can hash this out until we both pass out from exhaustion, and it isn't going to change anything at this point.


I've answered your questions to the best of my ability, and without significant effort on my part, that ability isn't going to suddenly increase. With this in mind as well as what I've outlined above, why don't we let MSFT and the ISO committee members do their jobs and then see what the result is when the votes are in.


Again, I do believe these discussions are important, but so is both yours and my time to work on other things of importance. So why don't we, and when the results are in, we can take things from there.

M. David Peterson
2007-02-11 21:22:11
@mike,


My apologies for the late follow-up response! Have a few extra items on my plate at the moment that need priority attention which is the reason for the delayed response.


re: "your "not a lawyer stay out of the courtroom" answer doesn't help me assess whether it is reasonable for Ecma 376 to break ISO 8601 so that it can accommodate an known Excel bug"


This is actually (ironically, given that the Lotus brand is owned by IBM, and it's IBM who is actively attempting to keep EOOXML from becoming an ISO standard) a Lotus 1-2-3 bug that was purposely perpetuated for compatibility reasons. From the MSFT Knowledge Base article @ http://support.microsoft.com/kb/214326


When Lotus 1-2-3 was first released, the program assumed that the year 1900 was a leap year, even though it actually was not a leap year. This made it easier for the program to handle leap years and caused no harm to almost all date calculations in Lotus 1-2-3.


When Microsoft Multiplan and Microsoft Excel were released, they also assumed that 1900 was a leap year. This assumption allowed Microsoft Multiplan and Microsoft Excel to use the same serial date system used by Lotus 1-2-3 and provide greater compatibility with Lotus 1-2-3. Treating 1900 as a leap year also made it easier for users to move worksheets from one program to the other.


It then continues with,


Although it is technically possible to correct this behavior so that current versions of Microsoft Excel do not assume that 1900 is a leap year, the disadvantages of doing so outweigh the advantages.


If this behavior were to be corrected, many problems would arise, including the following:
• Almost all dates in current Microsoft Excel worksheets and other documents would be decreased by one day. Correcting this shift would take considerable time and effort, especially in formulas that use dates.
• Some functions, such as the WEEKDAY function, would return different values; this might cause formulas in worksheets to work incorrectly.
• Correcting this behavior would break serial date compatibility between Microsoft Excel and other programs that use dates.
If the behavior remains uncorrected, only one problem occurs:
• The WEEKDAY function returns incorrect values for dates before March 1, 1900. Because most users do not use dates before March 1, 1900, this problem is rare.
NOTE: Microsoft Excel correctly handles all other leap years, including century years that are not leap years (for example, 2100). Only the year 1900 is incorrectly handled.


Of the above, the one that is most significant/relevant to the argument as to why this is important to EOOXML from a cross-application perspective is,


• Correcting this behavior would break serial date compatibility between Microsoft Excel and other programs that use dates.


So what this ultimately comes down to is that because there is a need to ensure 1-to-1 fidelity between documents (both MSFT and non-MSFT formats), the bug (unfortunately) must be propagated. As outlined above, more problems would be incurred by fixing the bug than would be if they just leave it as is.


Hope this helps clarify things a bit!