Barriers to a protocol indicating what file formats are supported

by Andy Oram

A correspondent named Tim Almond just wrote me to point out a modern
barrier to communications:

How can person A know that person B uses OpenDocument? Everyone
assumes that people can read .doc and .pdf, but no-one assumes that
people can support .odt.

This is an interesting challenge. Computer systems don't know how to
inquire what utilities are used on other systems, or formats they
support. That's why, for instance, when you visit a site that offers
audio or video files, it makes you choose the format you want. Your
browser can tell the server what browser and operating system you're
using, and the server can tell your browser what encoding it's using
to send data, but the browser and server don't figure out between the
two of them what format to use, when multiple formats are available.

Now, programming libraries such as the X Window System and CORBA
contain mechanisms whereby one side in an exchange can inquire what
extensions are used by the other side. And many networking protocols,
such as SSH, negotiate a lot of their parameters at the start, such as
what kind of encryption to use. But no such system is in use for
negotiating file formats.


Content Negotiation

section of the HTTP/1.1 protocol specification contains some
interesting reasoning as to why it would problematic to let computer
systems negotiate content. If driven by the server, any such system
would have to make assumptions about user preferences, or force the
user to describe her system in great detail (a potential privacy
violation). A browser-driven system would avoid some of these problems
but would add complexity and inefficiency.

Two more problems get in the way of creating a system that
automatically delivers documents in the format people want:

  • The choice of format tends to take place when a document is created.
    Unless all utilities support all formats (something that Microsoft
    shows no interest in, and which would be unfeasible for any utility
    when taken to an extreme) the person sitting down to create a
    spreadsheet or presentation has to make assumptions about what all the
    correspondents on the other end want.

  • Most documents are still delivered through the clunky, non-interactive
    medium of email. There is no way to negotiate parameters for

So we're going to have to continue doing what I do with my authors
every time I start working on a new book--ask what formats each side
can accept and do the negotiation on a human level. Still, it would be
interesting to speculate about systems that would be more flexible
than email and where you could set preferences before retrieving a
document from a colleague.


2005-12-29 17:24:57
Some possible solutions

I've had some thoughts about this. My current favourite idea would be to have Writer put something in the document data of a .doc that would mark the document as being created by Writer.

Then, if someone opened a .doc in Writer, it could check this data and tell the author that maybe the document was created by a user using Writer and to consider checking if they use Writer too.