Google's Genius

   Print.Print
Email.Email weblog link
Blog this.Blog this
Sam Ruby

Sam Ruby
Apr. 30, 2002 04:47 PM
Permalink

Atom feed for this author. RSS 1.0 feed for this author. RSS 2.0 feed for this author.

If you listen to the tale Paul Prescod tells, Google had in its possession a vastly superior API which it chose to forsake, relegating it to a lifetime locked away in an ivory tower, waiting for a prince to rescue it.  Meanwhile, a much hyped evil and ugly stepsister has been paraded around town.  At this point, we have all the essential elements of a fairy tale.

A fairy tail indeed.

I have seen no evidence that Google has behaved like an evil stepmother.  To the contrary, the one thing that has consistently shown through is that Google has taken a low key, pragmatic, and essentially hype-free approach to all things technical.  This instance has been no exception.

What was the seminal event that sparked countless people's imagination, inspired more than two dozen implementations and 10,000 developers sign up in the first week alone?  No, this isn't the result of coercion on the part of Google, or substantial financial incentives to the participants, or even a sustained marketing campaign.  Instead, it is the result of a relatively quiet post to a mailing list for an excellent but otherwise relatively obscure (in the US at least) programming language.

Google's Genius?  To pick a wire format for which there are dozens of toolkits poised to directly translate the protocol into readily consumable bits.  To directly test interop against a small but diverse set of platforms.  To provide early access to an undisclosed number of other interested parties.  To provide a sample that runs on wide range of operating systems and instruction set architectures. To document the wire protocol adequately, including all the optional type annotations.  And to provide sufficient metadata, in the form of WSDL, so that a large number of developers can be instantly up and running.

In short, they did their homework.

In return, Google was likened to the wizard Saruman, a benign and powerful force inexplicably turned from the path of virtue.

HTTP GET 

At about the same time as Paul's article, Simon St. Laurent posted a series of articles that suggest that SOAP is unclean and unRESTful.  The key difference between the Google approach and the Amazon approach which he apparently likes better?  The use of HTTP Get.  This is also a central theme of Paul's writings too.

I've taken some time the last few days to read up on the topic of REST.  This term was coined by Apache Software Foundation's Chairman Roy T. Fielding in his PhD. dissertation.  Suffice it to say that this paper has been both very influential and deeply misunderstood and misrepresented.  Here are three related quotes from Roy Fielding himself:

In fact, there are a number of tradeoffs between GET and POST.  Roy mentions size.  Safety is another consideration.  I'll add a third: security.  I understand and appreciate that Amazon and Google have only chosen to employ a rather light weight security mechanism at the present time for their free services.  Placing an associate ID or key in the payload is like placing a key under the doormat.  Given the way URLs are tracked and cached, placing it in the URL is like taping it to the door.  As I have stated before, if and when Google ever decides to commercialize this particular service, I'd like to suggest that they consider alternatives such as X.509 certificates, Kerberos tickets, or security tokens from mobile devices. 

There are other tradeoffs as well.  Paul points out that using HTTP GET enables one to participate in XInclude.  This is a valid consideration for static queries.  For more ad hoc queries, a facility like the IO JSP Taglib or the SOAP Cocoon Taglib may be more appropriate.

One thing I like about Roy is that he is rather direct.  His opinion on most CGI programs is rather clear and succinct: Most CGI scripts, in fact, provide interfaces to applications that suck.  Take a CGI implemented using HTTP POST and convert it to using HTTP GET and Roy's opinion will not change.  Take this same design and convert it to SOAP, and Roy's reaction is again very predictable.

So the question as to whether or not a given interface meets the criteria of REST does not rely on the protocol syntax.  It relies on the nature of the interaction, and in particular how state is represented and transferred.  As a general rule, pure query interfaces with no side effects meet this criteria.  Even if they use HTTP POST.

GoogleML 

Moving beyond HTTP GET, we look for other areas of disagreement.  As Paul has made it clear, he is *PRO-WSDL*.  It apparently is also not the SOAP encoding that Paul disagrees with, as Paul states "My opposition is to the SOAP-RPC protocol, not the SOAP encoding.".  He mentioned SOAPAction in passing, something that I have verified is not significant in this service.  Futhermore, SOAPAction promises to become optional in upcoming versions of the SOAP specification.  Paul mentions optional arguments, something that has long been a part of the SOAP specification.  Both Apache Axis and Microsoft ASP.Net support optional parameters.

Much of the simplification in Paul's examples comes from omitting type specifications.  I agree that including types in this message is entirely unnecessary.  I have verified that the Google API in no way requires such annotation.  It is hard to say whether "most" SOAP toolkits will inline the types into the message - the Apache ones currently default to sending such information on the theory that it is readily available and might be useful. In Axis, this can be easily overridden.  The default for Microsoft's ASP.Net  is to not send such information.  And the comment that "I could just as easily have left them in" leads me to believe that this isn't a crucial issue either.

What's left?  I guess there is the envelope.  We certainly could discuss this, but somehow this issue does not quite seem to rise to the level of Paul's call to arms for "like-minded Hobbits, Dwarves, Elves and men and go on a quest to educate the world about the limitations of SOAP-RPC interfaces".

Perhaps the most illumining part of Paul's essay is when he describes his optimized doSpellingSuggestion API.  In this case, he declares that XML is overkill for the job.  Unquestionably, omitting XML in some cases creates a tighter data stream.  It can also require custom marshallers and parsers to be written.  More tradeoffs to consider.

An Analogy 

It is impossible to escape the fact that there is much active hostility directed at the SOAP protocol from within the REST community.  I've been giving this some deep thought lately, and I finally came up with an analogy that might explain this situation.

Few Object Oriented Programming advocates would list Perl among their top choices in a programming language.  No one will deny that it is possible to write OO code in Perl.  In fact, there clearly are features in the language designed to support objects.  But does Perl require you to write in an OO style?  Well, no.  Does it even guide you in that direction?  Again, no, not particularly.  In fact, the Perl motto is TMTOWTDI.

But it goes deeper than that.  Few, if any, of the beginners samples on how to use Perl start from an object orientation.  This leads many towards programming practices which some find inappropriate.  One might also note that a significant fraction of the CGI programs that Roy declares as sucky are, in fact, written in Perl.

One could make a similar case against SOAP.

Conclusion 

I will readily agree that the architecture, analysis and design of any complex distributed system need to focus on the concept of state in general, and on its representation and transfer in particular.  Once that work is complete, there remain a large number of implementation tradeoffs that need to be made.  Some of these deal with ease of use and the rate of adoption.  Adopting a canonical means to represent such information may have a positive influence on such important secondary characteristics of one's implementation.

Sam Ruby is a prominent software designer who has made significant contributions to many of the Apache Software Foundation's open source software projects and to the standardization of web feeds via his involvement with the Atom web feed standard and the popular feedvalidator.org web service. He currently holds a Senior Technical Staff Member position in the Emerging Technologies Group of IBM.