Google's Genius

by Sam Ruby

If you listen to the tale Paul
Prescod
tells, Google had in its possession a vastly superior API
which it chose to forsake, relegating it to a lifetime locked away in an ivory tower, waiting
for a prince to rescue it.  Meanwhile, a much hyped evil and
ugly stepsister has been paraded around town.  At this point, we have
all the essential elements of a fairy tale.


A fairy tail indeed.


I have seen no evidence that Google has behaved like an evil
stepmother.  To the contrary, the one thing that has consistently
shown through is that Google has taken a low key, pragmatic, and
essentially hype-free approach to all things technical.  This instance has been no
exception.


What was the seminal event that sparked countless people's imagination,
inspired more than two dozen implementations and 10,000 developers sign up
in the
first week alone
?  No, this isn't the result of coercion on the
part of Google, or substantial financial incentives to the participants, or even a sustained marketing campaign. 
Instead, it is the result of a relatively quiet post
to a mailing list for an excellent but otherwise relatively obscure (in the
US at least) programming language.


Google's Genius?  To pick a wire format for which there are dozens
of toolkits poised to directly translate the protocol into readily
consumable bits.  To directly test
interop against a small but diverse set of platforms.  To
provide early access to an undisclosed number of other interested
parties.  To provide a sample that runs on wide range of operating
systems and instruction set architectures. To document the wire protocol
adequately, including all the optional type annotations.  And to
provide sufficient metadata, in the form of WSDL, so that a large number of
developers can be instantly up and running.


In short, they did their homework.

In return, Google was likened to the wizard Saruman, a benign and
powerful force inexplicably turned from the path of virtue
.


HTTP GET 
width="11" border="0">

At about the same time as Paul's article, Simon St. Laurent posted a
series of articles that suggest that SOAP is unclean
and unRESTful
The key difference between the Google
approach and the Amazon
approach which he apparently likes better?  The use of HTTP Get. 
This is also a central theme of Paul's writings too.


I've taken some time the last few days to read up on the topic of
REST.  This term was coined by Apache Software Foundation's Chairman
Roy T. Fielding in his PhD.
dissertation
.  Suffice it to say that this paper has been both
very influential and deeply misunderstood and misrepresented.  Here
are three related quotes from Roy Fielding himself:



In fact, there are a number of tradeoffs between GET and POST.  Roy
mentions size.  Safety
is another consideration.  I'll add a third: security.  I
understand and appreciate that Amazon and Google have only chosen to employ
a rather light weight security mechanism at the present time for their free
services.  Placing an
associate ID or key in the payload is like placing a key under the
doormat.  Given the way URLs are tracked and cached, placing it in the URL is like taping it to the door.  As I have stated
before, if
and when Google ever decides to commercialize this particular service, I'd like to
suggest that they consider alternatives such as X.509 certificates,
Kerberos tickets, or security tokens from mobile devices. 


There are other tradeoffs as well.  Paul points out that using HTTP
GET enables one to participate in XInclude.  This is a valid
consideration for static queries.  For more ad hoc queries, a facility
like the IO
JSP Taglib
or the SOAP
Cocoon Taglib
may be more appropriate.


One thing I like about Roy is that he is rather direct.  His
opinion on most CGI programs is rather clear and succinct: Most
CGI scripts, in fact, provide interfaces to applications that suck

Take a CGI implemented using HTTP POST and convert it to using HTTP GET and
Roy's opinion will not change.  Take this same design and convert it to SOAP, and
Roy's reaction is again very predictable.


So the question as to whether or not a given interface meets the
criteria of REST does not rely on the protocol syntax.  It relies on
the nature of the interaction, and in particular how state is represented
and transferred.  As a general rule, pure query interfaces with no
side effects meet this criteria.  Even if they use HTTP POST.


GoogleML 
width="11" border="0">

Moving beyond HTTP GET, we look for other areas of disagreement. 
As Paul has made it clear, he is *PRO-WSDL*
It apparently is also not the SOAP encoding that Paul disagrees with, as Paul states "My opposition is
to the SOAP-RPC protocol, not the SOAP
encoding
.".
  He
mentioned SOAPAction in passing, something that I have verified is not
significant in this service.  Futhermore, SOAPAction promises to become optional
in upcoming versions of the SOAP specification.  Paul mentions optional
arguments, something that has long been a part of the SOAP
specification
.  Both Apache Axis and Microsoft ASP.Net support optional parameters.


Much of the simplification in Paul's examples comes from omitting type
specifications.  I agree that including types in this message is entirely
unnecessary.  I have verified
that the Google API in no way requires such annotation.  It is hard to
say whether "most" SOAP toolkits will inline the types into
the message - the Apache ones currently default to sending such
information on the theory that it is readily available and might be useful.
In Axis, this can be easily overridden.  The default for Microsoft's
ASP.Net  is to not send such information.  And the comment that
"I could just as easily have left them in" leads me to
believe that this isn't a crucial issue either.


What's left?  I guess there is the envelope.  We certainly
could discuss this, but somehow this issue does not quite seem to rise to
the level of Paul's call to arms for "like-minded Hobbits, Dwarves,
Elves and men and go on a quest to educate the world about the limitations
of SOAP-RPC interfaces
".


Perhaps the most illumining part of Paul's essay is when he describes
his optimized doSpellingSuggestion API.  In this case, he declares
that XML is overkill for the job.  Unquestionably, omitting XML in some cases creates a tighter data
stream.  It can also require custom marshallers
and parsers to be written.  More tradeoffs to consider.


An Analogy 
width="11" border="0">

It is impossible to escape the fact that there is
much active hostility directed at the SOAP protocol from within the
REST community.  I've been giving this
some deep thought lately, and I finally came up with an analogy that might
explain this situation.


Few Object Oriented Programming advocates would list Perl among their
top choices in a programming language.  No one will deny that it is
possible to write OO code in Perl.  In fact, there clearly are features
in the language designed to support objects.  But does Perl require
you to write in an OO style?  Well, no.  Does it even guide you
in that direction?  Again, no, not particularly.  In fact, the
Perl motto is TMTOWTDI.


But it goes deeper than that.  Few, if any, of the beginners samples
on how to use Perl start from an object orientation.  This leads many
towards programming practices which some find inappropriate.  One
might also note that a significant fraction of the CGI programs that Roy
declares as sucky are, in fact, written in Perl.


One could make a similar case against SOAP.


Conclusion 
width="11" border="0">

I will readily agree that the architecture, analysis and design of any
complex distributed system need to focus on the concept of state in
general, and on its representation and transfer in particular.  Once
that work is complete, there remain a large number of implementation
tradeoffs that need to be made.  Some of these deal with ease of use
and the rate of adoption.  Adopting a canonical means to represent
such information may have a positive influence on such important secondary
characteristics of one's implementation.


5 Comments

paulprescod
2002-04-30 18:57:08
REST, URIs and everything
[mailed and posted]


I'm not convinced that it is productive to do a point-by-point argument against the weblog. If you want to discuss REST then I think a mailing list is a more productive way to do it. We could make a whole new one if you want an "even ground" or "decentralization" might be an interesting place. So I won't go through point by point and just try to re-emphasize the central point of my article.


Yes, I exaggerated the seriousness of Google's crime because I'm trying to motivate action rather than just dryly analyze. But Google's "crime" is that they deprived the Web of approximately a billion useful XML URIs. We could have used those URIs in XInclude, XSLT, XPath, XPointer, RDF. And of course we could ALSO have used them in Java, Perl, Python, C#, Ruby, ...


URIs are declarative, like SQL. You type them in and declarative data comes back. Just as you can type SQL in a little one-line window and get back records, you can type a URI in a little one-line window and get back representations of resources. Just as you can integrate tables using joins, you integrate XML/URI-based web services using XInclude and RDF.


Before the relational model (and query language) was widely circulated, the whole concept would have seemed ridiculous to database programmers. You could just use the "API" (surely not the terminology they used back then, but...). You want to integrate two tables...you write some code that iterates over one and the other and correlates them. As long as you like coding glue, and don't care about scalability, this is a great system! It took years for people to understand the subtle benefits of a system that is consistent but also very restrictive.


That's where we are today, with Web Services and REST. Precisely because the issues are subtle, we have to yell loud to be heard.


In particular, the fundamental benefit of publishing through URIs is interoperability:


* http://lists.w3.org/Archives/Public/www-tag/2002Apr/0286.html
* http://www.xml.com/pub/a/2002/02/06/rest.html


Until there is more than one useful web service on the Internet, it's really hard to demonstrate interoperability advantages...


Paul Prescod

rubys
2002-04-30 19:31:26
F2F ( BOF?) at ETCON?
I'm not convinced that it is productive to do a point-by-point argument against the weblog. If you want to discuss REST then I think a mailing
list is a more productive way to do it.

Perhaps we could have a F2F discussion at the ETCON? Want to schedule a BOF? I hear that Nelson Minar is going to be there...


I'd love to find some common ground and jointly author some recommendations on the subject...

timoreilly
2002-05-03 12:45:40
Bravo!
An excellent article, Sam!


While I like the REST debate, because it highlights that it is possible to build web services in a variety of ways (as I argued in my article Inventing the Future, hackers have been writing HTTP GET calls and screen scraping the resulting HTML to build crude web services long before the term became possible), saying that one style is good and another evil is a bit ridiculous.


I'm a big fan of TMOWTDI. What is most exciting to me about the internet architecture is that it is based on that principle. Let's not get religious here. Let's just go build great services and over time, no one will even remember who was on which side of the debate, because we'll all have learned what works best, and will just do it.


Anyway, your rebuttal was terrific, Sam.


-- Tim O'Reilly

timoreilly
2002-05-03 12:47:58
F2F ( BOF?) at ETCON?
This would be fantastic. If you agree to do it, we can publicize it widely.


This is a core argument. And Paul, you have a lot of right on your side. In my previous post, I was responding to the black and white tone. This is meat for a substantial technical debate, not mud slinging.

nasseam
2002-05-05 23:18:50
SOAP vs. REST Resource page
I've started a SOAP vs. REST Resource page:
http://www.myspotter.net/links.html