Towards truly document oriented Web services

by Mark Baker

In the beginning



For much of the first year or two in the life of Web services - and indeed all of their history up to that point - they were about remote procedure calls (RPC); exposing remote APIs across the Internet in order to facilitate machine-to-machine communication and ultimately, business-to-business integration over the Internet.

It didn't take very long however, for Web services proponents to realize that they needed to distance themselves from RPC and its well-deserved reputation as a poor large scale integration architectural style, due to the failure of systems such as CORBA, DCOM, and RMI to see any widespread use on the Internet. So, sometime in 2000/2001, collective wisdom in the space shifted towards a preference for "document oriented" services. Vendors quickly jumped on board with upgraded toolkits, and that was that; documents were the New Big Thing.

Unfortunately, the basic architectural assumptions underlying Web services at the time, didn't change nearly enough to distance Web services from the problems of RPC.

What is "Document oriented"?



Respected Web services guru Anne Thomas Manes succinctly (and unknowingly, it appears) describes the differences between RPC and document orientation;


Document style:

<env:Body>
<m:purchaseOrder xmlns:m="someURI">
...
</m:purchaseOrder>
</env:Body>

RPC style:

<env:Body>
<m:placeOrder xmlns:m="someURI">
<m:purchaseOrder>
...
</m:purchaseOrder>
</m:placeOrder>
</env:Body>

The bigger difference is how you encode the message. [...]



While the encodings used were certainly different, each with its own not-insignificant pros and cons, what Anne failed to point out is that the RPC example included an operation name ("placeOrder") while the document oriented example did not. This constitutes an extremely significant architectural difference, as it tells us that Anne's document example uses a state transfer style, while the RPC example does not.

State Transfer



State transfer styles, which include MOM, EDI, pipe and filter and others, are characterized primarily by one architectural constraint; all the components expose the same application interface. Actually, in most cases, including those three, the application interface is constrained to providing a single operation that one might call "processData" (it's actually called "putData" in that pipe-and-filter description). Each server component exposes this operation, enabling any client to submit data to it for processing. In addition, because there's only one operation, its use is implicit and therefore needn't be included in the message.

Allow me to reiterate my main point; Anne's document oriented example above includes an implicit ("processData") operation.

REST



REST - REpresentational State Transfer - is, as the name suggests, also a state transfer style. One of the interesting ways that REST differs from the others, is that rather than constrain the interface to the single "processData" operation, it allows any operation which is meaningful to all components (referred to as the "uniform interface"). An interesting side-effect of allowing more than one operation, is that it requires messages be explicit about the operation in use, since there obviously needs to be a way to disambiguate messages with the same document, but different operations.

HTTP is the application protocol most closely associated with REST, largely because it was developed to respect many of REST's constraints. As it related to the uniform interface and explicit operations, HTTP provides a "POST" operation which is an alias for the aforementioned "processData" operation. So, back to Anne's example again, this HTTP message is semantically identical to her document oriented example;



POST some-uri HTTP/1.1
Host: some-host.example.org
Content-Type: application/x-purchase-order+xml

<env:Body>
<m:purchaseOrder xmlns:m="someURI">
...
</m:purchaseOrder>
</env:Body>



Moreover, note that if the HTTP operation were different - say, if it were "PUT" instead of "POST" - then the message would no longer have semantics identical to Anne's original document oriented example. Yes, this means that the semantics of the message are a function of the application protocol being used, unlike conventional wisdom with Web services which suggests that message semantics should be "protocol independent".

Conclusion



Hopefully this little note helps put in context the architectural relationship between the Web and document oriented Web services. The relationship is closer than it appears in some important ways, yet more distant in others, likely as a result of the fact that Web services began with RPC, rather than with a truly document oriented architectural style. Perhaps spelling this out explicitly, as I hope I've done here, will help more Web services proponents realize the importance of the Web to their objectives of integrating systems across the Internet.

Note: this article was originally published at the Coactus weblog.