Building the Web with Web services

by Mark Baker

Introduction



As a proponent of using the REST architectural style for Web services
development, I've often been frustrated when I hear REST dismissed out of hand as a solution
for some problems, while at the same time, "document style Web services" are deemed amply suitable for it. This frustration stems from my view of REST as a generalization - an "uber architecture" of
sorts - of coarse grained, loosely coupled, document oriented approaches to application integration, which suggests to me that anything that can be accomplished with document style Web services, can be accomplished within the constraints of REST, and in a not too dissimilar way either.



In this essay, I'll describe how core features of the Web relate to document style services, and in doing so, will describe how to build the Web with Web services.



Documents and state



"Document style" Web services are commonly characterized, not surprisingly, by messages which consist of a "document". But what is a document? How do we know when we're using a document or not? What isn't a document?



What a document isn't, at least by the (IMO, accurate) use of the term in a Web services context, is a bag-o-bits which isn't asking anything of anyone; the current time, a purchase order, a signup sheet for pickup floor hockey. They aren't asking anything of anyone, because their job is simply to capture state; a signup sheet captures the state of those who desire to play, a purchase order captures the state of the desire of a purchaser to acquire some goods or service, and the time, well, communicates the state of some clock.



Documents are state.



Identification and state



When dealing in documents/state, a designer often finds it useful to be able to know which documents are about the same thing. For example given two documents representing the state of some business process, it's quite useful - and often necessary - to be able to determine if the two documents are both of the same business process even though they differ due to, perhaps, being snapshots taken at different times.



There is more than one way to address this problem, of course. One way would be to include information in the document which could be used to uniquely identify the business process; the parties involved, the time it began, etc.. While this has its advantages, it also has two large disadvantages; the inability to support alternate formats (e.g. images) which can be used to represent the state of the same business process, and that those identifying characteristics are known only to software which also knows the format. That latter problem would prevent, for example, a generic caching mechanism.



Another way, which has proven very successful on the Web, is to assign a unique URI to each business process. The URI would live outside the document (it could also live within it, but that's optional), and could be used as a key for a cache, as it is on the Web.



Transfer and transport



This is a much more subtle point. Despite a document not asking anything of anyone, sometimes during the use of that document - particularly on its journey across a trust boundary to some other party - there comes a time when you do want to use it to ask something of someone.



Consider an electronic document containing the current time, "11:34AM". If I transport that to some party, all I know is that it's sitting in a buffer of some network stack; I don't know that any application code has received it, processed it, or otherwise. This is where transfer comes in; if I successfully transfer a document, then I know that application code received it. Pictorially, it looks like this. Practically, what it means is that there is an operation being performed, just one that is uniform to all services/components (and therefore appears hidden) and means, roughly, "process this data". The advantage of making this operation explicit rather than implicit is extensibility, as it allows us to define new operations, such as "store this data", or "monitor this thing for state changes".



How this would look would be that instead of sending just the bits "11:34AM", we could send "PROCESS-THIS 11:34AM", or "STORE-THIS 11:34AM".



And of course, what one could achieve with the "PROCESS-THIS" and "STORE-THIS" operations would require that the protocol that defined these operations be used. They wouldn't be "protocol independent".



Requesting documents



Up to now, our notion of "document exchange" has been limited to "sending documents", even if we've talked about sending them using different operation semantics (e.g. process vs. store). And while document submission is both necessary and powerful, there exists another interaction style which has demonstrated its utility; requesting documents.



We've already discussed the value of having an identifier for a business process in order to relate two documents representing the state of the same business process. Now, wouldn't it be useful if we could "request a document" using that identifier to get the most recent state of the business process? I believe it would, and not coincidentally, this is the value of GET on the Web.



Building the Web



I explicitly avoided adding a section like this, because I was hoping people would be able to make these conclusions for themselves. But that doesn't seem to be happening, so here it is due to popular request.



So what does this have to do with building the Web? Well, what I just described above is how one would go about turning a world of document oriented Web services, into the Web.



The first step is to start identifying those things which your documents are representations of - your resources - and to identify them with URIs. The second is to realize that can do more than just submit documents for processing (POST), you can request documents by asking for the state of a resource by invoking GET on their URIs. That's it; that's the Web. It's not just for humans, it's for any agent which can submit and request documents.



Conclusions



The relatively recent shift away from RPC and towards "document exchange" (aka state transfer) is extremely welcome progress for this POV, but IMO, just the first step of many towards fully appreciating the enormity of the World Wide Web project.



Get it? Or am I still full of it? Let me know.


6 Comments

anonymous2
2003-11-24 23:00:30
what a rubbish article
I am not a web services fan, but this article is awful.
anonymous2
2003-11-25 02:37:55
what a rubbish article
Your comment is entirely unhelpful. In what way is it awful? Can you be more specific?


sh.

distobj
2003-11-25 06:10:21
what a rubbish article
It's specifically targetted at Web services proponents, perhaps even just those that I've worked with in the XML Protocol and Web Services Architecture WGs at the W3C who are familiar with the terminology (e.g. state/transfer/transport).


I'm sorry you didn't learn anything from it.

anonymous2
2003-11-26 10:44:22
what a rubbish article
maybe the title is a bit misleading, I was a little disappointed by the content of the article after reading the title myself.
distobj
2003-11-26 13:03:49
what a rubbish article
Fair enough. You have to squint a little to see how the title relates to the content (though not too much).


I'm basically saying that if Web services recognized that the documents they're dealing in were state, and that they represented the state of something which could be identified, then you'd basically have the Web. See my last Weblog entry on SDO for another description. It's amazing how close Web services are getting to the Web in terms of power, without people recognizing it.

anonymous2
2003-11-28 02:28:17
(Hopefully) Constructive Feedback on Mark Baker's Article
I didn't think it was harsh, it is just a badly written article.