The SimpleDB Epiphany: I Finally GET It... Why RFC 2616 Is To Blame

by M. David Peterson

Update: Subbu Allamaraju has followed up my post with "Idempotency Explained" which is worth a read. I'm not sure if I agree 100% with his comments due to the fact that -- as far as I know -- the same request to create/edit/update an entry/attribute on SimpleDB will always yield the same result no matter how many times the request was made. Then again, I could very well be completely off base here. /me is reading through the docs again to ensure I haven't missed something.

Anyone in the know care to clarify one way or another?

Either way, thanks for the extended overview, Subbu!

[Original Post]
So for various reasons I've had the opportunity to get to know a lot of the folks who design, develop, deploy, market, and support the various offerings of Amazon Web Services, and it's because of this I found it funny to hear people criticize Amazon for "setting back web architecture 10 years" with the release of SimpleDB. For example, Dare Obasanjo provided the following commentary,

I’ve talked about APIs that claim to be RESTful but aren’t in the past but Amazon’s takes the cake when it comes to egregious behavior. Again, from the documentation for the PutAttributes method we learn,


<snip/>

Wow. A GET request with a parameter called Action which modifies data? What is this, 2005? I thought we already went through the realization that GET requests that modify data are bad after the Google Web Accelerator scare of 2005?


I'll admit that at first I was right in line with Dare's point, or in other words, WTF?

But as I mentioned, I know a lot of these guys personally, and I can assure you not a single one of them could qualify as anything other than the best and brightest this world has to offer as it relates to the field of computer science. So I've always held off from criticizing, assuming that eventually it would all make sense.

Apparently eventually =~ February 19th, 2008,

23 Comments

Thomas Broyer
2008-02-20 02:19:15
How about SimpleDB returning "202 Accepted" statuses in response to POSTs and PUTs, each with a "status URIs" to poll for the "execution status"?


...or how RFC2616 gives you want you need and thus is not to blame...


Or am I misunderstanding something?

M. David Peterson
2008-02-20 04:59:46
@Thomas Broyer,


Not sure I follow. Guess maybe I need to dig a little deeper, but how would this provide me the same benefits provided by HTTP pipelining?

Ric
2008-02-20 05:47:57
I won the ACSL award for 4 years, so I am no chump when it comes to coding either, but I do not get why Amazon allows GET to modify data. I would LOVE to use get, because it is native to JSON requests, but I have been burned too many times and have learned my lesson, especially for security.
My only guess is that they may use a key to authenticate the request can change the state, but that is just hiding the issue.
BAD Amazon! Bad!
If you would care to elucidate us with a more descriptive rational of why we are breaking REST, please do so.
M. David Peterson
2008-02-20 06:12:10
@Ric,


>> I won the ACSL award for 4 years, so I am no chump when it comes to coding either,


FWIW, I won my seventh grade spelling bee (and then quickly made it a point to become the worst speller known to man. What can I say: I try. ;-)


>> My only guess is that they may use a key to authenticate the request can change the state,


Yeah, every request MUST be signed by a pub/priv keypair, so there is no way for a simple GET request to do anything it wasn't intended to do. You could argue that Amazon reinvented a few things to get the capabilities they provide, but they also invented quite a bit in the process that betters the overall state of the industry, so all in all I believe they've done more good than anything else.


>> but that is just hiding the issue.


Not really. They needed to provide the most efficient solution that would work out of the box, and in the case of highly parallel application requirements over HTTP(S) there is really only one option: HTTP Pipelines.


You *COULD* create multiple threads and a new connection for each request, but this costs you the HTTP overhead for every request which is what HTTP Pipelines were designed to help avoid. And as per above, you can only get the benefit of HTTP Pipelines using GET and HEAD (which, of course, was purposely put into place to avoid undesired side effects) so the only option is to hack GET to provide the same functionality as PUT or POST, ensuring that in doing so they put in place measures to counteract the side effects that PUT and POST were designed to help avoid.


>> If you would care to elucidate us with a more descriptive rational of why we are breaking REST, please do so.


Well the thing to keep in mind is -- quite obviously -- RFC 2616 != REST, something of which doesn't have a spec, only a thesis (and one of the better books of our modern age in "RESTful Web Services" > http://www.oreilly.com/catalog/9780596529260/ < for those unaware), so while Amazon might be breaking the traditional REST architecture style, they're not breaking RFC 2616, just hacking it to provide the desired functionality. And quite honestly, when you pull all of the pieces into view, I really don't see any other way to gain the benefits provided by HTTP Pipelines while still conforming to proper REST architecture guidelines.


Keeping in mind that I did win my seventh grade spelling bee, maybe I'm missing something?


Sylvain Hellegouarch
2008-02-20 09:38:35
AFAIK, idempotency doesn't mean you can't modify data, it means that for n identical requests you observe the same effect on the system.


The problem doesn't actually come from the RFC but more from the way GET has been abused over the years.


Technically speaking the put action performed by Amazon through a GET request is valid because two identical requests will not modify the system on a whole. It works because put actions simply modify the attributes of an existing item. It would break the idempotent rule if it triggered an action that does modify the system. It's not the case and I think Dare was a bit quick on his judgment here.

M. David Peterson
2008-02-20 10:33:32
@Sylvain,


Thanks for this! I was vaguely aware of when and how GET could modify data and stay within conformance to RFC 2616, but wasn't fully in sync with the specifics. I need to study this a bit more (still not 100% on it), but after reading through this I now know the areas I need to focus on. Thanks for spelling things out!

Alexander Klimetschek
2008-02-21 11:27:11
Ever heard of the idempotent PUT method?


REST means not to invent new operations (like the action PutAttributes), but instead use GET, PUT, DELETE and co and model everything via resources. The correct solution here would be to add a new resource for the attributes as part of the URL (something/attributes) and applying a PUT there.


This has nothing to do with increased performance and pipelining.

M. David Peterson
2008-02-21 11:31:17
>> This has nothing to do with increased performance and pipelining.


This has *EVERYTHING* to do with pipelining. We're not talking about updating a blog entry. We're talking about reading and writing to a database in the cloud over HTTP(S) in which many hundreds if not thousands of updates are being sent to the system at a time.

Subbu Allamaraju
2008-02-21 12:30:50

My quick response is here.


I hope this is your interpretation and not Amazon's.

M. David Peterson
2008-02-21 12:38:42
>> I hope this is your interpretation and not Amazon's.


It is, yes.


So answer me this: How would you increase the throughput of writes to the DB w/o the use of pipelines and yet still gain at least the same level of performance benefit? Finding humor w/o providing substance doesn't provide much reason to find the point of your post all that compelling.

Subbu Allamaraju
2008-02-21 12:39:59
By the way, the idempotency requirement on pipelining is not a limitation, it just allows safe retry after a connection failure. In fact, pipelining GETs that do writes (as in the case of SimpleDB), you are taking away that protection. Two consecutive GETs with each containing the same write could produce different results, and Amazon's SimpleDB does not protect against this.
M. David Peterson
2008-02-21 12:42:24
>> Two consecutive GETs with each containing the same write could produce different results,


For what purpose would you pipeline two of the same writes?

M. David Peterson
2008-02-21 12:45:25
>> Amazon's SimpleDB does not protect against this.


That depends on whether or not use the "replace" or "in addition to" capabilities.

Subbu Allamaraju
2008-02-21 18:37:36
Hi David,


Sorry for the late response. My full response is here.


Idempotency Explained.


Subbu

M. David Peterson
2008-02-21 19:00:15
@Subbu,


Thanks for this! I've left a comment on your entry and updated this post with a link and quick follow-up.

Brian
2008-02-21 21:24:27
PUT is idempotent
Erik Hetzner
2008-02-21 21:27:08
This is not an issue of idempotency & pipelining. Bringing this up muddies the issue.


The issue is the /safeness/ of a GET request. Read section 9.1, Safe and Idempotent Methods, of RFC 2616.


Subbu's weblog post gets it right, except for the statement about idempotent messages not having side effects. This gets us back into "safe" territory. Both idempotency and safeness have to do with constraining the possible side-effects of a request, but they mean separate things. Again, section 9.1 explains it better than I can here.

Tim Olsen
2008-02-22 05:59:44
So now there are two SHOULDs that Amazon is breaking:


1. GET SHOULD NOT have side effects (13.9)
2. Clients SHOULD NOT pipeline requests using non-idempotent methods.


By breaking rule #1, they are also breaking rule #2 because GET in this situation is now no longer idempotent!!


SHOULDs are not MUSTs. If you have to break them, just break rule #2 and continue to use PUT, POST. There is no need to break both rules. Calling it a GET when it's not idempotent does not make it idempotent.

Erik Hetzner
2008-02-22 07:28:23
Tim. having side-effects does not (necessarily) make a method non-idempotent.


Methods can have side-effects & yet still be idempotent. See PUT & DELETE.


M. David Peterson
2008-02-22 07:32:40
@All,


This is all great conversation. I'm learning *TONS*. I personally need to think through a lot of this to better understand all that is being presented/debated, but please don't let that stop any of you from hashing out the details. The end result, I believe, will be an extremely beneficial guide to all things related to this conversation, topics which quite obviously are either misunderstood or completely misinterpreted by the masses.

Tim Olsen
2008-02-22 08:09:46
Erik,


You're right. I should have been clearer that they were breaking GET's safeness by making it non-idempotent, thereby also breaking rule #2.

Tim Olsen
2008-02-22 08:19:53
> the same request to create/edit/update an entry/attribute on SimpleDB will
> always yield the same result no matter how many times the request was made.


now I'm confused. Are all of Amazon's requests idempotent or not? If they are, then they can be pipelined without having to be a GET (they can be PUT or DELETE).

Sylvain Hellegouarch
2008-03-03 00:27:52
@All,


There is an interesting thread going on at the HTTPBIS charter mailing list about pipelining:


http://lists.w3.org/Archives/Public/ietf-http-wg/2008JanMar/0348.html