|MySQL Conference and Expo April 14-17, 2008, Santa Clara, CA|
Amazon Web Services (AWS) allow anyone with some coding skills to create applications using Amazon's data. It's fairly easy to transform an AWS response into HTML and show a list of products and images on a remote site. Many ISPs put checks in place to stop "image leeching" — referencing an image URL from a remote server directly in a
When you request information about a product through Amazon's Web Services, you get an impressive amount of data about that item. Each XML response also includes three URLs that point to small, medium, and large images of the product. (The XML tags that hold the image URLs are self-describing:
To Host or Not to Host the Images?
If you want to display Amazon product images on your web site, you can save the image to your server and use a local path in the source attribute of your
At first glance, it seems that letting Amazon serve the product image is the best way to go: there's no extra coding to cache the image, and you save bandwidth. In practice, though, Amazon's image server can be unreachable for a variety of reasons. Keeping product images local means that you will have an image to display, whether Amazon's servers are responding or not. In my experience working with Amazon, the image server is rarely unreachable, but when it's not responding—even for just five or ten minutes at a time—a page full of broken images looks pretty bad. Another reason to consider caching product images locally is that some products don't have images. That sounds counterintuitive at first, too, but the process of caching the image gives you a chance to see if the product actually has an image.
When There Are No Images
Amazon has an incomprehensible number of products, and it's not possible for every one of them to have an image. The problem is that Amazon's API doesn't let you know which products have images and which products don't have images. Amazon always returns the image URLs, whether the product actually has an image or not. If the product doesn't have an image associated with it, all of the image URLs returned by the Amazon API will be single-pixel GIFs. The GIFs are transparent, and with some designs, that works fine. But if your product images have a border, or rely on the image being there for spacing, the single-pixel GIF can wreak havoc. There are a few ways to detect which products have images and which don't, and which method to use depends on whether you or Amazon are hosting them.
A Server-Side Solution
If you've decided to cache images locally, it makes sense to do the "image detection" at your server. In Amazon Hacks, Hack #84 provides code for this by checking the resulting image file's byte size. This method works well, but it requires your script to download the entire file, which can cause delays if you're working with the large image size. There's an even shorter way to determine whether or not the image is there for you to display.
The transparent GIF that's returned still has the JPG extension. By all appearances, it's a valid file. But the HTTP headers don't lie. By examining the headers for a given image URL, you can find out whether it's really a GIF or JPEG you're about to download. The headers give all sorts of information about the response, but the only values we're really interested in are the
For example, Amazon Hacks has an image and the image URL returned by the Amazon API for the medium image is:
We can use a script to see what the relevant HTTP headers for the request are:
By contrast, the book Using Email Effectively doesn't have an image. The URL returned by the Amazon API for the medium image is:
And its relevant headers:
As you can see, the image type is completely different and the content length is much smaller. Using this difference as a criteria for whether or not the image "exists," you can write routines in any scripting language to do this check for you.
This ASP function uses the Microsoft XML parser to request the image headers. If the image's
Here's a variation on the theme for PHP. You'll need a package that supports fetching HTTP headers, like the PEAR
Here's the same routine in Perl. You'll need a module that supports HTTP requests, and this example uses the convenient LWP::Simple.
With these functions, you can check to see if an image exists and take the appropriate action: display it if it's there, replace it with a generic "no image available" graphic if not. Using these methods makes the most sense if you're caching images locally. You wouldn't want to examine the HTTP headers for every image you want to display every time someone requests a page on your server; all of this "pre-processing" would slow down your application. Also keep in mind that you can't cache images indefinitely — Amazon's terms of service require you to refresh any cached images every 24 hours. To get a jumpstart on caching Amazon images, refer to Hack #93 in Amazon Hacks, "Cache Amazon Images Locally."
And this still doesn't solve the problem of image availability. If the Amazon image server isn't up, there aren't any HTTP headers to look at, anyway. But there is another way to work around products without images — even if you're letting Amazon do the work of serving them.
A Client-Side Solution
Take a look at the last
Make sure this function runs after the entire page has loaded by calling it from the
Don't forget to create a local graphic called book_noimage.gif, so that the missing or unresponsive images will be replaced by a local graphic.
Wherever you decide to apply these checks, they should help ensure that your dependent application looks good no matter what the conditions are.
Paul Bausch is a co-creator of the weblog software Blogger, maintains a directory of Oregon-based weblogs at ORblogs.com, and is the author of the forthcoming Yahoo! Hacks.
O'Reilly & Associates recently released (August 2003) Amazon Hacks.
Return to the Web Development DevCenter.