SOAP? Bah! What's wrong with /bin/sh?

by Edd Dumbill

The frenzy over Google's new SOAP API is just plain silly. Today I was sent details by a proud PR representative (I'll not mention them, but you'll likely hear from them yourselves) of his company's Google-over-email service, using the SOAP interface. What a waste of space for something that can be done in one line of shell script! Here's how...

Grab your local Linux/BSD box of preference, put this bit of script into a file, say /usr/local/bin/, and make it executable.


/bin/cat >/tmp/msg$$ && (/usr/bin/formail -b -t -I 'Content-Type: text/html; charset=ISO-8859-1' -r </tmp/msg$$ && QUERY=`/bin/grep Subject /tmp/msg$$ | /bin/sed -e 's/Subject: *//;'` && /bin/rm /tmp/msg$$ && /usr/bin/lynx -source "$QUERY") | /usr/sbin/sendmail -t -F 'Google mail server' -f

Then, edit your /etc/aliases file, and add in something like:

google: "|/usr/local/bin/"

Run newaliases and you're all set. Send email to the google user on your host, and put your query string in the subject field. You'll get back a nice HTML email with up to 100 results from Google.

It's more or less a one-liner. (You'll need lynx and formail, which comes with procmail, installed.)

Please don't come running to me with SOAP demos until they do something useful.

Think this curmudgeon's got it wrong? Feel free to teach me a trick or two.


2002-04-18 11:10:37
If everyone knew that you can do 95% of what ordinary people would do with SOAP + WSDL + VS.NET using HTTP and a few lines of script, they wouldn't have to replace their world wide web infrastructure with "SOAP Web" infrastructure! That would be bad for the economy! Bad for the Web Services bubble!

The Google web api is an interesting example. I mainly want to get some results back from Google that I can read with a program or stick into a web page; why must I go through all that pain (even if it is automated via WSDL) to do that? They should just do what Meerkat does and give me an option on the query to return data in various "flavors".

2002-04-18 11:39:21
What's wrong with /bin/sh?
2002-04-18 11:42:27
Why must we always be consumers?
A search over email is just a bad idea, it has nothing to do with SOAP. Where SOAP (or XMLRPC or whatever) should be used is in small frequent transactions for which lynx would be too much overhead.

Hmmm ... small and frequent, invisibly innocuous. That sure doesn't describe how I consume Google output, but does it describe how I'd want to input to Google? What if there were a bookmarklet that did a javascript soapish thing under the heading of "Index this page"?

2002-04-18 17:17:54
There are more possiblities...
True, google-by-email isn't a very exciting development, but that just shows a lack of imagination on the part of the inventors. There are plenty of things that the SOAP API makes a lot easier -- Rael Dornfest's auto-ego-surfer (see "Googling for Rael" on ) for instance.

Now SOAP may be overkill -- I'd personally prefer a RESTlike interface -- but it's a darn sight simpler and more reliable than having to write a screen-scraper.

2002-04-19 02:33:05
An example only using bash and sed
The use of all those external programs isn't really necessary. If using bash as shell, 'sed' is the only program you need apart from the shell:

(QUERY=OReilly ; \
echo "GET /search?q=$QUERY" 1>&3 & cat 0<&3) \
3 sed -e 's/^

]*\)>.*/\1/p' -e 'd'

2002-04-19 02:37:36
An example only using bash and sed (proper markup this time)

(QUERY=OReilly ; \
echo "GET /search?q=$QUERY" 1>&3 & cat 0<&3) \
3 sed -e 's/^

]*\)>.*/\1/p' -e 'd'

2002-04-19 04:19:55
An example only using bash and sed (ok, one more time)
QUERY=OReilly ; \
echo "GET /search?q=$QUERY" 1>&3 & cat 0<&3) \
3</dev/tcp/ \
| sed -e 's/^<p><a href=\([^>]*\)>.*/\1/p' -e 'd'
2002-04-19 04:22:59
An example only using bash and sed (This is getting ridiculous)

(QUERY=OReilly ; \
echo "GET /search?q=$QUERY" 1>&3 & cat 0<&3) \
3</dev/tcp/ \
| sed -e 's/^<p><a href=\([^>]*\)>.*/\1/p' -e 'd'
2002-04-19 06:37:42
The original weblog entry misses the point magnificently.

1: A SOAP interface allows Google to completely change their appearance without breaking screen scrapers. Please don't think me a zealot for SOAP though, I'd be just as happy if not happier with XML-RPC or something REST-like.

2: Anytime you see the phrase "Grab your local Linux/BSD box of preference" alarm bells should be going off in your head. Mr. Dumhill makes the common mistake, "It's what I use, therefore it's what everybody should use."

Fortunately no one is out there dictating what we all have to use for computers or for development languages. With SOAP and other _protocols_, I've got choices, real OS and language choices. That's why we have protocols, so we aren't locked into one solution if it doesn't fit our needs.

3: You'll get no argument that sending out a press release about being able to query Google via email is sad, the response to it was equally goofy and sad. jenglish's response pointing out the lack of imagination was a much better one.