O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  



HACK
#87
Google Whacking
With over 2 billion pages in its index, is it possible to get only one result for a search?

Contributed by:

[03/13/03 | Discuss (135) | Link to this hack]

With an index of over 2 billion pages, Google attracts lots of interest from searchers. New methods of searching are tested, new ways of classifying information are explored, new games are invented.

New games are invented? Well, yes, actually. This is the Internet, after all.

The term "Google whacking" was coined by Gary Stock. The idea is to find a two-word query that has only one result. The two words may not be enclosed in quotes (that's too easy), and the words must be found in Google's own dictionary (no proper names, made-up words, etc). If the one result comes from a word list, such as a glossary or dictionary, the whack is disqualified.

If you manage a Google whack—and its harder than it sounds—be sure to list your find on the official Whack Stack (http://www.googlewhack.com/). Perusing the most recent 2,000 whacks is highly recommended if your brain is stuck and you need a little inspiration in your research. Examples include "endoscopy cudgels," "nebbish orthodontia," and "peccable oink."

Are you stuck for a Google whack query? This hack should help. It takes a random word from each of two "word of the day" sites and queries Google in hopes of a Google whack (or as experienced players would say, "To see if they make a whack").

#!/usr/local/bin/perl
# google_whack.pl
# An automated Google whacker.
# Usage: perl google_whack.pl

# Your Google API developer's key
my $google_key='insert key here';

# Location of the GoogleSearch WSDL file
my $google_wdsl = "./GoogleSearch.wsdl";

use strict;

# Use the SOAP::Lite and LWP::Simple Perl modules
use SOAP::Lite;
use LWP::Simple;

# Generate some random numbers to be used as dates for choosing 
# random word one.
srand(  );
my $year  = int( rand(2) ) + 2000;
my $month = int( rand(12) ) + 1; 
$month < 10 and $month = "0$month";
my $day = int( rand(28) ) +1;
$day < 10 and $day = "0$day";

# Pulling our first random word from Dictionary.com
my $whackone = 
  get("http://www.dictionary.com/wordoftheday/archive/$year/$month/$day.html") 
  or die "Couldn't get whack word 1: $!";
($whackone) = 
  ($whackone =~ /<TITLE>Dictionary.com\/Word of the Day: (.*)<\/TITLE>/i);

# Generate a new year between 1997 and 2000 for choosing
# random word two
srand(  );
$year  = int( rand(5) ) + 1997;

# Pulling our second random word from th  now defunct Maven's 
# Word of the Day (thank goodness for archives)
my $whacktwo = 
  get("http://www.randomhouse.com/wotd/index.pperl?date=$year$month$day") 
  or die "Couldn't get whack word 2:: $!";
($whacktwo) = ($whacktwo =~ m!<h2><B>(.*)</b></h2>!i);

# Build our query out of the two random words
my $query = "$whackone $whacktwo"; 

# Create a new SOAP::Lite instance, feeding it GoogleSearch.wsdl
my $google_search = SOAP::Lite->service("file:$google_wdsl");

# Query Google
my $results = $google_search -> 
    doGoogleSearch(
      $google_key, $query, 0, 10, "false", "",  "false",
      "", "latin1", "latin1"
    );

# A single result means a possible Google whack
if ($results->{'estimatedTotalResultsCount'} == 1) {
  my $result = $results->{'resultElements'}->[0];
  print 
    join "\n",
      "Probable Google whack for $query",
      "Title: " . $result->{title}||'no title',
      "URL: $result->{URL}",
      "Snippet: " . $result->{snippet}||'no title',
      "\n";
}

# Anything else is Google jack 
else {
  print "Google jack for $query, with " . 
    $results->{'estimatedTotalResultsCount'}  . " results\n";
}


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.