O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  

Dig Deeper into Sites
Dig deeper into the hierarchies of web sites matching your search criteria
The Code
[Discuss (1) | Link to this hack]

The Code

Save this code as deep_blue_g.cgi, a CGI script ["How to Run the Hacks" in the Preface] on your web server. As you type it in, replace insert key here with your Google API key.

# deep_blue_g.cgi
# Limiting search results to a particular depth in a web 
# site's hierarchy.
# deep_blue_g.cgi is called as a CGI with form input.
# Your Google API developer's key.
my $google_key='insert key here';
# Location of the GoogleSearch WSDL file.
my $google_wdsl = "./GoogleSearch.wsdl";
# Number of times to loop, retrieving 10 results at a time.
my $loops = 10;
use SOAP::Lite;
use CGI qw/:standard *table/;
  header( ),
  start_html("Fishing in the Deep Blue G"),
  h1("Fishing in the Deep Blue G"),
  'Query: ', textfield(-name=>'query'),
  br( ),
  'Depth: ', textfield(-name=>'depth', -default=>4),
  br( ),
  submit(-name=>'submit', -value=>'Search'),
  end_form( ), p( );
# Make sure a query and numeric depth are provided.
if (param('query') and param('depth') =~ /\d+/) {
  # Create a new SOAP object.
  my $google_search  = SOAP::Lite->service("file:$google_wdsl");
  for (my $offset = 0; $offset <= $loops*10; $offset += 10) {
    my $results = $google_search -> 
        $google_key, param('query'), $offset, 10, "false", "",  "false",
        "", "latin1", "latin1"
    last unless @{$results->{resultElements}};
    foreach my $result (@{$results->{'resultElements'}}) {
      # Determine depth.
      my $url = $result->{URL};
      $url =~ s!^\w+://|/$!!g;
      # Output only those deep enough.
      ( split(/\//, $url) - 1) >= param('depth') and 
            b(a({href=>$result->{URL}},$result->Dig Deeper into Sites||'no title')), br( ),
            $result->{URL}, br( ),
            i($result->{snippet}||'no snippet')
  print end_html;

O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.