ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button
  Google Your Desktop
Subject:   Google search from a remote machine
Date:   2004-10-22 09:12:09
From:   arnaud.sahuguet
Here is a simple to access the search tool from another machine.

To be able to send a query and get back the results, you simply need to configure Apache as a reverse proxy (see below).

Google search runs as a web server.
It acts as a web server to return the search results, but it DOES NOT serve the local files. From the Google search screen on your browser, when you click on a file link, a request is sent to the local web server (port 4664) but nothing is returned. The google search program spawns a new process to display the file, lauching Word, Powerpoint, etc.

To be able to click on the results and view them (they are local files, remember), you need to have a way to serve the files from the local machine to the remote machine. There are many solutions for this: mount the local filesystem remotely, have Apache server those files, etc.

Here is my setting:
- google search running on host1 where the files are
- apache running on host1 as a reverse proxy
- apache running on host2 where the host1 file system is mounted
- client accessing the google search service through host1:80

In the Apache config, you need to:
- enable the various proxy modules
- enable mod_rewrite
- tell apache to reverse proxy request to localhost on port 4664
- tell apache to rewrite google search redir requests (prefixed by /redir?url= in the link) to a web server that can display the document.

Note that local documents are using Windows path separator. The path needs to be rewritten. To do that, I am using a simple Perl script that rewrite the path and returns an HTTP Location redirect.

Apache config:

ProxyPass /

<Location /redir>
RewriteEngine On
RewriteRule url=([^&]+) http://another-web-server/cgi-bin/show_file.cgi?$1 [R,L,NE]

The show_file script is given below:

use URI::Escape;
my $file = uri_unescape($ENV{QUERY_STRING});
$file =~ s/\\/\//g; # to replace \ by /
$file =~ s/\+/ /g; # to replace + by space

print "Location: http://another-web-server/$file\n\n";

One last thing. When accessing google search, you need to pass some special parameters in the initial query: e.g. &s=.

And it works. The only thing that does not work is the access to "cached results". This require to rewrite the HTML page returned by google search. This can be done using mod_proxy_html (see
this article for more info.

1 to 2 of 2
1 to 2 of 2