Weakening Google

by David Sims

Some interesting news this morning, via Dave Winer, that of all the mentions of Disney CEO Michael Eisner online - not just material published by Disney or its subsidiary ABC, but every article online from the New York Times, Wall Street Journal, Business Week, Forbes, The Economist and so on -
the top result on Google is Tim O'Reilly's recent weblog on Eisner's disingenuous comments before a Senate subcommittee on the Internet and copyright. (A post by Dave Winer is number two.)

First reaction: That's amazing and fantastic! Google's the most respected search engine on the web these days (previous title holders, though we don't like to admit it, included Hotbot, Alta Vista, and Yahoo) and it's found a way to punch through the grip of big-name corporate media, the East Coast gatekeepers of sanctioned news. It's a happy day when a dissenting opinion gets so prominent a voice.

Counter reaction: Google's being weakened by its reliance on webloggers and their crosslinks. Tim's blog is interesting, but even though he signs my paycheck, I can't convince myself that his comments on Eisner's testimony are the most important source on Eisner on the Web. The prominence of Tim's blog is a great chuckle among those of us who have our noses stuck so far up the weblog/Google/RSS information chain that we can't see daylight unless someone blogs it. But it could make it a less valuable tool for mainstream (whatever that is) users. You may not like Disney or its behavior, but Eisner has been one of the most important figures in the business of Media over the past 20 years. Disney's such a powerful force these days that we forget that it was on the ropes when Eisner took the helm in the mid 1980s. Someone researching Eisner online might want to know that.

I raised a similar point a few weeks ago with a prominent blogger who declined to politely agree and nod gravely at my concern. Bloggers, he pointed out, are a highly intelligent lot. And if they decide to vote with their links that something is important, it is. True: but bloggers are hardly any more representative than the folks at the Washinton Press Club. Like journalists, they have their biases: they lean towards the left, they lean towards the techy, and they lean towards open source.

If Google wants to evolve into a functional resource for all users, it will have to work itself off this current path, or it will open up an opportunity for The Next Great Search Engine.


2002-05-06 17:12:23
Googles of a different colour
This Eisener example is not the only place where the very foundation of Google's index model betrays its skew: Take any common term that has a GNU/Linux connotation and look what comes up: Try 'gnu' or 'gnome' or any one of a thousand common words with far more history and significance since the infamous Torvalds posting.

The problem is, of course, that the google model is self-selecting: Because it rates based on links, it rates based on the opinions not of subject experts or even just fans (as it claims) but on the opinions of people able to code the links. That's clearly a techie subculture.

Most non-technical emails I receive have no .signature (and it would take some explaining to tell them why I put a dot in front of that word) so their postings on mailing list archives do not feed google. Even in non-techie 'blog-like things' you will find a distinct lack of links, or links in plain text (not wrapped by A tags); all of this conspires to skew google to the geek.

The solution? I wouldn't want a "correction factor" applied to Google (although maybe there's some merit in this) but instead what is needed is a new form of "blog like thing" that would appeal to my mom and all the others out there who have no pressing need to be published, but who have relevence metadata to share. Maybe it will come when all our harddrives are online in one giant Gnutella index so that, by owning any bookmark, you've voted for that site.

2002-05-06 17:21:38
Googles of a different colour
say ... could this be a means for Mozilla to fund itself? My guess is there are a lot of people, now not just Google, who would pay per-hit for a Preferences/Privacy+Security/Bookmarks/Share+with+Mozilla+Partners option ...
2002-05-06 17:57:34
Different kinds of documents in the same list
The problem is that criticism against Eisner as well as his biography and interviews with him end up in the same list, sorted as one big unhappy category.
If Google knew what Tim was writing about it could label the text "Criticism" and it could label Eisner's own page "Personal page" etcetera. That's what they should be working on in my opinion.


2002-05-07 13:57:32
Teoma has already solved these problems
Teoma has a fundamentally better search algorithm that Google because:
1. It categorizes web pages according to categories that are created on-the-fly so that a search for Disney returns results under Walt Disney, Disney World, Animation Arts, Disney Comics, Theme Parks, etc.
2. Rather than looking at all links on the web to establish PageRank, Teoma looks at a subset of pages that are deemed to be relevant to the topic at hand. In other words, when searching for pages about Blogging, bloggers links would carry more weight than when searching about pages on Disney. This also prevents bloggers from sticking links on all of their pages in an effort to bost the rank for a particular site, like what happened with xenu.net. Bloggers could spam their pages with as many links as they like, but such references would have little weight on the search because other links from other Scientology-related pages are deemed more important for ranking xenu.net then pages in general.

In sum, Google returns what the web considers to be a "hot" page - meaning that it lots of bloggers are linking to it, there must be something special about the site. Temoa, on the other hand, returns more technical and detailed pages that are more highly regarded within a specific community.

At least in theory. It doesn't work that way in reality for a number of reasons, one of which is that Google's index is ten times as big.

2002-05-08 11:19:53
I disagree
I think you incorrectly hypothesize what Google is meant to represent. Google represents what people on the web think is important. Tim O'Reilly is highly respected among people on the web, and one shouldn't pass off the idea that perhaps, on the web, what Tim thinks of is of more importance than any other site.

Your comments on Eisner's importance refer to concepts about him which people on the web don't think is relevant to his importance. If people on the web really thought that highly of Eisner, there would likely be a bunch of pages which would provide, if you wish, more 'informative' information. Strangely, you didn't mention a site which you think should have gotten a higher ranking than Tim's weblog.

It's a basic rule of science: don't necessarily change the methodology just because you disagree with the results. It may be cause for suspicion, but without a doubt, Google's methodology has the track record of producing the best results.