The Fuss About Gmail and Privacy: Nine Reasons Why It's Bogus
Apr. 16, 2004 10:05 AM
There has been a rash of recent editorials about privacy concerns with Google's gmail service. A number of organizations have asked Google
to voluntarily suspend the service. One California legislator has gone so far as to say she plans to introduce a bill
to ban it. This is nuts! A number of things to consider:
There are already hundreds of millions of users of hosted mail services at AOL, Hotmail, MSN, and Yahoo! These services routinely scan all mail for viruses and spam. Despite the claims of critics, I don't see that the kind of automated text scanning that Google would need to do to insert context-sensitive ads is all that different from the kind of automated text scanning that is used to detect spam. (And in fact, those oppressed by spam should look forward to having Google's brilliant search experts tackle spam detection as part of their problem set!) Google doesn't have humans reading this mail; it has programs reading them. Yes, Google could instruct a program to mine the stored email for confidential information. But so could Yahoo! or AOL or MSN today. (Perhaps people feel Google is to be feared because they seem to so good at what they do. But that seems rather an odd point of view.)
- For that matter, the very act of sending an email message consists of having a number of programs on different machines read and store your mail. Every time you send an email message, it is typically routed through a number of computers to get to its destination. Run the traceroute command at a command prompt on any Linux or UNIX system (including Mac OS X) or tracert on a Windows system to see the hops that your internet packets go through from your machine to any destination site. Anyone equipped with a packet sniffer at any of those sites can snoop any mail that they want. In fact, the NSA recently proved the effectiveness of this approach by tracking down terrorists by way of their mail traffic.
The amount of personal data already collected by credit agencies and direct marketers dwarfs what might be gleaned from email. There are folks right now, who know everything you've ever bought. Heck, just recently, I was shopping in Bath, England, and made a large purchase in an antiquarian bookshop. Fifteen minutes later, I was four buildings down the street in a second bookshop, tried to make another purchase, and had my card rejected. Meanwhile, back in California, my wife was receiving a call, wondering if the card had been stolen. "Why would someone halfway around the world be spending so much on books?" they wanted to know. That's real time monitoring! Privacy advocates (and as a former board member of the EFF I count myself among them) argue that privacy is a slippery slope. But we're already a long way down that slope, and I have a lot more trust in Google to do the right thing to protect my privacy than I have in credit card and direct marketing companies! I certainly don't see why Google is being singled out. There are so many bigger issues to worry about, from RFID tagging to surveillance cameras on London street corners, that programmed scanning of email for targeted ad insertion doesn't seem like too big a deal to me, especially when it's disclosed up front to participants in the service.
Gmail's offer of extended storage means that hosted email accounts might appeal to more than the casual home user, resulting in the storage of more mission-critical messages, but considering that many businesses are already hosting critical business data at outside service providers like salesforce.com, I hardly think that is a show stopper.
People are also expressing concerns about Google's plan to insert targeted advertising into email sent with the service. Once again, I find myself baffled by the uproar. Some reasons:
No one is going to be forced to use gmail. If you don't like ads in your mail, don't use the service. Let the market decide. (Note: as far as I can tell, ads do not appear in outgoing mail, so there's no spamming of non-subscribers. Ads appear only in the mailbox of the gmail user. And as with Google adwords on search results, the ads appear in text boxes off to the side of the message, where they can easily be ignored if the information they provide is not useful.)
Google has a history of providing tasteful, unobtrusive, useful advertising. When all the other online services rushed to plaster their sites with bigger and more obnoxious banner ads, skyscrapers, popups, pop-unders, and screaming animations, Google held the line, and defined a new paradigm for advertising that no one seems to mind.
Meanwhile, I am entranced with the benefits that gmail will hopefully provide!
The ability to search through my email with the effectiveness that has made Google the benchmark for search. How many times have people asked, "When can I have Google to search my hard disk?" That's a hard problem, as long as it's just your disk, on your isolated machine. But it's solvable once Google has lots and lots of structured data to work with, and can build algorithms to determine patterns in that data. Gmail is Google's brilliant solution to that problem: don't search the desktop, move the desktop application to a larger, searchable space where the metadata can be collected and made explicit, as it is on the web.
The second-order search through "six degrees of separation" promised (but not yet delivered) by all of the social networking services such as Friendster, LinkedIn, Plaxo, and Google's own Orkut. These services are essentially a hack, designed to get around the fact that no one has yet re-invented the address book for the era of the internet. Why should I have to spam all my friends, asking them to "join my network", if my email client is smart enough to know who I know, how often I communicate with them, as well as who they know, and how well.
(Microsoft Research's Wallop project shows some interesting steps in this direction. Until I saw gmail, I was convinced that Microsoft would eventually own the social networking space by adding Wallop features to Outlook, since having access to the actual email traffic data and address book is so much more powerful than the workarounds that the social network services have to endure. I've been excited about the Chandler project because I saw its developers asking themselves how to reinvent the address book for the age of the internet -- thinking of contacts as private, public, or somewhere in between. But Chandler seemed like too little, too late to keep Microsoft from owning another promising new application category.)
Eventually, I imagine I'll be able to ask gmail, who do I know who can help me to reach someone I'm looking to meet...and get a reasonable answer, without any invasion of privacy. After all, I have lots of friends who know me well enough to make a recommendation to their friends, and pass on contact info if appropriate. Gmail wouldn't break any new social ground here -- it would just make it easier to find out who to ask, without revealing any confidential information. (Meanwhile, the existing social network services DO lead people to reveal lots of private information that could be misused by spammers and other electronic harvesters. Gmail could provide this information more securely.)
Storage of my critical data on one of the largest, most reliable data storage banks in the world. As Rich Skrenta made so clear in his recent weblog posting, Google is the shape of the future. Forget Moore's Law and Metcalfe's Law. Storage is getting cheaper faster than any other part of the technology infrastructure. I remember Bob Morris, head of IBM's Storage Division and the Almaden Research Labs, telling me a couple of years ago, that before too long, storage would be cheap enough and small enough that someone who wanted to do so could film every moment of his life, and carry the record around in a pocket. Scary? Maybe. But the future is always scary to those who cling to the past. It is enormously exciting if you focus on the possibilities. Just think how much value Google and other online information providers have already brought to all of our lives -- the ability to find facts, in moments, from a library larger than any of us could have imagined a decade ago.
Gmail is fascinating to me as a watershed event in the evolution of the internet. In a brilliant Copernican stroke, gmail turns everything on its head, rejecting the personal computer as the center of the computing universe, instead recognizing that applications revolve around the network as the planets revolve around the Sun. But Google and gmail go even further, showing that once internet apps truly get to scale, they'll make the network itself disappear into the universal virtual computer, the internet as operating system.
I've been dreaming this dream for years. At my conference on peer-to-peer networking, web services, and distributed computation back in 2001, Clay Shirky, reflecting on "Lessons from Napster", retold the old story about Thomas J. Watson, founder of the modern IBM. "I see no reason for more than five of these machines in the world," Watson is reputed to have said. "We now know that he was wrong," Clay went on. The audience laughed knowingly, thinking of the hundreds of millions, if not billions, of computers deployed worldwide. But then Clay delivered his punch line: "We now know that he overstated the number by four."
Pioneers like Google are remaking the computing industry before our eyes. Google of course isn't one computer -- it's a hundred thousand computers, by report -- but to the user, it appears as one. Our personal computers, our phones, and even our cars, increasingly need to be thought of as access and local storage devices. The services that matter are all going to run on the global virtual computer that the internet is becoming.
Until I heard about gmail, I was convinced that the future "internet operating system" would have the same characteristics as Linux and the Internet. That is, it would be a network-oriented operating system, consisting of what David Weinberger calls "small pieces loosely joined" (or more recently and more cogently, a "world of ends"). I saw this as an alternative to operating systems that work on the "one ring to rule them all principle" -- a monolithic architecture where the application space is inextricably linked with the operating system control layers. But gmail, in some sense, shows us that once storage and bandwidth become cheap enough, a more tightly coupled, centralized architecture is a real alternative, even on the internet. (I have to confess that was one of the wake up calls to me in Rich Skrenta's piece, linked to above.)
But in the end, I believe that the world we're building is too complex for tight coupling to be the dominant paradigm. It will be a long time, if ever, before any one company is in control of enough programs and enough devices and enough data to start dictating to consumers and competitors what innovations will be allowed. We're entering a period of renewed competition and innovation in the computer industy, a period that will utterly transform the technology world we know today.
I love Dave Stutz's phrase, "software above the level of a single device." We're used to thinking of software as something that runs on the machine in front of us, its complex dance hidden by the blank metal and plastic of the hardware that houses it. But now, computers are everywhere, and each dance has many partners, a whirling exchange of data that will be made visible when and where we want it. It's not the machine or even the software that matters, it's the information and services that travel over the hardware and software "wires." Gmail's introduction of large amounts of free online storage for application data is an important next step in freeing us from the shackles of the desktop.
This isn't to say that there aren't important issues raised by the internet paradigm shift. The big question to me isn't privacy, or control over software APIs, it's who will own the data. What's critical is that gmail makes a commitment to data migration capabilities, so the service isn't a one way door to the future. I want to be able to switch to alternate providers if the competition makes a better offer. The critical enabler is going to be the ability to extract my data and connections so that I can work with them on multiple devices, for example, syncing my laptop or phone with my gmail account rather than having to work only in a tethered fashion. I understand why gmail doesn't offer this feature now, but it's going to be essential in the long term.
is the founder and CEO of O'Reilly Media, Inc., thought by many to be the best computer book publisher in the world. In addition to Foo Camps ("Friends of O'Reilly" Camps, which gave rise to the "un-conference" movement), O'Reilly Media also hosts conferences on technology topics, including the Web 2.0 Summit, the Web 2.0 Expo, the O'Reilly Open Source Convention, the Gov 2.0 Summit, and the Gov 2.0 Expo. Tim's blog, the O'Reilly Radar, "watches the alpha geeks" to determine emerging technology trends, and serves as a platform for advocacy about issues of importance to the technical community. Tim's long-term vision for his company is to change the world by spreading the knowledge of innovators. In addition to O'Reilly Media, Tim is a founder of Safari Books Online, a pioneering subscription service for accessing books online, and O'Reilly AlphaTech Ventures, an early-stage venture firm.
Return to weblogs.oreilly.com.
Weblog authors are solely responsible for the content
and accuracy of their weblogs, including opinions they
express, and O'Reilly Media, Inc., disclaims any and
all liabililty for that content, its accuracy, and
opinions it may contain.
This work is licensed under a
Creative Commons License.