What's surprising about the AOL username theft...

by Eric M. Burke

Related link: http://www.usatoday.com/tech/news/2004-06-24-aol-tick-tock_x.htm

An AOL insider recently sold a bunch of AOL usernames to a spammer. All the big media outlets picked up on this story. As a consultant who has worked for numerous companies, I'm not at all surprised this happened.

Here is a dirty little secret our industry does not talk about. Many companies do not protect sensitive information from so-called "insiders". In fact, just about every company I have consulted for over the years has given me unfettered access to tens of thousands of SSNs, names, addresses, even employee salaries.

None of these companies did a background check on me, nor did they ever make me sign any confidentiality agreement.

In most cases, lack of privacy is directly attributable to laziness. In every case I can think of, I had access to this sensitive information because that's how many companies create so-called "test" databases. They just do a raw dump of live data and give the entire programming staff complete access to the data.

This must stop. We must be more careful with private information! As a consumer who is also a programmer, I know that my own personal information (like address, SSN, account numbers) is freely available to thousands of programmers worldwide within company walls. Yep, I'm scared. And I also know that when my identity is stolen, the burden is on ME to cleanup the mess.

When a "big leak" like the AOL leak occurs, companies are exposed to tremendous legal and financial risk. Is it worth the cost just because you are too lazy to scrub your test data?


2004-06-25 12:50:57
I come across this all the time
At the very least a company should randomize fields so that the record is not valid (i.e. is a valid SSN, but the name, address, ZIP code etc are not for that SSN).
2004-06-26 05:32:46
I come across this all the time
This will not work: often the data must be consistent or the software will not work. With invalid records the programs working with that data will not work.
2004-06-26 08:00:10
I come across this all the time
It depends on what you are doing, and what level of application controls over record data validation you are doing. Often you are doing validation of zip codes and addresses (maybe phone numbers). Usually that is about all you can do with regard to testing record integrity. Unless you are doing full blown external party validation (ie credit checks) on a system (remember - this is a testing system we are talking about) you don't need every field to belong to the same person.
Having said all of this, generating completely fictitious data is better again. Lots harder (of course) though.
2004-06-28 19:54:17
I come across this all the time
Is it? Anyone equipped with a very high level language (Perl, Python, Ruby, pick your poison) and a few rules should be able to generate large dumps of valid-looking data in very little time. And it's work that can be recycled easily, once it's done the first time.