Wacky Ideas from ETech 2004, day two

by chromatic

Related link: http://conferences.oreillynet.com/etech/

Tim's keynote this morning brought up several ideas he's been promoting for the past year. There's a lot of data in cyberspace that, if you measure it the right way, can teach you intersting things.

Take Netcraft, for example. It's brilliant to map web servers with a simple HEAD request (or whatever they use). Tens of millions of domain names map to millions of IP addresses. What kind of information can you learn if you trawl through DNS and check out what the normal browsing software doesn't tell you?

Consider also programs that scrape Amazon and eBay for book ranks, prices, and so on. It's easier (though still not easy) to judge the potential audience for a book by looking at the performance of similar books in a subject area.

Microsoft Research'sNetscan performs a similar service. Want to gauge the popularity of programming languages against each other? Check the numbers of newsgroup posts about that language in the previous month.

Of course, not all discussion on the Internet takes places in archived forae such as web pages and Usenet. Here's my wacky idea, a refinement on the above theme: scan IRC channel names, topics, and the number of participants.

It's a little tricky because there are several different networks and, admittedly, quite a bit of seedy content in an ephemeral forum. Still, there's an awful lot of conversation there. It'd be a shame to continue to miss out on the trends you could follow.

Is anyone doing this? Are there other odd corners of the Internet that need aggregation and summarization?