Building a SPAM Honeypot with Jakarta James Open Source Mail Server
by Kevin Bedell
James is really cool. If you're a Java hacker like me the whole idea of having a powerful, open-source email server that you can endlessly modify and extend is awesome. James has been around for a while and is past release 2.0 already, so it's pretty solid as well.
Recently, some discussion broke out on the james-user list about building some extensions to James to allow it to perform as a 'spam honeypot'. The idea is to write a James-based application that waits for incoming spam and captures real-time statistics on the spam it receives.
One of the ideas generated was to use the application to dynamically create new spam filters as you discover new sources for spam.
Another subscriber recommended just working with the folks at Spamhaus instead. These people have tracking spam sources from around the globe down to a science.
For example, Spamhaus discovered that a spammer has been executing a 'dictionary'-type attack on Hotmail for over 5 months. According to them, this person has been testing email address combinations at a rate of 3-4 per second, 24 hours a day, 7 days a week continuously. They were able to track the offenders down to a series of e-mail servers based in Bejing, China that they believe are owned by American spammers.
So how would developing Spam Honeypot with Apache James help? Well, to begin with it would allow the dynamic updating of spam filters for that install of James.
And if a group of people were to collaborate and work together to create a James mail-app that could capture spam and update a central, shared database then it might be possible for *all* servers running the mail-app to notify each other (through the shared database) whenever a new source for SPAM were found.
Of course, there are already DNS-based Real-time Black Lists (rbl's) of spam senders, but using this approach you could filter spam using much more than just reverse-dns info on the sender. You could perform all kinds of analysis on the content of the spam as well.
Sort of grid-computing's answer to spam trapping...
spam needs more than analysis
Unfortunately, you can know a whole lot about spammers and still not be able to do anything about them other than filter their mail on criteria that can easily change. I use an RBL, and that helps a lot, but it clearly isn't enough. I hear there is some progress on the legal front, but that will still probably be limited and inefficient. I have been entertaining another idea which I haven't had time to toy with...
re: spam needs more than analysis
I don't think this is good idea, because SPAMmer don't leave his original e-mail address.