Bayesian classification in the wilds

by Uche Ogbuji

The hype about all the futuristic applications that Bayesian classifiers wil usher in is getting rather aged. But bump the hype. There's nothing like running code. Skip Montanaro shows how easy it was to write some Python codde to apply the SpamBayes project to a real life problem where he had a large database (of concert and event information) which he needed to cull for bad data. I think this is very promising stuff.


2003-04-17 09:49:51
Hi Uche,

I caught Skip's messages to this effect, too, and it made me go looking for Orange

Which is a big data mining, machine learning, statistical analysis package in C with generous Python bindings. It seems right up the alley of your post here. I've been trying to use it to do some autocategorization of Usenet postings, and it's pretty easy to compile and use from Python.