Calculating Entropy for Data Mining (3 tags)
Eww, statistics. Right? Not necessarily--for example, calculating the entropy of your web statistics can help you analyze trends and correlations. Paul Meagher demonstrates statistical programming in PHP while explaining single-variable entropy.
Using Bloom Filters (2 tags)
Perl hashes make set membership easy at the cost of memory usage. A lesser-known technique, Bloom filters, trades a tunable false-positive rate for compactness -- and has interesting applications for privacy concerns. Maciej Ceglowski explains the theory and practice of Bloom filters.
Analyzing Baseball Stats with R (2 tags)
An introduction to one way of examining the abundance of raw data available on the web: using R to analyze baseball stats.
ANOVA Statistical Programming with PHP (2 tags)
Data miners and researchers often have to review their work for statistical variances. The Analysis of Variance (ANOVA) technique is a popular and effective way to gauge the effects of an experiment. Paul Meagher demonstrates how to use PHP, MySQL, and JpGraph for productive data-mining work.