Screenscraping the Senate (7 tags)
In Paul Ford's first Hacking Congress column, he shows us how to turn information on the U.S. Senate site into RDF.
Screen-scraping with WWW::Mechanize (4 tags)
Screen-scraping is the job of programmatically navigating through a usually visual task - like a web site - and then dealing with the result; and WWW::Mechanize is the best screen scraper out there for Perl! Chris Ball puts the two things together, to ensure that he never misses his favourite TV shows again...
Wrestling HTML (2 tags)
Uche Ogbuji's Python and XML column returns with a look at techniques for converting arbitrary and invalid HTML into XHTML.