Social Bookmarking - Creating a reusable copy of a 'tagged' RSS file

by M. David Peterson

Related link:

As mentioned in the intro, the following code will create a local copy of the latest RSS feed for any particular tag on social bookmarking. The general idea (at least what I'm using for) is to take a bit of the load off the servers, make them available from a site that is focused more towards this particular tags content, and as a result be enabled to embrace-and-extend from there in any particular direction you feel appropriate.

See this post for an idea of what I'm using this for now and some ideas I plan to implement as time allows.

The XML looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<tag id="blinqx" saveResult="/srv/www/htdocs/">All items tagged on as
<tag id="linq" saveResult="/srv/www/htdocs/">All items tagged on as linq</tag>
<tag id="xlinq" saveResult="/srv/www/htdocs/">All items tagged on as

The XSLT 2.0 based source looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="" version="2.0">
<xsl:output name="xml" method="xml"/>
<xsl:template match="/">
<xsl:apply-templates select="delicious/tags/tag"/>
<xsl:template match="tag">
<xsl:result-document href="file:///{@saveResult}/{concat(@id, '.xml')}">
<xsl:copy-of select="document(concat('', @id))"/>

Which, when processed by a series 8.x Saxon processor will result in what you find in the files below:

To get this transformation to run automagically I added the following line to my crontab file using 'crontab -e' which will run this transformation at :45 minutes after each hour. [NOTE: Thanks to Uche Ogbuji who helped me figure out that I needed to add the directory location of the Java run-time as opposed to just using 'java -jar ...']

45 * * * * /usr/lib/java/bin/java -jar /etc/saxon/latest/saxon8.jar /srv/www/transform/ /srv/www/transform/

Thats it. Thats all that needs to be done. Now obviously you may run a site whose topic is nothing to do with Microsoft's new Linq and XLinq technologies so changing the 'id' attribute value to the tag name you desire is an obvious and necessary modification. You will also need to change the 'saveResult' attribute value to the location on your server you want it saved such that the results are accessible to your site visitors.

What may not be immediatelly obvious is that you probably don't have the latest Saxon 8.x processor contained in the '/etc/saxon/latest' directory. Wherever you unzip and store the latest Saxon processor is where this piece of the above crontab entry should point to.

Please don't use this to create a local cache at :45 after each hour (or whatever time you decide works best for you) on your personal dev-box that you and you alone will access. The idea is to help take some of the load off of the servers by creating a copy(s) of the RSS file and allowing your site visitors to add the copied feed(s) to their feed readers as opposed to accessing the tagged feed from the servers.

Also, as specified on the 'about' page on

Please do not poll any single RSS feed more often than every 30 minutes. RSS feeds are not updated more than twice an hour, and you will receive an error if you try to crawl more frequently.

With these two things in mind the above code is licensed under the following terms (clicking yes will gain access to the zipped file):

I promise that I won't make the author of this code regret he ever made this post by using it for my own personal and selfish desires (and thus increasing instead of decreasing the load on the servers) and instead for the greater good of a particular community I take part in.

[Yes, I agree.] [No, I don't agree.]

Whichever option you chose, enjoy!

Do you have any other ideas of how something like this could be used. If so, please post the idea with links to any sample code as a comment so that others can gain the benefit as well.