Hpricot for JRuby

by Robert Cooper

Ola notices a change to the Hpricot gem:

This is just so cool, I cannot contain it. For those of you who haven't heard about Hpricot, it is one of why the lucky stiff's incredibly cool tools (which he probably will use to take over the world any day now...). It's HTML parsing goodness, very flexible, with the goal of being able to parse (and fix) everything that Firefox handles.

"So what?" you're probably asking... Well, Hpricot uses Ragel and some C code to achieve blinding speed. This means JRuby can't run it. Or I should say couldn't run it:


orpheus:~/workspace/jruby> jruby bin/gem install hpricot --source http://code.whytheluckystiff.net
Bulk updating Gem source index for: http://code.whytheluckystiff.net
Select which gem to install for your platform (java)
1. hpricot 0.5.110 (jruby)
2. hpricot 0.5.110 (mswin32)
3. hpricot 0.5.110 (ruby)
4. hpricot 0.5 (ruby)
5. hpricot 0.5 (mswin32)
6. hpricot 0.5.0 (ruby)

...

That's right, Hpricot is now more promiscuous than any other gem with native parts.
What can you do with it? Well, I'm just going to point you to _why's own description of it. All he says at http://code.whytheluckystiff.net/hpricot/ will work fine in JRuby!

How did this come to be? Well, me and _why did some joint hacking, which was helped along by the fact that Adrian Thurston (the genius behind Ragel) recently added Java support to it. So, basically, most of the Ragel definition is exactly the same for both the C and the Java versions. The native code has been factored out, and both versions are buildable with rake from _why's code repository.

This is important. Don't think anything else. This strategy will, and can, be used for other gems with native parts. It's just a question of time.


Yeah, I can't help but wonder how long it will be before this becomes SOP.

5 Comments

alpha3000
2007-02-10 05:53:18
Robert I dont want to sound rude but why you always posting Ruby and JRuby like it is the only cool scripting language and also in OnJava and not in the Ruby website? This is a Java website, Me as Java programmer I dont care about Ruby, Also there are lots of cool other scripting languages that a Java programmer could use and not only ruby. When I want to read a java article I come to OnJava but always I found your articles about Ruby I think better post on the ruby website they will be very interested on your postings. Thanks.
Mike
2007-02-10 06:33:41
Well maybe because the most interesting stuff right now is going on in Ruby community? Why not to overview it? Don't be Java snob. Keep it coming Robert as long as it's interesting.
cooper
2007-02-10 07:42:08
It is a fair question. Mostly I just post on things I trip over that I find interesting or have something to say about. Lately it has been a lot of scripting related stuff, but given that scripting on the JVM seems to be the big thrust for JDK 7, it is kind of hard to miss it.


It really hasn't been my intent to have 4 or 5 posts in a row all be in the same theme. It just worked out that way.

George Jempty
2007-02-12 04:11:55
I too object to this getting posted on onjava but for a different reason. Because the "author" adds one measly line of his own to several paragraphs' worth of somebody else's content. That should be a comment on somebody else's blog, not a blog entry on onjava.
Rob.
2007-02-14 05:08:45
Hey Java people!
Afraid of the new languages? That's what bothers you.