Link typing: Who cares?

by Bob DuCharme

Link typing is the assignment of a link to a particular category in order to give a human or automated reader a clue about the implications of traversing that particular link. It would be more accurately termed "link categorization," because it has very little to do with the computer science notion of data typing: the assignment of a type to data to identify the set of operations that can be performed on that data (integers can be added, subtracted, multiplied, divided; strings can't, but can be concatenated, have substrings extracted, etc.).



Link typing clearly adds value to a link, and anyone discussing it agrees that it's a Good Thing. Several link type taxonomies have been proposed, but no one I know of actually uses them for anything. In fact, none of the taxonomies I know of have improved on the one described twenty years ago by Randall Trigg in chapter four of his University of Maryland Ph.D. dissertation.



I have my own ideas about problems with various link taxonomies out there, such as the the values proposed for the XHTML 2.0 a element's rel attribute, but for now I'd like to know if anyone else has done anything besides proposing new taxonomies since Trigg's thesis. Do you know of any collection of links that actually have link types assigned to them? Even HTML with rel attributes on the a elements? (My limited research into that was not encouraging.) Has a generalized link taxonomy ever been proven useful, or are more application-specific ones such as the History and Treatment labels used in court case citations the only practical applications? Do you know of any any specialized ones besides court case citations?



Please post a comment here or e-mail me at bob@snee.com to let me know. And meanwhile, check out Trigg's taxonomy.




What kind of link typing applications do you know of?


7 Comments

anonymous2
2003-05-02 06:22:56
What are the requirements?
This may be too much like systems engineering, but I wonder if anyone is doing requirements analysis to discover what people want/need to use links for, and thus design link types to support some set of functional requirements (use-cases).
anonymous2
2003-05-02 06:30:09
Xanadu cares
Ted Nelson's Literate Machines, which documents his vision of the (never-fielded?) Xanadu system, describes a set of link types. As I recall, they were based on his personal ideas of the kinds of links that would be useful to users of a full-fledged Xanadu system. They might still be a useful standard for comparison with other hypertext/hypermedia systems.
BobDuCharme
2003-05-02 08:11:40
What are the requirements?

Not too much at all, it's a good point. I think that generalized link typing systems have been too generalized to be useful, and only application-specific ones such as those used for court case citations end up adding real value. So, requirements analysis for a particular application would be a very realistic step toward genuinely useful link types for that application.


UML has the idea of "association classes," which seem to correspond to link types, so perhaps my search for good link typing should look through some UML models out there for examples of association classes from system engineers who weren't necessarily thinking in terms of link types.


BobDuCharme
2003-05-02 08:33:04
Xanadu cares

Ahh, "never-fielded," (or only partially, and inaccessible to most people) that's the problem. Another unimplemented link taxonomy.


My copy of "Literary Machines" just arrived about two weeks ago, and I'm nearly finished reading some other books, so I hadn't had much chance to look at it. I just pulled it out and found the section on Link Types on page 4/43 (the printing that says 93.1 on the cover and 91.1 on the copyright page) and it's very interesting. If we treat the web as the docuverse, much as that would annoy Nelson, his description of link typing looks a lot like an RDF implementation: using his terminology, a link's from-set and to-set can point anywhere in the docuverse, and a "link's type is specified by yet another end-set pointing anywhere in the docuverse." If we read " web URLs" for "pointers anywhere in the docuverse" (again, thereby making his blood boil) his idea of the from-set, to-set, and link-type-end-set, which he calls a three-set, is remarkably like an RDF triple. The analogy works best if all three sets are singletons, but if not, it would still be easy to represent the set relationships with RDF. I'm working on an RDF link typing application which I will describe on the weblog in a few weeks when it's done.


I like how Nelson lays out the architecture of links, emphasizes that anyone can assign any link types they want, and then (pp. 4/52 - 4/55) describes a possible taxonomy. He does make it clear that "[t]his listing is provisional, to give the flavor of current thinking." I'd love to see if anyone has applied his typing to a large collection of links somewhere.

anonymous2
2003-05-07 23:01:39
What about abuse?

In the case of WWW links, there is a problem with people "gaming the system" to get more visitors or higher page rankings on search engines. It seems to me that we can't let the web page author decide on the link categorization; if we do, porn sites and online casinos will spam the web with bogus taxonomy information.


- Guy Macon


http://www.guymacon.com



BobDuCharme
2003-05-09 08:21:03
What about abuse?

That would be a problem if the use of link types was based on searches across the entire web, in which case you would have to deal with all the types assigned to all the links. A scenario where you're dealing with a much smaller set is perfectly realistic. For example, the links on just one page--if that page had seven links on it, and some visual indication (text coloring, mouseover text, whatever) indicated the type of each, you would have a better idea of which you'd like to follow. You probably already trust the author of that web page, or you wouldn't be reading it and interested in following its links.


There are larger sets of links whose types you might want to look at without looking at the entire web, such as the links from a particular site, or from a particular weblogger, or from a particular set of webloggers. You read these pages because you're interested in what they have to say, and the way that they typed their own links is just more of what they have to say. If sites exist that rate their links to anti-spam software, hair replacement tonics, or Nigerian business deals as "AMAZING," well, I don't care about those sites or the types assigned to their links.


The trustworthiness of link type assignment is a bigger problem with out-of-line links. On the one hand, let's say I create a link from http://www.snee.com/foo.html to a page http://www.example.com/bar.html that expresses an opinion I consider dubious. I put an A element in http://www.snee.com/foo.html and "http://www.example.com/bar.html" in its HREF attribute. I could then put Trigg's link type of "Pt-dubious" in a REL attribute of that A element. You have a reason to trust me, as the author of the link, as to why that link type was assigned. (If you didn't care at all about my opinion, you wouldn't be reading that web page in the first place.) However, if you see the same link type assigned to the (http://www.snee.com/foo.html, http://www.example.com/bar.html) link elsewhere, which is easy enough with RDF, and you don't know who assigned it, then you don't know how seriously to take it. If it's from a site you trust, that's a start. Also, part of the semantic web dream (as I understand it) going back to the PICS prototypes is that ratings of anything, whether movies, web pages, or links, can be attributed so that you can judge how seriously to take those ratings. This too is pretty simple with RDF, stay tuned...

anonymous2
2003-05-30 08:22:44
When I've done this...
... not for the Web, mind you... I had the flexibility to use element names rather than attributes to distinguish link types. (Generally this was in scholarly-journal DTDs.)


And I did. Links to figures were figrefs. Links to tables were tabrefs. Links to bibliography citations were (typically) citerefs. And so on for footnotes, endnotes, indexes, etc.


Possibly the problem here is not so much that link typology is useless... it's that doing it via attributes is cumbersome.