The 7 (f)laws of the Semantic Web

by Dan Zambonini

When it comes to the Semantic Web, you might call me a disillusioned advocate. I've been dipping in and out of the technologies for the last 5 years or so, but am increasingly frustrated by the lack of any visible progress.

This entry should be regarded as constructive criticism of the Semantic Web -- I still believe in it, but need to bring the flaws (as I see them) in to the open, in the hope that discussion and communication is the first step towards resolving problems.

42 Comments

Mark Woodman
2006-06-09 09:53:43
Dan,


I long suffered from SemWeb apathy until recently stumbling across OntoWorld's Semantic Wiki project. They're working on a semantic extension to MediaWiki (which drives Wikipedia) that makes it really easy to add semantic attributes to wiki text. The result is a wiki page that has an accompanying RDF document with triples and URIs that point to other wiki pages. Very cool stuff.


Imagine if Wikipedia enabled OntoWorld's plugin... every Wikipedia page could now become semantically meaningful. Considering how enthusiastic WP's volunteer army is at keeping their pages up to date, it might not take long for a Semantic Wikipedia to become the de-facto backbone of the Semantic Web.


- Mark

Dan Zambonini
2006-06-09 09:57:06
That sounds perfect! I've wanted this kind of functionality in wikipedia for ages (see the 4th bullet point in this http://www.oreillynet.com/xml/blog/2005/09/7_opposing_choices_in_the_futu.html ), so this would be amazing if we could add semantic data to wikipedia...


Thanks so much for pointing it out!

Mark Woodman
2006-06-09 10:16:31
I managed to forget the link. Here 'tis:
http://wiki.ontoworld.org/
Giovanni Tummarello
2006-06-10 03:55:12
While things like the semantic mediawiki are great to let people add annotation in a simple way, the issue then becomes how to the end user make use of such structured data.


This is in fact what we're trying to do on the DBin project, create a nice, integrated, locally environment where the end user collects information coming from, say, semantic wikipedia, but also specific semantic p2p channels, RSS (1.0 or other) etc and enjoys it as he/she wants with rich browsing capabilities, possibly merging with local documents metadata etc.


So sorry for the SW detractor but plenty of cool stuff is appairing :-) i say probably within a year or 2 sw will finally enter user space.

Kurt Cagle
2006-06-10 15:31:15
The Semantic Web is a hard concept for many to get, even serious developers. I'm not a big fan of RDF in many of its more egregious uses, but I recognize its value in the areas where it IS useful. My suspicion however is that as the web's abstraction metaphors becoming increasingly complex, the value of RDF will be seen more and more. I'm more inclined to suspect at this stage that its time has simply not yet come for widespread adoption, and it will only be through the backdoor apps that it hits that critical tipping point.
Karl Dubost
2006-06-10 18:08:17
RDF is really simple, but the same level of simplicity than any hardcore technology. Comparing HTML/CSS with RDF makes no sense, IMHO. Comparing with javascript/DOM makes a bit more sense, but tell me how many people are able to create javascript code to manipulate the DOM. Not a lot.


AJAX buzz, AJAX is not a technology but a design pattern. It relies on: DOM, javascript, etc. well If we look at the history of DOM and Javascript, it has been more than painful, people tend to forget that and Interoperability is not here yet, look for example the post about Web caching and XMLHttpRequest by Mark Nottingham.


There is an experiment with wikipedia, with an export of the full database in RDF. When LiveJournal exported all the user profiles in FOAF, it took just a matter of a few lines of codes and a template. If Amazon was doing the same, if IMDB (Amazon) was doing the same, if Google Maps was doing the same, we would have data out there. :) People don't do it in the first place, not because it's difficult, but because they don't believe it.

A web of data. It's funny why people want ONE answer for everything. It's not a problem to not have one big ontology, it's a feature, not a bug. Ontologies are social process, they are built on top of usage and agreement of a community of practices. A social group decides at a point to use one particular ontology. This one will be useful for this purpose and this other one will be useful for this. When they don't find what they want, they create their own only by motivation or ROI (personal or business). I'm happy to have some ontologies which seems general, like let's say, geography one for latitude/longitude, etc. but I want also local ones.


BigOnto: Latitude/Longitude of my village
RegionalOnto: Name of my village in my own language
LocalOnto: Name I share with my friends and parents for this small field out there.


Libraries for software developers: Javascript, PHP have success not very much because of the languages which are complex, but because of the libraries and the frameworks. Right now, people in the RDF world are creating libraries and frameworks to handle datas. Optimization, development, codes, etc. It takes time. People forget that it takes time to implement things. First version of CSS... Dec 1996... 10 years ago and there are still bugs in implementations of CSS. People forget. When CSS came out, people just said: it will never work. It's not possible to create anything visually appealing with it. It's born-dead, the (fluid) box model is too complex, give us tables. (funny thing is one of the features not very well implemented is... display: table in CSS)
It takes time.


It's an iterative process, built on top of other things. If you want to really be fair, compare RDF, with the world of databases, and they painfully moved from flat systems to Relational databases. Look at how many people are able to design a RDB 30/40 years after the first experiments.


:) The technology is not complicated. The Web is NOT one big database, but chunks of information with different views, cultures, with sometimes overlaps, etc. It takes time, things are exciting these days.


Dave Holden
2006-06-11 06:32:18
I suspect that one of the main problems is that the stepping stone or bridge that most people will have into the world of RDF i.e., RDF/XML is so rocky, its almost a disconnect.


If the default serialization was something like this http://www.hpl.hp.com/techreports/2003/HPL-2003-268.html
such that you can confidently process your data using normal XML tools (using RDF tools when appropriate) I suspect it would have taken off much more quickly.



Karl Dubost
2006-06-11 21:20:10
Stating the obvious but in case it's not clear. The abstract model of XML is different from the abstract model of RDF. XML is a tree, RDF is a graph. It's not because one of the serialization is in XML, that it means all XML tools are usable.


XML Tools don't parse pointy brackets, they parse a model with a specific syntax. We can't use XML Tools to parse... a CSS file for example. RDF/XML is one of the serializations. I usually prefer to use RDF/n3 :)

Dave Holden
2006-06-12 03:22:30
No, this was clear to me before I made the point.
Dan Zambonini
2006-06-12 03:48:51
Hi Karl,


I think you raise some very interesting points (in your first post), some of which I agree with... some I don't!


First, I really enjoyed the irony, given the nature of the post, that it's a W3C member who replies with "RDF is really simple"!


I'm not comparing RDF to HTML, CSS, Javascript, AJAX etc. in terms of their use, syntax or model, but just in terms of how easy they are to grasp for most developers.


As for the web of data - I don't particularly need "one answer for everything", but sort of wish that RDF would produce "one answer for something", which I haven't found yet. The web of data, I think, would certainly help to uncover what that "something" is.


It's funny how you mention the success of PHP. I sort of see PHP as the antitheses of RDF. The reasons that PHP has become the most popular language on the web (possibly? correct me if I'm wrong, someone...) is maybe not due to 'libraries' or 'frameworks' (PHP has historically had a distinct lack of either of these, until relatively recently, with PEAR, etc), but rather because it is extremely forgiving, very easy to make sense of the syntax, and has absolutely excellent online help (in php.net). RDF (or RDF/XML, rather) is the exact opposite - very, very unforgiving, hard to get to grips with, and with a lack of high-level, introductory help.


I also don't really buy-in to the argument (that's been given for at least 3 or 4 years) that people aren't (or the world isn't) ready for RDF yet - that somehow we've done everything we can do with RDF, and we're now just waiting another couple of years for it to gain momentum. If we look at SVG, SOAP or XSLT (and a host of other technologies) - all came after RDF (in terms of published specs), but are more popular amongst developers. My personal opinion is that we need to stop thinking that we're waiting for the world to catch up with this cool RDF stuff (and it is cool, I think), and actually think about how we can address WHY the world hasn't caught up yet.


I think Tim Berners-Lee originally thought that the Semantic Web would be 10 years away, back when RDF first came around. I suppose we're not there yet, so I could be wrong with my impatience. But (I think?) it was Bill Gates who suggested that technologies that we think are 2 years away come around a lot sooner, and those that we think are 10 years away take a lot longer.


Finally, I agree that the technology is not complicated! I think those of us who push past that horrible initial barrier actually get to see its 'inner beauty', but the majority (who may read a book by its cover) unfortunately do not. Let's hope that we can find some new ways to get other people excited by it! (I suppose these bitching blogs by me don't help... maybe I'll write a positive one next!)

Karl Dubost
2006-06-12 06:18:22
(fixing) I'm not a "W3C Member", I'm a W3C employee. ;)


There is no irony, because I'm not a Semantic Web specialist, just a lurker most of the time :) and enjoying it for small hacks. So really I have given my user point of view.


Again for developers: How many people are able to implement CSS? The rendering model of it. Give me how many implementations of CSS on the rendering part? The irony is that there are more implementations of RDF than CSS. :))))


For OWL - http://www.w3.org/2001/sw/WebOnt/impls
For RDF - http://www.w3.org/2001/sw/RDFCore/20030331-advance.html
For SPARQL - Implementation report is in progress (SPEC at CR)


"One answer for something" no, not really. And I'm pretty sure you don't want that. Let's take a very simple example. :) "Cow" is something you find in your plate, or is it something which is a god (India), or is it just an animal? etc. You could answer that all of these could be represented, but the way you represent the world needs variety and different models. It's normal and necessary. Diversity in answers is a social feature and we are completely outside of the nature of RDF, which is just a tool to model knowledge and not how to define how the whole knowledge should be organized. Calendaring is another very good example of the diversity in the world.


But as I have given in an example there will be general ontologies which are shared between people.



About PHP, I have used the first version of PHP (PHP/FI), which was a set of perl scripts (a ***library***) to help people to design cgi-bin scripts without having access to the cgi-bin directory. PHP leveraged the use of scripts and then the programmatic Web. PHP is forgiving and RDF is unforgiving? you mean if you make a typo in a PHP function, the program still works? if you do a typo in an XML file for the name of an element, the XML is still parsed? Then you have magical tools that I don't have. ;)


"hard to get to grips with, and with a lack of high-level, introductory help." Give me a good introductory help about relational database design?


I didn't say the world is not ready for RDF, quite the opposite. SPARQL is in the process of being implemented, in fact there are implementations in things like SPARQL. I said that implementations takes time, a lot of time. Implementations, but also design patterns, experimentations, optimizations. It's true for all technologies.


You want a real demo
http://thefigtrees.net/lee/sw/demos/calendar/
http://esw.w3.org/topic/SparqlCalendarDemoUsage
http://www.thefigtrees.net/lee/blog/2006/03/sparql_calendar_demo_stepbyste.html


RDF has its own history. You are talking of the first version of the specification. Not the one which is being used now.
1st version (1999): http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/


Since there has been (2004)
http://www.w3.org/RDF/#specs
http://www.w3.org/2001/sw/DataAccess/#current


Just as a reminder CSS 2.0 is from 1998 and is still not fully implemented.


you said for TBL: 1999 (if 1st version) + 10 years = 2009, it seems we are still in the window. Or did I misunderstand you? Bill Gates said that the Web will not work as well. :)



So In your last paragraph, I agree with something. There is a missing GOOD book about RDF, not about the technology but about "RDF Hacks", there is a lot of content available on the Web but most of the time poorly documented, because hackers are not technical writers ;)



I don't necessary disagree with all what you said but more having a different point of view on them ;)




Dan Zambonini
2006-06-12 06:46:05
Oops, apologies for the member/employee thing - wrong choice of word! Sorry.


I guess I'm a 'Semantic Web lurker' like you, too. I wonder how many of us there are? It's like we're all sleeper cells, waiting for TBL to give a secret message in a TV interview, so we can all spring into action and publish our RDF all at once!


I think there's a misunderstanding around the 'one answer for something' bit; I just meant that I wish RDF would solve at least one good 'real' problem, to give people a good reason to investigate it a bit more (you know, like gmail and Google maps caused a lot of people to look in to AJAX, it's a pity there isn't a great app that uses RDF in a 'semantic web' kind of way).


Yeah, I do think PHP is forgiving - more so than XML (and definitely RDF/XML) anyway! I guess 'forgiving' might be the wrong word, maybe flexible is what I mean. You can use fully blown OOP or a simple 2 line script. You can switch off (or suppress) certain warnings. It's not statically typed, which (to me) makes any language more 'forgiving'. I just reckon that if you ask an average (non-PHP, non-RDF) developer to perform a simple PHP task, they'd be able to do it in next to no time (due to the low-barrier of entry to the language syntax, and the online resources). Ask them to perform a simple RDF task, and they'll be hunting online for hours before they can find a webpage that they can read without an encyclopedia.


I suppose the comparison of PHP vs RDF is ridiculous anyway...


Sorry, I didn't mean to direct the "world isn't ready for RDF" at you; it's just a general response to lots of the pro-RDF arguments I see on the web.


Yeah, I realise that RDF has had some revisions; there's the one that people don't use from 1999 and the one that people don't use from 2004 :)


Yup, 1999 + 10 = 2009. I'm just impatient! I'm like a child at Xmas, and need to open the presents before Xmas day... Maybe santa will be bringing me a good book on RDF this year, that would be great! I totally agree with you - an 'RDF Hacks' book is sorely needed (O'Reilly?).

Mike Lovett
2006-06-12 09:56:50
The corporate world will want to utilise semantic technologies as a solution to specific information based pain... and this will motivate vendors to produce tools. This will eventually help realise the Semantic Web.


The so-called enterprise semantic web is already producing new tools and platforms. You doubtless know about it but... Altova's SemanticWorks is a an example.


Then there are the Cerebras and Unicorns (the latter has been acquired by IBM). Expect more acquisitions like this.


As corporations struggle with ever more complex data integration problems - and info glut, you can expect them to slowly but surely turn to semantic technologies. What else is there?


There is also a strong business motivation for agility which most are attempting to deliver via SOA. Again... this desperately needs data decoupling which again is something that semantic tech can deliver.


The work is being done but it is early days. The important thing (from my humble perspective) is that the corporate use of semantic tech will ultimately spill into the wider internet and hopefully, deliver Tim's vision of a Semantic Web.


Time will tell!

Steve Ryder
2006-06-13 02:57:28
A great post and a valuable debate.
Alex Alishevskikh
2006-07-31 06:21:52
Thanks for the impressive article and discussion.


I'd add one more (f)law of SW: To be successful for end users, it lacks of an intuitive real-world metaphor behind it. After I imagined Semantic Wiki-enabled Wikipedia, I got an idea what this metaphor could be: it is an encyclopedia!

James Michael DuPont mdupont777@yahoo.com
2006-09-18 12:03:06
reposted from a Submission to http://www.oreillynet.com/xml/blog/2006/06/the_7_flaws_of_the_semantic_we.html


This is a reposting about my thoughts on this thread before, which have not been published on the oreillynet yet, that is fine, but I would like to get a copy of my post please? Basically I said that Ajax was Sexy and that the semantic web is not viable to sell sex, that is why the Web 2.0 will produce Porn 2.0, but not the Porn:Ontology#FreePorn.


Let me restate my point about the advertising without going into name of the the #1 consumer of internet advertising : The semantic web seen as a pure web of logic is not viable because it cannot be used for advertising. Otherwise it will be forced to contain opaque data designed to stop logic and appeal to the more primative forces . Thus you will always have chunks of data that are opaque. For them to be only small chunks, then they could be filtered out. Therefore the chunks of advertising have to look the same as the rest of the semantic web. But in a closed, secure semantic web of trust there will be be no way for such information to be hidden, thus it is excluded.
This is not the problem of the Web 2.0. It can be the advertisement and the logical content at the same time. the user can be lead to something that they dont even want, and then the search engines will get money for that.


This fuels the industry and that industry is powerful.


see a quote of my previous post here :


The Content Wrangler, Inc. (presumably Scott Abel) writes :


"Nowadays, adult entertainment companies are not just leaders in earning revenue from the Net, they're also leaders in the technology arena. In many areas, they are the dominate force. The leaders, not the followers. And, they're doing as much as possible to protect their turf. They file patents to protect their content matching algorithms and online content management and manipulation functionalities. "


Thanks for listening,


Mike

laki
2007-12-16 16:52:08
I don't have extensive knowledge on semantic web, but it seems to me that category 'People aren’t perfect' which includes malformed markup (already common in HTML and probably even more so to be common in RDF) and uncertainties in definitions, may be somehow alleviated in the similar manner that folk taxonomy works well in 20 questions AI game with the help of neural networks.


In that game, different people have different definitions for for same words, and yet, game guesses often enough what user is thinking about.

Craig Hubley
2008-02-10 00:00:46
You should read
http://let.sysops.be/wiki/the_real_web_3.0
which assumes exactly what you do, but takes it a step further: to generate a semantic web that works in public, start from the unique problems of public discourse. For instance the ontology of everything begins with "an authoritarian, top-down view", yes, academically, but the opposite of that socially. In other words, ontologies for the public web must begin by careful study of the process of trolling:
http://let.sysops.be/wiki/category:trolling


Read that again: "ontologies for the public web must begin by careful study of the process of trolling".


Understand? Now you must learn to see ALL forms of challenge to ALL standing assumptions as akin to political dissent and its Internet manifestation, trolling. That will take you years but you have a headstart in the lsb studies. Open edit. Have fun.

Craig Hubley
2008-02-10 00:11:13
w.r.t. semantic mediawiki it's great and will get much wider use especially if similar features are included in mediawiki clones like jamwiki (http://jamwiki.org) or commercial ontology gurus like ontoprise.de get involved. I haven't used OntoStudio but they are advertising it as specifically useful with semantic mediawiki.


That said, Wikipedia needs to do much more with this technology. Its contributors have been effective at other difficult tasks like proper citations and adding tags to represent the state of a page, they could add semantic mediawiki tags too. They patrol the pages well enough that this would make the database useful.


However, at present the RDF export of their inter-page linking is almost useless because it doesn't take into account:


1. The persistence of a particular link across versions nor how long those versions persisted. I want to know whether a link that is reported in the RDF is USUALLY on that page or not, not whether it is on the CURRENT version. I need an indication of how controversial it is and that's the only metric available.


2. Frequency of transit of those links. Even if they're usually there and commonly accepted as belonging, it's not clear that anyone clicks on them and checks. Heavily-transited links are more reliable by far than rarely-transited links.


3. Actual click paths. Even statistical data on this would be useful, even if every click path was not known for privacy reasons. But knowing that a particular IP number went to page X then Y then Z tells me an extraordinary amount about their intent and what else to suggest to them.


So annotated RDF that includes this log data, not just semantic tags, is required. Obviously all these considerations apply to semantic tags as well as to links to other Wikipedia articles, but those are semantic tags too: "article Y is relevant to article X". Given enough data, the semantics can get deeper.


By all means do push Wikipedia to implement OntoWorld's plugin. But push them at the same time to include the transit data above.

Craig Hubley
2008-02-10 07:16:50
(this is the second time I've posted this - you think it was a joke? you are failing to achieve your goal so completely and yet you think this is a joke? mere "trolling" perhaps? read this again, and think about it - by censoring it you provide evidence that this view is correct, and your blinders cause the failures you note above)


You should read
http://let.sysops.be/wiki/the_real_web_3.0
which assumes exactly what you do, but takes it a step further: to generate a semantic web that works in public, start from the unique problems of public discourse. For instance the ontology of everything begins with "an authoritarian, top-down view", yes, academically, but the opposite of that socially. In other words, ontologies for the public web must begin by careful study of the process of trolling:
http://let.sysops.be/wiki/category:trolling


Read that again: "ontologies for the public web must begin by careful study of the process of trolling".


Understand? Now you must learn to see ALL forms of challenge to ALL standing assumptions as akin to political dissent and its Internet manifestation, trolling. That will take you years but you have a headstart in the lsb studies. Open edit. Have fun.


(as for the central ontology it already exists - at Wikipedia - for instance see
http://en.wikipedia.org/wiki/World_War_One
if you don't like that definition all you have to do is troll a bit and you will attract other trolls and the fighting results in a new compromise between factions - to understand this review the process of trolling as noted above)

Craig Hubley
2008-02-13 06:26:08
Semantic mediawiki is now easy to install. Start with the WOS installer which puts the whole stack together on one USB stick if you want.


http://www.chsoftware.net/en/useware/wosmixer/wosmixer.htm?step=2


And the semantic mediawiki 1.01:


http://sourceforge.net/project/downloading.php?group_id=147937&use_mirror=internap&filename=semediawiki-1.0.1.tar.gz&59664296


Hardcores may want the Halo extensions from ontoprise:


http://wiki.ontoprise.de/ontoprisewiki/index.php/Features_of_ontoprises_Semantic_MediaWiki_distribution


Would appreciate an evaluation of that last especially from any semantic web gurus.

Vince
2008-02-28 13:26:01
Excellent site - do keep up the good work..
Timmy
2008-03-01 08:36:33
I like it and the background and colors make it easy to read+
Dan
2008-03-03 19:56:20
I thank the Lord for giving us the gift of brilliant preachers!+
Hannes
2008-03-05 13:41:17
Just serfed in. Great site, guys!
Martin
2008-03-15 04:53:21
Wow!!! Good job. Could I take some of yours triks to build my own site?
Hannes
2008-03-15 21:17:23
You have an outstanding good and well structured site. I enjoyed browsing through it.o
Ron
2008-03-18 05:40:14
Very good web site, great work and thank you for your service.
Vince
2008-03-21 00:01:34
mp3 many
=
judy
2008-03-21 19:27:27
I have always wanted a compendium of novena prayers. Thank you for sharing all these prayers with us. It brings joy and happiness to everyone. I know, I do feel that way.P
judy
2008-03-23 04:22:55
Lucky to find you, keep on the good workk guys! Best of luck.%
jeroen
2008-03-26 00:41:37
rss mp3
a
Bakerklok
2008-05-07 23:02:24
Try to look here and may be you find what do you want:,
jammarlibre
2008-05-08 21:55:48
tatuazh
D
Rosina
2008-05-27 15:28:10
Are you a big fan of movies and all the new releases on the big screen? Do you like to watch all the latest movies as soon as they are released? If the answer is yes, and you not only love to watch movies but you also like to get loads of other movie related products as well then there is a web site that is perfectly suited to you. The web site that you should consider taking a look at is called. The Films gives its visitors the chance to down load many of the latest movie releases as well as offering loads of news about all that is going on with in the movie industry and the actors in and around Hollywood. From this web sites well designed menu system you can also access movie sound tracks, and down load wall papers of you favorite movies and movie stars.
r
Kumar Sansar
2008-05-29 11:43:04
There may be life afterall...


We use an OWL schema for data profiling. It takes all the custom enterprise data dictionaries and build from scratch conversion spreadsheets and puts them in a machine processable and easily reusable langauge.


They say necessity is the mother of invention... well having suffered through enough painful integration projects we were willing to try anything and found a new way of using semantic web technology.

Dan
2008-06-01 19:29:47
anal-dildo
Vnrwjnbt
2008-06-10 16:28:53
Of course, but what do you think about that?,
Vnrwjnbt
2008-06-10 16:29:11
Of course, but what do you think about that?,
Bomzhang
2008-06-11 18:55:28
But you are say, that this idead is bad?,
Bobrila
2008-06-12 18:59:55
Open this post and read what I think about that:,
Mghrhwch
2008-06-15 16:53:44
Of course, but what do you think about that?,