Your RDF Query Language?

by Kendall Clark

Related link:

Among the several hats I wear these days -- including Managing Editor of O'Reilly's -- I'm also a member of the Data Access Working Group (DAWG), which is working on standardizing a query language and access protocol for RDF, the lingua franca of the Semantic Web.

The DAWG has recently released the 2nd draft of its Use Cases and Requirements doc, which we're encouraging people to read and comment on. This document contains an odd baker's dozen of use cases -- little stories where we think a standard RDF query language would help you get the job done. It also includes some requirements and some design objectives. The former are things that we've put into the critical path: we won't be done till they are. The latter are things which we think would be good to do, but which we aren't (yet) willing to put into the critical path.

Where do things stand now? The WG is trying to figure out what kind of interest there is from Semantic Web, RDF, XML, and web service developers in a few of its proposed design objectives, including:

  1. 4.2 Aggregation Graphs

    RDF can be used for data integration and aggregation. RDF repositories are built by merging RDF triples from several other RDF repositories or from non-RDF sources converted to RDF. Such an aggregations can be real or virtual.

    It must be possible for the query language and protocol to allow an RDF repository to expose the source from which a query server collected a triple or subgraph.

    This objective is related to the provenance issue -- consider an assertion A. It's often very useful to know which RDF graph A came from, even after you've aggregated A with lots of other assertions.

  2. 4.5 Aggregate Query

    It should be possible to specify two or more RDF graphs against which a query shall be executed; that is, the result of an aggregate query is the merge of the results of executing the query on each of two or more graphs.

  3. 4.5.1 Querying Multiple Sources

    It should be possible for a query to specify which of the available RDF graphs it is to be executed against. If more than one RDF graph is specified, the result is as if the query had been executed against the merge of the specified RDF graphs. Query processors with a single available RDF graph trivially satisfy this objective.

    How feature rich should the DAWG protocol and query language be? One of the promises of the Semantic Web is the ability for Bob to aggregate data that Carol, Alice, and Ted have published on the Web -- without any more explicit coordination between them.

    Objectives 4.5 and 4.5.1 are two variants of RDF aggregation built into the query language and data access protocol. If this is the sort of thing you want to be able to do, let us know!

  4. 4.6 Additional Semantic Information

    It should be possible for knowledge encoded in other semantic languages—for example: RDFS, OWL, and SWRL—to affect the results of queries executed against RDF graphs.

    There are two standardized ways to structure RDF data: RDFS and OWL. Should we build syntactic sugar into the RDF query language for asking queries about RDFS and OWL constructs? For example, inverse functional properties are an important OWL concept used in FOAF data. It might be a good thing to be able to ask about IFPs in an RDF query over a set of FOAF instances.

In other words, the Working Group is asking for community feedback. That's part of our job. You can help us do it!

I'll make sure comments left here on this weblog entry come to the attention of the DAWG. Thanks!

Let the DAWG folks know what you think