« Web Typography - Your wish list | Main | Vertical Layouts for Canvas Text (CJK) »

How to add RDF information to a page using RDFa?

The Semantic Web Activity home page has a number of information that might be of interest for the Semantic Web (eg, for data integration). These include: references to existing recommendations, talks on the subject made by working group members or the W3C staff, references to active groups, etc. Ie, it sounds like a good idea to make these available in RDF, too. Of course one could achieve that by publishing two files: http://www.w3.org/2001/sw/Overview.html for the HTML version and a separate http://www.w3.org/2001/sw/Overview.rdf for RDF (remember that W3C’s Apache setup is such that the default index file is called “Overview”). But this would lead to versioning problems; not a good idea!

This is where RDFa comes into the picture: don’t duplicate information if you can avoid it. Instead, add the RDFa attributes to the HTML file and let the machines do the rest. And this is what has been done. If you look under the hood, http://www.w3.org/2001/sw/Overview.html is, in fact, in RDFa, ie, the core (X)HTML information is enriched with some attributes that allows the automatic generation of corresponding RDF data.

The best is, of course, to look at the source to see the details; here is just an example. This is, essentially, how an entry on a recommendation looks like in XHTML+RDFa:

<li resource="http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/">
  <a rel="doc:versionOf" href="http://www.w3.org/TR/rdf-syntax-grammar" property="dc:title">RDF/XML Syntax Specification (Revised)</a>,
  <span rel="rdf:type" resource="[tr:REC]">W3C Recommendation,</span>
  <span property="dc:date" content="2004-02-10">February 10, 2004,</span>
  <span rel="tr:editor"><span typeof="contact:Person" property="contact:fullName">Dave Beckett</span></span>, ed.
</li>

yielding, in RDF:

<http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/> a tr:REC;
     dc:date "2004-02-10";
     dc:title "RDF/XML Syntax Specification (Revised)";
     doc:versionOf <http://www.w3.org/TR/rdf-syntax-grammar>;
     tr:editor
         [ a contact:Person;
             contact:fullName "Dave Beckett"
         ].

using a number of existing vocabularies (eg, Dublin Core or the “TR” vocabulary that W3C has been using for years to describe its documents).

So how would one set up the server to get the right version of the documents for the right request? Although, at some point in time, one could expect that (RDF) browsers will just pick up the RDF information automatically, what to do in the meantime? What one would like to have is:

A bit of Apache Wizzardy works here. First a special Apache file is created to control content negotiation. The usual setup is to associate this to the “var” extension, ie, Overview.var in this case. The file itself looks fairly simple:

URI: Overview

URI: Overview.html
Content-Type: text/html

URI: Overview.rdf
Content-Type: application/rdf+xml; qs=0.4

URI: Overview.ttl
Content-Type: text/turtle; qs=0.5

that will instruct the Apache server to choose the right file depending on the accept header. HTML will be returned if both HTML and RDF/XML are accepted; and Turtle is preferred if both RDF/XML and Turtle are accepted by the client (that is the role of those “qs” values).

That takes care of the content negotiations, but we are not yet done because, remember, the goal is to generate the RDF/XML and Turtle versions on-the-fly. This is achieved by adding the following lines to the .htaccess file in the directory:

RewriteEngine On
RewriteBase /2001/sw/
RewriteRule Overview.rdf /2007/08/pyRdfa/extract?uri=http://www.w3.org/2001/sw/Overview.html [L]
RewriteRule Overview.ttl /2007/08/pyRdfa/extract?format=turtle&uri=http://www.w3.org/2001/sw/Overview.html [L]

that instructs the server to run a script (ie, the RDFa distiller) on the (X)HTML file when an RDF/XML or Turtle versions are required.

That is it… (And thanks to Ralph Swick and Tim Berners-Lee who gave me the right push and information to handle Apache.)

Filed by Ivan Herman on May 1, 2008 10:00 AM in Semantic Web, Technology 101, Tutorials
| | Comments (0) | TrackBacks (0)

Leave a comment

Note: this blog is intended to foster polite on-topic discussions. Comments failing these requirements and spam will not get published. Please, enter your real name and email address. Every individual comment is reviewed by the W3C staff. This may take some time, thank you for your patience.

You can use the following HTML markup (a href, b, i, br/, p, strong, em, ul, ol, li, blockquote, pre) and/or Markdown syntax.

Your comment


About you

This blog is written by W3C staff and working group participants,
 and maintained by Karl Dubost and olivier Thereaux.
Powered by Movable Type, magpierss and a lot of Web Technology