Ivan’s private site

July 15, 2007

PURL to be renewed

Filed under: Semantic Web,Work Related — Ivan Herman @ 15:55

Unless you are a reporter, you rarely read press releases… however, the latest press release of OCLC is worth noting for the Semantic Web community: “OCLC to work with Zepheira to redesign OCLC’s PURL service”.

Two aspect of the announcement really caught my eyes:

  1. The way I understand it, the service will provide an implementation of what became known as HttpRange-14, ie, how to define URI-s for informational and non-informational resources. And this is really great: indeed, the theory of HttpRange-14 is one thing, its practical deployment is another. Unless one has access to the controls of his/her server (eg,to an .htaccess file for an Apache server), it is not that easy to adopt it in practice. With the renewed PURL service this should become a breeze…
  2. The code of PURL will be released as open source. Ie, other services can be set up using the same software and providing similar services. I could see a number of, say, specialized communities making use of that feature in future. I think this will play a very important role. (For example, look at the way UniProt defines its URI-s these days: the URI scheme used in the announcement, ie, http://purl.uniprot.org/{db}/{id}, suggesting that this community can very well make use of a renewed PURL software if they wish)

By the way, if you wonder who “Zepheira” is, look at their team page. Some familiar faces and names there…



  1. Unless one has access to the controls of his/her server (eg,to an .htaccess file for an Apache server), it is not that easy to adopt it in practice

    Actually all it’s needed is some decent support for producing dynamic content, e.g. with PHP. For my case http://ssfak.org/stelios/ responds with a 303 redirect taking into account the Accept header with the following PHP script stored in an index.php file in the ../stelios/ directory:

    Comment by Stelios Sfakianakis — July 16, 2007 @ 0:00

  2. The PHP script:

    header(“HTTP/1.1 303 See Other”);
    $ac = $_SERVER[“HTTP_ACCEPT”];
    if (strstr($ac, “application/rdf+xml”)!= false)
    header(“Location: http://ssfak.org/foaf.rdf“);
    header(“Location: http://ssfak.org/“);

    Comment by Stelios Sfakianakis — July 16, 2007 @ 0:02

  3. Well… it may not be that simple 😦

    I used the Recipe #3 in the Best Practices document to set up (via .htaccess) my URI http://www.ivan-herman.net/Ivan_Herman but, as Richard Cyganiak found out, there are problems with that recipe if multiple formats are to be accepted. This is also the case with your example; indeed

    Accept: text/html; application/rdf+xml;q=0.5

    should return HTML but it will return RDF/XML… This is of course solvable with a slightly more complicated PHP script taking into account all possible combinations, but this does not invalidate my point: it may be too complex for a lambda user…

    Comment by Ivan Herman — July 16, 2007 @ 18:45

  4. Sure there are problems because content negotiation is not simple. Nevertheless, as in the case of the .htaccess recipe, it is assumed that a client interested in “semantic content” will give higher priority to application/rdf+xml than anything else (e.g. I think Tabulator sends something like this: Accept: application/rdf+xml, application/xhtml+xml;q=0.3, text/xml;q=0.2, application/xml;q=0.2, text/html;q=0.3, text/plain;q=0.1) while a common client (e.g. all current browsers) will not specify this media type at all in the Accept header. This of course may not be the case in the future (or even today..in some cases) so one can use e.g. http://jystewart.net/process/2005/06/managing-content-negotiation-with-php/

    For my case I didn’t pay too much attention to that…

    Comment by Stelios Sfakianakis — July 18, 2007 @ 9:04

  5. True for Tabulator but, for example, the recipe’s fail on that account 😦 One has to reverse the order of rewrite rules to get the desired effect.
    I do not think we really disagree. Nobody said it cannot be done; what I said was that it is not very simple, and certainly not simple enough for potential Semantic Web users who, frankly, do not want to dive into the details of Accept headers… and a new PURL would be a great help. That is all.

    Comment by Ivan Herman — July 18, 2007 @ 10:03

RSS feed for comments on this post.

Create a free website or blog at WordPress.com.

%d bloggers like this: