Ivan’s private site

April 27, 2008

Setting up and RDFa file with Apache (second)

Filed under: Code, Semantic Web, Work Related — Ivan Herman @ 0:58
Tags: , ,

A few weeks ago I wrote a short post on how I set up an RDFa file with apache. As commented there by masaka (and I also received some private comments), that setup had the disadvantage that if a client had an accept header that referred both to HTML and to RDF, then it went wrong. Essentially, the server picked whichever was first in the .htaccess file.

So I had to revise (and ask advise from those who understand how Apache works). Here is how SW-FAQ is now set up. It is a little bit more complicated and requires the “.var” facilities to be switched on in Apache. In the same directory where I store the SW-FAQ.html file, I also store a SW-FAQ.var file (called a “type map file”). It looks as follows:

URI: SW-FAQ

URI: SW-FAQ.html
Content-Type: text/html

URI: SW-FAQ.rdf
Content-Type: application/rdf+xml; qs=0.5

the .htaccess file in the same directory is now simpler; it just says:

RewriteEngine On
RewriteBase /2001/sw/
RewriteRule SW-FAQ.rdf /2007/08/pyRdfa/extract?uri=http://www.w3.org/2001/sw/SW-FAQ.html [L]

The .var file switches in Apache’s content negotiation mechanism. The media type determines which version is returned, and this takes into account the “quality” parameter of the HTTP accept header, too. The .htaccess file would then just direct the server to run the RDFa distiller and return the result when RDF is required. This seems to work better…

(It was TimBL who pushed me to do the changes this time, and the basic structure comes from him, actually…)

April 24, 2008

Semantic Web W3C Track at WWW2008

Filed under: Semantic Web, Work Related — Ivan Herman @ 3:51
Tags: , , , , ,

Yesterday I chaired a Semantic Web session at the W3C Track at WWW2008. Nice turnout (about 100 people), and I had to cut the discussions to keep within schedule, which is always a good sign…

Three presentations, fairly different from one another. Tom Heath and Chris Bizer made a presentation (co-authored with Tim Berners-Lee) on the Linking Open Data project. Real good stuff. Maybe the most impressive part was when Chris flipped through the figures on the “current” status of the linked dataset, starting from a year ago at WWW2007 up to April 2008. And the fact that, actually, we essentially lost track of how many triplets are out there; there are simply too many of those! I also did not know that Tom worked on Revyu by automatically adding information coming from DPBedia to an entry. I really hope that the coming year will see lots of user applications that rely on this huge amount of public RDF data out there…

Raphaël Troncy made a presentation on managing multimedia content on the Semantic Web. The situation today is really a maze with all kinds of standards, semi-standards, etc, on how to describe, annotate, reason about, say, video. Lots of work ahead, both in the Semantic Web area and in others. Think of the fact that we still do not have a generally accepted URI to describe something like an area in an image, or a specific point in time in a video. (There was, actually, a short discussion after the presentation on how some of the current URI schemes fit, or not fit, general Web Architecture…)

Huajun Chen gave an overview on what is happening in the Semantic Web area in China. In two words: a lot. Some of the technologies developed in China are now well-known all around, some of them less. We should realize that there are more Semantic Web related blogs and subscribers to local mailing lists than anywhere else… I think one of the challenges is to bind the various SW communities beyond the boundaries of languages, where Chinese is probably the largest “local” community. I do not have any magic bullet here, but presentations like Huajun’s are important to have…

April 1, 2008

Data Portability Rock ‘n Roll

Filed under: Semantic Web, Work Related — Ivan Herman @ 11:20

John has already praised this video in his blog but, well, if you have missed his blog: look at Danny (Ringo) Ayers’ video on data portability, foaf, rdf, etc. It is worth it!

March 23, 2008

IR and SW communities (Baeza-Yates et. al.’s comments)

Filed under: Semantic Web, Work Related — Ivan Herman @ 10:37

It is quite unfortunate that IEEE still has not recognized the power of the Web, and their publications are still accessible through subscriptions only. This is also true for sections such as the “Trends & Controversies” of the IEEE Intelligent Systems. Anyway, I hope I do not break any copyright by quoting a few sentences…

The latest (i.e., January/February 200 8) issue of IEEE IS includes a T&C section entitled “Near Terms Prospects for Semantic Technologies”. It includes a number of short papers, all worth reading. Among those there is also a short contribution by Ricardo Baeza-Yates, Péter Mika and Hugo Zaragoza (all three from Yahoo! Research in Barcelona, Spain) on the relationships between search and SW. They ask the question: “Why has the Semantic Web had so little effect on search services?” They put forward several reasons, but I was most compelled by the “cultural” divide that they claim to exist between communities. Here is what they write:

[...] a clear cultural divide exists between the IR and Semantic Web communities. IR conferences prove difficult for researchers from other fields to attend, owing largely to a strong emphasis on methodology and, in particular, evaluation. Consequently, existing work on ontology-based IR focuses on smaller subtasks such as query expansion using ontologies or improving search result presentation, but there’s little work on reshaping the IR core. [...] Also, as Semantic Web research continues to experience considerable growth and attract significant funding for basic research, members of the community feel less compelled to even attempt breaking these barriers, real or imagined, despite the significant economic motivators. IR research is strongly driven by a problem, whereas Semantic Web research is driven by a solution. Metaphorically speaking, Semantic Web researchers are like the hobbyist toolsmith who has the idea for the perfect tool and resents compromises on the design. However, IR, as a customer, is interested in buying a hammer that might not be perfect but can drive nails quickly and precisely.

Let us not forget: these are friendly critiques coming from a group that has just stirred a significant interest in being at the forefront in binding these two areas! Worth remembering, in my view.

Add-on to the original mail on 2008-04-05: I have received a mail from IEEE, that the T&S section is now available online. Great…

March 3, 2008

SemTech conference 2008

Filed under: Semantic Web, Work Related — Ivan Herman @ 10:57

This year’s version of the Semantic Technology conference has its program online. Lots of good stuff in perspective, it will be an interesting week…

A new feature of the conference is on its web site: there is a cool personal scheduler based on Exhibit which you can use to plan your week, export that into RDF or ICS, add it to your calendar…  nice stuff!

March 2, 2008

Book worth reading: on Paul Erdős

Filed under: General, Hungary, Private, Work Related — Ivan Herman @ 10:54

If you are interested by the personalities behind mathematics, or simply in the peculiar mind of a genius, it is worth reading Bruce Schechter’s book on Paul Erdős (well, with my Hungarian background I should really write Erdős Pál). Erdős Pál was undeniably one of the greatest scientific minds of the 20th century and certainly one of the greatest mathematicians ever. But also a very peculiar personality. He lived a completely “monastic” life; he never had a fixed job, a place he could really call “home”, all his worldly possessions would fit into a suitcase, and he spent most of his life traveling around the globe from one conference to the other, from one city to the other, wherever he had friends he could do mathematics with. He was author or co-author of around 1,500(!) articles; the number of collaborators was so big that the community came up with the humorous notion of “Erdős number”. He was also incredibly generous in helping young, talented mathematicians to start their career.

I did not have the pleasure to meet Erdős personally, although I had the privilege of having some of his closest collaborators as my teachers at the University of Budapest in the early 70’s (Turán, Sós, Simonovits, Hajnal,…). But he regularly came back to Hungary. We never knew when (nobody did, in fact); the news suddenly spread among us that Erdős was in Budapest and that he would make a presentation, well, tomorrow afternoon. And we went, forgetting our regular, scheduled courses and listed to his talk. His lectures were always great, witty, and full of interesting and unsolved problems. He would usually come with a problem saying “this seems to be an open issue, I have the feeling that it could be solved this and this way; I give a prices of 100$ to whoever solves this”. Or $10 or $1,000, depending on the problem (although the monetary side was not the most important in trying to solve those problems; the perspective of gaining an Erdős number 1, ie, becoming one of Erdős’ co-authors, was much more of an incentive). Even if our field of interest did not coincide with Erdős’, these lectures were always among the highlights of the year. And it did not occur often; I think in those 5 years that I spent at the University, I saw him twice, or maybe three times… certainly not more.

It may be an unusual analogy, but his personality, and the style of his appearances remind me of another genius in a totally different area, namely Sviatoslav Richter. Much like Erdős, he was one of the greatest personalities of the century in a particular field (as a classical pianist) who also led a kind of a recluse, monastic life without real possession and ignoring all traditional signs of success. And much like Erdős nobody knew when he would appear in Budapest for a concert nor what he would play; the news spread among those interested and we all ran to listen to his performances (played in a darkened concert hall, with barely a small lamp on the piano illuminating the music sheet only). And his performance of Bach’s Well Tempered Clavier remain among the most cherished memories I have from my youth. Much like Erdős’ occasional visits.

A book worth reading.

[1] Bruce Schechter: “My Brain is Open, the Mathematical Journeys of Paul Erdős”. The book is not new, but just appeared on some airports, that is where I found it…

February 22, 2008

Setting up an RDFa file with apache

Filed under: Semantic Web, Work Related — Ivan Herman @ 17:56
Tags:

As I said yesterday, the SW FAQ file is now in XHTML/RDFa. However, I was wondering how to set up the environment so that the right URI-s would lead to the right format, ie, either HTML or RDF. Of course, one could generate the SW-FAQ.rdf file offline and put that on the server, but that sounded a little bit like cheating (although, I must admit, that is what I did first). What one would like is

  • http://www.w3.org/2001/sw/SW-FAQ should return
    • XHTML by default
    • RDF/XML if so requested, but generated from the XHTML file on-the-fly via an RDFa processor; and to that with an HTTP 303 round (to make it really neat)
  • http://www.w3.org/2001/sw/SW-FAQ.rdf should return RDF/XML, again generated on-the-fly
  • http://www.w3.org/2001/sw/SW-FAQ.html should return, well, XHTML

It so happens that, on apache, a little bit of .htaccess wizardry works. The problem is that you have to be the wizard, which I am not. Luckily, my colleague and friend Ralph Swick is :-). So here is the .htaccess file:

RewriteEngine On
RewriteBase /2001/sw/

#This is where the RDFa distiller is called on-the-fly:
RewriteRule SW-FAQ.rdf /2007/08/pyRdfa/extract?uri=http://www.w3.org/2001/sw/SW-FAQ.html [L]

# Take care of the RDF case when so requested
RewriteCond %{HTTP_ACCEPT} application/rdf\+xml
RewriteRule ^SW-FAQ$ SW-FAQ.rdf [R=303,L]

RewriteRule ^SW-FAQ$ SW-FAQ.html [L]

And voilà! Thanks Ralph…

February 21, 2008

RDFa Syntax LC is out

Filed under: Python, Semantic Web, Work Related — Ivan Herman @ 19:41

The RDFa Syntax Last Call document has just been published; yey!

I have also made an update of the RDFa processor that I coded last summer; it is now available for download and is also used through the “RDFa Distiller” service page. I have played with RDFa in practical terms, too; my foaf file in HTML, the W3C SW Activity Home page, and the Semantic Web FAQ page are all annotated with RDFa now. Once one is used to it, it is fairly straightforward to add even complex RDF statements to HTML pages with an arbitrarily large number of different vocabularies mixed in. Of course, authoring tools would be good, but let us take things one step at a time… Having the Last Call published (ie, the Working Groups believing to have taken care of all technical issues) is a major, big step ahead!

B.t.w., Benjamin Nowack jumped on the SW-FAQ RDF file to make a nice little hack; here is the mail he sent on the SW SWEO list the other day:

Heh, silly stuff, just FYI: On the #foaf channel is foafbot (a SPARQLy reincarnation
of an earlierbot we had there years ago). It understands RDFa, and allows the
specification of custom commands at [1]. I made it load Ivan’s FAQ, and
created an “faq” command, so that you can now pass a keyword or phrase
to the bot and it will respond with a pointer to the FAQ (if something
matched the RDFa-encoded question), e.g.:

<bengee> foafbot, faq giant ontology

<foafbot> bengee, see http://www.w3.org/2001/sw/SW-FAQ#whgiantont ;)

Benji

[1] http://semsol.org/semcamp/sparqlbot

Isn’t that cool? As far as I could see, it took him about 10 minutes to add this hack, thanks to the SW-FAQ being in RDF…

SW for Health Care and Life Sciences Workshop, W3C Track

Filed under: Semantic Web, Work Related — Ivan Herman @ 11:10
Tags: ,

The program for WWW2008 is really shaping up.  I already blogged a while ago on the SW related stuffs at the conference, and on the LOD workshop program yesterday. Well, the program of the Health Care and Life Sciences Workshop is also public now. Again, lots of great stuff there. Last but not least: the program of the W3C Track is also public with, as usual, a SW session (and others!).

It will be an interesting week (an an interesting place).

February 20, 2008

Linked Data on the Web Workshop in Beijing

Filed under: Semantic Web, Work Related — Ivan Herman @ 17:06
Tags:

The preliminary programme for the “Linked Data on the Web” Workshop (one of the workshops at the WWW2008 conference) is now online. It looks really good… worth checking out!

Next Page »

Blog at WordPress.com.