June 1, 2009

PWC report on Semantic Web

There has already been a number of blogs and tweetes on PriceWaterhouseCoopers’ Spring ’09 Technology Forecast on Semantic Web, but it may still be worth writing about it. The document can be downloaded from the Web free of charge in return for a registration. It includes some of PWC’s own overview on the technology, plus interviews with Tom Scott (BBC), Uche Ogbuji (Zepheira), Lynn Vogel (University of Texas M.D. Anderson Cancer Center), and Frank Chum (Chevron).

The document is clearly not aimed at technologists of the Semantic Web. But there are number of well chosen wordings and quotes that might help us to talk to people around us who have to be convinced about the value of Linked Data/Semantic Web. Just a few of those:

PricewaterhouseCoopers believes a Web of data will develop that fully augments the document Web of today. You’ll be able to find and take pieces of data sets from different places, aggregate them without warehousing, and analyze them in a more straightforward, powerful way than you can now.


Let’s say your agency represents musicians, and you want to develop your own ontology […]. You might create your own ontology to keep better tabs on what’s current in the music world […]. You also can link your ontology to someone else’s and take advantage of their data in conjunction with yours. Contrast this scenario with how data rationalization occurs in the relational data world. Each time, for each point of data integration, humans must figure out the semantics for the data element and verify through time consuming activities that a field with a specific label […] is actually useful, maintained, and defined to mean what the label implies. Although an ontology-based approach requires more front-end effort than a traditional data integration program, ultimately the ontological approach to data classification is more scalable […]. It’s more scalable precisely because the semantics of any data being integrated is being managed in a collaborative, standard, reusable way.


With the Semantic Web, you don’t have to reinvent the wheel with your own ontology, because others […] have already created ontologies and made them available on the Web. As long as they’re public and useful, you can use those. Where your context differs from theirs, you make yours specific, but where there’s commonality, you use what they have created and leave it in place. Ideally, you make public the non-sensitive elements of your business-specific ontology that are consistent with your business model, so others can make use of them. All of these are linked over the Web, so you have both the benefits and the risks of these interdependencies. Once you link, you can browse and query across all the domains you’re linked to.


Traditional data integration methods have fallen short because enterprises have been left to their own devices to develop and maintain all the metadata needed to integrate silos of unconnected data. As a result, most data remain beyond the reach of enterprises, because they run out of integration time and money after accomplishing a fraction of the integration they need.[…] The most basic lesson is that data integration must be rethought as data linking—a decentralized, federated approach that uses ontology-mediated links to leave the data at their sources. The philosophy behind this approach embraces different information contexts, rather than insisting on one version of the truth, to get around the old-style data integration obstacles.

Yeah, we all know that, right? But can we really put it in succint terms for outsiders? That is not that easy… Ie, worth reading the report (and thanks to PWC!).


