Wednesday, November 15, 2006

wikipedia to serve as a global ontology....



Today, I met Lars Zapf for a quick coffee enjoying the rare late afternoon november sun. We were exchanging news about ISWC, WebModay, recent projects, and stuff like that. While talking about semantic annotation, Lars pointed out that instead of using (or developing) own ontologies for annotating (and authoring) documents, you could also use a wikipedia reference to indicate the semantic concept that you are writing about. Thus, as he already wrote in a comment, e.g., you could use the link http://en.wikipedia.org/wiki/Rome to indicate that you are refering to the city of Rome, the capital of Italy.
Of course you might object that there are several language versions of wikipedia and thus, there are several (different) articles that refer to the city of Rome. To use wikipedia as a 'commonly agreed and shared conceptualization' - to fulfill at least some points of Tom Gruber's ontology definition as long as wikipedia lacks the 'formal' aspect of machine understandability - we can make use of the fact that articles in wikipedia can be identified with articles in other language versions with the help of the language indicators at the lower left side of wikipedia's user interface. To serve as a real ontology, each wikipedia article should (at least) be connected to formalized concept (maybe encoded in RDF or OWL). This concept does not necessarely have to reflect all the aspects that are reported in the natural language wikipedia article. E.g., Semantic Media Wiki is working on a wiki extension to capture simple conceptualizations (such as e.g. classes or relationships).
An application for authoring documents could easily be upgraded by offering links to related wikipedia articles. If the author enters the string 'Rome', the application could offer the related wikipedia link to Rome [or any selection of related offers] and according to the authors this link can be automatically encoded as a semantic annotation (link).
O.k., that sounds pretty simply. Are any students out there to implement it (anybody in need for credit points??)? I would highly appreciate that...