Friday, August 17, 2012

Who Knows Movies - The 2nd Round

Play it
We are happy to announce that our paper
Andreas Thalhammer, Magnus Knuth and Harald Sack: "Evaluating Entity Summarizations Using a Game-Based Ground Truth" 
has been accepted for the Evaluations and Experiments Track of ISWC 2012. We want to thank all of you very much, who played our WhoKnowsMovies? game, which allowed us to collect the necessary data for this publication. And since we also want provide updated statistics for the final version of our paper, you are very welcome to play a little bit more... :)

There has been a little upgrade for the game including more questions and we need more data to achieve a more reliable proof of our assumption that our proposed fact ranking for entity summarization really is better than a random choice.

Therefore, please help us and play the game. Test your knowledge about movies! Can you challenge the highscore? 

P.S. The already gathered (anonymized) data is available at http://www.yovisto.com/labs/iswc2012/.

Saturday, June 23, 2012

Who knows Movies? Another Game with a Purpose for the Semantic Web

Play it
Of course you will remember our last Quiz Game 'Who Knows' (cf. the post below), where we have used DBpedia facts to generate questionaires for a game with a purpose. The very first application of this game was to put to evaluate some newly developed entity/property heuristics via crowdsourcing. Then, we realized that the gathered data also had some collateral benefits such as the detection of inconsistencies and flaws within the Linked Data resources that we used for generating the questions.

Now, we are looking into another task: entity summarization. Entity summarization means that we are trying to wrap up only the most important facts that determine a distinct entity. Just think of the Google knowledge graph that displays entity summaries from Freebase. Thus, Entity summarization of course is rather similar to relevance ranking of facts.

To generate a ground truth for evaluation of entity summarization heuristics, we adapted our quiz game WhoKnows? to become WhoKnows?Movies! We are cooperating with Andreas Thalhammer from University of Innsbruck with this task and we have restricted the domain of questions to popular movies, adapted the gereration of questionaires by utilizing the Freebase knowledge Base. Now we have to gather data....

Therefore, please help us and play the game. Test your knowledge about movies! Can you challenge the highscore? 


P.S. Of course all the data gathered will be anonymized and made publicly avaible.

Saturday, June 25, 2011

New Challenge! Playing WhoKnows? to develop a new Teflon Pan


Some of you already may know our --serious-- fun game WhoKnows? (N. Ludwig, J. Waitelonis, M. Knuth, H. Sack: WhoKnows? - Evaluating Linked Data Heuristics with a Quiz that Cleans Up DBpedia) that has been presented inter alia at this year's ESWC 2011 (cf. picture from poster session).

WhoKnows? is a quiz game based on the DBpedia dataset; while answering the quickies the player produces data that can be used for the ranking of facts from the underlying knowledge base. Furthermore, the player has the possibility to mark strange questions often originating from inconsistencies that we want to identify this way.

Now, as a next step we want to apply the collected data for the development of an expert finder and user interest profile recommendation system. For this we would appreciate a larger data set that allows us to rate the expertise of several users in various domains. If you like to contribute this research, you can do this easily by playing WhoKnows? on Facebook. In order to make a sound statement about your expertise, we need at least about thirty questions answered and of course the more the better.

Of course all gathered data will be anonymized before analysis and evaluation.

Don't forget it's really fun and educational together!
Your help really is appreciated. Thank you for playing!

P.S. We would be pleased to inform you about the final results, if you are interested in. Just send us an e-mail.

Friday, February 25, 2011

Mediaglobe - the Digital Archive

Der neue Teaser-Trailer für unser Projekt 'Mediaglobe - The Digital Archive' ist online. Weitere Infos über das semantische Videosuchmaschinenprojekt unter der Mediaglobe Projekt-Webseite oder über unsere Semantic Technologies Webseite am HPI.


Wednesday, November 24, 2010

'Who knows?' - A Semantic Web Game

Please support our research by playing our Semantic Web Game 'Who Knows'!

What is 'Who Knows?'
'Who Knows?' is a simple Q&A Game in the style of 'Who wants to be a Millionaire'. The questions are automatically generated from DBpedia content.

What is the purpose of 'Who Knows?'
The purpose is the evaluation of some heuristics that are used to determine a ranking of facts within a knowledge base such as e.g. DBpedia.

These are the simple assumptions 'Who Knows?' is based on:
  1. If a user knows the correct answer, the fact seems to be 'important'.
  2. If a user doesn't know the correct answer, the fact seems to be not so 'important'.
  3. If a user votes the question to be wrong, odd, or strange, the fact seems to be 'irrelevant'.
There a different variants to play the game:
  1. One-on-One questions -- only one choice is correct.
  2. N-to-One questions -- there are multiple correct answers.
  3. Hangman -- find the answer by playing the popular game of hangman.
  4. Maths -- find the answer and compute a simple arithmetic formula.
Meanwhile you will receive points for correct answers. The faster you provide the answer, the more points you will get. If you provide the wrong answer, you'll loose a life and some points will be taken from your score.

Try to score as many points as possible and don't forget to tell your friends!!!!

Sunday, September 05, 2010

Schuster bleib bei Deinen Leisten - Stendhal und die Kryptografie

Was haben der französische Autor Henri Beyle aka Stendhal und Kryptografie miteinander zu tun, wird sich der geneigte Leser fragen. Die Antwort darauf liegt in Stendhals großen Roman 'Die Kartause von Parma', die ich dieses Jahr als Urlaubslektüre ausgewählt hatte (und hier im biblionomicon rezensiert habe). Darin geht es um den jungen Helden Fabrizio, der aufgrund eines unbeabsichtigten Duells mit Todesfolge in die berüchtigte Zitadelle von Parma gesteckt wird, aus der noch nie jemand entfliehen konnte. Sie erhebt sich als gewaltiger Turm, der Engelsburg in Rom nachempfunden, auf deren Plateau sich Fabrizios Gefängniszelle befindet. Fabrizio liebt Clelia, die Tochter des kommandierenden Generals der Festung, und kommuniziert mit ihr via 'optischer Telegrafie'. Dies allerdings auf denkbar einfachste Weise, indem Seiten aus einem Buch mit jeweils einem großen Buchstaben des Alphabets versehen werden und diese nacheinander an einem Fenster präsentiert werden.

So weit so gut... das hat ja noch nichts mit Verschlüsselung zu tun. Fabrizios Freunde außerhalb der Festung hecken einen Fluchtplan aus und müssen daher mit ihm unerkannt kommunizieren. Dies gelingt ihnen nachts mit Hilfe von Lichtzeichen - also wieder 'optische Telegrafie', die allerdings verschlüsselt werden muss, damit niemand hinter ihre Pläne kommt. In der ersten, noch unverschlüsselten Version, entspricht dabei jeder Buchstabe einer Leuchtzeichenfolge entsprechend seiner Position im Alphabet, also 'a' einmal leuchten, 'b' zweimal leuchten, usw.

Für die eigentliche Verschlüsselung wird Fabrizio ein Brief in seine Zelle geschmuggelt. Allerdings enthält dieser nicht nur den Fluchtplan im Klartext sondern auch noch den vollständigen Schlüssel (für eine einfache Substitutionschiffre) für die zukünftige Lichtzeichenkommunikation. Der Schlüssel alleine wäre doch schon gefährlich genug gewesen. Ich frage mich, wenn schon der Fluchtplan im Detail mitgeteilt wird, wozu braucht es dann noch einen Schlüssel. Würde der Brief kompromitiert werden, wäre alles verloren...

Allerdings klärt uns der Anhang des Buches darüber auf, dass Stendhal in seiner Funktion als französischer Konsul von Civitavecchia ein ähnlicher Lapsus im Zuge seiner Amtsgeschäfte unterlaufen wäre. In einem verschlüsselten Brief an den französischen Außenminister fügte er 1835 im Klartext noch den kompletten Schlüssel hinzu und schickte beide Nachrichten gemeinsam in einem Brief. Das ist ein absoluter Anfängerfehler und Stendhal wurde völlig zurecht dafür offiziell vom Außenminister gerügt.

Schön ist die Episode aber als literarisches Kryptografiebeispiel, das ich gerne auch in einer meiner Vorlesungen aufgreifen werde. Neben Edgar Allan Poes 'Goldkäfer' und Arthur Conan Doyles Sherlock Holmes Episode 'Das Musgrave Ritual', ein weiteres Beispiel für den Einsatz von Kryptografie, um die Spannung in einem Roman zu erhöhen.

Tuesday, July 27, 2010

There are more Things in Heaven and Earth... - DBPedia Link Graph Analysis Revisited

In the course of our ongoing work with Linked Open Data, we recently made some analysis on the graph structure of DBPedia data. For this we only took under consideration the original link graph (aka 'wikilinks'), where we did some cleanup first, such as, e.g., resolving redirects, etc.

As a side effect, we had to compute in-degree and out-degree of all DBPedia entities according to wikilinks, ... and we discovered some more or less surprising facts (thanks to Nadine):

The entity with the hightest out-degree (i.e. number of outgoing links) currently is:
http://dbpedia.org/resource/List_of_places_in_Afghanistan
with 7.147 outlinks (after cleanup, and remember it's wikilinks and not typed links of DBPedia)

The entity with the highest in-degree (i.e. number of incoming links) currently is:
with 440.151 inlinks (after cleanup)

While the 2nd one (living people) seemed pretty clear to me, the first (Afghanistan places...) was a bit of a surprise (as also are trilobytes...). For all the explorers among us, I have included the Top Ten list of incoming and outgoing wikilinks, each with indegree and associated outdegree...


Top Ten Incoming in out
http://dbpedia.org/resource/Living_people 440151 0
http://dbpedia.org/resource/United_States 385407 963
http://dbpedia.org/resource/France 124206 759
http://dbpedia.org/resource/England 123223 1320
http://dbpedia.org/resource/United_Kingdom 121203 1152
http://dbpedia.org/resource/List_of_sovereign_states 114086 465
http://dbpedia.org/resource/Canada 105849 523
http://dbpedia.org/resource/Germany 103382 889
http://dbpedia.org/resource/Animal 98680 236
http://dbpedia.org/resource/World_War_II 93555 771
http://dbpedia.org/resource/Association_football 90673 196



Top Ten Outgoinginout
http://dbpedia.org/resource/list_of_places_in_afghanistan97147
http://dbpedia.org/resource/Flora_of_New_South_Wales 917 6819
http://dbpedia.org/resource/List_of_municipalities_of_Brazil 1 5503
http://dbpedia.org/resource/Index_of_India-related_articles 4 5369
http://dbpedia.org/resource/Area_codes_in_Germany 6 5360
http://dbpedia.org/resource/IUCN_Red_List_vulnerable_species_%28Plantae%29 0 5172
http://dbpedia.org/resource/List_of_trilobites6 5102
http://dbpedia.org/resource/List_of_Social_Democratic_Party_of_Germany_members 24 5078
http://dbpedia.org/resource/List_of_French_words_of_Germanic_origin 9 5010
http://dbpedia.org/resource/Index_of_Thailand-related_articles 4 4831

But there are more interesting things to discover ... stay tuned!

Tuesday, July 20, 2010

Visualizing video archive content -- arte.tv

Okay, first at all, it's been a while that I have written a blog post here. I guess that's some tribute to the ever faster spinning world of digital media as I had concentrated more on shorter and therefore, faster means of communication such as twitter and (shame on me...) facebook. Nevertheless, while skipping through the pages of moresemantic I decided to revive the blog and to keep on posting about current research work....

This morning, I had to look up a documentary 'The Digital Bomb', which had been broadcasted yesterday evening on the German/French arte television channel. Arte is one of the public service television channels focussing on culture and arts. As many other television broadcasters, arte of course maintains a website and being as a television broadcaster there is also some sort of media archive. As being a public service television broadcaster, some strange regulations keep the archive from maintaining more than 7 days of tv-program -- but this is something completely different (as to speak with Monty Python). This morning, I made some discoveries in arte's tv archive, esp. about their way of visualizing content.

Besides being a little bit difficult to find the right mode of access -- esp. if you are looking for a specific date of broadcast, I succeeded in finding this nice portal page showing the most featured videos of yesterday's tv program including a timeline (at the top) for growing back (and forth) in time. I really like the 2-D tile pattern relating the size of the videos (represented by some significant key frame) to their popularity (or any other ranking). When you place the mouse pointer over a frame you will get more detailed information about the video and by clicking on it the video opens for reviewing.

All in all it seems to be inspired by the TED video archive and there's still room for improvement. I would like to see also timelines for shown content (not only broadcast or production date) as well as geographical information about production/content shown in interactive maps.

Now I have become curious about what else is out there? Any new innovative, interactive visualizations for displaying video archive content aside from the youtube mainstream??

Monday, October 26, 2009

Open PhD Positions in Semantic Multimedia Retrieval Project

OPEN Ph.D. POSITIONS at Hasso-Plattner-Institute (HPI), Potsdam (Germany) starting on the fourth quarter of 2009

Hasso-Plattner-Institute (HPI) is a privately financed institute affiliated with the University of Potsdam, Germany. The Institute's founder and benefactor Professor Hasso Plattner, who is also co-founder and chairman of the supervisory board of SAP AG, has created an opportunity for students to experience a unique education in IT systems engineering in a professional research environment with a strong practice orientation.
(for more information on HPI, c.f. http://www.hpi.uni-potsdam.de/ )

Project Description:
MEDIAGLOBE is part of the THESEUS research program initiated by the German Federal Ministry of Economy and Technology (BMWi), with the goal of developing a new Internet-based infrastructure in order to better use and utilize the knowledge available on the Internet. The focus of the research program is on semantic technologies, which determine contents (words, images, sounds, and videos) not through conventional methods (e.g., combinations of letters) but which are able to recognize and place the meaning of a content in its proper context. MEDIAGLOBE deals with digitalization, analysis, and semantic retrieval of historical, documentary audiovisual content. (for more information on MEDIAGLOBE, c.f. http://theseus-programm.de/theseus-mittelstand-2009/ )

The ideal candidate holds a MS degree in Computer Science or related field and is able to consider both theoretical and practical/implementation aspects in her/his work. Fluent english communication and programming skills are fundamental requirements. Since we are working on a multimedia repository with resources in German language, German language skills are welcome! Preferably the candidate has a background in one of the following
fields:
• semantic web technologies
• knowledge representations and ontology engineering
• audiovisual retrieval and analysis
• semantic search
• innovative web development
• user interface design for audiovisual content

The position starts as soon as possible and is full-time (40h/week) for the duration of the project until Oct 2011. Review of applications will begin immediately and will continue until the position is filled. The successful candidate will tightly work with international partners and has the possibility to pursue PhD work within the scope of the project.

How to apply:
Excellent candidates are invited to apply with:
• Curriculum vitae and copies of degree certificates/transcripts,
• Writing samples/copies of relevant scientific papers (e.g. thesis, etc.),
• Letters of recommendation.

Please send your application in PDF format indicating in the subject 'Application for PhD position‘ via email or via traditional mail to the following contact.

Contact and application:
Harald Sack
Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
Universität Potsdam
Prof.-Dr.-Helmert-Str. 2-3
D-14482 Potsdam, Germany
phone: +49 (0)331-5509-527
fax:
+49 (0)331-5509-325
email:
harald.sack@hpi.uni-potsdam.de
web:
http://www.hpi.uni-potsdam.de/meinel/persons/sack.html

Thursday, October 15, 2009

Opening of the German/Austrian W3C-Office / Teaching the Web at FH Potsdam

Today, I'm visiting the Opening of the German/Austrian W3C-Office at FH Potsdam (only 20 minutes away from HPI with public transportation), which is entitled "Teaching the Web".

Short Opening address by Prof. Johannes Vielhaber, rector of FH Potsdam and by Felix Sasaki, followed by the first speaker.
Klaus Birkenbihl on "W3C and W3C Offices - an Overview", who gives some general information about W3C and some overview about the worldwide W3C offices and their duties. Today, there are 18 W3C offices and on of their main tasks is the recruiting of local stakeholders to become W3C (paying) members. But also have to be mentioned better synergies with local hosts for community building, acquisition of local projects, and fostering new cooperations.

Andrew Vande Moere from the University of Sidney on "Visualization for the Web", author of the information aesthetics blog. He starts with giving an overview on visualization technology ranging from simple data graphs to information art (e.g. the Web2DNA website for visualizing your website as DNA-Sequence...). The main message....ok, open up your data and make your data publicly available (that's my point! Go one step further and make it Linked Open Data!). Then, there are lots of new possibilities for intelligent data mashups (as e.g. the 'the city is the future web'), making use of the data in a completely new way. Two interesting examples for web online visualizations are Google's visualization API and IBM's Many Eyes ('the youtube of visualization'), not to forget tha ultimate data mining &visualization application on your personal data (only for 'the data addicted'), http://your.flowingdata.com/.

Malgorzata Machol from FU Berlin (instead of the announced Prof. Tolksdorf...) on "Why Semantic Web " and the "Semantic Technology Institute". Sorry, but I can't stand those web2.0/3.0/xxx timelines any more. I've seen better motivations for the Semantic Web (at least with better/newer examples). Unfortunately, in this talk the Semantic web is motivated with a rather coarse conception of 'semantics' and (human/machine) 'understanding'. There is too much of human perception (and complex human understanding) mixed up with formal semantics and technology working on formal semantics (and what can be achieved with it w.r.t where are its limits). Please try to achieve more and better differentiation. The second part of the talk is about the STI in Germany and its activities around the semantic web.

After the lunch break, the event continues with Lambert Heller from TIB/UB Hannover on 'Library 2.0 – how the web has (and is) changing education of librarians?'. Today, the library catalogue is nothing but highly structured data, but the problem of uniquely identifying strings (symbols) with subjects (catalogue entries) is only solved on the surface and in a superficial way. There are commonly used (and elaborated) data sources such as the 'Schlagwortnormdatei' or the 'Personennormdatei' that contain reliable structured information about persons (authors) or keywords. But, today these datasets are not publicly available (while it is also not certain, who really owns the copyright, and what kind of copyright at all...). Let's get all these data, triplify it and make it Linked Open Data!!!
Finally, come and visit 3rd BibCamp in Hannover in May 2010:


Patrick H. Lauke from (web evangelist at) Opera Software on 'Standards education - what students need to know about web standards and accessibility'. It', about telling the big picture to the students. Not really the complete design specification or code implementation. ('Standards are Code....and Designers don't care about code') Make clear why...not necessarely how. On the other hand, there's accessibility. Web accessibility is more than blind users and screen readers. accessibility doesn't have to be hard. In fact, a lot of accessibility is just usability! Opera accordingly is invoved in outreach and educational activities on 'Teaching the Web' (Opera: We want to make the Web a better place!).

Petra Rauschenbach from Bundesarchiv on 'Conversion, Digitisation and Internet Gateways. Strategies and their implementation at the Federal Archives of Germany'. Starting with an overview on overview of their stocks and a lot of German archival vocabulary, which I cannot translate into English. Unfortunately, Mrs. Rauschenbach is 'reading' the talk and not talking freely...Also the term 'Findbuch' (i.e. concordance, index) sounds somehow very 'retro' for computer scientiest like me. But anyhow, the content of the Bundesarchiv (Federal Archive) is available on the web....why not make it Linked Open Data??

Henry S. Thompson from University of Edinburgh on 'Teaching Web Architecture'.
students:WWW::fish:water i.e. the relationship between students and the WWW is similar to the relationship of fish and water. So, why teaching a fish about the water? Learn to think about something which usually is invisible (i.e. the technology we are using everyday). So how should we teach web standards? Basic didactic principles like 'Analysis' (decomposing concepts), Contradiction (to something taken for granted), Analogies (being offered for confounding expectations) are presented. I like the last one: E.g. 'Standards come from official standards bodies', but consider the following: IEEE (semi-private bodies) / IETF (bunch of volunteers, left-overs of a hippie community) / W3C (2 private Universities, one semi-private body and companies paying some fee)...all of them have NO legal authority. More on this can also be here: 'Identity, URIs and Semantic Web'.
Conclusion: Teaching needs to draw on theatre as well as educational theory, 'Keeping people engaged is the core of education.'

Jens Meiert from Google Inc. on 'Modern Web development – a view on the future of HTML, CSS and development practices'. As expected, we are starting in the past, 1990 HTML 1.0 by Tim Berners-Lee. What about development practices? Problems in the past ranged from technological limitations, support limitations over low output quality up to bad user experience. In the Present, more and more we are facing a separation between behavior, structure, and presentation.

Erik Wilde from School of Information, UC Berkeley on 'Information Engineering' (BTW, my very first on lecture on web technology back at FSU Jena was based on Erik Wilde's book, this was before I had written my own ;-). He's using Google sidewiki for his lecture handouts and presentations, which can be annotated by the students simply by installing the Google toolbar in your browser.
Information Engineering is bigger than the web. It includes high level skills such as information and service modelling, or knowledge about complementary architectures. The Web of Things is nothing but applied Information Engineering. Engineering can be defined as constrained-based design and implementation.

[to be continued]

Tuesday, September 15, 2009

Corporate Semantic Web Workshop in Berlin, 15.09.2009


Heute findet im Rahmen der XInnovations 2009 der Corporate Semantic Web Workshop an der HU Berlin statt, der heute abend im 3. Semantic Web Meetup seinen Abschluss finden wird.

9 Uhr morgens, noch ist alles ruhig. Kaffee allerdings scheint am morgen ein Fremdwort zu sein. Zumindest reicht es 30 Minuten vor Beginn der Veranstaltung lediglich für ein anscheinend vom Vorabend "geplündertes" Buffet mit zwei leeren Thermoskannen und ein Paar verlorener und zum Teil gebrauchter Kaffeetassen. Da lob ich mir doch Graz und die I-Semantics, bei der wir vor gut einer Woche rund um die Uhr mit leckerem illy-Kaffee (Ja! ich würde mich über eine Product-Placement-Vergütung sehr freuen...:) und Espresso versorgt wurden....

Prof. Adrian Paschke von der FU Berlin eröffnet den Corporate Semantic Web Workshop mit einer kurzen vorstellung der 'Vision' des Corporate Semantic Web Projekts, das vom BMBF seit 2008 gefördert wird. Für mich natürlich interessant der Unterpunkt und Forschungsbereich 'Corporate Semantic Search', also bin ich auf den später geplanten Vortrag zum Thema gespannt.

Gökhan Coskun aus Adrian Paschkes Forschungsgruppe schließt an mit einem Vortrag über 'Effiziente Verwaltung von Unternehmenswissen - Corporate Ontology Management'. Wozu braucht man Ontologien in einem Unternehmen? Ganz einfach, zur Steigerung der Produktivität und der Effizienz der Informationsverarbeitung (typische Wirtschaftsinformatikerantwort...). Als Hinderungsgrund für den Einsatz im Unternehmen identifiziert Coskun die 'Akademische Orientierung' vorhandener Werkzeuge. Er sieht Ontologien als normativ und allgemeingültig, was einen weiteren Hinderungsgrund bzgl. deren Einführung im Unternehmen darstelle. Dem möchte ich widersprechen, da Ontologien stets auf einer 'gemeinschaftlichen Vereinbarung' beruhen (explicit, formal specification of a shared conceptualization), die den jeweiligen Blickwinkel der Beteiligten abbildet. Allgemeingültigkeit wäre eine Eigenschaft, die gar nicht erreicht werden soll. Wir betreiben ja schließlich Informatik und nicht Metaphysik, d.h. unser Ziel ist nicht die normative und allgemeingültige Beschreibung der gesamten Welt, da eine Ontologie stets der Interpretation des Benutzers, seinem Kontext und seiner Pragmatik unterliegt.

Nach der viel zu kurz geratenen Kaffeepause geht es weiter mit Olga Streibel von der FU Berlin und dem Thema 'Semantische Suche: Tagging und Wissensgewinnung'. Als 'Extreme Tagging' werden jetzt Tags eingeführt, die selbst als Objekte für das Tagging hergenommen werden können, d.h. die eigentliche Tag-Relation lässt sich taggen. Was gewinnt man dadurch? Streibels Erklärung, dass man 'durch Tags Assoziationen bildet, mit denen man Ontologien erzeugt', hilft mir hier nicht besonders weiter. Leider wurde meine diesbezügliche Frage am Ende mit der Bemerkung, dass wir das offline diskutieren sollten, etwas unschön abgebügelt. Dabei wäre es meines Erachtens für den Vortrag von zentraler Bedeutung, eben diesen Vorteil des Ansatzes herauszustellen und gegenüber einem einfachem (individuellem) Konzept-Mapping abzugrenzen.

Ralf Heese von der FU Berlin referiert als nächstes über 'Einfach Verlinken in Wikis / Experten mittels Wikis finden'. Es geht beim ersten Thema dabei darum, den Benutzer mit Hilfe von Hintergrundwissen beim Setzen von Links durch entsprechende Vorschläge zu unterstützen. Das zweite Thema widmet sich der Frage, wie sich aus den History-Logdaten eines Wikis Experten zu bestimmten Themen bestimmen lassen.

Gleich Mittagspause....! Leider nur mit Gulaschsuppe/Möhrensuppe und Rundgang über den 'Büchermarkt' im Hof vor dem HU-Hauptgebäude.

Wieder einmal ein paar Minuten zu spät bei Richard Hubers (FIZ Chemie GmbH) Kurzvortrag über 'ChemgaPedia - virtuelle Forschungsumgebung'. Und schon wieder eine Wortneuhülse: 'Wissenscloud' - ohne sich darüber im Klaren zu sein, was exakt damit gemeint ist.

Änderung im Programm, 'Semantic Profiles in Universal Plug and Play AV', Vortrag eines Diplomanden zur semantischen Anreicherungen von UPnP AV Daten.

Weiter geht es mit Johannes Krug von x:hibit, der das Berliner Museumsportal (finanziert durch entsprechende E-Commerce Anteile, z.B. E-Ticketing, u.a.) vorstellt, gefolgt von Radoslav Oldakowski von der FU Berlin mit dem Thema 'Semantische Datenintegration und Suche im Museumsportal Berlin'. Semantisch unterstützte Suche, die Suchbegriffe um verwandte Begriffe ergänzt...mal sehen, ob sie dies auch (1) intelligent und (2) visuell ansprechend tut.

Sebastian Hellmann gibt mit 'DBPedia Live Extraction' als erstes eine kurze Einführung in das zentrale Hub der Linked Open Data Cloud, der DBPedia, gefolgt von diversen Anwendungen rund um die DBPedia. Übrigens nutzt yovisto.com (including semantic features) ebenfalls DBPedia-Daten zur Implementierung einer echten explorativen (semantischen) Suche.

[...to be continued @ Semantic Web Meetup ]

Monday, September 14, 2009

W3C-Tag an der HU Berlin, 14.09.2009

Es ist mal wieder soweit: W3C-Tag an der HU Berlin im Rahmen der XInnovations 2009, diesmal mit Prof. Felix Sasaki als Vertreter des lokalen W3C-Büros an der Uni Potsdam....und es geht auch schon gut los. Prof. Robert Tolksdorf, der die Veranstaltung eröffnen sollte, steckt im Berliner Verkehr fest und so warten wir erst einmal 20 Minuten bis hier irgendetwas heute morgen passiert.

Atemlos kommt Herr Tolksdorf mit 20 minütiger Verspätung an, entschuldigt sich kurz (ohne dabei nicht auch einen kurzen Hinweis auf die Situation der Berliner Verkehrsbetriebe zu geben und auf seine Solidarität mit den Berliner S-Bahn-Fahrern hinzuweisen) und stellt das STI (Semantic Technology Institute) vor.

Interessanter wird es jetzt schon mit dem ersten, dem 'Semantic Web' gewidmeten Vortrag, genauer geht es dem Titel entsprechend über das 'Rule Interchange Format' (RIF) und das neue OWL 2, von Prof. Adrian Paschke von der FU Berlin. OWL 2 verwirft die übliche Dreiteilung in OWL-Light, OWL DL und OWL full und definiert verschiedene OWL DL Sprachprofile bzgl. ihrer 'worst case' Berechnungskomplexität. Dabei lässt sich OWL EL sogar in Polynomialzeit berechnen (daneben existieren noch OWL QL und OWL RL). RIF als Austauschformat für Regeldialekte bringt ebenfalls wieder eine Unmenge an neuen Syntaxvarianten. In diesem Zusammenhang wird auch auf ein Handbuch hingewiesen (Handbook of Research on Emerging Rule-Based Languages and Technologies, IGI Global), das sich allerdings nicht gerade durch seinen Preis (> 300 Euro) empfiehlt...

Prof. Felix Sasaki vom deutsch-österreichischen W3C-Office stellt als nächstes die aktuellen Entwicklungen rund um den neuen HTML5 Standard vor (hier ein Link auf die im Vortrag gezeigten Beispiele). Warum eigentlich jetzt HTML 5, nachdem bereits 2000 XHTML 1.0 veröffentlicht wurde und die Entwicklung von XHTML 2.0 auf vollen Touren lief? Nun, die Entwicklung von XHTML 2.0 wurde abgebrochen, der Anspruch ein 'sortenreines' XML im Browser einzuführen (eine 'Revolution') ist gescheitert. Nach einer von Opera durchgeführten Studie 2008, ist lediglich 4.13% des gesamten Web-Codes tatsächlich valide. Daher versucht man mit HTML 5 eine (fehlertolerante) 'Evolution' des alten Standards zu realisieren.
Eigentlich ist HTML5 genau genommen sogar ein Rückschritt, da es auf eine dezentrale Erweiterung über einzubindende Namensräume zugunsten eines eindeutig zu interpretierenden DOM-Baumes verzichtet. Dahingehend sind Probleme mit semantischen Erweiterungen, wie z.B. RDFa oder Microformats vorprogrammiert!

Mittagspause im Café Chagall mit Bliny und saurer Sahne mit anschließendem Rundgang zu Dussmanns Kulturtempel...

Thomas Caspers spricht zunächst einmal über Barrierefreiheit (Die deutsche Übersetzung der WCAG 2.0 a.k.a. Web Content Accessibility Guidelines). 'POUR' steht für die vier Grundprinzipien der Richtlinien für barrierefreie Webanwendungen: Perceivable, Operable, Understandable und Robust. Allerdings konnte ich die beschriebenen Übersetzungsprobleme ('programmatically determined' usw.) nicht nachvollziehen...vielleicht hätte man mal einen Informatiker fragen sollen....

Mit gut 20 minütiger Verspätung (dank der Ausdauer des Vorredners) beginnt der Vortrag von Joachim Neuberth über SKOS (Simple Knowledge Organization Systems). Warum müssen manche Vortragende nur immer so leise sprechen. Das Verfolgen des Vortrags gestaltet sich nicht wirklich einfach (was nicht etwa in der Komplexität des Themas begründet liegt). Zugegebenermaßen ist aber auch die Entwicklung von SKOS und das dahinter liegenden Datenmodell nicht so besonders spannend. Besser wird es erst, als unterschiedliche Ressourcen, wie z.B. die Library of Congress Headings oder die französische Nationalbibliothek aufgezeigt werden und am Ende dann doch Linked Open Data angesprochen wird.

Prof. Felix Sasaki widmet sich als nächstes dem Thema Metadaten für Multimedia und berichtet von der W3C Metadata Annotations Group. Das Ziel der Metadata Annotation Group kann als "DublinCore + X" paraphrasiert werden, also ein minimales Multimediametadatenschemas mit Mapping zu bereits existierenden Formaten. Der zweite Teil des Vortrags beschäftigt sich mit XProc (XML Pipeline Language) zur Modellierung und Beschreibung von XML-Verarbeitungsketten, die eine Pipeline-artige Verarbeitung von wechselnden Validierungs- und Transformationsschritten unter Ausnutzung einer rudimentären Programmlogik (konditionale Verarbeitung, Iterationen, Selektive Verarbeitung, Ausnahmebehandlung, u.a.) mit heterogenen XML-Daten erlaubt.

[Nach der noch folgenden Diskussion, weiter mit "Brezeln und Wein" im Foyer..... :)]

Links:

Friday, September 04, 2009

i-Semantics 2009 in Graz (Day 03)

The second day of i-Semantics ended with party and dancing to live music performed by 'Egon 7', and of course with a lot of interesting talks with interesting and nice people :) This morning at breakfast, Jörg really looked as if he had not really had got enough sleep (he continued to party after the official ending somewhere downtown :)

Keynote
Peter Kropsch (Austrian Press Agency): When technologies are drivers, integrated concepts are needed for success,
talking about scenarios of future media convergence, the development of the information technologies and the possibilities opened up by them, esp. about the expectations of APA what to get out of semantic technologies. In general, keynotes without some slides (to get hold of the information structure of this 60 minutes talking) are rather difficult to follow, if the speaker is not able to awake sufficient enthusiasm in the audiences...

The Role of Semantic Technologies in Future Internet Track,
Klaus Tochtermann (Know-Center, TU Graz): The Role of Semantic Technologies in the Future Internet
explained, why the vision of the future internet is not only 'old wine in some new bottles'. Although all the components that consttute the Future Internet (Content, Services, Security, People, etc.) is already around, the main contribution of FI is the integrated view and interaction of thes components.

Marco Pistore (FBK): Highlights of the Future Internet Conference Berlin
In parallel to i-Know/i-Semantics in Berlin the 2nd Future Internet Symposium took place...

Jan Reichelt (Mendeley): Mendeley - A Last.fm for Research?
Mendeley wants to help researchers to manage their resources. Research is inherently social. Mendeley offers a desktop tool, similar to last.fm audioscrobbler that (a) analyzes your search papers and enables full-text search and (2) smart metadata extraction and generation. The mendeley web service offers an online backup of your own research library and enables on-click import from google scholar and other citation services.
Interesting stuff...I have just registered and download my desktop client for Mac OSX...

Paolo Rosso (Universidad Politecnica de Valencia): Geographical Information Retrieval and Toponym Disambiguation
Geographical filtering of information retrieval results by expanding query strings with semantically related geographical information (...like that stuff!).

Thursday, September 03, 2009

I-Semantics 2009 in Graz (Day 02)

The first day of i-Semantics ended with a guided tour through the old section of the town (that we already had enjoyed the day before), and a welcoming party located in the Kunsthaus Graz. The Kunsthaus Graz has some extraordinary architecture and reminds me more of some strange submarine lifeform than of a building. Unfortunately we had to leave earlier to prepare our talk for today....

The second day of i-semantics 2009 started in a very conveniant way - again great espresso, fresh fruits, and nice talking.

Linked Data Track
Harald Sack (HPI): How can Software Developers benefit from Linked Data Vocabulary,
Well, the title is misleading somehow. Of course software developers do benefit from the use of linked data, but.....the problem is the feasibility of APIs and automated tools for mapping knowledge representations to object-oriented data structures. I had to deliver the talk (instead of Matthias, who actually was the main contributor, but is STIL on vacations) and I'm afraid that I have overstressed my audience with lots of software engineering details (bad for an introductory first talk of a session on linked data...).

Jörg Waitelonis (HPI): How to augment Video Search with Linked Data,
Jörg introduced yovisto.com and its connection to the Linked Open Data cloud, featuring our 'exploratory search' widget, suggesting additional search results to the user and providing seredipenditios findings.

Atif Latif (Know-Center, TU Graz): The Linked Data Value Chain: A Lightweight Model for Business Engineers
First, Atif stated significant differences between the aims of the scientific community and enterprise level business. He introduced the concept of Linked Data Value Chain, where the most valuable output consists out of an increase of the (code/data) readability for the human engineer, thus also serving the same duty as Matthias work in making semantic web software engineering very, very simple. So.....let's make Semantic web a No-Brainer (finally) ;-)

I had to skip parts of the afternoon sessions (Poster/Short-Paper and Student Sessions) due to interesting talks and reviewing work (deadline today...unfortunately). But I'm looking forward to more interesting contacts, lots of delicious espresso, maybe again some cake and pastries, and then of course the Gala Diner in the evening...

The Gala Diner startet with an interesting performance of Austrian cultural heritage. In particular, we listened to a musical performance of 4 original "Wurzelhorns" (Wurzelhorn is a variation of the well known Alphorn).

Wednesday, September 02, 2009

i-Semantics 2009 in Graz (Day 01)

Although already having been at the PC of i-Semantics some years ago, this year it's the very first time for me being in Graz. We've already arrived yesterday and spent an beautiful day with sight seeing in the baroque historic section of the town (including the Schlossberg with 260 steps in the midday sun).

The conference is starting right now with greetings by Hermann Maurer (TU Graz) and Klaus Tochtermann (Know-Center, TU Graz). I will report about all upcoming the conference highlights...

Keynote 01,
Paolo Traverso: Towards a future Internet of Services and Content
Pointing out differences between real worl services and software services (duration, accessibility, time contraints, etc.) towards the proposition of a 'Future Internet', where research is focussed on modeling, compositing, and monitoring real world services...

Social Semantic Web Track,
Liana Razmerita: Towards a New Generation of Social Networks: Merging Social Web with Semantic Web
Conclusion: Semantic Web Technologies will enable the development of a new generation of social networking applications...nothing new up to now..

Web 2.0 und Neue Medien Track,
Watraud Wiedermann (APA Defactor): Semantische Suche in Medienarchiven
Demonstration of search application on newspapers, deploying synonym search and search for significant co-occurrences.

Knowledge Visualization Track,
Sven Havemann (TU Graz): Patterns of shape Design
Raising the question 'What's the point on semantic enrichment of images?' One of the problems in computer graphics is that different communities use slightly different vocabularies with inherent ambiguities. Furthermore, there is also a 'semantic gap' in the way that (correct) CG-algorithms often do not deliver the expected result, due to the lack of a proper vocabulary. This leads Haveman to the development of a conceptual reference model (like CIDOC-CRM) for shapes.


Innovative Funktionen für E-Learning Track,
Viktoria Pammer (TU Graz): Intelligente Ad-hoc Erstellung von Lerninhalten mit semantischen Technologien
Presentation of the APOSDLE system that proposes learning material to the learner according to his current skills and needs. Seems to be an Intranet (non web based) system...

...and now on for some espresso and sone 'real Austrian' cake and pastries ;-)

Corporate Semantic Web Track,
Fan Bai (Uni Duisburg/Essen): Exchanging Knowledge in Concise Bounded Descriptions. An Approach to Support Collaborative Ontology Development in an Distributed Environment
Interesting variant of versioning of ontologies (for distributed ontology development). Using simple cvs is not suitable, simply because most ontologies simply consist out of one huge file...

Usage and Case-Studies Track,
Sanja Vranes: Maturity and Applicability Assessment of Semantic Web Technologies


ok...ready for 'Guided Tour from Conference Venue to Welcome Receiption' at Kunsthaus Graz

[see you again tomorrow...]

Sunday, June 21, 2009

Digitale Kommunikation

Am 21. Mai 2009 ist unser Neues Buch 'Ch. Meinel, H. Sack: Digitale Kommunikation' bei Springer erschienen, das ich hier heute vorstellen möchte. Hervorgegangen ist das Buch aus dem Absicht, unserem 2003 erschienenen Buch 'WWW - Kommunikation, Internetworking, Web-Technologien' eine zweite Auflage folgen zu lassen. Dies allerdings erwies sich als schwierig. 1200 Seiten in einer Disziplin, die sich so rasant weiterentwickelt, dass sich in den mehr als 5 Jahren, die seither vergangen sind eine Stofffülle angesammelt hat, die in einem Band einfach nicht mehr ausreichend behandelt werden kann.

Daher unternahmen wir Absprache mit dem Verlag das Wagnis, den Band in seine drei Grundbestandteile zu zerlegen und diese separat als einzelne Bände einer Trilogie zu veröffentlichen. Deren erster Band, die 'Digitale Kommunikation' liegt nunmehr vor. Die beiden Folgebände 'Internetworking' und 'Web-Technologien' sind in Vorbereitung und werden bald erscheinen.

Worum geht es im ersten Band der WWW-Trilogie? Wie schon im ersten Teil des 2003 erschienenen WWW-Buches dreht es sich in diesem Band um die Grundlagen der Rechnerkommunikation, die durch eine ausführliche historische Betrachtung eingeleitet werden und insbesondere die Gebiete der Kodierungstheorie und der Multimedia-Kodierung und -Komprimierung, sowie die Grundlagen der Kryptografie abdecken.

Hier das Inhaltsverzeichnis:

DIGITALE KOMMUNIKATION

(1) Prolog
(2) Geschichtlicher Rückblick
(3) Grundlagen der Kommunikation in Rechnernetzen
(4) Multimediale Daten und ihre Kodierung
(5) Digitale Sicherheit
(6) Epilog

In den Anhängen befindet sich ein ausführliches Personenregister, das vom ägyptischen Pharao Ramses II. und seiner 'ersten' Bibliothek bis hin zum 1970 geborenen Vincent Rijmen, dem Miterfinder des AES-Verschlüsselungsverfahren reicht. Das Buch stellt auf gut 430 Seiten mit zahlreichen Abbildungen die fundamentalen Grundlagen der digitalen Kommunikation dar. 17 einzelne Exkurse vertiefen dabei wichtige Themengebiete, die vielleicht nicht für jeden Leser gleichermaßen von Interesse sind. Jedes Kapitel ist mit einem ausführlichen Glossar abgeschlossen und über 250 Literaturverweise und Referenzen regen zum Weiterlesen an.

Weitere Informationen:

Sunday, May 31, 2009

ESWC 2009 at Heraklion, Greece, Day #01

Perfect loction for a summer holiday....as well as for the European Semantic Web Conference 2009. Arriving yesterday afternoon at a beautiful sea resort, blue skies, blue sea, white beaches...Kind of an environment that makes it hard for speakers and presenters at a scientific conference keeping their audience interested ;-)

First Day starts with several Tutorials and a Workshop. I have chosen the "Evaluation of Semantic Technologies" workshop for the morning. I cam a little bit late, thus the first presentation I fully attended was Ulrich Küster's ''evaluation of Semantic Web Service technology", which is not exactly my research topic. Interesting was the discussion whether a binary measurement for ranking results of service matching is appropriate or not (it's not...).

In the following presentation, Jerome Euzenat is talking about 'Ontology matching Evaluation". This topic for me is much more interesting (although not being my research topic).

Unfortunately the internet connection was so bad at ESWC 2009 that 'Live Blogging' was almost not possible. Sorry for that! But, at last I switched to twitter to give some 'live' updates. Just search for the hashtag #eswc2009 at twitter to get a timeline of tweets related to the conference program of ESWC 2009.

Chris Bizer from FU Berlin is the last presenter in the tutorial with a talk about 'Evaluation of Semantic Storage Systems'.

Monday, March 23, 2009

Artifacts of modern information society

Artifact, i.e. an error or misrepresentation introduced by a technique and/or technology. Most obvious are artifacts in lossy image compression techniques such as, e.g. the jpeg compression algorithm. For jpeg, the entire image is divided into 8x8 pixel squares (of course for all three color channels... but not RGB. For jpeg the RGB picture is first transposed into the YCrCb color space, i.e. Y for luminance and Cr, Cb for chrominance being subdivided into red and blue. BTW, there is also subsampling, i.e. luminance is sampled with higher accuracy than chrominance w.r.t the human sensory perceiption). Then these 8x8 squares of intensity values are transformed into the frequency domain via a discrete cosine transformation. Up to now, no artefacts, no data loss.....

But, next comes a quantification algorithm that rounds the frequency values within the 8x8 squares. If this is done with high accuracy, often one doesn't even notice it within the picture. But, increasing quantification also results in higher data compression, which also results in more data loss, which creates ... artifacts. You can see the typical jpeg raster effect, whenever using high jpeg compression.Here's a cool animation of an image being jpeg compressed 600 times (in a 20 second short movie). The simple algorithm goes:
Open the last saved jpeg image
Save it as a new jpeg image with slightly more compression
Repeat 600 times.



Generation Loss from hadto on Vimeo.

Monday, March 09, 2009

CeBIT 2009 - Aftermath

So, how was this year's CeBIT computer fair? The media have been rather pessimistic beforehand. 25% less exhibitors, 20% less visitors, but despite the worlwide economic crisis you'll find confidence everywhere ... at least now.

This year, I visited CeBIT only for a single day as exhibitor as well as a visitor. On friday, March 6th 2009, there was 'startup day' at the HPI booth in hall 9, B.10., and we presented our video search engine yovisto (well, it was almost only Jörg who did the presentation, while I was playing 'visitor'...). As every other year, for me hall 9, the 'future parc', is the most interesting hall of all 26 (!) halls. BTW, did you now that the Hannover fair area is the largest in the world? In hall 9 you could find the latest research of universities and research institutions (as well as some government and public administration...).

One thing that was pretty obvious was that hall 9 was not as crowded with exhibitors as the years before (at least in the rear areas this could be realized pretty heavily). I also was a little bit disappointed about our former booth at the joint exhibition stand of the universities Thüringen, Sachsen, and Sachsen-Anhalt (=Mitteldeutschland), because there was nothing really interesting to see, and most exhibitors were busily sitting in front of their computer screens.....presenting their 'backside' to the visitors and avoiding contact...not really inviting at all. No wonder, if they don't make it in the run for excellence...


But, on the other hand, I really liked the exhibition stands of the Fraunhofergesellschaft and the THESEUS research programme. Esp., at THESEUS (user scenario CONTENTUS) they demonstrated the workflow for digitalization, restauration, and indexing audiovisual media (...just my subject). While digitalization and restauration work pretty well, the indexing capabilities are still rather limited. You might search for 'similar' motives, but identifying individual persons is still subject to ongoing research.



After all, I was really happy to stay just for one day. Some of our co-workers had to stay for the entire week (my sincere condolences!). Was it worth while? For one single day, of course yes it was!

P.S. Alas, there really was a Latvian company that showed 'brains' in various colors.... ;-)

Wednesday, February 18, 2009

Radio Fritz in der Semantic Web Vorlesung

Vor einigen Wochen saß ein tatsächlich Radioreporter in meiner "Semantic Web" Vorlesung hier am HPI in Potsdam. Allerdings - so steht zu befürchtet - hatte er nicht unbedingt viel Spaß am Thema der aktuellen Stunde, denn es stand die "Semantik von OWL" auf dem Programm, die ohne jegliches Vorwissen genossen, doch zu erheblichen Nebenwirkungen (z.B. Zweifel am eigenen Selbst- und Weltverständnis, Aufmerksamkeitsdefizit oder Schlafstörungen) führen kann ;-)

Der Grund für den Besuch lag (leider) nicht am Thema der Vorlesung als vielmehr an der Aufzeichnung derselben und dem aktuellen Angebot an HPI-Lehrveranstaltungen auf itunesU (von dem hier schon berichtet wurde). Radio Fritz hat ein kurzes "special" produziert, in dem über dieses Angebot berichtet wird und auch einige (unserer) Studenten -- und am Ende auch ich -- zu Wort kommen. Nette Sache ... aber ich zweifle daran, dass der Reporter tatsächlich, wie von ihm behauptet, erst "nach 10 Minuten" nichts mehr verstanden hat ;-)

-> Link zum podcast

 

Wednesday, January 14, 2009

HPI Vorlesungen auf iTunesU

Seit gestern (Montag, 13.1.2009) bieten neben dem Hasso-Plattner-Institut der Universität Potsdam auch die Albert-Ludwigs-Universität Freiburg, die RWTH Aachen und die LMU München Vorlesungsaufzeichnungen und Lehrangebote via iTunesU als Video/Audio-Podcasts an. Das Ganze startete mit gehörigem Presse-Rummel (siehe Links) und ich bin schon einmal auf die Nutzungszahlen gespannt, über die ich hier natürlich wieder berichten werde....

Natürlich ist das Thema Vorlesungsaufzeichnungen und deren Distribution via Podcasts nichts Neues. Bislang war der "exklusive" Distributionsweg über Apple iTunes' separate "Bildungs-Channels" nur englischsprachigen -- also vorwiegend US-amerikanischen -- Universitäten vorbehalten. Nichtsdestotrotz findet man aber auch im regulären iTunes Angebot den ein oder anderen Bildungshappen, vielleicht auch von einer nicht ganz so exklusiven Universität.

Und wenn man nicht unbedingt zum proprietären "Marktführer" Apple gehen möchte gibt es da natürlich auch noch das universelle akademische Videoportal yovisto mit seinem unschlagbaren Vorlesungsangebot von mehr als 6000 Vorlesungen und wissenschaftlichen Vorträgen, das zudem auch noch inhaltlich durchsucht werden kann...

Links:

Thursday, November 13, 2008

3. tele-TASK Symposium am HPI in Potsdam

Heute und morgen (13./14. November 2008) findet am Hasso-Plattner-Institut in Potsdam das 3. tele-Task Symposium statt. Ich freue mich auf ein spannendes Programm als auch auf interessante Gäste (unter anderem von der ETH Zürich mit dem Projekt REPLAY, Andreas Nürnberger von der Uni Magdeburg, das Fraunhoher IDM aus Illmenau und viele mehr...).

Natürlich werde ich selbst auch im Programm vertreten sein zum Thema "Semantisch unterstützteu Suche und Navigation in audiovisuellen Datenbeständen" (Slides gibt es später hier via slideshare).

Tuesday, November 04, 2008

Trendseminar "Semantische Technologien" in Stuttgart am 5. Nov. 2008

Morgen werde ich in Stuttgart die Moderation durch das Seminar "Semantische Technologien - Wissen intelligent und gezielt nutzen", initiiert durch die MFG Fazit Forschung - Informations- und Medientechnologien in Baden-Würtemberg. Dabei geht es insbesondere um die Anwendung von Semantic-Web Technologien in Unternehmen.

Auf der Gästeliste stehen unter anderem Prof. Rudi Studer vom AIFB Karlsruhe und Tassilo Pellegrini von der Semantic Web Company aus Wien, und ich freue mich schon auf die abschließende Podiumsdiskussion, in der ich von den Gästen ihre Meinung zum aktuellen Marktpotenzial der semantischen Technologien, den Chancen für mittelständische Unternehmen und die Zukunft des Semantic Web erfragen werde....

Wednesday, September 24, 2008

W3C-Day, Berlin, 2008 (at XInnovations 2008)

Today, I'm going to visit the annual W3C-Day being connected with the XInnovations 2008 in Berlin. According to the program the sessions were supposed to start at 9 am (For this reason I got up rather early thismorning....) but also the first speaker seemd to think that 9 am is a little bit early ;-) So we started with a 20 minutes delay. In the introducing talk, Klaus Birkenbihl from W3C Semantic Web Activity Group is giving an introductory talk on the Semantic Web, solving the question, how to explain Semantic Web to an (more or less' ordinary web user. Not a simple task, but the best you can do ist to explain it via examples, showing that integrating information in the web today is a rather tedious and extensive manual work. Of course, with semantic web technologies, automated integration of heterogeneous data might soon be possible...

Unfortunately, the speaker for the second talk did not show up. Moreover, it should have been a presentation about the semantic search engine ConWeaver, which I know very well and I was rather curious about its progress. Therefore, the session continues with some kind of RDF tutorial being presented by Lars Bröker from Fraunhofer IAIS. Next, a short introduction in SPARQL is given by Thomas Tikwinski from Fraunhofer IAIS und W3C DE/AT.

[to be continued]

Tuesday, September 23, 2008

XInnovations 2008, Berlin, Day 02 - Sept. 23, 2008

Today, I'm going to visit the 'Corporate Semantic Web' workshop at XInnovations 2008 in Berlin. At least it seams that semantic technology has reached industry and corporations. "There is no market for semantic technology", as Christoph Tempich from Detecom Int. quotes a former oracle statement in his talk "Analytics drive the Corporate Semantic Web". Therefore, you just have to provide another label, which is 'Enterprise Information Management' with semantic web technology as underlying technology.

During the coffee break I followed a discussion on the planning of an 'Asocial Semantic Web Workshop' for the next WWW conference or ESWC conference. The goal o fthe workshop should be to show in which way the semantic web is vulnerble by SPAM or other offensive techniques, as e.g. denial of service by providing a deadly RDF-sequence that causes temporary data to grow exponentially.....sounds rather intriguing. I'm looking forward wo contribute ;-)

The 2nd session this morning starts with a presentation from Markus Luczak-Rösch from FU Berlin on 'Corporate Ontology Engineering'. Next, Holger Seubert from IBM is presenting 'Enriched Content Browsing', i.e. during page load in the traditional web, the web page is enriched with additional content. The text of the web page is analysed and terms of interest (info spots) are selected and linked with additional contextual information (from the web, from corporate data bases, etc. in another frame of the same window) without leaving the current context.

We had lunch in a small cafe underneath the nearby public railway with russian dishes (Cafe Chagall, Georgenstr. 4, 10117 Berlin). The pelmeni was really delicious!

The afternoon session starts with Thomas Hoppe from Ontonym with a presentation on 'Corporate Semantic Web'. According to his interpretation, the general 'Semantic Web' concept of Tim Berners Lee cannot be simply transported into the corporation as it is. Inside the corporation, it's a different world compared to the outside. All users are employees, vocabulary is (most times) strictly controlled, there are strict access restrictions, services have to be integrated in portals and corporations have to support corporate processes. The session continues with a presentation by Ralf Heese from FU Berlin on 'Corporate Semantic Collaboration'. He introduces the simple text-annotation tool loomp which has the purpose to enable nonexpert users to provide semantic annotations.

[to be continued tomorrow, W3C-Day, XInnovations 2008, Berlin, Day 03...]

Monday, September 22, 2008

XInnovations 2008, Berlin, Day 01, Sept. 22, 2008

For the third time, I'm attending the XInnovations (formerly known as XML-Days) 2008 in Berlin. Although I was often rather dissapointed about the quality of the conference program (you might refer to my previous posts about XML-Tage Berlin here), I decided to give it another try (simply because I can reach Humboldt University in Berlin with local public transportation in about 45 minutes). Also this time, there seemes to be some emphasis on semantic web technology (at least considering the program, there are Semantic Wikis in the Corporate Wiki track and anoteher Corporate Semantic Web Workshop, not to forget the Semantic Web topic in the PhD-Forum).

I started the first day of the coference with participating the Coprporate Wiki Infotag. Denny Vrandecic is talking about the "Semantic Media Wiki" from AIFB Karlsruhe. Denny is starting his talk with some historical facts about WIkipedia. Interesting thing to mention, according to a study from Aaron Swartz, only 2% of all Wikipedia users (it come up to 1.400 people) are primarily responsible for all article changes. This contradicts the commonly assumed opinion that wikipedia is written by millions of users. Then he was introducing the semantic web in general, by stating that the semantic web is nothing but things (nodes, concepts) being connected by certain relationships, forming graphlike structures that can again be related to each other. According to his definition, a semantic wiki is nothing but graphs being created from wiki data.

The next talk I'm attending is in the Ph.D workshop. Olaf Hartig is talking about 'Trustworthiness of Data on the Web'. With the Semantic Web more and more software agents are taking decisions based on (RDF-based) data on the web. But how can we trust those data? Olaf is developing an RDF trust model as a basis for trust assesment and trust-aware data acces. He suggests a scale from [1-;1], where -1 represents 'absolute distrust' and +1 'absolute trust' for a statement. Now, all relationships in an RDF-graph can be weighted with according trust values ranging from [-1] to [+1]. Trust into a set of statements can be expressed with aggregated trust functions ranging from a cautious (conservative) minimum to a slightly optimistic median. The formal trust vocabulary can be found here. Next, criteria for trust assessment are collected and three different trust assessment strategies are defined: user-based (ask the user on his/her opinion about the trustworthiness), provenance-based (taking into account the trustworthiness of the referring users), and opinion-based (recommendations by other users according to their own trustworthiness).

The afternoon session started with Nils Barnickel from Fraunhofer IOCS with a talk on 'Semantic Mediation between Loosely-Coupled Information Models in Service Oriented Architectures'. Semantic descriptions of Web Services are supposed to enable data and service interoperability. One problem being addresses ist the lack of efficient ontology mapping options in current existing onlology languages (although OWL does have a differentFrom or sameAs operator, complex mappings deploying concepts with totally different subgraphs or 1:n, n:m mappings are missing).

Ok, now it's definitely time for a coffee break. after that, I will be joining the 'World Cafe' session, where I will participate in the discussions instead of writing blog.

[to be continued tomorrow, XInnovations Day 02, Corporate Semantic Web Workshop]

Tuesday, July 29, 2008

Happy 50th Birthday NASA

Well, NASA, the National Aeronautics and Space Administration, celebrates its 50th birtday. On July 29th, 1958 US-President Eisenhower signed the „National Aeronautics and Space Act“ and NASA started to work on October 1st, 1958. Just a few months earlier in autumn 1957, the Soviet Union launched the very first artificial satellite SPUTNIK 1, resulting in the so called "Sputnik shock", paraphrasing that the western world was shocked that the USSR was really able to do this...and by doing so the entire western world (esp. USA) was commited to an atomic thread. We all know the story of the Cold War.... (BTW, for this reason Eisenhower also founded ARPA, the "mother" of the Internet).
The race began and now, 50 years later, Russian and US-american astronauts are working together in the International Space Station ISS.

Although I'm quite younger than NASA, NASA made a big impression on my childhood days. Remember the Apollo program and the first man on the moon. As almost every child I wanted to become an astronaut - or at least a scientist (got it!). Belief it or not, my very first memories of television are pre-launch cuntdowns of the Apollo Program (I really don't know which mission, but obviously one of the later). I remember the countdown was stopped several times and I was very angry, because I had to go to bed and could not watch the lift-off. I dimly remember even the Skylab program (as well as its early "re-entry" in 1979) and of course the first launch of the Space Shuttle in 1981 (and again with delayed countdown..., at Google Video you may watch the lift-off video of STS-1 Columbia).

But, the most interesting thing for me always have been NASA's planetary missions, giving us wonderful pictures of Jupiter or Saturn (Pioneer and Voyager) and the other planets.

NASA has opend up its Picture Archive with tons of pictures for free use. Nicely organized you may find pictures from Hubble, planetary missions, the space program, and many more....(But beware, today their servers have to keep up with an intense workload because of their birthday event).