Tuesday, July 27, 2010

There are more Things in Heaven and Earth... - DBPedia Link Graph Analysis Revisited

In the course of our ongoing work with Linked Open Data, we recently made some analysis on the graph structure of DBPedia data. For this we only took under consideration the original link graph (aka 'wikilinks'), where we did some cleanup first, such as, e.g., resolving redirects, etc.

As a side effect, we had to compute in-degree and out-degree of all DBPedia entities according to wikilinks, ... and we discovered some more or less surprising facts (thanks to Nadine):

The entity with the hightest out-degree (i.e. number of outgoing links) currently is:
http://dbpedia.org/resource/List_of_places_in_Afghanistan
with 7.147 outlinks (after cleanup, and remember it's wikilinks and not typed links of DBPedia)

The entity with the highest in-degree (i.e. number of incoming links) currently is:
with 440.151 inlinks (after cleanup)

While the 2nd one (living people) seemed pretty clear to me, the first (Afghanistan places...) was a bit of a surprise (as also are trilobytes...). For all the explorers among us, I have included the Top Ten list of incoming and outgoing wikilinks, each with indegree and associated outdegree...


Top Ten Incoming in out
http://dbpedia.org/resource/Living_people 440151 0
http://dbpedia.org/resource/United_States 385407 963
http://dbpedia.org/resource/France 124206 759
http://dbpedia.org/resource/England 123223 1320
http://dbpedia.org/resource/United_Kingdom 121203 1152
http://dbpedia.org/resource/List_of_sovereign_states 114086 465
http://dbpedia.org/resource/Canada 105849 523
http://dbpedia.org/resource/Germany 103382 889
http://dbpedia.org/resource/Animal 98680 236
http://dbpedia.org/resource/World_War_II 93555 771
http://dbpedia.org/resource/Association_football 90673 196



Top Ten Outgoinginout
http://dbpedia.org/resource/list_of_places_in_afghanistan97147
http://dbpedia.org/resource/Flora_of_New_South_Wales 917 6819
http://dbpedia.org/resource/List_of_municipalities_of_Brazil 1 5503
http://dbpedia.org/resource/Index_of_India-related_articles 4 5369
http://dbpedia.org/resource/Area_codes_in_Germany 6 5360
http://dbpedia.org/resource/IUCN_Red_List_vulnerable_species_%28Plantae%29 0 5172
http://dbpedia.org/resource/List_of_trilobites6 5102
http://dbpedia.org/resource/List_of_Social_Democratic_Party_of_Germany_members 24 5078
http://dbpedia.org/resource/List_of_French_words_of_Germanic_origin 9 5010
http://dbpedia.org/resource/Index_of_Thailand-related_articles 4 4831

But there are more interesting things to discover ... stay tuned!

Tuesday, July 20, 2010

Visualizing video archive content -- arte.tv

Okay, first at all, it's been a while that I have written a blog post here. I guess that's some tribute to the ever faster spinning world of digital media as I had concentrated more on shorter and therefore, faster means of communication such as twitter and (shame on me...) facebook. Nevertheless, while skipping through the pages of moresemantic I decided to revive the blog and to keep on posting about current research work....

This morning, I had to look up a documentary 'The Digital Bomb', which had been broadcasted yesterday evening on the German/French arte television channel. Arte is one of the public service television channels focussing on culture and arts. As many other television broadcasters, arte of course maintains a website and being as a television broadcaster there is also some sort of media archive. As being a public service television broadcaster, some strange regulations keep the archive from maintaining more than 7 days of tv-program -- but this is something completely different (as to speak with Monty Python). This morning, I made some discoveries in arte's tv archive, esp. about their way of visualizing content.

Besides being a little bit difficult to find the right mode of access -- esp. if you are looking for a specific date of broadcast, I succeeded in finding this nice portal page showing the most featured videos of yesterday's tv program including a timeline (at the top) for growing back (and forth) in time. I really like the 2-D tile pattern relating the size of the videos (represented by some significant key frame) to their popularity (or any other ranking). When you place the mouse pointer over a frame you will get more detailed information about the video and by clicking on it the video opens for reviewing.

All in all it seems to be inspired by the TED video archive and there's still room for improvement. I would like to see also timelines for shown content (not only broadcast or production date) as well as geographical information about production/content shown in interactive maps.

Now I have become curious about what else is out there? Any new innovative, interactive visualizations for displaying video archive content aside from the youtube mainstream??