....more semantic!: An Integrated View on Document Annotation.... (part 3)

Friday, September 01, 2006

An Integrated View on Document Annotation.... (part 3)

As already mentioned in [1], we identify three semantically interrelated structures within a document or a collection of documents:

logical structure (chapters, paragraphs, pages, ...)
conceptual structure (index entries, concepts, definitions, ...)
referential structure (references, associations, links, ...)

All three in concert form the so called dependency graph (or reading graph).

Now, what is the purpose of this dependency graph? Imagine, your intention is to understand a certain topic. What you usually do is to enter one or more descriptive keywords related to that topic into a search engine. More traditional, you would look up the index of a textbook for those keywords to identify some relevant sections to read to understand the given topic. Probably you look up a certain index entry and the section in the text contains other terms that you don't understand. Thus, you look up those terms in the index to read the referring section and you will probably again find terms that you don't understand. Thus you enter an interative process that stops, if you have read all sections that cover all the terms that you didn't understand before looking up the first index entry.

In this way, we recursively define a rather simple concept of understanding: To understand a term, we have understood all terms that are mentioned in a descriptive definition of that term.

This iterative process is closely related to self containment. Therefore, to understand a topic means to calculate the self containment closure of that topic. The size of the closure is determined by the end of the iterative process. The process ends, if there are no new terms that are not understood (i.e. read). Thus, the size of the closure depends on how much the user already knows (i.e. has read), is different for individual users, and changes (adapts) while the user is reading (understanding).
The dependency (reading) graph is then ordered sequentially (with the help of the logical document structure) to form a roadmap that leads to a better understanding of the given topic.
(to be continued....)

References:
[1] An Integrated View on Document Annotation (part 1) (part 2)