Tuesday, August 08, 2006

An Integrated View on Document Annotation.... (part 2)

It is important to state the dependencies between the three identified characteristcs (logical, conceptual, and referential) that are common to any kind of document despite their heterogeneity.

As e.g., let's consider a large text document (textbook). The document's index can be considered to be part of the conceptual document structure, while the linear text flow with its hierarchical structure defines a part of the logical document structure. The index of a textbook can be considered as being of good quality, if an index entry in the first place refers to the location, where the concept denoted by that index entry is defined and subsequently to the other locations, where it is used. Thus, we distinguish between concept definition and concept usage (refering to the logical structure) for the document index (conceptual structure). Obviously, both are dependent on each other. The same holds for the bibliographic references (referential structure) together with the table of contents (logical structure) and the document index.

We claim that these dependencies within a document have to be made explicit. As additional document annotation these dependencies can help to represent document semantics without the necessity to really "understand" the content of the document. Acting in concert with external (user and/or author defined annotations) the document dependency structure is able to facilitate cross annotation reasoning that is a prerequisite for the semantic web.