This document contains a conceptual overview of topicmaps and
the k42 architecture.
What is a topicmap?
A topicmap is an organisation of knowledge using a
representation understandable by a computer. A topicmap in its most basic form
is simply a collection of "topics" and "associations" where associations
connect topics together.
What is a topic?
A topic is defined as "something that I am interested in". This
could be a person, a place, a concept, anything that makes sense
toyou. Topics are identified by names. A topic may have one or more
names. This is useful for multi-lingual applications. For example: the topic
used to express the concept of a dog, may have names:Dog , Hund andPerro so that a person speaking German or Spanish talk about the same
concepts equally as well as an English speaking user.
What is an association?
An association is the link between any number of
topics1. For example: given the topics Andy and Spectrum , I could create an association namedowns and connect Andy andSpectrum , using this association to express the concept that "Andy owns a
Spectrum"
Topic instances
To make this more generic, we can use OO techniques in topicmaps to
provide a schema for our topics. Finding out who owns a Spectrum is useful, but
all a bit random, I don't know what an "Andy" is or what a "Spectrum" is. I
want to be able to ask questions such as "Who owns a computer?". The first step
towards this is to create a topic called Computer and make Spectrum an instance ofComputer . If I wanted to add other types of computer, such as a Commodore
64, I could create a topic Commodore
64 and make it also an instance of Computer . The same should be done for Andy , so a topicPerson is created and Andy becomes an instance of Person . We're not quite there to asking "Who owns a computer?" yet; we
need an association template first.
What is an association template?
At the moment we have a single non-reusable assocationowns that associates two arbitrary topics,Andy and Spectrum . To make this more powerful, associations can be defined in terms
of templates2. In OO terms, an association template is a class (or
"type" of an association), and the associations themselves are the instances of
that class. So, if I create the association templateowns , I can then create instances of theowns template for each person who owns a Spectrum.
What makes up an association template?
Each association template defines one or more ends. Each association
instance must have a topic at each end of the assocation. Each "end" of an
assocation template has two properties:
What makes up an assocation instance?
Instances of this template can now be created. An association
instance consists of the same number of ends as the template, but at each end
is now the topic that plays the role for that end in the association. Thus, an
instance of the owns association template could containAndy playing the role of owner , and Spectrum playing the role ofowned . Below is a table showing some newPerson instances, new Computer instances and owns assocation instances between them.
I can now make queries against the association to ask the
following questions:
While it would be possible against individually defined
association instances, the templating mechanism provides consistency and
semantic rigour across all associations of this type. This begins to ensure
consistency and usefulness of the topicmap.
Topic subclasses
There are lots of different types of computer in the world, and
grouping them all simply as types of computer, may be confusing. To refine the
instances of Computer , I can create subclasses of theComputer topic: 8-bit computer and 16-bit computer and make my topics that represent specific computers, instances of
these refinements. So we have:
Occurrences, Subject references and Resources
Creating these knowledge structures is only really useful if they
represent something real. Topics called Spectrum do not actually tell the user what a Spectrum is. As far
as the computer is concerned, it is just a 8 characters, and probably won't
mean very much to someone under the age of 10. We need to make a link between
the conceptual notion of a Spectrum to what a Spectrum really
is.
There are three ways to do this, which way you use depends on
the following criteria:
What is scope, and what are scopesets?
Scope is probably the most confusing concept in topicmaps. It is
advised that you avoid using scope sets until you are competent with the other
concepts in topicmaps. Scopesets may be applied to either names of
associations. Names are the identities given to topics. Scopesets consists of a
set of scoping topics. A scoping topic is simply a topic that is used
in a scopeset. A simple example of using a scopeset would be for multi-lingual
topicmaps.
The topic
Computer that we have created is an English word. In French, the word
isordinateur and the Spanish word is ordenador. If French or
Spanish users wanted to browse the topicmap to find people who owned computers
they would not be able to, unless they knew the English
wordcomputer.
To prevent this problem occurring, we can create several names
for the topic we know as
Computer . By default all these names are added to what is called
theunconstrainted scope, that is, the scope within which everything
that has no scope is automatically placed. This will work with no problems
except that when browsing the topicmap, all users will now see all 3 names for
the topic: Computer ,Ordinateur and Ordenador . This is messy and presents useless information to the user. To
fix this, we can use scope. By creating two new topics: French andSpanish , we then scope the topic nameOrdinateur to the French topic (making the French topic a scoped topic, part of a scopeset) and
scope the topic nameOrdinador to the Spanish topic. Now a user can specify that they are only interested in
seeing theFrench names for topics by masking all results based on that
scopeset.
Summary
In this section we have seen all of the main topicmap constructs:
topics, associations, occurrences, resource references, subject indicator
references and scopesets. The rest of this document describes the data model
that powers k42 enabling it to represent a topicmap.
TopicMaps are not relational databases
A common misconception of topicmaps is that they are no
different from a relational database. This is not the case. A database consists
of a collection of tables containing, effectively, lists of data. These lists
may be combined using relationships or queries to refine the data and obtain
useful information. It is possible to model any database schema in a topicmap,
however, the reverse of this is not true. Relational databases were created
from a mathematical model, which was then transposed on to real-life scenarios
and is not a natural way to model data.
A topicmap allows the user to model the data in a natural way,
by defining areas of interest (topics), and then defining links between those
areas (associations). Neither the topics or associations are simple pointers;
they may have semantics defined for them which may be used for querying and
merging of information. Also, as the user is modelling knowledge from a natural
perspective, topicmaps are very easy to create and
query.
The salient point here is that databases process data,
topicmaps process knowledge.
The TopicMap Model
How does k42 model topicmaps?
The XML topicmaps (henceforth XTM) version 1.0 syntax provides
a standard way to interchange topicmap information between compliant
applications. k42 understands the XTM v1.0 syntax and processes it into an
internal representation; we call the internal representation, the topicmap
model. The core topicmap data model is a subset of the complete topicmap model.
This core topicmap data model is the simplest useful view of the topicmap
model. In this document we express the core topicmap data model using an entity
relationship diagram. The final part of this document provides further insight
into the details of the object model and advanced conceptual features that are
implemented in k42.
The k42 TopicMap Core Data Model
Overview
This overview presents an entity relationship diagram showing
all the entities of the core topicmap data model along with the relationships
between them. These entities and their relationships are fully explained in the
subsequent sections.
The Entities
TopicMap
The topicmap is the organising principal of the model. A
single topicmap instance will manage many topics and topic associations. It is
the topicmap that acts as factory for the creation of new topics and
associations. The topicmap consists of two properties: topics and
associations.
In k42 these properties are accessed through the methods
provided by
com.empolis.topicmaps.ik42.ITopicMap interface. These methods are
getTopics() getAssociationInstances()both return typed iterators which are subclasses of the java.util.Iterator interface. These iterators provide access to sets of topics and
associations respectively.
The topicmap provides a point of focus for the model. The
querying mechanisms for the topicmap are alo located on the
ITopicMap interface.
Syntax Reference
A topicmap instance is created when processing an XTM document and
finding a <topicMap> element.
Topic
There are only a few concepts in the model but at the heart of
all of them is the topic, represented by the
com.empolis.topicmaps.ik42.ITopic interface. As defined in XTM, the topic has names, occurrences,
identity and plays roles in associations. All of these aspects are represented
in the model.
The topic has the following properties:
In k42 the following methods on the
com.empolis.topicmaps.ik42.ITopic interface can be used to access the properties described
above.Note that where a set is mentioned above, a typed java.util.Iterator instance is returned in k42. This is an implementation detail to
enable scalable topicmap access.
TopicAssociation
The TopicAssociation is the entity that, in conjunction with
the AssociationEnd entity, binds the map together. The TopicAssociation has the
following properties:
The following methods on
com.empolis.topicmaps.ik42.ITopicAssociation provide access to the properties described.
AssociationEnd
The AssociationEnd is contained by an association and links
together a role defining topic and a role playing topic. The AssociationEnd can
be accessed from both associations and the topic that is the role playing
topic. The AssociationEnd has the following properties:
The properties described above can be accessed on instances of
com.empolis.topicmaps.ik42.IAssociationEnd using the following methods:
ScopeSet
Several aspects of the model rely on the concept of scope to
define a context in which they are valid. Names and associations are examples
of this. In the model, scope is represented by an aggregated entity: the
ScopeSet. The ScopeSet consists of the set of topics that together are said to
be the scope. A ScopeSet has just one property in the core data model and that
is
topics which returns the set of topics that comprise it. In k42 this
property is accessed through the methodgetTopics() on thecom.empolis.topicmaps.ik42.IScopeSet .
TopicOccurrence
The TopicOccurrence is the structure that associates resources
with topics. The properties of the occurrence are :
The occurrence properties can be accessed by the following
com.empolis.topicmaps.ik42.ITopicOccurrence methods:
SubjectIdentityReference
The SubjectIdentityReference entity has just a single property
value that is a string that helps to identify the topic to which the
subject indicator reference belongs. To access this property k42 provides a
method oncom.empolis.topicmaps.ik42.ISubjectIndicatorReference called getValue() .
ResourceReference
The resource reference entity has just a single property
URI that is a string that locates the resource . To access this
property k42 provides a method oncom.empolis.topicmaps.ik42.IResourceReference calledgetURI() .
k42 Core Data Model Summary
This section has shown how k42 models topicmaps. We have
illustrated the key model concepts and their relationships. For a fuller
understanding of how to access all aspects of the presented entities, refer to
the
API documentation.
k42 Advanced TopicMap Model
TopicMap Model Advanced Concepts - Introduction
The core data model presented all the main topicmap model
aspects for reading a topicmap. This section adds to those basic concepts by
presenting the advanced parts of the data model. Templates, properties and the
extended occurrence model are unique to k42, while reification is an advanced
feature of XTM (also supported by k42).
TopicMap Model Advanced Concepts
Reification
Reification is the act of creating a topic to represent a 'thing'.
Topics achieve this by having names and identity. This is how the reified
object is identified. However, we might also want to reify things within the
topicmap model. An example of this is: putting a name onto an association or
perhaps putting an occurrence onto a topicmap name entity. This is supported in
k42 such that all entities know what their reifying topic is 4 and equally a topic knows what topicmap model
entity it reifies. Associations and occurrences are automatically reified
within k42, other entities can also be reified.
The diagram above shows the entity hierarchy in the data
model. The key thing to notice is that all entities extend the IK42 (see
com.empolis.topicmaps.ik42.IK42 ) entity. This is the base entity in the topicmap model. Any
topicmap construct can be reified through the fact that IK42 entities can be
reified. The diagram below shows the data model relationship between an IK42
entity and its reifying topic. Thus all derived entities of k42 also have this
property.
The methods used to access a reified object in k42 are
getReifyingTopic() oncom.empolis.topicmaps.ik42.IK42 andgetReifiedObject() oncom.empolis.topicmaps.ik42.ITopic .
Properties
Topics and associations are a rich paradigm for representing typed
associations between meaningful entities. However, it is also necessary to
store simple named properties as meta data on the topics and associations
themselves.
The diagram below shows how all IK42 entities have
aproperties property. Each property entity in the set of properties
has two properties key and value. These return entities of
type
java.lang.String and type general entity. Where general entity can be any kind of
object that exists in the system.
The methods in k42 for accessing the properties part of the
data model, is as follows: on
com.empolis.topicmaps.ik42.IK42 usegetProperties() to get access to a set of property entities, and on the property
entity use getValue() andgetKey() .
Advanced Occurrence Model
An occurrence is in fact a special kind of association. It is an
association between a topic of any nature and a 'resource topic'. What we mean
by a 'resource topic' is one whose identity is defined using a resource
reference. See the core model for more information on identity and resource
references. When accessing the resource references property of an occurrence,
as shown in the core model, we are actually fetching the resource reference of
the topic that is playing the role of the resource topic in the occurrence
association. This approach mans that there are less special cases in terms of
the data structure within the topicmap model. What we have is a refinement of
the semantics for a particular kind of topic association. The model below shows
an occurrence instance diagram as a topic association.
Notice above that the 'topic' is the topic entity to which the
occurrences relates and the 'resource topic' is the topic that has its identity
defined by a resource reference. The instance diagram below shows the internal
representation of the following XTM fragment.
<topic id="t-gdm"> <baseName><baseNameString>Graham Moore</baseNameString></baseName> <occurs xlink:href="http://www.empolis.co.uk/bios.html#gdm" /> </topic> Templates
The last section in the k42 advanced model is the idea of templates.
Templates are used to provide a schema for association instances. In k42,
templates use the normal topic associations data structure; the difference is
that the semantics assigned to that data structure are different. This is
similar to the k42 perception of occurrences, that they are topic associations
with refined semantics. It should be noted though - that reading a topic
association template as a standard association will seem strange, but coupled
with an understanding of the semantics, it will appear
obvious.
The key aspect of a template is to define the nature of the
roles that are present in a particular class of association. In addition, the
template must specify the class of topics that can play those roles in an
association instance. A template in k42 is a topic association where the 'role
defining topics' define the nature of roles in that association template, and
the role playing topics of those particular ends define the class of topic that
can play that role in an association instance.
An example will best illustrate these ideas. We want to create
an association template that defines a relationship between companies and
employees called 'company employs person'. In this template, we create a topic
association that has two association ends. One end has a role defining topic
called
employee and the role playing topic of that association template end has
the topic called person . This is where the topic person is being used to state that topics in instances of this
association that play the role ofemployee must be instances of the topicperson . In a similar way, we define the other template association end to
have a role defining topic as the topicemployer and the role playing topic as the topiccompany such that all topics playing the role ofemployer must be instances of the topiccompany ..
The diagram below shows the basic data model that connects
association instances to its template. For information on how topic
associations and their templates are serialised in XTM, see the section
onXML topicmap (XTM) Support below.
In k42 the method
getTemplate() exists on the com.empolis.topicmaps.ik42.ITopicAssociation interface.
k42 Advanced Model and Concepts - Summary
This section has presented the advanced features of k42. These
additional features and models add more value to the topicmap systems without
compromising the interoperability of topicmaps. These features are here to
enable more sophisticated processing of the topic model.
XML TopicMap (XTM) Support
This section details the XTM support that is available in k42,
the mapping between the XTM syntax and the k42 topicmap model and the
additional use of the XTM syntax to express topic association templates. The
main point of note is that k42 is 100% XTM compliant and
supports many optional features of the standard, such as the processing of
referenced maps as a single processing operation.
XTM Compliance and Core TopicMap Model Mapping
k42 can process a compliant XTM instance into an internal
model that can be manipulated programmatically by k42 client applications such
as WebAuthor. As there are no XTM constructs not supported, this section shows
how the XTM syntax maps to the k42 topicmap model. We will illustrate this
mapping in a descriptive manner, each element at a time.
<!ELEMENT topic ... > creates a new topic entity in the topicmap model. The XML 'id'
attribute is not maintained within the model.
<!ELEMENT instanceOf ... > on a topic and association creates a reference between this topic
/ association and another topic where that topic is the class of which this
thing is an instance. When defined on an occurrence, it references a topic
which defines the nature of the topic occurrence. Note that this and
other elements contain either topic references or subject indicator references.
In the k42 model both of these things result in a reference to a topic and thus
are not discussed in detail, except to say; that when a subject indicator is
found in the XML, in a position other than within subjectIdentity, that a new
topic is created, if not already existing, that has a subjectIndicatorReference
as its identity.
<!ELEMENT scope ... > defines a set of topics that together are considered as a unique
scope set in the topicmap model. Scope sets are used by names, associations and
occurrences to define the context in which they are valid.
<!ELEMENT baseName ... > is used to define a name construct by grouping together a scope
set and a baseNameString.
<!ELEMENT baseNameString ... > creates a name entity where the value is the text contained within
the element.
<!ELEMENT occurrence ... > creates an occurrence entity connected to the topic which has most
recently been created due to a topic element being processed. The occurrence
can contain a scope element which defines the context of this occurrence. Thexlink:href value or the <!ELEMENT
resourceData ...> element define the value of the resourceReference that is
associated with the occurrence. Note that the prefix data:, is appended to any values of resourceReference that are created as
a result of processing a<!ELEMENT resourceData ...> element.
<!ELEMENT association ... > creates a topic association entity which using the member element
connects topics together in typed associations. A scope set within this element
defines the context in which this association is valid.
<!ELEMENT member ... > is used to group together a role defining topic and a role playing
topic. Processing this element creates an associationEnd entity on the current
association in the topicmap model. The topicRef or the subjectIndicatorRef
within this element define the rolePlayingTopic property.
<!ELEMENT roleSpec ... > this creates a reference to a topic from the associationEnd which
is the property 'roleDefiningTopic'.
<!ELEMENT subjectIdentity ... > subjectIdentity does not itself create any structures within the
k42 model (but the resourceReference or subjectIndicator reference it may
contain, do). If either of these sub-elements are processed in this context,
the appropriate subjectIndicatorReference or resourceReference is attached to
the current topic being processed. A subjectIndictorReference is overloaded in
one way: if the subjectIndicatorReference references any topicmap construct
within the XTM file being processed, then this topic reifies that topicmap
construct. This is mirrored in the model through the reifiedObject property on
the topic entity.
Serializing TopicMap Templates using XTM
The section above has described how all the XTM constructs are
processed and which resulting topicmap model entities are constructed. From the
descriptions above, it should be clear how the export operation serializes to
XTM given a topicmap model. In this section, we show how k42 supports templates
defined using the standard topic association constructs.
An instance association needs to be connected with its
template. The template itself is represented as an assocation. We relate the
two associations together using a third topic association. However, we cannot
connect associations directly using the third topic association, so we need to
reify the template and the instance with topics and then connect the reified
topics.
Note that k42 will not support this XTM extension,
unless the associations are defined in the order:
tm-18:0 is reified by topict-18:0 and the topic association instancetm-23:0 is reified by topict-23:0 . Then the two reified topics are connected through the final topic
association.
The example below clearly shows how the XTM syntax is used to
define association instances and their templates.
<!-- Association Template --> <association id="tm-18:0"> <!-- t-16:0 is both the role defining topic and role playing topic class constraint --> <member> <roleSpec> <topicRef xlink:href="#t-16:0" /> </roleSpec> <topicRef xlink:href="#t-16:0" /> </member> <!-- t-15:0 is both the role defining topic and role playing topic class constraint --> <member> <roleSpec> <topicRef xlink:href="#t-15:0" /> </roleSpec> <topicRef xlink:href="#t-15:0" /> </member> </association> <!-- t-18:0 is the reifying topic of the tm-18:0 assocation template --> <topic id="t-18:0"> <subjectIdentity> <subjectIndicatorRef xlink:href="#tm-18:0" /> </subjectIdentity> <baseName> <baseNameString>comp emp person</baseNameString> </baseName> </topic> <!-- tm-23:0 is the assocation instance of the tm-18:0 template --> <association id="tm-23:0"> <!-- t-16:0 is the role defining topic t-17:0 is the role playing topic for this end --> <member> <roleSpec> <topicRef xlink:href="#t-16:0" /> </roleSpec> <topicRef xlink:href="#t-17:0" /> </member> <!-- t-15:0 is the role defining topic t-12:0 is the role playing topic for this end --> <member> <roleSpec> <topicRef xlink:href="#t-15:0" /> </roleSpec> <topicRef xlink:href="#t-12:0" /> </member> </association> <!-- t-23:0 is the reification of the association instance tm-23:0 --> <topic id="t-23:0"> <subjectIdentity> <subjectIndicatorRef xlink:href="#tm-23:0" /> </subjectIdentity> </topic> <!-- This is the assocation that connects the reifying topic of the assocation instance with the reifying topic of the assocation template --> <association> <instanceOf><subjectIndicatorRef xlink:href="http://www.empolis.com/xtm/1.0/index.html#psi-assoctemplateassoc" /></instanceOf> <member> <roleSpec><subjectIndicatorRef xlink:href="http://www.empolis.com/xtm/1.0/index.html#psi-assoctemplate" /></roleSpec> <topicRef xlink:href="#t-18:0" /> </member> <member> <roleSpec><subjectIndicatorRef xlink:href="http://www.TopicMaps.org/xtm/1.0/index.html#psi-associnstance" /></roleSpec> <topicRef xlink:href="#t-23:0" /> </member> </association>
This section has shown how to use the XTM syntax to represent
k42 topic association templates. This does not extend the syntax in any way and
other compliant XTM applications are still able to process this
topicmap.
Summary
We have shown here how the XTM syntax relates to the k42
topicmap model and how to define and use association templates in XTM. For more
information on the XTM syntax, see the XTM v1.0 DTD and for more information on
the core data model, see the developer guide and API documentation for
k42
Copyright (c) empolis UK LTD. All rights reserved.
|