k42 Concepts
This document contains a conceptual overview of topicmaps and the k42 architecture.
What is a topicmap?
A topicmap is an organisation of knowledge using a representation understandable by a computer. A topicmap in its most basic form is simply a collection of "topics" and "associations" where associations connect topics together.
What is a topic?
A topic is defined as "something that I am interested in". This could be a person, a place, a concept, anything that makes sense toyou. Topics are identified by names. A topic may have one or more names. This is useful for multi-lingual applications. For example: the topic used to express the concept of a dog, may have names:Dog, Hund andPerro so that a person speaking German or Spanish talk about the same concepts equally as well as an English speaking user.
What is an association?
An association is the link between any number of topics1. For example: given the topics Andyand Spectrum, I could create an association namedowns and connect Andy andSpectrum, using this association to express the concept that "Andy owns a Spectrum"
Topic instances
To make this more generic, we can use OO techniques in topicmaps to provide a schema for our topics. Finding out who owns a Spectrum is useful, but all a bit random, I don't know what an "Andy" is or what a "Spectrum" is. I want to be able to ask questions such as "Who owns a computer?". The first step towards this is to create a topic called Computer and make Spectrum an instance ofComputer. If I wanted to add other types of computer, such as a Commodore 64, I could create a topic Commodore 64 and make it also an instance of Computer. The same should be done for Andy, so a topicPerson is created and Andybecomes an instance of Person. We're not quite there to asking "Who owns a computer?" yet; we need an association template first.
What is an association template?
At the moment we have a single non-reusable assocationowns that associates two arbitrary topics,Andy and Spectrum. To make this more powerful, associations can be defined in terms of templates2. In OO terms, an association template is a class (or "type" of an association), and the associations themselves are the instances of that class. So, if I create the association templateowns, I can then create instances of theowns template for each person who owns a Spectrum.
What makes up an association template?
Each association template defines one or more ends. Each association instance must have a topic at each end of the assocation. Each "end" of an assocation template has two properties:
  1. a role defining topic, which describes the nature of the relationship played by a particular topic in the template. For example: "owner", as in "Andy plays the role of owner in the association owns"
  2. a class of role playing topic, which describes what types of topic may be used at this association end. For example: theowner role can only be played by instances ofPerson. So the class of role playing topic in the example is Person.
To complete this template, we need to create an end to represent an owned computer. The whole template is summarised in the table below.
Association EndRole defining topicClass of role playing topic
1ownerPerson
2ownedComputer
What makes up an assocation instance?
Instances of this template can now be created. An association instance consists of the same number of ends as the template, but at each end is now the topic that plays the role for that end in the association. Thus, an instance of the owns association template could containAndy playing the role of owner, and Spectrum playing the role ofowned. Below is a table showing some newPerson instances, new Computerinstances and owns assocation instances between them.
Topic NameInstance of
Person-
Computer-
SpectrumComputer
Commodore 64Computer
AndyPerson
GrahamPerson
MartynPerson
AssociationPersonComputer
OwnsAndySpectrum
OwnsAndyCommodore 64
OwnsGrahamSpectrum
OwnsMartynCommodore 64
I can now make queries against the association to ask the following questions:
  1. Who owns a computer?
  2. Who owns a Spectrum?
  3. What computers does Andy own?
While it would be possible against individually defined association instances, the templating mechanism provides consistency and semantic rigour across all associations of this type. This begins to ensure consistency and usefulness of the topicmap.
Topic subclasses
There are lots of different types of computer in the world, and grouping them all simply as types of computer, may be confusing. To refine the instances of Computer, I can create subclasses of theComputer topic: 8-bit computerand 16-bit computer and make my topics that represent specific computers, instances of these refinements. So we have:
TopicSubclass ofInstance of
Computer--
8-bit computerComputer-
16-bit computerComputer-
Spectrum-8-bit computer
Commodore 64-8-bit computer
Amiga-16-bit computer
Atari ST-16-bit computer
Note very carefully when to use "subclass" and when to use "instance of". An 8-bit computer is not an instance of a computer, it is arefinement of computer. Amiga is an instance of both a 16-bit computerand a computer.
Occurrences, Subject references and Resources
Creating these knowledge structures is only really useful if they represent something real. Topics called Spectrum do not actually tell the user what a Spectrum is. As far as the computer is concerned, it is just a 8 characters, and probably won't mean very much to someone under the age of 10. We need to make a link between the conceptual notion of a Spectrum to what a Spectrum really is.
There are three ways to do this, which way you use depends on the following criteria:
  1. Is the real-world thing represented by the topic available to the computer? If so I can create a resource reference. For example: if a topicW3C website was created, then I can create a resource reference to http://www.w3.org. This URL resolves to data that is the W3C website. Another example might be an introduction to the Jakarta-Apache product Log4J. Here, the document itself is available at this URL: http://jakarta.apache.org/log4j/docs/manual.html, therefore that URL would be associated with that topic as a resource reference.
  2. If the thing represented by the topic is not available to the computer, for example the person called Andy cannot be obtained by the computer directly, then a subject indicator reference may be used. This is a unique3 identifier for the concept of the person called Andy. A subject indicator reference could be anything, such as a URN or a dewey reference, ie. it is a well-known code. You can use other peoples subject indicator references or make up and publish your own.
  3. The final way to link between the conceptual knowledge and real-world entities is used if you have an annotation for a particular topic, so for example if I have a picture of Andy, I can create an occurrence pointing to the picture (usually via a URL) and associate it with the topic. Occurrences differ from resource references because occurrences are resourcesabout the topic, not what the topic represents itself.
What is scope, and what are scopesets?
Scope is probably the most confusing concept in topicmaps. It is advised that you avoid using scope sets until you are competent with the other concepts in topicmaps. Scopesets may be applied to either names of associations. Names are the identities given to topics. Scopesets consists of a set of scoping topics. A scoping topic is simply a topic that is used in a scopeset. A simple example of using a scopeset would be for multi-lingual topicmaps.
The topic Computer that we have created is an English word. In French, the word isordinateur and the Spanish word is ordenador. If French or Spanish users wanted to browse the topicmap to find people who owned computers they would not be able to, unless they knew the English wordcomputer.
To prevent this problem occurring, we can create several names for the topic we know as Computer. By default all these names are added to what is called theunconstrainted scope, that is, the scope within which everything that has no scope is automatically placed. This will work with no problems except that when browsing the topicmap, all users will now see all 3 names for the topic: Computer,Ordinateur and Ordenador. This is messy and presents useless information to the user. To fix this, we can use scope. By creating two new topics: French andSpanish, we then scope the topic nameOrdinateur to the French topic (making the French topic a scoped topic, part of a scopeset) and scope the topic nameOrdinador to the Spanish topic. Now a user can specify that they are only interested in seeing theFrench names for topics by masking all results based on that scopeset.
Summary
In this section we have seen all of the main topicmap constructs: topics, associations, occurrences, resource references, subject indicator references and scopesets. The rest of this document describes the data model that powers k42 enabling it to represent a topicmap.
TopicMaps are not relational databases
A common misconception of topicmaps is that they are no different from a relational database. This is not the case. A database consists of a collection of tables containing, effectively, lists of data. These lists may be combined using relationships or queries to refine the data and obtain useful information. It is possible to model any database schema in a topicmap, however, the reverse of this is not true. Relational databases were created from a mathematical model, which was then transposed on to real-life scenarios and is not a natural way to model data.
A topicmap allows the user to model the data in a natural way, by defining areas of interest (topics), and then defining links between those areas (associations). Neither the topics or associations are simple pointers; they may have semantics defined for them which may be used for querying and merging of information. Also, as the user is modelling knowledge from a natural perspective, topicmaps are very easy to create and query.
The salient point here is that databases process data, topicmaps process knowledge.
The TopicMap Model
How does k42 model topicmaps?
The XML topicmaps (henceforth XTM) version 1.0 syntax provides a standard way to interchange topicmap information between compliant applications. k42 understands the XTM v1.0 syntax and processes it into an internal representation; we call the internal representation, the topicmap model. The core topicmap data model is a subset of the complete topicmap model. This core topicmap data model is the simplest useful view of the topicmap model. In this document we express the core topicmap data model using an entity relationship diagram. The final part of this document provides further insight into the details of the object model and advanced conceptual features that are implemented in k42.
The k42 TopicMap Core Data Model
Overview
This overview presents an entity relationship diagram showing all the entities of the core topicmap data model along with the relationships between them. These entities and their relationships are fully explained in the subsequent sections.
The Entities
TopicMap
The topicmap is the organising principal of the model. A single topicmap instance will manage many topics and topic associations. It is the topicmap that acts as factory for the creation of new topics and associations. The topicmap consists of two properties: topics and associations.
In k42 these properties are accessed through the methods provided by com.empolis.topicmaps.ik42.ITopicMapinterface. These methods are
getTopics()
getAssociationInstances()
both return typed iterators which are subclasses of thejava.util.Iterator interface. These iterators provide access to sets of topics and associations respectively.
The topicmap provides a point of focus for the model. The querying mechanisms for the topicmap are alo located on the ITopicMapinterface.
Syntax Reference
A topicmap instance is created when processing an XTM document and finding a <topicMap> element.
Topic
There are only a few concepts in the model but at the heart of all of them is the topic, represented by thecom.empolis.topicmaps.ik42.ITopic interface. As defined in XTM, the topic has names, occurrences, identity and plays roles in associations. All of these aspects are represented in the model.
The topic has the following properties:
names - the name constructs that help form the identity of a topic.
instanceOfClasses - the set of topics that are classes of which this topic is an instance.
occurrences - the occurrence constructs the reference resources related to this topic.
SubjectIndicatorRefs - the unique identity constructs that help to define the identity of this topic.
ResourceRefs - if this topic is itself a resource then its identity consists of the resource reference that identifies the resource.
AssociationEnds - the set of association ends that bind this topic into an association.
In k42 the following methods on thecom.empolis.topicmaps.ik42.ITopic interface can be used to access the properties described above.Note that where a set is mentioned above, a typed java.util.Iterator instance is returned in k42. This is an implementation detail to enable scalable topicmap access.
EntityMethod on ITopic
namesgetNames()
instanceOfClassesgetInstancesOf()
occurrencesgetTopicOccurrences()
SubjectIndicatorRefsgetSubjectIndicatorReferences()
ResourceRefsgetResourceReferences()
AssociationEndsgetAssociationEnds()
TopicAssociation
The TopicAssociation is the entity that, in conjunction with the AssociationEnd entity, binds the map together. The TopicAssociation has the following properties:
Ends - the set of association ends that bind topics into this association.
inScope - the scopeset that constrains the context in which this association holds.
instanceOf - the topic entity that is the class of which this association is an instance.
The following methods oncom.empolis.topicmaps.ik42.ITopicAssociation provide access to the properties described.
EntityMethod onITopicAssociation
EndsgetEnds()
inScopegetScopeSet()
instanceOfgetInstancesOf()
AssociationEnd
The AssociationEnd is contained by an association and links together a role defining topic and a role playing topic. The AssociationEnd can be accessed from both associations and the topic that is the role playing topic. The AssociationEnd has the following properties:
association - the association that this end is a member of.
roleDefiningTopic - the topic that defines the role in the association.
rolePlayingTopic - the topic that plays the role in the association.
The properties described above can be accessed on instances ofcom.empolis.topicmaps.ik42.IAssociationEnd using the following methods:
EntityMethod onIAssociationEnd
associationgetAssociation()
roleDefiningTopicgetRole()
rolePlayingTopicgetTopic()
ScopeSet
Several aspects of the model rely on the concept of scope to define a context in which they are valid. Names and associations are examples of this. In the model, scope is represented by an aggregated entity: the ScopeSet. The ScopeSet consists of the set of topics that together are said to be the scope. A ScopeSet has just one property in the core data model and that is topics which returns the set of topics that comprise it. In k42 this property is accessed through the methodgetTopics() on thecom.empolis.topicmaps.ik42.IScopeSet.
TopicOccurrence
The TopicOccurrence is the structure that associates resources with topics. The properties of the occurrence are :
topic - the topic that this occurrence applies to.
occurrenceInstanceOf - the topic that defines the class of which this occurrence is an instance.
resourceReference - the resource reference that can be used to locate the occurrence resource.
inScope - the scope that defines the context in which the occurrence is valid.
The occurrence properties can be accessed by the followingcom.empolis.topicmaps.ik42.ITopicOccurrencemethods:
EntityMethod on ITopic
topicgetTopic()
occurrenceInstanceOfgetInstancesOf()
resourceReferencegetResourceReference()
inScopegetScopeSet()
SubjectIdentityReference
The SubjectIdentityReference entity has just a single propertyvalue that is a string that helps to identify the topic to which the subject indicator reference belongs. To access this property k42 provides a method oncom.empolis.topicmaps.ik42.ISubjectIndicatorReferencecalled getValue().
ResourceReference
The resource reference entity has just a single propertyURI that is a string that locates the resource . To access this property k42 provides a method oncom.empolis.topicmaps.ik42.IResourceReference calledgetURI().
k42 Core Data Model Summary
This section has shown how k42 models topicmaps. We have illustrated the key model concepts and their relationships. For a fuller understanding of how to access all aspects of the presented entities, refer to the API documentation.
k42 Advanced TopicMap Model
TopicMap Model Advanced Concepts - Introduction
The core data model presented all the main topicmap model aspects for reading a topicmap. This section adds to those basic concepts by presenting the advanced parts of the data model. Templates, properties and the extended occurrence model are unique to k42, while reification is an advanced feature of XTM (also supported by k42).
TopicMap Model Advanced Concepts
Reification
Reification is the act of creating a topic to represent a 'thing'. Topics achieve this by having names and identity. This is how the reified object is identified. However, we might also want to reify things within the topicmap model. An example of this is: putting a name onto an association or perhaps putting an occurrence onto a topicmap name entity. This is supported in k42 such that all entities know what their reifying topic is 4 and equally a topic knows what topicmap model entity it reifies. Associations and occurrences are automatically reified within k42, other entities can also be reified.
The diagram above shows the entity hierarchy in the data model. The key thing to notice is that all entities extend the IK42 (seecom.empolis.topicmaps.ik42.IK42) entity. This is the base entity in the topicmap model. Any topicmap construct can be reified through the fact that IK42 entities can be reified. The diagram below shows the data model relationship between an IK42 entity and its reifying topic. Thus all derived entities of k42 also have this property.
The methods used to access a reified object in k42 aregetReifyingTopic() oncom.empolis.topicmaps.ik42.IK42 andgetReifiedObject() oncom.empolis.topicmaps.ik42.ITopic.
Properties
Topics and associations are a rich paradigm for representing typed associations between meaningful entities. However, it is also necessary to store simple named properties as meta data on the topics and associations themselves.
The diagram below shows how all IK42 entities have aproperties property. Each property entity in the set of properties has two properties key and value. These return entities of typejava.lang.String and type general entity. Where general entity can be any kind of object that exists in the system.
The methods in k42 for accessing the properties part of the data model, is as follows: oncom.empolis.topicmaps.ik42.IK42 usegetProperties() to get access to a set of property entities, and on the property entity use getValue() andgetKey().
Advanced Occurrence Model
An occurrence is in fact a special kind of association. It is an association between a topic of any nature and a 'resource topic'. What we mean by a 'resource topic' is one whose identity is defined using a resource reference. See the core model for more information on identity and resource references. When accessing the resource references property of an occurrence, as shown in the core model, we are actually fetching the resource reference of the topic that is playing the role of the resource topic in the occurrence association. This approach mans that there are less special cases in terms of the data structure within the topicmap model. What we have is a refinement of the semantics for a particular kind of topic association. The model below shows an occurrence instance diagram as a topic association.
Notice above that the 'topic' is the topic entity to which the occurrences relates and the 'resource topic' is the topic that has its identity defined by a resource reference. The instance diagram below shows the internal representation of the following XTM fragment.
<topic id="t-gdm">
<baseName><baseNameString>Graham Moore</baseNameString></baseName>
<occurs xlink:href="http://www.empolis.co.uk/bios.html#gdm" />
</topic>
Templates
The last section in the k42 advanced model is the idea of templates. Templates are used to provide a schema for association instances. In k42, templates use the normal topic associations data structure; the difference is that the semantics assigned to that data structure are different. This is similar to the k42 perception of occurrences, that they are topic associations with refined semantics. It should be noted though - that reading a topic association template as a standard association will seem strange, but coupled with an understanding of the semantics, it will appear obvious.
The key aspect of a template is to define the nature of the roles that are present in a particular class of association. In addition, the template must specify the class of topics that can play those roles in an association instance. A template in k42 is a topic association where the 'role defining topics' define the nature of roles in that association template, and the role playing topics of those particular ends define the class of topic that can play that role in an association instance.
An example will best illustrate these ideas. We want to create an association template that defines a relationship between companies and employees called 'company employs person'. In this template, we create a topic association that has two association ends. One end has a role defining topic called employee and the role playing topic of that association template end has the topic called person. This is where the topic person is being used to state that topics in instances of this association that play the role ofemployeemust be instances of the topicperson. In a similar way, we define the other template association end to have a role defining topic as the topicemployer and the role playing topic as the topiccompany such that all topics playing the role ofemployer must be instances of the topiccompany..
The diagram below shows the basic data model that connects association instances to its template. For information on how topic associations and their templates are serialised in XTM, see the section onXML topicmap (XTM) Support below.
In k42 the method getTemplate() exists on the com.empolis.topicmaps.ik42.ITopicAssociationinterface.
k42 Advanced Model and Concepts - Summary
This section has presented the advanced features of k42. These additional features and models add more value to the topicmap systems without compromising the interoperability of topicmaps. These features are here to enable more sophisticated processing of the topic model.
XML TopicMap (XTM) Support
This section details the XTM support that is available in k42, the mapping between the XTM syntax and the k42 topicmap model and the additional use of the XTM syntax to express topic association templates. The main point of note is that k42 is 100% XTM compliant and supports many optional features of the standard, such as the processing of referenced maps as a single processing operation.
XTM Compliance and Core TopicMap Model Mapping
k42 can process a compliant XTM instance into an internal model that can be manipulated programmatically by k42 client applications such as WebAuthor. As there are no XTM constructs not supported, this section shows how the XTM syntax maps to the k42 topicmap model. We will illustrate this mapping in a descriptive manner, each element at a time.
<!ELEMENT topic ... > creates a new topic entity in the topicmap model. The XML 'id' attribute is not maintained within the model.
<!ELEMENT instanceOf ... > on a topic and association creates a reference between this topic / association and another topic where that topic is the class of which this thing is an instance. When defined on an occurrence, it references a topic which defines the nature of the topic occurrence. Note that this and other elements contain either topic references or subject indicator references. In the k42 model both of these things result in a reference to a topic and thus are not discussed in detail, except to say; that when a subject indicator is found in the XML, in a position other than within subjectIdentity, that a new topic is created, if not already existing, that has a subjectIndicatorReference as its identity.
<!ELEMENT scope ... > defines a set of topics that together are considered as a unique scope set in the topicmap model. Scope sets are used by names, associations and occurrences to define the context in which they are valid.
<!ELEMENT baseName ... > is used to define a name construct by grouping together a scope set and a baseNameString.
<!ELEMENT baseNameString ... >creates a name entity where the value is the text contained within the element.
<!ELEMENT occurrence ... >creates an occurrence entity connected to the topic which has most recently been created due to a topic element being processed. The occurrence can contain a scope element which defines the context of this occurrence. Thexlink:href value or the <!ELEMENT resourceData ...> element define the value of the resourceReference that is associated with the occurrence. Note that the prefix data:, is appended to any values of resourceReference that are created as a result of processing a<!ELEMENT resourceData ...>element.
<!ELEMENT association ... >creates a topic association entity which using the member element connects topics together in typed associations. A scope set within this element defines the context in which this association is valid.
<!ELEMENT member ... > is used to group together a role defining topic and a role playing topic. Processing this element creates an associationEnd entity on the current association in the topicmap model. The topicRef or the subjectIndicatorRef within this element define the rolePlayingTopic property.
<!ELEMENT roleSpec ... > this creates a reference to a topic from the associationEnd which is the property 'roleDefiningTopic'.
<!ELEMENT subjectIdentity ... >subjectIdentity does not itself create any structures within the k42 model (but the resourceReference or subjectIndicator reference it may contain, do). If either of these sub-elements are processed in this context, the appropriate subjectIndicatorReference or resourceReference is attached to the current topic being processed. A subjectIndictorReference is overloaded in one way: if the subjectIndicatorReference references any topicmap construct within the XTM file being processed, then this topic reifies that topicmap construct. This is mirrored in the model through the reifiedObject property on the topic entity.
Serializing TopicMap Templates using XTM
The section above has described how all the XTM constructs are processed and which resulting topicmap model entities are constructed. From the descriptions above, it should be clear how the export operation serializes to XTM given a topicmap model. In this section, we show how k42 supports templates defined using the standard topic association constructs.
An instance association needs to be connected with its template. The template itself is represented as an assocation. We relate the two associations together using a third topic association. However, we cannot connect associations directly using the third topic association, so we need to reify the template and the instance with topics and then connect the reified topics.
Note that k42 will not support this XTM extension, unless the associations are defined in the order:
  1. Association Template, and its reifying topic;
  2. Association Instance, and its reifying topic;
  3. The Association that connects the reifyied topics.
In the example shown below, topic association templatetm-18:0 is reified by topict-18:0 and the topic association instancetm-23:0 is reified by topict-23:0. Then the two reified topics are connected through the final topic association.
The example below clearly shows how the XTM syntax is used to define association instances and their templates.

<!-- Association Template -->
<association id="tm-18:0">

	<!--  t-16:0 is both the role defining topic and 
         role playing topic class constraint -->
	<member>
		<roleSpec>
			<topicRef xlink:href="#t-16:0" />
		</roleSpec>
		<topicRef xlink:href="#t-16:0" />
	</member>
	
	<!--  t-15:0 is both the role defining topic and 
         role playing topic class constraint -->
	<member>
		<roleSpec>
			<topicRef xlink:href="#t-15:0" />
		</roleSpec>
		<topicRef xlink:href="#t-15:0" />
	</member>
</association>

<!-- t-18:0 is the reifying topic of the tm-18:0 assocation template -->
<topic id="t-18:0">
	<subjectIdentity>
		<subjectIndicatorRef xlink:href="#tm-18:0" />
	</subjectIdentity>
	<baseName>
		<baseNameString>comp emp person</baseNameString>
	</baseName>
</topic>

<!-- tm-23:0 is the assocation instance of the tm-18:0 template -->
<association id="tm-23:0">

	<!-- t-16:0 is the role defining topic 
        t-17:0 is the role playing topic for this end -->
	<member>
		<roleSpec>
			<topicRef xlink:href="#t-16:0" />
		</roleSpec>
		<topicRef xlink:href="#t-17:0" />
	</member>
	
	<!-- t-15:0 is the role defining topic 
        t-12:0 is the role playing topic for this end -->
	<member>
		<roleSpec>
			<topicRef xlink:href="#t-15:0" />
		</roleSpec>
		<topicRef xlink:href="#t-12:0" />
	</member>
</association>

<!-- t-23:0 is the reification of the association instance tm-23:0 -->
<topic id="t-23:0">
	<subjectIdentity>
		<subjectIndicatorRef xlink:href="#tm-23:0" />
	</subjectIdentity>
</topic>

<!-- This is the assocation that connects the reifying topic of the assocation
     instance with the reifying topic of the assocation template -->
<association>
	<instanceOf><subjectIndicatorRef xlink:href="http://www.empolis.com/xtm/1.0/index.html#psi-assoctemplateassoc" /></instanceOf>
	<member>
		<roleSpec><subjectIndicatorRef xlink:href="http://www.empolis.com/xtm/1.0/index.html#psi-assoctemplate" /></roleSpec>
		<topicRef xlink:href="#t-18:0" />
	</member>
	<member>
		<roleSpec><subjectIndicatorRef xlink:href="http://www.TopicMaps.org/xtm/1.0/index.html#psi-associnstance" /></roleSpec>
		<topicRef xlink:href="#t-23:0" />
	</member>
</association>
This section has shown how to use the XTM syntax to represent k42 topic association templates. This does not extend the syntax in any way and other compliant XTM applications are still able to process this topicmap.
Summary
We have shown here how the XTM syntax relates to the k42 topicmap model and how to define and use association templates in XTM. For more information on the XTM syntax, see the XTM v1.0 DTD and for more information on the core data model, see the developer guide and API documentation for k42

Footnotes
1. Virtually all associations will only ever connect two topics. It is possible for create associations between more than two topics, but that is usually not useful and should only be considered by advanced users.
2. Templates are not part of the topicmap specification as yet, but they are a logical extension that provide many useful semantics to a topicmap.
3. there is no world-wide repository as yet for all types of subject indicator reference. But one day there may be.
4. not all things are reified