logo
Structure < AsTMa* Topic Map Engineering (Part II) < < Home 

PrevUpNext

Structure

In the structural part our ontology controls how topics can be used and what information they must or may contain.

Basenames

A natural constraint is that all publications must contain a basename. In a first attempt we would write that anything which is (directly or indirectly via subclassing) a document must have some base name:

forall $t [ * (document) ]            # directly on indirectly being a document
   => exists $t [ bn : * ]            # there is some base name

Unfortunately, this is not exactly mirroring our intentions: It simply expects documents to have some basename, irrespective of the scope. What we really need is to be explicitly enforcing a basename in the unconstrained scope:

forall $t [ * (document) ]
   => exists $t [ bn @ uc : * ] is-reified-by documents-must-contain-basename

uc reifies urn:x-topicmaps:unconstrained-scope
We immediately also reify this constraint. This means that a topic is implicitely created that can be used in some other context. Issue: For some reason there is no PSI for the unconstrained scope. The question is why not all the core concepts should be URNs anyway.

In an earlier section we also introduced optional basenames in the latex scope:

forall $t [ * (document) ]
   => suggested exists $t [ bn @ latex : * ] is-reified-by documents-may-contain-latex-basename
The keyword suggested simply indicates to the author (or the authoring environment) that potential map users may profit from this information.

Comments and Abstract

In the same way we might encourage map authors to include comments and/or abstracts into document topics

forall $t [ * (document) ]
   => suggested exists $t [ in (comment) : * ] is-reified-by documents-may-contain-comment

forall $t [ * (document) ]
   => suggested exists $t [ in (abstract) : * ] is-reified-by documents-may-contain-abstract

We used a wildcard (*) to signal that we do not care about the contents and the length of this abstract. For applications which should format and render this information, say, as part of a web page or in a brochure, length restriction may significantly increase the professional look. As the * is only a convenient abbreviation of the regular expression .* we can also use a more demanding expression there:

forall $t [ * (document) ]
   => suggested exists $t [ in (abstract) : /(.*){200,400}/ ] is-reified-by documents-may-contain-abstract
In this case the abstract may contain at least 200, at most 400 characters. Note that this does not forbid the author to have abstracts outside this range. For one the suggested ranks this constraint as not strictly necessary for a map to conform to. Secondly, the topic itself can have other characteristics with no relevance to the above.

Unfortunately, at the time of writing, Topic Map notations (and the underlying paradigm) do not have an official mechanism to include typed data. If, for instance, we would like to add the abstract as XML, then there is no way to let an application know that fact. Also the problem of constraining this is not yet solved.

Cite Codes

To specify that all documents must have a cite code is straightforward:

forall $t [ * (document) ]
   => exists $t [ oc (cite-code) @ latex : /^urn:x-bibtex:/ ] is-reified-by documents-must-contain-bibtex-cite-code
We even can constrain the pattern the cite code must follow with a regular expression.

Online references

To express that we may allow any kinds of additional occurrences we do not actually need to express this as a constraint. If we do not make any statement, additional components are allowed unless we have not explicitely forbidden them.

Still, if we would like to encourage authors to add these, we might say

forall $t [ * (document) ]
  => suggested exists $t [ oc : m/is_uri()/e ]
which expresses our expectation that at least one occurrence - regardless of its type - exists in a document topic. The value of this must be a URI, though.

For selected document classes we can (and should) detail our expectations as it might be the case for RFCs:

forall $rfc [ * (internet-rfc) ]
  => exists [ oc : m|^http://www.faqs.org/rfcs| ]
This warrants that all such documents contain a URL to the given server.

Authors

Authors we bind to documents via an is-author-of association which we control in form:

forall $a [ (is-author-of) ]
  => exists $a ] (is-author-of)
                 opus   : *
                 author : * [ is-reified-by structure-of-is-author-of

forall $a [ (is-editor-of) ]
  => exists $a ] (is-editor-of)
                 opus   : *
                 editor : * [ is-reified-by structure-of-is-editor-of

For the authors themselves we mandate only a basename (again in the unconstrained scope) and an optional LaTeX name. Also the affiliation is optional:

forall $a [ * (person) ]
   => exists $a [ bn @ uc : * ]
      and
      suggested exists $a [ bn @ latex : * ] is-reified-with person-must-have-basename-and-may-have-latex-name

forall $a [ (is-affiliated-with) ]
   => exists $a ] (is-affiliated-with)
                  person      : *
                  affiliation : * [ is-reified-with-structure-of-is-affiliated-with

Again, we are not interested in particular constraints on the nature of these affiliations.

Other meta information

In a similar way we create the structure for the is-published-by association. One variation in this is that associations of this type may include optionally a year or a country:

forall [ $d (document) ]
  => suggested exists [ (is-published-by)
                        document : $d ] is-reified-by document-may-have-a-publisher

forall $a [ (is-published-by) ]
  => exists $a ] (is-published-by)
                 document : $d
                 publisher: $p
                 ?year    : *
                 ?country : * [
     and
     exists [ $d (document) ]
     and
     exists [ $p (company) ]

We added constraints on the types of the players in the association: every topic playing the document role must be a document in the map as well as every publisher topic be an (direct or indirect) instance of a company.

If we want to specify the type of the players for year or country then we have to add

forall [ (is-published-by)
         year : $y ]
  => exists [ $y =~ /year-\d{4}/ ] is-reified-by every-year-must-have-a-specific-id
and
forall [ (is-published-by)
         country : $c ]
  => exists [ $c (country) ] is-reified-by every-publishing-country-must-exist

In the former constraint we used a regular expression to limit the form of the topic id. It is arguable whether this is the optimal way to encode a year value.

In the last constraint we simply asked the country to be declared as such in the map. A more disciplined approach would be to be rather specific which countries we allow. To define a closed set of values we can add to our ontology:

forall [ $c (country) ]
  => exists [ 'albania' ]
     or
     exists [ 'algeria' ]
     or
     ...
     exists [ 'zimbabwe' ] is-reified-by limited-list-of-countries

Of course these topics should be aligned with the PSIs for countries.

Documents may also be published within a larger document; for this purpose we use the is-published-in association type:

forall $a [ (is-published-in) ]
   => exists $a ] (is-published-in)
                  whole : $d1
                  part  : $d2 [
      and
      exists [ $d1 (journal | book | proceeding) ]
      and
      exists [ $d2 (document) ] is-reified-by structure-of-is-published-in

What this also says is that the players must be documents, the larger part specifically either a journal, a book or a proceeding.


PrevUpNext