![]() |
![]() |
![]() |
|||||||||||||||
| Topic Map Web Server < Evolution of a Perl-based Knowledge Portal < < Home | |||||||||||||||||
|
Topic Map Web ServerThe scope of the web site has shifted quite dramatically since its inception. We first started off with a pure HTML-based web site to document our Topic Map related languages and software. This soon was upgraded to a site serving DocBook documents which are automatically rendered into HTML. Map Development AreaAs we started to host articles of third-parties concerning issues with TM research and development we factored out the reference documentation about the AsTMa languages into a separate server. This also happened to various online demonstrations regarding our constraint and query language. For community-building purposes we enriched the articles with an RSS[RSS] stream which we syndicate from other TM-related blogs and RSS streams. The other, more dramatic change was the introduction of a map development area (MDA). In this area users can now browse generically through topic maps. Navigating through an inherently unsorted knowledge store like a topic map requires an interactive, hyperlinked display of whole maps and individual topics. The viewer starts with any existing topic and then decides on an outgoing path leading to another topic. Our generic map display layout was inspired by the early versions of that of the TM4J project (which provides a TM framework for the Java language). Every web page corresponds to a particular topic, its components (names, occurrences) are the focus of the page. All associations in which this topic is involved in are displayed at the perimeter and users can follow one such association via a hyperlink. That leads to a different topic. To give the interface a slightly more appealing look, we also experimented with displaying topological information with SVG. Only few people use it, be it because of the yet suboptimal arrangement on the SVG pane, be it the lack of installed SVG plugins or the overly complex impression our current interface presents. Generic, unsequenced map browsing is only one of the display mechanisms implemented. For more tailored teaching materials as needed in lectures or talks, topic map views are used. Here we used a format most people familiar with conventional slides shows are familiar with. A 'Next' and a 'Previous' button allows users to follow the prepared sequence through the map. On the management side, authorized users can manage maps by uploading, checking and downloading them. We have not added any editing functionality as we consider editing inside a real text editor to be superior over than in any web interface. With the introduction of map views we created (several versions of) a dedicated user interface to edit all aspects of map views via a web browser. Unlike the authoring of topic maps themselves, the generation of such views associated to a map is a sufficiently visual task so that we decided to implement this editing functionality within the server environment. New views can be generated, the order of topics displayed can be changed and for every topic the author can customize what information should be shown for a specific topic and also how it is to be rendered (Fig. 2).
Once these views are created, they can then be downloaded as a whole in various formats, such as styled HTML, PDF and various slide show formats. These documents can be directly used as slide shows, either for online or offline viewing. Views can also be downloaded as XML documents for further processing. The MDA has become our main tool for knowledge engineering in support of our teaching. It allows us to centrally manage and store knowledge in topic maps which then can be extracted and applied easily when producing course material. ArchitectureLacking experience in this kind of services, we let ourselves be mainly guided by the needs of the user frontend. URL SpaceOne issue which we paid considerable attention to was that of the form of URLs for our server. While this is a trivial question for XML documents (and fragments thereof), we had little experience using an addressing scheme for topic maps themselves, for topics within these maps and for topic map view or other high-level objects. Eventually we adopted the approach that maps are addressed via URLs in a rather hierarchical manner. A map about the Internet would be addressed as /internet/, a map about the web would be a subordinate thereof with an address /internet/web/. Topics within a map would be referenced first by the address of the map and then using the internal topic identifier, so for instance a topic about Google would be accessible at /internet/web/google. Using the convention that map addresses always have a trailing slash, we can also elegantly distinguish between topic 'Google' and another map about Google: /internet/web/google/. It might strike one as a bit odd to use a hierarchical addressing scheme for an inherently graph-like structure, especially since Topic Maps have a concept of controlled merging which allows to combine maps into one global map. Another valid argument against hierarchical addressing is that any chosen hierarchy is completely arbitrary and highly subjective. Still, in the experimental phase it proved to be a convenient way to manage our private topic maps. If a map has a view associated, then this view also has an URL, such as /users/me/room-with-view/. This object, though, is never addressed by itself but always in conjunction with its map: /internet/web/@/users/me/room-with-view/. Clearly, a map can thus have different views associated with it. This mechanism is also used for topics when they should be displayed controlled by a view: /internet/web/google@/users/me/room-with-view/. The benefits of this URL scheme are not only that all relevant objects can be addressed unambiguously and concisely, but we also avoid all forms of URLs which obstruct caching by intermediate proxy chains. The scheme also is consistent and simple and does not introduce artificial distinctions between maps and individual topics. Backend StorageWhen we only served static XML documents and sections thereof, all content was simply stored in the file system. With the introduction of the MDA, though, we were in need of a persistent TM store. We had to rule out the simplistic approach to load the whole map from file for every single request. Even with smaller maps the parsing process could take several seconds, for bigger maps up to a minute. As a fully-fledged topic map engine was out of reach at that time, we resorted to a simpler solution. Every map was parsed once (whenever it had changed) and the whole map data structure was stored in serialized form (in Perl sometimes referred to as freezing) as BLOBs in our MySQL database backend. At the beginning this topic map fridge seemed more like a typical 'Perl Hack', but it has worked quite well over the years and even larger maps can be reinstated ('thawed') quite efficiently at request time. In one of the earlier versions we relied on the upload and download functionality we developed for the web server frontend. To change maps we would simply do a manual HTTP upload and the application server would parse and store the map under the given URL. Over time that proved to be quite cumbersome and unappealing, especially as we moved more towards sharing our knowledge pool actively: managing multiple versions of maps, maintained by several authors, became a major source of inconsistencies. Also we suffered from the lack of a visible hierarchy, so different authors used slightly conflicting organization criteria. In reaction we introduced a central CVS repository for the text versions of our maps. This changed our storage paradigm, with CVS becoming the primary reference point holding human-readable versions of maps organized in a clearly visible file system tree. CVS provided us with an audit trail of revisions and differences between them. The topic map fridge was reduced to serve only as a fast-lookup cache for the binary representations of maps. Using on-board means of CVS, the injection of maps could be completely automated: whenever a new version of a map is checked into the CVS repository, a small Perl script run by the CVS server software deposits the map in frozen form in the fridge. All other functionality, such as access control, caching, etc. remained in the web server infrastructure.
|
||||||||||||||||