![]() |
![]() |
![]() |
||||||||||||||
| Lessons Learned and Future Directions < Evolution of a Perl-based Knowledge Portal < < Home | ||||||||||||||||
|
Lessons Learned and Future DirectionsDeveloping, maintaining and operating the server over an extended period involves at least two axes. Technology-wise we have detected some performance bottlenecks but also gained some experience to host relatively complex functionality such as provided by TM processing software inside the Apache/mod_perl environment. On the content axis we had to learn that covering sizeable and volatile areas like the "Internet" or "XML" with topic maps is a rather time-consuming process. Technology ExperiencesTo improve the overall performance of the system, we used a variety of techniques:
The TM software packages proved to be fairly reliable but - by themselves - were rather CPU intensive. A more severe problem, though, was the size of the data structures involved. A single map, especially when combined with all rendering information inside a topic map view, easily consumed several MB of main memory. This, and the fact that we used a fair bit of 3rd party Perl packages resulted in Apache children using up more than 40 MB. To avoid a shortage of main memory we had to limit the number of concurrent requests. To mitigate memory leakage we also put an upper limit on the size of an Apache child. Content ExperiencesThe initial vision to invite external authors to write for free clearly failed. Some authors understandably would only consider writing as a commercial venture, while more academic candidates preferred to submit the content as a conference paper. Others who might have been inclined eventually built their own site or blog. As a consequence we had to feed the server with content on our own, which - also with the help of research students - proved to be fairly workable. Map ContentThe more we came to share maps produced by multiple authors, the more it became apparent that everybody has his own idiosyncrasies with regard to knowledge representation. This has led to very different authoring styles: Some authors use very systematically rather generic types for topics and associations, such as machine or process even though they were not using one of the the available top level ontologies. Not surprisingly, others had a much more adhoc approach. While it is interesting to use this in a learning process, higher quality of map content can only be achieved by using baselining ontologies. These would force authors to use a specific vocabulary and a consistent set of association structures. We also realised that categorising and arranging our maps into some globally valid hierarchical structure is bound to fail. First there is no single all-encompassing map of The Universe. Then, when we tried to factor this vast domain into smaller elements, vastly different organisations of maps resulted. Paradoxically enough, this resulting "patchwork" structure is exactly what is necessary for efficient joint authoring. Map ViewsOur experiences with presenting content via views is throughout positive, give or take minor annoyances such as the lack of splitting larger topics into several, subsequent slides or the lack of embedding images. The big promise, though, free sharing of information, has not yet been entirely fulfilled. It is mainly the different abstraction levels and the degree of textual content between authors which made maps not as reusable as they could be. This is definitely an axis on which further development and self-discipline has to be exercised. Future PlansOne of the bigger annoyances on the current site is the user interface. On the one hand it is complex enough to regularily confuse casual users, on the other hand not powerful enough for serious ontology applications and customization. Accordingly, we will completely redesign it by adding more functionality such as fulltext querying, stylesheet selection and facilities for applying map constraints and queries. To simplify the interface, though, all these features will be organized into function bars which are mostly hidden by default. The most important call for action, though, is for upgrading the TM software itself. In this process we will deploy a new dedicated TM datastore. Not only can it serve complete topic maps to clients, but it is also capable of processing Topic Map query language expressions. This is expected to significantly cut down the complexity of our Mason components. The server is also designed in such a way that topic map data can be distributed over several machines, so that - theoretically - query expressions can be factorized and spread over several nodes in a topic map cluster environment. The TM server's capability to process more complex queries will not only simplify the user interface related components, but it also allows to separate topic maps from their views more cleanly. Views can actually be formulated completely as a single (alas longish) query statement which returns an XML structure containing all necessary information to render every topic in the view. This new server version will also be able to host not only topic maps, but also ontologies and queries into its database. Together with a new driver infrastructure this will allow us to implement virtual topic maps [BaVirtualMaps]. With these we can wrap a topicmappish access layer around existing resources. A DNS server or, say, a relational database, can thus be treated as if it were a topic map.
|
|||||||||||||||