ArchTracker, Copyright © Malgosia Askanas, 2000-2005

 

ArchTracker is a set of CGI scripts for performing on-demand Web display of email archives. It can be used on any email archive consisting of mbox files - zipped or non-zipped - organized into a directory hierarchy rooted in a single top directory (the Archive Root). Irrespective of whether the archive grows dynamically, undergoes modifications or restructuring, ArchTracker will always permit its users to view the archive directories, files and individual messages as they are at the moment.

Most systems for displaying email archives (usually, archives of email lists) over the Web require the creation of a static "shadow archive" - with its own hierarchy, dictated a priori by web-page design considerations - whose bottom nodes are HTML versions of the individual messages. As each message arrives, it is automatically stored in an mbox file inside some "raw archive" hierarchy; and it is also translated, either "on the fly" or at some later time, into HTML. This "HTML shadow" of the message is then deposited in the appropriate place within the static HTML "shadow archive", and it is this that is on display on the given Web site. The "raw archive" is treated as an auxiliary structure which may or may not be kept, and may or may not be made available for download.

In contrast, ArchTracker makes the "raw archive" the direct object of display. Its underlying philosophy assumes that there is a unity of purpose between intelligent "raw archive" design, conceived as the domain of the archivist, for the internal use of those with direct access to the "raw archives", and intelligent web display, conceived as the domain of the web-designer for the use of "external" visitors to the website. Consequently, there is no need for a dual form of the archive - one for the "raw messages", and another for their HTML shadows. When Web visitors access the archive, they should be accessing precisely the same archive into which the "raw" messages are deposited and which is the domain of the archivist. Web access to the archive should be treated as a particular way of viewing the objects in the "raw archive" - just as using an email client on an mbox provides a particular "view" - whose exact "look and feel" is specific to the client itself - of the mbox and the messages it contains. In fact, one might say that ArchTracker is precisely an email client for direct access to (hierarchical) mbox archives.

The way the software works is very simple. It can be "trained" on - i.e. given as an input parameter - a directory path, a filename, or a segment of an mbox file representing an individual message. If it is trained on a directory path, it displays a listing of the given directory, in which each element is a link which, in turn, will invoke ArchTracker with the path of the element. If it is trained on a (zipped or non-zipped) non-mbox file, it attempts to display the contents of the file as text. If it is trained on a (zipped or non-zipped) mbox file, it displays a Table of Contents for the file, with each item being a link which, in turn, will invoke ArchTracker on the particular messages which the item represents. When trained on a particular message, it displays its contents, using (minimal) special formatting to distinguish the header from the body.

The concepts governing ArchTracker were evolved to match the realities of the "living" archives of the philosophy and theory lists run by the Spoon Collective and hosted at the Institute for Advanced Technology in the Humanities at the University of Virginia. These archives developed over a period of 10 years. In the beginning, they were maintained by hand, irregularly, with somewhat haphazard file names, with gaps and duplicates. Later on, with the arrival of majordomo's capability for automatic division of its archive into monthly files, they were in the form of uniformly-named monthly files, but the organization into yearly directories remained haphazard. Also, many segments of the archive underwent occasional zipping and unzipping, depending on the shortage or abundance of disk space on the listserver machine. Not only did the always-impending disk-space shortage make it unfeasible to maintain a "shadow archive" in HTML, but the inappropriateness of a historically fixed a-priori vision of the organization of the archive made the "static HTML" philosophy a bad match for the living realities of the maintenance of these lists. ArchTracker was conceived and created by Malgosia Askanas to fit those realities. It has since turned out to be a useful display tool in a number of other environments.

 

To Driftline Home Page