[SJ Logo]The SuperJournal Application

Home | Search | Demo | News | Feedback | Members Only

SuperJournal Conference, Birkbeck College, London, 21 April 1999

Ross MacIntyre, Manchester Computing, University of Manchester
Member of SuperJournal Planning Committee


The initial approach used within the application centred on flexibility, rather than scalability. Early in the project it was not known how to balance flexibility and scalability – judgements were based on incomplete knowledge. All design had to assume that multiple formats may have to be dealt with. An early survey of the publishers found them dealing with SGML, PDF, HTML, PostScript, RealPage, LaTeX, gif, tiff and jpeg, plus some experimentation with mpeg, audio CD and a variety of chemical formats. So rationalisation occurred alongside the project, rather than in advance.

What would I do the same and what differently? I'll focus on three areas: one we got correct from the outset, one we got wrong, and one we got right be accident.

1. Data Design

The use of object-oriented design with SGML was correct. The focus should be on the document content, i.e. what objects it contains and the relationships between them, e.g. authors and affiliations, bibliographic references, figures and their captions.

Keeping things in SGML also brought rewards. The ability to parse for correctness was retained throughout and the data itself was "readable", both in terms of the content, but the structure could be communicated in the same language as had been used for the document creation.

Since the SGML was converted to a common, intermediate format (SJ-DTD), scripts could be written and applied to existing files en bloc; on-the-fly conversion to HTML was also easier to modify.

2. Data Storage

The project reaffirmed an old maxim: when thinking about whether or how a particular product will be used, it should also be considered how it would be removed subsequently. For each component within a development strategy, there should be a corresponding exit strategy. In the final system, adequate flexibility and performance were achieved by working with the source data, utilising the database software for session management, logging, and overall application control.

An objectbase was overkill for storing the data. There was little practical benefit realised from having SGML components separated out and stored in objects at such a level of detail. In fact, there was a significant processing overhead experienced in the reassemble and display functions once the user base became established. That is not to say that the database design itself was wrong, it just bears out something Yourdon the system design methodologist said: "Real world constraints always bastardise the most elegant designs."

3. Linking

The bibliographic section of an article could be viewed as an additional type of metadata which could be made available as "header" data.

The linking of a bibliographic reference could, in turn, actually link to a page that was produced dynamically, or which offered a choice of external A&I databases, etc.

People did use the function to download references in bibliographic format and I would include additional bibliographic formats, as well as the plain tags, e.g. Papyrus, ProCite, EndNote.

This web site is maintained by epub@manchester.ac.uk
Last modified: May 14, 1999