Handling: Recommendations for Publishers
Home | Search | Demo | News | Feedback | Members
SuperJournal Conference, Birkbeck College, London, 21 April 1999
Ann Apps, Manchester Computing, University of Manchester
Recommendations on Journal Data
SuperJournal journal data production processed the data from receipt to data load. It
involved writing the data conversion and data display software, and setting up the
production process, as well as the actual handling of the data. The journal data was
generally supplied as full articles with accompanying article metadata.
Article metadata was supplied to SuperJournal in SGML. Using SGML aware software, the
metadata was processed not just for its display but also to derive tables of contents at
all levels in the journal/issue hierarchy, again held in SGML, which effectively
catalogued the journal data held within SuperJournal. The choice of DTD was left to each
publisher. From analysis of those used in SuperJournal we would recommend:
- A DTD should be simple with structuring be kept to a minimum.
- The metadata should be self-identifying, including: journal; publisher; ISSN;
copyright; volume; issue; cover date; article title; article start and end pages.
- The metadata should include an abstract of the article.
- The metadata should have sufficient granularity to support the required functionality
of an electronic journals application, e.g. author names split into constituent parts.
SGML should describe the logical structure of the metadata or document.
- Data handling processes should be scaleable (minimum manual intervention)
- Quality Control. To end users the significant part of an electronic journal
application is the data itself they want to read the articles. In many cases
problems can be identified by data conversion software, the most obvious quality check
being to parse SGML data against the DTD.
Recommendations on Data Display
Data display included enhanced data functionality within the SuperJournal application.
Most journals were supplied to SuperJournal with full articles in PDF format. Some also
included SGML full articles, which was converted to HTML.
- Holding articles as both PDF and SGML seems to be the right strategy.
- PDF is the best format for printing, whereas HTML provides an easier format
- SGML provides flexibility: to provide enhanced functionality such as linking
from citation references. It is quite difficult to isolate elements of a PDF article
- Where possible it is a good idea to generate HTML for end-user display dynamically,
to obviate later re-generation caused by changes, e.g. to embedded URLs.
- The SGML should be tagged to support the required functionality, e.g.
tagging internal links and tagging bibliographic references to a fine grain level.
- DTDs should be XML compliant to allow direct display by future Web browsers.
Recommendations on Usage Data
Usage data in the log files was processed on a monthly basis before the resulting
extensive usage statistics were passed on to HUSAT and included on the SuperJournal Web
site. The SuperJournal logging requirements were far more extensive than for a usual
electronic journal application because of the nature of the project, but the processing
required should not be underestimated.
- Specify what statistics and logging are required. Usage cannot be logged
- Search engines use their own proprietary formats for logging and may not provide the
required logging functionality.
User feedback data was sent to the SuperJournal email address.
- A user support and feedback facility requires someone to answer the
questions. If passwords are used a reminder service is needed.
- Users expect electronic journal issues to be timely, concurrent with, or
before, print versions.
- Users would like backfiles. Publishers should give this serious consideration
as journals become increasingly electronic.
- Data handling staff requirements should not be underestimated. In particular, quality
control is very important. Ideally it should be performed by someone with this
specific role within the publisher's organisation rather than relying on typesetters.
- Data handling and production processes must be capable of evolution.
This web site is maintained by firstname.lastname@example.org
Last modified: May 14, 1999