An Arabic cataloguing system
The cataloguing project builds on the experience gained by Nikolai Serikoff at the Wellcome Library and Gerhard Brey at the Centre for Computing in the Humanities, King’s College London, in creating an online catalogue
for a limited collection of Arabic manuscripts (the Haddad collection) based on the TEI P4 and MASTER manuscript description standards. The new cataloguing system will use this intellectual framework and previously developed technology for the Haddad manuscripts as the basis for the new software development and delivery.
There are three elements to this project:
1) the creation of the cataloguing tool and workflows that enable its use
2) creating catalogue records - based on the TEI P5 / ENRICH schema - for the ca. 500 manuscripts identified in this project. Part of this will involve converting the Haddad metadata to this new schema, and the export of MARC21 records for the Wellcome Library
catalogue.
3) delivery of metadata and images to users
The Wellcome Library will work closely with Kings and the Bibliotheca Alexandrina in developing a cataloguing tool and workflow processes that will enable these manuscripts to be catalogued for efficient resource discovery purposes. The key requirements for the technical development are as follows:
Data repository - this concerns the ingest of existing metadata, and issues surrounding the use of the TEI P5 / ENRICH schema, UNICODE and export of data to other standards. This will be hosted initially by the Bibliotheca Alexandrina, but the Wellcome Library plans to integrate the database into its own digital library in future.
-
creation of TEI P5 / ENRICH based XML schema (1)
-
creation of a conversion programme from TEI P4 MASTER and ArabTeX to the new schema
-
store text as UNICODE
-
identify non-UNICODE characters specific to medieval Arabic manuscripts and manage their input and display
-
investigate problems/issues related to the storage, input and display of bi-directional UNICODE text
-
create export facilities to the Wellcome Library’s Encore system (XML) and OPAC (MARC21).
Input system - this will be the web-based user interface for administration, data input and QA.
-
design and build a web-based interface with template for data entry, and a facility to handle the non-UNICODE characters
-
develop a workflow for data input
-
create a facility to input Arabic characters (including non-standard characters) via a virtual keyboard and/or a transliteration scheme.
(1) Released on November 1, 2007; see TEI P5 Guidelines; “ From MASTER to TEI P5”
|