The Wellcome Library is part of Wellcome Collection | Wellcome Trust websites
 
 





Frequently asked questions

What is the Wellcome Arabic Cataloguing and Digitisation Project?

The Wellcome Library, Bibliotheca Alexandrina, and King’s College London, have formed a partnership to create a free searchable online catalogue of 500 Islamic manuscripts in the Wellcome Library.

The partners will

  • design a cataloguing system to create and manage descriptive metadata for Asian manuscripts, including adapting the ENRICH TEI metadata schema
  • catalogue the manuscripts using this tool
  • create cover-to-cover, high quality digital photographs of the manuscripts
  • produce a website to enable sophisticated access to the metadata and its associated digital images.

The website will be hosted by the Bibliotheca Alexandrina, and digitised content will also be available via the Wellcome Library catalogue, pending inclusion of the complete catalogue on the Wellcome site when circumstances permit.

The project is partly funded by a grant from the JISC’s Islamic Studies Programme.

top

What are the project aims?

The aim of this project is to provide remote access to Islamic manuscripts via rich metadata and associated digital images.

The objectives are:

  • to design and implement an open source cataloguing tool storing metadata as TEI P5
  • to create rich descriptive metadata for each manuscript
  • to create a web front-end to be hosted by the BA to deliver metadata and cover-to-cover images
  • to create fully-formed MARC21 records for each intellectual work to facilitate resource discovery via the Wellcome Library catalogue
  • to create around 75 000 full-colour images of the manuscripts for online access.
top

What content will you catalogue and digitise?

The content consists of around 500 Arabic manuscripts dating from the 14th to the 20th century, sourced from the Wellcome Library's holdings. The core of this collection relates to the great heritage of classical medicine, preserved, enlarged and commentated on throughout the Islamic world, stretching from Southern Spain to South and South-east Asia.

Part of the Wellcome collection of Arabic manuscripts constitute most of the Arabic medical manuscript collection of Dr Sami Ibrahim Haddād (1890-1957) the well-known Lebanese physician and historian of medicine (The "Haddad manuscript collection"). The collection includes works by well-known Islamic authors, such as al-Majusi, Ibn Sina and Jewish authors who wrote in Arabic (al-Isra'ili).

top

Who are the target audiences for this collection?

The core audience for the digital resource are Islamic studies specialists, in particular scholars of Arabic medicine, science and history. Also benefiting from greater access to these texts are conservators, calligraphers, museum workers and others interested in ancient manuscripts as objects.

As a key deliverable of this project, the cataloguing standards and tool will be of great interest to cataloguing professionals working in the area of the manuscripts written in Arabic script.

top

How will the collection be made accessible?

A website will be available to search the catalogue data and view images. All content digitised under this project will be made freely accessible online. Moreover, so these digital objects can be embedded in teaching materials, presentations and publications, images can be reused and repurposed under the Creative Commons Licence. This will enable users anywhere in the world to copy, distribute, display, and make derivative works based on these films, providing such uses are fully attributed and done on a non-commercial basis.

The cataloguing tool will be open source, and therefore freely available to any who want to develop the software for their own use.

top

Who are the project partners?

The Wellcome Library will be supplying the digital images, providing directorship and project management, cataloguing, ensuring quality of cataloguing work, ensuring long-term preservation of metadata and digital images and making images and descriptive metadata available on the Wellcome Library website. The project director is Dr Richard Aspin, Head of Research and Scholarship. Dr Nikolai Serikoff, Asian Collections Librarian, provides the intellectual and technical expertise.

The Bibliotheca Alexandrina will lead on software and web development, carry out cataloguing, manage and deliver descriptive metadata and digital images and host the website. Leading the project at the Bibliotheca Alexandrina is Dr Magdy Nagi and Dr Noha Adly, Director and Deputy Director of the ICT Sector, in conjunction with Dr Youssef Ziedan, Head of the Manuscripts Museum.

King’s College London provides expert advice on creation of technical specifications, will design and implement a suitable XML schema for metadata management and provide expert input on web delivery. Simon Tanner leads King's involvement, with Gerhard Brey and Elena Pierazzo providing the technical expertise in software development and the TEI schema.

The JISC has provided a large proportion of funding to this project through a grant under the Islamic Studies Programme.

top

What cataloguing standards will be used?

The project aims to develop an appropriate TEI P5 schema based on the ENRICH metadata standard. This schema will be able to cater for the specific requirements of Arabic manuscript description, and will include all the main elements and attributes required for effective interoperability with ENRICH. TEI documents will be created and managed by the cataloguing tool and will facilitate search and discovery via a dedicated website to be hosted by the Bibliotheca Alexandrina.

The Wellcome Library catalogue, which uses MARC 21 standard (AARC2 rules), will be enhances with a subset of the data created during cataloguing.

top

What are the technical specifications of the cataloguing tool?

The cataloguing project builds on the experience gained by Nikolai Serikoff at the Wellcome Library and Gerhard Brey at the Centre for Computing in the Humanities, King's College London, in creating an online catalogue for a limited collection of Arabic manuscripts (the Haddad collection) based on the TEI P4 and MASTER manuscript description standards. The new cataloguing system will use this intellectual framework and previously developed technology for the Haddad manuscripts as the basis for the new software development and delivery.

There are three elements to this project:

1) the creation of the cataloguing tool and workflows that enable its use

2) creating catalogue records - based on the TEI P5 / ENRICH schema - for the ca. 500 manuscripts identified in this project. Part of this will involve converting the Haddad metadata to this new schema, and the export of MARC21 records for the Wellcome Library catalogue.

3) delivery of metadata and images to users

The Wellcome Library will work closely with Kings and the Bibliotheca Alexandrina in developing a cataloguing tool and workflow processes that will enable these manuscripts to be catalogued for efficient resource discovery purposes. The key requirements for the technical development are as follows:

Data repository - this concerns the ingest of existing metadata, and issues surrounding the use of the TEI P5 / ENRICH schema, UNICODE and export of data to other standards. This will be hosted initially by the Bibliotheca Alexandrina, but the Wellcome Library plans to integrate the database into its own digital library in future.

  • creation of TEI P5 / ENRICH based XML schema (1)
  • creation of a conversion programme from TEI P4 MASTER and ArabTeX to the new schema
  • store text as UNICODE
  • identify non-UNICODE characters specific to medieval Arabic manuscripts and manage their input and display
  • investigate problems/issues related to the storage, input and display of bi-directional UNICODE text
  • create export facilities to the Wellcome Library's Encore system (XML) and OPAC (MARC21).

Input system - this will be the web-based user interface for administration, data input and QA.

  • design and build a web-based interface with template for data entry, and a facility to handle the non-UNICODE characters
  • develop a workflow for data input
  • create a facility to input Arabic characters (including non-standard characters) via a virtual keyboard and/or a transliteration scheme.

(1) Released on November 1, 2007; see TEI P5 Guidelines; " From MASTER to TEI P5"

top

What digitisation standards will be used?

The manuscripts are photographed using medium-format cameras to provide high-quality, full-colour images. Cover-to-cover photography includes all the pages, the covers inside and out, spine, fore edge, top and tail. Images will be archived at both the Wellcome Library and the Bibliotheca Alexandrina as JPEG 2000 images.

top

How were conservation issues addressed?

A condition survey was carried out on the entire Arabic manuscript collection before digitisation was carried out. This survey categorised items by the scale of the conservation requirements, their opening angles (to determine what photography equipment was needed), and other key information required in order to proceed with digitisation.

Ligatus Research Unit was instrumental in developing the condition survey, creating a system on which to record the data, and providing the staff required to check every manuscript. After the survey was completed, a member of the Library's conservation team carried out the necessary repairs, although items needing extensive bench work were deemed outside the scope of the project.

The Library purchased a conservation book cradle system to enable digitisation of manuscripts that are tightly bound and/or have writing that runs into the gutters of the book.

top

What are the key benefits of the project?

Arabic manuscript cataloguers urgently require a system that allows them to capture the full range of character variation that typifies Asian manuscripts, thus enabling researchers to interrogate the resulting metadata in a comprehensive and flexible manner. What is truly lacking for this activity is a system that adequately reflects the specific features of vernacular scripts which are non-standard. This project will address these issues, in collaboration with experts in the field, and will incorporate the requirements into the metadata schema to be developed.

Once a critical mass of Arabic manuscripts are available online, the use of the resource will facilitate research and collaboration. This has been identified by HEFCE as a key strategic initiative in UK higher education, and complements HEFCE's other initiatives such as the formation of a UK Islamic studies network, and the digitisation of Islamic studies Ph.D. theses.

top

How will this project be sustained, and will it be further developed in future?

All three partners are committed to providing long-term access and support to their digital content, including digital image preservation, website accessibility and availability, and technical support for software systems. This project will be incorporated into the Wellcome Library and Bibliotheca Alexandrina's operational sustainability plans. Images and metadata will be made available under a Creative Commons, Attribution, Non-commercial license.

Once developed the cataloguing tool will be freely available for anyone to obtain and use, including the documentation required to install and implement it.

Other content providers will be encouraged to re-use this digital content to make this information accessible to non-specialist audiences - in particular HE/FE students in the areas of Islamic history, Arabic language, history of science and medicine, history of the book, and manuscript conservation.

In future, further cataloguing will be carried out to increase the level of chapter heading recording (only a sample of around 120 manuscripts will include this metadata during the project), to include manuscript fragments, and to share data with the Ligatus Research Unit's conservation database as well as other databases to come.

top

Where can I find further information?

top
 Wellcome Library, 183 Euston Road, London NW1 2BE, UK  tel:+44 (0)20 7611 8722  email: library@wellcome.ac.uk Sitemap|Privacy statement|Disclaimer