Project overview

The need for this project

Ingesting materials into a managed repository environment is a pre-requisite for archival curation in the digital era. While managed repository environments exist, there is a shortfall in the ingest tools available to submit materials to them. The quality of the OAIS-based Archival Information Packages (AIP) submitted to a preservation repository will have a significant impact on the 'preservability' of a collection and its future research potential. Both the Personal Archives Accessible in Digital Media (Paradigm) and Digital Curation in Action (DCIA) projects have found the immaturity of existing tools needed for the ingest workflow to be the principal barrier to populating repositories. Research libraries such as those of Oxford, Manchester, and the Wellcome Library require practical tools for ingesting digital materials into repositories to fulfil their missions of curating and providing access to archival and manuscript collections, which are increasingly created and offered in digital forms.

Problems with existing ingest mechanisms

Current technology for ingesting born digital materials is fragmented: it consists of a series of stand-alone tools, many of which are overly-technical, poorly documented and not integrated into practical workflows. This makes it difficult for institutions and individuals, which lack skilled software engineers, but which have archival or library skills, to engage adequately with, and collect, born digital materials.

For adequate digital curation, institutions require tools that automatically create a METS Archival Information Package, which wraps together all the metadata needed to preserve each object. At present, knowledge and understanding of the Metadata Encoding Transmission Standard (METS) standard amongst curatorial staff is still developing and hand-crafting XML-based METS files is onerous. Archivists need tools which simplify the creation of appropriate METS files.

Existing ingest tools do not satisfy the requirements of a repository aiming to acquire and preserve personal digital archives of enduring historical value because:

  • Many tools have been developed for repositories whose primary function is 'access now' and collect descriptive rather than preservation metadata; digital archives must be ingested into a restricted preservation repository until relevant legal, privacy and intellectual property right requirements are satisfied
  • Born digital archival materials consist of a variety of objects (email, web sites, documents, etc.) that must be ingested together, maintaining contextual information in the form of complex inter-object relationships while respecting the different preservation requirements of each object; Existing solutions are geared towards collections composed of many of the same, or similar, kinds of object, which do not have complex relationship requirements

Ingest for complex digital collections is therefore prohibitively expensive at present.

An integrated solution:

The project partners believe that the solution to the ingest problem is to bring together existing tools into a documented, automated, integrated workflow, to produce repository-independent metadata packages, in the form of METS files, that will provide the basis for long term life cycle management. This will address the problems outlined above by:

  • Combining the functionality of existing tools in one user-friendly interface and workflow, which is well documented from the user and technical perspectives
  • Developing the tool to automatically generate METS files for archival storage, so that curators need not be familiar with the finer details of this complex standard
  • Developing preservation metadata profiles for the tool, which will ensure that important research materials can be preserved and used in the long-term
  • Respecting the importance of context and the preservation needs of different kinds of content
  • Driving down costs by producing a more efficient ingest process.
 

Contributing institutions

The Joint Information Systems Committee University of Oxford The University of Manchester Wellcome Library
Last updated: 9 October 2006