Workbook on Digital Private Papers > Administrative and preservation metadata > Using METS for the preservation and dissemination of digital archives
Using METS for the preservation and dissemination of digital archives
Introduction
METS and the Paradigm Project
Paradigm identified METS as the most appropriate means of storing all the metadata required for long-term preservation; each digital object in a personal archive should have an associated METS document which wraps up, or points to, all the metadata needed to preserve that object, thus forming an Information Package. Relationships with other digital objects will be made manifest by means of the METS structural map mechanism, which details the hierarchy for the entire accession, and internally each object can also record metadata about associated child and parent objects.
METS documents will also be needed for:
- Intellectual constructs, such as folders used by creators in arranging their archives, accessions and collections.
- Metadata about agents, events and rights (as defined by the PREMIS standard).
This Workbook chapter will concentrate on the use of METS as an Information Package for digital objects as defined by the OAIS model. OAIS defines three types of Information Package, and the METS document for a digital object will differ slightly according to which role it is fulfilling at any one stage of the lifecycle:
- As Submission Information Packages (SIPs): in some contexts, data creators supply data preservers with structured metadata. In these instances METS can be used to wrap the objects and the metadata together, and the preserving service could impose a standard METS profile for this purpose. This is highly unlikely in the context of a collecting archive, but might happen where the data creators are library staff undertaking a digital project of some kind.
- As Archival Information Packages (AIPs): this is the key stage in the long-term digital preservation of digital objects. Each digital object will have its own dedicated METS file containing comprehensive administrative metadata and will link to related METS files detailing relevant events, agents and rights which are associated with the digital object. This group of linked METS files therefore constitutes the AIP for a single digital object. There will also be collection and accession AIPs which provide comprehensive structural maps pointing to the AIPs of their children digital objects. There is likely to be less descriptive metadata at this stage than at the next.
- As Dissemination Information Packages (DIPs): METS can act as a delivery package for researchers, who will usually identify the object they are interested in via an EAD catalogue. On calling up an item, they will receive both the digital content itself and some relevant metadata from the METS file; there is likely to be a higher proportion of descriptive information at this stage and rather less administrative metadata. See Chapter 06 Arranging and cataloguing digital and hybrid archives for further information about DIPs for individual objects. The DIP should be relatively straightforward to build by extracting the relevant metadata from the current version of the AIP for any single digital object.
Paradigm’s progress with METS was stalled by the fact that personal archives contain a huge variety of digital objects with complex relationships. This requires the specification of a detailed content model and the means to automate the generation of METS documents which subscribe to that model. Presently, such METS documents can only be crafted by hand, assisted by the use of various (non-uniform) metadata extraction tools and registries. This is clearly unfeasible as a long-term solution. The CAIRO Project has therefore been set up to address this issue; it aims to develop an integrated, automated workflow, which will produce repository-independent metadata packages in the form of METS documents, that will provide the basis for long term lifecycle management.