Workbook on Digital Private Papers > Administrative and preservation metadata > Using METS for the preservation and dissemination of digital archives
Using METS for the preservation and dissemination of digital archives
Structure of a METS file
Alternative uses of the structural map
The repeatable <structMap> element allows archivists to exploit the digital environment to the full by creating multiple arrangements - and therefore multiple means of navigating and accessing - the material in a personal digital archive. The hierarchical nature of the <structMap> is ideal for maintaining the original order of an archive (i.e. the creator's divisions into different directories, folders, subfolders and files), but it also allows the repository to cater for different categories of user by presenting numerous alternative arrangements of the material. METS does not prescribe any specific types of arrangement, so these might be physical (e.g. <div>s representing page sequence in a digitised book), logical (e.g. <div>s representing poems in a poetry collection, which might span pages in the volume), or a mixture of both. This facility also has great potential for personal digital archives.
The <structMap> in one METS document is not limited to organising the content represented by the <fileSec> of the same METS document. It can also organise content represented by linked, external METS documents. This is achieved by using the METS pointer (<mptr>) element instead of the file pointer (<fptr>) within each <div>: the <mptr> points to content represented by an external METS document, by means of an xlink:href attribute containing a URL marking the location of the relevant METS document.
This means that a digital archive can be arranged by using numerous 'parent' METS documents, which do not contain a <fileSec> themselves, but by means of their <structMap> organise further METS documents at a lower level in the hierarchy. Ultimately, the lower level parent METS documents organise and point to METS documents for individual digital objects like the email example explored above.
The following examples take the single email and its METS document as a starting point, and work upwards through this hierarchy of METS documents.
Example 1: Email and attachment (AIP)
This example is based on an email and the attachment; the attachment is in the form of a Microsoft Word file containing a draft of the election press release referred to in the subject line of the email. As an attachment is an integral part of an email, the two digital objects must be unambiguously linked together. This can be achieved by means of a parent METS document containing some basic descriptive metadata, and a <structMap> pointing to the two separate METS documents representing the email message and the attachment. There is no need to include a <fileSec> in this parent document because it does not hold any digital content itself; the administrative metadata specific to each component object will be recorded in its own METS document and does not necessarily need to be represented at this higher level, although a basic <amdSec> could be included. This is a reversal of the rules for descriptive archival cataloguing, where as much common information as possible is given at a higher level; each digital object needs to be self-documenting, so the detail is placed at the lowest possible level.
This example is a rough approximation of what such a METS document might look like at the AIP stage. At DIP stage Dublin Core metadata (for metadata harvesting), a reference to an EAD finding aid and some basic administrative metadata might be included.
In the MODS descriptive metadata section the subject line of the email is given in the <title> field and the title of the attachment is supplied in the <abstract>. The date of creation is given as a span date - from the date when the Microsoft Word attachment was created, to the date when the email was sent. Additional <name> elements should be added if the sender of the email is not the person who authored the press release in the attachment.
There are two <div> elements in the <structMap>, which both point to external METS documents using the <mptr> element: one for the email message itself and the other for the attached Microsoft Word document. The TYPE attribute values are hypothetical; these should be established and recorded in a METS Profile.
Example 2: email folder (DIP)
This example moves up the archival hierarchy one step to an email folder within an email directory; the example represents an email folder called 'Press releases', in which the email and attachment from Example 1 might be stored.
This example contains descriptive metadata in the form of a MODS record as well as some basic administrative metadata; this therefore contains all the metadata needed in a DIP, apart from an additional Dublin Core record and a reference to an EAD finding aid. Most importantly, however, it contains three different structural maps providing users with different ways of accessing the folder contents. The first <structMap> represents the original order of the folder's contents (which is a single sequence of messages arranged chronologically); this is the primary order and the one which would be represented in the EAD catalogue. The second <structMap> is arranged alphabetically by named correspondent; and the last is grouped according to email subject line (allowing users to navigate 'threads' in a particular email correspondence).

Each of the three structural maps has been given a unique ID to distinguish it from the others. The root <div> element in each case refers (by means of DMDID and ADMID attributes) to the descriptive and administrative metadata relating to the email folder, which is stored in this METS document. The lower level <div> elements point to other METS documents which will contain the same kind of metadata specific to the digital object represented in each. While some of the <div> elements point directly to METS documents for single objects which would include the location of the object content, others point to lower-level 'parent' METS documents containing data about emails and associated attachments.
The first <structMap> above reflects the creator's original order, which means a single chronological sequence of divisions, each representing one email, with a label indicating the date/time of sending and the sender's name. Another <structMap> could be provided in order to nest the divisions into chronological periods based on week or month to make the content more manageable.
The second two structural maps have been classified as "logical" types because they do not reflect the creator's original order; an artificial arrangement has been imposed on the material with the purpose of facilitating access for users. In the second map the <div> elements are nested into groups based on the name of the correspondent, and in the third example nesting is based on named 'threads' (drawn from the email subject line), which have been arranged in alphabetical order.
These possibilities for creating different arrangements of folder content also apply at the higher level of the email directory. While the primary <structMap> would reflect the creator's arrangement of folders within their directory; additional maps might cut across these folder divisions (which are usually subject-based and ordered alphabetically) to arrange material in different ways, e.g. by correspondent.
Example 3: collection level (DIP)
At the highest level, an overall parent METS document can be used to structure a digital archive in its entirety. Whereas the EAD collection-level description will pull together all the elements (both paper and hard copy) of a hybrid archive in their final archival arrangement, the collection-level METS document can organise the digital elements of the archive into a hierarchical structure which reflects the different components (website, email directory, office files) of the digital archive and their structure into directories, folders, sub-folders and files.
This collection-level file will point, via its <structMap>, to a web of lower level METS documents, most of which will be parent files representing folders (e.g. named email folders or subject folders); these will in turn organise their own contents and point to the lowest level METS documents which each represent a single digital object.
The collection-level METS document would contain some basic descriptive metadata and its administrative metadata would consist of brief information on IPRs which affect the digital part of the archive as a whole. The focus of the document would be the structural map, which is all that is reproduced below. Each <div> would contain a <mptr> element (not included here) which would point to the location of the METS document representing that <div>.