Workbook on Digital Private Papers > Administrative and preservation metadata > Using METS for the preservation and dissemination of digital archives

Using METS for the preservation and dissemination of digital archives

Structure of a METS file

File section <fileSec> The file section is used to provide an inventory of, and location for, the data files comprising the digital object being described by the METS document. It can contain one or more file group (<fileGrp>) elements which can be used to organise the individual files (each recorded in a <file> element) into sets, e.g. in the case of digitised images, there might be groups for thumbnails, reference copies and archival masters.

In contrast, the file section of a typical METS document for a born-digital object in a personal archive is likely to be very simple. A METS document representing a single email, for example, will only contain one <fileGrp> and one <file>.

Example:

This example is based on a single email at AIP stage, when it is in archival storage and is linked to its own individual METS document.

Much of the information in the <fileSec> is conveyed by means of attributes. Here, the <fileSec> as a whole has been allocated a unique identifier which allows it to be referenced from elsewhere in the METS document. Similarly, the <fileGrp> and <file> elements also have unique IDs. The <fileGrp> element can contain an ADMID attribute which links to the relevant administrative metadata sections by means of their IDs. The <fileGrp> element does not allow a similar link to descriptive metadata; this is done at <file> level by means of the DMDID attribute.

The USE attribute indicates the intended use of the files within a <fileGrp>. Frequently used values include master, reference or thumbnails for image files. METS does not prescribe values for this attribute, so these should be determined at local level. Here it has been used to indicate that this version of the file is an AIP rather than a SIP or DIP.

At <file> level an attribute to specify the MIME-type of the file is also available.

METS offers two methods of dealing with content files within the <file> element. They can either be embedded within the METS document using the <FContent> element, or held externally and pointed to by means of the <FLocat> element. The latter is the more usual approach and here the location of the email is given as a URL. The href attribute supplies the URL for the location of the file (although technically optional, this is essential when using the <mptr> element in the structural map); the title describes the meaning of the link in a human-readable fashion; the "new" value for the SHOW attribute indicates that the digital object (the email) would be shown in a new window; and "onRequest" indicates that it should only be shown at the request of the user.

Code sample

The <fileSec> can also handle much more complex digital objects: there is a component byte stream (<stream>) element which can be used to record the existence of separate data streams within a particular file (e.g. separate audio and video streams in an MPEG4 file); and a transform file (<transformFile>) element, which provides a means of accessing any subsidiary files listed below a <file> element by indicating the steps required to unpack or transform the subsidiary files.