Workbook on Digital Private Papers > Administrative and preservation metadata > Preservation metadata

Preservation metadata

Choosing tools for metadata generation

Generating all the preservation metadata needed for preserving personal digital archives cannot be a fully automatic process at present, though the DROID, Jhove and National Library of New Zealand Metadata Extract (NLNZ) tools do produce some of the values repositories might like to record. A key problem is that no tool currently generates metadata marked up in the PREMIS XML schemas, and though PREMIS does not require that the PREMIS XML schemas be used, it seems sensible to use a single mark-up standard, which will interoperate with future tools and repositories, rather than add multifarious kinds of tool-specific metadata to METS files. Working with fewer schemas will also facilitate the training of staff in using the repository.

Tools developed by digital preservation specialists

DROID

Developed by The National Archives in the United Kingdom, DROID (Digital Record Object Identification) uses a signature file to identify formats it knows, and returns metadata to the user which could be used to populate the <formatDesignation> and <formatRegistry> semantic units. This information can be exported from the tool in CSV and DROID’s own XML formats.

Jhove

Developed by JSTOR and Harvard, Jhove (JSTOR/Harvard Object Validation Environment) supports mainly open formats; these include profiles of versions of the following: AIFF; ASCII; Bytestream; GIF; JPEG; JPEG2000; PDF; TIFF; UTF8; WAVE and XML.

Jhove uses its modules to parse an object in order to identify it. The file will be run through each module until it is identified, or does not identify with any module. It can be configured to look at magic numbers instead. For files that do not conform to these types, Jhove can produce a small set of metadata according to its default profile.

Jhove produces its metadata according to its own XML schema, but images are described using MIX.

NLNZ tool

The NLNZ tool supports the following proprietary formats:

It also has a default profile for other formats. The NLNZ uses file extension to determine format (not such a reliable method) and provides a mime type.

The NLNZ tool produces its metadata according to its own schema.

Other tools

Software developers and the public are becoming increasingly interested in creating and exploiting metadata, and there are a number of tools for working with metadata available commercially and in open source software repositories such as Sourceforge. The Cairo project, which is developing a tool for ingesting digital archives and metadata into a repository, is conducting a survey of such tools and assessing their utility in supplying metadata for preserving digital objects. Initial impressions include: