Workbook on Digital Private Papers > Administrative and preservation metadata > Preservation metadata

Preservation metadata

A note on the Object entity

The Object entity is more complex than the other entities and deserves a little more explanation. PREMIS defines three subtypes of Object:

File

'a named and ordered sequence of bytes that is known by an operating system. A file can be zero or more bytes and has one file format, access permissions, and file system statistics such as size and last modification date'.

Examples:
Portable Document Format 1.4 file
WordPerfect for Windows 5.1 file
WordPerfect for Macintosh file
Graphics Interchange Format 1989a file

Bitstream

'contiguous or non-contiguous data within a file that has meaningful common properties for preservation purposes. A bitstream cannot be transformed into a standalone file without the addition of file structure (headers, etc.) and/or reformatting the bitstream to comply with some particular file format'.

Bitstreams which are true files embedded within larger files, are known as filestreams. These have sufficient structural information to stand alone as files when removed from the context in which they were found.

Examples:
An image embedded in a Tagged Image File Format file (bitstream)
A Portable Document Format 1.4 file embedded in a zip file (filestream)

Representation

'the set of files, including structural metadata, needed for a complete and reasonable rendition of a particular Intellectual Entity'.

A Representation object could be a simple object, consisting of a single File to represent an Intellectual Entity, or a complex object, consisting of multiple Files and the structural metadata required to reassemble them into an Intellectual Entity. The diagram below shows two Representations of the same Intellectual Entity, one complex and one simple:

Figure 13: Two Representations of the same Intellectual Entity

The division of Representation and File in this way enables repositories to record metadata that is relevant to the Representation as a whole and to record metadata about each of its constituent Files. Many of the detailed semantic units in the PREMIS dictionary are not applicable to Representation objects because they are a wrapper object to unite the assembly of the Files which compose an Intellectual Entity, therefore this detail is held in the File objects related to the Representation of the Intellectual Entity.

Further examples to illustrate the relationships between Intellectual Entities, Representations and Files:

Intellectual Entity Sub-Intellectual Entity Representation (one or more Files)
Digital correspondence of politician Messages with or without attachments Microsoft Outlook for XP personal store file (.pst)
A message about a meeting with an agenda   An email in XML format and attachment in Open Document Format 1.0
Draft article for a magazine   A Microsoft Word 2000 file Tagged Image File Format file
Personal website Pages from personal website 1 css file
20 html files
1 gif1989a file
Personal website   1 PDF 1.4 file
Page from personal website   1 css file
1 html file

All elements described in PREMIS are as applicable to objects created by the repository itself, through migration events or other preservation actions, as they are to the original objects deposited with the repository as a personal digital archive.

PREMIS does not cover all preservation metadata requirements; along with specifying a local PREMIS application profile, decisions regarding what to use to describe Intellectual Entities, agents, files formats, rights, media, hardware, a repository's business and PREMIS record creation may also be needed.