Workbook on Digital Private Papers > Digital preservation strategies > File formats

File formats

Significant properties

Significant properties are those aspects of the digital object which must be preserved in order to ensure that it remains accessible and meaningful over time as it is moved to new technologies. Crucially in an archival context, preserving a digital object's significant properties can also help safeguard its continuing authenticity and integrity.

Digital preservation research projects and testbeds (most recently the Investigating the Significant Properties of Electronic Content Over Time (InSPECT) Project) have categorised the significant properties of digital objects into five areas:

The inclusion of content and context here is important. While it is possible to identify some significant properties which apply to a particular format as a whole, e.g. binary word processed documents, this does not take into account the wide range of purposes for which a word processed document might have been produced - in other words the record type or genre; for instance, records as diverse as committee minutes or reports, an author's manuscript draft of a literary text, the master copy of a letter, or a list of contact details might all be saved in word-processed format. Other significant properties will therefore be object- or document-specific.

In some cases it may be decided that the textual content is the most important element of a record, in which case properties like font type and size, italicisation, pagination, layout and so on may not be essential to the document's meaning. In other cases it may be that the record creator has made use of font size, layout, bulleting, italicisation, or colour to convey or emphasise meaning, in which case these elements should be preserved. Factors which determine what elements of a record are important relate to the intention of the record creator and the requirements of the research community it is being preserved for.

The identity of the record creator is also an important factor when determining significant properties. For instance, a politician might draft some notes for a speech using a word processor; in this case it is likely to be the content of the text itself which is considered most important for preservation because the document was created as a reference tool to use whilst speaking. Alternatively, a writer might draft a poem using a word-processor; in this case, the precise layout of the text on the page will be essential to the meaning of the text and any reformatting may destroy this meaning.

Similarly, a low-res digital snapshot intended as an informal record of a holiday will gain nothing by saving it at a higher resolution, while a digital image taken by a professional digital photographer should be saved at a high resolution to maintain the quality of the photograph. Taking this approach, the Library of Congress have developed five categories of still images which are likely to be added to the Library's collections; these range from pictorial expressions of high value, such as works by graphic artists, photographers and advertisers, for whom the designated community (the users of the resource) has high interest in the artist's intent, to images incidental to web harvesting. For each category proposed, a list of preferred and acceptable formats is compiled.

Although it may not be practical or cost-effective to identify significant properties on an object by object basis, in some cases this will be necessary, especially for collecting institutions which take in the personal digital archives of a wide range of record creators. It is hoped that projects like InSPECT will be able to establish certain significant properties which are common to a particular format or type of digital object, and archivists may then be able to draw out some generalisations about particular types of record creator (e.g. politician, writer, scientist). However, there will always be specific and local considerations, and decisions will always involve an element of subjectivity. The advantage of working with donors and depositors at an earlier stage in the records lifecycle is that decisions can often be made in consultation with the record creator (see Chapter 03 Working with record creators).