Workbook on Digital Private Papers > Digital preservation strategies > Degree of preservation

Bitstream preservation

Bitstream preservation is the most basic technical layer of any digital preservation strategy. It could be adopted as the sole layer were the repository to confer responsibility for acquiring file- and representation-level access to future users of the material. This might be acceptable in some circumstances, but is not a reasonable approach to the preservation of personal digital archives, which are likely to be used by readers possessing a wide range of technical knowledge from negligible to expert. Nevertheless, digital archivists must address bitstream preservation as it forms the foundation for all other preservation strategies. Whatever the overall approach to preservation taken by a digital repository, it should be a matter of policy that the bitstream of every archival digital object is preserved in its original form indefinitely. The advantages of doing this are largely obvious and include the following:

Even this most basic level of preservation requires a degree of preservation activity, and the preservation strategy of a digital archive should include provisions for ensuring that unaltered bitstreams are preserved intact over time so that the authenticity of digital objects is not compromised.

Bitstreams must be stored on some kind of physical medium. Ultimately no media is 'archival' and all types will degrade over time; media life expectancy claims are statistical averages based on accelerated ageing tests which can only provide a rough estimate of longevity. Media technology also evolves quickly. New media types supercede older media and the devices needed to read a particular kind of media are often discontinued far sooner than the media physcially degrades. It is therefore important to have a comprehensive strategy in place for ensuring that the bitstreams of digital objects are stored on suitable media at all times and are checked on a regular basis. The following elements should all form part of a physical preservation strategy for digital media:

Ongoing procedures for regularly refreshing storage media. Refreshing (moving to a newer version of the same storage media, or to different storage media, but with no change to the bitstream) should be carried out at specified times; these should fall within the minimum lifespan of the chosen medium as specified by the manufacturer or by independent sources. Refreshment may also be undertaken in response to an increase in error-rates reported by the storage system. After refreshing (and before the earlier media are securely disposed of), fixity checks should be carried out to ensure that no changes have occurred to the bitstream during the transfer process; all such preservation actions should be well documented, using metadata schemas which provide support for this, such as PREMIS.

There are also some general precautions in relation to storage and handling which can be observed to mitigate the risk of physical degradation, whatever the media employed:

It is important for digital curators to understand the properties of the various types of storage media available, because they require different hardware and software equipment for access, and have different storage conditions and preservation requirements. Choosing the most appropriate media can maximise the period between refreshment cycles.

Selecting appropriate preservation media

When selecting appropriate storage media for preservation, The National Archives in the UK recommends taking into consideration:

  1. The longevity of the media, which should be at least 10 years.
  2. The capacity of the media, which should be appropriate for the quantity of data to be stored, and the physical size of the archival store.
  3. The viability of the media, which should have robust error-detection methods for reading and writing data. Media should ideally be write-once.
  4. The possible obsolescence of the media and its supporting hardware and software; ideally it should be based on mature technology which is widely available. As with file formats, open standards are preferable to proprietary ones.
  5. The cost of the media: comparisons should be made on a price to volume ratio.
  6. The susceptibility of the media to physical damage.
  7. The media's ability to tolerate a wide range of environmental conditions.

When selecting appropriate storage media for preservation, there are three main types to consider: disk, tape and solid state media.

Optical media

This media type includes:

  • The Compact Disc (CD): originally just an audio format, but since the development of the CD-ROM there are also CD types specifically designed to store data accessible by a computer.
  • The Digital Video (or Versatile) Disc (DVD).
  • Numerous variants on both CD and DVD, e.g. CD-R (Compact Disc-Recordable), DVD-R (Digital Video Disc-Recordable).
  • New, high-definition optical disc formats, principally the Blu-ray Disc and High Density or High-Definition DVD (HD DVD).

CDs and DVDs store data in the form of pits within a flat surface; the data is accessed when a special material on the disc is illuminated with a laser diode, and the pits distort the reflected laser light. The discs are comprised of various layers (including a dye layer for recordable media and a reflective layer), and the combination of materials used for these layers can affect stability and longevity.

The National Archives produced a scorecard to measure formats against the seven selection factors listed above. Their conclusion indicates that both CD-R and DVD-R can be considered for long-term preservation; CDs have a slightly higher score - rating particularly highly for longevity, obsolescence, cost and susceptibility. Recent research suggests that CD-Rs which combine a more chemically stable dye layer (e.g. metal-stabilised cyanine) with a reflective layer of gold may have a life span suited to archiving.

Definitive lifespans cannot be determined, but under optimal environmental conditions and with infrequent use, the life expectancy of both a CD and a DVD is predicted to range from approximately two years (at a temperature of 28ºC and relative humidity of 50%) to 75 years (at a temperature of 10ºC and relative humidity of 25%). Media integrity should be monitored by reading a sample of disks periodically and media should be scheduled for refreshment at regular intervals pre-dating the lifespan suggested by the manufacturer. Buying the highest-quality optical storage media can extend the period needed between refreshment cycles.

The new Blue-ray Disc and HD DVD both differ from other optical media in that a blue-violet laser is used for reading and writing data. This has a shorter wavelength than the red laser used by CDs and DVDs, so substantially more data can be stored on a single disc. Currently, Blu-ray provides a single layer storage capacity of 25 GB and HD DVD provides 15 GB (in contrast to the 4.7 GB of conventional single layer DVDs). Technical differences make the two high-definition formats incompatible with each other; this has resulted in a format war comparable to that between Betamax and VHS in the 1980s, with major companies like Sony, Philips and Apple backing Blu-ray, and Toshiba and Microsoft backing HD DVD. In the light of this ongoing competition, and the current lack of knowledge about how well-suited these formats are to long-term preservation, Blu-ray and HD DVD should not yet be used as archival storage media, although they may have great potential for the future and players capable of reading both formats are in development.

Magnetic media

The term magnetic media is used to describe any media format where information is recorded and retrieved in the form of a magnetic signal; the magnetic properties come from metallic materials suspended in a non-magnetic mixture on a substrate or backing material.

The most common types of magnetic media are:

  • Magnetic tape: including computer tape stored in cassettes (the open-reel format is now obsolete); and tapes used in digital recording processes, primarily Digital Linear Tape (DLT) and Digital Audio Tape (DAT), which was originally designed for audio use but has now been adopted for general computer data storage; Linear Tape Open (LTO) is a non-proprietary alternative to DLT. The tape consists of a carrier of plastic film coated with a matrix containing magnetisable particles, and a plastic or resin binder, as well as other ingredients.
  • Magnetic hard disks, which can be held within a computer, or exist as independent external devices: they consist of a spindle holding one or more rapidly rotating platters which have a metallic (usually aluminium) base coated on both sides with a matrix similar to that of magnetic tape.
  • Magnetic floppy disks/diskettes: these consist of a plastic base with a magnetic matrix on one or both sides; this is enclosed in a protective plastic jacket.

Of the magnetic tape varieties, DLT and LTO are high capacity formats. The National Archives scoring system ranks these alongside CD-R, and they are the most stable and long-lasting formats; they are therefore suitable for long-term preservation. So too is DAT, although this has a low storage capacity and scores slightly lower overall than DLT/LTO. Magnetic tape is also the least expensive backup medium per unit. In contrast to optical media, more inexpensive, standard, magnetic tape products can be used for preservation; while this requires more regular and rigorous monitoring and refreshment, it may still be more affordable in the long-term than using the highest quality optical media.

Hard disks have high storage density and are reasonably robust. The kind of hard disks found in personal computers currently hold between approximately 20 GB and 750 GB of data. Larger scale storage employing hard disk technology uses disk arrays, which organise multiple disks into a logical volume of storage. Archival storage should use RAID (Redundant Array of Independent Disks) technology, which, depending on the level of RAID used, can protect against some level of data loss in the event of one or more disks in the array failing.

Hard disk drives have a life expectancy of five years at the most; this means that they would need regular refreshment. Their advantage is that they are spinning disks, and therefore the archive can automate regular fixity checks. The National Archives recommends server-based hard disk storage as the most effective and secure storage regime for electronic records.

Floppy disks have very limited capacity; they are susceptible to accidental erasure and have a very short lifespan. Most modern computers do not include a floppy disk drive, so developments in hardware are rapidly rendering them obsolete. For all of these reasons, floppy disks are not suitable for long-term preservation purposes and any data residing on these should be transferred to more appropriate media as soon as possible.

All electro-magnetic devices are susceptible to electro-magnetic radiation. Electromagnetic Pulse (EMP), which can be generated by nuclear detonations or electromagnetic bombs, is the most damaging form of this. Protection against electro-magnetic effects can be provided by Faraday cages, or repositories could opt to store copies of their data on optical media, although these could not be read until damage to electrical and computing infrastructure is remedied. More common electro-magnetic interference is caused by active wireless devices, such as mobile phones.

Solid state media

The term solid state denotes removable storage devices which use flash memory, including USB and memory sticks, and cards which are used in digital cameras or laptops, like CompactFlash, xD Picture Cards, SD Memory cards and MultiMedia Cards. Solid state media holds data in smaller packages than hard drives, which makes their storage more efficient; their capacity does not yet equal that of hard disk drives, but is increasing all the time. They are very small and have no moving parts which make them particularly robust and portable. While these media are useful for short-term portable storage, their archival properties are not yet well understood, and they are therefore not appropriate for long-term preservation use.

Storage and handling of removable digital media

In addition to the general storage and handling recommendations for all types of media, other recommendations can be made for specific media types selected for long-term storage.

Optical:

  • For long-term storage, CDs and DVDs should ideally be stored at a temperature of 18-22ºC and at a relative humidity of 35-45%.
  • Perhaps surprisingly, the top surface (or label area) of a CD is more vulnerable than the underside and requires extra care.
  • Optical discs should only be handled by the extreme edges or the centre hole.
  • Cleaning of CDs or DVDs should be done from the outer to the inner edge, rather than along the tracks.
  • Optical discs should never be flexed. DVDs are most vulnerable because their tracks are more closely spaced; special DVD carriers can minimise flexing when discs have to be moved.
  • The surfaces of optical discs should not be marked, unless according to the manufacturer's recommendation; if marking a disc, use soft tipped pen with water-soluble, permanent ink and only mark the upper surface.

Magnetic tape cartridges:

  • Store DLT at a temperature of 18-26ºC and relative humidity of 40-60%.
  • Store LTO at a temperature of 16-32ºC and relative humidity of 20-80%.
  • Store DAT at a temperature of 5-32ºC and relative humidity of 20-60%.
  • Avoid exposure to magnetic fields: these can alter the media and lead to data loss.
  • Minimise handling and use and return tapes to their containers directly after use.
  • Tape cartridges should never be opened.
  • The tape surface should never be touched.
  • Label in ink rather than pencil: graphite dust can interfere with the reading of the tape.
  • Before use, tapes should be forwarded and rewound fully to equalise tape tentions.
  • After writing, the tape should be fully rewound; tapes should never be left in a partly wound state for any length of time.
  • When transporting tapes, use enclosures or packaging with a space clearance of 50 mm around the media.
  • Tape cartridges should be retensioned at yearly intervals.
  • The write-protect switch should be set after writing.

Solid state:

  • Media should only be held by the extreme edges.
  • Labels should only be applied within the approved label area.

Institutions should follow the recommendations established in British Standard 4783, Storage, Transportation and Maintenance of Media for use in Data Processing and Information Storage.