Workbook on Digital Private Papers > Digital repositories > Introduction
Introduction
Repository software aims to provide a managed environment for digital objects, such as documents and images, and their metadata. Repository software will generally include tools which allow curators and users to exploit the stored objects and their metadata. Paradigm evaluated DSpace and Fedora, two of many repository systems, with the objective of assessing their potential as preservation repositories for personal digital archives and to select software that the project would use in its prototype system. View the how-to guides for installing DSpace and Fedora and on ingesting file directories into Fedora.
Paradigm distinguished between access and preservation repositories very early in the life of the project. Repositories catering for less sensitive materials with shorter embargoes, or materials where rightsholders permissions to publish can feasibly be sought, often combine preservation and access functions in a single repository. Paradigm's work suggests that this is not the optimal solution for personal digital archives because:
- Preservation repositories have different functional requirements to access/presentation repositories.
- Preservation and access have different metadata requirements.
- Preservation repositories do not have the same infrastructure/performance requirements as presentation repositories.
- Access repositories must be networked, but closed materials are afforded better security when placed in a preservation repository that is isolated from the network.
- Preservation repositories which are networked will require much thought to be given to security of materials server-side and client-side, and in transit.
- Preservation repositories may have different backup and disaster recovery requirements.
- The management of preservation repositories calls for different skills and experience.
- Preservation repositories have different users (archivists) than access repositories (researchers, general public).
- Some items held in preservation repositories may never be published to online access repositories (e.g. software maintained by the repository for data extraction purposes; an image of a depositor's hard disk; an old version of a file that is no longer accessible using current computing environments).
Paradigm therefore proposes that:
- Born-digital archives be held in preservation-only repositories while they are closed to researchers.
- That institutions secure online preservation repositories appropriately, or opt for offline repositories that are simpler to secure (but will require a local back up routine).
- When an archive, or parts of it, is opened to research, readers may order a dissemination copy of a born-digital archive to read in a controlled environment.
- When all restrictions relating to privacy, rights and other content liability expire, 'access copies' of born-digital archives may be published to an online 'access repository', but master copies should remain in the preservation repository. The preservation service should be responsible for supplying new versions of 'access copies' in accessible formats as appropriate.
This solution means that those functions of preservation repositories that would be redundant in an access repository need not be imposed on them and vice versa.