Workbook on Digital Private Papers > Working with record creators > Accessioning digital and hybrid personal archives
Accessioning digital and hybrid personal archives
Introduction
The term 'accessioning' in the digital environment essentially denotes the activities which take place after the records survey and prior to ingest into the digital repository, i.e. 'the process of transferring the selected records and the records survey metadata (the Submission Information Package (SIP)) from the creator's computing environment to the accessions archive where the Archival Information Package (AIP) is generated; this represents phase 2 in the lifecycle of a digital archive. This process must take into account the concerns of all parties with a stake in the archive:
- The record creator will be concerned about the security of their data during transit. In the case of a busy politicians' office, creators will also be keen that the transfer process can be carried out quickly and with a minimum of disruption.
- The digital archivist is concerned to ensure the authenticity and integrity of the records throughout the process, i.e. that they have not been unintentionally altered in any way and that their original directory structures are maintained intact; this also needs to be documented fully in an audit trail.
- Authenticity will also be a prime concern of future researchers, who will want to be sure that the records they access are identical in all essential respects to those which left the creator's computer.
Paradigm developed a transfer protocol for the purposes of the project which attempts to take all these issues into account. However, it should be noted that this only represents one possible approach to the transfer process (transfer via removable media), and ultimately the process must accommodate a range of different methods which will vary according to the selected approach to collection development (see Chapter 02 Collection development), e.g. electronic transfers from the donor/depositor via a secure upload mechanism, or transfer via retired hardware and media when records have reached the end of their active life.
This section outlines the Paradigm transfer protocol and documentation, and points to practical 'how-to' guides for accessioning two commonly-encountered types of digital record. The paper-based component of hybrid archives is not covered because most collecting institutions will have well-established procedures in place for transferring traditional archives to the repository.
Transfer protocol
The Paradigm transfer protocol was designed to balance the needs of archivists with those of record creators, and to be both effective and relatively straightforward to carry out.
The goal of the protocol is to enable authentic records to be securely transferred from the premises where they are currently accessed or stored to the premises of the Library's Digital Archive. It includes measures designed to:
- Preserve as much of the records' original order as possible.
- Protect the integrity of the records.
- Secure the records from unauthorised access.
The transfer protocol developed for the project is based on copying the records selected for accession to removable media.
Pre-accession assessment
The pre-accession assessment of the archive will normally include a site visit to conduct the records survey and some discussion with the depositor (or representative) via phone or email. The records survey has been developed to enable the digital archivist to gather the information required to undertake an accession which includes digital materials; it assists the archivist in:
- Discovering which hardware(s) and software(s) are being used to create records.
- Learning about any username/passwords which might be required to access or copy materials.
- Discovering which hardware(s) and software(s) might be used in the transfer process.
The transfer list
A transfer list was produced (see Appendix D: Transfer list) to document the transfer process; this forms part of the audit trail for each accession and supports the authenticity of its component digital objects. A separate list can be used for each directory structure, e.g. email directory; 'my documents' folder. It allows the archivist to record:
- The name and contact details of the owner of the archive material and the archivist carrying out the transfer. Both parties can sign and date the list.
- A reference to the deposit agreement, which sets out the terms and conditions of the transfer.
- A reference number for the piece of removable media being used, e.g. USB-1.
- The checksum value generated for the material. The checksum calculation is repeated when the records arrive at the accessions archive to ensure that they have not been altered in any way during the transfer process. Paradigm used the MD5 algorithm for checksumming at accession; the checksumming algorithm defines how the checksum is generated and several algorithms exist.
- The extent of the material, ideally in bytes (the size measurement required for preservation metadata purposes).
- A technical description of the material, including information on file formats, any passwords or encryption applied by the creator, and details of the PC from which the material was taken.
- A description of the material's content, including information on: directory and folder names; the identity of the principal creator; record types included; broad subject areas covered; and approximate covering dates.
- Restrictions which apply to the material, including an indication of: records which might contain confidential or personal data; records on which the creator wishes to impose access restrictions; and records which both parties have agreed will not form part of the accession (these should be securely deleted if they have been picked up during the copying process).
It is useful to supplement the transfer list with printed screenshots of the system information and directory structures as a visual aid. A copy of the list is offered to the creator and a copy remains with the archivist to form part of the audit trail.
Transferring records to removable media
The first site visit and survey should have helped the archivist to determine which records are to be captured, how they will be captured and how long the transfer process will take. This should be agreed with the creator and the archivist should now be prepared to make the first accession; for this they will need:
- A laptop with checksum software, encryption software, anti-virus software, CD-ROM reader/USB port.
- Removable media on which to store the records. The type of media should be agreed in advance with the depositor and will depend on the hardware available and the quantity of records to be captured. It might be a CD-ROM or USB stick, although in light of concerns about the security of data expressed by project participants, the Paradigm archivists ultimately opted to use a biometric protected USB-powered external hard-disk. It can be useful to take a selection of different media.
- A copy of the deposit agreement.
- A supply of transfer lists.
At the premises where the records are to be copied, the archivist should follow these steps:
- Where possible, records should be compressed in existing directory structures using lossless compression, such as a tar or zip file. This helps to maintain original directory structures, speeds up transfer and requires less storage space.
- These compressed files should be transferred to the removable medium.
- The removable medium should be inserted into the digital archivist's laptop and, using the MD5 checksum software pre-installed on the laptop, a checksum of each compressed file should be generated and recorded on the transfer form. An authenticity check is carried out when the records reach the accessions archive as part of a general 'health check' on the material, by repeating the checksum calculation and comparing the result with the checksum(s) recorded at the transfer stage.
- The transfer form should be completed and signed by the digital archivist and the depositor or their representative.
If the archivist is able to use a USB mass storage device, then it is possible to add a toolkit of compression, checksumming, screencapture and encryption software to this device and to run these while the device is connected to the creator's computer, rather than creating checksums and encrypting files using the archivist's laptop. Archivists should also explore digital forensics tools, which are designed for investigators to capture digital material that can subsequently be used as evidence onsite, as these have interfaces for the extraction of materials and can automatically create checksums for each file.