Workbook on Digital Private Papers > Arranging and cataloguing digital and hybrid archives > EAD templates for a personal archive

EAD templates for a personal archive

Suggested EAD elements required at Fonds level

The importance of the collection-level description is indicated elsewhere in this chapter. Here, we focus on those elements Paradigm considers essential at fonds level when cataloguing a digital or hybrid archive. The <eadheader> and <frontmatter> elements are omitted here because they relate to metadata about the finding aid itself and the encoding of prefatory text and title page, so will be repository-specific. The focus is therefore on the information included in the <archdesc>.

Elements in the <did>

ID of the Unit <unitid> A unique reference code forming the basis of the shelfmark for each component level description. This should consist of: a country code based on ISO 3166 Codes for the Representation of Names of Countries; a repository code in accordance with the national repository code standard or other unique location identifier; and a specific local reference code or shelfmark.

Title of the Unit <unittitle> A title for the archive. Traditionally for an individual, the term 'Papers of [name]' is used; in the digital environment this should perhaps be replaced by 'Archive of [name]'.

Date of the Unit <unitdate> Covering dates for the whole archive. Where the information is obtainable (via the preservation metadata stored as part of the AIP), these should be span dates recording the initial creation of the earliest item in the archive, to the last modified date of the latest item in the archive. At collection level this will be a span of years rather than anything more specific. This will be supplied for researchers in traditional form. The 'normalise' attribute will be used to record the date in accordance with ISO 8601 Representation of Dates and Times. Embedding this information as an attribute facilitates information retrieval queries based on dates, and both EAD and PREMIS recommend using ISO 8601, e.g.:

Code sample

Origination <origination> The name(s) of the individual(s) responsible for the creation, accumulation, or assembly of the archive prior to its accession. Traditionally this is limited to the principal creator. However, in the case of politicians, a number of office staff might also be involved in generating a substantial proportion of the archive, so multiple names may need to be recorded here. Names should be supplied in accordance with the NCA Rules to facilitate information retrieval; a register of authority files should be maintained and the subelements <persname> and <corpname> should be used to encode the names created.

Physical Description <physdesc><extent> For digital materials, Paradigm's Academic Advisory Board suggested that researchers would be interested in both 'intellectual' extent (i.e. the number of series, folders and files) of the archive, as well as the size in megabytes, with the former taking priority over the latter. Note that the preservation metadata record is likely to record file sizes in bytes; the EAD catalogue should use megabytes, as a more understandable indication of size for the researcher. For a hybrid archive, information should also be supplied here about the extent of the hard copy material, with an indication of the relative proportions of digital to hard copy in the archive as a whole.

<materialspec> This element is used to record data unique to a particular class or form of material which is not assigned to any other element of description. This is probably the most appropriate element in which to record information about the creator's original file formats. At fonds level this should comprise a broad overview, e.g.

Code sample

It is suggested that <materialspec> is used in preference to for supplying free-text information on file formats; if <genreform> is used at all, it should only be used in the controlled access section.

Language of the Material <langmaterial> Record the language(s) of the material in the archive. If only one language is represented in the material, this information need only be supplied at collection level. If an archive is predominantly in one language but with a proportion of material in another, this should be noted at collection level; <langmaterial> can then be used at lower levels to indicate where the second language is represented.

Digital Archival Object <dao> If desired, a link can be made to the METS document which represents the 'collection' in the digital repository and points to the children folders and files of the archive.

Other elements

Physical Characteristics and Technical Requirements <phystech> Used to describe: physical conditions and characteristics which affect the storage, preservation or use of the archive, including the physical composition or hardware and software requirements for the preservation of and access to records held in electronic formats. The archivist may need to record information here which relates to the hard copy component of a hybrid archive, but for the digital component these issues are managed for readers by the digital repository and fully documented as part of a digital object's preservation metadata. Detailed information relating to the form of the digital material therefore need not to be included in the EAD catalogue, though a chronology of the creator's hardware and software environments, which could be referred to from lower level descriptions may be useful. Information about the repository's preservation policy (which will evolve) should not be included, but researchers might be referred to such information via a URL held in this element.

Biography or History <bioghist> Paradigm's Academic Advisory Board emphasised the usefulness of biographical information about an archive's creator(s), and this contextual information has always been considered important by archivists. Where the principal creator (i.e. the politician) has an entry in a standard published source (e.g. the Dictionary of National Biography, Who's Who or Who Was Who), researchers can be referred to this. The archivist should, however, provide additional information gleaned from the archive itself, or emphasise particular activities or achievements where these are well-represented in the archive. If creating a biographical account for a politician from scratch, it might include references to: dates; education and early life; political career, including positions held, any ministerial roles, groups or committees involved with, constituency represented etc. The archivist should also supply information about the administrative structures, staffing, and functions of the politician's Westminster and constituency offices where relevant.

Custodial History <custodhist> Depending on which approach to collection development is taken (see Chapter 02 Collection development), the chain of custody of a digital or hybrid archive may be a complex one, especially in the case of politicians, whose various staff members also generate records. Digital provenance will be recorded fully in the preservation metadata associated with the digital objects making up an archive; at collection level this should be summarised, with an emphasis on the record creators and pre-accession provenance, rather than focusing in detail on authenticity checks or migrations; custodial information about the paper and digital components of a hybrid archive should also be pulled together here.

Acquisition Information <acquinfo> This element is used to record the immediate source of acquisition of the archive (the politician) and the terms under which it is held by the repository. It may be helpful to provide more detailed information here about the number of accessions; the Paradigm Academic Advisory Board felt it was important for users to have information about accession dates.

Scope and Content <scopecontent> An overview of the content of the archive, including reference to: significant individuals, organisations, events and activities represented; range of the material, in terms of geography, subject, timespan; record types and their creators; record keeping practices at politicians' offices; and an indication of research potential. This may include comments on what isn't included; e.g. in one of Paradigm's exemplar archives there were no presentation slides or texts of speeches because the politician concerned tends to speak off the cuff, from brief factual bullet points.

Appraisal Information <appraisal> This element should record all appraisal decisions and actions, and the rationale on which they are based. See Chapter 04 Appraisal and disposal for a detailed discussion of appraisal in relation to digital and hybrid personal archives.

Arrangement <arrangement> This element is used to record information on how the archivist has arranged the material. See the section on templates for arrangement for examples of how a digital or hybrid archive might be arranged.

Preferred Citation <prefercite> In the context of a hybrid or digital archive, it is important that researchers know how to cite the material in their work. Citation should be based on shelfmark rather than on any digital identifier generated by the repository software.

Conditions Governing Access <accessrestrict> This element is used to record conditions affecting access to the archive material by users. This is very important in a digital environment, where a range of legislation can affect whether or not records may be opened to the public. See Chapter 09 Legal issues for a detailed discussion of relevant legislation.

In the case of the Paradigm testbed material, exemplar archives were closed to readers under agreement with the depositors. Where a collection is closed, the closure period and reasons should be stated here. However, in the case of digital and hybrid personal archives more generally, it may be decided over time to open parts of a collection before the full copyright duration has expired under certain conditions; paper elements of the archive may also be made available. Feasibly, then, a single hybrid archive may contain material subject to a range of different restrictions, e.g.:

This has the potential to confuse researchers, and any access restrictions should be clearly outlined at collection level.

The METS documents for an individual digital object may also hold IPR metadata specific to conditions governing access.

Conditions Governing Use <userestrict> This element relates to the material in an archive which is listed as open to researchers (whether online or in a searchroom) in <accessrestrict>. It describes restrictions on a researcher's reuse of the information for the purposes of quotation, publication or reproduction. These may be imposed by the repository, by the donor or depositor, or by national and international statutes.

It should identify the principal copyright holder in the material (usually the donor or depositor). Given that there will be multiple copyright holders in any one archive, a general statement should also be provided advising researchers that it is their responsibility to seek the copyright holder's permission before the material can be reproduced or published. If the digital repository has an institutional 'take down policy' it should also be referred to here.

A statement should be included about the need for users to sign a copyright declaration form before viewing or reusing certain categories of material. See Chapter 09 Legal issues for an explanation of fair and lawful use and related issues.

The METS documents for an individual digital object may also hold IPR metadata specific to conditions governing use.

Other Finding Aid <otherfindaid> This may be a useful element in which to explain that each digital object in the archive has an associated METS document in the digital repository which provides a limited quantity of descriptive metadata along with other relevant metadata.

Controlled Access Headings <controlaccess> Researchers may undertake full-text searches of catalogue records quite easily (e.g. by using the Find command in their web browser). However, searches carried out like this are indiscriminate and operate at all levels of description, meaning a searcher may be overwhelmed with hits, many of which are irrelevant.

<controlaccess> facilitates searching by acting as a wrapper for key access points. Entries should be authority-controlled to ensure that standardised and authoritative versions of the terms are used.

There are ten possible <controlaccess> subelements. Paradigm's Academic Advisory Board suggested that researchers would want to browse on terms like subject, place and creator, and Paradigm therefore recommends using only the following:

Each element used should include a 'source' attribute to indicate the source of a controlled vocabulary term or the rules that were used to formulate it. See above on Indexing and Authority Files for information about index terms, rules and thesauri.