Workbook on Digital Private Papers > Arranging and cataloguing digital and hybrid archives > Arranging and cataloguing websites

Arranging and cataloguing websites

Suggested elements for use at c02/3 (series) level

Elements in the <did>

<unitid> Shelfmark for the series (individual snapshots will be demarcated by splitters)
<unittitle> Title supplied by the archivist, e.g. 'Series of website snapshots'
<unitdate> Span date: from date the website was created to the date of the last snapshot. Normalise date attribute in accordance with ISO 8601.
<physdesc<>extent> Supply the number of items, i.e. individual snapshots. Supply the overall size of the series in MB.
<materialspec> Overview of file formats represented, e.g. static html, css, javascript, JPEG images, etc.
<dao> If desired, a link can be made to the METS document which represents the series of wensite snapshots in the digital repository.

Other elements

<phystech> Possibly give an indication of the software used by the site's author(s). This should probably also include an explanation of any loss of functionality and links to external sites in the archived version of the site, as well as different look-and-feel from the original.
<scope and content> General overview, including: history of the site; its author and any other individuals involved in supplying content, where the information is available; when it was first created; its general structure and content (e.g. the site includes homepage, biography of MP, latest news, constituency reports, etc.); an indication of its research potential; frequency of snapshots. Also record snapshot depth: whether the entire website is captured in every snapshot, or whether snapshots are limited to the homepage.
<appraisal> This might be used to record the rationale behind the frequency of snapshots, e.g. to record that where identical snapshots (i.e. shots made on different dates where no new material was added during the intervening period) have occurred, the duplicates have been ommitted from the web archive.
<arrangement> A note on the arrangement: generally a chronological series of dated snapshots.
<userestrict> Use to outline the copyright status of the material. This is particularly important in relation to websites which are classed as published material. The copyright holder should be identified and fair dealing provisions outlined. Also state that the repository has sought permission from the copyright holder to make the site available to researchers. It may be a good idea to have a copyright disclaimer too (in case anyone represented in the site objects to their inclusion in the archive).
<controlaccess> Use subject indexing at a very general level here and identify significant individuals associated with the website. Much more detailed indexing should take place at the item level.