Workbook on Digital Private Papers > Arranging and cataloguing digital and hybrid archives > Arranging and cataloguing emails
Arranging and cataloguing emails
Suggested elements for use at c04 (item) level
None but the most significant individual emails are likely to be catalogued at item level. However, it may be possible to produce brief catalogue descriptions using metadata that is simple to extract automatically using an extraction tool; most of the email header information should be easy to extract and is likely to be held in the DIPs for individual emails. If repositories intend to provide EAD item-level descriptions for email, the following elements are recommended:
Elements in the <did>
<unitid> Shelfmark
<unittitle> Subject line of the email. If the sender fails to supply a subject line, the archivist (depending on the bulk of material concerned) may supply an appropriate title in square brackets (this convention should be made clear to users); in reality the field may simply be left blank, reflecting the original. Another habit of many email correspondents is - rather than searching for and then retyping the email address - to find the first available message from the desired correspondent and click on reply without adjusting the title. Again this leads to misleading subject headings. In reality, archivists are unlikely to have the time to record information like this and it will be left to the researcher to discover.
Two <unitdate> elements will be required:
- <unitdate> Date and time sent; the date format should be granular - to the minute level - to reflect the pace of email exchange.
- <unitdate> Date and time received: [ditto]
<origination> The sender of the email, with their email address.
<physdesc><extent> This is likely to be the size in bytes, as extracted automatically. If cataloguing manually, the archivist should also include the ‘intellectual’ extent, e.g. 3 pieces: email message and two attachments.
<materialspec> Use to indicate the file format of any attachments. It is probably unnecessary to include any information about the email message itself here.
<dao> If desired, a link can be made to the METS document which represents the individual email in the digital repository.
Other elements
<scopecontent>
Much of the information that may be automatically extracted will not map easily to a specific EAD element. Therefore the <scopecontent> tag may need to be used to record such metadata as:
- Recipient, with email address.
- Other primary addresses listed in the ‘To’ field of the email, with email addresses.
- If a listserv posting, name and address of listserv.
- Names and email addresses of those copied into the message (recorded in the cc and bcc field of the email).
- Message priority: record only when a message is flagged in some way (e.g. urgent).
- Whether an automatic signature was attached.
- Any encryption information.
- List of attachments.
- Link to email and attachment by means of either <dao> if to single message, or <daogrp> and <daoloc> for an email with attachments. If possible use <daodesc> in each case to indicate name and format of attachment.
Paradigm recommends repeated text (in the form of message strings) should be retained as part of the record; this text can contain important contextual links and relationships. However, the catalogue entry for the email should only reflect the latest transaction.