Skip to content

A starter kit providing sample DACS-compliant EAD3 files along with detailed description of element use.

Notifications You must be signed in to change notification settings

kerstarno/ead3-toolkit

 
 

Repository files navigation

EAD3 Starter Kit

Prepared by the EAD Roundtable Steering Committee (now EAS Section Steering Committee), Society of American Archivists

2016-07-19

Creative Commons By 4.0 License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Context and Usage

The EAS Section of the Society of American Archivists promotes the implementation and use of encoding standards for dissemination of archival information. With the release of the Encoded Archival Description schema, Version EAD3 in 2015, the Section aims to provide tools and information to promote and facilitate its use among the archival community. This EAD3 Starter Kit was created to facilitate awareness of and support implementation of the encoding standard.

The EAD3 Starter Kit comprises this document, along with a set of sample EAD3 files that comply with Describing Archives: A Content Standard (DACS) recommendations. Describing Archives: A Content Standard is a content standard developed and published by the Society of American Archivists. It is the US implementation of a content standard for the General International Standard Archival Description ISAD(G) (and for ISAAR, which is not as relevant to EAD3). The starter kit uses DACS for recommendations of element use. For example, DACS recommends providing information about the origination of materials. DACS classifies the types of content element as Required, Required (If Known), Optimum, and Added Value. The "minimum" files meet the Requires recommendations (including "if known") and "optimum" adds the Optimum content recommendations. Those not using DACS may still find this starter kit helpful and the group would appreciate any additional mappings to international content standards.

The starter toolkit is intended to be used as a quick-start study guide by anyone interested in learning or implementing EAD3. From a quick review of the files, you can get a sense of what bare-minimum DACS-compliant standardized EAD3 outputs look like. The files also showcase a number of new EAD3 elements that supersede EAD Version 2002 elements.

The files can additionally be used and adapted as templates, within local encoding contexts.

The EAD3 Starter Kit is intended to supplement other extant resources, which provide an overview of the EAD3 schema and a detailed enumeration of each element. We recommend consulting the following resources for more information:

The Sample Files

As each archival collection is unique in its nature and has its own unique encoding requirements, we created three different sample EAD3 files, available on GitHub at https://github.com/saa-ead-roundtable/ead3-toolkit.

In creating these files, we kept the EAD3 encoding to a baseline minimum. In determining which elements to include in the examples, we used the following general parameters:

  • Include elements required for a valid EAD3 XML instance.
  • Include elements that are required by DACS for single-level through multilevel descriptions.
  • Include elements that have have have been broadly established as mandatory or required by statewide and regional EAD aggregators, as reflected through their specifications and "best practices & guidelines."

We did not include encodings that are highly-specific to local implementations (e.g., to support particular display needs and tied to specific stylesheets), as well as encodings that can facilitate transformations of EAD3 files for particular systems. Examples include the @label and @encodinganalog attributes, and <head> element.

Validating your EAD File

The EAD3 standard was published in three forms, as a Document Type Declaration (DTD), an XML Schema Definition (XSD), and in Relax NG (RNG). The Preface to the EAD Tag Library explains the Technical Subcommittee’s choices as well as the limitations of the DTD (in brief, that it cannot support the additional namespaces one might choose to embed in <objectxmlwrap>). Other than this one limit on the DTD, your file should validate with all of them. We recommend the XSD or Relax NG but your system may require a DTD.

We also recommend that you store a local or hosted copy of the standard in the language of your choice. This will allow you to continue validation even if a remote copy goes offline. It ensures that your instance is self-contained. The following examples assume that ead3.rng/xsd/dtd is stored in the same directory as the XML file you are validating, to avoid faux URLs such as http://yourlocalsite.org/wherever/you/store/it or ../wherever/you/store/it. When using these examples, please note instances of the filename and ensure you’ve written the correct static or relative path to your copy of the file. The location of the XML type and encoding declaration provides context on where to place the validation.

Validating with RNG

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="ead3.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>

If using Oxygen, one may also add the line:

<?oxygen RNGSchema="ead3.rng" type="xml"?>

Validating with XSD

<?xml version="1.0" encoding="UTF-8"?>
<ead xmlns="http://ead3.archivists.org/schema/"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://ead3.archivists.org/schema/ ead3.xsd">

Validating with the DTD

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ead PUBLIC "+// http://ead3.archivists.org/schema/ //DTD ead3 (Encoded Archival Description (EAD) Version 3)//EN" "ead3.dtd">

An Encoding Walkthrough

Single-level Minimum

Sample file: ead3_single_level_minimum.xml

The Single-level Minimum sample file illustrates materials described at only one level — whether at the collection, series or any other single level. It consists of the following baseline EAD3 elements and includes elements that comply with DACS' requirements for single-level descriptions but are technically optional within the EAD3 schema. Linked DACS references are used only for elements required for compliance with DACS.

EAD3 element Starting line number in sample file Comments DACS reference
<ead> 3 This is a required root element of an EAD file and must contain <control> followed by <archdesc>. It is recommended to keep a local copy of your preferred schema format (DTD, XSD, or RNG) in case of server outages where the schema is currently stored, as has happened with the 2002 EAD schema. Because validation methods vary by schema format chosen, none is included here.
<control> 4 A new element in EAD3 and the first of <ead>’s two required child elements. Replaces <eadheader> and aligns more with EAC-CPF, a focus on data, and a focus on sources of information and standards used. <control> is used for recording bibliographic and administrative information about an EAD. It must contain the following elements: <recordid>, <filedesc>, <maintenanceagency>, <maintenancestatus> and <maintenancehistory>.
<recordid> 5 This is a new element in EAD3 that designates a unique identifier for the EAD file. The template uses an optional @instanceurl attribute to record the URL of the EAD xml file.
<filedesc> 6 A child element of <control> that records bibliographic information about an EAD file such as description of the finding aid, author, title, subtitle, sponsor, edition, publisher, publishing series, and related notes. It must include a <titlestmt>.
<titlestmt> 7 A child element of <filedesc> that groups together information about the name of an encoded finding aid and those responsible for its content. The template uses a required child element <titleproper> to record the title of a finding aid or finding aid series.
<publicationstmt> 10 A child element of <filedesc> that provides information concerning the publication or distribution of the EAD instance.
<maintenancestatus> 20 A child element of <control> that records the current version status of the EAD file. The current status must always be recorded in the required attribute @value which is limited to revised, deleted, new, deletedsplit, deletedmerged, deletedreplaced, cancelled, derived).
<maintenanceagency> 21 A child element of <control> that records information about the institution or service responsible for the creation, maintenance, and /or dissemination of the EAD file. This template uses an optional attribute @countrycode that specifies a unique code for the country in which the materials being described are held. The recommended source for country codes is the ISO 3166-1 Codes for the Representation of Names of Countries, column Alpha-2. <maintenanceagency> must also include a child <agencyname> that records the name of the institution or service. This template also uses an optional element <agencycode> that provides a code for the institution or service responsible for the creation, maintenance, or dissemination of the EAD file. For best practices, use the format of International Standard Identifier for Libraries and Related Organizations (ISO 15511). The code is composed of a prefix, a dash, and an identifier. For specific codes, see the Library of Congress’s code search.
<languagedeclaration> 25 A new element in EAD3 and a child of <control> that indicates the language and script in which an EAD instance is written.
<maintenencehistory> 29 A new element in EAD3 and a child of <control>. It’s used to record the history of the creation, revision, updates, and other modifications to the EAD file. Each event is recorded in a separate <maintenanceevent> element.
<maintenanceevent> 30 A new element in EAD3 and a required and repeatable child element of <maintenencyhistory>. It is used to record maintenance activities of the EAD file. It must contain <eventtype>, <eventdatetime>, <agenttype>, <agent> in that order.
<eventtype> 31 A new element in EAD3 and a required child of <maintenanceevent>. It includes required attribute @value to record the type of maintenance activity performed. The possible values are cancelled, created, deleted, revised, unknown and updated.
<eventdatetime> 32 A new element of EAD3 and must follow <eventtype>. It is a required child of <maintenenceevent>. It records the date and time of a specific maintenance action for an EAD file. Though optional, it is strongly suggested to include @standarddatetime attribute with this tag which can be used to record an ISO 8601 standard-compliant form of the date and time of the maintenance event.
<agenttype> 33 A new element of EAD3 and must follow <eventdatetime>. It is a required child of <maintenanceevent>. It includes required attribute @value to indicate the type of agent responsible for the creation, modification, or deletion of an EAD file and must be set to human, machine or unknown.
<agent> 34 A new element of EAD3 and must follow <agenttype>. It is a required child of <maintenanceevent>. It provides the name of a person, institution, or system responsible for the creation, modification, or deletion of an EAD file.
<archdesc> 44 One of only two child elements of <ead>. <archdesc> follows the <control> element and wraps together all of the archival descriptive information in an EAD file. It includes elements describing the content, context, extent, administrative and other supplemental information that facilitates the use of the materials which are organized in hierarchical levels. It includes a required attribute @level which indicates the type of aggregation being described in the EAD file and must include the following values: class, collection, file, fonds, item, otherlevel, recordgrp, series, subfonds, subgrp, or subseries. <archdesc> must contain a required child <did>.
<did> 45 A required child element of <archdesc> that binds together other elements and records information about the material to be described in the EAD file. It may also occur within other wrapper elements like c, c01, c02 ... c12.
<repository> 46 An optional child element of <did> that records the name of the institution, person, or family responsible for providing intellectual access to the materials being described. At least one of four name elements (<persname>, <famname>, <corpname>, or <name>) is required if this is used. This template uses <corpname> and the optional <address>. <corpname> is used to identify the organization responsible for the materials. The name of the organization is encoded within its required and repeatable child element <part>. <address> is used to bind together multiple required <addressline> child elements that provide information about the place where the repository is located and how it may be contacted. 2.2
<origination> 56

An optional child element of <did> that specifies the name of an individual, organization, or family responsible for the described materials. At least one of four name elements (<persname>, <famname>, <corpname>, or <name>) is required if this is used. This template uses the child element <persname> to identify the personal name of the collector which is encoded within its required and repeatable child element <part>. When handling personal names, we chose to use <part> to its fullest logical extent per the Study Group on Discovery's recommendations. We then provided an example of how @localtype could be used to handle definition of each <part> element in a way that will aid in display, searching, and sorting. While @localtype's values are not standardized, developing standards for similar types of information across consortia could aid in aggregate parsing of EAD finding aids.

The experimental element may also satisfy DACS 2.6, but as it is designated as experimental, we chose to recommend use of <origination>.

2.6
<unittitle> 61 An optional child element of <did> that records the specified title for the described materials. It can be used at <archdesc> level and at the subordinate <c> levels. 2.3
<unitid> 62 An optional child element of <did> that provides an identifier for the materials being described. 2.1
<unitdatestructured> 63 A new element in EAD3 and an optional child element of <did> that records machine-processable dates of the materials being described. We opted to highlight the use of this new element, in lieu of using <unitdate> with a @normal attribute (EAD3 supports the same general type of <unitdate> encoding as EAD Version 2002). Our sample file uses an optional attribute @unitdatetype that identifies the type of date expressed with possible values of bulk or inclusive. When <unitdatestructured> is used, it must contain one and only one of the following: <daterange>, <dateset>, <datesingle>. The sample file uses an optional child element <daterange> to bind together dates encoded with optional (but recommended) <fromdate> and <todate> elements. 2.4
<physdescset> 75 A new element in EAD3 and an optional child element of <did> that can be used to group two or more optional <physdecstructured> elements. <physdecstructured> element quantifies the physical or logical extent of the materials being described. It must include two required attributes @coverage and @physdescstructuredtype. @coverage, with two possible values whole or part, specifies whether the description refers to the entire unit or only a part of the materials. @physdescstructuredtype defines the type of amount being described with the following possible values:
  • carrier: Refers to the number of containers.
  • materialtype: Indicates the type and/or number of items.
  • spaceoccupied: Describes the linear, cubic, or other space occupied by the materials.
  • otherphysdescstructuredtype: May be chosen if none of the other values are appropriate.
<physdescstructuredtype> must include two required children. <quantity> to indicate the number of units present in <unittype> and <unittype> to indicate the type of unit being quantified such as boxes, linear feet, cubic feet, etc.
2.5
<langmaterial> 85 An optional child element of <did> which identifies the languages and scripts represented in the materials. This template uses a required child element <language> to record the language(s) of the EAD file or of the materials being described. <language> also includes a strongly recommended attribute @langcode to record the code of the language which should be taken from from ISO 639-1, ISO 639-2b, ISO 639-3, or another controlled list. 4.5
<accessrestrict> 89 An optional element that records information about the conditions that affect the availability of the materials being described. 4.1
<userestrict> 92 An optional element that indicates any conditions that govern the use of the described materials. 4.4
<scopecontent> 95 An optional element that may contain the information about the arrangement of the materials, dates covered by the materials, significant organizations, individuals, events, places and subjects represented by the materials. 3.1

Single-level Optimum

Sample file: ead3_single_level_optimum.xml

The Single-level Optimum sample file illustrates the elements included in the Single-level Minimum, along with the DACS recommended Optimum element <bioghist> and access points as recommended in the DACS "Overview of Archival Description."

EAD3 element Starting line number in sample file Comments DACS reference
<bioghist> 98 An optional element can be used to provide a concise essay or chronology which highlights the historical context of the materials. Information could be added using a series of <p> elements and/or <chronlist>. 2.7
<controlaccess> 101 An optional element that binds together key access points – names, topics, places, functions, occupations, titles, and genre terms – that represents the context and contents of the materials being described. <controlaccess> may be used at multiple levels within a finding aid either within <archdesc> to provide access terms for the entirety of the materials or within <c>..<c12> i.e. Component level to provide terms specific to a component. This template uses the following two optional child elements to record the access points:
  • <persname> to identify a name of a person who is related to the materials being described as either a source, creator, or subject. It includes an optional attribute @source to record the source of the controlled vocabulary, e.g. "lcnaf" for Library of Congress Name Authority File and @relator to highlight the contextual relationship, the identified person has with the materials being described.
  • <subject> to identify topics associated with or covered by the described materials. It includes an optional attribute @source to record the source of the controlled vocabulary, e.g. "lcsh" for Library of Congress Subject Heading.
See DACS "Overview of Archival Description"

Multi-level Optimum

Sample file: ead3_multi_level_optimum.xml

The Multi-level Optimum sample file illustrates materials described at multiple levels beginning with large accumulations (e.g. collection level, series level). It includes the elements in the Single-level Optimum, along with the following elements to highlight the level of arrangement of archival materials. Because the finding aid being used for this example included several digital archival objects, the Steering Committee chose to include these as an example of the <dao> element, revised in EAD3. A <dao> is not a necessary part of the Multi-level Optimum description, but if a digital archival object exists, local practice may support including it.

EAD3 element Starting line number in sample file Comments DACS reference
<dsc> 119 An element which bundles information about the hierarchical groupings of the materials being described.
<c01> through <c12> 120, 134, 150, 164, 176

An optional child element of <dsc> that designates a subordinate part of the materials being described. Each <c##> identifies a logical section, or level of the described materials. <c##> may be further subdivided into smaller components and numbered. Note that unnumbered <c> components may be utilized, in lieu of numbered <c##> components. Because this file is meant for human eyes vs. machine processing, we opted to use numbered <c##> components in the sample file, making it easier for viewers to immediately discern the hierarchy. The best practices recommendation is to choose <c> or <c##> and apply consistently across all EAD files.

This template uses an optional attribute @level (required on <archdesc> but not at lower levels) that identifies the logical type of component and uses on these values: class, collection, file, fonds, item, otherlevel, recordgrp, series, subfonds, subseries, subgrp.

See DACS "Multilevel Optimum" notes for identifying the whole-part relationship between levels of description
<container> 144, 160, 171, 186 An optional child element of <did> that indicates the container in which the materials being described is housed. This template uses an optional attribute @localtype to record the type of container e.g., box, folder, file, etc.
<dao> 161, 172 An optional child element of <did> that is used for linking to born digital records or a digital representation of the materials being described. It requires an attribute @daotype that specifies whether the digital archival object is born digital or digitized from physical holdings i.e. derived. This template uses an optional attribute @href that records the Uniform Resource Identifier (URI) of the digital file.

About

A starter kit providing sample DACS-compliant EAD3 files along with detailed description of element use.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published