Archival Technologies: Archivist’s Toolkit, XTF and CONTENTdm

By Courtney Chartier, Assistant Head, Archives Research Center, RWWL Atlanta University Center

“Working for Freedom:  Documenting Civil Rights Organizations” is a collaborative project between Emory University's Manuscript, Archives and Rare Book Library, The Auburn Avenue Research Library on African American Culture and History, The Amistad Research Center at Tulane University and the Robert W. Woodruff Library of Atlanta University Center to uncover and make available previously hidden collections documenting the Civil Rights Movement in Atlanta and New Orleans.  The project is administered by the Council on Library and Information Resources with funds from the Andrew W. Mellon Foundation.  Each organization contributes regular blog posts about their progress.

The processing team for the Voter Education Project (VEP) Organizational Records is using Archivist’s Toolkit (AT) for the creation of the VEP finding aid, XTF to provide search functionaliy, and CONTENTdm (Cdm) to store, manage and display of digitized images from the collection. All of these programs were in the early stages of implementation at the start of the VEP project.

Archivist’s Toolkit is an open source archival data management system. The Archives Research Center (ARC) of the Robert W. Woodruff Library began AT use in 2007, primarily for the creation of new finding aids using EAD (Encoded Archival Description). Currently, ARC does not have a legacy data program associated with AT. Records created using AT are the first ARC finding aids to be converted to Encoded Archival Description (EAD) and presented on the website with full text search capabilities.

The eXtensible Text Framework (XTF) is an open source search platform developed by the California Digital Library. XTF was designed specifically for searching hierarchically faceted documents, particularly EAD. ARC’s instance of XTF has been specifically configured for AT’s EAD output and was first used to drive the search of the Morehouse College Martin Luther King, Jr. Collection.

CONTENTdm is proprietary digital collection management software produced by OCLC. ARC has a long history of using Cdm as a member of the HBCU Library Alliance and a participant in the HBCU Digital Collections project. Cdm not only stores digital content and metadata, but has a display side that can be browsed or searched by users, and that can produce links to images for embedding in the item list of an EAD finding aid.

AT has been especially useful for the workflow of the VEP project. As each archivist completes a series, they enter descriptive information and the subseries and file list. A limitation of AT is that only one user can be logged in to the same record at once, but this has actually just encouraged the two archivists to schedule other tasks around entry. The student assistants on the project are primarily responsible for refoldering and labeling folders. The students follow the file list in AT to find the exact title for each folder. This process allows the students to work independently from an archivist, and for the students to act in an editor capacity, by reading over each section of the finding aid and pointing out misspellings and other inconsistencies to the archivist.

When the processing and editing of the finding aid is complete, an EAD version will be exported for publication to the web. A benefit of XTF is that the EAD does not have to be converted to HTML in order for the search and display to work; no additional HTML stylesheets are required. The search platform simple reads the raw EAD and displays it using its own customizable stylesheet, that can be “branded” for the individual collection or institution.

Concurrent to processing, the VEP archivists are marking materials for digitization, with the goal of launching a digital exhibition on the history of VEP at the end of the grant project. As archivists find exhibition appropriate material, they deliver it, with a full metadata description using Dublin Core (DC) fields, to a scanning technician. The technician creates a TIFF and a JPEG for each item (or a compound of TIFFs/JPEGs for multiple page items), and, once scanning is completed, will import the images and the metadata into Cdm in a single sitting. The collection will be available to the public for viewing in Cdm, while the digital exhibition will “skim” images from the Cdm instance for display within the exhibition space, eliminating the need for separate versions of each image to exist in both the exhibit and the management system. The browse feature on Cdm is also customizable, and will be branded for the VEP project.