Digital Preservation Initiatives in Ontario: Trusted Digital Repositories and Research Data Repositories
Johnston, Wayne, Partnership : the Canadian Journal of Library and Information Practice and Research
The first in a series of two articles dealing with digital preservation, this article discusses repositories, more specifically Trusted Digital Repositories (TDR) and Research Data Repositories. The focus will be on the TDRs at Scholars Portal and Library and Archives Canada (LAC), and the data repository at the University of Guelph.
trusted digital repository; digital preservation; research data repository
Trusted Digital Repositories
On one level, a Trusted Digital Repository (TDR) is a set of metrics that are used to certify that a given repository is an appropriate custodian of a collection of digital assets. More than an array of abstract measures, however, a TDR represents a stable and sustainable organization, a set of policies and procedures for sound management of the digital objects, and a robust and secure technical platform.
To be certified as a TDR an organization must undergo a meticulous audit that ensures the proposed TDR meets all criteria of the ISO 16363 standard. The first category of criteria is organizational infrastructure. This includes issues like governance and organizational stability, procedural accountability and policy framework, and financial sustainability. The second category is digital object management which includes processes for ingest, preservation planning, information management and access. The final category is technical infrastructure and security risk management which includes appropriate technologies and security systems.
A key aspect of the TDR is development of preservation metadata. Preservation Metadata Implementation Strategies (PREMIS) was a working group that developed a data dictionary for digital preservation. The name "PREMIS" is now the de facto name for that data dictionary. It includes concepts such as provenance (Who has had custody/ownership of the digital object?), authenticity (Is the digital object what it purports to be?), preservation activity (What has been done to preserve the digital object?), technical environment (What is needed to render and use the digital object?) and rights management (What intellectual property rights must be observed?)
Scholars Portal's TDR
Scholars Portal (SP), an initiative of the Ontario Council of University Libraries, began the certification process late in 2010. After many months of documenting procedures and policies, their audit was initiated by the Center for Research Libraries (CRL). Steve Marks, the Digital Preservation Policy Librarian at SP, reports that in April of 2012 the CRL conducted a two-day site visit for an in-person review of the SP TDR. CRL met with all members of the SP team and saw demonstrations of the various systems involved. Following that site visit the SP TDR team received a report from CDR including items requiring follow-up. The SP TDR team members are working on a response which should wrap up this very rigorous certification process.
The SP TDR platform essentially builds preservation capabilities onto the MarkLogic environment already in use for hosting journal content. Among other things this meant adding preservation metadata including checksums to monitor bit rot. The system now includes, where possible, the full text of the journal articles in XML format, descriptive and discovery metadata, and preservation metadata using both PREMIS and a structural metadata format similar to METS. (METS is the Metadata Encoding and Transmission Standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library.) The system has been designed to be robust while also remaining streamlined enough to accommodate the vast amount of data to be processed. The strategy also incorporates a storage array that is shared with the University of Toronto, used for the non-XML content which is generally in PDF form. While MarkLogic has proven to be a very effective platform for SP both in terms of content delivery and long-term preservation, their strategy is not tied to MarkLogic should future needs necessitate a platform change. …