Extracting and Ingesting DDI Metadata and Digital Objects from a Data Archive into the iRODS extension of the NARA TPAP Using the OAI-PMH J. Ward, A. de Torcy, M. Chua & J. Crabtree IASSIST 2010 Ithaca, N.Y.
Oldest Institute or Center at UNC-CH Founded 1924 Mission: Teaching, research, & service for social sciences Cross-disciplinary focus
Rules-Based policy enforcement iRODS grid based technology OAI-PMH harvesting from Odum Dataverse Network
Ingest Odum collections into iRODS Break apart Odum preservation policies Code these policies into series of iRODS rules
*From
*From
*From
*From
*From pdf
Global Identifier hdl: /H handle Study Title Harris 1986 Disabled Americans - Employment Survey, Study no … Harris 1986 Disabled Americans - Employment Survey, Study no hdl: /H …
Level 1 Dataflow of extraction and ingest process
iRODS Rule ==== parseDDI.ir ==== Format DDI and extract metadata|| msiXsltApply(*xsltObjPath, *ddiObjPath, *BUF)##(XSLT transformation) msiDataObjCreate(*xmlObjPath,null,*DEST_FD)##(Create XML file) msiDataObjWrite(*DEST_FD,*BUF,*Written)##(Write XML file) msiDataObjClose(*DEST_FD,*junk)##(Close XML file) msiLoadMetadataFromXml(*ddiObjPath, *xmlObjPath)|nop(Load into iCAT) Input parameters *ddiObjPath=$1% Example: /odum/home/rods/ /H-339/ddi.xml *xmlObjPath=$2% Example: /odum/home/rods/ /H-339/AVUs.xml *xsltObjPath=/odum/home/rods/prototype/formatDDI.xsl Output parameters ruleExecOut *From
Step 1 = define policy areas Step 2 = create policy declaration statements for each policy area; state the requirements for operation, not technical specifics Step 3 = each entity in a policy statement is defined in language descriptions: humans and machine-readable references Step 4 = deontic statements: logical statements define actors, actions, and constraints that enforce a policy statement. Step 5 = Write iRODS rules for each statement Wolfe, Robert PLEDGE policy list. MIT Libraries.
Organization, Environment, and Legal Policies Defined dataset succession plan Defined access policies Log access for accountability Reference TRAC criteria Community and Usability Policies Require a deposit agreement Process and Procedure Policies Defined iCAT to DDI discovery crosswalk Store dataset’s DDI metadata as object Defined persistent identifiers Defined UNF’s and Checksums Provide reporting of preservation network Technology and Infrastructure Policies Defined number of replication copies Defined geographic location for the copies Provide authentication policy Provide versioning Provide control for deletion/replacement Defined replica validation frequency via UNF’s and Checksums
Video Demo
Acknowledgements This work is funded by the NSF grant OCI and is a collaboration with NARA on the development of the "NARA Transcontinental Persistent Archive Prototype". The initial work on this project was funded by the NARA supplement to NSF SCI , “Cyberinfrastructure; from Vision to Reality” – Transcontinental Persistent Archive Prototype (TPAP) ( ).
Questions?