Extracting and Ingesting DDI Metadata and Digital Objects from a Data Archive into the iRODS extension of the NARA TPAP Using the OAI-PMH J. Ward, A. de.

Slides:



Advertisements
Similar presentations
Panel 2 – Promoting Re-Use of Scientific Collections John Harrison SHAMAN Project University of Liverpool
Advertisements

A Community Approach to Preservation: Experiences with Social Science Data ASIST Summit 2010 Jonathan Crabtree April 9, 2010.
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Introduction to Research Data Management Services, January 2013 Library Data Services Functions and activities.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Connecticut State Data Center at the Map and Geographic Information Center - MAGIC Connecticut State Data Center Data Collaborator for Planning, Analysis,
A Very Brief Introduction to iRODS
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Sustainable Preservation Services for Archivists through Distributed Custody Caryn Wojcik State of Michigan Records Management Services.
Trustworthy Repository Criteria, Virtual Organizations, and Infrastructure MacKenzie Smith, MIT Libraries NDIIPP Meeting, July 2010.
AN OPEN-SOURCE SYSTEM FOR AUTOMATIC POLICY-BASED COLLABORATIVE ARCHIVAL REPLICATION Using the SafeArchive System The SafeArchive System coordinates six.
Replicated & Distributed Storage Technologies : “Impact on Social Science Data Archive Policies” IASSIST 2010 Ithaca, New York Jonathan Crabtree June.
A Community Approach to Preservation: “Experiences with Social Science Data” Community Approaches to Digital Preservation 2009 Jonathan Crabtree February.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
1 Institutional Repository (IR) Models Rutgers University Community Repository (RUcore) A digital library perspective (objects and collections) Flexible.
January 2006DSpace User Group Meeting, Sydney, Australia DSpace development from MIT's Digital Library Research Program MacKenzie Smith Associate Director.
Hannele Keckman-Koivuniemi and Mari Kleemola : Data Processing in FSD : CHALLENGES IN A NEW ARCHIVE IASSIST2003 Ottawa,
Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
1 What is RUcore?  A cyberinfrastructure for the Rutgers Community that includes:  An institutional repository, to preserve, manage and make accessible.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Designing Flexible Workflow for Upstream Participation of the Scientific Data Community Robert R. Downs and Robert S. Chen NASA Socioeconomic Data and.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Digital Repositories and Social Science Data: Supporting the Data Life Cycle IASSIST 2006 Panel Discussion Ann Green, Chair Ann Arbor May 24, 2006.
Archiving our Social Science Digital History ECURE 2005 March 1, 2005.
NARA – Roper Center Collaboration: USIA Office of Research Surveys Michael Carlson National Archives and Records Administration Marc Maynard.
Robust Technologies for Automated Ingestion and Long-Term Preservation of Digital Information PI: Joseph JaJa Co-PIs: Allison Druin and Doug Oard Major.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
The Natural Resources Digital Library Needs, Partners, and Challenges Bonnie Avery, Janine Salwasser, & Janet Webster Oregon State University.
National Data Infrastructure Projects EarthCube Layered Architecture (GEO) DataNet Federation Consortium (OCI) integrated Rule Oriented Data System (SDCI)
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H. Ward, Hao Xu, Mike C.
Tools for Dissemination of Data and Metadata Janez Štebe, ADP SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
Trustworthy Repositories, Organizations & Infrastructure Micah Altman, Institute for Quantitative Social Science, Harvard University Jonathan Crabtree,
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
OGC ® © 2006 Open Geospatial Consortium, Inc.1 Introduction to Archives and Geospatial Issues ( Continued ) Steve Morris Head, Digital Library Initiatives.
File format registries - a global infrastructure for local persistence Andreas Aschenbrenner, ERPANET.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
Safeguarding the Freedom of Information: Digital Archive Initiatives in the United States Federal Government Michael Paul Huff Information Resource Officer.
Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz.
GeoMAPP: Using Metadata to Help Preserve Geospatial Content Matt Peters, Utah’s Automated Geographic Reference Center Glen McAninch, Kentucky Department.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
Background Researchers and funders continue to be concerned about the lack of archiving of scientific data. Such data can be useful to researchers, educators,
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
Archive Ingest and Handling Test: ODU’s Perspective Michael L. Nelson Department of Computer Science Old Dominion University
National Science Foundation Cooperative Agreement: OCI Reagan Moore, PI Mary Whitton, Project Manager.
©MIT LKTR Workshop, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego Supercomputer.
The ELAR Metadata Set David Evans, ELAR 3 November 2006.
Integrating Data Mining and Data Management Technologies for Scholarly Inquiry Ray R. Larson University of California, Berkeley Paul Watry Richard Marciano.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
DataNet Collaboration
An Overview of Data-PASS Shared Catalog
Policy-Based Data Management integrated Rule Oriented Data System
VI-SEEM Data Repository
Sophia Lafferty-hess | research data manager
Presentation transcript:

Extracting and Ingesting DDI Metadata and Digital Objects from a Data Archive into the iRODS extension of the NARA TPAP Using the OAI-PMH J. Ward, A. de Torcy, M. Chua & J. Crabtree IASSIST 2010 Ithaca, N.Y.

Oldest Institute or Center at UNC-CH Founded 1924 Mission: Teaching, research, & service for social sciences Cross-disciplinary focus

 Rules-Based policy enforcement  iRODS grid based technology  OAI-PMH harvesting from Odum Dataverse Network

 Ingest Odum collections into iRODS  Break apart Odum preservation policies  Code these policies into series of iRODS rules

*From

*From

*From

*From

*From pdf

Global Identifier hdl: /H handle Study Title Harris 1986 Disabled Americans - Employment Survey, Study no … Harris 1986 Disabled Americans - Employment Survey, Study no hdl: /H …

Level 1 Dataflow of extraction and ingest process

iRODS Rule ==== parseDDI.ir ==== Format DDI and extract metadata|| msiXsltApply(*xsltObjPath, *ddiObjPath, *BUF)##(XSLT transformation) msiDataObjCreate(*xmlObjPath,null,*DEST_FD)##(Create XML file) msiDataObjWrite(*DEST_FD,*BUF,*Written)##(Write XML file) msiDataObjClose(*DEST_FD,*junk)##(Close XML file) msiLoadMetadataFromXml(*ddiObjPath, *xmlObjPath)|nop(Load into iCAT) Input parameters *ddiObjPath=$1% Example: /odum/home/rods/ /H-339/ddi.xml *xmlObjPath=$2% Example: /odum/home/rods/ /H-339/AVUs.xml *xsltObjPath=/odum/home/rods/prototype/formatDDI.xsl Output parameters ruleExecOut *From

 Step 1 = define policy areas  Step 2 = create policy declaration statements for each policy area; state the requirements for operation, not technical specifics  Step 3 = each entity in a policy statement is defined in language descriptions: humans and machine-readable references  Step 4 = deontic statements: logical statements define actors, actions, and constraints that enforce a policy statement.  Step 5 = Write iRODS rules for each statement Wolfe, Robert PLEDGE policy list. MIT Libraries.

 Organization, Environment, and Legal Policies  Defined dataset succession plan  Defined access policies  Log access for accountability  Reference TRAC criteria  Community and Usability Policies  Require a deposit agreement  Process and Procedure Policies  Defined iCAT to DDI discovery crosswalk  Store dataset’s DDI metadata as object  Defined persistent identifiers  Defined UNF’s and Checksums  Provide reporting of preservation network  Technology and Infrastructure Policies  Defined number of replication copies  Defined geographic location for the copies  Provide authentication policy  Provide versioning  Provide control for deletion/replacement  Defined replica validation frequency via UNF’s and Checksums

Video Demo

Acknowledgements This work is funded by the NSF grant OCI and is a collaboration with NARA on the development of the "NARA Transcontinental Persistent Archive Prototype". The initial work on this project was funded by the NARA supplement to NSF SCI , “Cyberinfrastructure; from Vision to Reality” – Transcontinental Persistent Archive Prototype (TPAP) ( ).

Questions?