Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.

Slides:



Advertisements
Similar presentations
A Community Approach to Preservation: Experiences with Social Science Data ASIST Summit 2010 Jonathan Crabtree April 9, 2010.
Advertisements

Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Karen Dennison Accessing international survey data collections via ESDS British Academy, Tuesday 14 March 2006 ESDS International.
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
The Alliance for Data Archive Technologies: Looking towards a Common Future Myron Gutmann, ICPSR Ben Evans, ASSDA Deborah Mitchell, ASSDA Kevin Schürer,
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
Helping Journals to Upgrade Data Publications for Reusable Research Sonia Barbosa (Project Manager) Eleni Castro (Project Coordinator) Institute for Quantitative.
Tom Lewis Director, Academic & Collaborative Applications University of Washington.
OPEN RESEARCH DATA, EPFL, 28 October 2014, M. Töwe, M. Bärlocher docuteam packer: viewer and editor for file structures and metadata.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
AN OPEN-SOURCE SYSTEM FOR AUTOMATIC POLICY-BASED COLLABORATIVE ARCHIVAL REPLICATION Using the SafeArchive System The SafeArchive System coordinates six.
Title Subtitle. Building Relationships: “A Foundation for Digital Archives” Digital Object Repository Systems in Digital Libraries (DORSDL) September.
Replicated & Distributed Storage Technologies : “Impact on Social Science Data Archive Policies” IASSIST 2010 Ithaca, New York Jonathan Crabtree June.
A Community Approach to Preservation: “Experiences with Social Science Data” Community Approaches to Digital Preservation 2009 Jonathan Crabtree February.
IASSIST conference 2006 Efficient Ingest of Datasets in a Two-Stage Archival Process: The First Phase - Easy-Store Marion Wittenberg
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Archiving our Social Science Digital History ECURE 2005 March 1, 2005.
FGDC, Meet the DDI Adding Geospatial Metadata to a Numeric Data Catalog Julie Linden Yale University.
Open Exeter Project Team
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
The Natural Resources Digital Library Needs, Partners, and Challenges Bonnie Avery, Janine Salwasser, & Janet Webster Oregon State University.
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using.
Persistent Digital Archives and Library System (PeDALS) SC Department of Archives and History.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Bryan Beecher University of Michigan Director, Computing & Network Services E: W:
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Trustworthy Repositories, Organizations & Infrastructure Micah Altman, Institute for Quantitative Social Science, Harvard University Jonathan Crabtree,
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Micah Altman Associate Director, Harvard-MIT Data Center Institute for Quantitative Social Science, Harvard University Bryan Beecher Director of Computing.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
Data-PASS Partners: Plans for Moving Forward Marc Maynard The Roper Center for Public Opinion Research University of Connecticut July 2010.
Geospatial One-Stop FGDC and GOS: Working as One to Build the NSDI Sharon Shin Federal Geographic Data Committee Geospatial One-Stop Metadata Coordinator.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Peter Granda Archival Assistant Director / Data Archives and Data Producers: A Cooperative Partnership.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Data Citation in The Dataverse Network ® Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for the Board on Research.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
The National Digital Stewardship Alliance: Stewardship, Collaboration, Inclusiveness, Exchange.
A SCRIPT FOR ARCHIVING DIGITAL RESEARCH DATA IMPROVING ACCURACY AND EFFICIENCY IN THE DATAVERSE NETWORK ABSTRACT SUMMARY Rachel Carriere, Thu-Mai Christian,
CONTENTdm A proven solution September A complete digital collection management software solution Stores, manages and provides access for all digital.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Alessandro Yoshi Polliotti 1 / 13 TERENA Networking Conference 2005 Biblioteca d'Alessandria: A Peer-to-peer Network for Scholar Knowledge Exchange Terena.
Open Exeter Project Team
GISELA & CHAIN Workshop Digital Cultural Heritage Network
DataNet Collaboration
An Overview of Data-PASS Shared Catalog
Joseph JaJa, Mike Smorul, and Sangchul Song
VI-SEEM Data Repository
Data stewardship life cycle
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Bird of Feather Session
Robin Dale RLG OAIS Functionality Robin Dale RLG
Dataverse for citing and sharing research data
Presentation transcript:

Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate Director, Harvard-MIT Data Center Senior Research Scientist, Institute for Quantitative Social Sciences E: W: Jonathan Crabtree University of North Carolina Assistant Director for Archives and Information Technology HW Odum Institute for Research in Social Science E: W: NDIIPP Meeting 07/08

2 Micah Altman & Jonathan Crabtree Collaboration for Preservation Strategic Partnership Agreements Coordinated Operations Joint “not-bad” practices Shared catalog Shared tools & technologies NDIIPP Meeting 07/08

3 Micah Altman & Jonathan Crabtree Technical Collaboration Shared Catalog  Unified Discovery  Content exchange  Layered Services Shared Technologies & tools  Schema’s and crosswalks  Fingerprint and persistent identifiers  Digital libraries and ingest tools  Storage and replication “Not-bad” practices and Standards  Identification & selection  Metadata Cataloging Exchange  Security  Confidentiality  Citation Shared Catalog  Unified Discovery  Content exchange  Layered Services NDIIPP Meeting 07/08

4 Micah Altman & Jonathan Crabtree Data-PASS Shared Catalog A unified catalog of the partners’ entire holdings Completes the unification of social science data that was the dream of the first Council of Social Science Data Archives in 1969 Discovery Services  Simple & fielded search  Virtual collection browsing Metadata delivery  Descriptive study, file, & variable information  Provenance metadata  Human and OAI interfaces Enhanced Delivery  Proxy delivery  Replication  Layered analysis services NDIIPP Meeting 07/08

5 Micah Altman & Jonathan Crabtree Finding Data Search Across Entire Partners’ Catalogs Find Studies Collected for Data-PASS Simple and Fielded Search Browse by Subject, Date, Source NDIIPP Meeting 07/08

6 Micah Altman & Jonathan Crabtree Delivering Data Through Partners’ Sites  Shared catalog results always give link to data at partners site  If no file information supplied to catalog, this is the only option Through Shared Catalog  Catalog server may cache a copy of data for performance  Catalog can bundle requests for multiple files Through Analysis Services  If partner site runs DVN(or data access proxy), analysis and extraction is available  Download data in multiple formats  Extract subsets, in multiple formats, with citations and UNF’s  Run descriptive stats, crosstabs  Advanced analysis -- dozens of statistical models NDIIPP Meeting 07/08

7 Micah Altman & Jonathan Crabtree Enabling Technologies Metadata harvesting: OAI-pmh Metadata standards and tools: DDI XSL Citation, validation: Handles UNF Federated Search, Virtual Archives: Dataverse Network OAI Servers NDIIPP Meeting 07/08

8 Micah Altman & Jonathan Crabtree Catalog Distributed Architecture Search Shared Catalog Data Mirror Metadata Catalog Harvester Online Catalog Online Analysis View Information on Data -Through Catalog -Link to Data at Partner Site Access Data -With Extraction and Analysis, Through Catalog -Direct to Partner Sites Crosswalk proxy OAI NDIIPP Meeting 07/08

9 Micah Altman & Jonathan Crabtree Metadata Harvesting Each partner catalog is exposed via  Dataverse Network via OAI  Other OAI Server, running on-site  Proxy OAI Server, running at HMDC Harvested ad-hoc XSL Metadata to cross-walk applied Made available through OAI DDI-lite schema subset used for exchange  Data Documentation Initiative (DDI) – international effort to establish specification schema for the content, presentation, transport, and preservation of documentation for datasets in the social and behavioral sciences  Provenance, and structural metadata, including: document description (meta-meta data), study description, file description, variable description NDIIPP Meeting 07/08

10 Micah Altman & Jonathan Crabtree The Dataverse Network Includes integrated developments in web application software, networking, data citation standards, and statistical methods designed to put some of the universe of data and data sharing practices on firmer ground. It facilitates the public preservation and distribution of persistent, authorized, and verifiable research data. Virtually-Hosted Archiving The importance of being virtual …  Nothing to install  Dynamic collections: local and federated Institutionally supported  Persistent identifiers and citations  No worries about file formats changing, backups, etc.  All the initial setup work is done for depositor Depositor retain total control over  Content  Access  Presentation NDIIPP Meeting 07/08

11 Micah Altman & Jonathan Crabtree Benefits to collaboration Combine and blend strengths Bring different perspectives to the table Coordinate on key issues, e.g., syndicated storage Share knowledge and experience to develop tools and future standards NDIIPP Meeting 07/08

12 Micah Altman & Jonathan Crabtree Archivists & Catalogers Benefit from shared workflows Participate in software design to enhance ingest Potential for increased submissions NDIIPP Meeting 07/08

13 Micah Altman & Jonathan Crabtree IT Administration Perspective Standards based collaborations are less risky  More recovery paths  More resources to solve problems Collaboration provides larger test audience for software development Lowers developmental cost NDIIPP Meeting 07/08

14 Micah Altman & Jonathan Crabtree What do data consumers say? Enjoy the simplicity of a “common catalog” Variable level searches are powerful Browsing the data with descriptive statistics helpful Excited about the advance online statistics NDIIPP Meeting 07/08

15 Micah Altman & Jonathan Crabtree Benefits of Virtual Archiving Promotes self archiving Potential to reach investigators early in the data lifecycle Allows for professional subject area based curation Customized branding for producers Lowers the barriers to submission and in turn increasing data deposit rates NDIIPP Meeting 07/08

16 Micah Altman & Jonathan Crabtree Collaboration for Preservation Objects protected again single institutional failure Standards based metadata Collaborations offer potential for replicated and geographically diverse distributed storage Collaborations may offer small archives the only way to become a “trusted archive” Collectively dedicated to the long-term survival of the resource NDIIPP Meeting 07/08

17 Micah Altman & Jonathan Crabtree Collaboration Strengths Over 200 years combined experience in social science data preservation Innovative archival software developed uniquely for the ingest, presentation, location, analysis, and preservation of social science data Institutional dedication to the distribution and preservation of social science data NDIIPP Meeting 07/08

18 Micah Altman & Jonathan Crabtree For More Information Data-PASS Project: Shared Catalog: Dataverse Network Software: NDIIPP Meeting 07/08