The NCAR Community Data Portal (CDP) Experiences with OAI metadata record federation presented by Michael Burek (NCAR/SCD/VETS) Acknowledgments:

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
White Paper on Establishing an Infrastructure for Open Language Archiving Steven Bird and Gary Simons.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
NCAR/SCD/VETS The NCAR Community Data Portal
DLESE and NSDL The role of the Digital Library for Earth System Education* (DLESE) in the National SMETE Digital Library Presented by Dave Fulker Director.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
University of Michigan’s OAIster Service Provider Kat Hagedorn OAIster/Metadata Harvesting Librarian University of Michigan, DLPS November 5, 2002.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
UCLA Digital Library CNI Fall 2002 Task Force Meeting December 6, 2002 Project Briefing Specialized OAI Service Providers: Sheet Music Harvester Design.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
University of Michigan’s OAIster Lessons Learned Kat Hagedorn OAIster/Metadata Harvesting Librarian University of Michigan, DLPS October 7, 2002.
The Open Archives Initiative and OAIster: Past, Present and Future Kat Hagedorn University of Michigan Libraries April 6, 2006.
1 An introduction to the NSDL William Y. Arms Cornell University.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Digital Library Architecture and Technology
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
University of Illinois at Urbana-Champaign OAI Alpha Experiences Timothy W. Cole Thomas G. Habing Grainger Engineering.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Getting Started with CONTENTdm Corey Harper, University of Oregon Terry Reese, Oregon State University OLA - April 8, 2005.
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
I Never Met a Data I Didn’t Like Metadata Issues in Local and Shared Digital Collections Presentation to ALCTS Electronic Resources Interest Group January.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
LIS 654 BUILDING DIGITAL LIBRARIES FALL 2011 NOVEMBER 03, 2011 The OAI-PMH Harvester Plugin for The Omeka Content Management System JAMES R. GRIFFIN III.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
WDC-MARE – World Data Center for Marine Environmental Sciences Data portal based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler,
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Are Standards Really Standards Any More? Mélanie F. Meaux NASA / GCMD In response to Wyn Cudlip with regards to an IDN profile of ISO …
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
1 The NSDL Program Stephen Griffin National Science Foundation.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Registering Earth Science Data and Data Related Services Using NASA’s Global Change Master Directory (GCMD) Tyler Stevens (GIS/Services Coordinator) ESIP.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
Improving Description through Collaboration: The Ethnomusicological Video for Instruction & Analysis Digital Archive Music Library Association, February.
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
Access Control for NCAR Data Portals A report on work in progress about the future of the NCAR Community Data Portal Luca Cinquini GO-ESSP Workshop, 6-8.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Breakout Session 2.2: A sustainable GEO Information System of Systems Chair: Lorenzo Bigagli Rapporteur: Greg Yetman.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team Photo.
Ktisis: Building an Open Access Institutional and Cultural Repository Alexia Kounoudes, Petros Artemi, Marios Zervas Library and Information Services,
Harvesting and Exporting Metadata 714: Metadata Margaret E.I. Kipp -
Getting a Leg Up on OAI for the NSDL
University of Michigan’s OAIster Progress Report
Information modeling and infrastructures for metadata
VI-SEEM Data Repository
OAI and Metadata Harvesting
The New Face of Information Retrieval: The Ankara University Open Access Platform Prof. Dr. Sekine Karakaş Prof. Dr. Doğan.
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Robert Dattore and Steven Worley
Presentation transcript:

The NCAR Community Data Portal (CDP) Experiences with OAI metadata record federation presented by Michael Burek (NCAR/SCD/VETS) Acknowledgments: CDP staff: Dave Brown, Luca Cinquini, Don Middleton (PI), Markus Stobbs, James Humphrey funding: NCAR’s directorate, NSF

Introduction, What is OAI OAI: Open Archives Initiative Goal: Provide a lightweight infrastructure for sharing metadata records among participating institutions OAI Began in 199x to serve the library and e-print communities OAI model consists of six verbs -- Identify, List Metadata Formats, List Sets, List Identifiers, List Records, Get Records OAI base mode specifies Dublin Core as the default schema for shared records but makes provision for other schemas to be used

OAI record sharing effort NCAR, BADC, GCMD GCMD DIF records were shared INCOMING Records:  DIF records harvested by NCAR were transformed into THREDDS schema using XSLT  The transformed THREDDS records were ingested into the CDP Search and Browse functions  Links to BADC data that were included in the generated DIF records enabled linking back to BADC data from the CDP search and browse OUTGOING Records:  CDP hierarchical THREDDS catalogs were “flattened” into transformable THREDDS catalogs then, into DIF and DC  NCAR DIF records were harvested by BADC and GCMD  NCAR DC records were harvested by the University of Michigan Digital Library Oaister project  DC records were sent to the Yahoo search engine via UM

Current CDP OAI server architecture Remote OAI Client OAI Server DCDIF Flattened THREDDS catalogs THREDDS hierarchical catalog Indexer Flattened THREDDS catalogs Flattened THREDDS catalogs XSLT Transform WMO/ISO

OAI metadata Harvest DIF OAI Remote Servers BADC GCMD XSLT Transform THREDDS catalog THREDDS catalog THREDDS catalog OAI Client NCAR Indexer Index HTTP Server Search, Browse, Deliver Data Java Based Filter

Demo oBADC records on CDP oSearch on oaister site oDemo finding NCAR records on BADC site

OAI Harvesting Technical Issues Many extra items in the records that cause problems:  Records are marked as UTF-8 but in fact are ISO 8859-x oUTF-8 and ISO 8859-x are incompatible! oUTF-8 and 7 bit ASCII *are* compatible, which leads to confusion  Embedded HTML in text fields(using escaped <> symbols, or not) oUsually a byproduct of automatic creation of legacy records  Text fields contain un-escaped special characters ( &,, /) oEven if allowed in one schema it may not transform well to another e.g. a text field that transforms to an element attribute in another.  Namespace issues oOAI requires a schema oGCMD is still in the process of creating a schema for DIF oNCAR and BADC created working DIF schemas in the meantime, not quite compatible CONCLUSION: There needs to be filtering code before the XSLT transform

OAI schema / XSLT transformation issues Schemas do not always agree in level of detail Schemas have different required elements Schemas can have different controlled vocabularies Adoption of dataset identifiers that don’t overlap is crucial  All centers currently have their own standards OAI does not address duplication of records, need to establish republication “rules of the road” to avoid problems

OAI future directions GO-ESSP could consider creating/adopting a community wide OAI metadata interchange schema that addresses the issues on earlier slide, (similar to DC, or may be an existing variant of DC) RECOMMENDATAION: GO-ESSP coordinate community wide record ID schema (adopt DC’s methods?) Collaborations with other GO-ESSP institutions GO-ESSP could provide some tutorial support.  XML do’s and don’ts  Identifier guidance  Rules of the road for republication

Questions?