Presentation is loading. Please wait.

Presentation is loading. Please wait.

The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox.

Similar presentations


Presentation on theme: "The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox."— Presentation transcript:

1 The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox fox@vt.edu CS DLRL Virginia Tech, Blacksburg, VA, USA

2 Acknowledgements Sponsors: Mellon Foundation, SOLINET, NSF, DLF, CNI, UK’s JISC, Virginia’s CIT, … OAI Team: Steering Committee, Technical Committee, Developers, Data Providers, Service Providers Emory Team, Partners around Southeast VT Colleagues: Hussein Suleman, Rohit Kelapure, Ming Luo, Ryan Richardson, Marcos Goncalves, Priya Shivakumar, Baoping Zhang, students working on term projects, …

3 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

4 Open Archives Initiative OAI www.openarchives.org openarchives@openarchives.org

5 Open Archives Initiative (OAI) xxx@LANL, high-energy physics (Ginsparg, 1991) CSTR + WATERS = NCSTRL (Lagoze,1994) xxx + NCSTRL = CoRR collaboration (1998) Universal Preprint Service protoproto, Oct. 21-22, 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi Santa Fe Convention (see Feb 2000 D-Lib Magazine article) Archives -> Open Archives Support unique archive identifiers Implement metadata set(s) (DC, using XML) Implement OA harvesting protocol Register the archive Build tools, layer other services: linking, searching, …

6 OAi Philosophy Self-archiving = submission mechanism Long-term storage system = archive Open interface = harvesting mechanism Data provider + service provider Start with “gray literature” e-prints/pre-prints, reports, dissertations, …

7 Began as “archives of the world unite!” OAI

8 Open Archives (protoproto) ArXiv & Los Alamos National Lab CogPrints & U. Southampton NACA & NASA (reports) NCSTRL & Cornell U. NDLTD & Virginia Tech RePEc & U. Surrey Total of around 200K records

9 Original Open Archives Members American Physical Society California Digital Library Caltech Coalition for Networked Info. Cornell University Harvard University Library of Congress Los Alamos Nat’l Lab Mellon Foundation NASA Langley Research Cntr Old Dominion University Stanford University U. of Ghent U. of Surrey U. of Southampton Vanderbilt University Virginia Tech Washington University

10 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

11 Now is a Technical Umbrella for Practical Interoperability… Reference Libraries Publishers E-Print Archives …that can be exploited by different communities Museums

12 Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI

13 Aggregation through OAI Harvesting – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7

14 Aggregation through OAI Harvesting – By Organization TheologyEmoryGAUGAU FLUTKAmSoLibrary

15 Aggregation through OAI Harvesting – By Topic Confederate Constitution Civil WarHistoryOralSportsCultureAmSoDiaries

16 Approaches to Aggregation Build By Discipline Build By Institution

17 Types of Access Possible Build By Discipline Build By Institution Year Category Personage Author Genre Query …

18 OAI Repository Required: Protocol DO MDO

19 Metadata vs. Data Data refers to digital objects or digital representations of objects Metadata is information about the objects (e.g. title, author, etc.) OAI focuses on metadata, with the implicit understanding that metadata usually contains useful links to the source digital objects

20 Metadata: Complex to Simple MARC (>$50)Dublin Core (DC)

21 repository repositoryrepository OAI protocol harvesterharvester support data harvesting data items

22 identifiers oai-identifier = oai:archive-identifier:record-identifier Registered URI Scheme Archive Identifier: Registered within OAI Unique ID within archive: (syntax is archive- specific) example = oai:ncstrl:ncstrl.cornellcs/TR94-1418 locally unique key for extracting a record from a repository

23 selective harvesting - datestamps repositoryrepository harvest within date range record

24 selective harvesting - sets repositoryrepository harvest within set S1 record S2

25 Summary: Protocol for Metadata Harvesting Service Requests Identify ListMetadataFormats ListSets GetRecord ListIdentifiers ListRecords Metadata Multiplicity Date (and Time) Ranges Resumption Tokens

26 Harvesting vs. Federation Competing approaches to interoperability Federation is when services are run remotely on remote data (e.g., federated searching) Harvesting is when data/metadata is transferred from the remote source to the destination where the services are located (e.g., union catalogues) Federation requires more effort at each remote source but is easier for the local system and vice versa for harvesting OAI (currently) focuses on harvesting

27 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

28 Example 1: Union Collection of ETDs (Electronic Theses and Dissertations, for Networked Digital Library of Theses and Dissertations, NDLTD)

29 Example 1: Details

30 Example 2: NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”

31 Example 2: CITIDEL -> NSDL Computing and Information Technology Interactive Digital Education Library A collection project in the National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL www.nsdl.nsf.gov www.nsdl.org

32 Example 2: CITIDEL Distributed repository structure

33 Example 2: NSDL Collections (themes relevant to our projects) Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on pedagogy

34 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

35 Open Digital Libraries XOAI-PMH Dissertation work of Hussein Suleman (member of OAI technical committee) Extending the OAI protocol Supporting rapid development of DLs using networks of components Demonstrated with NDLTD, CSTC Described in Dec. 2001 D-Lib Magazine article, and article scheduled for publication

36 Open Digital Libraries Components Running now XML-File (data provider from file system) Union, search, browse, recent, filter E-journal support system Class projects High performance multilingual search Recommender User rating Others discussed Classification/categorization and browsing

37 Component System Approach (Open) DL = Network of Extended OAs Local Archive Data Input Remote Archive Browse Metadata Repository SearchRecommend Resource Discovery User Interface OAI/ODL archive OAI/ODL protocol legend

38 Example Architecture (NDLTD) Humboldt Duisburg MIT Filter MIT Browse Union Catalog SearchRecent User Interface OAI/ODL archive OAI/ODL protocol legend Virginia Tech PhysNet CalTech Dresden

39 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

40 OAI Tools Related resources, e.g., XML, Unicode Submission / author support XML Schema Validator Servers and utilities, e.g., ARC, Kepler, EPrints Repository Explorer Interactive Browsing Testing of parameters Multiple views of data Multilingual support Automatic test suite

41 Author‘s tools www.physik.uni-oldenburg.de/EPS/mmm

42 XSV Schema Validator

43 ARC (arc.cs.odu.edu)

44

45

46 VT Tool: Repository Explorer The Repository Explorer is a tool for browsing and testing Open Archives, by Hussein Suleman You issue commands and see the results You also can perform a sequence of automatic tests http://purl.org/net/oai_explorer

47 VT Tool: RE 1.3

48 VT Tool: Request, Response

49 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

50 What will central service look like? (1 of 2) Harvesting from local sites Rich content, drawn from all participating sites Data management Logging and reporting Repository/preservation/mirroring Adding/updating/deleting User interface and support for digital librarians and data providers

51 What will central service look like? (2 of 2) Adding value De-duping Categorization/classification -> browsing Normalization/standardization -> authority control Tools for communication/collaboration/annotation -> security/privacy User interface for both general users and scholars

52 What are needs at local sites? Increasing OAI expertise Connecting OAI with local systems Supporting standards, normalization Supporting continual updating Passing enhancements upstream

53 How can VT help? (1 of 2) Usability studies for central site Help develop consensus Help plan system architecture & services Education/training Provide and support tools/systems Help sites engage, become OAI compliant

54 How can VT help? (2 of 2) Standards MARC-XML ODL Suite Download and configure Use in packaged forms, or re-architected Support Connecting your system into OAI Help with OAI Tools

55 MARC XML-DTD XML Transport format for US-MARC records Standardized metadata exchange format for traditional library services joining OAI

56 Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

57 Rethink your efforts in terms of providers of Data, Services Reduced work for data providers Tools available Don’t need to offer services Reduced work for service providers Others provide the data Can use tools and systems for OAI, XOAI Results More data becoming available To more people Supported by improved services MetaScholar can be a win-win-win project!

58 Links Open Archives Initiative http://www.openarchives.org OAI Metadata Harvesting Protocol http://www.openarchives.org/OAI/openarchivesprotocol.htm Virginia Tech DLRL OAI Projects http://www.dlib.vt.edu/projects/OAI/ http://oai.dlib.vt.edu/odl Repository Explorer http://purl.org/net/oai_explorer NDLTD http://www.ndltd.org

59 More Links ARC Cross-Archive Search Service http://arc.cs.odu.edu/ XML Schema Validator http://www.w3.org/2001/03/webdata/xsv Dublin Core Metadata Initiative http://www.dublincore.org E-Prints DL-in-a-box http://www.eprints.org XML Tools at W3C http://www.w3.org/XML/#software


Download ppt "The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox."

Similar presentations


Ads by Google