1 CS 430: Information Discovery Lecture 13 Case Study: the NSDL.

Slides:

Advertisements

Similar presentations

Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.

Advertisements

1 William Y. Arms Cornell University October 25, 2002 The National Science Digital Library (NSDL) as an Example of Information Science Research.

An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.

1 Building the NSDL William Y. Arms Cornell University Thinking aloud about the NSDL.

1 Uppsala University Library Eva Müller Peter Hansson Stefan Andersson Uwe Klosa Electronic Publishing Centre Krister Östlund Waller project.

Information Retrieval in Practice

Building Reliable Distributed Information Spaces Carl Lagoze CS /22/2002.

1 DLESE in Context: Educational Computing, Digital Libraries and Scientific Education William Y. Arms Cornell University.

1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.

1 CS 430 / INFO 430 Information Retrieval Lecture 22 Metadata 4.

1 NSDL The National Science Foundation's National Digital Library for Science, Mathematics, Engineering and Technology Education [a.k.a. Smete, NSDL, Learns,...]

SCORM-NSDL Workshop May 18, Educational Materials are Scattered across the Internet NASA Math Forum State standards Scientific American Ask.

Mixed content, mixed metadata: Information discovery in the NSDL.

William Y. Arms Corporation for National Research Initiatives March 22, 1999 Object models, overlay journals, and virtual collections.

Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.

1 CS 430: Information Discovery Lecture 20 The User in the Loop.

1 William Y. Arms September 26, 2002 A Research Program for Information Science with the NSDL as an Example.

Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.

1 An introduction to the NSDL William Y. Arms Cornell University.

Overview of Search Engines

CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.

Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.

W w w. i l u m i n a – d l i b. o r g iLumina: A Digital Library of Educational Resources for Science & Mathematics National Science Digital Library All-Projects.

1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.

9 April '01 1 NSDL The National SMETE* Digital Library *Science, Mathematics, Engineering, & Technology Education An early report on an initiative of the.

Ensemble Computing in the National Science Digital Library (NSDL)

Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK

NSDL: OAI and a large- scale digital library Carl Lagoze, Cornell University NSDL Director of Technology

Building a large-scale digital library for education Carl Lagoze Common Solutions Group January 16, 2003.

Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)

1 CS 430 / INFO 430 Information Retrieval Lecture 24 Architecture of Information Retrieval Systems.

1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.

Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.

Mixed content, mixed metadata: Information discovery in the NSDL.

Alexandria Digital Earth ProtoType DIGITAL LIBRARIES AND ENVIRONMENTAL INFORMATION Terence R. Smith Alexandria Digital Library Project.

1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.

Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.

GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.

Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.

OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley

Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.

1 The NSDL Program Stephen Griffin National Science Foundation.

Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.

Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.

Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:

Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,

Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:

JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.

NSDL & Access Management David Millman Columbia University Jan ‘02.

A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.

Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,

1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.

June 3-6, 2003E-Society Lisbon Automatic Metadata Discovery from Non-cooperative Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science.

1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.

1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems.

Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.

Information Retrieval in Practice

Search Engine Architecture

Joseph JaJa, Mike Smorul, and Sangchul Song

WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000

NSDL: OAI and a large-scale digital library

CS 430 / INFO 430 Information Retrieval

VI-SEEM Data Repository

OAI and Metadata Harvesting

Metadata to fit your needs... How much is too much?

Building a large-scale digital library for education

Some Options for Non-MARC Descriptive Metadata

BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES

Metadata supported full-text search in a web archive

Presentation transcript:

1 CS 430: Information Discovery Lecture 13 Case Study: the NSDL

2 Course Administration

3 The NSDLThe National SMETE Digital Library The National Science Digital Library Funded by the National Science Foundation Directorate for Education and Human Resources Division of Undergraduate Education

4 The NSDL Library Project 1996 Vision articulated by NSF's Division of Undergraduate Education 1997 National Research Council workshop 1998 Preliminary grants through Digital Libraries Initiative SMETE-Lib workshop 1999 NSDL Solicitation Core Integration System projects + 23 others funded very large Core Integration System project

5 Collections and Services Scientific and technical information Materials used in education Materials tailored to education

6 Core Partners

7 All Partners

8 NSDL Components Funded by the NSF Core Integration System Collection Projects Service Projects Other Any digital collection or service that is relevant to science education, very broadly defined. Official start date is December 2002.

9 The NSDL aims to be comprehensive—all branches of science, all levels of education, very broadly defined. Five year targets 1,000,000 different users 10,000,000 digital objects 100,000 independent sites Requires Low-cost, scalable technology Automated collection-building and maintenance How Big might the NSDL be?

10 A User's Wish List To discover materials and services: Good science Comprehensible to students -- effective for teaching Stable -- will not change or disappear Through services that are appropriate to the user's needs. No uniform catalog or index to everything Mixture of for-profit and open access information

11 The Dilemma Collections vary: Format: text, images, datasets, etc. Metadata: extensive, minimal, or none Dublin Core, other standard, or local scheme Protocols:HTTP, SQL, Z 39.50, etc. Access:Open access or restricted Methods studied in this course have been for homogeneous sets of documents.

12 The Challenge of Interoperability Technical agreements cover formats, protocols, security systems so that messages can be exchanged, etc. Content agreements cover the data and metadata, and include semantic agreements on the interpretation of the messages. Organizational agreements cover the ground rules for access, for changing collections and services, payment, authentication, etc. Challenge is to create incentives for independent digital libraries to adopt agreements

13 Levels of Interoperability LevelAgreementsExample FederationStrict use of standardsAACR, MARC (syntax, semantic, Z and business) HarvestingDigital libraries supplyOpen Archives basic metadata; simple protocol and registry GatheringDigital libraries do notWeb crawlers cooperate; services mustand search engines seek out information

14 The General Catalog (Metadata Repository) User portals Distributed collections Metadata Repository

15 Metadata Harvesting (Open Archive Initiative) Distributed collections Central services, metadata collections, etc. Metadata to harvest Central data

16 Metadata Harvesting Collections must support: Unqualified Dublin Core Collections may support: IMS FGDC or one of seven recognized metadata sets Simple XML tagged format -- protocol derived from Dienst

17 The Information Discovery System Items are stored in (usually) independent repositories. Surrogates for items and resources are stored in a central metadata repository. Items and surrogates become part of the library by way of gathering, harvesting and federated services. A search service allows items in the library to be discovered. The metadata repository and search service may be distributed. The big question: How can we have effective information discovery with such minimal and diverse metadata?

18 The InQuery Retrieval Engine Developed by Bruce Croft and colleagues at the University of Massachusetts, Amherst Used in: Infoseek Library of Congress -- Thomas, American Memory White House and many more Highly rated in TREC experiments

19 InQuery: Advanced Features Ranked output: Combines evidence in the text of the document and the corpus as a whole. Passage-based retrieval: The probability of relevance is based both on the entire content of a document and the best matching passage in the document. Simple and complex queries: e.g., simple word-based queries, Boolean queries, phrase-based queries or a combination. Field-based retrieval: e.g., bill number and type. Flexible and efficient indexing: Incorporates a variety of document structures (e.g. HTML, MARC, etc.) Tools for query processing and query expansion

20 Portal Search and Discovery Services Metadata Repository Content SDLIP? OAI http? How Search Services Fit into the NSDL Note: Services use both metadata and automatic indexing of (textual) content

21 Basic metadata search e.g., card catalogue Basic content search Provided content is textual If content is publicly readable Combining metadata and content e.g., content search restricted by metadata What service is not provided SQL-like access to metadata repository Goals of Information Retrieval Service for First Year

22 Integration of hierarchies content-based search for entries in hierarchies Browsing capabilities by metadata values by “concepts” automatically extracted from the content by hierarchies Feedback capabilities “more like this” while browsing retrieval results Use of thesaurus allowing user to add vocabulary terms Clustering/grouping show/find strongly related items across the repository Future Possible Directions for Information Retrieval Services