Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.

Similar presentations


Presentation on theme: "National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University."— Presentation transcript:

1 National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University

2 Aggregator Issues: Deleted Records indicated but transient indicated but transient reharvested soon enough – no problem, mark our copy “deleted” reharvested soon enough – no problem, mark our copy “deleted” reharvested as “disappeared” reharvested as “disappeared” not indicated not indicated reharvested as “disappeared” reharvested as “disappeared”Solution? “Full reharvest” “Full reharvest” Mark all the site’s records in our repository “deleted” Mark all the site’s records in our repository “deleted” Do a full harvest Do a full harvest Ingest each newly retrieved record into our repository, “un- deleting” if we over-write an old record Ingest each newly retrieved record into our repository, “un- deleting” if we over-write an old record

3 Aggregator Issues: Poor Quality Harvested Metadata What is poor quality? OAI protocol problems OAI protocol problems XML problems XML problems metadata “content” problems metadata “content” problems … it’s a knowledge gap … it’s a knowledge gapSolutions? Clearer documentation Clearer documentation “OAI for Dummies” - details coming up “OAI for Dummies” - details coming up “XML for OAI Dummies” - details coming up “XML for OAI Dummies” - details coming up “Metadata for dummies” – details coming up “Metadata for dummies” – details coming up More, better self-test tools for sites … More, better self-test tools for sites … error messages for “dummies” error messages for “dummies” stricter, more thorough OAI validation checking stricter, more thorough OAI validation checking more XML schema validation of metadata more XML schema validation of metadata user friendly, extremely low entry user friendly, extremely low entry OAI static repository OAI static repository Normalize metadata locally Normalize metadata locally

4 “OAI for Dummies” identifiers (OAI vs. DC; the need for persistence) identifiers (OAI vs. DC; the need for persistence) datestamps ( vs. header vs. dc:date; format confusion) datestamps ( vs. header vs. dc:date; format confusion) resumptionTokens (exclusive argument, stateless vs. stateful) resumptionTokens (exclusive argument, stateless vs. stateful) chunk size recommendation or rule of thumb chunk size recommendation or rule of thumb “stateless resumption token” general scheme for User Guidelines doc? (To be indicated via Identify response description?) “stateless resumption token” general scheme for User Guidelines doc? (To be indicated via Identify response description?) about containers and their use (additional examples) about containers and their use (additional examples) distinction between “about the metadata” and “about the resource” concepts (dc:rights vs. rights described in about) distinction between “about the metadata” and “about the resource” concepts (dc:rights vs. rights described in about) sets sets multiple metadata formats are allowed (many sites believe OAI means simple DC only) multiple metadata formats are allowed (many sites believe OAI means simple DC only) MUST have valid XML schema MUST have valid XML schema Web service vs. flat file Web service vs. flat file HTTP vs. HTML HTTP vs. HTML We offer: Donna Bergmark’s OAI validation tool (email me to get more info) Donna Bergmark’s OAI validation tool (email me to get more info)

5 “XML for OAI Dummies” encoding encoding XML encoding XML encoding character encoding (UTF-8, UTF-16, etc.) character encoding (UTF-8, UTF-16, etc.) URL encoding URL encoding XML vs. URL vs. character XML vs. URL vs. character Namespaces Namespaces what are they for? how are they used? what are they for? how are they used? full syntax explanation full syntax explanation declaration, prefix, URI, scope, default, missing … declaration, prefix, URI, scope, default, missing … XML schemas XML schemas what are they for? how are they used? what are they for? how are they used? xsi:schemaLocation xsi:schemaLocation validation – what it will and won’t find validation – what it will and won’t find validators – what’s there, what’s best for “my” site? validators – what’s there, what’s best for “my” site?

6 “Metadata for Dummies” simple DC vs. qualified DC simple DC vs. qualified DC What refers to metadata, what refers to resource? What refers to metadata, what refers to resource? Think identifiers Think identifiers Think rights Think rights other … other … We offer: Metadata Primer (currently being revised) Metadata Primer (currently being revised) email me to get URL email me to get URL

7 Normalize Metadata Locally Aim to improve services (e.g. search results) Aim to improve services (e.g. search results) Improve quality when possible Improve quality when possible Supply missing information, if known Supply missing information, if known site is about Math; add “Mathematics” site is about Math; add “Mathematics” Correct wrong information, when possible Correct wrong information, when possible “text/pdf”  “application/pdf” in “text/pdf”  “application/pdf” in for further details, read our paper Analyzing Metadata for Effective Use and Re-use, submitted to DC 2003 for further details, read our paper Analyzing Metadata for Effective Use and Re-use, submitted to DC 2003 email me to get URL for draft email me to get URL for draft


Download ppt "National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University."

Similar presentations


Ads by Google