6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Ontology Assessment – Proposed Framework and Methodology.
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
UKOLN, University of Bath
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Metadata Standards and Applications 6. Vocabularies: Attributes and Values.
The OCLC Metadata Switch Project Jean Godby, Thomas Hickey, Diane Vizine-Goetz OCLC Office of Research Digital Library Federation May 14, 2003.
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
The JISC IE Metadata Schema Registry Pete Johnston UKOLN, University of Bath JISC Joint Programmes Meeting Brighton, 6-7 July 2004
Corey A Harper DC2006 October 4, 2006 Authority Control for the Semantic Web Encoding Library of Congress Subject Headings (LCSH) in SKOS.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
Thesaurus Design and Development
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
A Registry for controlled vocabularies at the Library of Congress
Application Profiles: A Tutorial Diane I. Hillmann Cornell University Diane I. Hillmann Cornell University.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
The role of metadata schema registries XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
AthenaPlus: WP4 Eva Coudyzer Koninklijke Musea voor Kunst en Geschiedenis Europeana Overlegplatform, 7 juni 2013.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
The JISC IE Metadata Schema Registry and IEEE LOM Application Profiles Pete Johnston UKOLN, University of Bath CETIS Metadata & Digital Repositories SIG,
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Metadata Registries Registry: authoritative, centrally controlled store of information – W3C Web Services Glossary, 2004
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Registry of MEG-related schemas MEG BECTa, Coventry, 17 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by:
OECD Expert Group on Statistical Data and Metadata Exchange (Geneva, May 2007) Update on technical standards, guidelines and tools Metadata Common.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
A centre of expertise in digital information management UKOLN is supported by: IEMSR, the Information Environment & Metadata Application.
GACS: Towards a common concept scheme for information in agriculture International Conference on Big Data and Knowledge Discovery Bangalore, March 9-11,
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Information Organization
Taxonomies, Lexicons and Organizing Knowledge
PREMIS Tools and Services
RDA in a non-MARC environment
Semantic Interoperability in Digital Library Systems
Attributes and Values Describing Entities.
Presentation transcript:

6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop

Goals of Session Understand how different controlled vocabularies are used in metadata Investigate metadata quality issues

Applying metadata: Controlled vocabularies Values that occur in metadata Often documented and published Goal to reduce ambiguity Control of synonyms Establishment of formal relationships among terms (where appropriate) Testing and validation of terms Metadata registries may be established

Why bother? To improve retrieval, i.e., to get an optimum balance of precision and recall Precision – How many of the retrieved records are relevant? Recall – How many of the relevant records did you retrieve?

Types of Controlled Vocabularies Lists of enumerated values Taxonomy Thesaurus Classification Schemes Ontology

Lists A list is a simple group of terms Example: Alabama Alaska Arkansas California Colorado.. Frequently used in Web site pick lists and pull down menus

Taxonomies A taxonomy is a set of preferred terms, all connected by a hierarchy or polyhierarchy Example: Chemistry Organic chemistry Polymer chemistry Nylon Frequently used in web navigation systems

Thesauri A thesaurus is a controlled vocabulary with multiple types of relationships Example: Rice UF paddy BT Cereals BT Plant products NT Brown rice RT Rice straw

Ontology One definition: “ An arrangement of concepts and relations based on an underlying model of reality. ” Ex.: Organs, symptoms, and diseases in medicine No real agreement on definition — every community uses the term in a slightly different way

Thesaurus Relationships Relationship types: Use/Used For – indicates preferred term Hierarchy – indicates broader and narrower terms Associative – almost unlimited types of relationships may be used It is the most complex format for controlled vocabularies and widely used.

Equivalence Relationships Term A and Term B overlap completely A = B

Hierarchical Relationships Term A is included in Term B B A

Associative Relationships Semantics of terms A and B overlap AB

Expressing Relationship RelationshipRel. IndicatorAbbreviation Equivalence (synonymy) Use Used for None or U UF HierarchyBroader term Narrower term BT NT AssociationRelated termRT

Vocabulary Management The degree of control over a vocabulary is (mostly) independent of its type Uncontrolled – Anybody can add anything at any time and no effort is made to keep things consistent Managed – Software makes sure there is a list that is consistent (no duplicates, no orphan nodes) at any one time. Almost anybody can add anything, subject to consistency rules Controlled – A documented process is followed for the update of the vocabulary. Few people have authority to change the list. Software may help, but emphasis is on human processes and custodianship

Encoding controlled vocabularies MARC 21 Authority Format used for names, subjects, series Classification Format used for formal classification schemes MADS (a derivative of MARC) Simple Knowledge Organization System (SKOS) Intended primarily for concept schemes (e.g., not names)

SKOS “ SKOS Core provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, 'folksonomies', other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies. ” --SKOS Core Guide

The skos:Concept class allows you to assert that a resource is a conceptual resource. That is, the resource is itself a concept.skos:Concept

Preferred and Alternative Lexical Labels

Registries: the Big Picture (Adapted from Wagner & Weibel, “The Dublin Core Metadata Registry: Requirements, Implementation, and Experience” JoDI, 2005)

Why Registries? Support interoperability Discovery of available schemes and schemas for description of resources Promote reuse of extant schemes and schemas Access to machine-readable and human- readable services Support for crosswalking and translation Coping with difference metadata schemes

Declaration, documentation, publication To identify the source of a vocabulary, e.g., a term comes from LCSH, as identified in my metadata by a URI: info:lcsh To clarify a term and its definition To publish controlled vocabularies and have access to information about each term

Some uses for registries Metadata Schemas Crosswalks between metadata schemas Controlled Vocabularies Mappings between vocabularies Application Profiles Schema and vocabulary information in combination

Metadata registries Some are formal, others are informal lists Some formal registries: Dublin Core registry of DC terms NSDL registry of vocabularies used LC is establishing registries MARC and ISO code lists Enumerated value lists LCSH in SKOS

Applying metadata standards: quality issues Defining quality Criteria for assessing quality Levels of quality Quality indicators

Determining and Ensuring Quality What constitutes quality? Techniques for evaluating and enforcing consistency and predictability Automated metadata creation: advantages and disadvantages Metadata maintenance strategies

Quality Measurement: Criteria Completeness Accuracy Provenance Conformance to expectations Logical consistency and coherence Timeliness (Currency and Lag) Accessibility

Basic Quality Levels Semantic structure ( “ format, ” “ schema ” or “ element set ” ) Syntactic structure (administrative wrapper and technical encoding) Data values or content

Quality Indicators: Tier 1 Technically valid Defined technical schema; automatic validation Appropriate namespace declarations Each element defined within a namespace; not necessarily machine-resolvable Administrative wrapper present Basic provenance (unique identifier, source, date)

Quality Indicators: Tier 2 Controlled vocabularies Linked to publicly available sources of terms by unique tokens Elements defined and documented by a specific community Preferably an available application profile Full complement of general elements relevant to discovery Provenance at a more detailed level Methodology used in creation of metadata?

Quality Indicators: Tier 3 Expression of metadata intentions based on documented AP endorsed by a specialized community and registered in conformance to a general metadata standard Source of data with known history of updating, including updated controlled vocabularies Full provenance information (including full source info), referencing practical documentation

Improving Metadata Quality … Documentation Basic standards, best practice guidelines, examples Exposure and maintenance of local and community vocabularies Application Profiles Training materials, tools, methodologies