Vocabularies Joseph T. Tennis The University of British Columbia Vocabularies Tutorial Manzanillo, Mexico October 5, 2006.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
DC Architecture WG meeting Monday Sept 12 Slot 1: Slot 2: Location: Seminar Room 4.1.E01.
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
Andy Powell, Eduserv Foundation Feb 2007 The Dublin Core Abstract Model – a packaging standard?
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
A centre of expertise in digital information management UKOLN is supported by: XML and the DCMI Abstract Model DC Architecture WG Meeting,
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
The Knowledge Management Research Group 1 Towards an Interoperability Framework for Metadata Standards Presenter: Mikael Nilsson Co-authors: Pete Johnston.
Pete Johnston & Andy Powell, Eduserv Foundation 28 June 2006 Update.
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Corey A Harper DC2006 October 4, 2006 Authority Control for the Semantic Web Encoding Library of Congress Subject Headings (LCSH) in SKOS.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
Creating and Managing Controlled Vocabularies for Use in Metadata Tutorial 4 DC2004, Shanghai Library 14 October 2004 Stuart A. Sutton & Joseph T. Tennis.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
© 2006 DCMI DC-2006 – International Conference on Dublin Core and Metadata Applications 3-6 October 2006 Thomas Baker Dublin Core Metadata Initiative.
A Registry for controlled vocabularies at the Library of Congress
The NSDL Registry: An Update Diane I. Hillmann Jon Phipps Stuart Sutton.
1 Technologies and Modelling Frameworks XML ontology RDF taxonomy OWL thesaurus Semantic Web.
Application Profiles: A Tutorial Diane I. Hillmann Cornell University Diane I. Hillmann Cornell University.
Everything Around the Core Practices, policies, and models around Dublin Core Thomas Baker, Fraunhofer-Gesellschaft DC2004, Shanghai Library
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
The NSDL Registry Jon Phipps Stuart Sutton Diane Hillmann Ryan Laundry Cornell U. U. of Washington.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
The role of metadata schema registries XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
Profiling Metadata Specifications David Massart, EUN Budapest, Hungary – Nov. 2, 2009.
Logics for Data and Knowledge Representation
By: Dan Johnson & Jena Block. RDF definition What is Semantic web? Search Engine Example What is RDF? Triples Vocabularies RDF/XML Why RDF?
“Integrating Standards in Practice” 10th Open Forum on Metadata Registries July 9-11, 2007 New York City, NY USA An international conference to share and.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Creating an Application Profile Tutorial 3 DC2004, Shanghai Library 13 October 2004 Thomas Baker, Fraunhofer Society Robina Clayphan, British Library Pete.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Metadata Registries Registry: authoritative, centrally controlled store of information – W3C Web Services Glossary, 2004
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.
Registry of MEG-related schemas MEG BECTa, Coventry, 17 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by:
Application Profiles Application profiles -- are schemas which consist of data elements drawn from one or more namespaces, combined together by implementers,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
A centre of expertise in digital information management UKOLN is supported by: IEMSR, the Information Environment & Metadata Application.
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Metadata Issues in Long-term Management of Data and Metadata
Introduction to Metadata
PREMIS Tools and Services
Presentation transcript:

Vocabularies Joseph T. Tennis The University of British Columbia Vocabularies Tutorial Manzanillo, Mexico October 5, 2006

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Outline Vocabularies 1. Semantics: Defining, Developing, and Reusing 2. Posting to the web: Identifying, Declaring, Publishing, and Registering 3. Reuse on the web: Repurposing and Describing

Defining Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Defining Vocabularies Vocabularies A prescribed set of consistently used and carefully defined terms (DCMI Glossary) ANSI/NISO Z39.19 Lists Synonym Rings TaxonomiesThesauri

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Defining Vocabularies Examples: Art and Architecture Thesaurus < NASA Thesaurus < Medical Subject Headings (MESH) < DCMI Type Vocabulary <

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Defining Vocabularies Vocabularies are made up of Terms Definitions (either by notes or by relationships or both) A term from MESH: Respiratory Therapy Department, Hospital Definition: Hospital department which is responsible for the administration of diagnostic pulmonary function tests and of procedures to restore optimum pulmonary ventilation.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Defining Vocabularies Dublin Core glossary does not allow folksonomies to be vocabularies For example: del.icio.us, flickr, connotea This is because on the whole, they are neither consistently used, nor carefully defined.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Defining Vocabularies Network Environment The Internet Where humans and machines can link to other humans and machines We want to design this linking so it is meaningful to all parties involved This leads to recommending best practice for vocabulary specification and reuse in this environment

Developing Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Developing Vocabularies Observe the use of terms and control the meaning of concepts: Gather terms, concepts, and uses of those terms and concepts Document and make explicit the relationships between these terms,and concepts Make decisions about what to include and exclude based on use Value here is on decision to name and exclude some things

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Developing Vocabularies For more on developing vocabularies see: Aitchison, Gilchrist, and Bawden’s 2000 “Thesaurus Construction and Use: A Practical Manual” andZ ANSI/NISO z Guidelines for the Construction, Format, and Maintenance of Management Controlled Vocabularies <

Reusing Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Reusing Vocabularies Vocabularies are the result a huge amount of effort, and if they are owned by an institution, then it is updated and maintained. If vocabularies are available at addressable and machine processable parts of the networked environment we can facilitate reuse

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Reusing Vocabularies Vocabularies can be reused if: You have permission to reuse them Machines can reuse vocabularies if they are: identified (given a URI reference), declared (machine processable), published (web accessible), registered (contextualized and maintained).

Recap 1

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Recap 1 Defined vocabularies and networked environment DevelopedReused Reuse is the key to utilizing vocabularies to their full potential in the networked environment

Identifying Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Identifying Vocabularies “URIs identify resources and so are central to the Semantic Web enterprise. Using a global naming convention … provides the global network effects that drive the Web’s benefits. URIs have global scope and are interpreted … across contexts. Associating a URI with a resource means that anyone can link to it, refer to it, or retrieve a representation of it.” Nigel Shadbolt, Wendy Hall, & Tim Berners-Lee, “The Semantic Web Revisited” Nigel Shadbolt, Wendy Hall, & Tim Berners-Lee, “The Semantic Web Revisited”

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Identifying Vocabularies URIs are required by the DC Abstract Model “The Dublin Core Abstract Model requires that all terms (elements, element refinements, encoding schemes and controlled vocabulary terms) … that are compliant with the model must be assigned a URI reference that identifies the term.” Andy Powell, “Guidelines for assigning identifiers to metadata terms” Andy Powell, “Guidelines for assigning identifiers to metadata terms”

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Identifying Vocabularies Vocabularies contain terms Terms are resources They need to sit at a single space in the network They need a URI To that end, terms within a vocabulary need to be declared using a URI The URI should be persistent

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Identifying Vocabularies Strategies for identifying vocabularies and terms Use project-specific URL E.g., Questionable persistence Use PURL E.g., 2E.g., 2 Reliable intermediary (resolution service) for persistence Use “info” URI E.g., info:ddc/22/eng// E.g., info:ddc/22/eng// Persistent identification but info URIs cannot be “resolved” using current Web browsers DCMI Working Draft: “Guidelines for assigning identifiers to metadata terms” <

Declaring Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Declaring Vocabularies Declaring a vocabulary in the networked environment means we create a machine processable representation of the vocabulary and its terms by means of a schema language XML and RDF/XML

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Declaring Vocabularies … … Collection Collection An aggregation of resources. An aggregation of resources. <dcterms:issued> </dcterms:issued> … …</rdf:RDF>

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Declaring Vocabularies The DCMI Types namespace providing access to its content by means of an RDF Schema The DCMI Types namespace providing access to its content by means of an RDF Schema The Dublin Core Metadata Initiative The Dublin Core Metadata Initiative The Dublin Core Types namespace provides URIs for the entries of the DCMI Type Vocabulary. Entries are declared using RDF Schema language to support RDF applications. The Schema will be updated according to dc-usage decisions. The Dublin Core Types namespace provides URIs for the entries of the DCMI Type Vocabulary. Entries are declared using RDF Schema language to support RDF applications. The Schema will be updated according to dc-usage decisions. English English <dcterms:issued> </dcterms:issued><dcterms:modified> </dcterms:modified></rdf:Description> The DCMI Type Vocabulary provides a general, cross-domain list of approved terms that may be used as values for the Resource Type element to identify the genre of a resource. The DCMI Type Vocabulary provides a general, cross-domain list of approved terms that may be used as values for the Resource Type element to identify the genre of a resource. <dcterms:issued> </dcterms:issued></dcterms:TypeScheme></rdf:RDF>

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Declaring Vocabularies Versioning is an open issue at this point. How do you make reference to outdated vocabularies and the most current vocabularies?

Publishing Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Publishing Vocabularies Vocabularies, once given a URI and a bound in a machine-readable schema, should be web-accessible. This should be maintained by the owner(s) of that vocabulary It should be give a URL Should be resolvable and persistent For example:

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Publishing Vocabularies For many it will be html pages to narrate the structure of the vocabulary. However publishing is not just in.html, but also.xml or.rdf or.owl files offered through content negotiation Where RDF/XML can be served to a machine Fix for RDF/XML < Content Negotiation <

Registering Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Registering Vocabularies Third party registries Have a mandate to maintain published vocabularies from multiple parties They require an explication of context (definitions, relationships, documentation, pointers to these, identification of owner(s) and editor(s) Must commit to the requirements that come before registration: identification, declaration, and publication.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Registering Vocabularies The value of registering vocabularies lies in the registry’s ability to serve up versions of your vocabulary, contextualize your vocabulary, and maintain persistence (could help you identify, declare, and publish your vocabulary) - offering you complete networked vocabulary services

Recap 2

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Recap 2 The networked environment is designed to help link humans and machines to humans and machines. We can link humans and machines to vocabularies for human and machine use if we identify, declare, publish, and register vocabularies and their constituent terms.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Recap 2 Identify (through URI) Declare (through machine processable representation) Publish (through web accessible serving) Registering (though submission to and contextualization in a third party server+services i.e., registry)

Repurposing Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Repurposing Vocabularies Once vocabularies have been registered, you can create repurposed vocabularies. For example, you can repurpose a subset of DC Terms for your work. You can also extend DC Terms to satisfy your needs.

Describing Vocabularies

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies Once we have vocabularies identified, declared, published and registered, we want to move them around the networked environment.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies In order to do this we need to wrap metadata around the vocabulary describing it so we can make use of it in a different context, make relationships and definitions explicitly machine processable, map from one vocabulary to another, and identify differences between versions.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies SKOS - Simple Knowledge Organisation Systems w3c initiative Lightweight specification for metadata about vocabularies The purpose is to make meaningful assertions about vocabularies on the web

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies “SKOS is an area of work developing specifications and standards to support the use of knowledge organisation systems (KOS) such as thesauri, classification schemes, subject heading lists, taxonomies, other types of controlled vocabulary, and perhaps also terminologies and glossaries, within the framework of the Semantic Web.”

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies Identifies Concepts (through URIs) Labels Relationships between concepts Change Notes & Scope Notes

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Describing Vocabularies SKOS is becoming less lightweight through community driven development. They are wrestling with mapping, versioning, expressiveness, and other factors contribute to the expansion of SKOS. Folks can contribute to this discussion.

Recap 3

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Recap 3 Repurposing The networked environment allows us to repurpose extant vocabularies in whole or in part. Describing In order to ship vocabularies around the networked environment with relationships and definitions intact, we must describe them in a standard way.

Vocabularies in the Networked Environment, DC 2006, Manzanillo, Mexico (c) Joseph T. Tennis Documents Defining, Developing and Reusing: ANSI/NISO z Guidelines for the Construction, Format, and Maintenance of Management Controlled Vocabularies ANSI/NISO z Guidelines for the Construction, Format, and Maintenance of Management Controlled Vocabularies Aitchison, Gilchrist, and Bawden “Thesaurus Consturction and Use: A Practical Manual” 4th Ed. Willpower Information Management Consultants Willpower Information Management Consultants Identifying: RFC 3986 URI Generic Syntax 2005 RFC 3986 URI Generic Syntax 2005 Naming and Address: URIs, URLs… Naming and Address: URIs, URLs… Declaring and Publishing Expressing Simple Dublin Core in RDF/XML Expressing Simple Dublin Core in RDF/XML Expressing Simple Dublin Core in XML Expressing Simple Dublin Core in XML Registering ISO/IEC Metadata Registries ISO/IEC Metadata Registries Hillmann et al., “A Metadata Registry from Vocabularies Up: the NSDL Registry Project. Hillmann et al., “A Metadata Registry from Vocabularies Up: the NSDL Registry Project. Repurposing CWA Guidance information for naming, versioning, evolution, and maintenance of element declarations and application profiles CWA Guidance information for naming, versioning, evolution, and maintenance of element declarations and application profiles DC Application Profiles Guidelines DC Application Profiles Guidelines Describing SKOS SKOS

Thank you jtennis [at] interchange.ubc.ca Acknowledgements: Stuart A. Sutton, University of Washington Diane Hillmann, Cornell University