ISO 25964-1: a new standard for development of thesauri and exchange of thesaurus data Stella G Dextre Clarke and Johan De Smedt.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
BS 8723 advances to encompass interoperability Stella G Dextre Clarke Convenor, IDT/2/2 Working Group of BSI.
BS 8723 : a new British Standard for structured vocabularies Stella G Dextre Clarke Information Consultant.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Update on BS 8723 and ISO NP Stella G Dextre Clarke Convenor, IDT/2/2 Working Group of BSI and Project Leader for ISO NP
ISO – plans and progress towards the revised international standard for thesauri Stella G Dextre Clarke Project Leader, ISO NP
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Applying ISO25964 to thesaurus mapping and other forms of linkage Stella Dextre Clarke Convenor, ISO TC46/SC9 WG8 1.
Thesauri, Terminologies and the Semantic Web
Ontology Notes are from:
Standards for networked knowledge organisation systems Ron Davies European Library Automation Group Bucharest, April 2006.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
Thesaurus Design and Development
A Registry for controlled vocabularies at the Library of Congress
Dublin Core as a tool for interoperability Common presentation of data from archives, libraries and museums DC October 2006 Leif Andresen Danish.
Educause October 29, 2001 A GEM of a Resource: The Gateway to Educational Materials Copyright Nancy Virgil Morgan, This work is the intellectual.
ISO Standards: Status, Tools, Implementations, and Training Standards/David Danko.
Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Practical RDF Chapter 1. RDF: An Introduction
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
AthenaPlus: WP4 Eva Coudyzer Koninklijke Musea voor Kunst en Geschiedenis Europeana Overlegplatform, 7 juni 2013.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
XML DTDs and other Alternatives: Vocabulary Markup Language (Voc-ML) Project & Friends Joseph A. Busch Director, Solutions Architecture NetLab and Friends.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
In pursuit of interoperability: Can we standardize mapping types? Stella G Dextre Clarke Project Leader, ISO NP
Incorporating ARGOVOC in DSpace-based Agricultural Repositories Dr. Devika P. Madalli & Nabonita Guha Documentation Research & Training Centre Indian Statistical.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
Overview of ISO NP Stella G Dextre Clarke Convenor, IDT/2/2 Working Group of BSI and Project Leader for ISO NP
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
DCMI Making Metadata Work Integrating taxonomies with ESCO and applying the ISO25964 SKOS extensions Presentation for a workshop of the ISKO-UK, IRSG and.
Publications Office Metadata Registry (MDR) INSPIRE Registry and Registers Workshop Willem van Gemert Publications Office of the EU Dissemniation and Reuse.
Update on ISO 25964: Thesauri and interoperability with other vocabularies Reported by Doug Tudhope TPDL – – NKOS workshop 1.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
AGROVOC Thesaurus. 1980s: developed as multilingual structured thesaurus for agricultural terminology (“rice”) : parallel effort to express thesaurus.
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Types of mapping recommended in ISO 25964, and the question of reciprocity Stella G Dextre Clarke Project Leader, ISO NP
APS Taxonomy Project Arthur Smith, American Physical Society April 2014.
Margherita Sini, FAO 1 / 19 Using RSS to Share KOS Metadata Margherita Sini, Gauri Salokhe IV Ecoterm Vienna, Austria April.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
A look to the past for the future- The North American Profile Sharon Shin Metadata Coordinator Federal Geographic Data Committee.
Part of the Cronos Group 4C/kZen 4 th EcoTerm meeting, Vienna, April 18, 2007 Jef Vanbockryck Research & Development “Risk Assessment ontologies and data.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.
INSPIRE Network Services
The Re3gistry software and the INSPIRE Registry
Cataloging the Internet
From a thesaurus standard to a general knowledge organization standard?! 04/12/2018.
PREMIS Tools and Services
Session 2: Metadata and Catalogues
Taxonomy of public services
Presentation transcript:

ISO : a new standard for development of thesauri and exchange of thesaurus data Stella G Dextre Clarke and Johan De Smedt

What is ISO 25964? ISO 25964: Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval Part 2: Interoperability with other vocabularies It updates ISO 2788 and ISO 5964 based on BS 8723, with much reworking Part 1, published in August 2011, covers monolingual and multilingual thesauri Part 2, to be published in 2012, covers mapping between thesauri and other types of vocabulary information retrieval seen as main application; mapping applies to index terms or to search terms

ISO 2788 (1986) ISO 5964 (1985) + New content, adapted from BS ISO Part 1 (expected 2011) ISO Part 2 (expected 2012) extensive revision

What’s in Part 1? All that was in ISO 2788 and ISO 5964, revised and extended to include: thesaurus content and construction, mono- or multi- lingual. guidance on applying facet analysis to thesauri guidance on managing thesaurus development and maintenance functional requirements for software to manage thesauri a data model and derived XML schema

What’s in Part 2? Models for mapping Guidelines for mapping Recommendations on mapping types How to handle pre-coordination Mapping to vocabularies other than thesauri: classification schemes, file plans, taxonomies, subject heading schemes, ontologies, synonym rings, terminologies and name authority lists Brief guidance on handling mappings data

Want a copy of ISO ? Download it from ISO at talogue_detail.htm?csnumber=53657 Order it from your national standards body (e.g. BSI, DIN, ANSI, AFNOR) Some public/academic reference libraries may stock it It is not cheap to purchase However, the XML schema for exchange of thesaurus data is in an Annex which is available online without charge or password control. Go to

Want a copy of ISO ? A draft will be issued later in 2011, “ISO DIS ”, with the hope of attracting comments from potential users The official way to get it is through your national standards body (e.g. BSI, DIN) Distribution policies vary from one country to another; but for a couple of months the draft should be available online free of charge and free of passwords, on the BSI site. Send me an and I’ll alert you when the DIS is released.

The XML-Schema Based on the UML model to capture a maximum of the specifications of the standard The element naming follows the names in the UML diagram and the descriptions in the ISO specification. Use of Dublin Core elements

Terms and lexicalValue Example clothing Exactly one lexical value with optional language required in multi- lingual thesaurus A required identifier Optional dates, source reference, notes custom attributes name-type, value

Preferred Term Simple non-Preferred Term abattoirs L11 abatoirs L12 true MS

Split non-preferred Term Compound equivalence Split non preferred term one (term) identifier one lexical value May engage in one or more compound equivalence relationships Compound Equivalence: UFPlus identifier of Split non Preferred Term USEPlus identifier of Preferred Term at least 2 coal mining SL1 SL1 L9 L10 coal L9 mining L10

Concept and Equivalent Terms (1/2) C11 abattoirs L11 abatoirs L12 true MS C10 mining L10 exploitation minière L10.fr Equivalent preferred and non-preferred terms Equivalent preferred terms in a different language

Concept and Equivalent Terms (2/2) Top Concept True: the concept is a top concept of the thesaurus One Preferred Term per Concept and per language Any number of simple non- preferred term per Concept with language Equivalence relationship between preferred term any non-preferred term with same language under same concept

Concept relations Hierarchy (1/4) C1 clothing L1 C2 outerwear L2 C3 overcoats L3 BT C3 C2 BT C2 C1

Concept relations Hierarchy (2/4) Milk Cow milk “Cow milk” BT “Milk” Coded as role: BT isHierRelConcept references a concept identifier is (subject): = identifier of “Cow milk” hasHierRelConcept references a concept identifier has (object): = identifier of “milk” Note: For any given concept, there can be more than one top concept

Concept relations Top-Level (3/4) isTopConceptOf is (subject): any concept references a concept identifier hasTopConcept has (object): The related Top concept references a concept identifier Note: For any given concept, there can be more than one top concept

Concept relations Associative (4/4) “sport event” RT “sport manifestation” Coded as role: RT isRelatedConcept references a concept identifier is (subject): = identifier of “sport event” hasRelatedConcept references a concept identifier has (object): = identifier of “sport manifestation” Note: For any given concept, there can be more than one related concept

Thesaurus Array (1/2) A1 true age group C5 C6 C7 C8 C5 people L5 C6 children L6 C7 youths L7 C8 adults L8 so25964:PreferredTerm>

Thesaurus Array (2/2) Has a unique identifier Can be ordered or not The unique parent referenced by identifier, is either of: A Concept A (super-) Array The members are a combination of 1 or more Concepts (sub-) Arrays

Concept Group Definition (1/2) Each group has a unique identifier A group has one type micro-thesaurus, theme, subject category,... Member concepts are referenced by identifier There is a group label per language

Concept Group sub-groups (2/2) Typically, groups do not form a hierarchy. Groups can be nested Relationship hasSubGroup the identifier of the sub-group hasSuperGroup the identifier of the super-group a member of a sub-group is also a member of the super-group

Thesaurus element The metadata sheet identifier dc:language list of all languages terms are made available in dc:coverage dc:title dc:relation.... Basic building blocks of the thesaurus ThesaurusConcept ThesaurusArray ConceptGroup Version

Thesaurus Version History Each record details a version described in the versionNote Each version has a date currentVersion True: if the version record details the latest and greatest False: if the version record details an older version thisVersion True: if this version record details the linked thesaurus False: if the version record pertains to other thesauri versions than the linked

Root element Thesaurus details on next slide metadata sheet concepts, groups, arrays Relationships of Concept Group of Concepts of split (compound) non Preferred Terms

Envisioned extensions Distribution of thesaurus updates Complex compound relationships

XML Schema versus SKOS (1/2) XML and XML Schema Pro Reasonably well know Stable and general available toolset with IDE support xml, xslt, xquery Strong typing Integrity constraints Covers all standardized features Con XML structure limits flexibility to order elements Limited flexibility in constraints (e.g. non-cyclic graphs) Limited extensibility No standard internet access protocol Usage Exchange between partners with an established SLA

XML Schema versus SKOS (2/2) SKOS Pro Reasonably well know Strong typing Extensibility Powerful constraint languages (SPARQL, OWL, SWRL, RIF...) Compact specification by using inference Graph model provides flexible specification without ordering Limited flexibility in constraints (e.g. non-cyclic graphs) Standard internet access protocol (SPARQL, content negotiation) Con Limited toolset validation, transformation, IDE Needs extensions to cover all standardized features e.g. specific label relationships Usage Internet publishing, L(O)D publishing, L(O)D applications

Q&A, References Web site Info