Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thesauri, interoperability and the role of ISO 25964 Stella G Dextre Clarke Project Leader, ISO NP 25964 Chair, ISKO UK 1.

Similar presentations


Presentation on theme: "Thesauri, interoperability and the role of ISO 25964 Stella G Dextre Clarke Project Leader, ISO NP 25964 Chair, ISKO UK 1."— Presentation transcript:

1 Thesauri, interoperability and the role of ISO Stella G Dextre Clarke Project Leader, ISO NP Chair, ISKO UK 1

2 Summary Brief thesaurus chronology What role does the thesaurus have now? The demand for interoperability Highlights from ISO

3 Thesauri – a brief chronology Once upon a time, thesauri were at the cutting edge of Information Retrieval (IR) technology Hey-day in 1960s and 1970s; after mid-1980s popularity declined ISO 2788 and ISO 5964 (for monolingual and multilingual thesauri respectively) came out Internet/intranets in 1990s brought resurgence and diversification (into other forms of controlled vocabulary, such as taxonomies) TREC (1992 onwards) has shown dominance of statistical methods in IR. But stats alone are not enough! At the turn of the century, thesauri back in fashion and work began on refurbishing the British and International standards Semantic Web and SKOS developments provide more incentive Today, even Google employs some taxonomists. 3

4 Slide unearthed from TR01(2001): The thesaurus coming back into fashion!

5 5

6 6

7 7

8 The role of controlled vocabularies today Needed where full text is not available, e.g. image libraries and audio resources Invaluable for crossing language barriers Especially useful in-house, where the page rank algorithms are less effective Essential to access vast databases and catalogues of bibliographic data from decades past Provide added value in combination with other methods, often hidden behind the scenes In all these contexts, interoperability is key. 8

9 Introducing ISO ISO 25964: Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval Part 2: Interoperability with other vocabularies It updates ISO 2788 and ISO 5964 based on BS 8723, with much reworking Part 1, published in August 2011, covers monolingual and multilingual thesauri Part 2, to be published in January 2013, covers mapping between thesauri and other types of vocabulary information retrieval seen as main application, including indexing as well as searching 9

10 What does interoperability mean? Definition: ability of two or more systems or components to exchange information and to use the information that has been exchanged. In the case of thesauri and other KOS, broadly speaking interoperability applies at more than one level: presenting data in a standard way to enable import and use in other systems (ISO Part 1) providing mappings between the terms/concepts of one KOS and those of another (ISO Part 2) plus any other type of exchange between one KOS and another (ISO Part 2) 10

11 Linked Data Cloud in Richard Cyganiak and Anja Jentzsch see

12 A simplified view of interoperability My thesaurus

13 Interoperability between vocabularies (see ISO ) My thesaurus Your thesaurus GEMET AGROVOC LCSH Dewey Wordnet

14 Interoperability between applications (see ISO ) Vocabulary management software indexing/tagging software search/browsing software

15 Content of ISO , supporting interoperability between applications thesaurus content and construction, mono- or multi- lingual (i.e. a complete update of ISO 2788 and ISO 5964) guidance on applying facet analysis to thesauri guidance on managing thesaurus development and maintenance functional requirements for software to manage thesauri a data model and derived XML schema 15

16 16

17 Models for mapping Guidelines for mapping Recommendations on mapping types How to handle pre-coordination Mapping to vocabularies other than thesauri: classification schemes file plans (Classification schemes used for records management) taxonomies subject heading schemes ontologies terminologies name authority lists synonym rings Brief guidance on handling mappings data Content of ISO , supporting interoperability between vocabularies 17

18 Recommended Models for mapping E F G H AB CD PQRS

19 What does mapping mean? Definition: process of establishing relationships between the concepts of one vocabulary and those of another Recommended types of mapping are based on the standard internal relationship types, basically: equivalence, hierarchical and associative Greater differentiation of mapping types is allowed, but is optional, to avoid complexity in simple applications

20 Full range of ISO mapping types Basic mapping types: Equivalence Simple Compound Intersecting compound equivalence Cumulative compound equivalence Hierarchical Broader Narrower Associative Simple equivalence can be marked as Exact or Inexact

21 Full range of ISO mapping types with examples Basic mapping types: Equivalence Simple: Laptop computers EQ Notebook computers Compound Intersecting compound equivalence: Women executives EQ Women + Executives Cumulative compound equivalence: Inland waterways EQ Rivers | Canals Hierarchical Broader:Streets BM Roads Narrower: Roads NM Streets Associative: e-Learning RM Distance education Exact equivalence: Aubergines =EQ Egg-plants Inexact equivalence:Horticulture ~EQ Gardening

22 The joys of pre-coordination Examples: (084.12) photographs of lions (from UDC) Automobiles--Air conditioning--Maintenance and repair (from LCSH) Occurs characteristically in subject heading schemes, classification schemes, taxonomies and file plans Mapping obliges use of the more complicated mapping types, especially compound equivalence 22

23 Vocabularies other than thesauri ISO is a standard for thesauri; it does not attempt to standardize other types of KOS. It guides only on interoperability between thesauri and other types of KOS. The clause on each KOS type presents: Key characteristics of the KOS (non-normative) Semantic components/relationships (non-normative) Recommendations for interoperability between the KOS and a thesaurus, especially mapping (normative) 23

24 Vocabularies other than thesauri The following are dealt with in ISO 25964: classification schemes file plans (classification schemes used for records management) taxonomies subject heading schemes name authority lists synonym rings terminologies ontologies

25 General prospects for mapping - thesaurimapping relatively straightforward - classification schemes - file plans - taxonomies - subject heading schemes concept mapping useful in IR, pre-coordination common - name authority listsmapping usually straightforward but common concepts few - synonym rings - terminologies - ontologies concept mapping rarely useful; complementary uses are a more likely prospect

26 Ontologies are special… Definition of ontology excludes lightweight examples such as thesauri and classification schemes The Gruber/Studer definition is adopted, and interpreted broadly enough to admit OWL-based examples such as ORE and FOAF. Mapping between ontologies and thesauri is not recommended. Interoperability recommendations focus on use cases such as reengineering a thesaurus as an ontology, and complementary use of thesaurus with ontology. 26

27 Simple ontology illustration ( credit: Jutta Lindenthal; see )http://www.jlindenthal.de/IID/2012/Kurs_2012.htm 27

28 Structural comparison The illustration is used in ISO to draw out key similarities and differences between ontologies and thesauri. The aim is to encourage emerging applications in which thesauri and ontologies can usefully interoperate. 28

29 Interoperability at the level of standards SKOS ISO25964 OWL RDF XML SRU Z39.19 MARC 21 REST HTTP BS 8723 ZThes SPARQL JSON ISO2709 Z39.50

30 Dextre Clarke and Zeng,

31 The thesaurus coming back into fashion…

32 …although often hidden behind the scenes

33 And interoperability makes new tricks easier…

34 Want a copy of the standards? Download Part 1 from ISO at talogue_detail.htm?csnumber= talogue_detail.htm?csnumber=53657 Part 2 will be in the ISO catalogue next year Order from your national standards body (e.g. BSI, DIN, ANSI, AFNOR) Some public/academic reference libraries stock them ISO standards are not cheap to purchase However, the data model and XML schema for exchange of thesaurus data are available online without charge or password control. Go to / 34

35 Some extra slides with more detail APPENDIX 35

36 Who is involved in developing the standard? A Working Group (WG8), under the ISO subcommittee known as ISO TC46/SC9, has drafted the standard. WG8 has members from 15 countries. The WG8 Secretariat is provided by NISO in the USA Currently active members of WG8 include: Johan De Smedt Marianne Lykke Stella Dextre Clarke (Leader) Esther Scheven Michèle HudonDouglas Tudhope Daniel KlessLeonard Will Jutta LindenthalMarcia Lei Zeng 36

37 Intersecting versus cumulative equivalence

38 Mapping example from a pre-coordinated concept: inland waterway transport Inland waterway transport EQ transport + (rivers | canals) The Rialto Bridge, Venice Michele Marieschi © Bridgeman Education


Download ppt "Thesauri, interoperability and the role of ISO 25964 Stella G Dextre Clarke Project Leader, ISO NP 25964 Chair, ISKO UK 1."

Similar presentations


Ads by Google