Presentation on theme: "Thesauri, interoperability and the role of ISO 25964"— Presentation transcript:
1Thesauri, interoperability and the role of ISO 25964 Stella G Dextre ClarkeProject Leader, ISO NP 25964Chair, ISKO UK
2Summary Brief thesaurus chronology What role does the thesaurus have now?The demand for interoperabilityHighlights from ISO 25964
3Thesauri – a brief chronology Once upon a time, thesauri were at the cutting edge of Information Retrieval (IR) technologyHey-day in 1960s and 1970s; after mid-1980s popularity declinedISO 2788 and ISO 5964 (for monolingual and multilingual thesauri respectively) came outInternet/intranets in 1990s brought resurgence and diversification (into other forms of controlled vocabulary, such as “taxonomies”)TREC (1992 onwards) has shown dominance of statistical methods in IR. But stats alone are not enough!At the turn of the century, thesauri back in fashion and work began on refurbishing the British and International standardsSemantic Web and SKOS developments provide more incentiveToday, even Google employs some “taxonomists”.
4Slide unearthed from TR’01(2001): The thesaurus coming back into fashion!
8The role of controlled vocabularies today Needed where full text is not available, e.g. image libraries and audio resourcesInvaluable for crossing language barriersEspecially useful in-house, where the page rank algorithms are less effectiveEssential to access vast databases and catalogues of bibliographic data from decades pastProvide added value in combination with other methods, often hidden behind the scenesIn all these contexts, interoperability is key.
9Introducing ISO 25964ISO 25964: Thesauri and interoperability with other vocabulariesPart 1: Thesauri for information retrievalPart 2: Interoperability with other vocabulariesIt updates ISO 2788 and ISO 5964based on BS 8723, with much reworkingPart 1, published in August 2011, covers monolingual and multilingual thesauriPart 2, to be published in January 2013, covers mapping between thesauri and other types of vocabularyinformation retrieval seen as main application, including indexing as well as searching
10What does “interoperability” mean? Definition: ability of two or more systems or components to exchange information and to use the information that has been exchanged.In the case of thesauri and other KOS, broadly speaking interoperability applies at more than one level:presenting data in a standard way to enable import and use in other systems (ISO Part 1)providing mappings between the terms/concepts of one KOS and those of another (ISO Part 2)plus any other type of exchange between one KOS and another (ISO Part 2)
11Linked Data Cloud in 2011 - Richard Cyganiak and Anja Jentzsch see http://lod-cloud.net/
12A simplified view of interoperability My thesaurus
13Interoperability between vocabularies (see ISO 25964-2) WordnetGEMETLCSHMy thesaurusYour thesaurusDeweyAGROVOC
14Interoperability between applications (see ISO 25964-1) indexing/tagging softwareVocabulary management softwaresearch/browsing software
15Content of ISO 25964-1, supporting interoperability between applications thesaurus content and construction, mono- or multi-lingual (i.e. a complete update of ISO 2788 and ISO 5964)guidance on applying facet analysis to thesauriguidance on managing thesaurus development and maintenancefunctional requirements for software to manage thesauria data model and derived XML schema
17Content of ISO 25964-2, supporting interoperability between vocabularies Models for mappingGuidelines for mappingRecommendations on mapping typesHow to handle pre-coordinationMapping to vocabularies other than thesauri:classification schemesfile plans (Classification schemes used for records management)taxonomiessubject heading schemesontologiesterminologiesname authority listssynonym ringsBrief guidance on handling mappings data
19What does “mapping” mean? Definition: process of establishing relationships between the concepts of one vocabulary and those of anotherRecommended types of mapping are based on the standard internal relationship types, basically: equivalence, hierarchical and associativeGreater differentiation of mapping types is allowed, but is optional, to avoid complexity in simple applications
20Full range of ISO 25964-2 mapping types Basic mapping types:EquivalenceSimpleCompoundIntersecting compound equivalenceCumulative compound equivalenceHierarchicalBroaderNarrowerAssociativeSimple equivalence can be marked as “Exact” or “Inexact”
21Full range of ISO 25964-2 mapping types with examples Basic mapping types:EquivalenceSimple: Laptop computers EQ Notebook computersCompoundIntersecting compound equivalence:Women executives EQ Women + ExecutivesCumulative compound equivalence:Inland waterways EQ Rivers | CanalsHierarchicalBroader: Streets BM RoadsNarrower: Roads NM StreetsAssociative: e-Learning RM Distance educationExact equivalence: Aubergines =EQ Egg-plantsInexact equivalence: Horticulture ~EQ Gardening
22The joys of pre-coordination Examples:(084.12) photographs of lions (from UDC)Automobiles--Air conditioning--Maintenance and repair (from LCSH)Occurs characteristically in subject heading schemes, classification schemes, taxonomies and file plansMapping obliges use of the more complicated mapping types, especially compound equivalence
23Vocabularies other than thesauri ISO is a standard for thesauri; it does not attempt to standardize other types of KOS. It guides only on interoperability between thesauri and other types of KOS.The clause on each KOS type presents:Key characteristics of the KOS (non-normative)Semantic components/relationships (non-normative)Recommendations for interoperability between the KOS and a thesaurus, especially mapping (normative)
24Vocabularies other than thesauri The following are dealt with in ISO 25964:classification schemesfile plans (classification schemes used for records management)taxonomiessubject heading schemesname authority listssynonym ringsterminologiesontologies
25General prospects for mapping - thesaurimapping relatively straightforward- classification schemes- file plans- taxonomies- subject heading schemesconcept mapping useful in IR, pre-coordination common- name authority listsmapping usually straightforward but common concepts few- synonym rings- terminologies- ontologiesconcept mapping rarely useful; complementary uses are a more likely prospect
26Ontologies are special… Definition of ontology excludes “lightweight” examples such as thesauri and classification schemesThe Gruber/Studer definition is adopted, and interpreted broadly enough to admit OWL-based examples such as ORE and FOAF.Mapping between ontologies and thesauri is not recommended.Interoperability recommendations focus on use cases such as reengineering a thesaurus as an ontology, and complementary use of thesaurus with ontology.
27Simple ontology illustration (credit: Jutta Lindenthal; see http://www
28Structural comparison The illustration is used in ISO to draw out key similarities and differences between ontologies and thesauri.The aim is to encourage emerging applications in which thesauri and ontologies can usefully interoperate.
29Interoperability at the level of standards ISO2709Z39.50MARC 21SPARQLZ39.19OWLSKOSZThesJSONRESTISO25964RDFBS 8723HTTPSRUXML
30From ISO 2788 to ISO 25964: the evolution of thesaurus standards towards interoperability and data modeling. Information Standards Quarterly (Winter 2012, v.24, no. 1), by Stella G. Dextre Clarke and Marcia Lei Zeng, Available at:Dextre Clarke and Zeng,
34Want a copy of the standards? Download Part 1 from ISO atPart 2 will be in the ISO catalogue next yearOrder from your national standards body (e.g. BSI, DIN, ANSI, AFNOR)Some public/academic reference libraries stock themISO standards are not cheap to purchaseHowever, the data model and XML schema for exchange of thesaurus data are available online without charge or password control. Go to
35Some extra slides with more detail APPENDIXSome extra slides with more detail
36Who is involved in developing the standard? A Working Group (WG8), under the ISO subcommittee known as ISO TC46/SC9, has drafted the standard.WG8 has members from 15 countries.The WG8 Secretariat is provided by NISO in the USACurrently active members of WG8 include:Johan De SmedtMarianne LykkeStella Dextre Clarke (Leader)Esther SchevenMichèle HudonDouglas TudhopeDaniel KlessLeonard WillJutta LindenthalMarcia Lei Zeng