Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICS-FORTH June 30, 2014 Knowledge Organisation Systems - Form and Utility - Center for Cultural Informatics, Institute of Computer Science Foundation for.

Similar presentations


Presentation on theme: "ICS-FORTH June 30, 2014 Knowledge Organisation Systems - Form and Utility - Center for Cultural Informatics, Institute of Computer Science Foundation for."— Presentation transcript:

1 ICS-FORTH June 30, 2014 Knowledge Organisation Systems - Form and Utility - Center for Cultural Informatics, Institute of Computer Science Foundation for Research and Technology - Hellas Martin Doerr Rethymnon, June 30, 2014

2 ICS-FORTH June 30, 2014 KOS We distinguish: Syntactic Interoperability:  Information systems can exchange data objects and all their elements without loss of data. (EXCEL, HTML, XML, RDF…) Semantic interoperability:  Information systems can communicate and combine data objects and all their parts consistent with the meaning intended by the data creators or maintainers.  Systems seem to understand, help people talk to each other: “Dream: ….input in Chinese & English, but: Query and answers all in Russian”. Interoperability

3 ICS-FORTH June 30, 2014 KOS Semantic interoperability can be divided into: 1)Interoperable data structures/schemata  understand types of relationships: context  either use of a standard schema  or map & transform data between schemata  “formal ontologies” describe concepts as common reference for schema equivalence. 2)Identity of items referred to by data  local people, places, objects, events never appearing elsewhere, or with a publicly clear local ownership (collection items!).  people, places, objects, events appearing elsewhere and no local ownership.  concepts, categories, typologies characterizing items or subjects.  “Knowledge Organisation Systems” describe identity of common references Semantic Interoperability

4 ICS-FORTH June 30, 2014 KOS Universals and Particulars Distinguish particulars from universals as a perceived truth. o Particulars do not have specializations. o Universals have instances, which can be either particulars or universals.  particulars: me, “hello”, 2, WW II, the Mona Lisa, the text on the Rosetta Stone, 2-10-2006, 34N 26E, City of London  universals: patient, word, number, war, painting, text, car model, species  “strange” universals: colors, materials, mythological beasts  “strange” particulars: literary characters  Dualisms: Texts as “equivalence classes” of documents containing the “same text”. concepts as objects of discourse, e.g. “this is a ‘chaffinch’” versus “Linné defined ‘ Fringilla coelebs Linnaeus, 1758’ in 1758”. 4

5 ICS-FORTH June 30, 2014 KOS Function of KOS The term KOS comes from library/information science: “indexing languages”, i.e., authoritative lists of items and concepts frequently referred in information systems in order to avoid using different names or identifiers for the same thing, describing properties and definitions for identification (“matching”) and names and identifiers for reference often extending into useful relations to inform people and allow systems to make automated inferences for search and retrieval Such inferences are  identity (get all cats by “cat”)  generalization (get “cats” by “felines”)  related terms (get Heraklion by “Candia”, get “bridge construction” by “bridges”, get Heraklion by “Crete”) 5

6 ICS-FORTH June 30, 2014 KOS KOS can be divided into: 1)Terminology of Universals  describe things by their nature and behavior (form, function, structure…)  generalize over universals for searching  provide typical/general relations of universals for searching  divide a domain for administration and searching (“classification”) 2)Identification of Particulars: persons, things, places, events  describe items by unique combinations of properties  understand we talk about the same item/instance (regardless classification!!):  provide important relations between particulars for searching a database schema contains ~ 20-500 concepts a terminology contains ~100-10 million concepts modern information systems may contain more than 10 billions of particulars. Kinds of KOS

7 ICS-FORTH June 30, 2014 KOS 7 A controlled vocabulary is a limited list of terms to be used in a database field.  only an authority may add terms.  highly ambiguous for particulars (typically place names) Authority files with identifying properties and recommended names  only an authority may add terms. Distinguishing:  lists of persons (authors !) with life-dates, names, titles, roles, family and business relations  “gazetteers”: Lists and hierarchies of places together with recommended names (controlled) and geographic area.  “thesauri of events or periods” : virtually non-existing yet! Kinds of KOS of particulars

8 ICS-FORTH June 30, 2014 KOS 8 E13 Attribute Assignment Place Naming E74 Group E39 Actor E53 Place E52 Time-Span Community E44 Place Appellation P89 falls within P87 is identified by (identifies) assigns name E4 Period to community identified by P14 carried out by P4 has time-span to place P7 took place at P4 has time-span Schema of a KOS (particular): TGN

9 ICS-FORTH June 30, 2014 KOS 9 Nineveh naming People of Iraq TGN7017998 1st mill. BC City of Nineveh Kuyunjik P89 falls within P87 is identified by (identifies) assigns name to community identified by P4 has time-span to place P7 took place at P4 has time-span 20th century Nineveh P87 is identified by (identifies) Nineveh naming assigns name TGN1001441 P14 carried out by KOS: Describing TGN

10 ICS-FORTH June 30, 2014 KOS Nineveh (TGN)

11 ICS-FORTH June 30, 2014 KOS London (TGN)

12 ICS-FORTH June 30, 2014 KOS London (geonames)

13 ICS-FORTH June 30, 2014 KOS Master of the Paradise Garden (ULAN)

14 ICS-FORTH June 30, 2014 KOS 14 A dictionary is a listing of words and phrases giving information such as  spelling, morphology and part of speech,  senses, definitions, usage, equivalents in other languages (bi- or multilingual dictionary).  etymology A controlled vocabulary is a limited list of terms to be used in a database field. Only an authority may add terms. A classification system is a structure that organizes concepts into a (mono) hierarchy in order to partition some material following a sequence of decision criteria. Kinds of KOS of Universals

15 ICS-FORTH June 30, 2014 KOS Dewey (3)

16 ICS-FORTH June 30, 2014 KOS Dewey (4)

17 ICS-FORTH June 30, 2014 KOS Dewey (5)

18 ICS-FORTH June 30, 2014 KOS Dewey (1)

19 ICS-FORTH June 30, 2014 KOS Dewey (6)

20 ICS-FORTH June 30, 2014 KOS Dewey

21 ICS-FORTH June 30, 2014 KOS Library of Congress Classification Outline

22 ICS-FORTH June 30, 2014 KOS Library of Congress Classification

23 ICS-FORTH June 30, 2014 KOS 23 A thesaurus is a controlled vocabulary of categorical terms related to concepts, and with scope notes and semantic relationships between concepts.  semantic relationships are: IsA, related terms subject catalogues may use thesaurus relationships but interpret IsA as a generalization of “talking about”. A monolingual thesaurus has terms form one expert group or community A multilingual thesaurus relates terms and concepts from two or more expert groups or communities (see next slide) Kinds of KOS

24 ICS-FORTH June 30, 2014 KOS Multilingual thesauri Translated thesauri:  Each concept is optimally interpreted in words of another or multiple languages, to allow speakers of those languages to understand it better. Correlated thesauri:  Multiple thesauri with terms and concepts from respective groups, and a set of concept-based mappings between the different thesauri of that aggregate, in order to process queries across different terminologies. Interlingua:  Concepts are created by fusing each cluster of similar concepts from different social groups into a new concept. One term from each user group is attached to the new concept as the identifier to be used by this group. The interlingua provides the sharing of concepts between social groups, e.g. as a legal basis used by the European Commission like the EBTI. Note that the interlingua may not contain any of the original concepts of any user group; it contains a set of compromises to remove interpretational differences. Its concepts may again be translated and correlated to other thesauri.EBTI 24

25 ICS-FORTH June 30, 2014 KOS 25 & Art & Architecture Thesaurus Merimee Thesaurus English Vocabulary French Vocabulary interthesaurus relations linguistic translation linguistic translation SIS - Thesaurus Management System Multilingual Relations +/-

26 ICS-FORTH June 30, 2014 KOS LCSH

27 ICS-FORTH June 30, 2014 KOS Dolls

28 ICS-FORTH June 30, 2014 KOS Dolls (συνέχεια)

29 ICS-FORTH June 30, 2014 KOS DC.Identifier: Louvre INV.779 DC.Type: Image Louvre INV.779oil paintings has type Expression.Id: DOI:10.9876/MonaLisa.jpg Manifestation.Id: Louvre INV.779 E19 Physical Object Louvre INV.779 E19 Image DOI:10.9876/MonaLisa.jpg digital images has type Physical Object paintings visual works BT ! BT Conceptual Object electronic images BT ! FRBR CIDOC CRM FRBR Dublin Core material objects can only be at one place at a time! immaterial objects reside on carriers! AAT “Cross-Walks” (Mediation) Using Terms

30 ICS-FORTH June 30, 2014 KOS E5 Event E 77 Persistent Item E2 Temporal Entity E22 Man-Made Object E4 Period E73 Information Object E18 Physical Thing E57 Material E55 Type E70 Thing E28 Conc. Object E39 Actor E1 CRM Entity ATT Facets ACTIVITIESDisciplines Events Functions ….. AGENTSOrganizations People MATERIALSMaterials OBJECTSComponents Containers Costume ……. PHYSICAL ATTR.Attr. & Properties Color …. STYLES & PERIODSStyles & Periods ASSOC. CONCEPTSAssoc. Concepts Mapping AAT to CRM E7 Activity different sense!

31 ICS-FORTH June 30, 2014 KOS IsA Generalization is based on strict inheritance of properties:  All narrower concepts must have all properties or potential of properties as the more general ones (plus their own). Robust criteria for IsA regard:  An a priori fixed scope of use  compatibility of substance (“paper” or “letter”?)  ways or reasons for coming into existence (“tree” or “rosacea” ?)  compatibility of behavior or function (“flying” or “disc shape” ?)  ability of recognizing/knowing (“professor” or “intelligent” ?) Applying such criteria increases chance of  individuals coming to the same generalizations/ decisions  larger groups agreeing on the same general concepts  individuals learning the “indexing language” About the Objectivity of IsA

32 ICS-FORTH June 30, 2014 KOS This leads to a kind of “relative objectivity”,  relative to a kind of use, human way of thinking (??) and a theoretical total of concepts (“ontology”), in contrast to sets of concepts  known to a group, used in a book  concepts used in a discipline  concepts used to describe a context (Athens 5 th century BC) These are all subsets of the one ontology!  As units of work/ knowing, we can only manage our subsets.  We can only compare and integrate them, if we agree on the common ontology and its hierarchical structure  since the ontology is only known in parts, we need to agree on the method and then on upper levels (facets)  the more special a terms is, the more useless is the agreement on it. About the Objectivity of IsA

33 ICS-FORTH June 30, 2014 KOS Core Ontology Field of Study “Athens 5 th cent.BC” X X X X X X Facet A Facet C Facet B Field of Study “Drama” Filling out the Ontology terms of the field merging terms from fields creating the upper level X

34 ICS-FORTH June 30, 2014 KOS Conclusions Never mix particulars and universals!  Completely different fields and reasoning mechanisms  Abusing “broader term” for spatial inclusion causes inhibits integration. Classification Systems  Cultural bias becoming more and more an issue  Not robust against evolution of domains  Shelving no more an issue in the computer age  “Activity facet” is an interesting concept ! Thesauri:  Application of ontological criteria for generalization by IsA and selection of generic facets provides a much higher degree of objectivity (ontological commitment)  Highly robust against evolution of domains  More easy to learn the “indexing language”! 34


Download ppt "ICS-FORTH June 30, 2014 Knowledge Organisation Systems - Form and Utility - Center for Cultural Informatics, Institute of Computer Science Foundation for."

Similar presentations


Ads by Google