Presentation on theme: "From General Ontology to Specific Ontology: Study of Shu-Shi Poems Ru-Yng Chang, Sue-ming Chang, Feng-ju Luo*, Chu-Ren Huang Academia Sinica, Yuan-Ze University*"— Presentation transcript:
From General Ontology to Specific Ontology: Study of Shu-Shi Poems Ru-Yng Chang, Sue-ming Chang, Feng-ju Luo*, Chu-Ren Huang Academia Sinica, Yuan-Ze University*
Knowledge and Knowledge Structure Variation Knowledge is Structured Information Most salient factors dictating variations in knowledge structures are time, space, and domain Language is both the product and conduit of the conceptual structure of its speakers
Accessing Knowledge Structure In order to become sharable and reusable knowledge, all extracted information must first be correctly situated in a knowledge structure The situated information must be allowed to transfer from knowledge structure to knowledge structure without losing its meaningful content
Research Goal Knowledge Structure Discovery –Knowledge as situated information –Language endows information with structure Text-based and Lexicon-driven Knowledge Structure Discovery –General Ontology: the upper ontology shared by all domains (such as SUMO) –Specific Ontology: a ontology specific to a domain, historical period, an author etc.
Research Methodology The Mental Lexicon Approach The Shakespearean-garden Approach The Ontology-merging as Ontology- discovery Approach
The Mental Lexicon Approach Concepts are stored in the mental lexicon The basic unit of mental lexicon organization and access is lexical entry A complete list of lexical entries covers the complete list of conceptual atoms Lexical semantic relations mirror conceptual relations Each Word is a Conceptual Atom
The Shakespearean-garden Approach A Shakespearean garden collects all the plants referred to in Shakespearean texts. –The garden is used to illustrate the flora of the Shakespearean England and gives scholars a context in which to interpret his work. There is a knowledge structure behind each corpus (i.e. a collection of texts with design criteria) Lexicon as a Structured Inventory of Conceptual Atoms –For instance, complete set of texts by an author, from a certain period, or in a certain domain
The Ontology-merging as Ontology-discovery Approach I Ontology provides a structure for knowledge to be situated However, there is a dilemma for the construction of a new ontology –If no existing ontology is referred to: reinventing the wheel, difficult to start a structure from scratch without rules –If existing ontology is referred to: mislead by existing structure, mismatched or erroneous
The Ontology-merging as Ontology-discovery Approach II The Solution Map conceptual atoms to two (or more) reference ontologies Merge the two resultant ontologies –Matched Mapping: Confirmation of knowledge structure –Mismatched Mapping: Only one or neither is correct. Possibly lead to discovery of new knowledge structure –Complimentary Mapping: Increases coverage
Resources used WordNet http://www.cogsci.princeton.edu/~wn/ SUMO Ontology http://www.ontologyportal.org Academia Sinica Bilingual Ontological Wordnet (Sinica BOW) : SUMO + WordNet http://bow.sinica.edu.tw Segmentation Program etc. http://LingAnchor.sinica.edu.tw/ Domain Lexicon Management System: Segmentation, New Word Detection Lexical Database
The information of Sinica BOW EX: fish Sense Domain POS Definition Translation Semantic relation SUMO Example
SUMO: Suggested Upper Merged Ontology SUMO Atoms Concepts: around 1000 Note that concepts are not necessarily linguistically realized Relations (ISA): See SUMO Graph See SUMO Graph Axioms : for inference Open resource created under an initiative from IEEE Standard Upper Ontology Working Group IEEE Standard Upper Ontology Working Group
Methodology From lexicon to ontology (from items to structure) Ontology discovery through ontology merging
WHY? We do not have the knowledge structure (ontology) of a new domain (historical period, field etc.) But typical ontology discovery needs a framework to be mapped to To solve the dilemma we map the conceptual atoms to both SUMO and WN (as a linguistic ontology)
How to build a domain ontology Word segmentation Match WordNet synset and SUMO concept automatically WordNet SUMO Use WordNet information to check results and extend concept Transform into ontology browser format
Distribution of Shu Shi lexicon 98,430 words in NO.1-45 volume
The distribution of animal, plant, and artifact concepts in Shu Shi ’ s poems
Concepts found in Shu Shi's but not in Tang 300 aquatic mammal (whale 鯨 *) amphibian (frog 蛙 * 、 toad 蟾蜍 * 、 salamander 鯢 ) mollusk (clam 蛤 * 、 gastropod 螺 * 、 oyster 蠔、 snail 蝸牛 * 、 earthworm 蚯蚓 *) crustacean (crab 蟹 * 、 shrimp 蝦 ) Guangdong and Hainan Island
Words stand for multiple concepts in the Shu Shi Poems SourceWordSUMOEnglishExample sentence Shu Shi 杜鵑 birdcuckoo 子規＞和孔郎中荊林馬上見寄＞春山聞 子規。 啼鴃＞次韻劉景文登介亭＞朝先啼鴃起， flowering plant azalea 杜鵑＞菩提寺南漪堂杜鵑花＞南漪杜鵑 天下無， 葛 flowering plant kudzu vine 白葛＞和文與可洋川園池三十首：湖橋 ＞白葛烏紗曳履行。 fabrickudzu vine 白葛＞病中遊祖塔院＞烏紗白葛道衣涼。
What We Learned about Specific Ontology Constructing ontology from a larger corpus and comparison of two specific ontologies Local information can be effectively mapped Global information offers deeper insights into the knowledge structure –Human conceptualization of animals and plants has been relatively stable. But NOT artifacts. Regardless of the criteria for classification, genetically determined features (behaviors, appearances etc.) do not vary greatly However, human technology is highly fluid. Our conceptualization of artifacts is highly dependent on the development of engineering and by our varying societal needs.
Axiom in SUMO (instance GeorgeBush Human) – GeorgeBush is an instance of the class of humans (exists (?X) (parent ?X GeorgeBush)) – there exists something of which George Bush is the parent (instance parent BinaryPredicate) – the relation of parent is a binary relation (domain parent 1 Organism) – the first argument to the parent relation must be an instance of the class Organism (domain parent 2 Organism) – similarly for the second argument
Summary and Future Work Ontologies represent the knowledge structure of a domain or historical period We have provided an online interface to browse ontologies and lexica In the future, we will complete the online ontology editor and browser, which will –Map lexicon, WordNet and SUMO. –Integrate ontologies based on different texts. –Facilitate comparative studies of various domain ontologies.
Towards a Workbench for Specific Ontology: Browser and Editor User login Function menu (Personal ontologies list) Function menu (Personal ontologies list) Browse an ontology Edit an ontology Add an ontology 1.SUMO 2.SUMO + WordNet +concept map with lexicon 1.SUMO 2.SUMO + WordNet +concept map with lexicon Logout 1.Update lexical concepts 2.Update mapping between WordNet synset and lexicon 3.Edit other information in lexicon 1.Update lexical concepts 2.Update mapping between WordNet synset and lexicon 3.Edit other information in lexicon Import text Import lexicon Word segmentation Match concept and synset automatically 1.Suggestion list 2.Missing list 1.Suggestion list 2.Missing list
Constructing a Specific Ontology Import text, or domain lexicon –Select style of writing –Select category of word list for word segmentation –Select reference ontologies to match SUMO and lexicon Information of suggestion list –Candidate synset –Candidate synset synonyms –Explanation of candidate synset –Concept of candidate synset
Your consent to our cookies if you continue to use this website.