Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search Stephen W. Liddle, Information Systems Department David W. Embley,

Slides:



Advertisements
Similar presentations
JULY JULIO AUGUST AGOSTO SEPTEMBER SEPTIEMBRE OCTOBER OCTUBRE NOVEMBER NOVIEMBRE DECEMBER DICIEMBRE LOS ANGELES UNIFIED SCHOOL DISTRICT DISTRITO ESCOLAR.
Advertisements

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
PDAs Accept Context-Free Languages
Advanced Piloting Cruise Plot.
EuroCondens SGB E.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 5 Author: Julia Richards and R. Scott Hawley.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
Addition and Subtraction Equations
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
CALENDAR.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Copyright 2006 Digital Enterprise Research Institute. All rights reserved. MarcOnt Initiative Tools for collaborative ontology development.
WARM UP What is the value of 4 5 x 4? Explain. WARM UP What is the value of 4 5 x 4 3 ? Explain.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
Learning to show the remainder
ZMQS ZMQS
Part 1 Marketing Dynamics
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
The basics for simulations
ABC Technology Project
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
VOORBLAD.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
© 2010 Cisco and/or its affiliates. All rights reserved.Presentation_IDCisco Confidential CISCO LEARNING CREDITS MANAGEMENT TOOL CLP ADMINISTRATOR – USER.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
Before Between After.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
Addition 1’s to 20.
25 seconds left…...
Subtraction: Adding UP
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
Static Equilibrium; Elasticity and Fracture
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
A SMALL TRUTH TO MAKE LIFE 100%
A SMALL TRUTH TO MAKE LIFE 100%
PSSA Preparation.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
Introduction Embedded Universal Tools and Online Features 2.
Traktor- og motorlære Kapitel 1 1 Kopiering forbudt.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Cross-Language Hybrid Keyword and Semantic Search David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Joseph S. Park, Andrew Zitzelberger Brigham Young.
Cross-language Information Retrieval
Presentation transcript:

Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search Stephen W. Liddle, Information Systems Department David W. Embley, Computer Science Department Deryle W. Lonsdale, Department of Linguistics and English Language Brigham Young University, Provo, Utah, USA Yuri Tijerino, Department of Applied Informatics Kwansei Gakuin University, Kobe-Sanda, Japan 1 November th International Conference on Conceptual Modeling Brussels, Belgium

Local Knowledge Is Valuable What is the best way to get from A to B? Optimize time, cost, convenience Where can I find good food at a good price? Optimize quality, price, schedule constraints Are there high-value deals (bargains) available for consumer products you may want? ER

An Example Im visiting Osaka, Japan and Im getting hungry Query: Find a BBQ restaurant near the Umeda station, with typical prices less than $40 Potential knowledge sources Hotel concierge Someone at the company or university Im visiting Search engines ER

Google Says… Were getting some local knowledge, but I want more Find a BBQ restaurant near the Umeda station, with typical prices under $40 ER

Compare with Local Search Query: Sandwich shops near Provo, Utah Reviews Location & Contact Info Map ER

Local Knowledge Is Encoded in the Local Language If I could query in Japanese over a structured database, I might find this: ER

A Better Scenario ER

Building Blocks Past work: Ontology-based data extraction system (OntoES) Free-form query answering system (AskOntos/HyKSS) New proposal: MultiLingual Ontology Extraction System (ML-OntoES) ER

ML-OntoES Process Find a BBQ restaurant near the Umeda station, with typical prices under $40 ER

Multilingual Ontology Multilingual ontology localized to N contexts Language-agnostic ontology ( A ) N localizations, each L i having: Context label Extraction ontology ( O i ) Set of mappings from L i to A Set of mappings from A to L i A L1L1 L2L2 L3L3 L5L5 LnLn ER

Language-Agnostic Restaurant Ontology ( A ) Star Architecture U.S. Localization ( L en_US )Japanese Localization ( L jp_JP ) A L1L1 L2L2 L3L3 L5L5 LnLn ER

Restaurant Ontology U.S. Localization ER

Restaurant Ontology Japanese Localization ER

Ontology Construction Strategy Start with extraction ontology in one language Language-agnostic ontology is one-to-one with first language For each localization: Adjust central language-agnostic ontology as needed to accommodate concepts in the new localization Each concept in localization must map to a single concept in the central ontology (injective from L i to A ) Not all concepts in central ontology need map to each localization (surjective from A to L i ) ER

Types of Mappings Structural (schema) mappings Most are direct We also support 1: n, n :1, m : n, split, merge, select, etc. Data instance mappings Scalar units conversions Lexicon matching Transliteration Currency conversions Commentary mappings ER

Scalar Units Conversions ER

Lexicon Matching EnglishFrench mealrepas meal (flour)farine suppersouper Term base systems Lexical databases Statistical machine translation ER

Transliteration Muhammar Ghadaffi At least 39 variant spellings in English Bill Clinton At least 6 variants of Clinton in Arabic Single language: machine learning techniques, language resources Cross-linguistic: lexicon possible, but on-the-fly tools may be more helpful 45 Created via automatic Kanji-to-English transliteration tool ER

Currency Conversions 1 Euro = US Dollar 1 British Pound = Euro 1 Swiss Franc = USD But exchange rates change over time Solution: web services Look up exchange rate at date/time ER

Commentary Mappings Some ideas require commentary to describe well Tipping protocol How meals are structured Dress codes Automatic translation can help Gives starting point Can give something for virtually any commentary Likely requires human refinement n 2 mappings, but pay-as-you-go tuning, base level free ER

Automatic Translation Reservations highly recommended, especially on Friday, Saturday, and holiday evenings translate.google.com : Réservations fortement recommandé, surtout le vendredi, samedi, et les soirées de vacances babelfish.yahoo.com : Réservations fortement - recommandé, particulièrement vendredi, samedi, et soirées de vacances ER

Automatic Translation Reservations highly recommended, especially on Friday, Saturday, and holiday evenings translate.google.com : Reservierungen dringend empfohlen, vor allem am Freitag, Samstag und Feiertag abends Reservas muy recomendable, sobre todo los viernes, sábados y las fiestas en Reservasjoner anbefales, spesielt på fredag, lørdag og ferie kvelder, Tempahan amat disyorkan, terutamanya pada hari Jumaat, Sabtu, dan malam hari bercuti ER

Pijin Prototype Accept free-form natural-language queries English, Japanese, Chinese Map to restaurant extraction ontology Reformulate as form queries suitable for submitting to restaurant- recommendation web services Hotpepper, LivedoorGourmet, Tabelog, Gournavi Process Japanese results, mash up with Google maps Correctly interpreted, translated queries Japanese: 79% Chinese: 69% English: 72% Loosely: find me a BBQ restaurant that offers coupons near Tokyo station and my budget is under 5000 yen ER

ML-OntoES´ Prototype Added UTF-8 abilities to OntoES Car-ads application domain English and Japanese extraction ontologies Simple structure, 1-1 schema mappings But immediately converted, stored Japanese values in English ER

ML-OntoES´ Evaluation ExEExJQiEQiJQrEEQrJJQrEJQrJE Precision Recall Harmonic-Mean F-Measures Extraction Free-form Query Interpretation Query Results Interesting issues: years in Japanese car ads Model year ( and ) Shaken year ( ) Heisei years (2008 written as ) Lesson: there are many subtleties ER

Contributions ML-OntoES Central language-agnostic ontology, pay-as-you-go incremental design Proof-of-concept prototypes Cross-language information extraction and semantic search can be successful ER

Future Work Broaden our ML-OntoES prototype implementation Experiment with more domains Experiment with more languages ER