Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Slides:



Advertisements
Similar presentations
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Advertisements

Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
A Knowledge-Rich Approach to Understanding Text about Aircraft Systems Peter Clark Lisbeth Duncan Heather Holmback Tom Jenkins John Thompson Boeing Engineering.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Frame-Based Expert Systems
Introduction to Propulsion
FCA-MERGE: Bottom-up Merging of Ontologies
Text Operations: Preprocessing. Introduction Document preprocessing –to improve the precision of documents retrieved –lexical analysis, stopwords elimination,
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
The Purpose and Function of Airplane Parts
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
Information Retrieval
Alternatives to Metadata IMT 589 February 25, 2006.
Overview of Web Data Mining and Applications Part I
Parts of an Aircraft Parts of an Aircraft Gateway To Technology®
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Presentation by Yuri de Lugt. Presentation structure Definitions of knowledge management Forms of knowledge Knowledge infrastructure Collexis background.
Query Relevance Feedback and Ontologies How to Make Queries Better.
Query Expansion.
Claudia Marzi Institute for Computational Linguistics, “Antonio Zampolli” – Italian National Research Council University of Pavia – Dept. of Theoretical.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Multilingual Information Exchange APAN, Bangkok 27 January 2005
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Which of the two appears simple to you? 1 2.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
1 Query Operations Relevance Feedback & Query Expansion.
Flight Concept Web Project By: Josh Science #
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
14 October 2010 Leveraging Technical Expertise via Boeing Library Services* Diane Brenes, Librarian, Boeing Library & Learning Center Services.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
LATENT SEMANTIC INDEXING Hande Zırtıloğlu Levent Altunyurt.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Dr. Leo Obrst Information Semantics Cognitive Science & Artificial Intelligence Information Technology Technical Center Center for Connected Government.
Team Members Dilip Narayanan Gaurav Jalan Nithya Janarthanan.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
EcoTerm IV NBII/EioNet Demo of Federated KOS Search Mike Frame Vienna, Austria April 2007.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
Controlled Vocabulary Giri Palanisamy Eda C. Melendez-Colom Corinna Gries Duane Costa John Porter.
A Knowledge-Based Search Engine Powered by Wikipedia David Milne, Ian H. Witten, David M. Nichols (CIKM 2007)
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Enabling Task Centered Knowledge Support through Semantic Markup Rob Jasper Mike Uschold Boeing Phantom Works.
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Semantic (web) activity at Elsevier Marc Krellenstein VP, Search and Discovery Elsevier October 27, 2004
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
© 1990—2006 Visual Knowledge Software® | Private and Confidential | 2 Semantic Agent Wikis For Engineering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Ricardo EIto Brun Strasbourg, 5 Nov 2015
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Lecture 12: Relevance Feedback & Query Expansion - II
Improving Data Discovery Through Semantic Search
Exploiting Synergy Between Ontologies and Recommender Systems
Chapter 6: Design of Expert Systems
CS246: Information Retrieval
Chaitali Gupta, Madhusudhan Govindaraju
Deep SEARCH 9 A new tool in the box for automatic content classification: DS9 Machine Learning uses Hybrid Semantic AI ConTech November.
Information Retrieval and Web Design
Presentation transcript:

Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing, Mathematics and Computing Technology

Overview Problem: searching for information –in particular, for human experts Approach: –Search using concepts, not words –Use a thesaurus as the initial ontology –Enhance it using simple AI techniques The Application: –Two deployed “Expert Locator” applications

Overall Picture Search Engine Query words “tube placement” Databases Human Experts Web pages Document repositories...

Problems with word searches.. Words have many senses (polysemy) –e.g. “plane” finds both airplanes and geometry Many words mean the same thing (synonymy) –e.g. “tail fin” misses “vertical stabilizer” Lack of world knowledge –e.g. “jet engine” misses “propulsion systems” Goal: organize search around concepts, not words  Need a conceptual vocabulary (“ontology”)

The Ontology Bottleneck Massive up-front cost to build an ontology Use a technical thesaurus, enhanced with AI techniques Boeing’s Thesaurus: –Highly customized to aerospace and Boeing –Massive knowledge repository 37,000 concepts, 18,000 synonyms 100,000 relationships (3 types) –Many person-years investment of effort The Approach

A (tiny) fragment of the ontology... Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition spray Jet spray

Converting Words to Concepts Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition Search word: “jet” spray Jet spray ? ? ? ?

Matching Query and Target Concepts Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition Semantic distance between “ignition” and “jet engines”? spray Jet spray

Expert Locator Demo (see end of this presentation for the demo in powerpoint form)

100,000 links are not enough! –40% of concepts are “orphans” But: Many concept names are phrases –Can add links by analyzing these phrases Enhancing the Thesaurus: 1. Increase connectivity using subsumption Space Shuttle Main Engine Engine generalization Space Shuttle related-to

Subsumption Computation Algorithm Space Shuttle Main Engine 1. Compute all possible generalizations by “word chopping” and “word generalization”... Engine Space Shuttle Engine Space Engine Space Vehicle Main Engine Space Shuttle MainSpace Shuttle Space VehicleSpace Shuttle VehicleVehicle Engine Vehicle Main Engine Vehicle Main

Space Shuttle Main Engine Space Shuttle Engine Space Engine Space Vehicle Main Engine Vehicle Main Engine Space Shuttle Main Space VehicleSpace Shuttle Vehicle Engine Engine Space Shuttle Vehicle Subsumption Computation Algorithm 2. Identify existing Thesaurus concepts and links within these Vehicle Main

Space Shuttle Engine Space Engine Space Vehicle Main Engine Space Shuttle Main Space VehicleSpace Shuttle Vehicle Engine Engine Space Shuttle Vehicle Space Shuttle Main Engine Subsumption Computation Algorithm 3. Add missing connections to nearest existing concepts Vehicle Main Engine Vehicle Main

Measuring Instruments Equipment Optical Measuring Instruments Distance Measuring Equipment Range Finders Optical Range Finders Halogen Compounds Fourine Compounds Nitrogen Fourine Compounds Fourides Nitrogen Flourides Some Example Inferred Links 21,000 generalization/specialization and 37,000 related-to links added Number of “orphans” down from 40% to 13%

Metal TubeMetal made-of New: Enhancing the Thesaurus: 2. Use NLP to refine the “related-to” links Metal TubeMetal related to Current: 27 relationship types chosen (causes, location, …) heuristic noun-noun rules selects relationship, e.g For compound “X Y” (e.g. “metal tube”): IF X is a Material AND Y is a Physical-Object THEN Y made-of X Can use relation type to help compute semantic distance

Definition: “Flap: A movable airfoil attached to an airplane’s wing, and used to increase lift or drag.” Flap isa: Airfoil attribute: Movable attached-to: Wing part-of: Airplane purpose: Increase object: Lift, Drag NLP Flap Airfoil Airplane rt bt Wing Lift Drag Increase Movable isaattribute purpose object attached-to part-of Enhancing the Thesaurus: 3. Knowledge from Text

Status and Evaluation The Applications –Two “Expert Locators” deployed and in use –Sustained usage (~20 searches / day) –Plans to quickly expand them further more experts also cover projects and work groups add in attribute filters (years at Boeing, location, …) How do the Thesaurus Enhancements Affect Search? –Study: Expert assessed relevance of “hit” concepts –Recall increased (44%  75%) with only minimal effect on precision (58%  57%)

Discussion “Number N of links”  “relevance”? – only for very small N! The useful bias of a domain-specific Thesaurus: –only contains relevant concepts massively reduces errors in Thesaurus enhancement –only contains relevant links provides very domain-specific search Limitations: –ignored “quality” of expert, social issues, etc. –what if the concept you want isn’t there? Generality: Applies to any resource, not just experts

Summary Search using concepts, not words Use of a thesaurus as an initial ontology: –Can leverage many years of work by librarians –Made viable using simple AI techniques of search subsumption computation language processing Domain-specific thesauri provide valuable bias

End - demo in PPT follows