Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Similar presentations


Presentation on theme: "Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,"— Presentation transcript:

1 Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing, Mathematics and Computing Technology

2 Overview Problem: searching for information –in particular, for human experts Approach: –Search using concepts, not words –Use a thesaurus as the initial ontology –Enhance it using simple AI techniques The Application: –Two deployed “Expert Locator” applications

3 Overall Picture Search Engine Query words “tube placement” Databases Human Experts Web pages Document repositories...

4 Problems with word searches.. Words have many senses (polysemy) –e.g. “plane” finds both airplanes and geometry Many words mean the same thing (synonymy) –e.g. “tail fin” misses “vertical stabilizer” Lack of world knowledge –e.g. “jet engine” misses “propulsion systems” Goal: organize search around concepts, not words  Need a conceptual vocabulary (“ontology”)

5 The Ontology Bottleneck Massive up-front cost to build an ontology Use a technical thesaurus, enhanced with AI techniques Boeing’s Thesaurus: –Highly customized to aerospace and Boeing –Massive knowledge repository 37,000 concepts, 18,000 synonyms 100,000 relationships (3 types) –Many person-years investment of effort The Approach

6 A (tiny) fragment of the ontology... Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition spray Jet spray

7 Converting Words to Concepts Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition Search word: “jet” spray Jet spray ? ? ? ?

8 Matching Query and Target Concepts Jet engines flameout combustion Burning rate afterburning Ramjet engines Hydrogen fuels engines Propulsion systems thrust lift Turbojet engines Engine starters Flame stability Combustion stability Flame propagation Pneumatic equipment starting ignition Semantic distance between “ignition” and “jet engines”? spray Jet spray

9 Expert Locator Demo (see end of this presentation for the demo in powerpoint form)

10 100,000 links are not enough! –40% of concepts are “orphans” But: Many concept names are phrases –Can add links by analyzing these phrases Enhancing the Thesaurus: 1. Increase connectivity using subsumption Space Shuttle Main Engine Engine generalization Space Shuttle related-to

11 Subsumption Computation Algorithm Space Shuttle Main Engine 1. Compute all possible generalizations by “word chopping” and “word generalization”... Engine Space Shuttle Engine Space Engine Space Vehicle Main Engine Space Shuttle MainSpace Shuttle Space VehicleSpace Shuttle VehicleVehicle Engine Vehicle Main Engine Vehicle Main

12 Space Shuttle Main Engine Space Shuttle Engine Space Engine Space Vehicle Main Engine Vehicle Main Engine Space Shuttle Main Space VehicleSpace Shuttle Vehicle Engine Engine Space Shuttle Vehicle Subsumption Computation Algorithm 2. Identify existing Thesaurus concepts and links within these Vehicle Main

13 Space Shuttle Engine Space Engine Space Vehicle Main Engine Space Shuttle Main Space VehicleSpace Shuttle Vehicle Engine Engine Space Shuttle Vehicle Space Shuttle Main Engine Subsumption Computation Algorithm 3. Add missing connections to nearest existing concepts Vehicle Main Engine Vehicle Main

14 Measuring Instruments Equipment Optical Measuring Instruments Distance Measuring Equipment Range Finders Optical Range Finders Halogen Compounds Fourine Compounds Nitrogen Fourine Compounds Fourides Nitrogen Flourides Some Example Inferred Links 21,000 generalization/specialization and 37,000 related-to links added Number of “orphans” down from 40% to 13%

15 Metal TubeMetal made-of New: Enhancing the Thesaurus: 2. Use NLP to refine the “related-to” links Metal TubeMetal related to Current: 27 relationship types chosen (causes, location, …) heuristic noun-noun rules selects relationship, e.g For compound “X Y” (e.g. “metal tube”): IF X is a Material AND Y is a Physical-Object THEN Y made-of X Can use relation type to help compute semantic distance

16 Definition: “Flap: A movable airfoil attached to an airplane’s wing, and used to increase lift or drag.” Flap isa: Airfoil attribute: Movable attached-to: Wing part-of: Airplane purpose: Increase object: Lift, Drag NLP Flap Airfoil Airplane rt bt Wing Lift Drag Increase Movable isaattribute purpose object attached-to part-of Enhancing the Thesaurus: 3. Knowledge from Text

17 Status and Evaluation The Applications –Two “Expert Locators” deployed and in use –Sustained usage (~20 searches / day) –Plans to quickly expand them further more experts also cover projects and work groups add in attribute filters (years at Boeing, location, …) How do the Thesaurus Enhancements Affect Search? –Study: Expert assessed relevance of “hit” concepts –Recall increased (44%  75%) with only minimal effect on precision (58%  57%)

18 Discussion “Number N of links”  “relevance”? – only for very small N! The useful bias of a domain-specific Thesaurus: –only contains relevant concepts massively reduces errors in Thesaurus enhancement –only contains relevant links provides very domain-specific search Limitations: –ignored “quality” of expert, social issues, etc. –what if the concept you want isn’t there? Generality: Applies to any resource, not just experts

19 Summary Search using concepts, not words Use of a thesaurus as an initial ontology: –Can leverage many years of work by librarians –Made viable using simple AI techniques of search subsumption computation language processing Domain-specific thesauri provide valuable bias

20 End - demo in PPT follows

21

22

23

24

25

26

27

28

29


Download ppt "Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,"

Similar presentations


Ads by Google