Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Expansion.

Similar presentations


Presentation on theme: "Query Expansion."— Presentation transcript:

1 Query Expansion

2 Outline Motivation Definition Issues Involved Existing Techniques
Global Thesauri/ WordNet Automatically Generated Thesauri Local Relevance Feedback Pseudo Relevance Feedback Conclusion

3 Problem with Keywords May not retrieve relevant documents that include synonymous terms “restaurant” vs. “cafe” “India” vs. “Bharat” May retrieve irrelevant documents that include ambiguous terms “bat” (baseball vs. mammal) “Apple” (company vs. fruit) “bit” (unit of data vs. act of eating)

4 Why Search Engines Fail to Search Relevant Documents
Users do not give sufficient number of keywords Users do not give good keywords Vocabulary gaps / Diversity & Vastness of web Lack of domain knowledge Solution: Automatically generate new query which is better than initial query

5 Query Expansion Definition
adding more terms (keyword spices) to a user’s basic query Goal to improve Precision and/or Recall Example User Query: car Expanded Query: car, cars, automobile, automobiles, auto, .. etc

6 Naïve Methods Finding synonyms of query terms and searching for synonyms as well Finding various morphological forms of words by stemming each word in the query Fixing spelling errors and automatically searching for the corrected form Re-weighting the terms in original query

7 Query Expansion Issues
Two major issues – Which terms to include? Which terms to weight more? Concept based versus term based QE Is it better to expand based upon the individual terms in the query, or the overall concept of the query

8 Objective To get proper set of words, which will improve Precision, when added to basic search query, without loosing the recall in considerable amount

9 Existing QE techniques
Global methods (static; of all documents in collection) Query expansion Thesauri (or WordNet) Automatic thesaurus generation Local methods (dynamic; analysis of documents in result set) Relevance feedback Pseudo relevance feedback

10 Global Analysis

11 Thesaurus based QE For each term, t, in a query, expand the query with synonyms and related words of t from the thesaurus feline → feline cat May weight added terms less than original query terms. Generally increases recall. May significantly decrease precision, particularly with ambiguous terms. “interest rate”  “interest rate fascinate evaluate” There is a high cost of manually producing a thesaurus And for updating it for scientific changes

12 Automatically Generated Thesauri
Attempt to generate a thesaurus automatically by analyzing the collection of documents Two main approaches Co-occurrence based (co-occurring words are more likely to be similar) Shallow analysis of grammatical relations Entities that are grown, cooked, eaten, and digested are more likely to be food items. Co-occurrence based is more robust, grammatical relations are more accurate.

13 Example

14 Semantic Network/ Wordnet
To expand a query, find the word in the semantic network and follow the various arcs to other related words.

15 Global Methods: Summary
Pros Thesauri and Semantic Networks (WordNet) can be used to find good words for users “more like this” Cons Little improvement has been found with automatic techniques to expand query without user intervention Overall, not as useful as Relevance Feedback, may be as good as Pseudo Relevance Feedback

16 Local Analysis

17 Relevance Feedback Relevance feedback: user feedback on relevance of docs in initial set of results User issues a (short, simple) query The user marks returned documents as relevant or non-relevant. The system computes a better representation of the information need based on feedback. Relevance feedback can go through one or more iterations.

18 Relevance Feedback Example: Initial Query and Top 8 Results
Query: New space satellite applications , 08/13/91, NASA Hasn't Scrapped Imaging Spectrometer , 07/09/91, NASA Scratches Environment Gear From Satellite Plan , 04/04/90, Science Panel Backs NASA Satellite Plan, But Urges Launches of Smaller Probes , 09/09/91, A NASA Satellite Project Accomplishes Incredible Feat: Staying Within Budget , 07/24/90, Scientist Who Exposed Global Warming Proposes Satellites for Climate Research , 08/22/90, Report Provides Support for the Critics Of Using Big Satellites to Study Climate , 04/13/87, Arianespace Receives Satellite Launch Pact From Telesat Canada , 12/02/87, Telecommunications Tale of Two Companies

19 Relevance Feedback Example: Expanded Query
2.074 new space satellite application 5.991 nasa eos 4.196 launch aster 3.516 instrument arianespace 3.004 bundespost ss 2.790 rocket scientist 2.003 broadcast earth 0.836 oil measure

20 Top 8 Results After Relevance Feedback
, 07/09/91, NASA Scratches Environment Gear From Satellite Plan , 08/13/91, NASA Hasn't Scrapped Imaging Spectrometer , 08/07/89, When the Pentagon Launches a Secret Satellite, Space Sleuths Do Some Spy Work of Their Own , 07/31/89, NASA Uses 'Warm‘ Superconductors For Fast Circuit , 12/02/87, Telecommunications Tale of Two Companies , 07/09/91, Soviets May Adapt Parts of SS-20 Missile For Commercial Use , 07/12/88, Gaping Gap: Pentagon Lags in Race To Match the Soviets In Rocket Launchers , 06/14/90, Rescue of Satellite By Space Agency To Cost $90 Million

21 Relevance Feedback: Problems
Why do most search engines not use relevance feedback? Users are often reluctant to provide explicit feedback It’s often harder to understand why a particular document was retrieved after applying relevance feedback

22 Pseudo Relevance Feedback
Automatic local analysis Pseudo relevance feedback attempts to automate the manual part of relevance feedback. Retrieve an initial set of relevant documents. Assume that top m ranked documents are relevant. Do relevance feedback

23 Future Work Not much work has been done which exhaustively utilizes measures of Semantic Relatedness and Similarity to compute Semantic Relatedness I propose to approach the problem of Query Expansion through Semantic Relatedness


Download ppt "Query Expansion."

Similar presentations


Ads by Google