Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generating Query Substitutions Alicia Wood. What is the problem to be solved?

Similar presentations


Presentation on theme: "Generating Query Substitutions Alicia Wood. What is the problem to be solved?"— Presentation transcript:

1 Generating Query Substitutions Alicia Wood

2 What is the problem to be solved?

3 Problem Imperfect description of need Search engine not able to retrieve documents matching query Need accurate and related query substitutions

4 Problem (cont.) Given a query Want to generate modified query (related) –Improvements (specification) –Neutral (spelling change, synonym) –Loss of original meaning (generalization)

5 Who cares about this problem and why?

6 Who cares? User typing the query Want correct results with imperfect query

7 What have others done to solve this problem and why is this inadequate?

8 Previous Work Relevance/Pseudo relevance feedback Query term deletion Substituting query terms with related terms Latent Semantic Indexing (LSI)

9 Relevance/Pseudo relevance feedback Submit query for initial retrieval Processing resulting documents Modify the query by expanding with additional terms from documents Perform second retrieval with modified query Can cause query drift Computationally expensive

10 Query term deletion Loss of specificity from original query

11 Substituting query terms Relies on an initial retrieval

12 Latent Semantic Indexing (LSI) Identify patterns in relationships between terms and concepts in unstructured collection of text Computationally expensive

13 What is the proposed solution to the problem?

14 Solution Query modification based on pre- computed query and phrase similarity, –Ranking proposed queries –Similar queries /phrases derived from user query sessions –Learned models used to re-rank Based on similarity of new query to original query

15 Contributions 1.Identification of new source of data to identify similar queries and phrases 2.The definition of a scheme for scoring query suggestions 3.An algorithm to combine query and phrase suggestions –Finds highly and broadly relevant phrases 4.Identification of features that are predictive of highly relevant query suggestions

16 Classes of Suggestion Relevance Precise rewriting –Match user’s intent, preserve core meaning automobile insurance automotive insurance Approximate rewriting –direct close relationship to topic, scope narrowed or broadened Apple music player ipod shuffle Possible rewriting –Categorical relationship to initial query, complementary product but distinct Eye glasses contact lenses Clear mismatch – no clear relationship Jaguar xj6 os x jaguar

17 Classes of Rewriting Specific Rewriting (1+2) –closely related query –highly relevant Broad Rewriting (1+2+3) –query expansion –relevant to user interests

18 Substitutables Initial query -> generate relevant queries –Replace query as whole or phrases –Segment query into phrases –Find query pairs where one segment has changed (britney spears) (mp3s) -> (britney spears) (lyrics) Pair Independence Hypothesis Likelihood Ratio –High value = strong dependence between two terms

19 Validation 1000 initial queries –Generate single suggestion (q j ) for each Evaluate accuracy of approaches Train machine learned classifier Evaluate ability to produce higher quality suggestions –Word distance, normalized edit distance, number of substitutions Suggestions criteria: –Some words from initial query –Modifications shouldn’t be made at start of query

20 Future Work Build semantic classifier –Predict semantic class of rewriting Take inspiration from machine translation techniques Introduce language model –Avoid producing nonsensical queries


Download ppt "Generating Query Substitutions Alicia Wood. What is the problem to be solved?"

Similar presentations


Ads by Google