GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST.

GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST

WHAT IS THE PROBLEM TO BE SOLVED? Query logs aren’t always available or the best tool to determine query suggestions Most user queries don’t provide enough information

WHO CARES ABOUT THE PROBLEM? Builders of search programs without query logs Users

WHAT HAVE OTHERS DONE? Most query suggestion work uses query logs Suggestion of alternate queries (in non-query log approaches): Adding frequent terms occurring in close proximity Auto-completion of last term N-gram suggestions Different than this paper’s approach due to ranking of possible completions by n-gram occurrence frequency

WHAT IS THE PROPOSED SOLUTION? Rank possible phrase completions by semantic relation Topical N-gram (TNG) model

RANKING OF PHRASES P is the set of phrases extracted by N-grams Q u is the user query, while Q c is the already completed portion and Q t is the uncompleted portion Q u = Q c + Q t Ranked by probability of the occurrence of Q t Use of hidden topics

N-GRAM MODELING Find bigrams in document corpus Concatenate to find larger N-gram phrases Creates a cleaner list More applicable to search engine use

EXPERIMENT DESIGN AP News and Labour news datasets Standard N-gram generation only extracted 1, 2, and 3 grams TNG-N-gram model found up to 10 grams Relevance and Diversity used to evaluate efficacy 20 test queries generated from titles of articles

RESULTS TNG with Probability performs better than standard N-grams TNG with hidden topics provides “topically diverse” and “semantically related” results

RESULTS Relevance is highest for TNGSim Diversity is also highest for TNGSim

RESULTS Clarity scores used to calculate retrieval effectiveness Difference between the query language model and the corpus language model Higher scores are better TNG model didn’t perform well Claim that clarity is less important than retrieving semantically related results AP News datasetLabour dataset NgramsProb4.93.5 TNGProb4.22.7 TNGSim4.232.8

CONCLUSION TNG model can be effectively used in system without query logs Good for domain-specific search engines

GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST.

Similar presentations

Presentation on theme: "GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST.

Similar presentations

Presentation on theme: "GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST."— Presentation transcript:

Similar presentations

About project

Feedback