Download presentation
Presentation is loading. Please wait.
Published bySam Bruckman Modified over 3 years ago
1
Learning Joint Query Interpretation and Response Ranking Uma Sawant Soumen Chakrabarti IIT Bombay
2
Searching the “Web of things” Lin et. al., WWW 2012 At least 14% of Web search queries mention target type or category
3
Telegraphic entity search queries Telegraphic queries with target type woodrow wilson president university dolly clone institute hermitage museum bank river lead singer led zeppelin band losing team baseball world series 1998 No reliable syntax clues for the search engine Free word order No or rare capitalization Rare to find quoted phrases Few function or relational words
4
Execution Ready Query Telegraphic NLQ Template Query Interpretation Ranking 2-stage process How to answer entity queries? (simplified view of related work) e1 e2 e3 Knowledge base
5
Telegraphic Query Our Proposal e1 e2 e3 Annotated Corpus Interpretation response Interpretation response Interpretation response Generative and Discriminative models Multiple Interpretations Joint Query Interpretation and Ranking
6
The annotated Web … By comparison, the Padres have been to two World Series, losing in 1984 and 1998. … Entity: San_Diego_Padres Type: Major_league_ baseball_teams Type: All subTypeOf instanceOf mentionOf Type hierarchy Annotated document
7
Query: losing team baseball world series 1998 Query = type hints + word matchers Large type catalog Most query words match some type Padres rarely co- occurs with hockey Can know this only from corpus stats Query: losing team baseball world series 1998 Incorrect type: World_Series_Hockey_teams Query: losing team baseball world series 1998
8
Large type catalog Most query words match some type Padres rarely co- occurs with hockey Can know this only from corpus stats Need joint type inference and snippet scoring Query: losing team baseball world series 1998 Correct Type: Major_league_baseball_teams Entity: San Diego Padres By comparison, the Padres have been to two World Series, losing in 1984 and 1998. mentionOf Word matches instanceOf Evidence snippet Query = type hints + word matchers
9
Generative model : generate query from entity San Diego Padres Major league baseball team type context E T Padres have been to two World Series, losing in 1984 and 1998 Type hint : baseball, team losing team baseball world series 1998 Z Context matchers : lost, 1998, world series switch model q losing team baseball world series 1998
10
Choose type to describe entity Generative approach : plate diagram WZ E T Type description language model For each query Entity context language model Choose entity For each query word… “Switch” variables: word hints at type or is a matcher? Generate query word hints matchers
11
Discriminative model : separate correct and incorrect entities Chakrabarti San_Diego_Padres losing team baseball world series 1998 (baseball team) losing team baseball world series 1998 (baseball team) losing team baseball world series 1998 (t = baseball team) 1998_World_Series losing team baseball world series 1998 (series) losing team baseball world series 1998 (series) losing team baseball world series 1998 (t = series) : losing team baseball world series 1998 q
12
Compatibility between matchers and snippets that mention e Feature vector design inspired by generative Feature vector given query, entity, type, switches Models type prior Pr(t|e) Models entity prior Compatibility between hint words and type HintsMatchers Generative: Discriminative:
13
Discriminative framework Non-convex formulation Annealing algorithms Constraints are formulated using the best scoring interpretation
14
Testbed YAGO entity and type catalog ~0.2 million types and 1.9 million entities Annotated corpus Web corpus having 500 million pages ~ 16 annotations per page ~700 entity search queries TREC + INEX Converted to telegraphic form, with most probable type and answer entities
15
Experiment 1 : Entity ranking using joint inference To reach : Human recommended type To surpass : Most generic type in catalog (no type inference) Entity level ndcg measure (map and mrr follow the same trend, details in paper)
16
Human > Discriminative > Generative > Generic Generative significantly better than generic (lower) Generative fills 28% gap to human (upper) Discriminative significantly better than generic (lower) Discriminative fills 43% gap to human (upper) Discriminative significantly better than generative Easier to handle balance diverse scales of probabilities 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 12345678910 Rank NDCG human discriminative generative generic Human > ?? > Generic
17
Generic v/s discriminative Correct hint match & type choice cathedral claude monet painting Incorrect hint match & type choice amazing grace hymn writer
18
Discriminative better than human Correct entity unreachable from human recommended type discriminative recovers using corpus feedback patsy cline producer producer manufacturer Discriminative Owen Bradley
19
Experiment 2 : Target Type Inference Aggregate ranks of top-k interpretations to rank types Compare type-level ndcg with B&N 2012 hermitage museum bank river (museum) hermitage museum bank river (river) hermitage museum bank river (building) river museum building possible target type...... k
20
Joint prediction improves type inference Data : [B&N 2012], Dbpedia catalog Joint prediction improves type inference too!
21
(river) + matchers Experiment 3 : joint v/s two-stage Two-stage 1. Best type prediction from experiment (2) 2. Launch type restricted query on annotated corpus Top m types to improve recall Measure entity-level ndcg river museum building Stage 1 Type inference Form query (river OR museum) + matchers Ranking Stage 2 Ranking
22
Joint entity ranking ?? two-stage Not much difference with the benefit of more types in 2-stage Joint type prediction and ranking significantly better than 2-stage 0.2 0.3 0.4 0.5 0.6 12345678910 Rank NDCG Joint 2stage(m=1) 2stage(m=5) 2stage(m=10) Joint entity ranking better than two-stage
23
Conclusion Large percentage of Web search queries contain a mention of the target type Identification of target type hint words and type itself is rewarding, but non-trivial Joint query interpretation and ranking approach significantly better than two stage Joint prediction improves type inference Datasets available at bit.ly/WSpxvr
24
Questions?
25
References 1) Patrick Pantel, Thomas Lin, Michael Gamon: Mining Entity Types from Query Logs via User Intent Modeling. ACL (1) 2012: 563-571 2) K. Balog and R. Neumayer: Hierarchical Target Type Identification for Entity-oriented Queries, In CIKM 2012, October 2012 3) T. Lin, P. Pantel, M. Gamon, A. Kannan, A. Fuxman: Active Objects: Actions for Entity-Centric Search, WWW 2012
26
Extra slides
27
Chakrabarti Components of the model Entity prior (Weighted) fraction of snippets attached to an entity in the corpus Type Generality or specificity of types Hint-type compatibility Probability of generating hint words from a language model built using type description Hint sub-sequence matches some type name exactly Matcher-entity compatibility Weighted fraction of snippets attached to an entity, retrieved using matchers Rarity of matchers + number of supporting snippets
28
Implementation details Additive features One generic query executed on index, rest in memory Pruned large search space using easy heuristics Continuous hint words
29
Not entity disambiguation in query ymca in query refers to song or organization? Similar to entity disambiguation in documents Uses accompanying words Misinterpreting target type: usually disastrous Avoid early or hard commitment Query: ymca lyrics Query: ymca address Entity: YMCA_(song) Entity: YMCA_(org) Type: Music Type: Organization instanceOf Learn topic model
30
Better type description model More generic query than “hint+matchers” Entities as literals Different models Explore non-linear models (boosting) List-wise loss Use click data Future work
31
Choose type to describe entity Generative framework WZ E T Type description language model For each query… Entity context language model Choose entity to describe For each query word… “Switch” variables: decide if word hints at type or is a matcher Generate query word
32
Compatibility between matchers and snippets that mention e Discriminative framework Feature vector given query, entity, type, switches Models type prior Pr(t|e) Models entity prior Compatibility between hint words and type HintsMatchers Given q, score of response e is: Ranking model trained by distant supervision
33
Joint entity ranking better than two-stage State of the art target type predictor Does not use corpus information Pick top k types to improve type recall Launch type- restricted query on annotated corpus Significantly worse than joint type prediction and ranking
34
Execution Ready Query Telegraphic NLQ Template Query Interpretation Ranking 2-stage process How to answer entity queries? (simplified view of related work) e1 e2 e3 RDF tuples Annotated Corpus Tables Knowledge
Similar presentations
© 2018 SlidePlayer.com Inc.
All rights reserved.
Ppt on job rotation definition Ppt on set top box Ppt on relations and functions for class 11th notes Ppt on astronomy and astrophysics letters Ppt on credit default swaps market Ppt on earthquake information Ppt on 98 notified sections of companies act 2013 Ppt on euclid geometry for class 9 download Resource based view ppt on iphone Ppt on natural resources and conservation major