Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.

Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University

September 11, 2002Language Modeling and Information Retrieval Workshop1 Retrieval As Decision Making Excerpt ? Clustering ? Given a query, - Which documents should be selected? (D) - How should these docs be presented to the user? (  ) Query … Ranked list ? 1234

September 11, 2002Language Modeling and Information Retrieval Workshop2 Decision Theory Framework observed Partially observed inferred S Source d Document U User q Query R Unified framework can be built on Bayesian decision theory: Models, loss function, risk minimization (Zhai, 2002)

September 11, 2002Language Modeling and Information Retrieval Workshop3 Example: Aspect Retrieval Query: What are current applications of robotics? Find as many different applications as possible. Example Aspects A 1 : spot-welding robotics A 2 : controlling inventory A 3 : pipe-laying robots A 4 : talking robot A 5 : robots for loading & unloading memory tapes A 6 : robot telephone operators A 7 : robot cranes … Aspect judgments A 1 A 2 A 3 …... A k d 1 1 1 0 0 … 0 0 d 2 0 1 1 1 … 0 0 d 3 0 0 0 0 … 1 0 …. d k 1 0 1 0... 0 1

September 11, 2002Language Modeling and Information Retrieval Workshop4 Aspect Models (Hofmann 1999, Blei, Ng and Jordan., 2001) Aspect 1Aspect 2 1 2  Dirichlet (for example) Generative: Inference: Given aspects and document, what is posterior for ? Learning: Given documents, what are the (ML) aspects? Studied recently in (Minka and Lafferty, 2002)

September 11, 2002Language Modeling and Information Retrieval Workshop5 Evaluation Measures What is the best measure?  Requires concrete specification of task Several natural measures are computationally intractable, even assuming aspects known (e.g., aspect coverage, aspect uniqueness) Defining aspects is difficult Maximum likelihood cannot be expected to capture “true” semantic relationships in aspects

Aspect Retrieval Baselines Aspect Precision Aspect Recall

September 11, 2002Language Modeling and Information Retrieval Workshop7 Challenges for IR Models Better task specification and data  e.g., TREC interactive data inadequate More advanced models  Fewer independence assumptions, greater structure Improved inference and learning algorithms  Accuracy and efficiency To handle user preferences, background knowledge  Loss function and priors/constraints Probabilistic language models have proven to be an effective way to reason about IR systems. We now need:

Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.

Similar presentations

Presentation on theme: "Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.

Similar presentations

Presentation on theme: "Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback