11 September 2002IR/LM workshop, Amherst1 Information retrieval, language and ‘language models’ Stephen Robertson Microsoft Research Cambridge and City.

Slides:



Advertisements
Similar presentations
Alexander Kotov and ChengXiang Zhai University of Illinois at Urbana-Champaign.
Advertisements

Traditional IR models Jian-Yun Nie.
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Metadata in Carrot II Current metadata –TF.IDF for both documents and collections –Full-text index –Metadata are transferred between different nodes Potential.
Chapter 5: Introduction to Information Retrieval
Basic IR: Modeling Basic IR Task: Slightly more complex:
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.
Probabilistic Information Retrieval Chris Manning, Pandu Nayak and
Web Search - Summer Term 2006 II. Information Retrieval (Basics Cont.)
Chapter 7 Retrieval Models.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
ISP 433/533 Week 2 IR Models.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Rutgers Components Phase 2 Principal investigators –Paul Kantor, PI; Design, modelling and analysis –Kwong Bor Ng, Co-PI - Fusion; Experimental design.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Cross Language IR Philip Resnik Salim Roukos Workshop on Challenges in Information Retrieval and Language Modeling Amherst, Massachusetts, September 11-12,
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.
Information Retrieval Ch Information retrieval Goal: Finding documents Search engines on the world wide web IR system characters Document collection.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Basic IR Concepts & Techniques ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
APPLYING INFORMATION RETRIEVAL TO TEXT MINING Data mining Lab 이아람.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Claudia Marzi Institute for Computational Linguistics, “Antonio Zampolli” – Italian National Research Council University of Pavia – Dept. of Theoretical.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Chapter 6: Information Retrieval and Web Search
1 Computing Relevance, Similarity: The Vector Space Model.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
UNED at iCLEF 2008: Analysis of a large log of multilingual image searches in Flickr Victor Peinado, Javier Artiles, Julio Gonzalo and Fernando López-Ostenero.
Facilitating Document Annotation using Content and Querying Value.
Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM
1 Information Retrieval LECTURE 1 : Introduction.
National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation design of energy processes.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
The TREC-9 Adaptive Filtering track (Coordinators: David Hull and Stephen Robertson) Stephen Robertson Microsoft Research Cambridge
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
User-Friendly Systems Instead of User-Friendly Front-Ends Present user interfaces are not accepted because the underlying systems are too difficult to.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture Probabilistic Information Retrieval.
Wen Chan 1 , Jintao Du 1, Weidong Yang 1, Jinhui Tang 2, Xiangdong Zhou 1 1 School of Computer Science, Shanghai Key Laboratory of Data Science, Fudan.
Introduction to Information Retrieval Probabilistic Information Retrieval Chapter 11 1.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Information Retrieval on the World Wide Web
موضوع پروژه : بازیابی اطلاعات Information Retrieval
Adaptive2 Language Model
CSE 635 Multimedia Information Retrieval
Search Engine Architecture
A Suite to Compile and Analyze an LSP Corpus
Retrieval Utilities Relevance feedback Clustering
NAME: _____________________________STUDY GUIDE: TEST
Information Retrieval and Web Design
Information Retrieval and Web Design
Information Retrieval and Web Design
Probabilistic Information Retrieval
Presentation transcript:

11 September 2002IR/LM workshop, Amherst1 Information retrieval, language and ‘language models’ Stephen Robertson Microsoft Research Cambridge and City University London

11 September 2002IR/LM workshop, Amherst2 Language and IR IR deals mainly in text objects Text = language Therefore, models or theories about language must be relevant to IR Many suggestions/attempts –Transformational methods –Shallow or deep NLP –Anaphora etc. etc.

11 September 2002IR/LM workshop, Amherst3 Language and IR But IR went its own sweet way –Term weighting, scoring functions, vector spaces, probabilistic models… –… with a strong emphasis on statistics Eventually, the language people became interested in statistics –Statistical NLP, collocation linguistics…

11 September 2002IR/LM workshop, Amherst4 Language and IR But ‘language models’ (as in this workshop title) seem to come from outside … and to share with IR a cavalier view of language So, can language models succeed where other language approaches have failed?

11 September 2002IR/LM workshop, Amherst5 Some modelling issues Relevance Topicality Learning Sources of evidence

11 September 2002IR/LM workshop, Amherst6 Relevance Central question: what is good system behaviour (what does the user want to see? what would satisfy him/her) Not necessarily a binary Relevance variable, though that has proved very useful Early language models seemed to hide this –but this is changing

11 September 2002IR/LM workshop, Amherst7 Topicality How do we understand ‘topics’? Documents are multi-topic Topics are not predefined… … potentially, any query defines a new topic (or perhaps more than one?) Models of topicality have eluded the IR community… … thus providing a significant opportunity for language modelling approaches

11 September 2002IR/LM workshop, Amherst8 Learning and Sources of evidence The major question: how to learn… … and from what? E.g. classical relevance feedback Text of query… … + relevance judgements So how do we combine this evidence? Again, opportunities for language models

11 September 2002IR/LM workshop, Amherst9 Final remarks Information retrieval is a slippery domain for modelling Language modelling has the potential to add significantly to the modelling tools available There are many connections between modelling approaches that need exploring