1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)

Slides:



Advertisements
Similar presentations
Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh
Advertisements

Special Topics in Computer Science The Art of Information Retrieval Chapter 1: Introduction Alexander Gelbukh
Chapter 5: Introduction to Information Retrieval
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
Modern Information Retrieval Chapter 1: Introduction
Query Languages. Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Guided Enquiry. OBJECTIVES databases  Understand what information is available from the databases  Locate and become familiar with the Student Research.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Search Engines and Information Retrieval
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Modern Information Retrieval Chapter 1: Introduction
IR Models: Structural Models
Models for Information Retrieval Mainly used in science and research, (probably?) less often in real systems But: Research results have significance for.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
What is a document? Information need: From where did the metaphor, doing X is like “herding cats”, arise? quotation? “Managing senior programmers is like.
Modern Information Retrieval Chapter 1 Introduction.
Web Search – Summer Term 2006 II. Information Retrieval (Basics) (c) Wolfgang Hürst, Albert-Ludwigs-University.
1 CS 502: Computing Methods for Digital Libraries Lecture 11 Information Retrieval I.
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
 IR: representation, storage, organization of, and access to information items  Focus is on the user information need  User information need:  Find.
Search Engines and Information Retrieval Chapter 1.
H. Lundbeck A/S3-Oct-151 Assessing the effectiveness of your current search and retrieval function Anna G. Eslau, Information Specialist, H. Lundbeck A/S.
1 Searching through the Internet Dr. Eslam Al Maghayreh Computer Science Department Yarmouk University.
Information Retrieval and Knowledge Organisation Knut Hinkelmann.
Modern Information Retrieval Computer engineering department Fall 2005.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Information Retrieval Introduction/Overview Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Chapter 6: Information Retrieval and Web Search
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Search Engine Architecture
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Web- and Multimedia-based Information Systems Lecture 2.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto.
Information Retrieval
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
INFORMATION STROAGE AND RETRIEVAL SYSTEM By Ms. Preeti Patel Lecturer School of Library And Information Science DAVV, Indore
Definition, purposes/functions, elements of IR systems Lesson 1.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Modern Information Retrieval
Search Engine Architecture
Information Retrieval
Thanks to Bill Arms, Marti Hearst
Data Mining Chapter 6 Search Engines
CSE 635 Multimedia Information Retrieval
Introduction to Information Retrieval
Chapter 5: Information Retrieval and Web Search
Search Engine Architecture
Information Retrieval and Extraction
Information Retrieval and Web Design
Recuperação de Informação
Introduction to Search Engines
Presentation transcript:

1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)

2 Course Text Modern Information Retrieval, R. Baeza-yates and B. Ribeiro-Neto., Addison-Wesley and ACM Press, 1999, ISBN: X

3 Introduction Example of information need in the context of the world wide web: “Find all documents containing information on computer courses which: (1) are offered by universities in South England, and (2) are accredited by the BCS/IEE bodies. To be relevant, the document must include information on admission requirements, and and phone number for contact purpose.”  Information Retrieval

4 Information Retrieval Retrieval System Query documents Set of retrieved documents Documents Information Need Search Engine Useful or relevant information to the user Primary goal of an IR system “Retrieve all the documents which are relevant to a user query, while retrieving as few non-relevant documents as possible.” Representation, storage, organisation, and access to information items (Usually) keyword-based representation

5 User tasks Pull technology User requests information in an interactive manner 3 retrieval tasks – Browsing (hypertext) – Retrieval (classical IR systems) – Browsing and retrieval (modern digital libraries and web systems) Push technology – automatic and permanent pushing of information to user – software agents – example: news service – filtering (retrieval task) relevant information for later inspection by user

6 Documents Unit of retrieval A passage of free text – composed of text, strings of characters from an alphabet – composed of natural language newspaper article, a journal paper, a dictionary definition, messages – size of documents arbitrary newspaper article vs. journal paper vs.

7 What is a document?

8 Representation of documents Set of index terms or keywords –extracted directly form text –specified by human subjects (information science)  metadata Most concise representation Poor quality of retrieval Full text representation –Most complete representation –High computational cost Large collections –Reduce set of representative keywords Elimination of stop words Stemming Identification of noun phrases Further compression Structure representation –Chapter, section, sub-section, etc Document term descriptors to access texts Generation of descriptors for text By hand By analysing the text

9 The retrieval process Information need Query Formulation Documents Document representation Indexing Retrieved documents Retrieval functions Relevance feedback

10 Queries Information Need Simple queries – composed of two or three, perhaps even dozens, of keywords – e.g., as in web retrieval Boolean queries – “neural networks AND speech recognition” Context Queries – Proximity search, phrase queries User term descriptors characterising the user need

11 Best-Match Retrieval Compare the terms in a document and query Compute similarity between each document in the collection and the query based on the terms that they have in common Sorting the documents in order of decreasing similarity with the query The outputs are a ranked list and displayed to the user - the top ones are more relevant as judged by the system Document term descriptors to access texts User term descriptors characterising the user need

12 Conceptual View of Text Retrieval Queries Documents Similarity Computation Retrieved Documents

13 Expanded view of text retrieval system QueriesDocuments Indexing Indexed Documents Similarity Computation Retrieved Documents Ranked Documents

14 Process of retrieving info User Interface Text Operations Query Operations Indexing Similarity Computation Ranking Document Repository Manager Index User need Logical view Inverted file Query Retrieved docs Text User feedback Ranked docs Text repository

15 Key Topics Indexing text documents Retrieving text documents Evaluation Query reformulations Search Engines = IR + Link Structure + Name Interpretation

16 Information Retrieval vs Information Extraction Information Retrieval –Given a set of query terms and a set of document terms select only the most relevant documents [precision], and preferably all the relevant [recall]. Information Extraction –Extract from the text what the document means. IR systems can FIND documents but need not “understand” them