Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.

Similar presentations


Presentation on theme: "Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer."— Presentation transcript:

1 Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer courses which: (1) are offered by universities in South England, and (2) are accredited by the BCS/IEE bodies. To be relevant, the document must include information on admission requirements, and e-mail and phone number for contact purpose.”  Information Retrieval

2 Information Retrieval Representation, storage, organisation, and access to information items Information need => query Retrieval (search) engine Useful or relevant information to the user (Usually) Keyword-based representation Primary goal of an IR system “Retrieve all the documents which are relevant to a user query, while retrieving as few non-relevant documents as possible.”

3 Information retrieval vs. data retrieval

4 Basic concepts Effective retrieval of useful information User tasks Documents representation

5 User tasks Pull technology User requests information in an interactive manner 3 retrieval tasks –Browsing (hypertext) –Retrieval (classical IR systems) –Browsing and retrieval (modern digital libraries and web systems) Push technology –automatic and permanent pushing of information to user –software agents –example: news service –filtering (retrieval task) relevant information for later inspection by user

6 Representation of documents Set of index terms or keywords –extracted directly form text –specified by human subjects (information science) Most concise representation Poor quality of retrieval Full text representation –Most complete representation –High computational cost Large collections –Reduce set of representative keywords Elimination of stop words Stemming Identification of noun phrases Further compression Structure representation –Chapter, section, sub-section, etc

7 The retrieval process Information need Query Formulation Documents Document representation Indexing Retrieved documents Retrieval functions Relevance feedback

8 History (1) Early development –Table of content for books –Index: collection of selected words with pointers to corresponding information Created manually Categorisation hierarchy –Still used in most libraries –Yahoo! Automatic creation of indexes => Information retrieval

9 History (2) IR in the libraries –First to adapt IR to retrieve information –Initially developed by academic institutions Automation of previous technologies (card catalogues) Search on author name, title –Later developed by commercial vendors Subject heading, keywords More complex query facilities –Now Improved graphical interface Electronic forms Hypertext features Open system architecture

10 History (3) Web and digital libraries –Why Low cost technology Advance in digital communication, and greater access to networks Publishing freedom –Then Web + digital libraries => high interactive medium –Issues High quality retrieval (finding the relevant information) User interaction (understand user behaviours to design and develop new IR strategies) Fast indexes and quick response time

11 Practical issues Security (electronic commerce) Privacy Copyrights and patents rights (digital libraries) Scanning and optical character recognition (OCR) Cross-lingual retrieval Non-text medium Application to the web


Download ppt "Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer."

Similar presentations


Ads by Google