Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway

Similar presentations


Presentation on theme: "CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway"— Presentation transcript:

1 CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway a.salway@surrey.ac.uk a.salway@surrey.ac.uk

2 Lecture 1a: INTRODUCTION What is information retrieval?

3 Why?, Who?, What? Why do we need information retrieval? Who are the users of information retrieval systems? What kinds of information do they want to retrieve? Why study information retrieval? For example, why is it important to understand how search engines work?

4 Applications of Information Retrieval For the World Wide Web For organisations’ intranets For our personal media collections

5 INSERT GOOGLE screenshot

6 INSERT AltaVista screenshot

7 INSERT Yahoo screenshot - Query

8

9 INSERT Autonomy screenshot

10 INSERT IBM Webfountain screenshot

11 INSERT my email screenshot

12 INSERT my photos screenshot

13 A very brief history… Libraries for 1,000’s of years 1950’s - computer-based IR early 1990’s - web search late 1990’s - multimedia search

14 Some traditional ways of organizing information Table of Contents of a book Index of a book Library classification schemes: Hierarchies (e.g. Dewey Decimal) Controlled vocabularies Collections of abstracts

15 From the dictionary… Library. 1 A large organised collection of books for reading or reference. b A mass of learning or knowledge; a source providing knowledge and learning. c A collection of films, gramophone records, etc. when organised or sorted for some specific purpose… The New Shorter Oxford English Dictionary, 1993

16 Information Retrieval “the representation, storage, organisation of, and access to information items” (Baeza-Yates and Riberio-Neto 1999, page 1)

17 How is computer-based IR different to traditional libraries? Remote, multiple access May have multiple indexes Interactivity Scale Automatic indexing and ranking

18 What are the particular challenges for IR on the Web? Volume of text data – Google claims to index more than 8,000,000,000 webpages, and that’s not everything Multimedia information – traditional IR focussed on texts More and more multilingual information Cannot access original text when processing a query Distributed data – different platforms, bandwidths Large amount of volatile data and redundant data Diverse users (hence diverse information needs) and many inexperienced users Some good news though! The links between webpages can be useful for web search engines (more on this in Lecture 4)

19 Who are they?

20 “ Analysts estimate that Google is worth between $15 billion and $20 billion” The Times, 29/01/2004


Download ppt "CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway"

Similar presentations


Ads by Google