G. Marchionini, Univ. of Maryland Electronic Environments Cost Trends: Hardware cost < Software cost < Information cost < People time Virtuality (transcend.

Slides:



Advertisements
Similar presentations
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Advertisements

Chapter 5: Introduction to Information Retrieval
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
Modern Information Retrieval Chapter 1: Introduction
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
ISP 433/533 Week 2 IR Models.
Modern Information Retrieval Chapter 1: Introduction
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan Sep. 16, 2005.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
INFORMATION RETRIEVAL WEEK 1 AND 2
Advance Information Retrieval Topics Hassan Bashiri.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
What is a document? Information need: From where did the metaphor, doing X is like “herding cats”, arise? quotation? “Managing senior programmers is like.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
CH 11 Multimedia IR: Models and Languages
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
An Overview of Relevance Feedback, by Priyesh Sudra 1 An Overview of Relevance Feedback PRIYESH SUDRA.
Recuperação de Informação. IR: representation, storage, organization of, and access to information items Emphasis is on the retrieval of information (not.
1 CS 502: Computing Methods for Digital Libraries Lecture 11 Information Retrieval I.
Chapter 5: Information Retrieval and Web Search
ICTLIP Module 3. Information Seeking in An Electronic Environment
1 CS 430 / INFO 430 Information Retrieval Lecture 2 Text Based Information Retrieval.
Modern Information Retrieval Computer engineering department Fall 2005.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Chapter 6: Information Retrieval and Web Search
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
Search Engine Architecture
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
1 Information Retrieval LECTURE 1 : Introduction.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto.
Information Retrieval
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Automatic vs manual indexing Focus on subject indexing Not a relevant question? –Wherever full text is available, automatic methods predominate Simple.
Definition, purposes/functions, elements of IR systems Lesson 1.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Information Retrieval in Practice
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Visual Information Retrieval
Modern Information Retrieval
Text Based Information Retrieval
Search Engine Architecture
Information Retrieval on the World Wide Web
Multimedia Information Retrieval
Information Retrieval
موضوع پروژه : بازیابی اطلاعات Information Retrieval
Multimedia Information Retrieval
CSE 635 Multimedia Information Retrieval
Introduction to Information Retrieval
Search Engine Architecture
Information Retrieval and Web Design
Recuperação de Informação
Presentation transcript:

G. Marchionini, Univ. of Maryland Electronic Environments Cost Trends: Hardware cost < Software cost < Information cost < People time Virtuality (transcend space) Timeliness (minimize time) Interactivity Multimedia Trends: Resource Sharing, Collaboration, Dynamic Representation, The WWW Critical Need for Text and Multimedia Management Systems !

G. Marchionini, Univ. of Maryland Information Seeking Perspective Information seeking is a human-centered process Analytical Browse continuum of strategies and tactics Close coupling of queries, results, and usage Interactive, iterative process Information retrieval has focused on documents (not concepts or answers)

G. Marchionini, Univ. of Maryland Electronic Text Retrieval 1. Text retrieval is more complex than data retrieval from DBMS. 2. Distinguish searching for word matches from concept matches. 3. Distinguish subject from keyword search: Subject:-->Search on a controlled vocabulary (e.g., LC subject headings). The results point to documents. Keyword-->Search all words in particular fields/text fragments. The results point to documents. 4. Distinguish exact match from partial match retrieval

G. Marchionini, Univ. of Maryland Approaches to Text Retrieval 1. Surrogate Search: Search a set of predefined words that point to related documents. Requires indexing via some controlled vocabulary. pros: natural transition from paper systems; computationally cheap cons: limited access; human indexing required 2. Full-Test Search: Search every word in every document. pros: broaden access; possible to automate indexing cons: computationally expensive; word rather than concept 3. Knowledge-Based Search: Search a set of concepts that are related to concepts in documents. pros: improved retrieval cons: computationally expensive; theoretical at present

G. Marchionini, Univ. of Maryland Full-Text Search Full-Text Search: Search every word (or variant)in the document except stop words. Methods: Text Scanning Indexes (inverted files) Vectors Signatures

G. Marchionini, Univ. of Maryland Inverted File Words point to word number, offset, surrogate, or document: aardvark *Doc3, Doc 7, Doc45, Doc abacus Doc2, Doc16, Doc33, Doc 45, Doc 67, zygote Doc 7, Doc 33, Doc 67, Doc 123,.... Find all Documents and then apply logical operators to combine Query either matches or does not match * actually Doc3,Para5,Word45

G. Marchionini, Univ. of Maryland Vectors Each document (or surrogate) is represented by a vector defined by every word in the collection. Doc Doc Doc (has aardvark and zygote). Doc (has abacus and zygote). Doc (has aardvark, abacus and zygote). Doc N Queries are expressed as vectors and matched to document vectors. Degrees of matching are possible.

G. Marchionini, Univ. of Maryland Document Alternatives Paragraphs, passages SGML codes Related problems: –text summarization/auto abstracting –auto categorization

G. Marchionini, Univ. of Maryland Multimedia Linguistic surrogates Images –color, texture, luminosity, shape Video –same as stills but add motion Sound –speaker attributes, pitch, duration

G. Marchionini, Univ. of Maryland Retrieval Trends 1. More full text databases (e.g., The Web!) 2. More statistical engines for ranking results (e.g., PLS, Inquiry, RetrievalWare, Topic) 3. Evolution in traditional markets (e.g., Dialog's Target, West's WIN, Mead's Freestyle) 4. WWW engines and services (Yahoo, Alta Vista, etc.) 5. Relevance feedback added 6. Multimedia developments