SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.

Slides:



Advertisements
Similar presentations
Recuperação de Informação B Cap. 10: User Interfaces and Visualization 10.1,10.2,10.3 November 17, 1999.
Advertisements

Critical Reading Strategies: Overview of Research Process
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Chapter 4 Design Approaches and Methods
The Systems Analysis Toolkit
Project Proposal.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Evaluating Search Engine
© Tefko Saracevic1 Interaction in information retrieval There is MUCH more to searching than knowing computers, networks & commands, as there is more.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
© Tefko Saracevic, Rutgers University1 Interaction in information retrieval There is MUCH more to searching than knowing computers, networks & commands,
Information Retrieval February 24, 2004
SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -
9/4/2001Information Organization and Retrieval Introduction to Information Retrieval University of California, Berkeley School of Information Management.
INFO 624 Week 3 Retrieval System Evaluation
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
A Task Oriented Non- Interactive Evaluation Methodology for IR Systems By Jane Reid Alyssa Katz LIS 551 March 30, 2004.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
SLIDE 1IS 202 – FALL 2003 Lecture 26: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
17:610:551:01 Where Should the Person Stop and the Information Search Interface Start? Marcia Bates Presented by Albena Stoyanova-Tzankova March 2004.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
The Information School of the University of Washington INFO 310 Information Behavior Models of information behavior.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
The Wharton School of the University of Pennsylvania OPIM 101 2/16/19981 The Information Retrieval Problem n The IR problem is very hard n Why? Many reasons,
Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Effective Questioning in the classroom
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Information Seeking Processes and Models Dr. Dania Bilal IS 530 Fall 2007.
Slide 1 D2.TCS.CL5.04. Subject Elements This unit comprises five Elements: 1.Define the need for tourism product research 2.Develop the research to be.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)
CSA3212: User Adaptive Systems Dr. Christopher Staff Department of Computer Science & AI University of Malta Lecture 9: Intelligent Tutoring Systems.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
Information Seeking Behavior Prof. Marti Hearst SIMS 202, Lecture 25.
Encompasses a broad, overall approach to instruction.
10/12/ Recall The Team Skills 1. Analyzing the Problem (with 5 steps) 2. Understanding User and Stakeholder Needs 1. Interviews & questionnaires.
Information Retrieval Evaluation and the Retrieval Process.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
Understanding Users Cognition & Cognitive Frameworks
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
A Generalized Architecture for Bookmark and Replay Techniques Thesis Proposal By Napassaporn Likhitsajjakul.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Search and Retrieval: Finding Out About Prof. Marti Hearst SIMS 202, Lecture 18.
Search and Retrieval: Query Languages Prof. Marti Hearst SIMS 202, Lecture 19.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
Prototyping Creation of concrete but partial implementations of a system design to explore usability issues.
Introduction to Information Retrieval. What is IR? Sit down before fact as a little child, be prepared to give up every conceived notion, follow humbly.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Information Architecture
Recall The Team Skills Analyzing the Problem (with 5 steps)
What is Information Retrieval (IR)?
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Office of Education Improvement and Innovation
Exploratory Search Beyond the Query–Response Paradigm
Document Clustering Matt Hughes.
Retrieval Performance Evaluation - Measures
Information Seeking Models
Presentation transcript:

SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000

Today l Modern IR textbook topics l The Information Seeking Process

Textbook Topics

More Detailed View

What We’ll Cover A Lot A Little

Search and Retrieval Outline of Part I of SIMS 202 l The Search Process l Information Retrieval Models l Content Analysis/Zipf Distributions l Evaluation of IR Systems –Precision/Recall –Relevance –User Studies l System and Implementation Issues l Web-Specific Issues l User Interface Issues l Special Kinds of Search

What is an Information Need?

The Standard Retrieval Interaction Model

Standard Model l Assumptions: –Maximizing precision and recall simultaneously –The information need remains static –The value is in the resulting document set

Problem with Standard Model: l Users learn during the search process: –Scanning titles of retrieved documents –Reading retrieved documents –Viewing lists of related topics/thesaurus terms –Navigating hyperlinks l Some users don’t like long disorganized lists of documents

Search is an Iterative Process Repositories Workspace Goals

“Berry-Picking” as an Information Seeking Strategy (Bates 90) l Standard IR model –assumes the information need remains the same throughout the search process l Berry-picking model –interesting information is scattered like berries among bushes –the query is continually shifting

A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89) Q0 Q1 Q2 Q3 Q4 Q5

Berry-picking model (cont.) l The query is continually shifting l New information may yield new ideas and new directions l The information need –is not satisfied by a single, final retrieved set – is satisfied by a series of selections and bits of information found along the way.

Information Seeking Behavior l Two parts of a process: »search and retrieval »analysis and synthesis of search results l This is a fuzzy area; we will look at several different working theories.

Search Tactics and Strategies l Search Tactics –Bates 79 l Search Strategies –Bates 89 –O’Day and Jeffries 93

Tactics vs. Strategies l Tactic: short term goals and maneuvers –operators, actions l Strategy: overall planning –link a sequence of operators together to achieve some end

Information Search Tactics (after Bates 79) l Monitoring tactics –keep search on track l Source-level tactics –navigate to and within sources l Term and Search Formulation tactics –designing search formulation –selection and revision of specific terms within search formulation

Term Tactics l Move around the thesaurus –superordinate, subordinate, coordinate –neighbor (semantic or alphabetic) –trace -- pull out terms from information already seen as part of search (titles, etc) –morphological and other spelling variants –antonyms (contrary)

Source-level Tactics l “Bibble”: – look for a pre-defined result set – e.g., a good link page on web l Survey: –look ahead, review available options –e.g., don’t simply use the first term or first source that comes to mind l Cut: –eliminate large proportion of search domain –e.g., search on rarest term first

Source-level Tactics (cont.) l Stretch –use source in unintended way –e.g., use patents to find addresses l Scaffold –take an indirect route to goal –e.g., when looking for references to obscure poet, look up contemporaries l Cleave –binary search in an ordered file

Monitoring Tactics (strategy-level) l Check –compare original goal with current state l Weigh –make a cost/benefit analysis of current or anticipated actions l Pattern –recognize common strategies l Correct Errors l Record –keep track of (incomplete) paths

Additional Considerations (Bates 79) l Add a Sort tactic! l More detail is needed about short-term cost/benefit decision rule strategies l When to stop? –How to judge when enough information has been gathered? –How to decide when to give up an unsuccesful search? –When to stop searching in one source and move to another?

Lexis-Nexis Interface l What tactics did you use? l What strategies did you use?

Implications l Interfaces should make it easy to store intermediate results l Interfaces should make it easy to follow trails with unanticipated results l Makes evaluation more difficult.

Orienteering (O’Day & Jeffries 93) l Interconnected but diverse searches on a single, problem-based theme l Focus on information delivery rather than search performance l Classifications resulting from an extended observational study: –15 clients of professional intermediaries –financial analyst, venture capitalist, product marketing engineer, statistician, etc.

Orienteering (O’Day & Jeffries 93) l Identified three main search types: –Monitoring –Following a plan –Exploratory l A series of interconnected but diverse searches on one problem-based theme –Changes in direction caused by “triggers” l Each stage followed by reading, assimilation, and analysis of resulting material.

Orienteering (O’Day & Jeffries 93) l Defined three main search types –monitoring »a well-known topic over time »e.g., research four competitors every quarter –following a plan »a typical approach to the task at hand »e.g., improve business process X –exploratory »explore topic in an undirected fashion »get to know an unfamiliar industry

Orienteering (O’Day & Jeffries 93) l Trends: –A series of interconnected but diverse searches on one problem-based theme –This happened in all three search modes –Each analyst did at least two search types l Each stage followed by reading, assimilation, and analysis of resulting material

Orienteering (O’Day & Jeffries 93) l *Searches tended to trigger new directions –Overview, then detail, repeat –Information need shifted between search requests –Context of problem and previous searches were carried to next stage of search l *The value was contained in the accumulation of search results, not the final result set –*These observations verified Bates’ predictions.

Orienteering (O’Day & Jeffries 93) l Triggers: motivation to switch from one strategy to another –next logical step in a plan –encountering something interesting –explaining change –finding missing pieces

Stop Conditions (O’Day & Jeffries 93) l Stopping conditions not as clear as for triggers l People stopped searching when –no more compelling triggers –finished an appropriate amount of searching for the task –specific inhibiting factor »e.g., learning market was too small –lack of increasing returns »80/20 rule l Missing information/inferences ok –business world different than scholarship

After the Search: Analyzing and Synthesizing Search Results Orienteering Post-Search Behaviors: –Read and Annotate –Analyze: 80% fell into six main types

Post-Search Analysis Types (O’Day & Jeffries 93) l Trends l Comparisons l Aggregation and Scaling l Identifying a Critical Subset l Assessing l Interpreting l The rest: »cross-reference »summarize »find evocative visualizations »miscellaneous

SenseMaking (Russell et al. 93) l The process of encoding retrieved information to answer task-specific questions l Combine –internal cognitive resources –external retrieved resources l Create a good representation –an iterative process –contend with a cost/benefit tradoff

Sensemaking (Russell et al. 93) l Most of the effort is in the synthesis of a good representation –covers the data –increase usability –decrease cost-of-use

Summary l The information access process –Berry picking/orienteering offer an alternative to the standard IR model –More difficult to assess results –Interactive search behavior can be analyzed in terms of tactics and strategies l Sensemaking: –Combining searching with the use of the results of search.

Next Time l IR Systems Overview l Query Languages – Boolean Model –Boolean Queries