Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slides Please download the slides from

Similar presentations


Presentation on theme: "Slides Please download the slides from"— Presentation transcript:

1 Slides Please download the slides from
2

2 LBSC 796/CMSC 838o Daqing He, Douglas W. Oard Session 5, March 8, 2004
Interactions LBSC 796/CMSC 838o Daqing He, Douglas W. Oard Session 5, March 8, 2004 1

3 Slides Please download the slides from
2

4 Agenda Interactions in retrieval systems Query formulation Selection
Examination Document delivery 2

5 System Oriented Retrieval Model
Search Query Ranked List Indexing Index Acquisition Collection

6 Whose Process Is It? Who initiates a search process?
Who controls the progress? Who ends a search process? 8

7 User Oriented Retrieval Model
Source Selection Query Formulation IR System Search Query Ranked List Collection Indexing Index Document Selection Document Document Examination Document Collection Acquisition Document Delivery

8 Taylor’s Conceptual Framework
Four levels of “information needs” Visceral What you really want to know Conscious What you recognize that you want to know Formalized (e.g., TREC topics) How you articulate what you want to know Compromised (e.g., TREC queries) How you express what you want to know to a system [Taylor 68] 8

9 Belkin’s ASK model Users are concerned with a problem
But do not clearly understand the problem itself the information need to solve the problem  Anomalous State of Knowledge Need clarification process to form a query [Belkin 80, Belkin, Oddy, Brooks 82] 8

10 What are humans good at? Sense low level stimuli Recognize patterns
Reason inductively Communicate with multiple channels Apply multiple strategies Adapt to changes or unexpected events Fuzzy and hard things From Ben Shneiderman’s “designing user interfaces”

11 What are computers good at?
Sense stimuli outside human’s range Calculate fast and mechanical Store large quantities and recall accurately Response rapidly and consistently Perform repetitive actions reliably Maintain performance under heavy load and extended time “Simple and sharply defined things” again paraphrasing George Miller From Ben Shneiderman’s “designing user interfaces”

12 What should Interaction be?
Synergic Humans do things that human are good at Computers do things that computers are good at the strength of one covers the weakness of the other

13 Source Selection People have their own preference
Different tasks require different sources Possible choices ask help from people or machines browsing or search, or combination general purpose vs specific domain IR system different collections

14 Query Formulation User Query Formulation Search Collection Indexing

15 User’s Goals User’s goals How can the user achieve this goal?
Identify the right query for the current need conscious/formalized need => compromised need How can the user achieve this goal? Infer the right query terms Infer the right composition of terms

16 System’s Goals Help the user build links between needs
know more about the system and the collection

17 How does System Achieve Its Goals?
Ask more from the user Encourage long/complex queries Provide a large text entry area Use forms filling or direct manipulation Initiate interactions Ask questions related to the needs Engage a dialogue with the user Infer from relevant items Infer from previous queries Infer from previous retrieved documents

18 Query Formulation Interaction Styles
Shneiderman 97 Command Language Form Fillin Menu Selection Direct Manipulation Natural Language Credit: Marti Hearst

19 Form-Based Query Specification (Melvyl)
Credit: Marti Hearst

20 Form-based Query Specification (Infoseek)
Credit: Marti Hearst

21 Direct Manipulation Spec. VQUERY (Jones 98)
Credit: Marti Hearst

22 High-Accuracy Retrieval of Documents
Topic Statement Search Engine Baseline Results Answers to Clarification Questions New track in TREC Study interaction between a user and a system Only one chance to interact with the user Query formulation is still the system’s task Extended batch IR model Acknowledge queries not equal to needs Allow asking user a set of clarification questions Designed for controlled evaluation Clarification questions generated in batch mode, Clarification questions only generated once Use ranked list as the outcome of the search Reasons for participation Human factor in IR process Controlled evaluation is hard in full interactive IR experiment HARD Results Clarification Questions

23 UMD HARD 2003 retrieval model
Clarification Questions HARD retrieval process Preference among subtopic areas Query Expansion Recently viewed relevant documents Document Reranking Refined Ranked List Preference to sub-collections or genres One way of achieving personalization is to include user’s search contexts into the retrieval process. To illustrate this, let’s look at a scenario. Desired result formats Passage Retrieval Ranked List Merging [He & Demner, 2003]

24 Dialogues in Need Negotiation
Information Need Document Collection 1. Formulate a Query Search Engine 2. Need negotiation Lets see what do I mean. Our interests are in the retrieval process. Which in a library situation would be: 1. A person with an information need 2. He formulates a query and consults with an experienced human intermediary who knows a lot about the document collection the library has. 3. She looks through the collection and finds the documents that match with the user query 3. Find Documents Matching the Query Search Results

25 Personalization through User’s Search Contexts
Incremental Learner Casablanca Context African Queen Context One way of achieving personalization is to include user’s search contexts into the retrieval process. To illustrate this, let’s look at a scenario. Romantic Films Context Information Retrieval System Romantic Films [Goker & He, 2000]

26 Things That Hurt Obscure ranking methods Counterintuitive statistics
Unpredictable effects of adding or deleting terms Only single-term queries avoid this problem Counterintuitive statistics “clis”: AltaVista says 3,882 docs match the query “clis library”: ,025 docs match the query! Every document with either term was counted 11

27 Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination

28 Indicative vs. Informative
Terms often applied to document abstracts Indicative abstracts support selection They describe the contents of a document Informative abstracts support understanding They summarize the contents of a document Applies to any information presentation Presented for indicative or informative purposes 15

29 User’s Browsing Goals Identify documents for some form of delivery
An indicative purpose Query Enrichment Relevance feedback (indicative) User designates “more like this” documents System adds terms from those documents to the query Manual reformulation (informative) Better approximation of visceral information need 16

30 System’s Goals Assist the user to Identify relevant documents
Identify potential useful terms for clarifying the right information need for generating better queries

31 Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination

32 A Selection Interface Taxonomy
One dimensional lists Content: title, source, date, summary, ratings, ... Order: retrieval status value, date, alphabetic, ... Size: scrolling, specified number, RSV threshold Two dimensional displays Construction: clustering, starfields, projection Navigation: jump, pan, zoom Three dimensional displays Contour maps, fishtank VR, immersive VR 18

33 Extraction-Based Summarization
Robust technique for making disfluent summaries Four broad types: Single-document vs. multi-document Term-oriented vs. sentence-oriented Combination of evidence for selection: Salience: similarity to the query Selectivity: IDF or chi-squared Emphasis: title, first sentence For multi-document, suppress duplication 27

34 Generated Summaries Fluent summaries for a specific domain
Define a knowledge structure for the domain Frames are commonly used Analysis: process documents to fill the structure Studied separately as “information extraction” Compression: select which facts to retain Generation: create fluent summaries Templates for initial candidates Use language model to select an alternative 27

35 Google’s KWIC Summary For Query “University of Maryland College Park”
20

36 Teoma’s Query Refine Suggestions
url: 20

37 Vivisimo’s Clustering Results
url: vivisimo.com

38 Kartoo’s Cluster Visualization
url: kartoo.com

39 Cluster Formation Based on inter-document similarity
Computed using the cosine measure, for example Heuristic methods can be fairly efficient Pick any document as the first cluster “seed” Add the most similar document to each cluster Adding the same document will join two clusters Check to see if each cluster should be split Does it contain two or more fairly coherent groups? Lots of variations on this have been tried 20

40 Starfield 21

41 Dynamic Queries: IVEE/Spotfire/Filmfinder (Ahlberg & Shneiderman 93)

42 Constructing Starfield Displays
Two attributes determine the position Can be dynamically selected from a list Numeric position attributes work best Date, length, rating, … Other attributes can affect the display Displayed as color, size, shape, orientation, … Each point can represent a cluster Interactively specified using “dynamic queries” 22

43 Projection Depict many numeric attributes in 2 dimensions
While preserving important spatial relationships Typically based on the vector space model Which has about 100,000 numeric attributes! Approximates multidimensional scaling Heuristic approaches are reasonably fast Often visualized as a starfield But the dimensions lack any particular meaning 23

44 Contour Map Displays Display a cluster density as terrain elevation
Fit a smooth opaque surface to the data Visualize in three dimensions Project two 2-D and allow manipulation Use stereo glasses to create a virtual “fishtank” Create an immersive virtual reality experience Mead mounted stereo monitors and head tracking “Cave” with wall projection and body tracking 24

45 ThemeView Credit to: Pacific Northwest National Laboratory

46 Browsing Retrieved Set
User Query Formulation Query Search Ranked List Document Selection Document Query Reformulation Document Reselection Document Examination

47 Full-Text Examination Interfaces
Most use scroll and/or jump navigation Some experiments with zooming Long documents need special features “Best passage” function helps users get started Overlapping 300 word passages work well “Next search term” function facilitates browsing Integrated functions for relevance feedback Passage selection, query term weighting, … 26

48 A Long Document

49 Document lens Robertson & Mackinlay, UIST'93, Atlanta, 1993

50 TileBar [Hearst et al 95]

51 SeeSoft [Eric 94]

52 Things That Help Show the query in the selection interface
It provides context for the display Explain what the system has done It is hard to control a tool you don’t understand Highlight search terms, for example Complement what the system has done Users add value by doing things the system can’t Expose the information users need to judge utility 28

53 Document Delivery User Document Examination Document Document Delivery

54 Delivery Modalities On-screen viewing Printing Fax-on-demand
Good for hypertext, multimedia, cut-and-paste, … Printing Better resolution, portability, annotations, … Fax-on-demand Really just another way to get to a printer Synthesized speech Useful for telephone and hands-free applications 30

55 Take-Away Messages IR process belongs to users
Matching documents for a query is only part of the whole IR process But IR system can help users And IR systems need to support Query formulation/reformulation Document Selection/Examination

56 Two Minute Paper When examining documents in the selection and examination interfaces, which type of information need (visceral, conscious, formalized, or compromised) guides the user’s decisions? Please justify your answer. What was the muddiest point in today’s lecture? 35

57 Alternate Query Modalities
Spoken queries Used for telephone and hands-free applications Reasonable performance with limited vocabularies But some error correction method must be included Handwritten queries Palm pilot graffiti, touch-screens, … Fairly effective if some form of shorthand is used Ordinary handwriting often has too much ambiguity 13


Download ppt "Slides Please download the slides from"

Similar presentations


Ads by Google