Presentation is loading. Please wait.

Presentation is loading. Please wait.

2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.

Similar presentations


Presentation on theme: "2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday."— Presentation transcript:

1 2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ SIMS 202: Information Organization and Retrieval

2 2003.12.02 - SLIDE 2IS 202 – FALL 2003 Lecture Overview Review of Last Time –Introduction to HCI –Why Interfaces Don’t Work –Early Visions: Memex Interfaces for Information Retrieval II Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

3 2003.12.02 - SLIDE 3IS 202 – FALL 2003 Lecture Overview Review of Last Time –Introduction to HCI –Why Interfaces Don’t Work –Early Visions: Memex Interfaces for Information Retrieval II Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

4 2003.12.02 - SLIDE 4IS 202 – FALL 2003 “Drawing the Circles”

5 2003.12.02 - SLIDE 5 Human-Computer Interaction (HCI) Human –The end-users of a program –The others in the organization –The designers of the program Computer –The machines the programs run on Interaction –The users tell the computers what they want –The computers communicate results –The computer may also tell users what the computer wants them to do

6 2003.12.02 - SLIDE 6IS 202 – FALL 2003 Shneiderman’s Design Principles Provide informative feedback Permit easy reversal of actions Support an internal locus of control Reduce working memory load Provide alternative interfaces for expert and novice users

7 2003.12.02 - SLIDE 7IS 202 – FALL 2003 HCI for IR Information seeking is an imprecise process UI should aid users in understanding and expressing their information needs –Help formulate queries –Select among available information sources –Understand search results –Keep track of the progress of their search

8 2003.12.02 - SLIDE 8 How to Design and Build UIs Task analysis Rapid prototyping Evaluation Implementation Design Prototype Evaluate Iterate at every stage!

9 2003.12.02 - SLIDE 9IS 202 – FALL 2003 Evaluation Techniques Qualitative vs. quantitative methods Qualitative (non-numeric, discursive, ethnographic) –Focus groups –Interviews –Surveys –User observation –Participatory design sessions Quantitative (numeric, statistical, empirical) –User testing –System testing

10 2003.12.02 - SLIDE 10IS 202 – FALL 2003 Why Interfaces Don’t Work Because… –We still think of using the interface –We still talk of designing the interface –We still talk of improving the interface “We need to aid the task, not the interface to the task.” “The computer of the future should be invisible.”

11 2003.12.02 - SLIDE 11IS 202 – FALL 2003 “What Dr. Bush Foresees” Cyclops Camera Worn on forehead, it would photograph anything you see and want to record. Film would be developed at once by dry photography. Microfilm It could reduce Encyclopaedia Britannica to volume of a matchbox. Material cost: 5¢. Thus a whole library could be kept in a desk. Vocoder A machine which could type when talked to. But you might have to talk a special phonetic language to this mechanical supersecretary. Thinking machine A development of the mathematical calculator. Give it premises and it would pass out conclusions, all in accordance with logic. Memex An aid to memory. Like the brain, Memex would file material by association. Press a key and it would run through a “trail” of facts.

12 2003.12.02 - SLIDE 12IS 202 – FALL 2003 Interaction Paradigms for IR Direct manipulation –Query specification –Query refinement –Result selection Delegation –Agents –Recommender systems –Filtering

13 2003.12.02 - SLIDE 13IS 202 – FALL 2003 Lecture Overview Review of Last Time –Introduction to HCI –Why Interfaces Don’t Work –Early Visions: Memex Interfaces for Information Retrieval II Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

14 2003.12.02 - SLIDE 14IS 202 – FALL 2003 HCI For IR Browsing –Visualizing collections and documents –Navigating collections and documents Searching –Formulating queries –Visualizing results –Navigating results –Refining queries –Selecting results

15 2003.12.02 - SLIDE 15IS 202 – FALL 2003 Information Visualization Utility –Inherently visual data –Making the abstract concrete –Making the invisible visible Techniques –Icons –Color highlighting –Brushing and linking –Panning and zooming –Focus-plus-context –Magic lenses –Animation

16 2003.12.02 - SLIDE 16IS 202 – FALL 2003 Mapping Logical structure of the information –Hierarchy –Rank –Proximity –Similarity distance –Term frequency –History of changes –Etc. Perceptual representation of the information –Outlines, trees, graphs –Color, size, shape, distance –Symbolic icons –Animation, interaction –Etc.

17 2003.12.02 - SLIDE 17IS 202 – FALL 2003 Task = Information Access The standard interaction model for information access 1)Start with an information need 2)Select a system and collections to search on 3)Formulate a query 4)Send the query to the system 5)Receive the results 6)Scan, evaluate, and interpret the results 7)Stop, or 8)Reformulate the query and go to Step 4

18 2003.12.02 - SLIDE 18IS 202 – FALL 2003 HCI Questions for IR Where does a user start? –Faced with a large set of collections, how can a user choose one to begin with? How will a user formulate a query? How will a user scan, evaluate, and interpret the results? How can a user reformulate a query?

19 2003.12.02 - SLIDE 19IS 202 – FALL 2003 HCI for IR: Collection Selection Question 1: Where does the user start?

20 2003.12.02 - SLIDE 20IS 202 – FALL 2003 Starting Points for Search Faced with a prompt or an empty entry form … how to start? –Lists of sources –Overviews Clusters Category Hierarchies/Subject Codes Co-citation links –Examples, Wizards, and Guided Tours –Automatic source selection

21 2003.12.02 - SLIDE 21IS 202 – FALL 2003 List of Sources Have to guess based on the name Requires prior exposure/experience

22 2003.12.02 - SLIDE 22IS 202 – FALL 2003 Old Lexis-Nexis Interface

23 2003.12.02 - SLIDE 23IS 202 – FALL 2003 Overviews Supervised (manual) category overviews –Yahoo! –HiBrowse –MeSHBrowse Unsupervised (automated) groupings –Clustering –Kohonen feature maps

24 2003.12.02 - SLIDE 24IS 202 – FALL 2003 Yahoo! Interface

25 2003.12.02 - SLIDE 25IS 202 – FALL 2003 Summary: Category Labels Advantages –Interpretable –Capture summary information –Describe multiple facets of content –Domain dependent, and so descriptive Disadvantages –Do not scale well (for organizing documents) –Domain dependent, so costly to acquire –May mismatch users’ interests

26 2003.12.02 - SLIDE 26IS 202 – FALL 2003 Text Clustering What clustering does –Finds overall similarities among groups of documents –Finds overall similarities among groups of tokens –Picks out some themes, ignores others How clustering works –Cluster entire collection –Find cluster centroid that best matches the query –Problems with clustering It is expensive It doesn’t work well

27 2003.12.02 - SLIDE 27IS 202 – FALL 2003 Scatter/Gather Interface

28 2003.12.02 - SLIDE 28IS 202 – FALL 2003 “ThemeScapes” Clustering

29 2003.12.02 - SLIDE 29IS 202 – FALL 2003 Kohonen Feature Maps on Text

30 2003.12.02 - SLIDE 30IS 202 – FALL 2003 Summary: Clustering Advantages –Get an overview of main themes –Domain independent Disadvantages –Many of the ways documents could group together are not shown –Not always easy to understand what they mean –Can’t see what documents are about –Documents may be forced into one position in semantic space –Hard to view titles Perhaps more suited for pattern discovery –Problem: often only one view on the space

31 2003.12.02 - SLIDE 31IS 202 – FALL 2003 HCI for IR: Query Formulation Question 2: How will a user formulate a query?

32 2003.12.02 - SLIDE 32IS 202 – FALL 2003 Query Specification Interaction styles (Shneiderman 97) –Command language –Form fill –Menu selection –Direct manipulation –Natural language What about gesture, eye-tracking, or implicit inputs like reading habits?

33 2003.12.02 - SLIDE 33IS 202 – FALL 2003 Command-Based Query Specification COMMAND ATTRIBUTE value CONNECTOR … –FIND PA shneiderman AND TW interface What are the ATTRIBUTE names? What are the COMMAND names? What are allowable values?

34 2003.12.02 - SLIDE 34IS 202 – FALL 2003 Form-Based Query Specification

35 2003.12.02 - SLIDE 35IS 202 – FALL 2003 Form-Based Query Specification

36 2003.12.02 - SLIDE 36IS 202 – FALL 2003 Direct Manipulation Query Specification

37 2003.12.02 - SLIDE 37IS 202 – FALL 2003 Menu-Based Query Specification

38 2003.12.02 - SLIDE 38IS 202 – FALL 2003 Natural Language Query AskJeeves –http://www.ask.com/http://www.ask.com/

39 2003.12.02 - SLIDE 39IS 202 – FALL 2003 HCI for IR: Viewing Results Question 3: How will a user scan, evaluate, and interpret the results?

40 2003.12.02 - SLIDE 40IS 202 – FALL 2003 Display of Retrieval Results Goal –Minimize time/effort for deciding which documents to examine in detail Idea –Show the roles of the query terms in the retrieved documents, making use of document structure

41 2003.12.02 - SLIDE 41IS 202 – FALL 2003 Putting Results in Context Interfaces should –Give hints about the roles terms play in the collection –Give hints about what will happen if various terms are combined –Show explicitly why documents are retrieved in response to the query –Summarize compactly the subset of interest

42 2003.12.02 - SLIDE 42IS 202 – FALL 2003 Putting Results in Context Visualizations of query term distribution –KWIC, TileBars, SeeSoft, Virtual Shakespeare Visualizing shared subsets of query terms –InfoCrystal, VIBE Table of contents as context –SuperBook, Cha-Cha

43 2003.12.02 - SLIDE 43IS 202 – FALL 2003 KWIC (Keyword in Context)

44 2003.12.02 - SLIDE 44IS 202 – FALL 2003 TileBars Graphical representation of term distribution and overlap Simultaneously indicate –Relative document length –Query term frequencies –Query term distributions –Query term overlap

45 2003.12.02 - SLIDE 45IS 202 – FALL 2003 TileBars Example Mainly about both DBMS & reliability Mainly about DBMS, discusses reliability Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability Mainly about high-tech layoffs Query terms: What roles do they play in retrieved documents? DBMS (Database Systems) Reliability

46 2003.12.02 - SLIDE 46IS 202 – FALL 2003 TileBars Example

47 2003.12.02 - SLIDE 47IS 202 – FALL 2003 SeeSoft (Eick & Wills 95)

48 2003.12.02 - SLIDE 48IS 202 – FALL 2003 David Small: Virtual Shakespeare

49 2003.12.02 - SLIDE 49IS 202 – FALL 2003 Other Approaches Show how often each query term occurs in sets of retrieved documents –VIBE (Korfhage ‘91) –InfoCrystal (Spoerri ‘94)

50 2003.12.02 - SLIDE 50IS 202 – FALL 2003 VIBE (Olson et al. 93, Korfhage 93)

51 2003.12.02 - SLIDE 51IS 202 – FALL 2003 InfoCrystal (Spoerri 94)

52 2003.12.02 - SLIDE 52IS 202 – FALL 2003 Problems with InfoCrystal Can’t see proximity or frequency of terms within documents Quantities not represented graphically More than 4 terms hard to handle No help in selecting terms to begin with

53 2003.12.02 - SLIDE 53IS 202 – FALL 2003 Cha-Cha (Chen & Hearst 98) Shows “Table- Of-Contents”- like view, like SuperBook Focus+Context using hyperlinks to create the TOC Integrates Web Site structure navigation with search

54 2003.12.02 - SLIDE 54IS 202 – FALL 2003 HCI for IR: Query Reformulation Question 4: How can a user reformulate a query?

55 2003.12.02 - SLIDE 55IS 202 – FALL 2003 Query Reformulation Thesaurus expansion –Suggest terms similar to query terms Relevance feedback –Suggest terms (and documents) similar to retrieved documents that have been judged to be relevant –“More like this” interaction

56 2003.12.02 - SLIDE 56IS 202 – FALL 2003 Relevance Feedback Modify existing query based on relevance judgements –Extract terms from relevant documents and add them to the query –And/or re-weight the terms already in the query Two main approaches –Automatic (pseudo-relevance feedback) –Users select relevant documents Users/system select terms from an automatically generated list

57 2003.12.02 - SLIDE 57IS 202 – FALL 2003 Revealing Internals Opaque (black box) –(Like web search engines) Transparent –(See used terms after Relevance Feedback ) Penetrable –(Choose suggested terms before Relevance Feedback ) Which do you think worked best?

58 2003.12.02 - SLIDE 58IS 202 – FALL 2003 Effectiveness Results Subjects using Relevance Feedback showed 17% - 34% better performance than without Relevance Feedback Subjects with penetration case did 15% better as a group than those in opaque and transparent cases

59 2003.12.02 - SLIDE 59IS 202 – FALL 2003 Summary: Relevance Feedback Iterative query modification can improve precision and recall for a standing query In at least one study, users were able to make good choices by seeing which terms were suggested for Relevance Feedback and selecting among them So … “more like this” can be useful!

60 2003.12.02 - SLIDE 60IS 202 – FALL 2003 Summary: HCI for IR Focus on the task, not the tool Be aware of –User abilities and differences –Prior work and innovations –Design guidelines and rules-of-thumb Iterate, iterate, iterate It is very difficult to design good UIs It is very difficult to evaluate search UIs Better interfaces in future should produce better IR experiences

61 2003.12.02 - SLIDE 61IS 202 – FALL 2003 Lecture Overview Review of Last Time –Introduction to HCI –Why Interfaces Don’t Work –Early Visions: Memex Interfaces for Information Retrieval II Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

62 2003.12.02 - SLIDE 62IS 202 – FALL 2003 Discussion Questions Arthur Law on Interfaces for IR –Using visualization in web information retrieval revealed poor results for navigation. However, this study was conducted in 1998. Are people more accustomed to these tools now with websites such as "http://www.smartmoney.com/marketmap/"? Perhaps this method of navigation will be better for the computer generation and their higher comfort level for using the web.http://www.smartmoney.com/marketmap/

63 2003.12.02 - SLIDE 63IS 202 – FALL 2003 Discussion Questions Arthur Law on Interfaces for IR –There are various examples of command line approaches and visual approaches. Individuals perform differently with each method so will the next step involve combining these methods to optimize each person's task of information retrieval? Or will a dominant company, i.e., LexisNexis or Google enforce one method of doing queries?

64 2003.12.02 - SLIDE 64IS 202 – FALL 2003 Discussion Questions Paul Laskowski on Interfaces for IR –MIR describes at least six sources of contextual information for the documents returned by a query: metadata, term scores, location of terms in each document, combinations of terms present in each document, tables of contents, and hyperlink structure. Which of these sources provides the most help for selecting relevant documents (or does it depend on the task)? Which types of context can help with reformulating a query? In the case of the location of terms, several tools are listed that graphically show where terms are placed in each document. I imagine using this to select documents where the terms appear in the same paragraph. Should this process be automated so that documents score higher when the search terms are near to each other? In what other ways might I use this information?

65 2003.12.02 - SLIDE 65IS 202 – FALL 2003 Discussion Questions Brooke Maury on Interfaces for IR –In chapter 10.7, Hearst discusses an application developed by Kozierok and Maes that keeps track of a user’s activities and makes recommendations based on previous action or situations. What impact does this “assistant/agent” application have on privacy? Is this too heavy a price to pay for achieving a positive human computer exchange or a more successful retrieval? If a system is charged with “looking over the shoulder” of a user, is there an ethical imperative to encrypt that information or otherwise provide safeguards against the misuse or abuse of that information?

66 2003.12.02 - SLIDE 66IS 202 – FALL 2003 Discussion Questions Brooke Maury on Interfaces for IR –The study by Koenemann and Belkin suggests that the most effective systems will allow users total control and access to what information is used for decision-making (They call such applications ‘penetrable.’). The system developed by Kozierok & Maes makes a number of important decisions without input from the user. Should K & M’s application be more ‘penetrable’?

67 2003.12.02 - SLIDE 67IS 202 – FALL 2003 Discussion Questions Dan Perkel on Interfaces for IR –While the web "has suddenly made vast quantities of information available globally" (MIR, 322) some would argue that it also comes at the price of a giant step backwards in terms of interfaces (As one example, compare the functionality of and types of interaction allowed by an email web app such as YahooMail/HotMail with an email client such as Eudora/Outlook/AppleMail). What does this say about the future of visualization techniques for IR? What needs to happen (technically, business-wise, other) for a top search engine to add an interactive visualization component to its search results?

68 2003.12.02 - SLIDE 68IS 202 – FALL 2003 Discussion Questions Joseph Hall on Interfaces for IR –In section 10.9 of MIR: "The field of information visualization needs some new ideas about how to display large, abstract information spaces intuitively.“ The seems to be the "holy grail" of HCI. Something that can intuitively deal with large information spaces... with feeble human brains providing imperfect queries. For example, a nowhere-near feeble brain and pretty direct query is evidenced by danah boyd's most recent blog entry: turtles all the way down http://www.zephoria.org/thoughts/archives/000889.html#000889 In this blog entry, danah has already queried the state-of-the-art search tool, Google, and unfortunately came across conflicting results. http://www.zephoria.org/thoughts/archives/000889.html#000889 –While Google can handle large information spaces sometimes the PageRank algorithm is just not enough. Seeing as humans tend to think in terms of "concentration"[1], what are some of the "penetrable" ways that IR tools could more effectively facilitate the human thought process instead of simply retrieving information? –[1] An old card game that requires remembering exactly where you saw a certain card for retrieval later.

69 2003.12.02 - SLIDE 69IS 202 – FALL 2003 Lecture Overview Review of Last Time –Introduction to HCI –Why Interfaces Don’t Work –Early Visions: Memex Interfaces for Information Retrieval II Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

70 2003.12.02 - SLIDE 70IS 202 – FALL 2003 Next Time Wishter DEMO! Final Exam Review


Download ppt "2003.12.02 - SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday."

Similar presentations


Ads by Google