Interaction LBSC 796/INFM 718R Douglas W. Oard Week 4, February 23, 2011.

Slides:



Advertisements
Similar presentations
Recuperação de Informação B Cap. 10: User Interfaces and Visualization 10.1,10.2,10.3 November 17, 1999.
Advertisements

Critical Reading Strategies: Overview of Research Process
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Jane Reid, AMSc IRIC, QMUL, 13/11/01 1 IR interfaces Purpose: to support users in information-seeking tasks Issues: –Functionality –Usability Motivations.
Information Retrieval in Practice
Information Retrieval Review
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 3, 2005.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
1 CS 430 / INFO 430 Information Retrieval Lecture 15 Usability 3.
Help and Documentation zUser support issues ydifferent types of support at different times yimplementation and presentation both important yall need careful.
Human-Computer Interaction and Prototype Demos Session 8 INFM 718N Web-Enabled Databases.
LBSC 796/INFM 718R: Week 10 Clustering, Classifying, and Filtering Jimmy Lin College of Information Studies University of Maryland Monday, April 10, 2006.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
INFM 700: Session 7 Unstructured Information (Part II) Jimmy Lin The iSchool University of Maryland Monday, March 10, 2008 This work is licensed under.
Information Retrieval Interaction CMSC 838S Douglas W. Oard April 27, 2006.
SIMS 296a-3: UI Background Marti Hearst Fall ‘98.
HCI Part 2 and Testing Session 9 INFM 718N Web-Enabled Databases.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
LBSC 796/INFM 718R: Week 9 User Interaction Jimmy Lin College of Information Studies University of Maryland Monday, April 3, 2006.
1 Discussion Class 12 User Interfaces and Visualization.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Interaction LBSC 796/INFM 718R Douglas W. Oard Week 4, October 1, 2007.
Meaningful Learning in an Information Age
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR depends on –Search in IR and search in human memory.
Overview of Search Engines
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits  Query formulation Selection part 1: Snippets Selection part 2: Result sets Examination.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Ch 6 - Menu-Based and Form Fill-In Interactions Yonglei Tao School of Computing & Info Systems GVSU.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
1 Adapting the TileBar Interface for Visualizing Resource Usage Session 602 Adapting the TileBar Interface for Visualizing Resource Usage Session 602 Larry.
User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Search and Navigation Based on the paper, “Improved Search Engines and Navigation Preference in Personal Information Management” Ofer Bergman, Ruth Beyth-Marom,
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
Clustering Supervised vs. Unsupervised Learning Examples of clustering in Web IR Characteristics of clustering Clustering algorithms Cluster Labeling 1.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
Clustering C.Watters CS6403.
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets Selection part 2: Result sets  Examination.
User Interfaces for Information Access Prof. Marti Hearst SIMS 202, Lecture 26.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Search Strategies Session 7 INST 301 Introduction to Information Science.
Data Mining and Text Mining. The Standard Data Mining process.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
Information Retrieval in Practice
Information Organization: Overview
Professor John Canny Fall 2001 Nov 29, 2001
Professor John Canny Spring 2003
Slides Please download the slides from
Document Clustering Matt Hughes.
Chapter 11 user support.
Information Organization: Overview
Structure of IR Systems
Presentation transcript:

Interaction LBSC 796/INFM 718R Douglas W. Oard Week 4, February 23, 2011

Moore’s Law transistors speed storage computer performance

Human Cognition human performance

Where is the bottleneck? Slide idea by Bill Buxton system vs. human performance

Interaction Points Source Selection Search Query Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource source reselection System discovery Vocabulary discovery Concept discovery Document discovery Help users decide where to start Help users formulate queries Help users make sense of results and navigate information space

Information Needs RIN 0 PIN 0 PIN m r0r0 r1r1 q0q0 … q1q1 q2q2 q3q3 rnrn qrqr Stefano Mizzaro. (1999) How Many Relevances in Information Retrieval? Interacting With Computers, 10(3), Real information needs (RIN) = visceral need Perceived information needs (PIN) = conscious need Request = formalized need Query = compromised need

Anomalous State of Knowledge Belkin: Searchers do not clearly understand –The problem itself –What information is needed to solve the problem The query results from a clarification process Dervin’s “sense making”: Need GapBridge

Q0 Q1 Q2 Q3 Q4 Q5 A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” Bates’ “Berry Picking” Model

Broder’s Web Query Taxonomy Navigational (~20%) –Reach a particular site (“known item”) Informational (~50%) –Acquire static information (“topical”) Transactional (~30%) –Perform a Web-mediated activity (“service”) Andrei Broder, SIGIR Forum, Fall 2002

Some Desirable Features Make exploration easy Relate documents with why they are retrieved Highlight relationships between documents

Agenda  Query formulation Selection Examination Source selection

Query Formulation Command Language Form Fill-in Menu Selection Direct Manipulation Natural Language Ben Shneiderman, 1997

WESTLAW® Query Examples What is the statute of limitations in cases involving the federal tort claims act? –LIMIT! /3 STATUTE ACTION /S FEDERAL /2 TORT /3 CLAIM What factors are important in determining what constitutes a vessel for purposes of determining liability of a vessel owner for injuries to a seaman under the “Jones Act” (46 USC 688)? –( ) FACTOR ELEMENT STATUS FACT /P VESSEL SHIP BOAT /P ( ) “JONES ACT” /P INJUR! /S SEAMAN CREWMAN WORKER Are there any cases which discuss negligent maintenance or failure to maintain aids to navigation such as lights, buoys, or channel markers? –NOT NEGLECT! FAIL! NEGLIG! /5 MAINT! REPAIR! /P NAVIGAT! /5 AID EQUIP! LIGHT BUOY “CHANNEL MARKER” What cases have discussed the concept of excusable delay in the application of statutes of limitations or the doctrine of laches involving actions in admiralty or under the “Jones Act” or the “Death on the High Seas Act”? –EXCUS! /3 DELAY /P (LIMIT! /3 STATUTE ACTION) LACHES /P “JONES ACT” “DEATH ON THE HIGH SEAS ACT” ( )

Form-Based Query Specification Credit: Marti Hearst

Direct Manipulation Spec. VQUERY (Jones 98) Credit: Marti Hearst

Alternate Query Modalities Spoken queries –Used for telephone and hands-free applications –Reasonable performance with limited vocabularies But some error correction method must be included Handwritten queries –Palm pilot graffiti, touch-screens, … –Fairly effective if some form of shorthand is used Ordinary handwriting often has too much ambiguity

Agenda Query formulation  Selection Examination Source selection

A Selection Interface Taxonomy One dimensional lists –Content: title, source, date, summary, ratings,... –Order: retrieval status value, date, alphabetic,... –Size: scrolling, specified number, score threshold Two dimensional displays –Construction: clustering, starfield, projection –Navigation: jump, pan, zoom Three dimensional displays –Contour maps, fishtank VR, immersive VR

Google: KeyWord In Context (KWIC) Query: University of Maryland College Park

Summarization

Indicative vs. Informative Terms often applied to document abstracts –Indicative abstracts support selection They describe the contents of a document –Informative abstracts support understanding They summarize the contents of a document Applies to any information presentation –Presented for indicative or informative purposes

Selection/Examination Tasks “Indicative” tasks –Recognizing what you are looking for –Determining that no answer exists in a source –Probing to refine mental models of system operation “Informative” tasks –Vocabulary acquisition –Concept learning –Information use

Generated Summaries Fluent summaries for a specific domain Define a knowledge structure for the domain –Frames are commonly used Analysis: process documents to fill the structure –Studied separately as “information extraction” Compression: select which facts to retain Generation: create fluent summaries –Templates for initial candidates –Use language model to select an alternative

Extraction-Based Summarization Robust technique for making disfluent summaries Four broad types: –Query-biased vs. generic –Term-oriented vs. sentence-oriented Combine evidence for selection: –Salience: similarity to the query –Specificity: IDF or chi-squared –Emphasis: title, first sentence

Goldilocks and the Three Summaries… The entire document: too much! The exact answer: too little! The surrounding paragraph: just right… It occurred on July 4, What does this pronoun refer to? Jimmy Lin, Dennis Quan, Vineet Sinha, Karun Bakshi, David Huynh, Boris Katz, and David R. Karger. (2003) What Makes a Good Answer? The Role of Context in Question Answering. Proceedings of INTERACT 2003.

Open Directory Project

SWISH List Interface Category Interface Query: jaguar Hao Chen and Susan Dumais. (2000) Bringing Order to the Web: Automatically Categorizing Search Results. Proceedings of CHI 2000.

Text Classification Problem: automatically sort items into bins Machine learning approach –Obtain a training set with ground truth labels –Use a machine learning algorithm to “train” a classifier kNN, Bayesian classifier, SVMs, decision trees, etc. –Apply classifier to new documents System assigns labels according to patterns learned in the training set

Machine Learning label 1 label 2 label 3 label 4 Text Classifier Supervised Machine Learning Algorithm Unlabeled Document label 1 ? label 2 ? label 3 ? label 4 ? Testing Training Training examples Representation Function

k Nearest Neighbor (kNN) Classifier

kNN Algorithm Select k most similar labeled documents Have them “vote” on the best label: –Each document gets one vote, or –More similar documents get a larger vote How can similarity be defined?

The Cluster Hypothesis “Closely associated documents tend to be relevant to the same requests.” van Rijsbergen 1979

Vivisimo: Clustered Results

Kartoo’s Cluster Visualization

Clustering Result Sets Advantages: –Topically coherent document sets are presented together –User gets a sense for the themes in the result set –Supports browsing retrieved hits Disadvantages: –May be difficult to understand the theme of a cluster based on summary terms –Clusters themselves might not “make sense” –Computational cost

Visualizing Clusters Centroids

Hierarchical Agglomerative Clustering

Another Way to Look at H.A.C. ABCDEFGHABCDEFGH

The H.A.C. Algorithm Start with each document in its own cluster Until there is only one cluster: –Determine the two most similar clusters c i and c j –Replace c i and c j with a single cluster c i  c j The history of merging forms the hierarchy

Cluster Similarity Assume a similarity function that determines the similarity of two instances: sim(x,y) –What’s appropriate for documents? What’s the similarity between two clusters? –Single Link: similarity of two most similar members –Complete Link: similarity of two least similar members –Group Average: average similarity between members

symbols8 docs film, tv 68 docs astrophysics97 docs astronomy67 docs flora/fauna10 docs Clustering and re-clustering is entirely automated sports14 docs film, tv47 docs music7 docs stellar phenomena12 docs galaxies, stars49 docs constellations29 docs miscellaneous7 docs Query = “star” on encyclopedic text Scatter/Gather

Sustem clusters documents into “themes” –Displays clusters by showing: Topical terms Typical titles User chooses a subset of the clusters System re-clusters documents in selected cluster –New clusters have different, more refined, “themes” Marti A. Hearst and Jan O. Pedersen. (1996) Reexaming the Cluster Hypothesis: Scatter/Gather on Retrieval Results. Proceedings of SIGIR 1996.

Summary: Clustering Advantages: –Provides an overview of main themes in search results –Helps overcome polysemy Disadvantages: –Documents can be clustered in many ways –Not always easy to understand the theme of a cluster –What is the correct level of granularity? –More information to present

Recap Clustering –Automatically group documents into clusters Classification –Automatically assign labels to documents

Agenda Query formulation Selection  Examination Source selection

Distorting Reality Bifocal Perspective Wall Fisheye

1-D Fisheye Menu

1-D Fisheye Document Viewer

Mainly about both DBMS and reliability Mainly about DBMS, discusses reliability Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability Mainly about high-tech layoffs TileBars Topic: reliability of DBMS (database systems) Query terms: DBMS, reliability DBMS reliability DBMS reliability DBMS reliability DBMS reliability

U Mass: Scrollbar-Tilebar

Agenda Query formulation Selection Examination  Source selection

ThemeView Pacific Northwest National Laboratory

WebTheme

Ben S’ ‘Seamless Interface’ Principles Informative feedback Easy reversal User in control –Anticipatable outcomes –Explainable results –Browsable content Limited working memory load –Query context –Path suspension Alternatives for novices and experts Scaffolding

My ‘Synergistic Interaction’ Principles –Interdependence with process (“interaction models”) Co-design with search strategy Speed –System initiative Guided process Exposing the structure of knowledge –Support for reasoning Representation of uncertainty Meaningful dimensions –Synergy with features used for search Weakness of similarity, Strength of language –Easily learned Familiar metaphors (timelines, ranked lists, maps)

Some Good Ideas Show the query in the selection interface –It provides context for the display Suggest options to the user –Query refinements, for example Explain what the system has done –Highlight query terms in the results, for example Complement what the system has done –Users add value by doing things the system can’t –Expose the information users need to judge utility

One Minute Paper When examining documents in the selection and examination interfaces, which type of information need (visceral, conscious, formalized, or compromised) guides the user’s decisions? Please justify your answer.