1/26Remco Chang – PNNL 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.

Slides:



Advertisements
Similar presentations
1/26Remco Chang – Dagstuhl 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.
Advertisements

1/54Remco Chang – LANL 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.
ProvenanceIntroLOCCog StateDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
ScalaRMotivationQueryPlanWrap-up 1/26 Dynamic Reduction of Query Result Sets for Interactive Visualization Leilani Battle (MIT) Remco Chang (Tufts) Michael.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
VALTChessVA IntroAppsWrap-up 1/25 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
ProvenanceIntroApplicationPersonalityDist FuncWrap-up 1/36 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
C SC 421: Artificial Intelligence …or Computational Intelligence Alex Thomo
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Personalizing the Digital Library Experience Nicholas J. Belkin, Jacek Gwizdka, Xiangmin Zhang SCILS, Rutgers University
Developing Intelligent Agents and Multiagent Systems for Educational Applications Leen-Kiat Soh Department of Computer Science and Engineering University.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
1 User Interface Design CIS 375 Bruce R. Maxim UM-Dearborn.
Lecture 5 PERSONALITY II: Dimensions of Personality.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
Personality. Defining Some Terms Personality = Psychologists define personality as the reasonably stable patterns of emotions, thoughts, and behavior.
Probability and inference General probability rules IPS chapter 4.5 © 2006 W.H. Freeman and Company.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.
EQ: How do heredity and environment influence personality?
SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University.
Personalization of the Digital Library Experience: Progress and Prospects Nicholas J. Belkin Rutgers University, USA
Dist FuncIntroPersonalityProvenanceGroupWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
IntroDefinitionSizeComplexityWrap-up 1/54 Individual Big Data Visual Analytics: Challenges and Opportunities Remco Chang and Eli Brown Tufts University.
VALTVA IntroAppsWrap-up 1/16 Interactive Data Analysis and Model Exploration: A Visual Analytics Approach Remco Chang Tufts University Department of Computer.
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, 2.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
Probability and inference General probability rules IPS chapter 4.5 © 2006 W.H. Freeman and Company.
1/20 Remco Chang (Computer Science) Paul Han (Tufts Medical / Maine Medical) Holly Taylor (Psychology) Improving Health Risk Communication: Designing Visualizations.
1/20 (Big Data Analytics for Everyone) Remco Chang Assistant Professor Department of Computer Science Tufts University Big Data Visual Analytics: A User-Centric.
Probability and inference General probability rules IPS chapter 4.5 © 2006 W.H. Freeman and Company.
Previous Lecture: Data types and Representations in Molecular Biology.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
VALTVA IntroAppsWrap-up 1/34 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
INTERACTION DESIGN PROCESS Textbook: S. Heim, The Resonant Interface: HCI Foundations for Interaction Design [Chapter 3] Addison-Wesley, 2007 February.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
National Science Foundation Industry/University Cooperative Research Center for e-Design IAB Meeting June 7, 2007 Project Title: Cognitive Integration.
Understanding Users Cognition & Cognitive Frameworks
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
MIS 2000 Chapter 15 Knowledge Management. Outline Knowledge Explicit and Tacit Knowledge Knowledge Management Activities Computer-Aided Design/Manufacturing.
The Interplay Between Mathematics/Computation and Analytics Haesun Park Division of Computational Science and Engineering Georgia Institute of Technology.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
LECTURE 16: (EVEN MORE) OPEN QUESTIONS IN VISUAL ANALYTICS December 9, 2015 SDS 235 Visual Analytics.
1 Remco Chang – Dagstuhl 15 From vision science to data science: applying perception to problems in big data Remco Chang Assistant Professor Computer Science.
CS507 Information Systems. Lesson # 11 Online Analytical Processing.
IntroGoalCrowdPredictionWrap-up 1/26 Learning Debugging and Hacking the User Remco Chang Assistant Professor Tufts University.
Expert System / Knowledge-based System Dr. Ahmed Elfaig 1.ES can be defined as computer application program that makes decision or solves problem in a.
LECTURE 13: ONGOING RESEARCH: THE ROLE OF INDIVIDUAL DIFFERENCES April 25, 2016 SDS136: Communicating with Data.
Knowledge Discovery in a DBMS Data Mining Computing models and finding patterns in large databases current major challenge in database systems & large.
Hybrid Ant Colony Optimization-Support Vector Machine using Weighted Ranking for Feature Selection and Classification.
Big Data Visual Analytics: A User-Centric Approach
Contextual Intelligence as a Driver of Services Innovation
School of Computer Science & Engineering
A Black-Box Approach to Query Cardinality Estimation
Lecture 18: (even more) Open Problems
Remco Chang Associate Professor Computer Science, Tufts University
NView Overview We developed this tool as part of a team of visualization and biomedical researchers to better understand the physiology of DBS and patient.
Big Data Visual Analytics: Challenges and Opportunities
Chapter 2 Human Information Processing
CSc4730/6730 Scientific Visualization
Information Design and Visualization
Introduction to Visual Analytics
Presentation transcript:

1/26Remco Chang – PNNL 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University

2/26Remco Chang – PNNL 14 (Modified) Van Wijk’s Model of Visualization Data Visualization Vis Params User Perceive Explore Discovery Image Interaction

3/26Remco Chang – PNNL 14 When the Analyst is Successful…. Data Visualization Vis Params User Perceive Explore Discovery Image Interaction Data + Vis + Interaction + User = Discovery

4/26Remco Chang – PNNL 14 Remco’s Research Goal “Reverse engineer” the human cognitive black box (by analyzing user interactions) A.Data Modeling 1.Interactive Metric Learning B.User Modeling 2.Predict Analysis Behavior C.Cognitive States and Traits D.Mixed-Initiative Visual Analytics R. Chang et al., Science of Interaction, Information Visualization, 2009.

5/26Remco Chang – PNNL 14 Data Modeling 1.Interactive Metric Learning Quantifying a User’s Knowledge about Data

6/26Remco Chang – PNNL 14 Metric Learning Finding the weights to a linear distance function Instead of a user manually give the weights, can we learn them implicitly through their interactions?

7/26Remco Chang – PNNL 14 Metric Learning In a projection space (e.g., MDS), the user directly moves points on the 2D plane that don’t “look right”… Until the expert is happy (or the visualization can not be improved further) The system learns the weights (importance) of each of the original k dimensions

8/26Remco Chang – PNNL 14 Dis-Function Brown et al., Find Distance Function, Hide Model Inference. IEEE VAST Poster 2011 Brown et al., Dis-function: Learning Distance Functions Interactively. IEEE VAST Optimization:

9/26Remco Chang – PNNL 14 User Modeling 2. Learning about a User in Real-Time Who is the user, and what is she doing?

10/26Remco Chang – PNNL 14 One Question at a Time Data Visualization Vis Params User Perceive Explore Discovery Image Interaction Data + Vis + Interaction + User = Discovery Novice or Expert? Introvert or Extrovert? Fast or Slow?

11/26Remco Chang – PNNL 14 Experiment: Finding Waldo Google-Maps style interface – Left, Right, Up, Down, Zoom In, Zoom Out, Found

12/26Remco Chang – PNNL 14 Fast completion time Pilot Visualization – Completion Time Slow completion time Helen Zhao et al., Modeling user interactions for complex visual search tasks. Poster, IEEE VAST, Eli Brown et al., Where’s Waldo. IEEE VAST 2014, Conditionally Accepted.

13/26Remco Chang – PNNL 14 Predicting Fast and Slow Performers State-Based (data exploration statistics) Linear SVM Accuracy: ~70% Interaction pattern (high- level button clicks) N-Gram + Decision Tree Accuracy: ~80%

14/26Remco Chang – PNNL 14 Predicting a User’s Personality External Locus of Control Internal Locus of Control Ottley et al., How locus of control influences compatibility with visualization style. IEEE VAST, Ottley et al., Understanding visualization by understanding individual users. IEEE CG&A, 2012.

15/26Remco Chang – PNNL 14 Predicting Users’ Personality Traits Noisy data, but can detect the users’ individual traits “Extraversion”, “Neuroticism”, and “Locus of Control” at ~60% accuracy by analyzing the user’s interactions alone. Predicting user’s “Extraversion” Accuracy: ~60%

16/26Remco Chang – PNNL 14 Cognitive States and Traits 3. What are the Cognitive Factors that Correlate with a User’s Performance?

17/26Remco Chang – PNNL 14 Emotion and Visual Judgment Harrison et al., Influencing Visual Judgment Through Affective Priming, CHI 2013

18/26Remco Chang – PNNL 14 Cognitive Load Functional Near-Infrared Spectroscopy a lightweight brain sensing technique measures mental demand (working memory) Evan Peck et al., Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces. CHI 2013.

19/26Remco Chang – PNNL 14 Spatial Ability: Bayes Reasoning The probability that a woman over age 40 has breast cancer is 1%. However, the probability that mammography accurately detects the disease is 80% with a false positive rate of 9.6%. If a 40-year old woman tests positive in a mammography exam, what is the probability that she indeed has breast cancer? Answer: Bayes’ theorem states that P(A|B) = P(B|A) * P(A) / P(B). In this case, A is having breast cancer, B is testing positive with mammography. P(A|B) is the probability of a person having breast cancer given that the person is tested positive with mammography. P(B|A) is given as 80%, or 0.8, P(A) is given as 1%, or P(B) is not explicitly stated, but can be computed as P(B,A)+P(B,˜A), or the probability of testing positive and the patient having cancer plus the probability of testing positive and the patient not having cancer. Since P(B,A) is equal 0.8*0.01 = 0.008, and P(B,˜A) is * (1-0.01) = , P(B) can be computed as = Finally, P(A|B) is therefore 0.8 * 0.01 / , which is equal to

20/26Remco Chang – PNNL 14 Visualization Aids Ottley et al., Visually Communicating Bayesian Statistics to Laypersons. Tufts CS Tech Report, 2012.

21/26Remco Chang – PNNL 14 Spatial Ability

22/26Remco Chang – PNNL 14 Mixed Initiative Systems 4. What Can a Visualization System Do If It Knows Everything About Its User?

23/26Remco Chang – PNNL 14 “The computer is incredibly fast, accurate, and stupid. Man is unbelievably slow, inaccurate, and brilliant. The marriage of the two is a force beyond calculation.” -Leo Cherne, 1977 (often attributed to Albert Einstein)

24/26Remco Chang – PNNL 14 Which Marriage?

25/26Remco Chang – PNNL 14 Which Marriage?

26/26Remco Chang – PNNL 14 Remco’s Prediction The future of visual analytics lies in better human-computer collaboration That future starts by enabling the computer to better understand the user

27/26Remco Chang – PNNL 14 Questions?

28/26Remco Chang – PNNL 14 Putting Theory into Practice: Big Data Visualization on a Commodity Hardware Large Data in a Data Warehouse

29/26Remco Chang – PNNL 14 Problem Statement Constraint: Data is too big to fit into the memory or hard drive of the personal computer – Note: Ignoring various database technologies (OLAP, Column-Store, No-SQL, Array-Based, etc) Classic Computer Science Problem…

30/26Remco Chang – PNNL 14 Work in Progress… * However, exploring large DB (usually) means high degrees of freedom Goal: Predictive Pre-Fetching from large DB Collaboration with MIT Big Data Center Teams: – MIT: Based on data characteristic – Brown: Based on past SQL queries – Tufts: Based on user’s analysis profile Current progress: developed middleware (ScalaR) Battle et al., Dynamic Reduction of Result Sets for Interactive Visualization. IEEE BigData, 2013.