Multidimensional Data Analysis IS 247 Information Visualization and Presentation 22 February 2002 James Reffell Moryma Aydelott Jean-Anne Fitzpatrick.

Slides:



Advertisements
Similar presentations
Multi-Dimensional Data Visualization
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
HCI 530 : Seminar (HCI) Damian Schofield. HCI 530: Seminar (HCI) Transforms –Two Dimensional –Three Dimensional The Graphics Pipeline.
1 SIMS 247: Information Visualization and Presentation Marti Hearst Sept 21, 2005.
Rutgers Components Phase 2 Principal investigators –Paul Kantor, PI; Design, modelling and analysis –Kwong Bor Ng, Co-PI - Fusion; Experimental design.
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
Interactive Dynamic Aggregate Queries Kenneth A. Ross Junyan Ding Columbia University.
1 i247: Information Visualization and Presentation Marti Hearst Multidimensional Graphing.
Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan University of Maryland.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
1 i247: Information Visualization and Presentation Marti Hearst Interactive Multidimensional Visualization.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus + Context Visualization for Tabular Information R. Rao and S. K.
Infovis and data george, laura, tjerk.
Evaluating the Quality of Image Synthesis and Analysis Techniques Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
Cliff Rhyne and Jerry Fu June 5, 2007 Parallel Image Segmenter CSE 262 Spring 2007 Project Final Presentation.
Selective Dynamic Manipulation of Visualizations Chuah, Roth, Mattis, Kolojejchick.
Project Update: Law Enforcement Resource Allocation (LERA) Visualization System Michael Welsman-Dinelle April Webster.
1 King ABDUL AZIZ University Faculty Of Computing and Information Technology CS 454 Computer graphicsIntroduction Dr. Eng. Farag Elnagahy
IS 247 Information Visualization and Presentation 10 May 2002 James Reffell Moryma Aydelott Jean-Anne Fitzpatrick The NewsHound Project Presents:
Table Lens Introduction to the Table Lens concept Table Lens Implementation Projected Usage Scenarios Usage Comparison with Splus Critical Analysis.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Term 2, 2011 Week 1. CONTENTS Types and purposes of graphic representations Spreadsheet software – Producing graphs from numerical data Mathematical functions.
Data Analysis Using SPSS
Dynamic Queries –presented by Bhaskar Chatterjee Visual Alternative to SQL for Querying databases Depending on data types and the values decides the input.
: Chapter 12: Image Compression 1 Montri Karnjanadecha ac.th/~montri Image Processing.
Space & Order (1) Jing Li The Visual Design and Control of Trellis Display R. A. Becker, W. S. Cleveland, and M. J. Shyu (1996). Source:
CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.
Extended Assessments Elementary Mathematics Oregon Department of Education and Behavioral Research and Teaching January 2007.
Computational Biology, Part E Basic Principles of Computer Graphics Robert F. Murphy Copyright  1996, 1999, 2000, All rights reserved.
The Scientific Method Honors Biology Laboratory Skills.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 11 Committed to Shaping the Next Generation of IT Experts. Chapter 5 PivotTables and Charts.
1 Multidimensional Detective Alfred Inselberg, Multidimensional Graphs Ltd Tel Aviv University, Israel Presented by Yimeng Dou
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Polaris: A System for Query, Analysis, & Visualization of Relational Databases Chris Stolte May 29 th, 2002.
Overview of CCSS Statistics and Probability Math Alliance September 2011.
Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
Summer Student Program 15 August 2007 Cluster visualization using parallel coordinates representation Bastien Dalla Piazza Supervisor: Olivier Couet.
Unit 42 : Spreadsheet Modelling
VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001.
Using and modifying plan constraints in Constable Jim Blythe and Yolanda Gil Temple project USC Information Sciences Institute
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
Building Dashboards SharePoint and Business Intelligence.
Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/ VisDB: Database exploration using Multidimensional.
Polaris: A System for Query, Analysis and Visualization of Multi- dimensional Relational Database by Chris Stolte & Pat Hanrahan presenter Andrew Trieu.
Chapter 3 Response Charts.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Visualization in Problem Solving Environments Amit Goel Department of Computer Science Virginia Tech June 14, 1999.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Visage: An All-in-One Tool A Paper by Roth, Lucas, Senn, et al. Presented by Josh Steele.
Scatter Plots Scatter plots are a graphic representation of collated biviariate data via a mathematical diagram using Cartesian coordinates. The data.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
Mulidimensional Detective “Multidimensional” : multivariate, many parameters “Detective” : focus is on the “discovery process”, finding patterns and trends.
Visualization Design Principles cs5984: Information Visualization Chris North.
DATA VISUALIZATION BOB MARSHALL, MD MPH MISM FAAFP FACULTY, DOD CLINICAL INFORMATICS FELLOWSHIP.
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-5 Chapters 9 and 10.
Table Lens Paper – The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus + Context Visualization for Tabular Information.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
CSc4730/6730 Scientific Visualization
CSc4730/6730 Scientific Visualization
Multidimensional Data Analysis
Presentation transcript:

Multidimensional Data Analysis IS 247 Information Visualization and Presentation 22 February 2002 James Reffell Moryma Aydelott Jean-Anne Fitzpatrick

Problem Statement How to effectively present more than 3 dimensions of information in a visual display with 2 (to 3) dimensions? How to effectively visualize “inherently abstract” data? How to effectively visualize very large, often complex data sets? How to effectively display results – when you don’t know what those results will be?

Key Goals More than 3 dimensions of data simultaneously Support “fuzzyness” (similarity queries, vector space, tolerance ranges) Support exploratory, opportunistic, “what-if” queries Allow identification of interesting data properties through pattern recognition Explore various dimensions without losing overview

Another Statement of Goals Visualization of multidimensional data Without loss of information With: –Minimal complexity –Any number of dimensions –Variables treated uniformly –Objects remain recognizable across transformations –Easy / intuitive conveyance of information –Mathematically / algorithmically rigorous (Adapted from Inselberg)

Purposes / Uses Find clusters of similar data Find “hot spots” (exceptional items in otherwise homogeneous regions) Show relationships between multiple variables Similarity retrieval rather than boolean matching, show near misses “Searching for patterns in the big picture and fluidly investigating interesting details without losing framing context” (Rao & Card)

Characteristics “Data-dense displays” (large number of dimensions and/or values) –Often combine color with position / proximity representing relevance “distance” –Often provide multiple views Build on concepts from previous weeks: –Retinal properties of marks –Gestalt concepts, e.g., grouping –Direct manipulation / interactive queries –Incremental construction of queries –Dynamic feedback Some require specialized input devices or unique gesture vocabulary

Examples Warning: These visualizations are not easy to grasp at “first glance”! DON’T PANIC

Influence Explorer / Prosection Matrix (Tweedie et. al.) We saw the video! Abstract one-way mathematical models: multiple parameters, multiple variables. Data for visualization comes from sampling Visualization of non-obvious underlying structures in models Color coding, attention to near misses

Influence Explorer / Prosection Matrix (Tweedie et. al.) Use the sliders to set performance limits. Color coding gives immediate feedback as to effects of changes—both for ‘perfect’ scores and for near-misses. Can also highlight individual values across histograms, show parallel coordinates. Interactive querying!

Influence Explorer / Prosection Matrix (Tweedie et. al.) In this view we can shift parameter ranges in addition to performance limits. Red is still a perfect score—blacks miss one parameter limit, blues one or two performance limits. Does this color scheme make sense? Would another work better?

Influence Explorer / Prosection Matrix (Tweedie et. al.) Prosection matrix (on right) = scatter plots for pairs of parameters. Color coding matches histograms. Fitting tolerance region (yellow box) to acceptability (red region) gives high yield for minimum cost Or: Make the red bit as big as possible! This aspect closely tuned to task at hand: manufacturing and similar.

The Table Lens (Rao and Card) Tools: zoom, adjust, slide Works best for case / variable data Cell contents coded by color (nominal) or bar length (interval) Special mouse gesture vocabulary Search / browse (spotlighting) Create groups by dragging columns

The Table Lens (Rao and Card)

The Table Lens (Rao and Card) Focus + context for large datasets while retaining access to all data Flexible, suitable for many domains Good example of direct interaction Inxight = silly name

Parallel Coordinates (Inselberg) Translation of multiple graphs by using parallel axes. Useful for recognizing patterns between the axes - adding or removing parts of the data to see general patterns or more closely examine particular interactions. Articles offer suggestions on how to most effectively use this system.

Parallel Coordinates (Inselberg) Dataset in a Cartesian graphSame dataset in parallel coordinates Parallel Coordinates applet - Like a normal graph, but different…

Parallel Coordinates (Inselberg) Strengths – Works for any N Clearly displays data characteristics of the data (without needing beaucoup explanations) Easy to adjust or focus displays/ queries Testing showed that it showed problems missed using other forms of process control Can be used in decision support when used as a visual modeling tool (to see how adjusting one parameter effects others). Weaknesses – Formation of complex queries can be tricky (if you want to get results that are useful and easy to interpret).

Polaris (Stolte and Hanharan) Extends pivot tables to generate graphic (not just table) displays Multiple graphs on one screen Designed to “combine statistical analysis and visualization”

Polaris (Stolte and Hanharan) Four step process: from selection to partitioning to grouping/ aggregation to composing/rendering/displaying

Polaris (Stolte and Hanharan) Table algebra automatically generated via drag and drop. Graphics generated using this algebra. Suitable graphic types are system selected based on query/result data types, combinations. (Include tables, bar charts, dot plot, gantt charts, matrices of scatterplots, maps.) Users can select marks (marks differ by shape, size, orientation and color).

Polaris (Stolte and Hanharan) Thought behind display types and graphs choices (Shapes recommended by Cleveland, Use of Size and orientation as recommended by Kosslyn, Color as recommended by Travis) No mention of user testing, though.

Polaris (Stolte and Hanharan) Data mapped into “layers” Linking and brushing capabilities, combined with automatic determination of the “best” graph type allows easy drilling down.

Polaris (Stolte and Hanharan) Strengths – Can be used with existing DB systems Data transformations can be converted to SQL Direct manipulation - Linking and Brushing, drag and drop supported Users can play with appearance of display Does maps, charts, images – not limited to one display type. Weaknesses – User only sees aggregated (not original) data System performs a number of functions automatically (conversion of variables, aggregation) - user may not know or not be able to control how their data is changed.

Worlds Within Worlds (Fiener and Beshers) Basic approach: graph 3 dimensions, while holding “extra” dimensions constant Visually represent “extra” dimensions as space within which graph(s) are placed –Position of “inner world” graph axis zero point equals set of constant values in “outer world” Tools: –Dipstick –Waterline –Magnifying box The following images from:

Worlds Within Worlds Constraints: –Uses special input device (“Data Glove”) and output device (liquid crystal stereo glasses); use without these special devices less than optimal Technical details: –Suspend calculation of “child” details during movement –Algorithm for prioritizing overlapping objects –Need to “turn off” gesture recognition to allow normal use of hand

Worlds Within Worlds I/O Devices

Techniques for plotting multivariate functions (Mihalisin et al) Multiples showing component dimensions, color codes for dimensions applied across multiples Or, for categorical data, select mth category from nth dimension Or, plot nested boxes, step values of independent variables and color-coding dependent variable

Techniques for plotting multivariate functions (Mihalisin et al) Tools: –General zoom: look at smaller range of data in same amount of space –Subspace zoom: select view of particular dimension’s input to function –Decimate tool: sample fewer values within range

from

VisDB (Keim & Kriegel) Mapping entries from relational database to pixels on the screen Include “approximate” answers, with placement and color-coding based on relevance Data points laid out in: –Rectangular spiral –Or, with axes representing positive/negative values for two selected dimensions –Or, group dimensions together (easier to interpret than very large number of dimensions)

from

VisDB - Relevance Relevance calculation based on “distance” of each variable from query specification Distance calculation depends on data type –Numeric: mathematical –String: character/substring matching, lexical, phonetic?, syntactic? –Nominal: predefined distance matrix –Possibly other “domain-specific” distance metrics

VisDB – Screen Resolution Stated screen resolution seems reasonable by today’s standards: 19 inch display, 1024x1280 pixels = 1.3 million data points However, controls take up a lot of space!

from

VisDB – Implementation Requires features not available in commercial databases: –Partial query results –Incremental changes to queries –Speed? (1994 vs today)

Limitations and Issues Complexity Abstract data –These visualizations are oriented toward abstract data –For “naturally” two or three-dimensional data (things that vary over time or space, e.g., geographic data) visualizations which exploit those properties may exist and be more effective

User Testing? Many of these systems seem only appropriate for expert use Minimal evidence of user testing in most cases

Future Work Save query parameters for reference / sharing results Automated query generation or filtering – Intelligent agents?

Words of wisdom from Tweedie et al Trade-off between amount of information, simplicity, and accuracy “It is often hard to judge what users will find intuitive and how [a visualization] will support a particular task”