Visualization Techniques for Multivariate Discrete and Continuous Data March 4, 2005 Rachael Brady.

Slides:



Advertisements
Similar presentations
1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds.
Advertisements

Brief Guide On How To Use ARC’s Interactive Employment Map Atlanta Regional Commission For more information contact:
Multi-Dimensional Data Visualization
Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.
Data Presentation A guide to good graphics Bureau of Justice Statistics Marianne W. Zawitz.
Mapping Nominal Values to Numbers for Effective Visualization Presented by Matthew O. Ward Geraldine Rosario, Elke Rundensteiner, David Brown, Matthew.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
Lecture Notes for Chapter 2 Introduction to Data Mining
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
WPI Center for Research in Exploratory Data and Information Analysis CREDIA SC4DEVO-1, July 12-15, 2004 Interactive Visual Exploration of Multivariate.
Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
Jamie Starke.  Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations ◦ J. Heer, N. Kong,
11/30/06C:\Documents and Settings\Administrator\My Documents\533\gliff.odppage 1 Information Visualization: Glyphs CPSC 533 Topic Presentation Clarence.
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
1 i247: Information Visualization and Presentation Marti Hearst Data Types and Graph Types.
Visual Computing Lecture 2 Visualization, Data, and Process.
Let's zoom in on one corner of the coordinate plane
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
Graphing in Science.
Charts and Graphs V
NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic.
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
© SSER Ltd. How Science Works Types of Graph. This presentation looks at the following types of graph: 1.Bar Chart 3.Line Graph4.Pie Chart 5.Scatter Graph.
Unit 2 Lesson 1 Representing Data
Information Design and Visualization
LECTURE 03: DATA COLLECTION AND MODELS February 4, 2015 COMP Topics in Visual Analytics Note: slide deck adapted from R. Chang, Fall 2010.
DATA MINING from data to information Ronald Westra Dep. Mathematics Knowledge Engineering Maastricht University.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Basic concepts in ordination
Space & Order (1) Jing Li The Visual Design and Control of Trellis Display R. A. Becker, W. S. Cleveland, and M. J. Shyu (1996). Source:
Quantitative Skills 1: Graphing
Section 2.4 Representing Data.
Data Mining & Knowledge Discovery Lecture: 2 Dr. Mohammad Abu Yousuf IIT, JU.
Robert Kosara, Helwig Hauser 1InfoVis STAR The State of the Art in Information Visualization Robert Kosara, Helwig Hauser.
Graphics for Macroeconomics. Principles Graphing is done best when it clearly communicates ideas about data Focus on the main point while preventing distractions.
Graphing Why? Help us communicate information : Visual What is it telling your? Basic Types Line Bar Pie.
Graphing Data: Introduction to Basic Graphs Grade 8 M.Cacciotti.
Ch. 1 Looking at Data – Distributions Displaying Distributions with Graphs Section 1.1 IPS © 2006 W.H. Freeman and Company.
1 Data Mining: Data Lecture Notes for Chapter 2. 2 What is Data? l Collection of data objects and their attributes l An attribute is a property or characteristic.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1.
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CONFIDENTIAL Data Visualization Katelina Boykova 15 October 2015.
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
The Nature of Science The Methods of Science Scientific Measurements Graphing.
Graphics. Coin data How can we see what’s going on better? –Long run vs. short run.
3/13/2016Data Mining 1 Lecture 1-2 Data and Data Preparation Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB) Bangkok.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Data Representation: Making Tables & Graphs. Data Tables.
Visualization Design Principles cs5984: Information Visualization Chris North.
1/59 Lecture 02: Data Mapping September 15, 2015 COMP Visualization.
Exploring Data: Summary Statistics and Visualizations
Fitting: Voting and the Hough Transform
IAT 355 Data + Multivariate Visualization
Fitting Curve Models to Edges
One-Dimensional Dynamics of Attention and Decision Making in LIP
Information Design and Visualization
Dimension reduction : PCA and Clustering
Multidimensional Space,
An Introduction to Multivariate Data Visualization and XmdvTool
Information Visualization
Data Pre-processing Lecture Notes for Chapter 2
Presentation transcript:

Visualization Techniques for Multivariate Discrete and Continuous Data March 4, 2005 Rachael Brady

Multivariate Data Types In general, each point has many attributes and/or measurements –Type 1: measurements are continuous in nature, and combining dimensions might make sense Weather data - for each x, y, z location we have water density (scalar), temperature (scalar), wind velocity (vector), air pressure (scalar) –Type 2: data is discrete, more like attribute list, and cannot in general be combined Baseball statistics - for each player we have at bats, walks, hits, doubles, homeruns, RBIs. Populations - eye color of residents in NC, income level, voting record

Approaches Dimensional Reduction –Principle Component Analysis –Independent Component Analysis –Kohonen Self Organizing Map Subsetting Dimensional Subsetting Dimensional Organization Dimensional Embedding Source: Matt Ward, Multivariate Vis talk Sept 2000

Dimensional Subsetting - Scatter Plots Invoke the concept of small multiples Show all pair- wise dimensions in a matrix Easily see clusters, trends and correlations Problem: How do you see a trend that requires 2 or more dependent variables? Source: Matt Ward, Multivariate Vis talk Sept 2000

Dimensional Organization Show each variable with an explicit visual representation Spatial Shape Color Size Orientation Texture The combination of these visual variables can produce information that “pops out”, but it is not additive Images: Chris Healey

Dimensional Organization - Glyphs (show star glyph demo) Image: Matt Ward, Multivariate Vis talk Sept 2000

Dimensional Organization - Parrallel Coords Parallel Coordinates creates parallel, rather than orthogonal, dimensions. Data point corresponds to polyline across axes Clusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes Source: Matt Ward, Multivariate Vis talk Sept 2000 Show Parrallel Coords Demo

Parrallel Coords - Useful? Source:

Parrallel Coords - Useful?

Parrallel Coords - Extended Visualizating Hierarchical clusters, Fua et al. 1999

Approaches Dimensional Reduction –Principle Component Analysis –Independent Component Analysis –Kohonen Self Organizing Map Subsetting Dimensional Subsetting Dimensional Organization Dimensional Embedding Source: Matt Ward, Multivariate Vis talk Sept 2000

Dimensional Embedding Dimensional stacking divides data space into bins Each N-D bin has a unique 2-D screen bin Screen space recursively divided based on bin count for each dimension Clusters and trends manifested as repeated patterns Source: Matt Ward, Multivariate Vis talk Sept 2000

Dimensional Embedding - not so easy What Dimensions do you choose at what hierarchy? How do you keep coordinates consistent? How do you layout tiles on page with consistency? Can we do this automatically? Producing a good plot is hard Trellis - an attempt by Rick Becker and Bill Cleveland Incorporated in to the S/S-PLUS statistical Package

A Digression into Plot design…

Effective use of space Which graph is better? Government payrolls in 1931 [how to lie with stats, huff 93] Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

Aspect Ratio - fill space with data Yearly CO2 concentrations [Cleveland 85] Don’t worry about showing zero Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

Banking to 45 Degrees Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

Clearly mark scale breaks Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

Scale break vs. Log scale Both Increase Visual Resolution Log scale allows easy comparisons of all data Scale break is more difficult to compare across the break Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

Transforming Data for Graphing How well does the curve fit the data? Plot vertical distance from best fit curve Residual graph shows accuracy of fit Slide Source: Maneesh Agrawal, Lecutre Notes Fall 2005

A Trellis Example Lead Concentration vs. Setback Distance Given Day-of-the-Week, Week, and Height On the next slide is a trellis display of lead concentration against setback distance given day- of-the-week (thu-wed), week (1-3), and height (3 values). There are 63 panels arranged into 31 columns and 3 rows. Each row conditions on a different value of height; as we go from bottom to top, the heights increase. The panels in each row are in time order because the panels first cycle through the days of the week and then through the weeks. The display reveals much about the structure of the data. There is a strong interaction between height and setback distance. For the lowest height, lead decreases with setback. But for the middle value of height, lead typically first increases with setback and then decreases. For the highest height, lead occasionally has the increase-decrease pattern for about 1/3 of the days, most of them days with large concentrations, and is relatively stable for the remaining days. This behavior is consistent with air transport mechanisms. Lead is emitted at ground level from automobile tail pipes. The closest of the 9 monitors, the one with the lowest height and the closest setback, has the largest concentrations because it is close to the pollution source. From the source, the lead is carried laterally by the wind, spreading upward as it moves. This plume-like behavior can cause the concentrations to be relatively small at the higher monitors at the closest setback. Source:

A Trellis Example Source:

Tensor Visualization High Dimensional Scientific Data Visualization Not Today

Some Interesting Web Sites The best and worst of statistical graphs – Chris Healey’s Preattentive Vision Applet – OpenDX Gallery – IVTK: An Information Visualization Toolkit –Ivtk.sourceforge.net Information Visualization Repository –

Resources Great sources for theory behind multivariate display and perception are –Bertin 1983 –Cleveland 1993 –Tufte 1983, 1990 –Colin Ware, 2000 A couple of good papers are –Shneiderman, “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations” –Marc Green, “Toward a Perceptual Science of Multidimensional Data Visualization: Bertin and Beyond”