Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan University of Maryland.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Multi-Dimensional Data Visualization
Experiments and Variables
Multidimensional Detective Alfred Inselberg Presented By Rajiv Gandhi and Girish Kumar.
Intersection Testing Chapter 13 Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology.
Graphical Examination of Data Jaakko Leppänen
Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation 1 Niklas Elmqvist | Purdue University Pierre Dragicevic | INRIA.
Sept-Dec w1d21 Third-Generation Information Architecture CMPT 455/826 - Week 1, Day 2 (based on R. Evernden & E. Evernden)
Information Visualization Focus + Context Fengdong Du.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
Raster Based GIS Analysis
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Visualization and Data Mining. 2 Outline  Graphical excellence and lie factor  Representing data in 1,2, and 3-D  Representing data in 4+ dimensions.
Visualizing Network Data Richard A. Becker Stephen G.Eick Allan R.Wilks.
Multidimensional Detective Alfred Inselberg Presented By Cassie Thomas.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus + Context Visualization for Tabular Information R. Rao and S. K.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
Multivariate and High Dimensional Visualizations Robert Herring.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
Information Visualization in Data Mining S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto.
Data Mining Techniques
Charts and Graphs V
Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Linking graphs and systems of equations Graph of a linear equation Graphical solutions to systems.
NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic.
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Mathematical Processes GLE  I can identify the operations needed to solve a real-world problem.  I can write an equation to solve a real-world.
Two key suggestions that came from the various focus groups across Ontario were:
CS 376 Introduction to Computer Graphics 02 / 12 / 2007 Instructor: Michael Eckmann.
Quantitative Skills 1: Graphing
Computational Biology, Part E Basic Principles of Computer Graphics Robert F. Murphy Copyright  1996, 1999, 2000, All rights reserved.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Clustering II. 2 Finite Mixtures Model data using a mixture of distributions –Each distribution represents one cluster –Each distribution gives probabilities.
1 Multidimensional Detective Alfred Inselberg, Multidimensional Graphs Ltd Tel Aviv University, Israel Presented by Yimeng Dou
Visual Perspectives iPLANT Visual Analytics Workshop November 5-6, 2009 ;lk Visual Analytics Bernice Rogowitz Greg Abram.
Opinion to ponder… “ Since we are a visual species (especially the American culture), because of our educational system. Many of the tools currently used.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Geovisualization and Spatial Analysis of Cancer Data: Developing Visual-Computational Spatial Tools for Cancer Data Research Challenges for Spatial Data.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001.
Topic 11.2: Measurement and Data Processing Graphing.
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science.
Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/ VisDB: Database exploration using Multidimensional.
Chapter 3 Response Charts.
Ggplot2 A cool way for creating plots in R Maria Novosolov.
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Multivariate plots. Glyphs The plot, shown in Figure 1, displays the relationship between WEIGHT and PRICE of automobiles in the foreground variables.
Visual Correlation Analysis of Numerical and Categorical Data on the Correlation Map Zhiyuan Zhang, Kevin T. McDonnell, Erez Zadok, Klaus Mueller.
Data Visualization.
Graphing Techniques and Interpreting Graphs. Introduction A graph of your data allows you to see the following: Patterns Trends Shows Relationships between.
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Multi-Dimensional Data Visualization cs5984: Information Visualization Chris North.
CSSE463: Image Recognition Day 25 This week This week Today: Applications of PCA Today: Applications of PCA Sunday night: project plans and prelim work.
Data Representation: Making Tables & Graphs. Data Tables.
Mulidimensional Detective “Multidimensional” : multivariate, many parameters “Detective” : focus is on the “discovery process”, finding patterns and trends.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
CSC420 Showing Complex Data.
Grids Geometry Computational Geometry
Grids Geometry Computational Geometry
Grids Geometry Computational Geometry
Descriptive Statistics vs. Factor Analysis
Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1  Yuzhi Chen, Eyal Seidemann  Neuron  Volume.
Volume 88, Issue 3, Pages (November 2015)
Presentation transcript:

Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan University of Maryland

Description  Covered papers: –Alfred Inselberg, Multidimensional Detective –Ted Mihalisin, Visualizing Multivariate Functions, Data, and Distributions  The problem: Visualization and analysis of large dataset with multiple parameters or factors, and the key relationships among themVisualization and analysis of large dataset with multiple parameters or factors, and the key relationships among them MDMV problemMDMV problem

Key words explanation  Multidimensional: –The dimensionality of independent variables  Multivariate: –The dimensionality of dependent variables  Example: –3-D volume space+temperature+pressure produces 3D2V data  The data set could larger than number of pixels

Four Stages of Development  1st:Graphical representation of either one or two variate data, e.g. scatterplot, scatterplot matrix  2 nd :Two dimensional graphics, but encoding multiple parameters, e.g. color, size,shape coding  3 rd :High dimensional graphics, high speed computation, single display, such as Parallel Coords  4 th :elaboration and assessment of various visualization techniques

MDMV Visualization Category  Broadly categorized into five groups: –Brushing –Panel Matrix –Iconography –Hierarchical Displays –Non-Cartesian Displays

Group 1  Brushing –Direct manipulation of MDMV visualization display:labeling, enhanced linking –E.g. brushing a scatterplot matrix

Group 2  Panel Matrix (pairwise 2-D plot, n-D box) –E.g. Hyperbox: n*n lines, n*(n-1)/2 faces –Elaboration of scatterplot matrix –Adding interactive data navigation (hyperbox cutting)

Group 3  Iconography: Glyphs: graphical entities which encode MDMV with shape, size, color, and position. –E.g. faceglyph: size and position of eyes, nose, mouth; curvature of mouth; angle of eyebrows

Group 4  Hierarchical Displays: –map a subset of variates into different hierarchical display –Dynamic interactive analysis –the Ted Mihalisin paper, more details followed

Group 4 (cont’d)  New term: speed=the hierarchical axes  E..g. Three variables:x,y,and z: {0,1,2}  X the fastest axis, Z the slowest axis

Group 4 (Cont’d)  Visualizing 3 variables: –2 interdependent variables: x, y: x= -2, -1, 0, 1, 2;x= -2, -1, 0, 1, 2; y= -2, -1, 0, 1, 2y= -2, -1, 0, 1, 2 –1 dependent variable: z = x**2 + y**2 –so, a 2D1V problem –x fastest, y slowest

Group 4 (Cont’d)  3d1v: W = (x**2) * (e**-y) + z Top panel speed order : x, y, z Bottom panel speed order: z, y, x

Group 4 (cont’d)  What if the number of the data points greatly exceeds the number of horizontal pixels assigned to the panel?  Example: 7 independent variables + each has 10 values = 10,000,000 points  Need: – hierarchical subspace zooming to reduce dimension

Group 4 (cont’d)  From 7D to 2D:

Group 4 (cont’d)  example: experiment data visualization: –Dependent: specific heat –Independent: Fastest: temperature (white) :gaussian peakFastest: temperature (white) :gaussian peak Then alloy concentration (blue): linear increaseThen alloy concentration (blue): linear increase Then magnetic field (red) :nonlinear decreaseThen magnetic field (red) :nonlinear decrease

Group 5  Parallel Coordinates –So many class presentations have already been done! –Everybody is already expert using it –What are some basic ideas behind it? –Cartesian v.s. Parallel Coords

Group 5 (cont’d)  A Cartesian line: –L: x 2 = mx 1 +b –A set of points sampled on this line On Parallel Coords: –Each point becomes a line –The set of points becomes a set of intersecting lines

Group 5 (cont’d)  The intersect point:  The location of the intersect point is important! –Between two axes: inversely proportional (x1 α 1/x2) –Outside two axes: directly proportional (x1 α x2)

Group 5 (cont’d)  Application example –Aircraft collision checking –Converting the problem into detecting a four dimension geometric intersection –Collision at (2,2,2,1)

Group 5 (cont’d)  Application example: –Economic model of a real country –8 variables: AgricultureAgriculture FishingFishing MiningMining ManufacturingManufacturing ConstructionConstruction GovernmentGovernment MiscellaneousMiscellaneous GNPGNP

Group 5 (cont’d)  A Least Squares function defines the boundary region in 8 dimension space  Any point (polygon) inside the boundary represents a feasible economic policy for the country

Group 5 (cont’d)  Discoveries: –No policy would favor Agriculture without also favoring Fishing: (x1 α x2) –Inverse relationship between Fishing and Mining: resource competition: (x1 α 1/x2)

Notes on the References  The Inselberg’s paper: –11 citations found on researchIndex –Application in knowledge discovery, user interface, aircraft design, etc.  Ted Mihalisin paper: –Only one citation found

Contribution  Inselberg’s paper: –Transform MDMV hyperspace relations into a 2-D geometric pattern problem –empirical studies demonstrated the ability extending the strength with trade-off analysis, discover sensitivities, and optimization  Mihalisin’s paper: –Hierarchical technique visualizing data points greatly exceeding number of pixels

Critique  Inselberg’s paper: –No comparison with other MDMV techniques –No examples supporting the claim that displayed objects can be recognized under projective transformations  Mihalisin’s paper: –Limited number of values for each variable visualized in one display –No discussion of potential information loss with coarse-grained grid

Favorite Sentence  “You can’t be unlucky all the time!” –Multiple techniques exist for MDMV visualization problem –Each has strength and weakness –Whichever you start with, you can’t be unlucky all the time! –Integration and collaboration of existed tools remain to be active research topics.