Visualizing quantitative information martin krzywinski.

Slides:



Advertisements
Similar presentations
Rubric Unit Plan Univariate Bivariate Examples Resources Curriculum Statistically Thinking A study of univariate and bivariate statistics.
Advertisements

Lecture 06: Design II February 5, 2013 COMP Visualization.
Isoparametric Elements Element Stiffness Matrices
Mkweb.bcgsc.ca/circos Martin Krzywinski
Graphical Examination of Data Jaakko Leppänen
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Chapter Three Art and Science of Graphical Presentations.
Describing the Relation Between Two Variables
x – independent variable (input)
Graphical Data Displays and Interpretation 2009 October 9.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Graphical Data Displays and Interpretation Wednesday, October 9.
Scientific Communication and Technological Failure presentation for ILTM, July 9, 1998 Dan Little.
Visualization and Data Mining. 2 Outline  Graphical excellence and lie factor  Representing data in 1,2, and 3-D  Representing data in 4+ dimensions.
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Chapter Three Art and Science of Graphical Presentations.
2007 會計資訊系統計學 ( 一 ) 上課投影片 3.1 Chapter Three Art and Science of Graphical Presentations.
James Tam Information Visualization Concepts covered What is Information Visualization? Tufte's Principles for Information Visualization. Visual Variables.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Illustration & Photography- Ch 9 Creating Graphics Illustration- using images that represent or express to make a visual statement Can show something that.
Presenting information
1 Cornerstones of Managerial Accounting, 2e Copyright © 2008 Thomson South-Western, a part of the Thomson Corporation. Thomson, the Star logo, and South-Western.
Guilford County SciVis V105.01
1 Determining Effective Data Display with Charts.
Vector vs. Bitmap SciVis V
Chapter 3 Cost Behaviour
V Obtained from a summer workshop in Guildford County July, 2014
Information Visualization in Data Mining S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto.
Regression and Correlation
Minard Saladino By:. Introduction: Illustrator is a vector-based imaging program. Unlike PhotoShop, which deals in pixels (raster images), this one deals.
Jeffrey Nichols Displaying Quantitative Information May 2, 2003 Slide 0 Displaying Quantitative Information An exploration of Edward R. Tufte’s The Visual.
Information Graphics Joyeeta Dutta-Moscato July 9, 2013.
Descriptive Methods in Regression and Correlation
Charts and Graphs V
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Tabular Display of Data Prepared by: Gary Klass
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Writing Business Reports. Introduction Gives background of problem or assignment. Introduces the subject and shows why it is significant or important.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Vector vs. Bitmap
Chapter 1 Review MDM 4U Mr. Lieff. 1.1 Displaying Data Visually Types of data Quantitative Discrete – only whole numbers are possible Continuous – decimals/fractions.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Are You Smarter Than a 5 th Grader?. 1,000,000 5th Grade Topic 15th Grade Topic 24th Grade Topic 34th Grade Topic 43rd Grade Topic 53rd Grade Topic 62nd.
Digital Images Can show something that cannot be photographed Illustration- using images that represent or express to make a visual statement.
Microsoft Office Illustrated Fundamentals Unit I: Working with Charts.
Examining Relationships in Quantitative Research
GRAPHING IN CHEMISTRY YEAR 11 DP CHEMISTRYR. SLIDER.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
1 CSE 2337 Chapter 3 Data Visualization With Excel.
Chapter 3 Response Charts.
CHAPTER 2 SECTION 2-1 PATTERNS AND ITERATIONS SEQUENCE An arrangement of numbers in a particular order. The numbers are called terms and the pattern.
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CONFIDENTIAL Data Visualization Katelina Boykova 15 October 2015.
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
Technical Communication A Practical Approach Chapter 13: Graphics William Sanborn Pfeiffer Kaye Adkins.
Recap Iterative and Combination of Data Visualization Unique Requirements of Project Avoid to take much Data Audience of Problem.
Visual Presentation of Quantitative Data Cliff Shaffer Virginia Tech Fall 2015.
Cartography: Communicating Spatial Information Scott Bell GIS Institute.
Data Visualization.
Infographic (informational graphic) Edward TufteEdward Tufte in The Visual Display of Quantitative Information defines 'graphical displays' in the following.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Guilford County SciVis V104.03
DATA VISUALIZATION BOB MARSHALL, MD MPH MISM FAAFP FACULTY, DOD CLINICAL INFORMATICS FELLOWSHIP.
1 INTRODUCTION TO COMPUTER GRAPHICS. Computer Graphics The computer is an information processing machine. It is a tool for storing, manipulating and correlating.
Inserting and Working with Images
Information Visualization 1
Visual Presentation of Quantitative Data
Chapter 10 Correlation and Regression
Graphical Data Displays and Interpretation
Keller: Stats for Mgmt & Econ, 7th Ed
Keller: Stats for Mgmt & Econ, 7th Ed
Presentation transcript:

visualizing quantitative information martin krzywinski

outline best practices of graphical data design data-to-ink ratio cartjunk circos the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphical displays essentials show the data induce viewer to think about substance rather than methodology encourage eye to compare different pieces of data avoid distorting what the data represents present many numbers in a small space make large data sets coherent reveal data at several levels of detail – broad overview and fine structure the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphics reveal data and patterns each of these sets are described by the same linear model anscombe’s quartet each of the values below is the same for each set number of points average x average y regression line standard error of slope sum of squares residual sum of squares correlation coefficient r 2 the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphics organize complex information some data sets are naturally better represented visually each of these data maps portrays ~21,000 numbers although very dense, the images draw attention to hot spots death rate from various cancers femalesmales the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphics organize dense information locations and boundaries of 30,000 communes in France 240,000 numbers the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphics organize dense information 1,024 x 2,222 sky divisions 10 grey tones pixel grey value denotes number of galaxies in corresponding sky region density of data commensurate with a photograph, but quantitative the visual display of quantitative information edward r tufte, 2001, 2nd ed

graphics simplify complex information TGV the visual display of quantitative information edward r tufte, 2001, 2nd ed

when the image is the data the visual medium is ideal for depicting multivariate data arguably univariate and bivariate data should be tabularized, within reason this example shows a plot for a case where data cannot be easily parametrized the visual display of quantitative information edward r tufte, 2001, 2nd ed

parametrization of multivariate data the 2D plane can depict high-dimension data chernoff faces are data encodings designed for easy identification of outliers parameters are mapped to head shape, eye distance, nose and lip size smoothly varying data corresponds to smoothly varying chernoff population the visual display of quantitative information edward r tufte, 2001, 2nd ed

data-to-ink ratio proportion of graphic’s ink devoted to the non-redundant display of data information 1.0 – proportion of a graphic that can be erased without loss of data information data-to-ink ratio should always be maximized, within reason the visual display of quantitative information edward r tufte, 2001, 2nd ed

data-to-ink ratio highshockingly low the visual display of quantitative information edward r tufte, 2001, 2nd ed

data-to-ink ratio originaldeleted components modified to increase data-to-ink ratio the visual display of quantitative information edward r tufte, 2001, 2nd ed

shrink your graphics dense data can be depicted within a small area without loss of clarity as long as data-to-ink ratio is high good graphics are informative dense multivariate strive to give your viewer the greatest number of ideas in the shortest time with the least ink in the smallest space the visual display of quantitative information edward r tufte, 2001, 2nd ed

cartjunk excessive use of grids and patterns cause perceived vibrations avoid hatched patterns to limit moire avoid excessive use of decorative forms the visual display of quantitative information edward r tufte, 2001, 2nd ed

the shimmering statistic natural eye tremor and dense fill patterns produce a shimmering effect this is annoying and tiring the visual display of quantitative information edward r tufte, 2001, 2nd ed

circos there are many genome browsers and visualizers already available – do we really need another one? communicating data visually critical for large data sets there certain types of data that obfuscate common diagram formats standard 2D plots (2 perpendicular axes) are inadequate

scalar mappings scalar valued mappings are common and easily handled input genomic position is a scalar input when the output is real-valued (GC content, conservation, etc) use a histogram, line plot, scatter plot genome position on x-axis function value on y-axis

genome-to-genome mappings output scalar is often a genome position (G2G) range may be the same genome, or a different genome G2G is also common, but less easily handled genome position genome position

drawing G2G mappings

Genome Res Jan;13(1):37-45

drawing G2G mappings Genome Res Jan;13(1):37-45

drawing G2G mappings Genome Res May;15(5):629-40

drawing G2G mappings

Genome Res Jan;13(1):37-45

drawing G2G mappings

drawing G2G mappings

drawing G2G mappings

dealing with G2G mappings reduce information content in figures plot/colourmap target chromosome, not position

dealing with G2G mappings Genome Res Apr;14(4):685-92

reduce sampling Genome Res Jan;15(1):98-110

rearrange axes

partition data

recompose axis layout – circos

circos written in Perl Apache-style configuration file plain text data input PNG output

G2G in circos display characteristics of most elements are customizable data-driven formatting rules support for data layers

2D data in circos

box scatter line

tiles heatmaps histogram chr2 2D data in circos

non-linear scaling global scaling – scale of each ideogram can be adjusted e.g. chr 1 drawn at 8x local scaling – any region can be locally expanded or contracted e.g Mb on chr1 expanded 5x

non-linear scaling

circos in comparative genomics human chr1 mouse chr1 mouse chr3

circos in comparative genomics chlamydia D fingerprint map vs chlamydia D sequence

circos in comparative genomics chlamydia L fingerprint map vs chlamydia D sequence

alignments drawn as ribbons blast of regions of chr14 vs chr22 single alignment

circos is flexible

mkweb.bcgsc.ca/circos download documentation tutorials circos art