Visualization UW CSE 190p Summer 2012. BARE BONES VISUALIZATION IN PYTHON WITH MATPLOTLIB.

Slides:



Advertisements
Similar presentations
Multiple Regression in Practice The value of outcome variable depends on several explanatory variables. The value of outcome variable depends on several.
Advertisements

Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
 Coefficient of Determination Section 4.3 Alan Craig
In Documents. Using Graphics to Think Preparing the graphics first helps you get started and sets out the framework of your written product.
Regression Regression: Mathematical method for determining the best equation that reproduces a data set Linear Regression: Regression method applied with.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for the Social Sciences
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Curve Fitting and Interpolation: Lecture (IV)
Jamie Starke.  Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations ◦ J. Heer, N. Kong,
Statistical Analysis SC504/HS927 Spring Term 2008 Session 5: Week 20: 15 th February OLS (2): assessing goodness of fit, extension to multiple regression.
PSY 1950 Graph Design December 8, Why graph? Exploratory data analysis –Usually raw data –Tukey: a good graph “forces us to notice what we never.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points.
Relationships Among Variables
Guide to Using Excel For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Information Visualization in Data Mining S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto.
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Simple Linear Regression
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Chapter 8: Regression Analysis PowerPoint Slides Prepared By: Alan Olinsky Bryant University Management Science: The Art of Modeling with Spreadsheets,
Multiple Regression. What Techniques Can Tell Us Chi Square- Do groups differ (nominal data)? T Test Do Groups/Variables differ? Gamma/Lambda/Kendall’s.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Linear correlation and linear regression + summary of tests
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Regression Regression relationship = trend + scatter
Visualization UW CSE 140 Winter matplotlib Strives to emulate MATLAB – Pro: familiar to MATLAB users – Pro: powerful – Con: not the best design.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
GoldSim Technology Group LLC, 2006 Slide 1 Sensitivity and Uncertainty Analysis and Optimization in GoldSim.
Information Design Trends Unit Three: Information Visualization Lecture 1: Escaping Flatland.
Chapter 2 Ordinary Least Squares Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Linear correlation and linear regression + summary of tests Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
MAPS. Dr. John Snow’s Cholera Map of London
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
Data Analysis, Presentation, and Statistics
Visual Presentation of Quantitative Data Cliff Shaffer Virginia Tech Fall 2015.
Curve Fitting Introduction Least-Squares Regression Linear Regression Polynomial Regression Multiple Linear Regression Today’s class Numerical Methods.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Quiz.
Chapter 12 Understanding Research Results: Description and Correlation
Modeling in R Sanna Härkönen.
Linear Regression CSC 600: Data Mining Class 13.
Multiple Regression.
Linear Regression Prof. Andy Field.
Multiple Regression.
Visualization UW CSE 160 Spring 2018.
Theme 7 Correlation.
CHAPTER 29: Multiple Regression*
What is Information Visualization?
Visualization UW CSE 160 Winter 2016.
Visualization UW CSE 160 Winter 2017.
IS-171 Computing With Spreadsheets
11C Line of Best Fit By Eye, 11D Linear Regression
Include your personal presentation if necessary.
Visualization UW CSE 160 Spring 2015.
Presentation transcript:

Visualization UW CSE 190p Summer 2012

BARE BONES VISUALIZATION IN PYTHON WITH MATPLOTLIB

matplotlib A major design limitation is that it stives to emulate MATLAB – More on this in the next lecture One important function for HW6: plot(xvalues, yvalues)

Plot import matplotlib.pyplot as plt xs = [1,2,3,4,5] ys = [x**2 for x in xs] plt.plot(xs, ys) no return value? We are operating on a “hidden” variable representing the figure. This is a terrible, terrible trick. Its only purpose is to pander to MATLAB users. I’ll show you how this works in the next lecture

import matplotlib.pyplot as plt xs = range(-100,100,10) x2 = [x**2 for x in xs] negx2 = [-x**2 for x in xs] plt.plot(xs, x2) plt.plot(xs, negx2) plt.xlabel("x”) plt.ylabel("y”) plt.ylim(-2000, 2000) plt.axhline(0) # horiz line plt.axvline(0) # vert line plt.savefig(“quad.png”) plt.show() Incrementally modify the figure. Show it on the screen Save your figure to a file

We can group these options into functions as usual, but remember that they are operating on a global, hidden variable

WHY VISUALIZE DATA? Review Bill Howe, eScience Institute

John Snow Location of deaths in the 1854 London Cholera Epidemic. X marks the locations of the water pumps Dr. John Snow

Anscombe ’ s Quartet

Anscombe ’ s Quartet (2) mean of the x values = 9.0 mean of the y values = 7.5 equation of the least-squared regression line: y = x sums of squared errors (about the mean) = regression sums of squared errors (variance accounted for by x) = 27.5 residual sums of squared errors (about the regression line) = correlation coefficient = 0.82 coefficient of determination = 0.67

Anscombe ’ s Quartet (3)

Another example: Pearson Correlation

Other reasons? Visualization is the highest bandwidth channel into the human brain [Palmer 99] The visual cortex is the largest system in the human brain; it’s wasteful not to make use of it. As data volumes grow, visualization becomes a necessity rather than a luxury. – “A picture is worth a thousand words”

What is the rate-limiting step in data understanding? Processing power: Moore’s Law Amount of data in the world slide src: Cecilia Aragon, UW HCDE

What is the rate-limiting step in data understanding? Processing power: Moore’s Law Human cognitive capacity Idea adapted from “Less is More” by Bill Buxton (2001) Amount of data in the world slide src: Cecilia Aragon, UW HCDE

What makes a good visualization? Edward Tufte: Minimize the Lie Factor Lie Factor = Size of effect in the visualization Size of effect in the data

Example Tufte 1997

What makes a good visualization? Edward Tufte: Maximize the data-ink ratio

Example: High or Low Data Ink ratio?

Bateman et al: The Effects of Visual Embellishment on Comprehension and Memorability of Charts There was no significant difference between plain and image charts for interactive interpretation accuracy (i.e., when the charts were visible). There was also no significant difference in recall accuracy after a five-minute gap. After a long-term gap (2-3 weeks), recall of both the chart topic and the details (categories and trend) was significantly better for Holmes charts. Participants saw value messages in the Holmes charts significantly more often than in the plain charts. Participants found the Holmes charts more attractive, most enjoyed them, and found that they were easiest and fastest to remember.

What makes a good visualization? Edward Tufte: Small multiples

What makes a good visualization? Jock Mackinlay: Use the appropriate visual element for the relationship and data being analyzed Conjectured rank effectiveness of each visualization method by data type

What makes a good visualization? Tufte again: Small multiples

Lloyd Treinish, IBM Research, What makes a good visualization? Lloyd Treinish: Color Matters

Color Matters (2) Lloyd Treinish, IBM Research, ibm.com/people/l/llo ydt/

A Nice Example Bergstrom, Rosvall, 2011