Statistics lecture 4 Relationships Between Measurement Variables.

Slides:



Advertisements
Similar presentations
7.1 Seeking Correlation LEARNING GOAL
Advertisements

Review ? ? ? I am examining differences in the mean between groups
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Chapter 4 The Relation between Two Variables
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Elementary Statistics Larson Farber 9 Correlation and Regression.
Describing the Relation Between Two Variables
Lecture 16 – Thurs., March 4 Chi squared test for M&M experiment Simple linear regression (Chapter 7.2) Next class after spring break: Inference for simple.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Correlation Relationship between Variables. Statistical Relationships What is the difference between correlation and regression? Correlation: measures.
Correlation: Relationship between Variables
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Business Statistics - QBM117 Least squares regression.
Ch 2 and 9.1 Relationships Between 2 Variables
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Linear Regression Analysis
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Descriptive Methods in Regression and Correlation
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Biostatistics Unit 9 – Regression and Correlation.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Statistical Analysis Topic – Math skills requirements.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Elementary Statistics Correlation and Regression.
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
PS 225 Lecture 20 Linear Regression Equation and Prediction.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
U Describes the relationship between two or more variables. Describes the strength of the relationship in terms of a number from -1.0 to Describes.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
Lecture 10: Correlation and Regression Model.
LECTURE 9 Tuesday, 24 FEBRUARY STA291 Fall Administrative 4.2 Measures of Variation (Empirical Rule) 4.4 Measures of Linear Relationship Suggested.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition, we would like to.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Statistics 200 Lecture #6 Thursday, September 8, 2016
CHAPTER 10 Correlation and Regression (Objectives)
Simple Linear Regression
Lecture Notes The Relation between Two Variables Q Q
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
Presentation transcript:

Statistics lecture 4 Relationships Between Measurement Variables

Thought Question 1 There is a positive correlation between SAT score and GPA. For used cars, there is a negative correlation between age of the car and selling price. What does that mean?

Thought Question 2 If you had a scatter plot comparing the heights of a number of fathers and their adult sons, how could you use it to predict the adult height of a child?

Thought Question 3 Would these pairs of variables have a positive correlation, a negative correlation, or no correlation? Calories eaten per day and weight Calories eaten per day and IQ Vinho consumed and driving ability Number of priests and amount of liquor sold in Portugal cities. Height of husbands and heights of wives

Goals for this lecture Get the idea of a statistical relationship and statistical significance Understand the meaning of correlation between two measurement variables Learn how to use the linear relationship between two variables to predict one value, given the other

Relationships Deterministic: You can predict one variable exactly given another (example: distance at a constant speed given time) Statistical: You can describe a relationship between variables, but it isn’t precise because of natural variability (example: the average relationship between height and weight.)

Remember How to Build a Scatter Plot? Doig

Relationship between Height and Weight

Statistical Significance Often we must use a sample to tell us about a population. We want to know if any relationships observed in the sample are “real” and not just chance.

Rule of Thumb A statistical relationship is considered significant if it is stronger than 95% of the relationships we’d expect to see by chance.

Be aware of sample size Statistical significance is affected by sample size: It’s easy to rule out chance if you have lots of observations (but the relationship still may not be strong or useful.) On the other hand, even a strong relationship may not achieve statistical significance if the sample is small.

Relationship between Height and Weight

Strength of Relationship? Correlation (also called the correlation coefficient or Pearson’s r) is the measure of strength of the linear relationship between two variables. Think of strength as how closely the data points come to falling on a line drawn through the data.

Features of Correlation Correlation can range from +1 to -1 Positive correlation: As one variable increases, the other increases Negative correlation: As one variable increases, the other decreases Zero correlation means the best line through the data is horizontal Correlation isn’t affected by the units of measurement

Positive Correlations r = +.1 r = +.4 r = +.8 r = +1

Negative Correlations r = -.1 r = -.4 r = -.8r = -1

Zero correlation r = 0

Zero correlation

Number of Points Doesn’t Matter r =.8

Important! Correlation does not imply causation.

Linear Regression In addition to figuring the strength of the relationship, we can create a simple equation that describes the best-fit line (also called the “least-squares” line) through the data. This equation will help us predict one variable, given the other.

Best-fit (“least-squares”) Line

Best-fit Line??? (much variance)

Best-fit Line? (less variance)

Best-fit Line! (least variance)

Remember 9th Grade Algebra? x = horizontal axis y = vertical axis Equation for a line: y = slope*x + intercept or as it often is stated: y = mx + b

Don’t panic! You won’t have to calculate the least-squares line equation yourself. Instead, you can use functions built into common computer programs like Microsoft Excel or even many pocket calculators. (But you do need to know how to use the regression line equation.)

Excel Regression Output of Height vs. Weight SUMMARY OUTPUT Regression Statistics Multiple R0.569 R Square0.324 Adjusted R Square0.320 Standard Error Observations174 Coefficients Intercept height4.01

Plotting the regression line

Using the Regression Equation to Predict Y for a Given X b: intercept = -123 m: coefficient of height (x) = 4 y = mx + b weight = (4 * height) “Predicted” weight for 68 inches: weight = (4 * 68) = 149 pounds

Predict Weight for a Given Height weight = (4 * height) inches (4 * height) = 125 lbs. 75 inches (4 * height) = 177 Lbs. 70 inches (4 * height) = 157 lbs.

What’s the point? Regression shows what a dependent (y) variable is “predicted” to be, given a value for the independent (x)variable. Definition: The residual is the amount an actual dependent (y) value differs from the “predicted” value Definition: R-squared is the percentage of variance from the mean that is explained by the independent (x) variable

Excel Regression Output of Height vs. Weight SUMMARY OUTPUT Regression Statistics Multiple R0.569 R Square0.324 Adjusted R Square0.320 Standard Error Observations174 Coefficients Intercept height4.01

Demo

Regression in CAR School test scores Cheating in school test scores Tenure of white vs. black coaches in NBA Racial profiling in traffic stops Miami criminal justice

Extrapolation? Beware! Don’t use your regression equation very far outside the boundaries of your data because the relationship may not hold. Words vs. age (r =.993 for ages 2-6) Words = 562 * Age Age 1: 562 * = -202 words???

Negative Weight? Data area

Mark Twain and the length of the Mississippi River From “Life on the Mississippi” (1884) In 176 years, the river was shortened by 403 kilometers, or about 2.3 kilometers per year A million years ago, the Mississippi must have been 2.2 million kilometers long In 742 years, it will be 2.9 kilometers long, joining Cairo, Illinois, and New Orleans Twain: “There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.”

Perguntas?