Lecture 11 Chapter 6. Correlation and Linear Regression.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Chapter 12 Simple Linear Regression
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Copyright © 2010 Pearson Education, Inc. Slide
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
© The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
CORRELATON & REGRESSION
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
REGRESSION AND CORRELATION
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Business Statistics - QBM117 Least squares regression.
Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points.
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
Linear Regression Analysis
Linear Regression.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
February  Study & Abstract StudyAbstract  Graphic presentation of data. Graphic presentation of data.  Statistical Analyses Statistical Analyses.
SIMPLE LINEAR REGRESSION
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Chapter 11 Simple Regression
Linear Regression and Correlation
STA291 Statistical Methods Lecture 27. Inference for Regression.
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
CORRELATION & REGRESSION
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
© The McGraw-Hill Companies, Inc., 2000 Business and Finance College Principles of Statistics Lecture 10 aaed EL Rabai week
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
© The McGraw-Hill Companies, Inc., Chapter 11 Correlation and Regression.
Introduction to Linear Regression
Correlation & Regression
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
1 Regression & Correlation (1) 1.A relationship between 2 variables X and Y 2.The relationship seen as a straight line 3.Two problems 4.How can we tell.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Inference about the slope parameter and correlation
The simple linear regression model and parameter estimation
Regression and Correlation
Chapter 5 STATISTICS (PART 4).
Simple Linear Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
Correlation and Regression
Topic 8 Correlation and Regression Analysis
Regression & Correlation (1)
Algebra Review The equation of a straight line y = mx + b
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Lecture 11 Chapter 6. Correlation and Linear Regression

6.1 Introduction  This chapter is concerned with relationships between continuous variables.  Example (see Handout 11) During the 1950s radioactive water leaked into the Columbia river in Washington DC. Data were collected on an exposure index (X), and the cancer mortality rate (Y) (deaths per 100,000 per year) for the years , for each of nine counties downstream: Exposure (x): Mortality (y):

Both the variables X and Y are measurements on a continuous scale. We are interested in how these two variables are related, or associated. We are interested in how these two variables are related, or associated. As usual, the sensible thing to do first is to have a look at the data. The best thing to do here is to plot the mortality rate against the exposure index....

The plot suggests that there is a clear relationship (association) between the mortality rate and the exposure index. The relationship looks approximately linear (like a straight line). In this chapter we do two things: 1. Use a measure called correlation to describe the strength of the association between two variables. 2. Use a method called linear regression to model the relationship between two variables which are associated in a way which is approximately linear.

6.2 Correlation  There are a several different measures of association in usage, but we will only consider the most common, which is called Pearson’s product moment correlation coefficient or more briefly the sample linear correlation coefficient or just the Pearson correlation. It is usually denoted by the letter r.

Additional Notes (Slide 1 of 2) The value of r always lies between -1 and +1; Values of r near to +1 indicate a strong positive linear relationship; Values of r near to -1 indicate a strong negative linear relationship; Values of r near to 0 indicate there is very little linear relationship.

Additional Notes (Slide 2 of 2) Let’s see what Minitab tells us about the Pearson correlation for our example above. We use: Stat>Basic Statistics>Correlation... Minitab tells us two things: the Pearson correlation is r = the P-value is 0.000

Note that this correlation is close to +1, indicating a strong positive linear relationship. What about the p-value? This is the result of the hypothesis test of the null hypothesis: H 0 : The linear correlation in the population is zero. Our value of p = indicates that we reject the null hypothesis. There does appear to be a strong positive linear relationship between exposure and mortality.

The correlation coefficient r is a very useful summary measure, but it us often misused. Some points to remember are as follows: 1.A high correlation does not necessarily imply a a cause-and-effect relationship. 2. Although a value of r close to 1 does indicate a strong positive linear association, a linear relationship is not always the most appropriate. Always produce a plot of y against x. 3. A value close to zero indicates no linear relationship. That does not necessarily mean there is no relationship!

For the data plotted below, r = 0.020, and the p-value is This correctly identifies there is no linear relationship, but there clearly is a relationship!

6.3 Simple Linear Regression  The correlation coefficient tells us about the strength of a linear relationship, but it doesn’t allow us to do things like make predictions about new data.  For this we need a model for the data. If we think there is an approximately linear relationship, we use the equation of a straight line, which relates X and Y: Y = α + βX Y = α + βX  Here the values of α (alpha) and β (beta) are the intercept and the slope of the straight line respectively. The slope, β, is usually of much more interest, because it tells us how Y changes with X.

 Since we don’t expect the data to lie exactly on a straight line, we always add a random error component, ε (epsilon), so the equation becomes: Y = α + βX + ε(Equation 1)  Equation 1 is the equation of a simple linear regression. In order to use it to model our data, we need to choose the values of α and β which work best.  E.g. for the exposure-mortality data, we might obtain....

 Notice that in the plot above, α has been chosen as 118.4, and β as  This indicates that in our model, the mortality rate increases by 9.03 for every unit increase in the exposure index, and the mortality rate when the exposure index is zero is  But how were these values chosen?  The usual criterion, and the one used above is to use the least squares estimates for α and β...

We obtain these in Minitab using: Stat>Regression>Regression... if we want the equation etc., and... Stat>Regression>Fitted Line Plot... Stat>Regression>Fitted Line Plot... if we want the graph with the fitted line superimposed.