The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

Inference for Regression
CHAPTER 24: Inference for Regression
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 14: More About Regression Section 14.1 Inference for Linear Regression.
Chapter 14: Inference for Regression
Chapter 12: More About Regression
Objectives (BPS chapter 24)
Chapter 12 Section 1 Inference for Linear Regression.
Inference for regression - Simple linear regression
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 15 Inference for Regression
Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Chapter 12: More About Regression
+ Chapter 12: Inference for Regression Inference for Linear Regression.
12.1: Inference for Linear Regression. Section 12.1 Inference for Linear Regression CHECK conditions for performing inference about the slope β of the.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Lesson Inference for Regression. Knowledge Objectives Identify the conditions necessary to do inference for regression. Explain what is meant by.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
AP STATISTICS LESSON 14 – 1 ( DAY 1 ) INFERENCE ABOUT THE MODEL.
Chapter 12 Inference for Linear Regression. Reminder of Linear Regression First thing you should do is examine your data… First thing you should do is.
I NFERENCE F OR R EGRESSION Is a child’s IQ linked to their crying as an infant?
Chapter 10 Inference for Regression
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Inference with Regression. Suppose we have n observations on an explanatory variable x and a response variable y. Our goal is to study or predict the.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Chapter 12 Inference for Linear Regression. Reminder of Linear Regression First thing you should do is examine your data… First thing you should do is.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.
Inference for Linear Regression
Chapter 14: More About Regression
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Regression Inferential Methods
Inference for Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
CHAPTER 12 More About Regression
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
The Practice of Statistics in the Life Sciences Fourth Edition
CHAPTER 26: Inference for Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Chapter 12: More About Regression
CHAPTER 12 More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Chapter 14 Inference for Regression
Day 68 Agenda: 30 minute workday on Hypothesis Test --- you have 9 worksheets to use as practice Begin Ch 15 (last topic)
Inference for Regression
Chapter 12: More About Regression
CHAPTER 12 More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Inference for Regression
Chapter 12: More About Regression
Presentation transcript:

The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company

Chapter Objectives Compute a confidence interval for the slope of the regression line. Conduct a test of the hypothesis that the slope of the regression line is 0(or that the correlation is 0) in the population. We are doing inference on the LSRL – “least-squares regression line” We will be using to estimate a parameter

Example. Crying and IQ. Child development researchers explored the relationship between the crying of infants four to ten days old and their later IQ test scores. A snap of a rubber band on the sole of the foot caused the infants to cry. The researchers recorded the crying and measured the intensity by the number of peaks in the most active 20 seconds. They later measured the children’s IQ at the age of three years using the Standford-Binet IQ test.

Infants’ crying and IQ scores CryingIQCryingIQCryingIQCryingIQ Discuss the W 5 HW for the data.

Create a scatterplot of the data and calculate the correlation. The correlation between crying and IQ is r = LSRL: Recall: To get the LSRL on the scatterplot enter linreg(a + bx) L1, L2, Y1 Interpret the LSRL.

Conditions for the Regression Model

Inference for Regression The heart of this model is that there is an “on the average” straight- line relationship between y and x. The true regression line says that the mean response moves long a straight line as the explanatory variable x changes. We can’t observe the true regression line. The values of y that we do observe vary about their means according to the Normal distribution. If we hold x fixed and take many observations of y, the Normal pattern will eventually appear in a stemplot or a histogram

Inference for Regression In practice, we observe y for many different values of x, so that we see an overall linear pattern formed by points scattered about the true line. The standard deviation σ determines whether the points fall close to the true regression line (small σ) or widely scattered (large σ).

The line in the figure is the true regression line. The mean of the response y moves along this line as the explanatory variable x takes different values. The Normal curves show how y will vary when x is held fixed at different values. All of the curves have the same σ, so the variability of y is the same for all values of x.

Checking Conditions Before we do inference, we must check these conditions one by one. 1.The observations are independent. Repeated observations on the same individual are not allowed. 2.The true relationship is linear. Always check your residual plot and original scatterplot 3.The standard deviation of the response about the true line is the same everywhere Check residuals for fanning (this is bad) The standard deviation needs to remain fixed, not change with x as the mean response changes with x 4.The response varies Normally about the true regression line. Make a histogram of the residuals and check for clear skewness or other major departures from Normality.

Getting the residuals After you perform linear regression using your graphing calculator, the residuals will automatically be stored as a list. To get them into L3 so you can use them, highlight L3 and press 2 nd STAT (LIST) and scroll down until you see RESID and press ENTER –Or you can enter L2 – Y1(L1).

Checking the residuals There does not seem to be a pattern, so a linear model is appropriate. The spread of the line does not seem to be changing as x increases which tells us the standard deviation is constant.

Checking the residuals A stemplot or a histogram can be used to demonstrate that the residuals are approximately Normal. There is a slight right-skew, but we see no gross violations of the condition. HW: pg. 894 #15.1, 15.2, 15.4

Estimating the Parameters The first step in inference is to estimate the unknown parameters α, β, and σ. The slope b of the LSRL is an unbiased estimator of β. The intercept a of the LSRL is an unbiased estimator of α. We will use s (the sample standard deviation of the residuals) to estimate σ.

The degrees of freedom of s are n – 2. To get Σx 2 perform 1-Var Stats on L3 (where you stored you resid) This is a very “reserved” method. You will get a better estimate if you perform a LinRegTTest (later)

Confidence Intervals for the Regression Slope This formula looks like on the formula sheet.

You should rarely have to calculate the standard error by hand. Regression software will give the standard error along with b itself. Press STAT, then TEST, then LinRegTTest. Use the data from crying for IQ. Confidence Intervals for the Regression Slope s = is the estimate for σ. SE b = b/t

Construct and interpret a 95% confidence interval for the mean IQ increase for each additional peak in crying. b ± t*SE b = ± (2.042)(0.4870) = ± = (0.4985, ) SE b = b/t = / = t*(36) = invT(.975, 36) = or table with df = 30, t*=2.042 We are 95% confident that the true mean IQ increases by between 0.5 and 2.5 points for each additional peak in crying.

Minitab output You are going to have be able to read different computer outputs for the exam.

Testing the Hypothesis of No Linear Relationship The most common hypothesis about the slope is H 0 : β = 0 The regression line with slope 0 is horizontal. –The mean of y does not change at all when x changes. So this H 0 says that there is no true linear relationship between x and y. We can use the test for zero slope to test the hypothesis of zero correlation between any two quantitative variables. Note: testing correlation only makes sense if the observations are a random sample.

Note: most software gives t and it’s P-value for a two-sided test

Crying and IQ H0: β = 0 Ha: β ≠ 0 Our test statistic t = and our p-value is We have very strong evidence that IQ is correlated with crying.

StudentBeersBACStudentBeersBAC Beer and BAC A previous example looked at how well the number of beers a student drinks predicts his/her blood alcohol content. 16 student volunteers at Ohio State University drank a randomly assigned number of cans of beer. Thirty minutes later, their BAC was measured by a police officer.

1) Perform a hypothesis test to prove that the number of beers has a positive effect on BAC. 2) Construct and interpret a 95% confidence interval for the true mean increase of blood alcohol content for each additional beer. First check that the 4 conditions are satisfied. 1) Independence – We can treat the sample of students as an independent random sample, since the number of beers were assigned randomly. 2) Linear relationship – check scatterplot and residual plot.

Scatterplot Residuals The scatterplot shows a strong linear relationship and the residual plot shows no patter. A linear model is appropriate. 3)Checking standard deviation. The residuals overall seem to be the same distance away from y = 0 2) Checking linear

4)Checking Normality A histogram of the residuals shows that the data are slightly skewed right, but there doesn’t seem to be any gross variations. We can now proceed with our calculations.

Here is the output from Minitab What is the t test statistic? (show your work, not just reading it from the chart) t = b/SE b = / ≈ 7.48

The p-value for a two-sided test is (for three decimal places) The one-sided p-value is half of that, so it is also close to 0. We can reject H 0 and conclude that there is strong evidence that an increased number of beers does increase BAC. The number of beers predicts blood alcohol level quite well. Five beers produced an average BAC of ŷ = (0.0180)(5) = which is close to the legal driving limit of 0.08 in many states.

The 95% confidence interval is b ± t*SE b t*(14) = invT(.975, 14) = 2.145SE b = b = ± (2.145)( ) ± = ( , ) We are 95% confident that the true mean increase in BAC for each additional beer consumed is between and HW: pg. 900 #15.6, 15.8/ pg. 908 #15.11