The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.

Slides:



Advertisements
Similar presentations
Chapter 12: More About Regression
Advertisements

Chapter 12 Inference for Linear Regression
CHAPTER 24: Inference for Regression
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 14: More About Regression Section 14.1 Inference for Linear Regression.
Chapter 14: Inference for Regression
Chapter 12: More About Regression
Objectives (BPS chapter 24)
Welcome to class today! Chapter 12 summary sheet Jimmy Fallon video
CHAPTER 8 Estimating with Confidence
Chapter 10: Estimating with Confidence
Chapter 12 Section 1 Inference for Linear Regression.
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Chapter 12: More About Regression
+ Chapter 12: Inference for Regression Inference for Linear Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.3 Estimating a Population Mean.
12.1: Inference for Linear Regression. Section 12.1 Inference for Linear Regression CHECK conditions for performing inference about the slope β of the.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 11.1 Estimating a Population Mean.
Chapter 10 Inference for Regression
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Inference with Regression. Suppose we have n observations on an explanatory variable x and a response variable y. Our goal is to study or predict the.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company.
12.1 Inference for Linear Regression Objectives SWBAT: CHECK the conditions for performing inference about the slope b of the population (true) regression.
BPS - 5th Ed. Chapter 231 Inference for Regression.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
The Practice of Statistics, 5 th Edition1 Check your pulse! Count your pulse for 15 seconds. Multiply by 4 to get your pulse rate for a minute. Write that.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Chapter 14: More About Regression
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
CHAPTER 10 Comparing Two Populations or Groups
Chapter 12: More About Regression
Chapter 12: More About Regression
CHAPTER 12 More About Regression
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Chapter 12: More About Regression
CHAPTER 10 Comparing Two Populations or Groups
Chapter 12: More About Regression
Chapter 12: More About Regression
CHAPTER 12 More About Regression
Chapter 12: More About Regression
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 12: More About Regression
CHAPTER 10 Comparing Two Populations or Groups
Chapter 12: More About Regression
CHAPTER 12 More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
Chapter 12: More About Regression
CHAPTER 10 Comparing Two Populations or Groups
Chapter 12: More About Regression
Chapter 12: More About Regression
Presentation transcript:

The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for Linear Regression

Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition2 CHECK the conditions for performing inference about the slope  of the population (true) regression line. INTERPRET the values of a, b, s, SE b, and r 2 in context, and DETERMINE these values from computer output. CONSTRUCT and INTERPRET a confidence interval for the slope  of the population (true) regression line. PERFORM a significance test about the slope  of the population (true) regression line. Inference for Linear Regression

The Practice of Statistics, 5 th Edition3 Introduction When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative response variable y, we can use the least-squares line fitted to the data to predict y for a given value of x. If the data are a random sample from a larger population, we need statistical inference to answer questions like these: Is there really a linear relationship between x and y in the population, or could the pattern we see in the scatterplot plausibly happen just by chance? In the population, how much will the predicted value of y change for each increase of 1 unit in x? What’s the margin of error for this estimate?

The Practice of Statistics, 5 th Edition4 Inference for Linear Regression Below is a scatterplot of the duration and interval of time until the next eruption of the Old Faithful geyser for all 222 recorded eruptions in a single month. The least-squares regression line for this population of data has been added to the graph. We call this the population regression line (or true regression line) because it uses all the observations that month.

The Practice of Statistics, 5 th Edition5 Sampling Distribution of b The figures below show the results of taking three different SRSs of 20 Old Faithful eruptions in this month. Each graph displays the selected points and the LSRL for that sample. Notice that the slopes of the sample regression lines – 10.2, 7.7, and 9.5 – vary quite a bit from the slope of the population regression line,

The Practice of Statistics, 5 th Edition6 Sampling Distribution of b Confidence intervals and significance tests about the slope of the population regression line are based on the sampling distribution of b, the slope of the sample regression line. Shape: We can see that the distribution of b-values is roughly symmetric and unimodal. Center: The mean of the 1000 b-values is This value is quite close to the slope of the population (true) regression line, Spread: The standard deviation of the 1000 b-values is Later, we will see that the standard deviation of the sampling distribution of b is actually 1.27.

The Practice of Statistics, 5 th Edition7 Sampling Distribution of b Sampling Distribution of a Slope Choose an SRS of n observations (x, y) from a population of size N with least-squares regression line predicted y = a +  x Let b be the slope of the sample regression line. Then: The mean of the sampling distribution of b is µ b = . The standard deviation of the sampling distribution of b is as long as the 10% Condition is satisfied. The sampling distribution of b will be approximately normal if the values of the response variable y follow a Normal distribution for each value of the explanatory variable x (the Normal condition).

The Practice of Statistics, 5 th Edition8 Conditions for Regression Inference The regression model requires that for each possible value of the explanatory variable x: 1.The mean value of the response variable µ y falls on the population (true) regression line µ y = a +  x. 2.The values of the response variable y follow a Normal distribution with common standard deviation .

The Practice of Statistics, 5 th Edition9 Conditions for Regression Inference Suppose we have n observations on an explanatory variable x and a response variable y. Our goal is to study or predict the behavior of y for given values of x. Linear: The actual relationship between x and y is linear. For any fixed value of x, the mean response µ y falls on the population (true) regression line µ y = α + βx. Independent: Individual observations are independent of each other. When sampling without replacement, check the 10% condition. Normal: For any fixed value of x, the response y varies according to a Normal distribution. Equal SD: The standard deviation of y (call it σ) is the same for all values of x. Random: The data come from a well-designed random sample or randomized experiment.

The Practice of Statistics, 5 th Edition10 How to Check Conditions for Inference Start by making a histogram or Normal probability plot of the residuals and also a residual plot. Linear: Examine the scatterplot to check that the overall pattern is roughly linear. Look for curved patterns in the residual plot. Check to see that the residuals center on the “residual = 0” line at each x-value in the residual plot. Independent: Look at how the data were produced. Random sampling and random assignment help ensure the independence of individual observations. If sampling is done without replacement, check the 10% condition. Normal: Make a stemplot, histogram, or Normal probability plot of the residuals and check for clear skewness or other major departures from Normality. Equal SD: Look at the scatter of the residuals above and below the “residual = 0” line in the residual plot. The vertical spread of the residuals should be roughly the same from the smallest to the largest x-value. Random: See if the data were produced by random sampling or a randomized experiment. Linear: Examine the scatterplot to check that the overall pattern is roughly linear. Look for curved patterns in the residual plot. Check to see that the residuals center on the “residual = 0” line at each x-value in the residual plot. Independent: Look at how the data were produced. Random sampling and random assignment help ensure the independence of individual observations. If sampling is done without replacement, check the 10% condition. Normal: Make a stemplot, histogram, or Normal probability plot of the residuals and check for clear skewness or other major departures from Normality. Equal SD: Look at the scatter of the residuals above and below the “residual = 0” line in the residual plot. The vertical spread of the residuals should be roughly the same from the smallest to the largest x-value. Random: See if the data were produced by random sampling or a randomized experiment.

The Practice of Statistics, 5 th Edition11 Estimating the Parameters When the conditions are met, we can do inference about the regression model µ y = α+ βx. The first step is to estimate the unknown parameters. If we calculate the least-squares regression line, the slope b is an unbiased estimator of the population slope β, and the y-intercept a is an unbiased estimator of the population y-intercept α. The remaining parameter is the standard deviation σ, which describes the variability of the response y about the population regression line. The LSRL computed from the sample data estimates the population regression line. So the residuals estimate how much y varies about the population line. Because σ is the standard deviation of responses about the population regression line, we estimate it by the standard deviation of the residuals

The Practice of Statistics, 5 th Edition12 Estimating the Parameters In practice, we don’t know σ for the population regression line. So we estimate it with the standard deviation of the residuals, s. Then we estimate the spread of the sampling distribution of b with the standard error of the slope: What happens if we transform the values of b by standardizing? Since the sampling distribution of b is Normal, the statistic has the standard Normal distribution.

The Practice of Statistics, 5 th Edition13 Estimating the Parameters Replacing the standard deviation σ b of the sampling distribution with its standard error gives the statistic which has a t distribution with n - 2 degrees of freedom.

The Practice of Statistics, 5 th Edition14 Constructing a Confidence Interval The confidence interval for β has the familiar form statistic ± (critical value) · (standard deviation of statistic) Because we use the statistic b as our estimate, the confidence interval is b ± t* SE b We call this a t interval for the slope. t Interval for the Slope When the conditions for regression inference are met, a level C confidence interval for the slope β of the population (true) regression line is b ± t* SE b In this formula, the standard error of the slope is and t* is the critical value for the t distribution with df = n - 2 having area C between -t* and t*.

The Practice of Statistics, 5 th Edition15 Performing a Significance Test for the Slope When the conditions for inference are met, we can use the slope b of the sample regression line to construct a confidence interval for the slope β of the population (true) regression line. We can also perform a significance test to determine whether a specified value of β is plausible. The null hypothesis has the general form H 0 : β = hypothesized value. To do a test, standardize b to get the test statistic: To find the P-value, use a t distribution with n - 2 degrees of freedom.

The Practice of Statistics, 5 th Edition16 Performing a Significance Test for the Slope t Test for the Slope Suppose the conditions for inference are met. To test the hypothesis H 0 : β = hypothesized value, compute the test statistic Find the P-value by calculating the probability of getting a t statistic this large or larger in the direction specified by the alternative hypothesis H a. Use the t distribution with df = n - 2.

Section Summary In this section, we learned how to… The Practice of Statistics, 5 th Edition17 CHECK the conditions for performing inference about the slope  of the population (true) regression line. INTERPRET the values of a, b, s, SE b, and r 2 in context, and DETERMINE these values from computer output. CONSTRUCT and INTERPRET a confidence interval for the slope  of the population (true) regression line. PERFORM a significance test about the slope  of the population (true) regression line. Inference for Linear Regression

The Practice of Statistics, 5 th Edition18

The Practice of Statistics, 5 th Edition19

The Practice of Statistics, 5 th Edition20

The Practice of Statistics, 5 th Edition21

The Practice of Statistics, 5 th Edition22