Download presentation

Presentation is loading. Please wait.

Published byAlana Tanner Modified about 1 year ago

1
Chapter 12 Inference for Linear Regression 12.1a h.w: pg 759: 1 – 11 odd Target Goals: I can make predictions using regression for normal distributions. I can check conditions for performing inference about the slope β of the population (true) regression line.

2
Inference about the Model We can use LSRL fitted to data to predict y for a given value of x for two quantitative variables. We can use LSRL fitted to data to predict y for a given value of x for two quantitative variables. Now we will do tests and construct confidence intervals in this setting. Now we will do tests and construct confidence intervals in this setting.

3
Pg. 752

4
Ex. Crying and IQ Infants who cry easily may be more easily stimulated than others and this may be a sign of higher IQ. Infants who cry easily may be more easily stimulated than others and this may be a sign of higher IQ. The researchers snapped a rubber band on the sole of the foot of infants and caused the infants to cry. The researchers snapped a rubber band on the sole of the foot of infants and caused the infants to cry. At age 3 years the measured IQ. At age 3 years the measured IQ.

5
Step 1: Make a scatterplot of the data. Explanatory variable: Crying Explanatory variable: Crying Response variable: IQ Response variable: IQ Enter “crying” data into L1 and “IQ” data into L2. Enter “crying” data into L1 and “IQ” data into L2. Plot and Interpret. STAT:CALC:LinReg(a+bx) L1,L2,Y1 Plot and Interpret. STAT:CALC:LinReg(a+bx) L1,L2,Y1 Y1:(VARS:Y-VARS:FUNCT:Y1) Scatterplot shows a roughly linear pattern. Scatterplot shows a roughly linear pattern. The correlation r describes the direction and strength of the relationship. The correlation r describes the direction and strength of the relationship.

6
Step 2: Calculate the LSRL

7
Step 3: Identify outliers and influential points Influential points Influential points Outliers Outliers No extreme outliers or potentially influential observations. No extreme outliers or potentially influential observations.

8
Step 4: Calculate the Correlation (r value) The correlation between crying and IQ is r = The correlation between crying and IQ is r =

9
Interpret r 2 = 0.207, only about 21% of the variation in IQ scores (response variable) is explained by crying intensity. only about 21% of the variation in IQ scores (response variable) is explained by crying intensity. r 2 is called the coefficient of determination. r 2 is called the coefficient of determination. Is prediction of IQ accurate with this model? No Is prediction of IQ accurate with this model? No

10
It is interesting though that behavior shortly after birth can partly predict IQ. It is interesting though that behavior shortly after birth can partly predict IQ.

11
Conditions for Regression Inference 3 SRSs of 20 Old Faithful Eruptions The values of the slope b for the 1000 sample regression lines are plotted. How long it will take before Old Faithful erupts again based on the duration of the previous eruption.

12
Pg. 742

13
Conditions for Regression Inference Our goal is to predict the behavior of y for a given value of x. 1) Linear: The y responses for various samples vary according to a normal distribution. The mean response μ y has a straight-line relationship with x. The mean response μ y has a straight-line relationship with x. The true regression line is written in the form: The true regression line is written in the form:

14
where μ y is the mean response, and is the true y-intercept and β is the true slope.

15
2) Independent: The y responses are independent of each other. 3) Normal: for any fixed value of x, the observed response value y varies according to a normal distribution having mean μ y.

16
4) Equal Variance: The standard deviation s about the true regression line is the same for all values of x. (constant). It is usually an unknown parameter. 5) Random: The data come from a well designed random sample or randomized experiment.

17
Linear Linear Independent Independent Normal Normal Equal Variance Equal Variance Random Random

18
The LSRL : = a + b x where b is an unbiased estimator of the true slope β and a is the unbiased estimator of the true intercept. The LSRL : = a + b x where b is an unbiased estimator of the true slope β and a is the unbiased estimator of the true intercept.

19
The line is the true regression line, which shows how the mean response μ y changes as the explanatory variable x changes. The line is the true regression line, which shows how the mean response μ y changes as the explanatory variable x changes.

20
Standard Deviation σ determines whether the points fall close to the true regression line (small σ) or are widely scattered (large σ). σ determines whether the points fall close to the true regression line (small σ) or are widely scattered (large σ). This is also the size of a typical prediction error if we use the least-squares regression line to predict “how long it will take before Old Faithful erupts again” based on the duration of the previous eruption. This is also the size of a typical prediction error if we use the least-squares regression line to predict “how long it will take before Old Faithful erupts again” based on the duration of the previous eruption.

21
Ex: Slope and Intercept The LSRL is = x The LSRL is = x The slope measures rate of change: how much higher average IQ is for children with one more peak in their crying measurements. The slope measures rate of change: how much higher average IQ is for children with one more peak in their crying measurements. b est. the unknown β; we est. that on the average IQ is about 1.5 points higher for each additional crying peak. b est. the unknown β; we est. that on the average IQ is about 1.5 points higher for each additional crying peak. IQ crying peak

22
Standard Deviation σ describes the variability of the response y about the true regression line. σ describes the variability of the response y about the true regression line. Recall that residuals estimate how much y varies about the true line and are the vertical deviations of the data points from the least-square line: Recall that residuals estimate how much y varies about the true line and are the vertical deviations of the data points from the least-square line: Residual = observed y – predicted y Residual = observed y – predicted y

23
Standard Error about the LSRL We estimate σ with s, the sample standard deviation, which is also called the standard error (this is the key to inference about the regression). We estimate σ with s, the sample standard deviation, which is also called the standard error (this is the key to inference about the regression). Since σ is unknown, we use s to estimate the value of σ. Since σ is unknown, we use s to estimate the value of σ. Note: (n – 2) is the degrees of freedom for the regression model. Note: (n – 2) is the degrees of freedom for the regression model.

24
Ex. Calculating Residuals and Standard Error The quickest way to do this is to: (use ex 14.1 data). Enter “crying” data into L1 and “IQ” data into L2. (We already did this.) Enter “crying” data into L1 and “IQ” data into L2. (We already did this.) Recall: LINREG (a+bx) automatically calculates the residuals and stores them in “Resid.” Recall: LINREG (a+bx) automatically calculates the residuals and stores them in “Resid.” Store “Resid” in L3 Store “Resid” in L3 STAT:CALC:1-Var Stats L3 STAT:CALC:1-Var Stats L3 ∑ ∑ resid 2

25
To find s, first find s 2 : To find s 2 : Enter the value of ∑X 2 by hand or (VARS:5: : ∑X 2 ) and divide by (n-2) Enter the value of ∑X 2 by hand or (VARS:5: : ∑X 2 ) and divide by (n-2) Take sqrt to find s.

26
A level C confidence interval for the slope b of the true regression line is

27
You will rarely have to calculate this by hand. You will rarely have to calculate this by hand. Regression software gives you the standard error SE b and b itself. Regression software gives you the standard error SE b and b itself.

28
Ex. Regression Output: Crying and IQ

29
There are 38 data points so There are 38 data points so df = n – 2 = 36. Find the critical value t* (critical value). Find the critical value t* (critical value). For a 95% C.I. for true slope b, use critical value t* = with df =30 from table C.

30
Conclude We are 95 % confident that mean IQ increases by, between 0.5 and 2.5 points, for each additional peak in crying. We are 95 % confident that mean IQ increases by, between 0.5 and 2.5 points, for each additional peak in crying.

31
Interpret SE b Se b estimates how much the slope of the sample regression line typically varies from the slope of the population (true) regression line if we repeat the data production process many times. Se b estimates how much the slope of the sample regression line typically varies from the slope of the population (true) regression line if we repeat the data production process many times. If we repeated the experiment many times, the slope the slope of the sample regression line would typically vary by about.4870 from the slope of the true regression line for predicting IQ from cry count of infants. If we repeated the experiment many times, the slope the slope of the sample regression line would typically vary by about.4870 from the slope of the true regression line for predicting IQ from cry count of infants.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google