Objectives 10.1 Simple linear regression

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Chapter 12 Inference for Linear Regression
Lesson 10: Linear Regression and Correlation
Inference for Regression
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
CHAPTER 24: Inference for Regression
Objectives (BPS chapter 24)
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Chapter 12 Simple Linear Regression
Session 2. Applied Regression -- Prof. Juran2 Outline for Session 2 More Simple Regression –Bottom Part of the Output Hypothesis Testing –Significance.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Chapter 12 Section 1 Inference for Linear Regression.
Simple Linear Regression Least squares line Interpreting coefficients Prediction Cautions The formal model Section 2.6, 9.1, 9.2 Professor Kari Lock Morgan.
Simple Linear Regression Analysis
The t-test Inferences about Population Means when population SD is unknown.
Correlation and Linear Regression
SIMPLE LINEAR REGRESSION
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Lecture 10: Correlation and Regression Model.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
Stats Methods at IC Lecture 3: Regression.
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Regression and Correlation
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inference for Regression
Least Square Regression
Unit 3 – Linear regression
Statistical Inference about Regression
SIMPLE LINEAR REGRESSION
Presentation transcript:

Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence interval for µy Prediction intervals

Statistical model for linear regression In the population, the linear regression equation is y = 0 + 1x + e, where e is the random deviation (or error) of the response variable from the prediction formula. Usually, we assume that e has Normal(0,σ) distribution. 0 (y-intercept) and 1 (slope) are the parameters. Statistical inference is conducted to draw conclusions about the parameters. Confidence interval and hypothesis test for 1. We especially want to test whether the slope equals zero. Confidence interval for 0 + 1x, given a value for x. Prediction interval for a random y, given a value for x.

Estimating the parameters The population linear regression equation is y = 0 + 1x + e. The sample fitted regression line is ŷ = b0 + b1x. b0 is the estimate for the intercept 0 and b1 is the estimate for the slope 1. We also estimate σ (the standard deviation of e), using se is a measure of the typical size of a residual y − ŷ. We will use se to compute the standard errors we need.

Confidence interval for the slope parameter Before we do inference for the slope parameter b1, we need the standard error for the estimate b1: We use the t distribution, now with n – 2 degrees of freedom. A level C confidence interval for the slope, b1, is t* is the table value for the t(n – 2) distribution with area C between −t* and t*. “Confidence” has the same interpretation as always.

Significance test for the slope parameter We can test the hypothesis H0: b1 = m versus either a 1-sided or a 2-sided alternative, using a t-statistic. (The primary case is with m = 0.) We calculate and use the t(n – 2) distribution to find the P-value of the test. Note: Software typically provides two-sided p-values.

Relationship between ozone and carbon pollutants In StatCrunch: Stat-Regression-Simple Linear; choose Hypothesis Test se df = n − 2 To test H0: 1 = 0 with α = 0.05, we compute From the t-table, using df = 28 − 2 = 26, we can see that the P-value is less than 0.0005. Since it is very small we reject H0 and conclude the slope is not zero.

Relationship between ozone and carbon pollutants In StatCrunch: Stat-Regression-Simple Linear; choose Confidence Interval Having decided that the slope is not zero, we next estimate it with a 95% confidence interval:

Confidence interval for 0 + 1x We can also calculate a confidence interval for the regression line itself, at any choice x. Generally this is sensible as long as x is within the range of data observed (interpolation). Extrapolation should only be done with a great deal of caution. The interval is centered on ŷ = b0 + b1x, but we need a standard error for this particular estimate. The confidence interval is then calculated in the usual fashion: This is an estimate of the point on the line (the expected value of y) for the given value of x.

Prediction interval for a new obs. y It often is of greater interest to predict what the actual y value might be (not just what it is expected to be). Such a prediction interval for an actual (new) observation y, must necessarily account for both the estimation of the line and the random deviation e away from that line. The interval is again centered on ŷ = b0 + b1x, but now we also account for the random deviation. The prediction interval for the actual y, with given value for x, is The distinction between a confidence interval and a prediction interval is whether you want to capture the expected value of y or the actual value of y.

Prediction intervals Unlike confidence intervals, the size of the prediction interval does not get narrower as you increase the sample size. This is because: The confidence interval is estimating a parameter, such as the mean, the slope, the slope equation. For example, if I am interesting in the mean grade of all people taking midterm 3 who scored 10 on midterm 2, the CI will get narrower as the sample size grows (because the estimators tend to get better for large sample size). The prediction interval is completely different. Here we are trying to predict the grade of a randomly selected person who scored 10 on midterm 2. There will be a lot of variability, and it does not improve as we increase the sample size: very individual is different (it is like predicting the weight of someone who is 6 foot tall, even if we know what the average weight of a 6 footer is, there is a huge variation in this group, thus the prediction interval must be wide for us to be able to capture the height). This is a fundamental difference between predicting the measurement of an individual and estimating the mean. The mean estimator will get better with sample size, the individual won’t.

Efficiency of a biofilter, by temperature In StatCrunch: Stat-Regression-Simple Linear; choose Predict Y for X For a 95% confidence interval of the expected ozone level, with temperature = 16, we compute For a 95% prediction interval of the actual ozone level, with temperature = 16, we compute