Topic 10 - Linear Regression Least squares principle - pages 301 – 309301 – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.

Slides:



Advertisements
Similar presentations
Regression and correlation methods
Advertisements

Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Chapter 3 Bivariate Data
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
Linear Regression/Correlation
Linear Regression Analysis
Correlation & Regression
Correlation and Linear Regression
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Correlation and Linear Regression
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
Simple Linear Regression. Correlation Correlation (  ) measures the strength of the linear relationship between two sets of data (X,Y). The value for.
Chapter 6 & 7 Linear Regression & Correlation
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Examining Relationships in Quantitative Research
Chapters 8 & 9 Linear Regression & Regression Wisdom.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Lecture 10: Correlation and Regression Model.
3.3 Correlation: The Strength of a Linear Trend Estimating the Correlation Measure strength of a linear trend using: r (between -1 to 1) Positive, Negative.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The simple linear regression model and parameter estimation
Topic 10 - Linear Regression
Linear Regression and Correlation Analysis
Simple Linear Regression
Linear Regression/Correlation
Chapter 13 Simple Linear Regression
Presentation transcript:

Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression - pages

Regression How much should you pay for a house? Would you consider the median or mean sales price in your area over the past year as a reasonable price? What factors are important in determining a reasonable price? –Amenities –Location – Square footage To determine a price, you might consider a model of the form: Price = f(square footage) + 

Scatter plots To determine the proper functional relationship between two variables, construct a scatter plot. For the home sales data below, what sort of functional relationship exists between Price and SQFT (square footage)?home sales

Simple linear regression The simplest model form to consider is Y i =  0 +  1 X i +  i Y i is called the dependent variable or response. X i is called the independent variable or predictor.  i is the random error term which is typically assumed to have a Normal distribution with mean 0 and variance  2. We also assume that error terms are independent of each other.

Least squares criterion If the simple linear model is appropriate then we need to estimate the values  0 and  1. To determine the line that best fits our data, we choose the line that minimizes the sum of squared vertical deviations from our observed points to the line. In other words, we minimize

Least squares estimators

Home sales example For the home sales data, what are least squares estimates for the line of best fit for Price as a function of SQFT?home sales

Inference Often times, inference for the slope parameter,  1, is most important.  1 tells us the expected change in Y per unit change in X. If we conclude that  1 equals 0, then we are concluding that there is no linear relationship between Y and X. If we conclude that  1 equals 0, then it makes no sense to use our linear model with X to predict Y. has a Normal distribution with a mean of  1 and a variance of.

Hypothesis test for  1 To test H 0 :  1 =  0, use the test statistic HAHA Reject H 0 if  1 <  0 T < - t , n-2  1 >  0 T > t , n-2  1 ≠  0 | T | > t  /2, n-2

Home sales example For the home sales data, is the linear relationship between Price and SQFT significant?home sales

Confidence interval for  1 A (1-  )100% confidence interval for  1 is For the home sales data, what is a 95% confidence interval for the expected increase in price for each additional square foot?home sales

Confidence interval for mean response Sometimes we want a confidence interval for the average (expected) value of Y at a given value of X = x *. With the home sales data, suppose a realtor says the average sales price of a 2000 square foot home is $120,000. Do you believe her?home sales has a Normal distribution with a mean of  0 +  1 x * and a variance of

Confidence interval for mean response A (1-  )100% confidence interval for  0 +  1 x * is With the home sales data, do you believe the realtor’s claim?home sales

Prediction interval for a new response Sometimes we want a prediction interval for a new value of Y at a given value of X = x *. A (1-  )100% prediction interval for Y when X = x * is With the home sales data, what is a 95% prediction interval for the amount you will pay for a 2000 square foot home?home sales

Extrapolation Prediction outside the range of the data is risky and not appropriate as these predictions can be grossly inaccurate. This is called extrapolation. For our home sales example, the prediction formula was developed for homes that were less than 3750 square feet, is it appropriate to use the regression model to predict the price of a home that is 5000 square feet?

Correlation The correlation coefficient, r, describes the direction and strength of the straight-line association between two variables. We will use StatCrunch to calculate r and focus on interpretation. If r is negative, then the association is negative. (A car’s value vs. its age) If r is positive, then the association is positive. (Height vs. weight) r is always between –1 and 1 (-1 < r < 1). –At –1 or 1, there is a perfect straight line relationship. –The closer to –1 or 1, the stronger the relationship. –The closer to 0, the weaker the relationship. Understanding Correlation Correlation by eye

Home sales example For the home sales data, consider the correlation between the variables.home sales

Correlation and regression The square of the correlation, r 2, is the proportion of variation in the value of Y that is explained by the regression model with X. 0  r 2  1 always. The closer r 2 is to 1, the better our model fits the data and the more confident we are in our prediction from the regression model. For the home sales example, r 2 = between price and square footage, so about 71% of the variation in price is due to square footage. Other factors are responsible for the remaining variation.

Association and causation A strong relationship between two variables does not always mean a change in one variable causes changes in the other. The relationship between two variables is often due to both variables being influenced by other variables lurking in the background. The best evidence for causation comes from properly designed randomized comparative experiments.

Does smoking cause lung cancer? Unethical to investigate this relationship with a randomized comparative experiment. Observational studies show strong association between smoking and lung cancer. The evidence from several studies show consistent association between smoking and lung cancer. More and longer cigarettes smoked, the more often lung cancer occurs. Smokers with lung cancer usually began smoking before they developed lung cancer. It is plausible that smoking causes lung cancer Serves as evidence that smoking causes lung cancer, but not as strong as evidence from an experiment.