Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1.

Slides:



Advertisements
Similar presentations
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 13 Multiple Regression
Simple Linear Regression
Chapter 12 Multiple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 12b Testing for significance—the t-test Developing confidence intervals for estimates of β 1. Testing for significance—the f-test Using Excel’s.
The Simple Regression Model
1 Inference about Comparing Two Populations Chapter 13.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Chapter 17 Linear regression is a procedure that identifies relationship between independent variables and a dependent variable. This relationship helps.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
1 Simple Linear Regression and Correlation Chapter 17.
Lecture 23 Multiple Regression (Sections )
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
1 Chapter 16 Linear regression is a procedure that identifies relationship between independent variables and a dependent variable.Linear regression is.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Simple Linear Regression Analysis
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Correlation and Linear Regression
SIMPLE LINEAR REGRESSION
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Linear Regression and Correlation
Correlation and Linear Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 14 Introduction to Multiple Regression
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Outline When X’s are Dummy variables –EXAMPLE 1: USED CARS –EXAMPLE 2: RESTAURANT LOCATION Modeling a quadratic relationship –Restaurant Example.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.
Lecture 4 Introduction to Multiple Regression
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 14 Introduction to Multiple Regression
Inference for Least Squares Lines
Topic 10 - Linear Regression
26134 Business Statistics Week 5 Tutorial
Multiple Regression Analysis and Model Building
Correlation and Simple Linear Regression
Presentation transcript:

Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1. Determine the regression equation and answer additional questions stated later. Solution Run the regression tool from Excel > Data analysis. Click to see the output next

The regression equation The regression equation: Price =17499.1-1131.64Age-72.31Mileage Be careful about the interpretation of the intercept (17499). Do not argue that this is the price of a used car with no mileage when its age is “zero”. Although such cars may exist (a car purchased and returned within a week with almost no mileage) might need to be re-sold as a used car. Yet, such values of Age and Mileage were not covered by the sample range!!. CAMARO1

The model usefulness CAMARO1 Does the overall model contribute significantly to predicting the asking price of a used Chevrolet Camaro? Use .01 for the significance level Answer: Observe the Significance F. This is the p value for the F Test of the hypotheses H0: b1= b2 = 0 H1: At least one b ¹0. Since the p value is practically zero, it is smaller than alpha. The null hypothesis is rejected, and therefore at least one b ¹0. The variable associated with this b is linearly related to the price, and the model is useful, thus contributes to predicting the asking price.

Model’s fit How well does the model fit the data? Would you expect the predictions to be accurate with this model? Solution Observing the coefficient of determination (R2), 81% of the variation in car prices are explained by this model. This is quite high, and we can expect accurate predictions.

Predicting ‘y’ Predict the value of the asking price for a 5-years old car, with 70,000 miles on the odometer, with 95% confidence. Solution To obtain an interval estimate for the prediction of a single car asking price when Age=5, and Mileage=70, we look for the prediction interval. From Data Analysis Plus we have {$2622.222, $10936.38}. The general form of the interval is: , where D is determined from the data. Specifically: 17499.1-1131.64(5)-72.31(70)= 6779.303. So the interval is 6779.303 ± D, For the Data Analysis Plus procedure go to the worksheet “Prediction Interval” in “CAMARO1”.

Estimating the mean ‘y’ Predict the value of the mean asking price for all 5-years old cars, with 70,000 miles on the odometer, with 95% confidence. Solution To obtain an interval estimate for the mean asking price of all cars for which Age=5 and Mileage=70, we look for the confidence interval. From Data Analysis Plus we have {$5756.028, $7802.577} For details go to the worksheet “Prediction Interval” in “CAMARO1”.

Testing linear relationship Are both variables (Age and Mileage each one in the presence of the other one), serve as good predictors of Asking Price? Test at alpha=.025. Solution Perform a t-test for the b coefficient of each variable. The hypotheses tested are: H0: bAge=0 vs. H1: bAge¹ 0 for which the p value is .002; H0: bMileage=0 vs. H1: bMileage¹ 0 for which the p value is .0104. In both cases the null hypothesis is rejected, therefore, both have linear relationship to the asking price at 2.5% significance level.

Problem 2 The previous model for the prediction of the asking price of used Chevrolet Camaro, is now extended by adding two new independent variables: car condition (Excellent, Average, Poor), and the type of the seller who sells the car (Dealer, Individual). The data for this case is stored in CAMARO2 (see next slide). Develop the linear regression model for this case and answer several questions formulated next. Solution The two new variables describe the values of qualitative data (the state of a car and the type of the seller). Thus, they are dummy variables, take on the values ‘0’ and ‘1’.

Using dummy variables Solution – continued: CAMARO2 There are three possible car condition values, so we need two dummy variables. Let us select the variables ‘Average’ and ‘Poor’. In describing the two values of the car condition, these variables are used as follows: Average Poor An “Excellent condition” car 0 0 An “Average condition” car 1 0 A “Poor condition” car 0 1 In a similar manner we use one dummy variable to describe who sold the car. Let us define Dealer = 1 if the car was sold by a dealer. Dealer = 0 if sold by an individual. CAMARO2

The linear regression equation The linear regression equation: Price= 17357.38-1131.93Age-33.242Mileage- -2556.44Avg-3275.3Poor+775.64Dealer

Interpreting the coefficients bi Interpret the coefficient estimates bi of each variable and test the strength of their predicting power. Solution bAge= -1131.93. In this model, For each additional year the asking price drops by $1132, keeping the rest of the variables unchanged. bMile= -33.24. In this model, for each additional 1000 miles the asking price drops by $33.24, keeping the rest of the variables unchanged. bAvg = -2556.44. In this model, the asking price for a car whose condition is average is $2556.44 lower than the asking price for a car whose condition is excellent, keeping the rest of the variables unchanged. bPoor = -3275.3. In this model, the asking price for a car whose condition is poor is $3275.3 lower than the asking price for a car whose condition is excellent, keeping the rest of the variables unchanged. bDeal = 775.64. In this model the asking price for a car sold by a dealer is $775.64 higher than this sold by an individual, keeping the rest of the variables unchanged.

The role of the dummy variable coefficients Let us compare the asking price equations of two cars, with the same age, mileage, and condition, one sold by a dealer, the other one by an individual: Price(Dealer)=b0+b1Age+b2Mileage+b3Avg.+b4Poor +b5(Dealer=1)= b0+b1Age+b2 Mileage+b3Avg.+b4Poor +b5 Price(Individual)=b0+b1Age+b2Mileage+b3Avg.+b4Poor +b5(Dealer=0)= b0+b1Age+b2Mileage+b3Avg.+b4Poor Conclusion: When the only difference between cars is the type of sellers who sell them, the base line equation was selected to be the Price(Individual) equation, and then b5 is the average difference in asking price between them.

The role of the dummy variable coefficients Let us compare the asking price equations of three cars, that differ in their overall condition but have the same age, mileage, and are sold by the same type of a seller: Price(Excellent)=b0+b1Age+b2 Mileage+b3(Avg.=0)+b4(Poor=0) +b5(Dealer)= b0+b1Age+b2 Mileage+b5(Dealer) Price(Avg.)=b0+b1Age+b2Mileage+b3(Avg.=1)+b4(Poor=0) +b5(Dealer)= b0+b1Age+b2 Mileage+b5(Dealer) + b3 Price(Poor)=b0+b1Age+b2Mileage+b3(Avg.=0)+b4(Poor=1) +b5(Dealer)= b0+b1Age+b2 Mileage+b5(Dealer) + b4 Conclusion: When the only difference between cars is the car condition, the base line equation was selected to be the Price(Excellent) equation, and then b3 and b4 are the average differences in asking price between an “excellent condition” car and the other two cars.

Prediction power of independent variable (are there linear relationships?) Testing the prediction power. Formulate the t-test for each b. Observing the p values we have: For bAge the p value=.00036. Age is a strong predictor For bMileage the p value=.17. Mileage is not a good predictor, not having linear relationship with price. For bAverage the p value=.0098. There is sufficient evidence to infer at 1% significance level that the asking price of a car whose condition is average is different from the asking price of a car whose condition is excellent. In fact, the argument is even stronger. Since the t-statistic is negative (-2.79), the rejection region is at the left hand tail of the distribution, so we have sufficient evidence to claim that bavarage<0. This means the asking price of an “Avg. Condition” car is on the average $2556 lower than the asking price of an “Excellent condition” car.

Prediction power of independent variable (are there linear relationships?) Testing the prediction power - continued. For bPoor the p value = .006. There is a very strong evidence to believe that the asking price for a “Poor Condition” car is different than the asking price for an “Excellent condition” car. Specifically, a “Poor condition” car is sold for $3275.3 less than an “Excellent condition” car. For bDealer the p value = .40. There is insufficient evidence to infer at 2.5% significant level that on the average the asking price for a car sold by a dealer is different than the asking price for a car sold by an individual.

Prediction power of independent variable (are there linear relationships?) Predict the asking price of the following cars: 4 years old, 45000 miles, Average condition, sold by an individual. Price=17357 – 1131.9(4) – 33.242(45) – 2556.4(1) + 775.64(0) The variable “Average” is equal to 1 when the car is in average conditions. The variable “Dealer” is equal to 0 when the car is sold by an individual.