Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.

Slides:



Advertisements
Similar presentations
1 Chapter 9 Supplement Model Building. 2 Introduction Introduction Regression analysis is one of the most commonly used techniques in statistics. It is.
Advertisements

Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Lesson 10: Linear Regression and Correlation
Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Choosing a Functional Form
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 26 Model Building (Chapters ) HW6 due Wednesday, April 23 rd by 5 p.m. Problem 3(d): Use JMP to calculate the prediction interval rather.
Simple Linear Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Multiple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
SIMPLE LINEAR REGRESSION
Lecture 26 Omitted Variable Bias formula revisited Specially constructed variables –Interaction variables –Polynomial terms for curvature –Dummy variables.
Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Lecture 23 Multiple Regression (Sections )
Stat 112: Lecture 18 Notes Chapter 7.1: Using and Interpreting Indicator Variables. Visualizing polynomial regressions in multiple regression Review Problem.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
SIMPLE LINEAR REGRESSION
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Lecture 21 – Thurs., Nov. 20 Review of Interpreting Coefficients and Prediction in Multiple Regression Strategy for Data Analysis and Graphics (Chapters.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
SIMPLE LINEAR REGRESSION
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Economics 173 Business Statistics Lecture 22 Fall, 2001© Professor J. Petry
Outline When X’s are Dummy variables –EXAMPLE 1: USED CARS –EXAMPLE 2: RESTAURANT LOCATION Modeling a quadratic relationship –Restaurant Example.
Correlation & Regression
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Economics 173 Business Statistics Lecture 19 Fall, 2001© Professor J. Petry
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Example x y We wish to check for a non zero correlation.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Economics 173 Business Statistics Lecture 18 Fall, 2001 Professor J. Petry
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 6: Multiple Regression Model Building Priyantha.
1 Chapter 20 Model Building Introduction Regression analysis is one of the most commonly used techniques in statistics. It is considered powerful.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
Inference for Least Squares Lines
Linear Regression.
Let’s Get It Straight! Re-expressing Data Curvilinear Regression
Multiple Regression Analysis and Model Building
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
Multiple Regression Chapter 14.
Product moment correlation
Presentation transcript:

Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building

Copyright © 2009 Cengage Learning 18.2 Regression Analysis Regression analysis is one of the most powerful and commonly used techniques in statistics; it allows us to create mathematical models that realistically describe relationships between the dependent variable and independent variables. We’ve seen it used for linear models using interval data, but regression analysis can also be used for: non-linear (polynomial) models, and models that include nominal independent variables.

Copyright © 2009 Cengage Learning 18.3 Polynomial Models Previously we looked a this multiple regression model: (its considered linear or first-order since the exponent on each of the x i ’s is 1) The independent variables may be functions of a smaller number of predictor variables; polynomial models fall into this category. If there is one predictor value (x) we have:

Copyright © 2009 Cengage Learning 18.4 Polynomial Models   Technically, equation  is a multiple regression model with p independent variables (x 1, x 2, …, x p ). Since x 1 = x, x 2 = x 2, x 3 = x 3, …, x p = x p, its based on one predictor value (x). p is the order of the equation; we’ll focus equations of order p = 1, 2, and 3.

Copyright © 2009 Cengage Learning 18.5 First Order Model When p = 1, we have our simple linear regression model: That is, we believe there is a straight-line relationship between the dependent and independent variables over the range of the values of x:

Copyright © 2009 Cengage Learning 18.6 Second Order Model When p = 2, the polynomial model is a parabola:

Copyright © 2009 Cengage Learning 18.7 Third Order Model When p = 3, our third order model looks like:

Copyright © 2009 Cengage Learning 18.8 Polynomial Models: 2 Predictor Variables Perhaps we suspect that there are two predictor variables (x 1 & x 2 ) which influence the dependent variable: First order model (no interaction): First order model (with interaction):

Copyright © 2009 Cengage Learning 18.9 Polynomial Models: 2 Predictor Variables First order models, 2 predictors, without & with interaction:

Copyright © 2009 Cengage Learning Polynomial Models: 2 Predictor Variables If we believe that a quadratic relationship exists between y and each of x 1 and x 2, and that the predictor variables interact in their effect on y, we can use this model: Second order model (in two variables) WITH interaction:

Copyright © 2009 Cengage Learning Polynomial Models: 2 Predictor Variables 2 nd order models, 2 predictors, without & with interaction:

Copyright © 2009 Cengage Learning Selecting a Model One predictor variable, or two (or more)? First order? Second order? Higher order? With interaction? Without? How do we choose the right model?? Use our knowledge of the variables involved to build an initial model. Test that model using statistical techniques. If required, modify our model and re-test…

Copyright © 2009 Cengage Learning Example 18.1 We’ve been asked to come up with a regression model for a fast food restaurant. We know our primary market is middle- income adults and their children, particularly those between the ages of 5 and 12. Dependent variable —restaurant revenue (gross or net) Predictor variables — family income, age of children Is the relationship first order? quadratic?…

Copyright © 2009 Cengage Learning Example 18.1 The relationship between the dependent variable (revenue) and each predictor variable is probably quadratic. Members of low or high income households are less likely to eat at this chain’s restaurants, since the restaurants attract mostly middle-income customers. Neighborhoods where the mean age of children is either quite low or quite high are also less likely to eat there vs. the families with children in the 5-to-12 year range. Seems reasonable?

Copyright © 2009 Cengage Learning Example 18.1 Should we include the interaction term in our model? When in doubt, it is probably best to include it. Our model then, is: Where y = annual gross sales x 1 = median annual household income* x 2 = mean age of children* *in the neighborhood

Copyright © 2009 Cengage Learning Example 18.2 Our fast food restaurant research department selected 25 locations at random and gathered data on revenues, household income, and ages of neighborhood children. Xm18-02 Collected DataCalculated Data

Copyright © 2009 Cengage Learning Example 18.2 You can take the original data collected (revenues, household income, and age) and plot y vs. x 1 and y vs. x 2 to get a feel for the data; trend lines were added for clarity…

Copyright © 2009 Cengage Learning Example 18.2 Checking the regression tool’s output… The model fits the data well and its valid… Uh oh. multicollinearity INTERPRET

Copyright © 2009 Cengage Learning Nominal Independent Variables Thus far in our regression analysis, we’ve only considered variables that are interval. Often however, we need to consider nominal data in our analysis. For example, our earlier example regarding the market for used cars focused only on mileage. Perhaps color is an important factor. How can we model this new variable?

Copyright © 2009 Cengage Learning Indicator Variables An indicator variable (also called a dummy variable) is a variable that can assume either one of only two values (usually 0 and 1). A value of one usually indicates the existence of a certain condition, while a value of zero usually indicates that the condition does not hold. I 1 = I 2 = 0 if color not white 1 if color is white 0 if color not silver 1 if color is silver Car ColorI1I1 I2I2 white10 silver01 other00 two tone!11 to represent m categories… we need m–1 indicator variables

Copyright © 2009 Cengage Learning Interpreting Indicator Variable Coefficients After performing our regression analysis: we have this regression equation… Thus,the price diminishes with additional mileage (x) a white car sells for $91.10 more than other colors (I 1 ) a silver car fetches $ more than other colors (I 2 )

Copyright © 2009 Cengage Learning Graphically

Copyright © 2009 Cengage Learning There is insufficient evidence to infer that in the population of 3-year-old white Tauruses with the same odometer reading have a different selling price than do Tauruses in the “other” color category… Testing the Coefficients To test the coefficient of I 1, we use these hypotheses… H 0 : = 0 H 1 : ≠ 0

Copyright © 2009 Cengage Learning We can conclude that there are differences in auction selling prices between all 3-year-old silver-colored Tauruses and the “other” color category with the same odometer readings Testing the Coefficients To test the coefficient of I 2, we use these hypotheses… H 0 : = 0 H 1 : ≠ 0

Copyright © 2009 Cengage Learning Model Building Here is a procedure for building a mathematical model:  Identify the dependent variable; what is it we wish to predict? Don’t forget the variable’s unit of measure.  List potential predictors; how would changes in predictors change the dependent variable? Be selective; go with the fewest independent variables required. Be aware of the effects of multicollinearity.  Gather the data; at least six observations for each independent variable used in the equation.

Copyright © 2009 Cengage Learning Model Building  Identify several possible models; formulate first- and second- order models with and without interaction. Draw scatter diagrams.  Use statistical software to estimate the models.  Determine whether the required conditions are satisfied; if not, attempt to correct the problem.  Use your judgment and the statistical output to select the best model!