Lecture 27 Polynomial Terms for Curvature Categorical Variables.

Slides:



Advertisements
Similar presentations
Class 18 – Thursday, Nov. 11 Omitted Variables Bias
Advertisements

The Regression Equation  A predicted value on the DV in the bi-variate case is found with the following formula: Ŷ = a + B (X1)
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Guide to Using Minitab For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Chapter 13 Multiple Regression
Stat 112: Lecture 15 Notes Finish Chapter 6: –Review on Checking Assumptions (Section ) –Outliers and Influential Points (Section 6.7) Homework.
Stat 112: Lecture 17 Notes Chapter 6.8: Assessing the Assumption that the Disturbances are Independent Chapter 7.1: Using and Interpreting Indicator Variables.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Lecture 26 Model Building (Chapters ) HW6 due Wednesday, April 23 rd by 5 p.m. Problem 3(d): Use JMP to calculate the prediction interval rather.
Class 19: Tuesday, Nov. 16 Specially Constructed Explanatory Variables.
Regresi dan Rancangan Faktorial Pertemuan 23 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Multiple Regression
Stat 112: Lecture 22 Notes Chapter 9.1: One-way Analysis of Variance. Chapter 9.3: Two-way Analysis of Variance Homework 6 is due on Friday.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression.
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Stat 112: Lecture 19 Notes Chapter 7.2: Interaction Variables Thursday: Paragraph on Project Due.
Lecture 26 Omitted Variable Bias formula revisited Specially constructed variables –Interaction variables –Polynomial terms for curvature –Dummy variables.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Stat 112: Lecture 20 Notes Chapter 7.2: Interaction Variables. Chapter 8: Model Building. I will Homework 6 by Friday. It will be due on Friday,
Stat 112: Lecture 18 Notes Chapter 7.1: Using and Interpreting Indicator Variables. Visualizing polynomial regressions in multiple regression Review Problem.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Multiple Linear Regression
Ch. 14: The Multiple Regression Model building
Lecture 22 – Thurs., Nov. 25 Nominal explanatory variables (Chapter 9.3) Inference for multiple regression (Chapter )
Class 20: Thurs., Nov. 18 Specially Constructed Explanatory Variables –Dummy variables for categorical variables –Interactions involving dummy variables.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Lecture 21 – Thurs., Nov. 20 Review of Interpreting Coefficients and Prediction in Multiple Regression Strategy for Data Analysis and Graphics (Chapters.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Guide to Using Excel For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Chapter 14 Introduction to Multiple Regression Sections 1, 2, 3, 4, 6.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Moderation & Mediation
Multiple Regression 1 Sociology 5811 Lecture 22 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Non-Linear Regression. The data frame trees is made available in R with >data(trees) These record the girth in inches, height in feet and volume of timber.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
Stat 112 Notes 17 Time Series and Assessing the Assumption that the Disturbances Are Independent (Chapter 6.8) Using and Interpreting Indicator Variables.
Stat 112 Notes 15 Today: –Outliers and influential points. Homework 4 due on Thursday.
Stat 112 Notes 20 Today: –Interaction Variables (Chapter ) –Interpreting slope when Y is logged but not X –Model Building (Chapter 8)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.
Regression Models for Quantitative (Numeric) and Qualitative (Categorical) Predictors KNNL – Chapter 8.
1 Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the.
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Stat 112: Lecture 22 Notes Chapter 9.1: One Way Analysis of Variance Chapter 9.2: Two Way Analysis of Variance.
Class 5 Multiple Regression Models. We can readily imagine that there may be several factors that we can include in our model to explain test scores.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
REGRESSION REVISITED. PATTERNS IN SCATTER PLOTS OR LINE GRAPHS Pattern Pattern Strength Strength Regression Line Regression Line Linear Linear y = mx.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
CHAPTER 29: Multiple Regression*
Soc 3306a Lecture 11: Multivariate 4
Regression and Categorical Predictors
Presentation transcript:

Lecture 27 Polynomial Terms for Curvature Categorical Variables

Polynomial Terms for Curvature To model a curved relationship between y and x, we can add squared (and cubic or higher order) terms as explanatory variables. Fit as a multiple regression with two explanatory variables and Coefficients are not directly interpretable. Change in the mean of Y that is associated with a one unit increase in X depends on X. To test whether the multiple regression model with X and X 2 as predictors provides better predictions than the multiple regression model with just X, use the p-value of the t-test on the X 2 coefficient (null hypothesis is that X 2 has a zero coefficient). Plot residuals vs. X to determine whether quadratic model is appropriate. If there is still a pattern in the mean, can try a cubic model with X, X 2 and X 3.

Regression Model for Fast Food Chain Data Interactions and polynomial terms can be combined in a multiple regression model For fast food chain data, we consider the model This is called a second-order model because it includes all squares and interactions of original explanatory variables.

fastfoodchain.jmp results Strong evidence of a quadratic relationship between revenue and age, revenue and income. Moderate evidence of an interaction between age and income.

Categorical variables Categorical (nominal) variables: Variables that define group membership, e.g., sex (male/female), color (blue/green/red), county (Bucks County, Chester County, Delaware County, Philadelphia County). Categorical variables can be incorporated into regression through dummy variables. We will look at categorical variables that have two categories.

Sex discrimination revisited At the beginning of the class, in case study 1.2, we examined data from a sex discrimination case. Strong evidence that male clerks are paid more than female hires. But bank’s defense lawyers say that this is because males have higher education and experience, i.e., there are omitted confounding variables.

Multiple regression model for sex discrimination Let’s look at controlling for education level first. To examine bank’s claim, we want to look at and compare to How do we incorporate a categorical explanatory variable into multiple regression? Dummy variables.

Dummy variables Define Multiple regression model:, the coefficient on the dummy variable for sex, is the difference in mean earnings between the populations of men and women with the same education levels.

Categorical variables in JMP To color and mark the points by a categorical variable such as Sex, click red triangle to left on first column and select Color or Mark by Column. Select Set Marker by Value to use different marker by column.

Parallel Regression Lines The model implies that Regression lines for males and females as education varies are parallel. No interaction between sex and education.

Plot produced by JMP version 5 in Fit Model output that shows the parallel regression lines and the actual observations.

Interactions with Dummy Variables The model assumes that difference between men and women’s mean salaries for fixed levels of education is the same for all levels of education. There might be an interaction between sex and education. Difference between men and women might differ depending on level of education.

Interaction Model Multiple regression model that allows for interaction between sex and education: To add interaction in JMP, create a new colun sexdummy*educ. Right click on column, select formula and use the formula sexdummy*educ.. Difference in mean salary between men and women of same education level depends on the education level.

The model with one continuous explanatory variable, one categorical variable and an interaction is called the separate regression lines model because regression lines of y on continuous explanatory variables for two levels of dummy variable are “separate,” neither coincident nor parallel.

Multiple regression with education, experience and sex We can easily control for both education and experience in the sex discrimination case by adding them both to the multiple regression. A model without interactions is: Note that is difference between mean salaries of males and females of same education and experience level.