Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 15, part D Qualitative Independent Variables.

Similar presentations


Presentation on theme: "Chapter 15, part D Qualitative Independent Variables."— Presentation transcript:

1 Chapter 15, part D Qualitative Independent Variables

2 VI. Qualitative Independent Variables For most of our models we have restricted our independent variables to quantitative data, values that can take any value in a range. Past examples include: Salary, G.P.A., # of Customers, Repair Cost $ Qualitative (dummy) variables are those that take two or more values (Gender, Political Party, Region of Country).

3 A. A Dummy Variable The simplest of dummy variables is one in which there are only two possibilities for a qualitative variable. You arbitrarily assign a value of 1 to one possibility and a value of 0 to the other. Examples: X=1 if Female; X=0 if Male X=1 if Union worker; X=0 if Nonunion X=1 if College Graduate; X=0 if not

4 B. Inclusion in a Regression Problem #38 builds a model to relate Age (x 1 ), Blood Pressure (x 2 ) and Smoking (x 3 ) to the Risk of Strokes (y). Smoking is a dummy variable, X3 =1 if a smoker; X3=0 if a non smoker.

5 Output Overall, what do you make of these results?

6 C. Interpretation The estimated coefficient on the Dummy for smoking is 8.74. Since X 3 =1 for a smoker, this means the probability a patient has a stroke in the next 10 years rises by 8.74% if they’re a smoker. You can’t do much about your age, but if you lower your blood pressure by 10 points, you lower the risk by 2.5%. Hmmm, what should a person do?

7 D. Multi-level Dummy Variables There are many wage/salary regression models that wish to examine differences in a wage variable by region of the country. For example, we could divide the country into 4 regions and assign a value of 1 to a worker from that region and 0 for all other regions.

8 Example Suppose we have 3 workers in a set of data. Franklin is from the North, Elly May is from the South, and Chet is from the West. Our table of data might look like this:

9 The Model If you have 4 levels for the qualitative variable “Region”, you can only include 3 in the equation. Including all 4 makes it impossible for least- squares to minimize the sum of squared residuals. The omission of one region creates a benchmark and allows you to compare all other regions to the one omitted.

10 Hypothetical Regression Results Let’s say that we leave out “East” and we find the following: Wage(Y) = 100 + 50(North) - 25(South) - 10(West) Remember, “North”=1 only if a worker is from the North and all other regions “South” and “West” are 0 for that worker.

11 Interpretation Franklin is from the North, so “North”=1 and “South”=“West”=0. His estimated wage is then 100+50=$150. Thus we could say that a worker from the North, all else held constant, would see a $50 increase in his/her wage

12 Continued... Elly May is from the South, so “South”=1 and “North”=“West”=0. Her estimated wage is then 100-25=$75.


Download ppt "Chapter 15, part D Qualitative Independent Variables."

Similar presentations


Ads by Google