Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 13 Multiple Regression

Similar presentations


Presentation on theme: "Chapter 13 Multiple Regression"— Presentation transcript:

1

2 Chapter 13 Multiple Regression
Section 13.6 Modeling a Categorical Response

3 Modeling a Categorical Response Variable
The regression models studied so far are designed for a quantitative response variable y. When y is categorical, a different regression model applies, called logistic regression.

4 Examples of Logistic Regression
A voter’s choice in an election (Democrat or Republican), with explanatory variables: annual income, political ideology, religious affiliation, and race. Whether a credit card holder pays their bill on time (yes or no), with explanatory variables: family income and the number of months in the past year that the customer paid the bill on time.

5 The Logistic Regression Model
Denote the possible outcomes for y as 0 and 1. Use the generic terms failure (for outcome = 0), and success (for outcome =1). The population mean of the scores equals the population proportion of ‘1’ outcomes (successes). That is, The proportion, p, also represents the probability that a randomly selected subject has a successful outcome.

6 The Logistic Regression Model
The straight-line model is usually inadequate when there are multiple explanatory variables. A more realistic model has a curved S-shape instead of a straight-line trend. The regression equation that best models this S- shaped curve is known as the logistic regression equation.

7 The Logistic Regression Model
Figure Two Possible Regressions for a Probability p of a Binary Response Variable. A straight line is usually less appropriate than an S-shaped curve. Question: Why is the straight-line regression model for a binary response variable often poor?

8 The Logistic Regression Model
A regression equation for an S-shaped curve for the probability of success p is: This equation for p is called the logistic regression equation. Logistic regression is used when the response variable has only two possible outcomes (it’s binary).

9 Example: Travel Credit Cards
An Italian study with 100 randomly selected Italian adults considered factors that are associated with whether a person possesses at least one travel credit card. The table on the next slide shows results for the first 15 people on this response variable and on the person’s annual income (in thousands of euros).

10 Example: Travel Credit Cards
Table Annual Income (in thousands of euros) and Whether Possess a Travel Credit Card. The response y equals 1 if a person has a travel credit card and equals 0 otherwise.

11 Example: Travel Credit Cards
Let x = annual income and let y = whether the person possesses a travel credit card (1 = yes, 0 = no). Table shows what software provides for conducting a logistic regression analysis. Table Results of Logistic Regression for Italian Credit Card Data

12 Example: Travel Credit Cards
Substituting the and estimates into the logistic regression model formula yields:

13 Example: Travel Credit Cards
Find the estimated probability of possessing a travel credit card at the lowest and highest annual income levels in the sample, which were x = 12 and x = 65.

14 Example: Travel Credit Cards
For x = 12 thousand euros, the estimated probability of possessing a travel credit card is:

15 Example: Travel Credit Cards
For x = 65 thousand euros, the estimated probability of possessing a travel credit card is:

16 Example: Travel Credit Cards
Insight: Annual income has a strong positive effect on having a credit card. The estimated probability of having a travel credit card changes from 0.09 to 0.97 as annual income changes over its range.

17 Example: Estimating Proportion of Students Who’ve Used Marijuana
A three-variable contingency table from a survey of senior high-school students is shown on the next slide. The students were asked whether they had ever used: alcohol, cigarettes or marijuana. We’ll treat marijuana use as the response variable and cigarette use and alcohol use as explanatory variables.

18 Example: Estimating Proportion of Students Who’ve Used Marijuana
Table Alcohol, Cigarette, and Marijuana Use for High School Seniors

19 Example: Estimating Proportion of Students Who’ve Used Marijuana
Let y indicate marijuana use, coded: (1 = yes, 0 = no) Let be an indicator variable for alcohol use, coded (1 = yes, 0 = no) Let be an indicator variable for cigarette use, coded (1 = yes, 0 = no)

20 Example: Estimating Proportion of Students Who’ve Used Marijuana
Table MINITAB Output for Estimating the Probability of Marijuana Use Based on Alcohol Use and Cigarette Use

21 Example: Estimating Proportion of Students Who’ve Used Marijuana
The logistic regression prediction equation is:

22 Example: Estimating Proportion of Students Who’ve Used Marijuana
For those who have not used alcohol or cigarettes, . For them, the estimated probability of marijuana use is

23 Example: Estimating Proportion of Students Who’ve Used Marijuana
For those who have used alcohol and cigarettes, . For them, the estimated probability of marijuana use is

24 Example: Estimating Proportion of Students Who’ve Used Marijuana
SUMMARY: The probability that students have tried marijuana seems to depend greatly on whether they’ve used alcohol and/or cigarettes.


Download ppt "Chapter 13 Multiple Regression"

Similar presentations


Ads by Google