Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Logistic Regression

Similar presentations


Presentation on theme: "Introduction to Logistic Regression"— Presentation transcript:

1 Introduction to Logistic Regression
武汉大学公共卫生学院流行病与卫生统计学系 2011,5,31 16:59

2 New Words LPM       线性概率模型 Odds Ratio    优势比 Nominal Variables 名义变量 Dummy Variable 哑变量 Multiple Logistic Regression 多重Logistic回归 16:59

3 CONTENTS 1. Review the Type of Variables 2. Variables In Logistic Regression 3. Why cannot we use a Linear Regression for Categorical Response? 4. Logistic Regression Model 5. What Is an Odds Ratio? 6. Multiple Logistic Regression 16:59

4 1. Review the Type of Variables
16:59

5 Choosing the Scale of Measurement
Before analyzing, select the measurement scale for each variable. 16:59

6 分类(定性)变量 数值(定量)变量 名义变量 有序变量 离散变量 连续变量 16:59

7 Nominal Variables 16:59

8 Ordinal Variables 16:59

9 Binomial Variables Weather Good or Bad ? Male or Female ? 16:59

10 Continuous Variables 16:59

11 2. Variables In Logistic Regression
16:59

12 Predicted ,Outcome ,Dependent variable
应变量 16:59

13 Types of Logistic Regression
16:59 3. 有序分类logistic回归

14 What Does Logistic Regression Do?
自变量 to predict the probability of specific outcomes. 二分类应变量 Predictor variables Predicted variable Explanatory variables Response variable Covariables Outcome variable Independent variables Dependent variable 16:59

15 Independent variables of Logistic Regression
Continuous variables Dummy Variable for Nominal 16:59

16 3. Why cannot we use a Linear Regression for Categorical Response?
16:59

17 Example: Failing or Passing an Exam
Let us define a variable ‘Outcome’ Outcome = 0 if the individual fails the exam = 1 if the individual passes the exam Predictor variable:the quantity of hours we use to study Linear Probability Model’ (LPM) : Prob (Outcome=1) = α + β*Quantity of hours of study 16:59

18 Linear Probability Models (LPM)
Student id Outcome Quantity of Study Hours 1 3 2 34 17 4 6 5 12 15 7 26 8 29 9 14 10 58 11 31 13 16:59

19 4. Logistic Regression Model
16:59

20 Logistic Regression Curve
1.0 0.9 0.8 0.7 Probability 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 x 16:59

21 Logit Transformation Logistic regression models transform probabilities called logits. where i indexes all cases (observations). is the probability the event (a sale, for example) occurs in the ith case. ln is the natural log (to the base e). 16:59

22 Assumption 1 16:59

23 Logistic Regression Model
logit ( ) = b0 + b1X1 where logit( ) logit transformation of the probability of the event b0 intercept of the regression line b1 slope of the regression line. 线性关系 16:59

24 LOGISTIC Procedure SAS SPSS Analyze Regression Binary Logistic…
PROC LOGISTIC DATA=SAS-data-set <options>; CLASS variables </option>; MODEL response=predictors </options>; OUTPUT OUT=SAS-data-set keyword=name </option>; RUN; SPSS Analyze Regression Binary Logistic… Dependent: y Covariates: x Method: Forward Ward Save…—— Predicted Values  Probabilities  Group membership Option——  CI for exp 95% Probability for Stepwise Entry: Removal Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model. The likelihood function L = Prob (p1* p2* … * pn) 16:59

25 SPSS Output result Odds Ratio 16:59

26 LPM and Logistic Regression Models
Student id Outcome Quantity of Study Hours 1 3 2 34 17 4 6 5 12 15 7 26 8 29 9 14 10 58 11 31 13 16:59

27 Comparing LPM and the Logistic Curve
16:59

28 5. What Is an Odds Ratio? An odds ratio indicates how much more likely, with respect to odds, a certain event occurs in one group relative to its occurrence in another group. 16:59

29 Probabilities from odds
The odds, calculated as Can be rearranged to express the probability of an event in terms of the odds: 16:59

30 Probabilities and Odds
16:59

31 Probability of Outcome
16:59

32 Odds 16:59

33 Odds Ratio 16:59

34 Properties of the Odds Ratio
No Association Odds Ratio Group B More Likely Group A More Likely Regression Coefficient b 16:59 -∞ ∞

35 Odds Ratio from a Logistic Regression Model
Estimated logistic regression model: Estimated odds ratio (each more 1 Study Hours): odds ratio = (e (a+1))/(e (a)) odds ratio = eb=e.495 = 1.640 16:59

36 6. Multiple Logistic Regression
logit ( ) = b0 + b1X1 + b2X2 + b3X3 16:59

37 Backward Elimination Method
16:59

38 Adjusted Odds Ratio 16:59

39 Interaction in Multiple Logistic Regression
16:59

40 Interaction Plot Predicted Logit Income Level Females Males Low Medium
High Income Level 16:59

41 Backward Elimination Method
. . . 16:59

42 Multicollinearity in Multiple Logistic Regression
The presence of multicollinearity will not lead to biased coefficients. But the standard errors of the coefficients will be inflated. If a variable which you think should be statistically significant is not, consult the correlation coefficients. If two variables are correlated at a rate greater than .6, .7, .8, etc. then try dropping the least theoretically important of the two. 16:59

43 =15~20 times number of variables
Sample Sizes =15~20 times number of variables 16:59

44 Thanks for your attention! Thanks for your attention 16:59


Download ppt "Introduction to Logistic Regression"

Similar presentations


Ads by Google