1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.

Slides:



Advertisements
Similar presentations
Econometrics I Professor William Greene Stern School of Business
Advertisements

Brief introduction on Logistic Regression
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Logistic Regression Example: Horseshoe Crab Data
Overview of Logistics Regression and its SAS implementation
6.1.4 AIC, Model Selection, and the Correct Model oAny model is a simplification of reality oIf a model has relatively little bias, it tends to provide.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
1 Experimental design and analyses of experimental data Lesson 6 Logistic regression Generalized Linear Models (GENMOD)
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Generalized Linear Models
Logistic regression for binary response variables.
Survival analysis. First example of the day Small cell lungcanser Meadian survival time: 8-10 months 2-year survival is 10% New treatment showed median.
Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Logistic Regression I HRP 261 2/09/04 Related reading: chapters and of Agresti.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Applied Epidemiologic Analysis - P8400 Fall 2002
Logistic Regression Database Marketing Instructor: N. Kumar.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Logistic Regression. Conceptual Framework - LR Dependent variable: two categories with underlying propensity (yes/no) (absent/present) Independent variables:
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
Logistic Regression. Linear Regression Purchases vs. Income.
Applied Statistics Week 4 Exercise 3 Tick bites and suspicion of Borrelia Mihaela Frincu
Multiple Logistic Regression STAT E-150 Statistical Methods.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
We’ll now look at the relationship between a survival variable Y and an explanatory variable X; e.g., Y could be remission time in a leukemia study and.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Logistic regression (when you have a binary response variable)
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Exact Logistic Regression
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.
Logistic Regression Hal Whitehead BIOL4062/5062.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.
Analysis of matched data Analysis of matched data.
BINARY LOGISTIC REGRESSION
CHAPTER 7 Linear Correlation & Regression Methods
Notes on Logistic Regression
Logistic Regression Logistic Regression is used to study or model the association between a binary response variable (y) and a set of explanatory variables.
Regression Techniques
Generalized Linear Models
Multiple logistic regression
ביצוע רגרסיה לוגיסטית. פרק ה-2
Logistic Regression.
Modeling Ordinal Associations Bin Hu
Presentation transcript:

1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל

2 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Usual Regression Model y i =  0 +  1 x i +  i,i=1,…,n  i ~ N(0  2  independent) The model can be extended to many x’s: y i =  0 +  1 x 1i +  2 x 2i + … +  p x pi +  i Some of the x’s may be categorical (defined by dummies).

3 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Many times the y’s are binary, for example: 1) yes/no 2) alive/dead 3) success/failure

4 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה E(Y i )= p i p i is a function of the x’s, approaching 1 from below and 0 from above:

5 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה There are other possible functions with this form (such as probit) - which are not discussed here.

6 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Passengers on the Titanic Description The data give the survival status of 679 passengers on the Titanic, together with their names, age, sex and passenger class. Variable Description Name: Recorded name of passenger Passenger class: 1st, 2nd or 3rd Age: Age in years Gender: 0 = male, 1 = female Survived: 1 = Yes, 0 = No

7 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה One binary explanatory variable: An example: GENDER Effect on the Survival of Passengers on the Titanic GENDER is define by x:x=1 for female, and x=0 for male Survival is defined by: Yes=survived and No=didn't survive

8 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Odds OR=1 indicates no effect of x on y.

9 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Odds and the Logistic Model  1 =log(OR) measures the effect of x.  1 =0 (or equivalently OR=1) implies no effect of x on y. log(Odds Ratio)

10 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Method of Estimation: Maximum Likelihood Motivation for MLE (Maximum Likelihood Estimator): Value of the parameters which maximizes the probability of observing the data we in fact observed. For individual i we have a Bernoulli distribution: p i are functions of x.

11 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Numerical optimization: gives estimates and estimates of their variances and covariances. x’s can be continuous o r categorical.

12 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression using SAS Software The LOGISTIC Procedure: Response Variable: SURVIVED Response Levels: 2 Number of Observations: 679 Link Function: Logit Response Profile Ordered Value SURVIVED Count

13 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC SC LOG L with 1 DF (p=0.0001)(*) Score with 1 DF (p=0.0001)(*) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT (  0 ) GENDER (  1 ) (*) (*)tests GENDER effect. ^ ^

14 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Odds(Female) = / = log(3.6970) = 1.3 GENDER=1:  0 +  1 *1 = = 1.3 = log{Odds(Female)} Odds(Male) = / = log(0.2721) = -1.3 GENDER=0:  0 +  1 *0 = -1.3 = log{Odds(Male)} Odds Ratio = Odds(Female)/Odds(Male) = / = log(OR) = log(13.58) = =  1 ^ ^ ^ ^ ^ ^^ ^^ ^^

15 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER (Continued) Parameter Estimates and 95% Confidence Intervals Profile Likelihood Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT GENDER Wald Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT GENDER

16 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = AGE Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC SC LOG L with 1 DF (p=0.0569)(*) Score with 1 DF (p=0.0575)(*) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT AGE (*) (*)tests AGE effect.

17 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה

18 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Odds Ratio for a continuous explanatory variable x if p k = probability to survive for x=k and odds(k)=p k /(1-p k ), then: which means that e  1 = odds(k+1)/odds(k) is the Odds-Ratio for an increment of one unit on x.

19 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = AGE (Continued) Parameter Estimates and 95% Confidence Intervals Profile Likelihood Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT AGE Wald Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT AGE

20 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER and AGE Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC SC LOG L with 2 DF (p=0.0001)(*) Score with 2 DF (p=0.0001)(*) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT GENDER AGE (*)tests GENDER+AGE effect simultaneously.

21 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER and AGE (continued) Parameter Estimates and 95% Confidence Intervals Profile Likelihood Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT GENDER AGE

22 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER and AGE (continued) Likelihood Ratio Test: for AGE effect in addition to GENDER effect -2logL(GENDER) - {-2logL(GENDER+AGE)} = = <  (df=1) = 3.84 meaning there is no additional effect of AGE over GENDER.

23 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER and AGE (continued) p 1 = Female probability of survival (GENDER=1) p 0 = Male probability of survival (GENDER=0) ^

24 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER and AGE (continued) Female Male

25 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER, AGE and Interaction(AGE*GEN) Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC SC LOG L with 3 DF (p=0.0001) Score with 3 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT GENDER AGE AGE*GEN

26 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE and Interaction(AGE*GEN) (continued) Parameter Estimates and 95% Confidence Intervals Profile Likelihood Confidence Limits Parameter Variable Estimate Lower Upper INTERCPT GENDER AGE AGE*GEN

27 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE and Interaction(Continued) Odds Ratio for GENDER = GENDER AGE AGE*GENDER = female probability of survival (GENDER=1) = male probability of survival (GENDER=0) = AGE AGE = AGE = AGE OR for GENDER is a function of AGE: ^ ^ ^

28 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE and Interaction (Continued) Male Female Male Female

29 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE and Interaction (Continued) Prediction: Survived Didn’t survive Correct Prediction = ( )/679 = 0.786

30 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE and Interaction (Continued) Sensitivity =proportion of survivors who were correctly predicted to have survived = 207/296 = Specificity =proportion of Non-survivors who were correctly predicted to have not survived = 327/383 = False Pos. =Proportion of those predicted to survive who in fact did not survive = 56/263 = False Neg. =Proportion of those predicted not to survive, who in fact survived = 89/416 = 0.214

31 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = GENDER, AGE and Interaction(AGE*GEN) (Continued) Classification Table Correct Incorrect Percentages Prob Non- Non- Sensi- Speci- False False Level Event Event Event Event Correct tivity ficity POS NEG

32 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה C-Statistic = 0.814

33 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Dummy Variables Passenger Class CLASS1 CLASS2 1 st Class nd Class rd Class 0 0

34 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = CLASS1 and CLASS2 Using LOGISTIC Procedure Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC SC LOG L with 2 DF (p=0.0001) Score with 2 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT CLASS CLASS

35 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Logistic Regression Model: Survived = PCLASS Using GENMOD Procedure Analysis Of Parameter Estimates Parameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT PCLASS 1st PCLASS 2nd PCLASS 3rd LR Statistics For Type 3 Analysis Source DF ChiSquare Pr>Chi PCLASS

36 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and interactions Using GENMOD Procedure LR Statistics For Type 3 Analysis Source DF ChiSquare Pr>Chi GENDER AGE GENDER*AGE PCLASS GENDER*PCLASS AGE*PCLASS GENDER*AGE*PCLASS

37 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and interactions Excluding the 3-order interaction: GENDER*AGE*PCLASS LR Statistics For Type 3 Analysis Source DF ChiSquare Pr>Chi GENDER AGE GENDER*AGE PCLASS GENDER*PCLASS AGE*PCLASS

38 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה C-Statistic = 0.879

39 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and 2nd-order interactions (continued) Analysis Of Parameter Estimates Parameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT GENDER AGE GENDER*AGE PCLASS 1st PCLASS 2nd PCLASS 3rd GENDER*PCLASS 1st GENDER*PCLASS 2nd GENDER*PCLASS 3rd AGE*PCLASS 1st AGE*PCLASS 2nd AGE*PCLASS 3rd

40 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and 2nd-order interactions (continued) = GENDER AGE GENDER*AGE CLASS CLASS GENDER*CLASS GENDER*CLASS AGE*CLASS AGE*CLASS2 Hence, there are 6 models: one for each combination of GENDER and PCLASS. For example: GENDER=1 (Female) and PCLASS=1: = ( )*AGE = AGE

41 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and 2nd-order interactions (continued)

42 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה Survived = GENDER, AGE, PCLASS and 2nd-order interactions (continued)