Applied Epidemiologic Analysis - P8400 Fall 2002

Slides:



Advertisements
Similar presentations
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Advertisements

Prof. Navneet Goyal CS & IS BITS, Pilani
The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Simple Logistic Regression
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
EPI 809/Spring Probability Distribution of Random Error.
1 Contingency Tables: Tests for independence and homogeneity (§10.5) How to test hypotheses of independence (association) and homogeneity (similarity)
Overview of Logistics Regression and its SAS implementation
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
1 Experimental design and analyses of experimental data Lesson 6 Logistic regression Generalized Linear Models (GENMOD)
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
BIOST 536 Lecture 9 1 Lecture 9 – Prediction and Association example Low birth weight dataset Consider a prediction model for low birth weight (< 2500.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
An Introduction to Logistic Regression
WLS for Categorical Data
Adjusting for extraneous factors Topics for today More on logistic regression analysis for binary data and how it relates to the Wolf and Mantel- Haenszel.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Simple Linear Regression
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Logistic Regression Analysis of Matched Case-Control Data- Part 2.
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 10 Missing Data Henian Chen, M.D., Ph.D.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Epidemiologic design from a sampling perspective Epidemiology II Lecture April 14, 2005 David Jacobs.
03/20131 EPI 5344: Survival Analysis in Epidemiology Risk Set Analysis Approaches April 16, 2013 Dr. N. Birkett, Department of Epidemiology & Community.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
Logistic Regression Applications Hu Lunchao. 2 Contents 1 1 What Is Logistic Regression? 2 2 Modeling Categorical Responses 3 3 Modeling Ordinal Variables.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
© Department of Statistics 2012 STATS 330 Lecture 19: Slide 1 Stats 330: Lecture 19.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Exact Logistic Regression
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 3 Type I, II Error, Sample Size, and Power Henian Chen, M.D., Ph.D.
Lecture 3 (Chapter 4). Linear Models for Longitudinal Data Linear Regression Model (Review) Ordinary Least Squares (OLS) Maximum Likelihood Estimation.
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
Analysis of matched data Analysis of matched data.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
The simple linear regression model and parameter estimation
BINARY LOGISTIC REGRESSION
Advanced Quantitative Techniques
Advanced Quantitative Techniques
Multiple logistic regression
Applied Epidemiologic Analysis - P8400 Fall 2002
Statistics 262: Intermediate Biostatistics
Case-control studies: statistics
Presentation transcript:

Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 8 Conditional Logistic Regression Henian Chen, M.D., Ph.D. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Linear Regression Ordinary Least Squares (OLS) Statistical methods based on the minimization of the squared differences between the observed and predicted values of Y. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Logistic Regression Likelihood The probability that a score or set of scores could occur, given the values of a set of parameters in a model. Maximum Likelihood (ML) A method for the estimation of parameters based on the principle of maximizing the likelihood of the sample. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Ordinary (unconditional) Logistic Regression Unconditional Likelihood (Lu) Lu = πP(Xi) π[1-P(Xi)] = πexp(α+βiXi) / π[1+exp(α+βiXi)] πP(Xi) = the probability for the event π[1-P(Xi)] = the probability for not having the event proc logistic data=case_control978 descending; model status=alcgrp; run; Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -2.5911 0.1925 181.1314 <.0001 alcgrp 1 1.7641 0.2132 68.4372 <.0001 OR=exp(1.7641)=5.836 Applied Epidemiologic Analysis - P8400 Fall 2002

1:1 Conditional Logistic Regression   Description of Data (CASECONTROL11.DBF) 1:1 matched case-control study of low birth weight (<2500 grams) babies. 56 matched pairs. Controls were age-matched mothers who gave birth to a baby above 2500 grams. ID identification variable for each matched pair Status 0= GE 2500 grams (controls), 1=<2500 grams (cases) AGE age of mother in years LWT pre-pregnant weight, weight in pounds at last menstrual period RACE race of mother (1=white, 2=black, 3= other) SMOKE smoking status during pregnancy (1=yes, 0=no) PTD history of premature labor (1=yes, 0=no) HT history of hypertension (1=yes, 0=no) UI presence of uterine irritability (1=yes, 0=no) The goal of the analysis is to determine whether the low birth weight is associated with any of above explanatory variables. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Conditional Logistic Regression (1) We cannot use unconditional likelihood (Lu) for matched case-control study. Using Lu for matched case-control data, the αi is the effect of the pair effect, the βi is the effect of Xi. Since there are only two observations in each pair, you can’t estimate the αi without bias. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Conditional Logistic Regression (2) We have to include 56-1=55 dummy variables for the strata if we want to use unconditional logistic regression for matched case-control study. This would leave us with possibly 64 parameters being estimated for a data set with only 112 observations. Furthermore, increasing the sample size will not help because an additional stratum parameter would have to be estimated for each additional matched set in the study sample. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Conditional Logistic Regression (3) Conditional Likelihood (Lc) Lc = πP(Xi) π[1-P(Xi)] / Σ{ΣP(Xi) π[1-P(Xi)] } = πexp(ΣβiXi) / Σ[πexp(ΣβiXi)] πP(Xi) =probability of pair’s case having the event π[1-P(Xi)] = probability of pair’s control not having the event Σ{ΣP(Xi) π[1-P(Xi)] } = the joint probability that either the case or control has the event. Applied Epidemiologic Analysis - P8400 Fall 2002

Parameters Estimation for Conditional Logistic Regression (4) Conditional Likelihood (Lc) Lc can use the information contained in the matches 2. Lc can’t estimate the intercept (α). With no α in conditional logistic regression model, we can’t estimate the P(x). 3. Lc can estimate the odds ratio by using the βi, so we can estimate the other effects (except intercept) in which we are interested. Applied Epidemiologic Analysis - P8400 Fall 2002

Applied Epidemiologic Analysis - P8400 Fall 2002 Lc vs. Lu We can use conditional likelihood (Lc) for unmatched case-control study: 1) Lc treats the unmatched case-control data as one stratum. 2) Conditional likelihood (Lc) always gives you the unbiased estimation. We can not use unconditional likelihood (Lu) for matched case-control study. Because Lu omits the information inherent in the matching process. It is incorrect to treat 56 strata as one stratum. Applied Epidemiologic Analysis - P8400 Fall 2002

1:1 Conditional Logistic Regression (1) SAS Program   proc import datafile= 'a:casecotrol11.dbf' out=casecontrol11 dbms=dbf replace; run; data casecontrol11; set casecontrol11; status1=1-status; race1=0; if race=2 then race1=1; race2=0; if race=3 then race2=1; proc phreg; model status1 = age lwt race1 race2 smoke ptd ht ui /selection=forward ties=discrete rl; strata=id; Applied Epidemiologic Analysis - P8400 Fall 2002

1:1 Conditional Logistic Regression (2) Status1 (case=0,control=1): Probability of being a case is modeled proc phreg: Procedure PHREG performs both Cox regression for survival data, and conditional logistic regression for matched case-control studies selection=forward: start with a model containing none of the independent variables and then considers variables one by one for inclusion ties= option specifies how to handle ties in the failure time = BRESOW: uses the approximate likelihood of Breslow. this is the default value = DISCRETE: replaces proportional hazards model by the discrete logistic model = EFRON: uses the approximate likelihood of Efron = EXACT: computes the exact conditional probability under the proportional hazards assumption The DISCRETE method is required :1) if the dependent variable is discrete 2) there is more than one case in a matched set of case-control study rl: estimate the 95% confidence limits Applied Epidemiologic Analysis - P8400 Fall 2002

1:3 Conditional Logistic Regression Description of Data (CASECONTROL13.TXT)   1:3 matched hospital based case-control study. Cases are women diagnosed with benign breast disease from two hospitals. Controls were selected from other patients at the same two hospitals.   ID stratum number SUBJECT observation within a matched set (1=case, 2-4= controls) AGEINTER age of the subject at the interview STATUS diagnosis (1=case, 0=control) MCHECK regular medical checkup history (1=yes, 0=no) AGEP age at first pregnancy AGEM age at menarche NONLIVEN number of stillbirths, miscarriages, and other non live births LIVEN number of live births WEIGHT weight of the subject AGELM age at last menstrual period Applied Epidemiologic Analysis - P8400 Fall 2002

1:3 Conditional Logistic Regression SAS Program   proc import datafile= 'a:casecotrol13.txt' out=casecontrol13 dbms=tab replace; getnames=yes; run; data casecontrol13; set casecontrol13; status1=1-status; proc phreg; model status1 = ageinter mcheck agep agem nonliven liven weight agelm /selection=forward ties=discrete rl; strata=id; Applied Epidemiologic Analysis - P8400 Fall 2002

Applied Epidemiologic Analysis - P8400 Fall 2002 We can not use unconditional logistic regression for matched case-control study, but we can use conditional logistic regression for unmatched case-control study. Unconditional model proc logistic data=case_control978 descending; model status=alcgrp; Parameter β SE OR 95% Confidence Limits alcgrp 1.7641 0.2132 5.836 3.843 8.864 Conditional model (ties=discrete) proc phreg; model status1*status(0)= alcgrp / ties=discrete; Parameter β SE OR 95% Confidence Limits alcgrp 1.76231 0.21315 5.826 3.836 8.847 Conditional model (default value for ties) proc phreg; model status1*status(0) = alcgrp; Parameter β SE OR 95% Confidence Limits alcgrp 1.47319 0.2008 4.363 2.944 6.467 Applied Epidemiologic Analysis - P8400 Fall 2002