1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.

Slides:



Advertisements
Similar presentations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Advertisements

AP Statistics Course Review.
1-Way Analysis of Variance
Lecture 11 (Chapter 9).
Three or more categorical variables
Brief introduction on Logistic Regression
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
Simple Logistic Regression
1 Contingency Tables: Tests for independence and homogeneity (§10.5) How to test hypotheses of independence (association) and homogeneity (similarity)
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
1 If we live with a deep sense of gratitude, our life will be greatly embellished.
1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose.
Log-Linear Models & Dependent Samples Feng Ye, Xiao Guo, Jing Wang.
Instructor: K.C. Carriere
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Final Review Session.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Sample Size Determination Ziad Taib March 7, 2014.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
AS 737 Categorical Data Analysis For Multivariate
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
Chi-square test or c2 test
HSRP 734: Advanced Statistical Methods June 19, 2008.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
CHAPTER 11 SECTION 2 Inference for Relationships.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
+ Chi Square Test Homogeneity or Independence( Association)
CHI SQUARE TESTS.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Multiple Logistic Regression STAT E-150 Statistical Methods.
AP STATISTICS LESSON (DAY 1) INFERENCE FOR TWO – WAY TABLES.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
1 STA 617 – Chp12 Generalized Linear Mixed Models Modeling Heterogeneity among Multicenter Clinical Trials  compare two groups on a response for.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
Analysis of variance Tron Anders Moger
ANOVA and Multiple Comparison Tests
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Inferential Statistics Assoc. Prof. Dr. Şehnaz Şahinkarakaş.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Chi-square test or c2 test
Review for Exam 2 Some important themes from Chapters 6-9
Chapter 10 Analyzing the Association Between Categorical Variables
Inference for Relationships
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Analyzing the Association Between Categorical Variables
Categorical Data Analysis
Joyful mood is a meritorious deed that cheers up people around you
Presentation transcript:

1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements over time on each individual  or from a set of measurements that are related because they belong to the same group or cluster (e.g., measurements made on siblings from the same family, measurements made on a set of teeth from the same mouth).  Observations within a cluster are not usually independent of each other, as the response from one child of a family, say, may influence the response from another child, because the two grew up together.  Matched-pairs are the special case of each cluster having two members.

2 STA 617 – Chp11 Models for repeated data  Using repeated measures within a cluster can be an efficient way to estimate the mean response at each measurement time without estimating between-cluster variability.  Many times, one is interested in the marginal distribution of the response at each measurement time, and not substantially interested in the correlation between responses across times.  Estimation methods for marginal modeling include maximum likelihood estimation and generalized estimating equations (GEE).  Maximum likelihood estimation is difficult because the likelihood is written in terms of the I T multinomial joint probabilities for T responses with I categories each, but the model applies to the marginal probabilities.  Lang and Agresti give a method for maximum likelihood fitting of marginal models in Section Modeling a repeated multinomial response or repeated ordinal response is handled in the same way.

3 STA 617 – Chp11 Models for repeated data Topics  In Section 11.1 we compare marginal distributions in T- way tables. The remaining sections extend models to include explanatory variables.  In Section 11.2 we use ML methods for fitting marginal models.  In Section 11.3 we use generalized estimating equations (GEE), a multivariate version of quasi- likelihood that is computationally simpler than ML.  Section 11.4 covers technical details about the GEE approach.  In the final section we introduce a transitional approach that models observations in terms of previous outcomes.

4 STA 617 – Chp11 Models for repeated data 11.1 COMPARING MARGINAL DISTRIBUTIONS: MULTIPLE RESPONSES  Please review  Example: in treating a chronic condition with some treatment, the primary goal might be to study whether the probability of success increases over the T weeks of a treatment period.  The T success probabilities refer to the T first-order marginal distributions  We want to compare marginal distributions.

5 STA 617 – Chp11 Models for repeated data Binary Marginal Models and Marginal Homogeneity  T binary responses  Marginal logit model with  All possible outcomes where  Let  the joint distribution of is Mult (n, ( 1,  2, …,  2^T ))

6 STA 617 – Chp11 Models for repeated data Marginal homogeneity  Likelihood  The likelihood-ratio test of marginal homogeneity where sample proportions and is maximized likelihood estimate assuming marginal homogeneity.  asymptotic null chi-squared distribution with DF=T-1

7 STA 617 – Chp11 Models for repeated data Crossover Drug Comparison Example  each subject used each of three drugs for treatment of a chronic condition at three times.  The response measured the reaction as favorable or unfavorable. (binary)  assume that the drugs have no carryover effects and that the severity of the condition remained stable for each subject throughout the experiment.

8 STA 617 – Chp11 Models for repeated data Test marginal homogeneity  Sample proportions favorable (n=46) [( )/46=0.61, 28/46=0.61, 16/46=0.35] for drug A, B, C  Clearly, from the sample proportion, A and B are similar, and better than C  The likelihood-ratio test statistic is 5.95 (DF=2). P-value=0.05.

9 STA 617 – Chp11 Models for repeated data SAS

10 STA 617 – Chp11 Models for repeated data simultaneous confidence intervals  The confidence interval for the true difference is ( , 0.520) between B and C

11 STA 617 – Chp11 Models for repeated data CATMOD Suppose the dependent variable A has three levels and is the only response-effect in the MODEL statement.

12 STA 617 – Chp11 Models for repeated data Design Matrix  p_A=alpha+beta1+beta2  P_B=alpha+beta1  P_C=alpha  Alpha=intercept  Beta1=p_B-p_C  Beta2=p_A-p_B

13 STA 617 – Chp11 Models for repeated data Design Matrix  p_A=parameter1  P_B=parameter2  P_C=parameter3 Analysis of Weighted Least Squares Estimates EffectParameterEstimateStandard Error Chi- Square Pr > ChiSq Model < < <.0001

14 STA 617 – Chp11 Models for repeated data Modeling Margins of a Multicategory Response  Saturated model  marginal homogeneity  Test

15 STA 617 – Chp11 Models for repeated data Ordinal response  marginal homogeneity  Test  Model fitting

16 STA 617 – Chp11 Models for repeated data Wald and Generalized CMH Score Tests of Marginal Homogeneity  Similar with paired data in Chapter 10  SAS

17 STA 617 – Chp11 Models for repeated data 11.2 MARGINAL MODELING: MAXIMUM LIKELIHOOD APPROACH  compared marginal distributions, but accounting for explanatory variables.

18 STA 617 – Chp11 Models for repeated data Longitudinal Mental Depression Example  comparing a new drug with a standard drug  Outcome: mental depression (normal, abnormal)  Stratified randomization by severity of depression (was mild or severe). Four arms n=80, 70, 100, 90  Follow up 1 week, 2 weeks, and 4 weeks

19 STA 617 – Chp11 Models for repeated data  explanatory variables: treatment type and severity of initial diagnosis  T=3  12 marginal distributions result from three repeated observations for each of the four groups.  Let s denote the severity of the initial diagnosis, with s=1 for severe and s=0 for mild.  Let d denote the drug, with d=1 for new and d=0 for standard.  Let t denote the time of measurement. Use score (0, 1, 2), the logs to base 2 of the week (1, 2, 4).

20 STA 617 – Chp11 Models for repeated data Descriptive statistics (sample proportions)  the sample proportion of normal responses after week 1 for subjects with mild initial diagnosis using the standard drug was

21 STA 617 – Chp11 Models for repeated data data depress; input case diagnose treat time outcome ; * outcome=1 is normal; datalines; ; proc sort; by diagnose treat time; proc means n mean std; class diagnose treat time; var outcome; run;

22 STA 617 – Chp11 Models for repeated data  The sample proportion of normal responses  increased over time for each group;  increased at a faster rate for the new drug than the standard, for each fixed initial diagnosis;  and was higher for the mild than the severe initial diagnosis, for each treatment at each occasion.  The company would hope to show that patients have a significantly higher rate of improvement with the new drug.

23 STA 617 – Chp11 Models for repeated data Modeling  The marginal logit model 1 (main effects model)  Time (t) is continuous  The natural sampling assumption is multinomial for the eight cells in the 2 3 cross-classification of the three responses  A check of model fit compares the 32 cell counts in Table 11.2 to their ML fitted values. Since the model describes 12 marginal logits using four parameters, residual df=8. The deviance G 2 =34.6.  Lack of fit, since model assumes a common rate of improvement (should be higher for new drug)

24 STA 617 – Chp11 Models for repeated data Model 2

25 STA 617 – Chp11 Models for repeated data  For each drug-time combination, the estimated odds of a normal response when the initial diagnosis was severe equal exp(-1.29)=0.27 times the estimated odds when the initial diagnosis was mild.  The estimate indicates an insignificant difference between the drugs after 1 week.  At time t, the estimated odds of normal response with the new drug are exp( t) times the estimated odds for the standard drug, for each initial diagnosis level.  Conclusion: severity of initial diagnosis, drug treatment, and time all have substantial effects on the probability of a normal response.

26 STA 617 – Chp11 Models for repeated data Modeling a Repeated Multinomial Response  At observation t, the marginal response distribution has I-1 logits.  nominal responses, baseline-category logit models describe the odds of each outcome relative to a baseline.  For ordinal responses, one might use cumulative logit models.  checking for interaction is crucial.

27 STA 617 – Chp11 Models for repeated data Insomnia Example  randomized, double-blind clinical trial comparing an active hypnotic drug with a placebo in patients who have insomnia problems.  response is the patient’s reported time in minutes to fall asleep after going to bed.

28 STA 617 – Chp11 Models for repeated data Proportional odds model  Sample marginal distributions proc sort; by treat time; proc freq; tables treat*time*outcome /nocol NOFREQ NOPERCENT; run;

29 STA 617 – Chp11 Models for repeated data ML model fitting  G2=8.0 (df=6)  shows evidence of interaction  At the initial observation, the estimated odds that time to falling asleep for the active treatment is below any fixed level equal exp(0.046)=1.04 times the estimated odds for the placebo treatment;  at the follow-up observation, the effect is exp( )=2.03.  In other words, initially the two groups had similar distributions, but at the follow-up those with the active treatment tended to fall asleep more quickly.  Follow-up with placebo or treatment, both tended to fall sleep more quickly (exp(1.07)=2.9)

30 STA 617 – Chp11 Models for repeated data Comparisons That Control for Initial Response  Model assumption: the marginal distributions for initial response are identical for the treatment groups.  This is true if random assignment of subjects to the groups (one of the principles in experimental design: randomization, other two: replication, blocking)  If the initial marginal distributions are not identical, however, the difference between follow-up and initial marginal distributions may differ between treatment groups, even though their conditional distributions for follow-up response are identical.  In such cases, although marginal models can be useful, they may not tell the entire story. It may be more informative to construct models that compare the follow-up responses while controlling for the initial response.

31 STA 617 – Chp11 Models for repeated data transitional model  Let Y 2 denote the follow-up response, for treatment x with initial response y 1.

32 STA 617 – Chp11 Models for repeated data ML Fitting of Marginal Logit Models*  For T observations on an I-category response, at each setting of predictors the likelihood refers to I T multinomial joint probabilities, but the model applies to T sets of marginal multinomial parameters  The marginal multinomial variates are not independent.  Marginal logit models have the generalized loglinear model form where denote the complete set of multinomial joint probabilities for all settings of predictors.

33 STA 617 – Chp11 Models for repeated data Example, model (11.1)  the model of marginal homogeneity (T=2)

34 STA 617 – Chp11 Models for repeated data likelihood  The likelihood function for a marginal logit model is the product of the multinomial mass functions from the various predictor settings.  Usually, no continuous predictor is allowed if U denote a full column rank matrix such that the space spanned by the columns of U is the orthogonal complement of the space spanned by the columns of X.  maximizing the likelihood incorporates these model constraints as well as identifiability constraints

35 STA 617 – Chp11 Models for repeated data ML  Joseph Lang ( has R and S-Plus functions for ML fitting of marginal models through the generalized loglinear model (11.8), using the constraint approach with Lagrange multipliers.  The program MAREG (Kastner et al. 1997) provides GEE fitting and ML fitting of marginal models with the Fitzmaurice and Laird (1993) approach, allowing multicategory responses.

36 STA 617 – Chp11 Models for repeated data Generalized Estimating Equation (GEE)