Presentation is loading. Please wait.

Presentation is loading. Please wait.

STT520-420: BIOSTATISTICS ANALYSIS Dr. Cuixian Chen Chapter 8: Fitting Parametric Regression Models STT520-420 1.

Similar presentations


Presentation on theme: "STT520-420: BIOSTATISTICS ANALYSIS Dr. Cuixian Chen Chapter 8: Fitting Parametric Regression Models STT520-420 1."— Presentation transcript:

1 STT520-420: BIOSTATISTICS ANALYSIS Dr. Cuixian Chen Chapter 8: Fitting Parametric Regression Models STT520-420 1

2 Leukemia Remission Time with PHM and SAS STT520-420 2 SAS has a procedure that easily estimates  ’s in the proportional hazards model. With PHM, use SAS codes to estimates  ’s in the remission times data. Q: Did the group effect significant?

3 Hypothesis Testing We can test three types of hypothesis as following:  H0: β =0;  H0: the hazard function of the two groups are the same;  H0: the survival function of the two groups are the same. (for more than two groups, we can use CLASS options in procedures) STT520-420 3

4  Then hypothesis tests about  can be based on any of the 3 statistics below.  To test use one of :  Wald test:  LR test:  Score test: Review: Parametric Survival Models (chap 7) ~ ~ ~

5  There are three statistics we can compute to do a significance test of H0: β =0 conveniently for each model:  the Wald statistic is the quotient of the estimator (beta-hat) divided by the standard error of the estimator.  the second statistic is the so-called likelihood ratio (LR) statistic and is used to compare the models  The Score test. Chap 9.4: Cox Model Fitting

6  Notice that in each of the three printouts, there is a section giving values of a three test statistics testing the so-called “Global Null Hypothesis: β =0”. In this case, β =0 refers to the vector of all the betas:  The likelihood ratio chi-square statistic is obtained from the two -2LOG(L) statistics subtracted (the one w/out covariates {no x’s} minus the one with covariates). If the null hypothesis is true, then this chi- square will have d.f. = # of covariates in the model.  This same difference in log(likelihoods) can be used to compare any two models - the statistic is chi-square with the number of d.f. is the difference in # of covariates, assuming the null hypothesis of the “extra” betas = 0 is true. Chap 9.4: Cox Model Fitting: likelihood ratio test

7  Let’s use the LR test to compare models (see p. 179-180) - use the notation there:  The likelihood ratio test of the current model is  The likelihood ratio test of the full model is  Their difference (subtract the full minus the current likelihoods) is asymptotically chi-square with q d.f. and may be used to test whether the additional q parameters in the full model are zero.  This difference is called the deviance Chap 9.4: Cox Model Comparison: likelihood ratio

8 Leukemia Remission Time with PHM and SAS STT520-420 8 SAS has a procedure that easily estimates  ’s in the proportional hazards model. With PHM, use SAS codes to estimates  ’s in the remission times data.

9  Reconsider the remission data example in more detail…get the SAS output for the 3 models:  grp only (model 1)  grp and logWBC (model 2)  grp, logWBC, and interaction term grp*logWBC (model 3)  For each model, we’ll do three things:  do a statistical test of the null hypothesis beta=0  get an estimate of the hazard ratio for each beta  get a 95% confidence interval for for each beta Cox Model Fitting for remission data example

10 Remission data example in SAS, EX4.5, page 68 http://people.uncw.edu/chenc/STT520_420/SAS_Codes/remission-phreg.sas http://people.uncw.edu/chenc/STT520_420/SAS_Codes/remission-phreg.sas ptions ls=80; dm log 'clear'; dm lis 'clear'; data remission; input group $ remtime censor logWBC; if group="6mp" then grp=0; else grp=1; datalines; 6mp 6 1 … (input data); /*note the use of the numeric variable grp defined as grp=1 if group=“pl” and 0 otherwise… */ /*Model 1: Covariate = group*/ proc phreg data=remission; model remtime*censor(0)=grp; baseline out=out1 survival=S1 LOGSURV=ls1 LOGLOGS=lls1 upper=UCL1 lower=LCL1; title "Model 1"; run; quit; proc print data=out1; run; quit; proc gplot data=out1; plot S1*remtime=grp; /*plot for survival function*/ /* it gives baseline survival curves for treatment and placebo groups*/ /*SYMBOL1 VALUE=none interpol=join;*/ plot ls1*remtime=grp; /*plot for log-log survivor*/ plot lls1*remtime=grp; /*plot for cumulative hazard function (negative log survival function)*/ run; quit; STT520-420 10 /*Model 2: Covariate = group and logWBC*/ proc phreg data=remission; model remtime*censor(0)=grp logWBC; baseline out=out2 survival=S2 upper=UCL2 lower=LCL2; title "Model 2"; run; quit; proc print data=out2; run; quit; proc gplot data=out2; plot S2*remtime=grp; /*baseline survival curves for treatment and placebo groups*/ run; quit; /*----------------------------------------*/ /* Model 3: Covariate = group, logWBC and interaction term grp_logWBC=group*logWBC */ proc phreg data=remission; model remtime*censor(0)=grp logWBC grp_logWBC; grp_logWBC=grp*logWBC; /*Creation of interaction term*/ baseline out=out3 survival=S3 upper=UCL3 lower=LCL3; title "Model 3"; run; quit; proc print data=out3; run; quit; proc gplot data=out3; plot S3*remtime=grp; /*baseline survival curves for treatment and placebo groups*/ run; quit;

11 SAS output for Model 1: EX4.5, page68 STT520-420 11 Default method to handle ties; Other methods: Exact; Discrete; Efron. Only look at Y, not Yx Consider the Regression Model: Cox PHM on Yx

12 SAS output for Model 1: EX4.5, page 68 STT520-420 12 H0: β grp =0 v.s. Ha: β grp ≠0 Review STT215: 5-steps to do Hypothesis testing: 1.H0: β =0 v.s. Ha: β ≠0 2.Choose a significance level: α=5% or 1%; 3.Calculate the test statistics, assuming H0 is true; 4.Finding the P-value in direction of Ha. 5.Drawing conclusions (statistical and non-statistical): If P-value ≤ α, then we reject H 0 (Enough evidence). If P-value > α, then we do not reject H 0 (No Enough evidence).

13 SAS output for Model 1: EX4.5, page68 STT520-420 13 H0: β grp =0 v.s. Ha: β grp ≠0 Deviance for Model 1 (w/ covariate grp) comparing to Model 0 (w/o any covariate) D=187.970-172.759=15.2109 P-value =1-pchisq(15.2109, 1) =9.614686e-05

14 SAS output for Model 2: EX4.5, page68 STT520-420 14 H0: β grp = β logWBC = 0 v.s. Ha: at least one of β grp, and β logWBC is not 0 Deviance for Model 2 (w/ covariate grp & logWBC) comparing to Model 0 (w/o any covariate) D=187.970-144.559= 43.411 P-value =1-pchisq(43.411, 2) =3.744736e-10

15 SAS output for Model 3: EX4.5, page68 STT520-420 15 H0: β grp =0 v.s. Ha: β grp ≠0 H0: β grp = β logWBC = β grp*logWBC = 0 v.s. Ha: at least one of β grp, β logWBC, and βgrp*logWBC is not 0 Deviance for Model 3 (w/ covariate grp, logWBC, grp*logWBC) comparing to Model 0 (w/o any covariate) D=187.970-144.131= 43.839 P-value =1-pchisq(43.839, 3) =1.632828e-09

16 Compare Model 1 and Model 2 STT520-420 16 Deviance for Model 2 (w/ covariate grp, and logWBC) comparing to Model 1 (w/ covariate grp) D=172.759-144.559=28.2 P-value=1-pchisq(28.2, 1)=1.094046e-07 So Model 2 is better than Model 1. H0: β logWBC =0 v.s. Ha: βlogWBC ≠0

17 Compare Model 2 and Model 3 STT520-420 17 Deviance for Model 3 (w/ covariate grp, logWBC, and grp*logWBC) comparing to Model 2 (w/ covariate grp, and logWBC) D=144.559-144.131=0.428 P-value=1-pchisq(0.428, 1)=0.512972 So coefficient of interaction part is 0. Model 2 is better than Model 3. H0: β grp*logWBC =0 v.s. Ha: β grp*logWBC ≠0

18  PHM with a group membership covariate: there is only one covariate, namely “group” (usually control group: x=0 and treatment group: x=1)  The proportional hazard (or the hazard ratio) is  So, if we could get an estimate of  call it  -hat), we could then have an estimate of the hazard ratio between two individuals in the two groups ; i.e., exp(  -hat) so we could say that Recall: Example 8.1, page 145.

19 SAS output for Model 1: EX4.5, page 68 STT520-420 19 exp(beta)=4.523; then beta=log(4.523)=1.50919. Hazard Ratio=4.523 means the hazard of remission for those with placebo is about 4.523 times (or 452.3%) of the hazard for those with 6-MP.

20 SAS output for Model 1: EX4.5, page 68 STT520-420 20 Nonparametric estimates of survival function based on a fitted PHM is given by BASELINE statement (for the subjects whose covariates are all equal to the mean Of each variable, eg: x=0 and x=1, mean(x)=0.5 for grp variable).

21  Cox model fitting: Use LR test to compare model 3 with model 2; i.e., is the interaction term significant?  LR statistic is computed as difference between LRs of 2 models, LR(model 2) - LR(model 3) = 144.559 - 144.131 =.428.  This test statistic follows chi-square with df=1. (one parameter difference between the two models) under the null hypothesis that the interaction term has coefficient zero.  From R: 1-pchisq(0.428, 1) = 0.512972  That is: P(chisq(1) >.428) =.513. Therefore, we do not reject the null hypothesis. That is, Model 2 is already an appropriate model. More details: Compare Model 2 and Model 3

22  Now let’s look at the Hazard Ratio (HR) in each of the three models…  In model 1, the HR is estimated to be 4.523 (from SAS). Let’s see how this is done… we’ve seen that so if X=1 is the placebo group, then the maximum likelihood estimate of beta = 1.50919 (from SAS), so exp(1.50919) = 4.523066 is the estimated hazard ratio. This means that the hazard for an individual in the placebo group is more than 4.5 times greater than an individual in the treatment group (at all times) ignoring logWBC. How to estimate beta, assuming g1(x)=exp(beta*x)

23  Consider Model 2’s hazard ratios(Placebo: x=1) and Model 3: with a significant interaction term, estimated HR could be Cox Model Fitting: Control covariates

24  we use the baseline option in proc phreg will give the estimation of the baseline survival function with a 95% confidence interval for the baseline survival function. The UPPER and LOWER options store the upper and lower 95% confidence limits in variables UCL and LCL, respectively. proc phreg; model remtime*censor(0)=grp logWBC; title “Model 2”; baseline out=a survival=s upper=ucl lower=lcl ; proc print data=a; run; quit; PROC PHREG: Baseline option

25  To predict the adjusted survival curves for specific values of the covariates, first create a dataset with the values you want to consider and then use the covariate option as follows: … data b; grp=1; logWBC=2.93; run; … proc phreg data=remission; model remtime*censor(0)=grp logWBC; baseline out=a survival=s upper=ucl lower=lcl covariates=b/nomean; proc print data=a; run; quit; Chap 9.4: Cox Model Prediction

26 Testing for whether quantitative covariates are associated with survival time?  Both give likelihood-ratio, Score, Wald’s test statistics.  PROC LIFEREG; (Testing automatically) proc lifereg data=recid; model week*arrest(0) = fin age race wexp mar paro prio /dist=exponetial; Run;  PROC PHREG; (Testing automatically, works better) PROC PHREG DATA=recid; MODEL week*arrest(0)=fin age race wexp mar paro prio; RUN; STT520-420 26

27 To check Exponential/Weibull assumption  Case 1: With NO covariate:  R programs in chapter 4;  PROC LIFETEST produces two useful plots: the log-survival (LS) plot and the log-log survival (LLS) plot, by using PLOTS=(S, LS, LLS) to check Exponential/Weibull distribution.  If Exponential Model is appropriate, the log-survival (LS) plot: (t, -log(S(t))) should be a straight line through origin.  If Weibull Model is appropriate, the log-log survival (LLS) plot: (logt, log(-log(S(t)))) should be a straight line.  However, these graphs do not adjust for the effects of covariates. With covariates, we can use PROC LIFEREG with RPOBPLOT option. STT520-420 27

28 Graphical methods for evaluate model fit http://people.uncw.edu/chenc/STT520_420/dataset/Chap8-steel-model-check.sas http://people.uncw.edu/chenc/STT520_420/dataset/Chap8-steel-model-check.sas  Case 2: With covariate:  To check model fit with covariates, consider PROC LIFEREG with probplot option.  The PROBPLOT statement produces non-parametric estimates of the survivor function using a modified Kaplan-Meier method that adjusts for covariates. proc lifereg data=recid; model week*arrest(0)=fin age race wexp mar paro prio / dist=weibull; probplot; title "Lifereg Weibull"; run; quit; STT520-420 28

29 Graphical methods for evaluate model fit STT520-420 29 The upward sloping straight line represents the survival function predicted by the model. The shaded bands around that line are the 95% confidence bands. The circles are the non-parametric survival function estimates. Ideally, all the non-parametric estimates should lie within the confidence bands.

30 PROC LIFEREG in SAS STT520-420 30  The only differences between AFT and the usual linear regression models are that there is a σ before ε i the and that the dependent variable is logged.  With exact data, take Y = log T, and use the linear regression model with Y as the dependent variable. With censoring data, use MLE with different distribution assumption on ε. For each of the distribution of ε, there is a corresponding distribution for T.  Incidentally, all AFT models are named for the distribution of T rather than for the distribution of e or log T.

31 PROC LIFEREG in SAS  Y x can assume the follow distributions: Weibull, Exponential, gamma, log-logistic, and log-normal, by using “/dis=Weibull”.  Note: all AFT models are named for the distribution of Y x, not log(Y x ) or epsilon.  However, the choice of model can make substantial difference.  Graphical method for evaluation model fit:  If Y x ~Exp, then (Y x, -logS(Y x )) should be a straight line with an origine at 0.  If Y x ~Weibull, then (log(Y x ), log[ -logS(Y x )]) should be a straight line.  In PROC LIFETEST, plots=(ls, lls) gives both plots. STT520-420 31

32 PROC LIFEREG/PHREG in SAS  PROC LIFEREG allows all types of censoring: RC, LC and IC, while PROC PHREG only allows RC.  PROC LIFEREG can test certain hypothesis about the shape of hazard function. PROC PHREG only gives nonparametric estimation of survivor function, which can be difficult to interpret.  If shape of survival distribution is known, PROC LIFEREG produces more efficient estimation with smaller SD than PROC PHREG.  PROC LIFEREG creates set of dummy (indicator) variable to represent categorical variables with multiple values. But PROC PHREG require you to create such variables in DATA step.  But PROC LIFEREG does NOT handle time-dependent variables, while PROC PHREG does. STT520-420 32

33 PROC LIFEREG EX 7.7, pg138 ( with residual plot) options ls=80; dm log 'clear'; dm lis 'clear'; data sinker; input dur censor; datalines; 10 1 12 1 15 0 17 1 18 1 18 1 20 0 20 1 21 1 21 0 23 0 25 1 27 1 29 1 29 1 30 0 35 1 ; proc print data=sinker; run; proc lifereg data=sinker; model dur*censor(0)= /nolog dist=weibull; /*Considering Extreme-Value Distribution*/ probplot; title 'Modeling u=log(Y) w/ NOLOG option'; run; quit; proc lifereg data=sinker; model dur*censor(0)= /dist=weibull ITPRINT; title 'Modeling non-transformed Y'; /*Considering Extreme-Value Distribution*/ probplot; run; quit; /*ITPRINT is to see how the iterative process works*/ proc lifereg data=sinker; model dur*censor(0)= /dist=exponential; probplot; title 'Modeling Exponential Y'; /*Considering Exponential Distribution*/ run; quit; STT520-420 33

34 Testing for difference in survivor functions, with covariates  Q: Did the treatment make a difference in the survival experience of the two groups?  Test: S 1 (t)=S 2 (t) for all t. PROC LIFETEST calculates the following hypothesis testing’s:  (1) Log-rank Test (Mantel-Haenszel test);  (2) Wilcoxon Test;  (3) Likelihood-ratio Test with additional assumption that Yx follows Exponential assumption. STT520-420 34

35 PROC LIFETEST: Testing for differences in survivor functions between 2 groups STT520-420 35  Did treatment make a difference in survival function of two groups? That is, we test whether survivor functions are same in two groups, S1(t) = S2(t) for all t. ODS GRAPHICS ON; PROC LIFETEST DATA=myel PLOTS=S(TEST); TIME dur*status(0); STRATA treat; RUN; ODS GRAPHICS OFF;  The STRATA statement has three consequences:  1. First, instead of a single table with KM estimates, separate tables are produced for each of the two treatment groups.  2. Second, corresponding to the two tables are two graphs of the survivor function, superimposed on the same axes for easy comparison.  3. Third, PROC LIFETEST reports several statistics related to testing for differences between the two groups. Also, the TEST option (after PLOTS=S) includes the log-rank test in the survivor plot.

36 PROC LIFETEST: Testing for whether quantitative covariates are associated with survival time?  To test whether quantitative covariates are associated with survival time.  PROC LIFETEST with test option; Proc lifetest data=recid; time week*arrest(0); Test fin age race wexp mar paro prio; Run;  It gives the log-rank and Wilcoxon test statistics (better in PROC LIFETEST).  Or Likelihood-ratio test. STT520-420 36

37 PROC LIFETEST: myelomatosis http://people.uncw.edu/chenc/STT520_420/dataset/Example-myel.sas http://people.uncw.edu/chenc/STT520_420/dataset/Example-myel.sas options ls=80; dm log 'clear'; dm lis 'clear'; data myel; input dur censor treat renal; datalines; 8 1 1 1 … (input data) proc print; run; quit; proc lifetest data=myel plots=(s,h,p) graphics /*Only survivor function will be plotted in KM method even you mention s, h, p*/ outsurv=OUT /*write the intervals to an output data set OUT*/; time dur*censor(0); run ; quit; proc print data=OUT; /*print the output for intervals*/ run; quit; proc lifetest data=myel plots=(s) graphics; time dur*censor(0); strata treat; symbol1 v=none color=black line=1; symbol2 v=none color=red line=2; run ; quit; STT520-420 37  proc lifetest data=myel plots=(s) graphics;  time dur*censor(0);  strata renal;  symbol1 v=none color=black line=1;  symbol2 v=none color=red line=2;  run ;  proc lifetest data=myel plots=(s) graphics;  time dur*censor(0);  strata treat renal;  symbol1 v=none color=black line=1;  symbol2 v=none color=red line=2;  run ;

38 PROC LIFETEST: myelomatosis STT520-420 38

39 PROC LIFETEST: myelomatosis, with all data STT520-420 39

40 PROC LIFETEST: myelomatosis, with STRATA STT520-420 40

41 PROC LIFETEST: myelomatosis, with log-rank test STT520-420 41 H0: The survival function for groups are the same; Ha: The survival function for groups are NOT the same;


Download ppt "STT520-420: BIOSTATISTICS ANALYSIS Dr. Cuixian Chen Chapter 8: Fitting Parametric Regression Models STT520-420 1."

Similar presentations


Ads by Google