# HSRP 734: Advanced Statistical Methods July 24, 2008.

## Presentation on theme: "HSRP 734: Advanced Statistical Methods July 24, 2008."— Presentation transcript:

HSRP 734: Advanced Statistical Methods July 24, 2008

Objectives Describe methods to evaluate the proportional hazard assumption Describe methods to evaluate the proportional hazard assumption Describe other model diagnostics Describe other model diagnostics Describe a stratified analysis Describe a stratified analysis Use SAS to implement Use SAS to implement

Background The Cox PH model is a semi-parametric regression-based approach to survival analysis The Cox PH model is a semi-parametric regression-based approach to survival analysis Nonparametric estimation of the baseline hazard function Nonparametric estimation of the baseline hazard function Parametric estimation of the proportionality constant as a linear function of the covariates Parametric estimation of the proportionality constant as a linear function of the covariates

Checking the Proportional Hazards Assumptions Graphical approach (subjective) Graphical approach (subjective) Compare estimated –ln(-ln) survivor curves over different categories of variables being investigated. Parallel curves indicates that the PH assumption is satisfied. Compare estimated –ln(-ln) survivor curves over different categories of variables being investigated. Parallel curves indicates that the PH assumption is satisfied. Compare observed with predicted survival curves. Compare observed with predicted survival curves. Goodness-of-fit tests (global) Goodness-of-fit tests (global) Time-dependent variables (computationally cumbersome) Time-dependent variables (computationally cumbersome)

Graphical Approach Background Consider a Cox PH model in which we wish to model gender effect (i.e., single binary covariate) Consider a Cox PH model in which we wish to model gender effect (i.e., single binary covariate) Now take the natural log of both sides of the equation: Now take the natural log of both sides of the equation: log h(t; X) = log h 0 (t) +  1 x female

Graphical Approach Background Assuming that a female is coded as 1 and a male is coded as 0, we have Assuming that a female is coded as 1 and a male is coded as 0, we haveFemale: log h(t; X) = log h 0 (t) +  1 x 1 = log h 0 (t) +  1 = log h 0 (t) +  1Male: log h(t; X) = log h 0 (t) +  1 x 0 = log h 0 (t) = log h 0 (t)

Graphical Approach Background Thus, a plot of the log-hazard over time would yield two curves – one for females and one for males, and the distance between the curves would be fixed at  1. Thus, a plot of the log-hazard over time would yield two curves – one for females and one for males, and the distance between the curves would be fixed at  1. A simple method for assessing the proportional hazards assumption is an examination of the extent to which the two (or more) curves are equidistant over time. A simple method for assessing the proportional hazards assumption is an examination of the extent to which the two (or more) curves are equidistant over time.

Graphical Approach Advantages: Advantages: Simple Simple Often the eye is better at evaluating patterns than a formal analytical method Often the eye is better at evaluating patterns than a formal analytical method Disadvantages: Disadvantages: Not formal Not formal What do you do with continuous variables What do you do with continuous variables Univariate by nature and one has to think hard of how to best consider combinations of variables Univariate by nature and one has to think hard of how to best consider combinations of variables

Survival curve and hazard (under PH) Equivalently, one can use log log survival curves Equivalently, one can use log log survival curves Some math required to figure this out Some math required to figure this out Lets start with Lets start with -d (log S(t))/dt = h(t) -d (log S(t))/dt = h(t)

Survival curve and hazard (under PH)

Log-log Plots eval ph example.sas

Empirical Log-log Plots We can get the survival functions based on Kaplan- Meier estimates that do not assume an underlying Cox model. We can get the survival functions based on Kaplan- Meier estimates that do not assume an underlying Cox model. Empirical log-log plots: Empirical log-log plots: Calculate K-M estimates Calculate K-M estimates Create a new dataset. Keep only two variables time and survival. Create a new dataset. Keep only two variables time and survival. In the new data set, create the group variable (e.g., maintained). In the new data set, create the group variable (e.g., maintained). Do the log(-log(Survival)) transformation. Do the log(-log(Survival)) transformation.

Empirical Log-log Plots using SAS You need to spend some time to create a data where you can make a plot. You need to spend some time to create a data where you can make a plot. eval ph example.sas

Another Alternative Approach Using Lehmann’s alternative expression Using Lehmann’s alternative expression

Observed vs. Predicted Plots Idea: Idea: Use K-M curves to obtain the “observed” plots. Use K-M curves to obtain the “observed” plots. Use Cox PH model to obtain the “expected” plots. Use Cox PH model to obtain the “expected” plots. Put both sets of plots on the same graph. Put both sets of plots on the same graph. If they are close, then complies with PH assumption; if not, then the assumption is violated. If they are close, then complies with PH assumption; if not, then the assumption is violated.

Expected Plot from SAS

Time-Dependent Covariates Add a time-dependent variable to assess the PH assumption for a time-independent variable. The Cox model is extended to contain an interaction term between the covariate and some function of time. If the test is significant, then the PH assumption is violated. Add a time-dependent variable to assess the PH assumption for a time-independent variable. The Cox model is extended to contain an interaction term between the covariate and some function of time. If the test is significant, then the PH assumption is violated. eval ph example.sas

Regression diagnostics Model checking in Cox PH model uses measures analogous to those used for linear and logistic regressions: residuals, leverage, and influence. Model checking in Cox PH model uses measures analogous to those used for linear and logistic regressions: residuals, leverage, and influence. Diagnostics can be plotted and examined in order to identify observations that are influential or that have high leverage in determining the fit. Diagnostics can be plotted and examined in order to identify observations that are influential or that have high leverage in determining the fit.

Identification of Influential and Poorly Fit Observations High leverage values in isolation are not necessarily a concern. The issue is how does high leverage contribute to a measure of the influence a covariate value has on the estimate of the coefficient of concern. High leverage values in isolation are not necessarily a concern. The issue is how does high leverage contribute to a measure of the influence a covariate value has on the estimate of the coefficient of concern.

Identification of Influential and Poorly Fit Observations Specifically, we use the measure of the change in the value of the estimated coefficient upon deletion of an observation Specifically, we use the measure of the change in the value of the estimated coefficient upon deletion of an observation where denotes the MLE from the partial likelihood using the entire sample and is that when the i th observation is deleted. Plots of vs. the covariate values are helpful in identifying observation that greatly influence parameter estimation and hypothesis testing. Plots of vs. the covariate values are helpful in identifying observation that greatly influence parameter estimation and hypothesis testing. are called dfbetas are called dfbetas

Residuals Martingale residuals Martingale residuals Deviance residuals Deviance residuals Plot of martingale residuals vs. covariates or fitted values Plot of martingale residuals vs. covariates or fitted values Plot of deviance residuals vs. covariates or fitted values Plot of deviance residuals vs. covariates or fitted values

Identification of Influential and Poorly Fit Observations obtain dfbeta from a Cox PH model by requesting that they be included in the OUTPUT dataset obtain dfbeta from a Cox PH model by requesting that they be included in the OUTPUT dataset obtain linear predictor score, martingale and deviance residuals from a Cox PH model by requesting that they be included in the OUTPUT dataset obtain linear predictor score, martingale and deviance residuals from a Cox PH model by requesting that they be included in the OUTPUT dataset eval ph example.sas

Non-Proportional Hazards - Stratification What if the proportional hazards assumption does not fit? What if the proportional hazards assumption does not fit? If you find that the proportional hazards assumption does not fit for a specific set of groups, you can compute a stratified analysis in which you stratify by group. If you find that the proportional hazards assumption does not fit for a specific set of groups, you can compute a stratified analysis in which you stratify by group.

Stratification Advantages: Advantages: Flexibility in that it allows for different hazard functions for each stratum Flexibility in that it allows for different hazard functions for each stratum Relatively simple idea and easy to implement Relatively simple idea and easy to implement Retains single estimate for each regression parameter, assuming no strata by covariate interaction Retains single estimate for each regression parameter, assuming no strata by covariate interaction Crude form of adjustment Crude form of adjustment

Stratification Disadvantages: Disadvantages: Loss of parsimony Loss of parsimony Requires larger sample size to obtain similar quality estimators – number of individuals within a stratum is important Requires larger sample size to obtain similar quality estimators – number of individuals within a stratum is important Not valid if there is a strata by covariate interaction Not valid if there is a strata by covariate interaction

Stratified Cox Model Using the UISSURV data, assume RACE doesn’t satisfy the PH assumption. Using the UISSURV data, assume RACE doesn’t satisfy the PH assumption. Further assume that the hazard function for non-whites and whites differ only because they have different baseline hazard function. The effect of TREAT is the same for both non-whites and whites. Further assume that the hazard function for non-whites and whites differ only because they have different baseline hazard function. The effect of TREAT is the same for both non-whites and whites. Because of different baseline hazard functions, the fitted stratified Cox model will have different estimated survival curves for non-whites and whites. Because of different baseline hazard functions, the fitted stratified Cox model will have different estimated survival curves for non-whites and whites.

Stratified Cox Model eval ph example.sas

Stratified Cox Model Some comments: Some comments: Note that RACE is not included in the model because it doesn’t satisfy the PH assumption. So instead the RACE variable is controlled by stratification. Note that RACE is not included in the model because it doesn’t satisfy the PH assumption. So instead the RACE variable is controlled by stratification. Now we can estimate the treatment effect adjusted for RACE. Now we can estimate the treatment effect adjusted for RACE. It is not possible to obtain a hazard ratio for the RACE effect adjusted for TREAT. This is the price to be paid for the stratification. Also a single value for the hazard ratio for RACE is not appropriate because it must vary with time. It is not possible to obtain a hazard ratio for the RACE effect adjusted for TREAT. This is the price to be paid for the stratification. Also a single value for the hazard ratio for RACE is not appropriate because it must vary with time.

General Stratified Cox Model If there are more than one variable not satisfying the PH assumption, a general stratified Cox model can be used. If there are more than one variable not satisfying the PH assumption, a general stratified Cox model can be used. Define a new single variable (e.g., Z) which is the combinations of the variables (i.e., covariate pattern) and then apply the same stratified Cox model. Define a new single variable (e.g., Z) which is the combinations of the variables (i.e., covariate pattern) and then apply the same stratified Cox model. The trouble is that you might not have enough sample sizes in each stratum. The trouble is that you might not have enough sample sizes in each stratum.

General Stratified Cox Model In the statistical software, you don’t need to create a new variable Z by yourself. The software will do it for you automatically as long as you specify the variables not satisfying PH assumption in the model. In the statistical software, you don’t need to create a new variable Z by yourself. The software will do it for you automatically as long as you specify the variables not satisfying PH assumption in the model.

General Stratified Cox Model — SAS eval ph example.sas

No-Interaction Assumption Previous stratified model contains regression coefficients that do not vary over the strata. This is the “no-interaction assumption”. Previous stratified model contains regression coefficients that do not vary over the strata. This is the “no-interaction assumption”.

Interaction If we allow for interaction between the TREAT and RACE, we can fit two separate Cox models to non- whites and whites with each model containing TREAT. If we allow for interaction between the TREAT and RACE, we can fit two separate Cox models to non- whites and whites with each model containing TREAT. An alternative way is to fit a stratified model with an interaction term between TREAT and RACE. Note that though RACE can’t be treated it as a covariate, it can be included as an interaction term. An alternative way is to fit a stratified model with an interaction term between TREAT and RACE. Note that though RACE can’t be treated it as a covariate, it can be included as an interaction term.

Interaction eval ph example.sas

Summary Non-proportional hazards and strata by covariate interactions greatly complicate our analyses and interpretations. Non-proportional hazards and strata by covariate interactions greatly complicate our analyses and interpretations. Options: Options: Run completely separate analyses for each stratum – simple but often confusing Run completely separate analyses for each stratum – simple but often confusing Attempt to explicitly model these interactions – more complicated and often confusing to describe. Attempt to explicitly model these interactions – more complicated and often confusing to describe.