Presentation on theme: "Multilevel survival models A paper presented to celebrate Murray Aitkin’s 70 th birthday Harvey Goldstein ( also 70 ) Centre for Multilevel Modelling University."— Presentation transcript:
Multilevel survival models A paper presented to celebrate Murray Aitkin’s 70 th birthday Harvey Goldstein ( also 70 ) Centre for Multilevel Modelling University of Bristol
Time to event (survival) models Murray has contributed : Aitkin, M. and Clayton, D. (1980) The fitting of exponential, Weibull and extreme value distributions to complex censored survival data using GLIM. Appl. Statist. 29, Journal of the Royal Statistical Society, Series A 1986; 149: 1-43 Murray A. Aitkin, Brian Francis, John Hinde (2005). Statistical Modelling in GLIM4 – Chap 6. survival models. Basic notions - consider employment duration (u): The proportion of the workforce employed for periods greater than t is the survivor function with risk of unemployment in next unit interval given survival to t – the hazard
The traditional grouped discrete time hazard model Suppose time is grouped into pre-assigned categories: if the survivor function at start of time interval t is then the probability of death and the hazard are Thus the basic data consists of one record for each time interval for each individual (within each higher level unit for a multilevel structure) with the response being a binary indicator of failure for each interval. The estimation follows that for the binary response model, e.g with a logit or probit link function. This formulation is very flexible, it can be extended to competing risks (multinomial response), allows time-varying covariates, automatically handles right censored data and easily extends to incorporate random effects in multilevel data structures. Can be fitted with existing software.
A repeated measures discrete time data structure
A GLM for the grouped discrete time model The hazard is where k indexes individual, j indexes episode (of a partnership) and i indexes the state (partnership, non-partnership) - modelled by dummy variables. We can use a ‘standard’ model, e.g. where z indexes the modelled interval at discrete time t using a p-order polynomial (typically p<5) to describe the baseline hazard. v is between-individual random effect, u is within-individual between-episode random effect (extra-binomial frailty) The downside is that this requires data expansion and can result in very large files. So: staying with grouped discrete time data we consider another formulation
An ordered categorical model For time at death t write cumulative probability as the standard normal integral Discretise the time scale, as before, by defining cut points and consider the cumulative distribution This thus defines the ordered probit model where represents the effect of any covariates, and is the probability that an event occurs in time interval. We shall discuss how to model the threshold parameters. Note that the hazard for time interval is.
Advantages of the ordered probit model First proposed by McCullagh (1980) – used by others e.g. Hedeker et al (2000). Does not require data expansion Generalises to multivariate and multilevel case ( e.g. repeated episodes within individuals) easily Handles any kind of censoring/missing data The threshold parameters correspond to the cut points and we require that they are strictly ordered. We can set, if we assume that the intercept is incorporated in. We generalise to the 2-level case by adding random effects with further levels or classifications similarly specified. This is a ‘latent normal’ model and can be combined with other responses, normal and categorical, and levels in a general multivariate multilevel framework. (Goldstein, H., Carpenter, J., Kenward, M. and Levin, K. (2009). Multilevel Models with multivariate mixed response types. Statistical Modelling. 9(3): ) An MCMC algorithm has been developed. This involves sampling from posterior distributions for parameters + sampling from the latent normal given observed category. Missing data can be handled by multiple imputation if covariates are missing.
Censoring Right censored data after time h simply involves a random draw from the standard normal in Interval censored data likewise involves a random draw from the corresponding normal interval Left censored data time h or earlier involves random draw from
Estimating threshold parameters We require strict monotonicity so consider: for q time varying explanatory variables. Guarantees parameters are strictly increasing: MH algorithm used in MCMC step. This allows time-varying covariates to contribute cumulatively to the parameter value. Alternatively they can contribute according to current mean: Other link functions for such as logistic are possible and the baseline hazard can be a smooth function of time, rather than a step function.
Example – partnership durations Data are based upon partnership histories of female respondents in the National Child Development Study collected retrospectively at 33 and 42 years. A full description is given in Steele et al. (2005). The present analysis uses a subset of the data and explanatory variables. Six month intervals
Partnership durations Negative values are right censored observations.
Fitted model (omitting threshold estimates)
Interpretation - Model B The overall effect of having a young child at the start of a partnership is to increase the value on the latent normal scale (since it is a covariate belonging to X) and hence to and hence to increase the probability that the partnership will end for each given interval, i.e. decrease the overall probability of remaining in a partnership for all time periods, presumably reflecting the characteristics of a partnership that starts with an existing younger child. It could also reflect unobserved characteristics of women who have children from a previous relationship. Given the presence (or absence) of a young child at the start of the partnership, the effect at separation of the current average of the younger child variable, in effect the proportion of times over the period that there is a younger child present, is to multiply that threshold parameter (additive) contribution (compared to no younger children during the time period) by the mean multiplied by = 0.30, so that 0.30 is the multiplier when a younger child is always present. This therefore leads to a decrease on the latent normal scale and hence to increase the probability of remaining in a partnership. This suggests that the arrival of a young child during a partnership tends to prolong the partnership as opposed to the effect of starting the partnership with a young child.
Interpretation - Model A In model A we see that the fixed part contribution for a younger child is greater and the time-dependent effect is larger with a multiplying factor of =0.73. The effect is thus to multiply the cumulative base threshold by 0.73 which provides perhaps a more straightforward interpretation than model B. References Steele, F., Kallis, C., Goldstein, H. and Joshi, H. (2005). The Relationship between Childbearing and Transitions from Marriage and Cohabitation in Britain. Demography 42: Goldstein, H (2010). A general model for the analysis of multilevel discrete time survival data. (Submitted for publication).
The models I have used can all be traced back to work carried out by Murray Aitkin over the course of a long and distinguished career. We owe him a great debt of gratitude