Presentation is loading. Please wait.

Presentation is loading. Please wait.

U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los Angeles Office of Academic Planning & Budget CAIR 2014.

Similar presentations


Presentation on theme: "U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los Angeles Office of Academic Planning & Budget CAIR 2014."— Presentation transcript:

1 U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los Angeles Office of Academic Planning & Budget CAIR 2014

2 A GENDA Survival Analysis History & Background Overview Survival Analysis example using SPSS Results of Survival Analysis

3 S URVIVAL A NALYSIS B ACKGROUND Definition A statistical method for studying the time to an event. The term “survival” suggests that the event of interest is death but the technique is useful for other types of events. Alternative terminology Event analysis, Time series analysis, Time-to-event analysis Survival analysis –studies involving time to death (biomedical sciences) Reliability theory / Reliability analysis (engineering) Duration analysis / Duration modeling (economics) Event history analysis (Sociology) Uses Clinical trials Cohort studies

4 Example of Survival Probability Graph

5 Example of Survival Probability Graph

6 Example of Survival Probability Graph

7 Unknown – been around for a few hundred years Techniques developed in medical / biological sciences World War II –military vehicles (reliability and failure time analysis) The Kaplan-Meier Estimator was introduced with the publication of NONPARAMETRIC ESTIMATION FROM INCOMPLETE OBSERVATIONS – E. L. Kaplan / Paul Meier, 1958 Cited 34,000 times as of 2011 S URVIVAL A NALYSIS H ISTORY

8 S URVIVAL ANALYSIS - O VERVIEW A set of statistical methods where the outcome variable is the time until the occurrence of an event of interest Follows cohort over specified time period with focus on an event Useful when the rate of the occurrence of the event varies over time Differs from other statistical methods: handles censored data (the withdrawal of individuals from the study) Censored observations : Individuals who have not experienced “the event” by the end of the study Right censoring o Study participant can’t be located o or lives beyond the end of the study o or drop outs before the study is completed o or is still enrolled o An observation with incomplete information o Don’t have to handle these individuals as “missing” o Do have to follow rules with respect to censored data o # of censored should be small relative to non-censored o Censored and non-censored population should be similar (Kaplan-Meier)

9 S URVIVAL ANALYSIS - C ENSORING

10 Consequences of mishandling or ignoring censored data: Ignoring censored records completely or arbitrarily assigning event dates introduces bias into the results Inclusion of the censored data produces less bias. Newell/Nyun 2011 Example Student cohort, N = 50, event of interest = Graduation Still enrolled at the end of the study, N = 6 No longer enrolled but did not graduate, N = 4 Options: Code all 10 as missing Code 4 as missing, 6 as graduated as of study end Consequences: Mean time to degree is over or understated selection bias risk

11 Two methods to produce the cumulative probability of survival that the survival graph is based upon: 1.SPSS Life Table: (Each time period) the effective size of the cohort is reduced by ½ of the censored group 2.Kaplan-Meier Survival Table: The survival probability estimate for each time period, except the first, is a compound conditional probability S URVIVAL ANALYSIS – HANDLING CENSORED DATA

12 Data required for analysis: Clearly defined event: (death, onset of illness, recovery from illness, marriage, birth, mechanical failure, success, job loss, employment, graduation). Terminal event Event status (1 = event occurred, 0 = event did not occur) Time variable = Time measured from the entry of a subject into the study until the defined event. Months, terms, days, years, seconds. Covariates: To determine if different groups have different survival times Gender, age, ethnicity, GPA, treatment, intervention Regression models S URVIVAL ANALYSIS - O VERVIEW

13 S URVIVAL ANALYSIS – SPSS D ATA LAYOUT Basic student data Time variable – terms enrolled Event status – graduation status Group into categories Censored indicator Binary or dummy variables

14 Cohort Description Undergraduates, one division Fall 2006, Fall 2007 entering freshmen, N = 884 Respondents to 2008 UCUES* survey Freshmen admits (transfers excluded) 1 st term gpa >= 3.0 Censored = 10 or 1.1% Explanatory variables available: gender, URM status, domestic-foreign status, Pell Grant recipient status, hours worked (survey), double/triple major * UCUES = University of California Undergraduate Survey

15 S URVIVAL A NALYSIS – SPSS SPSS Analyze Survival Life Tables

16 S AMPLE D ATA – W ORKING IN SPSS SPSS Analyze Survival Life Tables

17 S URVIVAL A NALYSIS – L IFE T ABLE PRODUCED BY SPSS primary output of the survival analysis procedure Intervals = terms. count is from admit term Count of still enrolled students at start of term

18 S URVIVAL A NALYSIS – L IFE T ABLE PRODUCED BY SPSS primary output of the survival analysis procedure # withdrawing during interval = censored # exposed to risk : # entering interval minus ½ censored # terminal events = # graduated Proportion Terminating : # Terminal events ÷ # exposed to risk: example Term 10 = 38 ÷ =.05 Proportion surviving = 1 – proportion terminating Probability Density = Estimated probability of graduating in interval Hazard Rate = Instantaneous failure rate. % chance of graduating given not having graduated at start of interval Cumul. Surviving = cumulative % of those surviving at end of interval = ( ) ÷ 884 = 0.90

19 S URVIVAL F UNCTION G RAPH P RODUCED BY SPSS The proportion of the cohort that has survived (still enrolled) at any term There is a 90% probability of surviving to the end of 10 th term. Surviving = remaining enrolled! Each step of the curve represents an event

20 F UNCTION & O NE MINUS A FUNCTION y = x 2 y = 1-x 2 y = x+1 y = 1- (x+1)

21 O NE M INUS SURVIVAL FUNCTION There is a 10% probability of not- surviving to the end of 10 th term. Not surviving = graduating!!

22 S URVIVAL A NALYSIS : SPSS, WITH C OVARIATE F ACTOR = G ENDER SPSS Analyze Survival Life Tables SURVIVAL TABLE=Terms_enrolled BY Gender(1 2) /INTERVAL=THRU 15 BY 1 /STATUS=graduated(1) /PRINT=TABLE /PLOTS (SURVIVAL OMS)=Terms_enrolled BY Gender.

23 S URVIVAL A NALYSIS – SPSS, L IFE T ABLE BY GENDER Median Survival Time = Time at which 50% of the original cohorts have not-survived (graduated) Hazard Rate = Instantaneous failure rate. % chance of graduating given not having graduated at start of interval

24 S URVIVAL A NALYSIS : H AZARD R ATIO Hazard Ratio = ratio of the hazard rates. At 12 th term, Hazard ratio = 1.63 / 1.41 = 1.16, females are 16% more likely to graduate in the 12 th term than males At 13 th term, Hazard ratio =.41 /.62 =.66, females are 34% less likely to graduate in the 13 th term than males

25 S URVIVAL FUNCTIONS - SPSS F ACTOR = GENDER Survival Pattern: SPSS will produce a different colored line for each of the factor’s values

26 S URVIVAL A NALYSIS : K APLAN - MEIER M ETHOD Assumptions Censored individual – student who has not experienced the event (graduated) by the end of the study, e.g. they are no longer enrolled Check for differences between censored and non- censored groups Cohorts should behave similarly – groups entering at different times should be similar Avoid “selection bias” in data

27 S URVIVAL FUNCTIONS – SPSS, K APLAN _ MEIER F ACTOR = GENDER KM Terms_enrolled BY Gender /STATUS=graduated(1) /PRINT TABLE MEAN /PLOT SURVIVAL /TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED.

28 K APLAN -M EIER S URVIVAL TA BLE This is an example of the survival table produced by the Kaplan-Meier procedure. Kaplan-Meier Survival Probability Estimate calculation example: Interval 4: Cumulative Proportion Surviving = # remaining / # at risk = [(# at start of interval - (# censored + # of events)] ÷ [# at start of interval - # of events] = [(46 – (2 + 1)] ÷ [(46 – 2)] = 43 ÷ 44 = Interval 5: Cumulative Proportion Surviving = [(43 – (2 + 2)] ÷ (43 – 2) = 39 ÷ 41 = x = Kaplan-Meier Survival Table: The survival probability estimate for each time period, except the first, is a compound conditional probability

29 In this way the fudging is kept conceptual, systematic, and automatic. Kaplan & Meier, 1958

30 Kaplan-Meier Results – Gender Null Hypothesis: Female Curve = Male Curve

31 K APLAN -M EIER OUTPUT Log Rank weights all graduations equally Breslow gives more weight to earlier graduations Taron-Ware is mixture of two

32 Kaplan-Meier Results – Gender Null Hypothesis: Female Curve = Male Curve Curves not significantly different at p <.05

33 Measures influence of explanatory variables Most used Survival analysis method Only time independent variables are appropriate Assumptions: Hazards are proportional C OX R EGRESSION (P ROPORTIONAL HAZARDS )

34 C OX R EGRESSION, C HECKING PROPORTIONAL HAZARDS ASSUMPTION Repeat for each factor! SPSS Analyze Survival Cox Regression

35 C OX R EGRESSION : U SE LOG MINUS LOG FUNCTION TO CHECK P ROPORTIONAL H AZARDS A SSUMPTION Do not use Cox Regression if the curves cross. This means the hazards are not proportional.

36 C OX R EGRESSION M ODEL – E XAMPLE, G ENDER SPSS Analyze Survival Cox Regression ( move gender to Covariates box )

37 C OX R EGRESSION M ODEL R ESULTS : E XAMPLE, G ENDER

38 Interpretation of SPSS Cox Regression Results: The reference category is female because I made that choice for this model It is not statistically significant at p < 0.05 that females and males have different survival curves Exp(B) = Hazard ratio: Female vs. Male The null hypothesis is that this ratio = 1. Hazard Ratio = e B = e = 0.961

39 C OX R EGRESSION M ODEL R ESULTS : P ELL G RANT R ECIPIENTS VS. N ON -P ELL G RANT R ECIPIENT Tip: To edit the default chart, click on the chart until the “Chart Editor” opens Per Kaplan- Meier Estimation, Pell-Grant Student curve is not equal to non-Pell Grant students curve, highly significant at p <.001

40 C OX R EGRESSION M ODEL R ESULTS : P ELL G RANT R ECIPIENTS VS. N ON -P ELL G RANT R ECIPIENT Pell Grant Recipients 1. Work more hours than non-Pell Grant Recipients 2. Pell Grant Recipients with similar GPAs to non-Pell Grant Recipients have attempted 10 more units

41 Survival Analysis provides the following: Handles both censored data and a time variable Life table Graphical representation of trends Kaplan-Meier survival function estimator Survival comparison between 2 or more groups Regression models – relationships between variables and survival times p value is produced that indicates if difference between curves is significant or not S UMMARY

42 Descriptive power of survival analysis : Terms Enrolled by 1 st Term GPA – Using Survival Graph (K-M) to display data ~ 34% probability of continued enrollment ~ 9% probability of continued enrollment At end of 12 th term:

43 Contact Info: Thank you! REFERENCES Dunn, S. (2002). Kaplan-Meier Survival Probability Estimates. Retrieved from Harris, S. (2009). Additional Regression techniques, October 2009, Retrieved from Newell, J. & Hyun, S. (2011). Survival Probabilities With and Without the Use of Censored Failure Times Retrieved from https://www.uscupstate.edu/uploadedFiles/Academics/Undergraduate_Research/Reseach_ Journal/2011_007_ARTICLE_NEWELL_HYUN.pdf Singh, R., Mukhopadhyay, K. (2011). Survival analysis in clinical trials: Basics and must know areas, Retrieved from Wiorkowski, J., Moses, A., & Redlinger, L. (2014). The Use of Survival Analysis to Compare Student Cohort Data, Presented at the 2014 Conference of the Association of Institutional Research


Download ppt "U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los Angeles Office of Academic Planning & Budget CAIR 2014."

Similar presentations


Ads by Google