Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The.

Similar presentations


Presentation on theme: "© Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The."— Presentation transcript:

1 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The Topics Covered In Today’s Overview?

2 S077: Applied Longitudinal Data Analysis I.1 What Kinds Of Research Questions Require Longitudinal Methods? Questions About Systematic Change in a Continuous Outcome Over Time Questions About Whether and When a Particular Event Occurs © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 2 1.Within-Person: Does each married couple eventually divorce or not? 2.Between-Person: If so, when are couples at greatest risk of divorce? How does risk of divorce vary by couple characteristics? 1.Within-Person: Does each married couple eventually divorce or not? 2.Between-Person: If so, when are couples at greatest risk of divorce? How does risk of divorce vary by couple characteristics? Discrete- and Continuous-Time Survival Analysis (ALDA, Chapters 9 thru 15) Discrete- and Continuous-Time Survival Analysis (ALDA, Chapters 9 thru 15) South (2001) studied marriage duration 3,523 couples followed for 23 years, until divorce or until the study ended. Couples in which the wife was employed tended to divorce earlier. South (2001) studied marriage duration 3,523 couples followed for 23 years, until divorce or until the study ended. Couples in which the wife was employed tended to divorce earlier. 1.Within-Person: Descriptive: How does an infant’s neuro-function change over time? Summary: What is each infant’s rate of development? 2.Between-person comparison: How do these rates vary by child characteristics? 1.Within-Person: Descriptive: How does an infant’s neuro-function change over time? Summary: What is each infant’s rate of development? 2.Between-person comparison: How do these rates vary by child characteristics? Individual Growth Modeling/ Multilevel Model for Change (ALDA, Chapters 1 thru 8) Individual Growth Modeling/ Multilevel Model for Change (ALDA, Chapters 1 thru 8) Espy et al. (2000) studied infant neurofunction: 40 infants observed daily for 2 weeks; 20 had been exposed to cocaine, 20 had not. Infants exposed to cocaine had lower rates of change in neurodevelopment. Espy et al. (2000) studied infant neurofunction: 40 infants observed daily for 2 weeks; 20 had been exposed to cocaine, 20 had not. Infants exposed to cocaine had lower rates of change in neurodevelopment.

3 Sample: 180 middle school boys (all considered “at risk”) Research Design: Panel study, each boy tracked from 7 th through 12 th grade. By the end of data collection (end of 12 th grade), 126 boys (70.0%) had reported having had heterosexual sex. Remaining 54 boys (30%) were still virgins. These censored observations pose a challenge for data analysis. Question predictor: PT (“Parenting Transition”), a dichotomous variable indicating whether the boy had lived with both biological parents during his early formative years (before 7 th grade): 72 boys (40%) had lived with both biological parents (PT=0). 108 boys (60%) experienced at least one parenting transition before 7 th grade (PT=1). Ultimately, we’ll also include a second continuous predictor, PAS, which records the parents’ level of antisocial behavior during the child’s formative years (time-invariant—behavior recorded before the study began). Because the original PAS scale was totally arbitrary, we have standardized the scores to a mean of 0 and sd of 1. Sample: 180 middle school boys (all considered “at risk”) Research Design: Panel study, each boy tracked from 7 th through 12 th grade. By the end of data collection (end of 12 th grade), 126 boys (70.0%) had reported having had heterosexual sex. Remaining 54 boys (30%) were still virgins. These censored observations pose a challenge for data analysis. Question predictor: PT (“Parenting Transition”), a dichotomous variable indicating whether the boy had lived with both biological parents during his early formative years (before 7 th grade): 72 boys (40%) had lived with both biological parents (PT=0). 108 boys (60%) experienced at least one parenting transition before 7 th grade (PT=1). Ultimately, we’ll also include a second continuous predictor, PAS, which records the parents’ level of antisocial behavior during the child’s formative years (time-invariant—behavior recorded before the study began). Because the original PAS scale was totally arbitrary, we have standardized the scores to a mean of 0 and sd of 1. Data Source: Deborah Capaldi & Colleagues (1996) Child Development (ALDA, Section 11.1, pp 358-360) © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 3 S077: Applied Longitudinal Data Analysis I.2 Illustrative Example: Grade At First Heterosexual Intercourse

4 (ALDA, Section 10.1, pp 326-329) Risk Set n censored in interval j n experiencing target event in interval j J intervals, =7, 8, …, 12 J intervals, =7, 8, …, 12 How Might We Summarize The Distribution Of Event Occurrence? © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 4 S077: Applied Longitudinal Data Analysis I.2 The Life Table: Summarizing The Sample Distribution Of Event Occurrence Over Time

5 (ALDA, Section 10.2.1, pp 330-339) Discrete-Time Hazard Probability: Conditional probability that individual i will experience the target event in time period j (T i = j) given that s/he didn’t experience it in any earlier time period (T i  j) h(t ij )=Pr{T i = j|T i  j} As a probability, discrete-time hazard is bounded by 0 and 1. This is an issue for statistical modeling that we’ll need to address subsequently. Estimation is easy -- each estimate of hazard is simply the sample proportion of the risk set in that interval who experience the event of interest … Discrete-Time Hazard Probability: Conditional probability that individual i will experience the target event in time period j (T i = j) given that s/he didn’t experience it in any earlier time period (T i  j) h(t ij )=Pr{T i = j|T i  j} As a probability, discrete-time hazard is bounded by 0 and 1. This is an issue for statistical modeling that we’ll need to address subsequently. Estimation is easy -- each estimate of hazard is simply the sample proportion of the risk set in that interval who experience the event of interest … © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 5 S077: Applied Longitudinal Data Analysis I.2 Assessing The Conditional Risk Of Event Occurrence: Discrete-time Hazard Function

6 (ALDA, Section 10.2, pp 330-339) 6789101112 Grade 0.00 0.25 0.50 0.75 1.00 S(t) Discrete-Time Survival Probability: Probability that individual i will “survive” beyond time period j (T i > j) (i.e.,will not experience the event until after time period j). S(t ij )=Pr{T i > j} Is also a probability, and therefore bounded by 0 and 1. At the beginning of time, S(t i0 ) must equal 1. Strategy for estimation: Since h(t ij ) describes the probability of event occurrence in any period (given no event prior to that period), 1-h(t ij ) tells us about the probability of non-occurrence (i.e., about survival) in that period, and so … Discrete-Time Survival Probability: Probability that individual i will “survive” beyond time period j (T i > j) (i.e.,will not experience the event until after time period j). S(t ij )=Pr{T i > j} Is also a probability, and therefore bounded by 0 and 1. At the beginning of time, S(t i0 ) must equal 1. Strategy for estimation: Since h(t ij ) describes the probability of event occurrence in any period (given no event prior to that period), 1-h(t ij ) tells us about the probability of non-occurrence (i.e., about survival) in that period, and so … ML = 10.6 Estimated median lifetime © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 6 S077: Applied Longitudinal Data Analysis I.2 Cumulating Risk Over Time: The Survivor Function (And Median Lifetime)

7 (ALDA, Section 11.1.1, pp 358-361) Questions To Ask … When Examining Sample Hazard Function: What is the shape of each hazard function? Here, their shape is similar—both beginning low and climbing steadily over time. Does the relative level of hazard differ across groups? Here, the hazard profile for boys with a parenting transition is consistently higher than the profile for boys without. This suggests we should partition variation in risk into: A “baseline” profile of risk. A shift in risk corresponding to variation in the predictor. Questions To Ask … When Examining Sample Hazard Function: What is the shape of each hazard function? Here, their shape is similar—both beginning low and climbing steadily over time. Does the relative level of hazard differ across groups? Here, the hazard profile for boys with a parenting transition is consistently higher than the profile for boys without. This suggests we should partition variation in risk into: A “baseline” profile of risk. A shift in risk corresponding to variation in the predictor. Questions to Ask … When Examining Sample Survivor Functions: Tend to be less useful because they assess the predictor’s cumulative effect—here, telling us that the sample median lifetime for boys with a PT is 10.0 vs. 11.7 when PT=0.  Note: reversal of relative rankings. Questions to Ask … When Examining Sample Survivor Functions: Tend to be less useful because they assess the predictor’s cumulative effect—here, telling us that the sample median lifetime for boys with a PT is 10.0 vs. 11.7 when PT=0.  Note: reversal of relative rankings. © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 7 S077: Applied Longitudinal Data Analysis II.1 Introducing Discrete-Time Hazard Model: Thinking About The Impact of Predictors

8 (ALDA, Section 11.1.2, pp 362-365) Facts about the Logit scale Ranges from -  to + , ‘centered” on 0, but has negative values when hazard is less than.50. Tends to regularize separation of hazard functions: Stretches distance between small values. Compresses distance between large values. Easy to convert back into raw hazard: Facts about the Logit scale Ranges from -  to + , ‘centered” on 0, but has negative values when hazard is less than.50. Tends to regularize separation of hazard functions: Stretches distance between small values. Compresses distance between large values. Easy to convert back into raw hazard: © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 8 S077: Applied Longitudinal Data Analysis II.1 Introducing the Discrete-Time Survival Model: Thinking About Hazard’s Bounds

9 " " " " " " ! ! ! ! ! ! 6789101112 Grade 0.0 -2.0 -3.0 -4.0 Logit(hazard) PT=0 PT=1 (ALDA, Section 11.1.1, pp 366-369) Flat population logit hazard, shifted when PT switches from 0 to 1 Linear population logit hazard, shifted when PT switches from 0 to 1 General population logit hazard, shifted when PT switches from 0 to 1 Three Reasonable Features Of A Population Model For Discrete-time Hazard? 1.At each predictor value, there is a population logit-hazard function: When the predictor(s)=0, we refer to it as the “baseline” logit-hazard function. 2.Each population logit-hazard function is constrained to have the identical shape, at every predictor value. This is an assumption that will be relaxed later. 3.Vertical separation of the logit hazard functions, by predictor value, is the same in each time period. Differences in predictor value serve only to “shift” the logit-hazard function “vertically.” This is an assumption that will be relaxed later Until then, the magnitude of the vertical shift is the magnitude of the predictor’s effect. Three Reasonable Features Of A Population Model For Discrete-time Hazard? 1.At each predictor value, there is a population logit-hazard function: When the predictor(s)=0, we refer to it as the “baseline” logit-hazard function. 2.Each population logit-hazard function is constrained to have the identical shape, at every predictor value. This is an assumption that will be relaxed later. 3.Vertical separation of the logit hazard functions, by predictor value, is the same in each time period. Differences in predictor value serve only to “shift” the logit-hazard function “vertically.” This is an assumption that will be relaxed later Until then, the magnitude of the vertical shift is the magnitude of the predictor’s effect. © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 9 S077: Applied Longitudinal Data Analysis II.1 What Population Statistical Model Could Have Generated These Sample Data?

10 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 10 (ALDA, Section 11.2, pp369-372) Recode PERIOD into a set of TIME indicators Constant vertical shift in logit hazard is associated with differences in PT How Does This Model Map Onto The Previous Plot? S077: Applied Longitudinal Data Analysis II.1 What Kind of Model Specification Possesses These Three Features?

11 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 11 (ALDA, Section 11.2.1, pp 372-376) When PT=0, you get the baseline logit hazard function When PT=0, you get the baseline logit hazard function When PT=1, you shift this entire baseline vertically by  1 And we can add predictors just as in regular (logistic) regression How Does This Model Behave When Hazard Is Expressed On The Other Scales? S077: Applied Longitudinal Data Analysis II.1 Unpacking The Proposed Discrete-Time Hazard Model Carefully

12 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 12 (ALDA, Section 11.2.2, pp 376-379) On the Logit Scale, the vertical separation of population hazard functions, at different predictor values, is the same in each time period (this assumption is built into our model) On the Odds Scale, population hazard functions at different values of the predictor are a constant magnification (or diminition) of each other. They are Proportional. On the Odds Scale, population hazard functions at different values of the predictor are a constant magnification (or diminition) of each other. They are Proportional. On the Hazard Scale, population hazard functions at different predictor values have no constant separation and are only approximately proportion, when the magnitude of hazard is small. Consequently, the “standard” DTSA model is a Proportional Odds model! S077: Applied Longitudinal Data Analysis II.1 What Does The DT Hazard Model Look Like When Expressed On Other Scales?

13 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 13 (ALDA, Section 11.3, pp 378-386) All parameter estimates, standard errors, t- and z- statistics, goodness-of-fit statistics, and tests will be correct for the discrete-time hazard model Outcome TIME Indicators Substantive Predictors  ’s estimate the baseline logit hazard function  ’s assess the effects of substantive predictors S077: Applied Longitudinal Data Analysis II.2 Fitting The DTSA Model by Logistic Regression Analysis In The Person-Period Data Set

14 © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 14 (ALDA, Section 11.4.1, pp 386-388) Because there are no predictors in Model A, the fitted hazard profile is the “baseline” risk profile for the entire sample If estimates are approx equal across time periods, baseline fitted hazard profile is flat. If estimates get smaller across time periods, later fitted hazard probs are lower. If estimates increase across time periods (as they do here), later fitted hazard probs are higher. Because there are no predictors in Model A, the fitted hazard profile is the “baseline” risk profile for the entire sample If estimates are approx equal across time periods, baseline fitted hazard profile is flat. If estimates get smaller across time periods, later fitted hazard probs are lower. If estimates increase across time periods (as they do here), later fitted hazard probs are higher. Simplifying interpretation by transforming back to odds and hazard ^ Because there are no substantive predictors, Model A’s estimates are the same as the sample estimates S077: Applied Longitudinal Data Analysis II.3 Strategies For Interpreting the  ’s: Estimating the Baseline Hazard Function

15 (ALDA, Section 11.4.2 & 11.4.3, pp 388-390) ^ Dichotomous predictors As in regular logistic regression, antilogging a parameter estimate yields the estimated odds-ratio associated with a 1-unit difference in the predictor: Dichotomous predictors As in regular logistic regression, antilogging a parameter estimate yields the estimated odds-ratio associated with a 1-unit difference in the predictor: Estimated odds of first intercourse for boys who have experienced a parenting transition are 2.4 times higher than the odds for boys who did not experience such a transition. Continuous predictors Antilogging still yields a estimated odds-ratio associated with a 1-unit difference in the predictor: Continuous predictors Antilogging still yields a estimated odds-ratio associated with a 1-unit difference in the predictor: Estimated odds of first intercourse for boys whose parents exhibited “1 unit more” of antisocial behavior are 1.56 times the odds for boys whose parental antisocial behavior was one unit lower. Because odds ratios are symmetric about 1, you can also invert the odds ratios and change the reference group © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 15 S077: Applied Longitudinal Data Analysis II.3 Strategies For Interpreting the  ’s: ML Estimates Of Substantive Predictors’ Effects

16 (ALDA, Section 11.5.1, pp 392-394) With a single dichotomous predictor, there are 2 possible prototypical participants: PT = 0 (for boys from stable homes with no parenting transitions before 7 th grade) PT = 1 (for boys who experienced one of more early parenting transitions) With a single dichotomous predictor, there are 2 possible prototypical participants: PT = 0 (for boys from stable homes with no parenting transitions before 7 th grade) PT = 1 (for boys who experienced one of more early parenting transitions) © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 16 S077: Applied Longitudinal Data Analysis II.3 Displaying Fitted Hazard &Survivor Functions (Using Model B As An Example)

17 (ALDA, Section 11.5.1, pp 392-394) Constant vertical separation of 0.8736 (the parameter estimate for PT). Easy To See The Impact Of PT  But beware the non-constant vertical separation of the fitted hazard functions (for which there is no simple interpretation because the model is proportional in odds, not hazard). Easy To See The Impact Of PT  But beware the non-constant vertical separation of the fitted hazard functions (for which there is no simple interpretation because the model is proportional in odds, not hazard). Effect Of PT Cumulates Into A Large Difference In Estimated Median Lifetimes (9.9 vs. 11.8  2 years). Effect Of PT Cumulates Into A Large Difference In Estimated Median Lifetimes (9.9 vs. 11.8  2 years). © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 17 S077: Applied Longitudinal Data Analysis II.3 Displaying Fitted Hazard And Survivor Functions For Prototypical Participants

18 (ALDA, Section 11.5.1, pp 392-394) As in growth modeling, select substantively interesting prototypical values and proceed just as you did for dichotomous predictors here, we’ll choose +/- 1 sd PAS (lo=- 1, medium=0, and high=+1) As in growth modeling, select substantively interesting prototypical values and proceed just as you did for dichotomous predictors here, we’ll choose +/- 1 sd PAS (lo=- 1, medium=0, and high=+1) © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 18 S077: Applied Longitudinal Data Analysis II.3 Displaying Fitted Hazard &Survivor Functions When Predictors Are Continuous

19 (ALDA, Section 11.6, pp 397-402) TIME dummies Deviance smaller value, better fit,  2 dist., compare nested models Deviance smaller value, better fit,  2 dist., compare nested models Model B vs. Model A provides an uncontrolled test of H 0 :  PT =0  Deviance=17.30(1), p<.001 Model B vs. Model A provides an uncontrolled test of H 0 :  PT =0  Deviance=17.30(1), p<.001 Model C vs. Model A provides an uncontrolled test of H 0 :  PAS =0  Deviance=14.79(1), p<.001 Model C vs. Model A provides an uncontrolled test of H 0 :  PAS =0  Deviance=14.79(1), p<.001 Model D vs. Models B&C provide controlled tests [Both rejected as well] Model D vs. Models B&C provide controlled tests [Both rejected as well] © Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 19 S077: Applied Longitudinal Data Analysis II.3 Comparing Model Goodness Of Fit Using Deviance Statistics


Download ppt "© Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The."

Similar presentations


Ads by Google