Presentation is loading. Please wait.

Presentation is loading. Please wait.

STATA WORKSHOP www.lss.stir.ac.uk www.longitudinal.stir.ac.uk.

Similar presentations


Presentation on theme: "STATA WORKSHOP www.lss.stir.ac.uk www.longitudinal.stir.ac.uk."— Presentation transcript:

1 STATA WORKSHOP www.lss.stir.ac.uk www.longitudinal.stir.ac.uk

2 STATA WORKSHOP n Practical things n Toilets n Take a break – do the exercises n Fire drill n Management Centre

3 STATA WORKSHOP n Culture of the workshop n Practical n Culture of sharing knowledge n Ask questions n Something will go wrong (computers eh?) n You will tell us something we don’t know!

4 STATA WORKSHOP Structure of the workshop Introductory stuff – Dr Vernon Gayle Data management – Dr Paul Lambert More stuff – Dr Vernon Gayle Some advanced stuff – Professor David Bell

5 Statistical Modelling Some general points – see handout.

6 STATA SOFTWARE – GOOD POINTS n Does all the simple stuff (SPSS) n Is specifically designed for survey analysis (all the weighting and design related issues are better catered for) n Fits many more models than standard software n You can get started easily (menus and help) n There is a growing user community (lists etc) n New features emerge almost daily n There are good labour market opportunities (UK little known; USA well known)

7 STATA SOFTWARE – BAD POINTS n Poor data handling (compared with SPSS etc) n The weighting and design related issues can be complicated (some analysts ignore them) n There are still some models that can’t be fitted (see GLIM4; SABRE; MlWin etc) n STATA syntax is a pain in the bum n There is a growing user community, but they are generally GEEKBOYS (like myself!) n New features emerge almost daily these are sometimes tricky to get to grips with

8 Recurrent Events Analysis

9 The structure of many large-scale studies results in survey data being collected at a number of discrete occasions. In this situation, rather than being continuous, time lends itself to be conceptualized as a sequence of discrete events. Furthermore, social scientists are often substantively interested in whether a specific event has occurred. Taken together, these two issues appeal to the adoption of a discrete-time or event history approach.

10 Recurrent events are merely outcomes that can take place on a number of occasions. A simple example is unemployment measured month by month. In any given month an individual can either be employed or unemployed. If we had data for a calendar year we would have twelve discrete outcome measures (i.e. one for each month).

11 Social scientists now routinely employ statistical models for the analysis of discrete data, most notably logistic and log-linear models, in a wide variety of substantive areas. I believe that the adoption of a recurrent events approach is appealing because it is a logical extension of these models.

12 Willet and Singer (1995) conclude that discrete- time methods are generally considered to be simpler and more comprehensible, however, mastery of discrete-time methods facilitates a transition to continuous-time approaches should that be required. Willet, J. and Singer, J. (1995) Investigating Onset, Cessation, Relapse, and Recovery: Using Discrete-Time Survival Analysis to Examine the Occurrence and Timing of Critical Events. In J. Gottman (ed) The Analysis of Change (Hove: Lawrence Erlbaum Associates).

13 Employment BHPS Data Y0010 Timet1t1 t2t2 t3t3 t4t4 (Year) Wave 1991 1 1992 2 1993 3 1994 4

14 Consider a binary outcome or two-state event 0 = Event has not occurred 1 = Event has occurred In the cross-sectional situation we are used to modelling this with logistic regression.

15 Months 123456 obs000000 Constantly unemployed

16 Months 123456 obs111111 Constantly employed

17 Months 123456 obs100000 Employed in month 1 then unemployed

18 Months 123456 obs000001 Unemployed but gets a job in month six

19 Months 123456 obs010110 obs001011 obs011001 obs100010 Mixed employment patterns

20 Here we have a binary outcome – so could we simply use logistic regression to model it? Months 123456 obs000000

21 Our studio audience says…. Yes and No!

22 POOLED CROSS-SECTIONAL LOGIT MODEL In conventional logistic regression models, where each observation is assumed to be independent, a logistic link function is used, the contribution to the likelihood by the i th case and the t th event is given by the equation above. )'exp(1 )]'[exp( )( it x y x L B     

23 POOLED CROSS-SECTIONAL LOGIT MODEL x it is a vector of explanatory variables and  is a vector of parameter estimates.

24 This approach can be regarded as a naïve solution to our data analysis problem. We need to consider a number of technical issues… Note: If any economist or on the ball social scientists spots this you will get your grant/paper rejected!

25 Months Y 1 Y 2 obs00 Pickle’s tip - In repeated measured analysis we would require something like a ‘paired’ t test rather than an ‘independent’ t test because we can assume that Y 1 and Y 2 are related.

26 Repeated measures data violate an important assumption of conventional regression models. The responses of an individual at different points in time will not be independent of each other. This problem has been overcome by the inclusion of an additional, individual-specific error term.

27 The logical extension to the standard (vanilla) logit (i.e. the pooled analysis) is to use an appropriate longitudinal model. Random Effects Model Logit Logistic Mixture Model Called xtlogit in STATA

28 The random effects model extends the pooled cross-sectional model to include a case-specific random error term to better represent the effects of residual heterogeneity. For a sequence of outcomes for the i th case, the basic random effects model has the integrated (or marginal likelihood) given by the equation.

29

30 Davies and Pickles (1985) have demonstrated that the failure to explicitly model the effects of residual heterogeneity may cause severe bias in parameter estimates. Using longitudinal data the effects of omitted explanatory variables can be overtly accounted for within the statistical model. This greatly improves the accuracy of the estimated effects of the explanatory variables.

31


Download ppt "STATA WORKSHOP www.lss.stir.ac.uk www.longitudinal.stir.ac.uk."

Similar presentations


Ads by Google