Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut
Contents 1.A research question and a wrong answer 2.What kind of data is needed 3.How to analyse data 4.Confounder adjustment 5.Poisson regression 6.Cox regression Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. A research question Do MMR vaccination increase the risk of autistic disorder ? Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. Material All children born 1991 to 98 ( children). Registerbased information on MMR vacn. ( ) and autistic disorder (412 cases). Danish Epidemiology Science Centre, Copenhagen, Denmark Inform. on autisme: Danish Psychiatric Central Register Danish Civil Registration System Inform. on MMR Danish National Board of Health Cohort
1.2. A wrong answer I Danish Epidemiology Science Centre, Copenhagen, Denmark -autism+autism - MMR vacn (0.064%) + MMR vacn (0.079%) Relative risk = 0.079/0.064= What is the proportion of children with autism in vacn and non-vacn in the cohort before end of 2000(end of follow-up)?
1.2. A wrong answer II Danish Epidemiology Science Centre, Copenhagen, Denmark The simple comparison of proportion is not correct, because: 1)autism may be diagnosed before MMR 2)no age-adjustment, time under risk not taken into account, Conclusion: Compare person-time under risk, not the number of persons under risk.
2. What kind of data is needed
2.1. Information on time Danish Epidemiology Science Centre, Copenhagen, Denmark Time of study-entrance (1yr birthdate) Time of status-change (date of vaccination) Time of outcome (date of autism) Time of study-exit (date of autism, death, emigration, disappearance, end of study)
2.2 Datalines Who1yr birth dateVacn.autismdeath/emig. 111sep199504apr199717oct dec jan jan jul199304nov199501jan nov jun199703apr apr may nov1995. …..……….. ……………….. more than datalines. Not before end of 2000
2.3 Data as livelines
3. How to analyse data
3.1 Cox vs. Poisson regresssion Poisson regression in large datasets with time-dependent variables Cox regression in small datasets
3.2 Livelines
3.3 Contribution of pyrs Vacn. ??Person yearsAutisme ?? 1 No1.56 yrNo 1Yes0.54 yrYes 2No5.11 yrNo yrNo 4Yes2.16 yrYes 5No0.13 yrNo yrNo 6Yes2.03 yrNo yrYes
3.4 Contribution of pyrs Vacn. ??Person yearsAutisme ?? 1 No1.56 yrNo 1Yes0.54 yrYes 2No5.11 yrNo yrNo 4Yes2.16 yrYes 5No0.13 yrNo yrNo 6Yes2.03 yrNo yrYes
3.5 Data reduction Vacn. ??casesperson years (pyrs) - vacn = vacn =4.73
3.6 Rate ratio calculation Vacn.casesperson yearsRate (per ) - vacn vacn (Incidence) rate = number of new autistic cases per year = cases/pyrs Rate ratio = RR +vacn vs –vacn = rate +vacn /rate -vacn = 1.40
4. Confounder Adjustment
4.1 The lexis-diagram
4.2 Person-years by age and period Nr.Vacn.AgePeriodPyrsAutism … No No No No No No No No No Yes …..
4.3 Person-years by age and period (9 ages) x (9 periods) x (two vacn.) =162 groups e.g. Age period vacn pyrs cases yes
4.4 Relative rates by age and period ageperiodvacn.casespyrs 1 Rate 2 RR vacn vacn vacn vacn vacn05-- +vacn vacn vacn in thousands, 2 per yr
5. Poisson Regression
5.1 Regression analysis of the rates log(rate) = const + a I(vacn) + b I(5-9) + c I(96-00) I(vacn) = 1 if vacn, 0 otherwise I(5-9) = 1 if 5-9 years, 0 otherwise I(96-00) = 1 in period , 0 otherwise For non-vacn. children in 1997 aged 6 log(rate) is modelled by: const+b+c.
5.2 Log-linear Poisson regression (I) log(rate) = log((nr of cases)/pyrs) = log(nr of cases) - log(pyrs) i.e. log(nr of cases) = log(pyrs) + log(rate) log(rate) = const + a I(vacn) + b I(5-9) + c I(96-00) log(nr of cases) = log(pyrs) + const + a I(vacn) + b I(5-9) + c I(96-00)
5.3 Log-linear Poisson regression (II) log(nr of cases) = log(pyrs) + const + a I(vacn) + b I(5-9) + c I(96-00) The number of case is Poisson-distributed. log of the number of cases is modelled with a linear- function log(pyrs) is considered known for every cell and is called an offset
5.4 Parameters and rate ratios log(rate) = k + a I(vacn) + b I(5-9) + c I(96-00) rate = exp(k + a I(vacn) + b I(5-9) + c I(96-00)) = exp(k) exp(a I(vacn) exp(b I(5-9)) exp(c I(96-00)). For children 5-9 yr in the period : RR +vacn vs -vacn = rate +vacn /rate -vacn = (exp(k) exp(a) exp(b) exp(c)) (exp(k) exp(b) exp(c)) = exp(a)
5.5 A more complicated model log(rate) = k + a I(vacn) + b 1 I(1yr) + b 2 I(2yr) + b 3 I(3yr) + b 4 I(4yr) + b 5 I(5yr) + b 6 I(6yr) + b 7 I(7yr) + b 8 I(8yr) + c 1 I(92-93) + c 2 I(94) + c 3 I(95) + c 4 I(96) + c 5 I(97) + c 6 I(98) + c 7 I(99) + with non-vacn as the vacn-reference, age=9yr as the age- reference, and period=2000 as the period-reference.
5.6 SAS-dataset to Poisson regression data mmrdata; input age period vacn cases pyrs; logpyrs=log(pyrs); datalines; ; run;
5.7 SAS-procedure to Poisson regression proc genmod data=mmrdata; class age period; model cases=age period vacn/ dist=poisson link=log offset= logpyrs ; run;
5.8 SAS-output Parameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT AGE AGE AGE AGE AGE AGE AGE AGE AGE PERIOD PERIOD PERIOD PERIOD PERIOD PERIOD PERIOD PERIOD VACN VACN
5.9 Confidence-interval RR +vacn vs –vacn = exp( ) = 0.89 Confidence-interval: RR lower = exp(estimate StdErr) RR upper = exp(estimate StdErr) RR +vacn vs -vacn = 0.89 ( )
X.X Time since vaccination 5.9 years0.8 years - vacn0.8yr + vacn5.9yr -vacn0.8 yr V:<1 yr1 yr V:1-2 yr2 yr V:3-4 yr2 yr V:5+ yr0.9 yr
6. Cox Regression
6.1 Cox regression
log(rate) = k + a I(vacn) + b 1 I(1yr) + b 2 I(2yr) + b 3 I(3yr) b 4 I(4yr) + b 5 I(5yr) + b 6 I(6yr) + b 7 I(7yr) + b 8 I(8yr) + c 1 I(92-93) + c 2 I(94) + c 3 I(95) + c 4 I(96) + c 5 I(97) + c 6 I(98) + c 7 I(99) Cox regression (age)
6.3 Live-lines
6.4 Data to Cox-regression data coxdata; intime vactime auttime othtime date7.; datalines; 11sep95 04apr97 17oct97. 13dec jan00 27jan jul93 04nov95 01jan98. 15nov jun97 03apr99. 15apr01 03may92. 06nov run;
6.5 Cox SAS-program data coxdata2; set coxdata; outtime=min(auttime,othtime,"31dec2000"d); time=(outtime-intime); if auttime=outtime then status=1; else status=0; run; proc phreg; model time*status(0)=vacn; if (vactime=. or time<(vactime-intime)) then vacn=0; else vacn=1; run;