Presentation on theme: "HSS4303B – Intro to Epidemiology"— Presentation transcript:
1HSS4303B – Intro to Epidemiology Mar 8, 2010 – Matched Studies
2Summary from Last Time Case control study design Sources of cases and controlsProblems in selection of controlsPractical and conceptual problemsMatchingRecall problemsLimitation in recallRecall biasMultiple controlsType of case control studiesNested case control studyPrevalence study
3Summary of related studies Table Finding Your Way in the Terminology JungleCase-control study=Retrospective studyCohort studyLongitudinal studyProspective studyProspective cohort studyConcurrent cohort studyConcurrent prospective studyRetrospective cohort studyHistorical cohort studyNonconcurrent prospective studyRandomized trialExperimental studyCross-sectional studyPrevalence survey
7Prospective vs Retrospective Cohort Prospective studyIdentify a population and follow them prospectively until events developConcurrent cohortLongitudinal study
8Cohort study: prospective design Pitfalls of the studyLoss of subjectsLoss of investigatorsLifestyle changes in the subjects
9Prospective vs Retrospective Cohort Retrospective studyIdentify a population and observe the events as they occur and retrospectively determine their exposure status from historical recordsNon-current prospective studyHistorical cohort study
10Cohort study: retrospective study Pitfalls of the study:Availability of recordsQuality of recordsRecall bias
11Prospective and retrospective studies The designs of both prospective and retrospective study are similarExposed and unexposed population are compared for the eventsDifference in time frame:Prospective study – forward time frameRetrospective study – historical records for similar period of time as prospective study
13Potential biases in cohort studies Bias in assessment of the outcomeInformation on exposure status biases outcome statusInformation biasDifference in available information for the exposed and unexposedBiases from non-response and losses to follow-upAttrition rate creates study power problemsAnalytic biasesSubjectivity at the time of analyses
14Table 8–2. Comparison of the Attributes of Retrospective and Prospective Cohort Studies. Retrospective ApproachProspective ApproachInformationLess complete and accurateMore complete and accurateDiscontinued exposuresUsefulNot usefulEmerging new exposuresExpenseLess costlyMore costlyCompletion timeShorterLonger
15Advantages and Disadvantages of Cohort Studies. Direct calculation of risk ratio (relative risk)Time consumingMay yield information on the incidence of diseaseOften requires a large sample sizeClear temporal relationship between exposure and diseaseExpensiveParticularly efficient for study of rare exposuresNot efficient for the study of rare diseasesCan yield information on multiple exposuresLosses to follow-up may diminish validityCan yield information on multiple outcomes of a particular exposureChanges over time in diagnostic methods may lead to biased resultsMinimizes biasStrongest observational design for establishing cause and effect relationship
16Review of Odds Ratios (Case-Control Study) CasesControlsExposed63Nonexposed4710Odds ratio = 3.5Compute odds ratio of this dataset
17Case control study of 10 unmatched subjects: summary Figure 11-8 A case-control study of 10 cases and 10 unmatched controls.CasesControlsExposed63Nonexposed4710
18But What if Data is Matched? Why do we match again?
19Matched case control study Cases are matched with the controls on specific variablesCases and controls are analyzed in pairs rather than individual subjectsConcordant pairsDiscordant pairs1. Pairs in which both the case and the controlswere exposed2. Pairs in which neither the case nor the controlwas exposed3. Pairs in which the case was exposed but notthe control4. Pairs in which the control was exposed andnot the case
20Data Pair Outcome – Yes (cases) Outcome –No (controls) 1 2 3 4 Exposed Not exposedEg, outcome = getting the runs (the cases) vs not getting the runs (controls)exposure = did you attend the picnic and eat the egg salad?confounder = lactose intolerance (?)
21Data Pair Outcome – Yes (cases) Outcome –No (controls) 1 2 3 4 Exposed Not exposedAssume this is an unmatched study.How does the contingency table look?
26Concordant and discordant pairs ControlExposedNot ExposedCaseabcda pairs – both case and the control were exposedb pairs – case was exposed but not the controlc pairs – case was not exposed but the control is exposedd pairs – neither case nor control was exposeda and d pairs are concordant pairsb and c pairs are discordant pairs
29Individual matching (1:1) Echovirus meningitis outbreak, Germany, 2001Was swimming in pond “A” risk factor?Case control study with each case matched to one controlConcordant pairsDiscordant pairsSource: A Hauri, RKI Berlin
30Odds ratio for matched pairs Odds ratio for matched pairs is:The ratio of the ratio of the discordant pairsThe ratio of the number of pairs in which the case was exposed and the control was not, to the number of pairs in which the control was exposed and the case was not exposedb / cThe ratio of the number of pairs that support the hypothesis of an association to the number of pairs that negate the hypothesis of an association?
31Matched cases and controls 2 x 2 table ControlExposedNot ExposedCase2413Concordant pairs: 2 pairs (exposed and exposed) and3 pairs (not exposed and not exposed)Discordant pairs: 4 pairs (exposed and not exposed and1 pair (not exposed and exposed)Odds ratio = b / c = 4 / 1 = 4
33Remember this example? Concordant pairs Discordant pairs Source: A Hauri, RKI Berlin
34Individual matching (1:1): Analysis Echovirus meningitis outbreak, Germany, 2001Was swimming in pond “A” risk factor?Case control study with each case matched to one control
35What Else Can We Do With These Data? Remember the Chi-Square test?
36Chi-squareChi square is a non-parametric test of statistical significance for bivariate tabular analysisIt lets you know the degree of confidence you can have in accepting or rejecting an hypothesisIt provides information on whether or not two different samples are different enough in some characteristic or aspect of their behaviour
37Chi Square There are actually all sorts of chi-square tests out there Pearson’sYate’sMantel-HaenszelPortmanteauFisher’s Exact<- We’ll be using this one
38Also need to compute something called “degrees of freedom”
39Chi-square calculation Variable 1 Variable 2 Data type 1 Data type 2 Totals Category 1 aba + b Category 2 cdc + d Totala + cb + da + b + c + d = NChi square = N(ad-bc)2 / (a+b) (c+d) (b+d) (a+c)The degrees of freedom =(number of columns minus one) x (number of rows minus one) not counting the totals for rows or columns.For our data this gives (2-1) x (2-1) = 1.
40Chi-square calculations Number of animals that survived the treatment Dead AliveTotal Treated 36 14 50 Not treated 30 25 55 Total 66 39 105(36x25)/(14x30) = 2.14Odds ratio =Chi square =105[(36)(25) - (14)(30)]2 / (50)(55)(39)(66) = 3.418(2-1)x(2-1) = 1DOF=Now what do we do with this?
41Degrees of freedom and chi square table Df0.50.100.050.020.010.00110.4552.7063.8415.4126.63510.82721.3864.6055.9917.8249.21013.81532.3666.2517.8159.83711.34516.26843.3577.7799.48811.66813.27718.46554.3519.23611.07013.38815.08620.517Using the Chi square tableThe corresponding probability is 0.10<P<0.05. This is below the conventionally accepted significance level of 0.05 or 5%, so the null hypothesis that the two distributions are the same is verified.In other words, when the computed x2 statistic exceeds the critical value in the table for a 0.05 probability level, then we can reject the null hypothesis of equal distributions.Since our x2 statistic (3.418) did not exceed the critical value for 0.05 probability level (3.841) we can accept the null hypothesis that the survival of the animals is independent of drug treatment
42p-valueThe p-value is the probability that your sample could have been drawn from the population being tested given the assumption that the null hypothesis is true.A p-value of .05, for example, indicates that you would have only a 5% chance of drawing the sample being tested if the null hypothesis was actually true.A p-value close to zero signals that your null hypothesis is false, and typically that a difference is very likely to exist.Large p-values closer to 1 imply that there is no detectable difference for the sample size used.A p-value of 0.05 is a typical threshold used to evaluate the null hypothesis.
43p-value So what does a p-value of 0.10 mean? We “fail to reject the null hypothesis”
44What is the null hyp that we are testing? In cohort studies, the chi-square test tells us whether to accept or reject the null hypothesis that RR=1In case-control studies, the chi-square test tells us whether or accept or reject the null hypothesis that OR=1Pierson’s chi-square is NOT appropriate to test the null hypothesis of whether the matched study pairs are relatedFor that we use something called McNemar’s test, which we will not cover in this class
45Remember the Picnic Data PairOutcome – Yes(cases)Outcome –No(controls)1234ExposedNot exposedMatched odds ratio=b/c=1/2=0.5ControlsExposedNot ExposedCases12Pretend it’s unmatched and construct the contingency table…
46Data Can you compute a chi-square for this? Pair Outcome – Yes (cases) Outcome –No(controls)1234ExposedNot exposedOutcome(cases)No outcome (controls)TotalsExposed (picnic)235Not exposed (no picnic)148Odds ratio?(2x1)/(3x2)=0.33
47Caveat to Pierson’s Chi Square Typically, does not work well if any cell has a count of <5If it does, better off using Fisher’s Exact Test or some other similar testWe will not be doing that in this class
48Summary Chi square = (ad-bc)2 (a+b+c+d) / (a+b) (c+d) (b+d) (a+c) The degrees of freedom equal (number of columns minus one) x (number of rows minus one) not counting the totals for rows or columns.
49If you’re lazy… Lots of online OR, RR and chi-square calculators Eg,
50Homework12 women with uterine cancer and 12 without were asked if they’d ever used supplemental estrogen. Each woman with cancer was matched by race, weight and parity to a woman without cancer:pairWomen with cancerWomen without cancel123456789101112Estrogen userEstrogen nonuser
51HomeworkWhat is the estimated relative risk of cancer when analyzing this study as a matched-pairs study?What is the estimated relative risk of cancer when analyzing this study as an unmatched study?What is the chi square statistic of the (unmatched) relationship between cancer and estrogen intake?What is the null hypothesis being tested by the chi-square test?What does the p-value of the statistic tell you about whether to reject or accept the null hypothesis?Estimated RR is the same as OR3.004.002.67Null hypothesis = “the OR =1”0.5<p<1.0 therefore we fail to reject the null hypothesis. However, cell values are <5, therefore Pierson’s chi-square cannot be accurately computed for this sample