Presentation on theme: "LECTURE 3 – June Cohort Studies, Selection Bias Survival analysis"— Presentation transcript:
1LECTURE 3 – June 9 2006 Cohort Studies, Selection Bias Survival analysis Dr. Dick Menzies
2Cohort Studies – General Prospective study: Incidence of new disease in persons who start without disease.Follow-up period – weeks, months, yearsOne or more diseases can be measuredMeasure exposures – at start or ongoing.Can measure multiple exposuresCompare incidence in exposed vs unexposed groups within population – per unit of time
3Advantages of cohort over case-control or cross-sectional designs KEY – exposure measurement is made before disease occursExposure more accurate – prospective, and often repeatedEliminates bias in measurement of exposures:Recall bias of patients, or observer bias in exposure assessment - with knowledge of disease status.
4Experimental vs cohort studies Expt studies are a form of cohort studySame - Persons are free of disease at outsetBut - Exposure is RANDOMLY ASSIGNED to some/not othersSame - Measure outcomes after exposureCohort study – exposures NOT assigned, but occur naturally, or are chosen purposely by subjects, or by their MD’s, etc
5Advantages of cohort studies over experimental Ideal to study natural history, course of disease, prognostic factors.Etiologic research for exposures that can not be given experimentally, for ethical reasonsSmoking, asbestos, air pollutionInterventions not feasible for randomizationDiagnostic tests, complex care managementSome outcomes not well measured in trials:Compliance by patients and MD’s,
6Advantages of cohort studies over experimental Total population studied.Children, elderly, pregnancy, mentally incompetent,Full spectrum of illnessFrom patients in ICU to minimal forms of diseaseOften excluded in RCT – esp Pharma trialsFindings more likely to be applicable in real worldAdverse events often more accurately measuredPopulation based estimates of exposure effectsBUT you MUST include the full spectrum of patients as possible (No exclusions in observational studies)
7DisadvantagesSelection bias – Persons who get exposed not same as unexposedSurgery – who is ‘operable’ vs ‘inoperable’Exposures that seem same, are notPotential bias in measuringDrop-outs – reduce power, may bias (a lot)Outcome assessment can be biased
8Cohort DesignsProspective: Subjects without disease followed to determine incidence of diseasesExposures measured at baseline, and/or concurrently.Disease – measured during follow-upRetrospective: Subjects first identified based on past Exposures (Hiroshima survivors, work-force)Outcomes may then be ascertained directly, or also have already occurredKey – exposure well defined, AND occurred well before disease (useful for diseases like cancer)
9Cohort Populations General populations – no special exposures Framingham study – a true general populationAll persons in the community invitedProxy general Pop’n - Nurses, Military, CompanyExposures studied are those of general pop’n.Diet, exercise, smoking, alcoholExposure defined cohortWork-force to study occupational exposuresGroup of patients who received certain therapy
10Cohorts of patients Clinical cohorts – patients with a given condition Case series can be form of cohort studyBut – must have differences in ‘exposure’Different types, severity, causesPotential problems in cohort studies with patients:Referral bias – only sickest, rarest,Lead-time bias – better facilities = earlier DxMulti-serial cohorts –Cohort starts with all diabetics in 2004New, and old = very different patients
11Open versus Closed Cohorts An open cohort – or dynamic cohort - is one where people can enter or leaveExamples: A workforce study that is ongoingA city or other geographic locationA closed cohort is where all persons in the cohort are defined at entry. No one enters, members can only exit.Eg. McGill medical school class of 2004
12Selection BiasDefinition – selection bias occurs when there is a distortion in the estimate of effect (association) because the study or sample population is not truly representative of the underlying population in terms of the distribution of exposures and/or outcomes.Other terms: referral bias, volunteer bias, healthy worker effect, susceptibility bias, drop-out biasHow/where in a study can this occur?
13GROUPS LOSSES REASONS FOR LOSSES INTENDED POPULATION AVAILABLE GROUP CANDIDATE GROUPELIGIBLE GROUPQUALIFIED GROUPADMITTED GROUPNOTAVAILABLECANDIDATESELIGIBLEEXCLUDEDNON-RECEPTIVETreated at other hospitals or by other doctorsNot identified or accessibleDid not fulfill diagnostic criteriaSuperimposed condition of severity, co-morbidity, co-medication, or non-complianceRefused participation or acceptance of assigned maneuverFigure Diagram showing successive transfers from the intended population to the group admitted to a study of therapy
14Obtaining a representative sample In a representative sample we hope for a sample that shows us the true underlying distribution of exposure and disease:Truth – distribution of exposure and disease in source populationExposedNot ExposedDiseasedABNot DiseasedCDOdds Ratio = (A/B) / (C/D)= A x DB x C
15Un-biased Sampling x 1 = Truth! Odds Ratio = (P1 x P4) x (A x D) ExposedNot ExposedDiseasedP1AP2BNot DiseasedP3CP4DOdds Ratio = (P1 x P4) x (A x D)(P2 x P3) (B x C)IF (P1 x P4) THEN OR = (A x D)(P2 x P3) (B x C)x 1= Truth!
16Biased Sampling If sample all of A (P1=1) but only half of B (P2 =0.5) ExposedNot ExposedDiseasedP1AP2BNot DiseasedP3CP4DIf sample all of A (P1=1) but only half of B (P2 =0.5)And 1/3 of C and D (P3=0.33, P4=0.33)Odds Ratio = (P1 x P4) = (1x.33) = 2 x (A x D)(P2 x P3) = (.5X.33) (B x C)IF (P1 x P4)=2 THEN ORestimated = 2X ORTrue(P2 x P3)
17Example – Biased sampling We are planning a case control study of spicy foods and peptic ulcer diseaseCases = endoscopy proven peptic ulcer diseaseControls = elective inguinal hernia repair at the same hospitalThe truth: no relationship i.e. the odds ratio = 1The problem – physician at this hospital strongly believe spicy foods is an important risk factor for peptic ulcer disease.Therefore they tend to refer patients for endoscopy more often if they had a diet of spicy foods
19Example: biased sampling So, 100% of patients with peptic ulcer disease AND history of spicy foods have endoscopyBut only half of those with peptic ulcer, but WITHOUT history of spicy food are in fact diagnosed – (they do not have endoscopy, so they are missed)Estimated association will be twice what is correct.
20To achieve Un-biased Sampling To achieve un-biased sampling the easiest is:P1= P2=P3=P4This means the proportion sampled from each group is the same, i.e., 10% are sampled from each of the groupsHowever if P1 is higher than P2 this can be okay as long as P4 is also increased more than P3
21Volunteer Bias Participants in a study are different from refuseniks Mortality of non-participants in the Framingham studySubjects with exposure and the outcome are more (or less) likely to participateEg HIV infection and homosexuality – in AfricaDisease and occupational exposures, particularly for self-reported exposures, and compensable illnesses.
22Susceptibility biasPersons allocated to one form of treatment, or who who self-select to certain exposures are more, or less susceptible to develop health outcomes of interest.Eg Cancer patients who have surgery vs medical or radiotherapy only. Surgical patients often appear to do better.
23Healthy worker effect An important bias – found in work-force studies Reflects medical screening (military, mining)Or, physical requirements of jobResults in better health status initially than general population, or certain control pop’nStrongly affects results in cross-sectional studiesReduces risk or delays occurrence of health outcomes of interest.Also occurs in smokers “healthy smoker effect”Lung function in adolescent smokers > non-smokers
25Selection Bias in Cohort Studies – Dropouts Losses to follow up occur in all cohort studiesReduce power, and dilute resultsProblematic if more drop-outs in one exposure groupREALLY important if drop-out is due to development of disease
26Selection Bias in Cohort Studies – Dropouts Example:study of incidence of diabetes in obese persons.Truth: IRR = 3.0Losses – 33% in diabetes/obesity group (death/other)5% losses in all other groups(P1 x P4) does not = 1(P2 x P3)
27Selection Bias from Dropouts - Example At onsetDroppedNo DMOutDiabetesDetected at end with diabetesObese22710918Not Obese77335330Incidence (biased):In obese – 18/208 = 8.7%In non-obese – 30/735 = 4.1%Biased incidence rate ratio – 8.7%/4.1% = 2.1
28Drop-outs from a work-force - impact An occupational exposure causes health effects quickly in a susceptible sub-group.They leave the work-force (quit) quickly.Examples:Allergy to lab animals in researchersAsthma in Grain workersCross-sectional studies – no susceptibles leftCohorts – Can miss when setting up cohort.Outcomes occur in small number of new workers (power problem)
29Controlling Selection Bias Control in design - Most important is preventionRecruitment – high % in all groupsSame %recruitment in exposed/not exposedClose follow-up to prevent dropoutsAssess in analysisCompare participants to non-participantsSub-groups of non-participantCompare dropouts with those who remainedSensitivity analysis – best case/ worst case to assess impact of selection biases
30Cohort Studies – Exposure Assessments Prospective - Measure one or more exposures at startSpecific: cholesterol, obesity, smoking, blood pressure.Proxies: occupation, housingMeasure once, or repeatedly to account for changes in exposure over time (obesity, smoking, BP).RetrospectiveExposure based upon past eventsThese are rarely quantifiedProxies used (job description, distance from blast)Sometimes records (transfusions, dust levels)
31Pitfalls in exposure assessments Observer bias – disease ascertained at same timeBlind observers to study hypothesisStandardized protocolsAre all exposures the same?Complications of pleural tap at MGH/RVH >> MCIDid you forget something?Hard to go back to the start of cohortMeasure everything, freeze the restAdd measures as new things reported
32Cohort Studies – Outcome Assessments Baseline – ensure cohort members free of disease.Easy if prospective, harder if retrospectiveOutcomes measured periodicallyThrough questionnaire, exam, labs (direct)Through health service utilization (databases)Through vital statistics (databases)Case definition key for outcome assessmentsDiagnosis of milder disease common problem
33Pitfalls in outcome assessments Ascertainment bias – if patients with Factor X are more likely to have testing to detect outcome.Standardized protocols, blinding to exposuresObserver bias – patients with Factor X more likely to be diagnosed with outcome of interestCommon with more subjective tests – eg CXRSolution – independent reviewers, blinded to exposure status (Factor X)Lead time bias – earlier diagnosis makes survival look better
35Cohort Studies – Measures of Incidence Incidence rate (simplest) =number developing diseaseTotal number who entered cohortper unit of timeCumulative incidence =Over total follow-up period
36Measuring Incidence in Cohort Studies How to handle drop outs etc..? Drop-outs from loss to follow-up, death other causes, or withdraw consent are commonUp to 50% in long term cohortsInclude or exclude from analysis?Simple incidence measures - excludesNeed to allow variable length of follow upAnd count people who enter after the first year
37Incidence Density (ID) Counts person-time (person-years/months)Starts count when person enters cohortEach year of follow-up added upPatientExposedEnter in yearStop in YearYears of FUDisease occurrence1YES32NO12108411ID in Exposed = 1 event in 12 person yearsID in Unexposed = 1 event in 18 person years
38Cohort studies – Measure of Association: Risk Ratios, or Incidence rate ratios Summary measure of association in Cohort StudiesFormula for Incidence rate ratio (IRR) =Incidence of disease in persons with exposureIncidence of disease in persons without exposureNdisease/Nexposed per unit timeNdisease/Nunexposed per unit time* Note – in IRR there is no unit of time. This assumes the amount of time was similar for those with and without disease and those exposed and unexposedor
39Calculation of Risk Ratio - example Cohort at inception: 1,000 people without diabetesPrevalence of obesity at inception = 22.7%Outcome: Incidence of diabetes in a populationExposure - obesity at inception of cohortFollow-up - six yearsOverall incidence of diabetes = 1% per yearCumulative Incidence = 6%Risk = cumulative incidence
40Risk Ratio Calculation - Example Number with exposureDeveloped DiabetesCumulative Incidence rateObese2272727/227Non Obese7733333/773Total1,00060Ratio of Incidence = risk ratio = 27/227 / 33/773= 12 / 4=
41Incidence Density Ratio PatientExposedFollow up YearsDisease1YES2NO10384Incidence rate ratio = (1/2) / (1/2) = 1Density method = (0/2 years) + (1/10 years)(0/8 years) + (1/10 years)Incidence density ratio = (1/12) (1/18)= 1.5
42Incidence Rate Difference A patient asks “How much will my risk of heart attack go down if I take this new drug (B), instead of old one (A)?”Answer using incidence rate differenceIncidence with Drug A - Incidence with Drug B= 0.5%/year – 0.3%/year = 0.2%/year, or, a 40% reductionSame answer using Incidence rate ratio:= Incidence with Drug B = 0.3% = 0.6, or, a 40% reductionIncidence with Drug A %
43Attributable risk“How many lung cancers are due to air pollution in Montreal?” Same as “What is attributable risk?”Attributable risk = IRR x Prevalence of exposureIncreases with higher IRROr if exposure more commonDiabetes vs Silicosis and TBDiabetes: IRR = 3.5, Prevalence = 3%Silicosis: IRR = 12, Prevalence = 0.1%Attrib risk for Diabetes >> than for Silicosis
44Cohort Studies – Survival Analysis Analysis of time to eventAccounts for variable length of follow up.Advantage if time to event affected by exposure.Can find important differences in treatments even overall survival same:Cancer treatment A increases survival at two yearsBut five year mortality is same as treatment B.Treatment A - preferred by most patients!
45Important differences found using Survival analysis
46Types of Survival Analysis Simplest – DirectKaplan-Meier – still pretty simple. Calculates cumulative proportion free of outcome (survived) at each point in time when that outcome occurs. People who drop out or die of other causes are ‘censored’. At each point numerator is all who have developed disease, while denominator is all without outcome in the interval just beforeCox regression analysis – multivariate analysis with same basic principles
47Kaplan Meier survival analysis - example TimeNumber at startDuring intervalSurvivingat endProportion survivingDrop-outsDeathsIntervalCumulative1001.03 months10906 months700.8810 months600.860.7512 months400.80.618 months30Notes: Intervals are variable – defined by when subjects dieProportion surviving interval – excludes drop-outs during the interval (censored)
49Example of Kaplan-Meier analysis: General Hospital Ventilation and time to TST conversion
50Selection Bias – Berkson’s This is described in case control studies in hospitalized patientsFirst described on mathmatical basis.Probability Hospitalization if Factor Z = 0.1 Probability Hospitalization if Factor Y = 0.05 Probability Hospitalization if both = higherThese two independent conditions will appear to be associated – but may not be.In practice it is common that patients with 2 or more conditions ARE more likely hospitalized (eg CHF and pneumonia) so in hospital based Case-control study they appear to be strongly associated.Fundamental problem is the same. P1 does not equal P2 does not equal P3 does not equal P4