Biases and errors in Epidemiology

Biases and errors in Epidemiology
Anchita Khatri

Definitions ERROR: A false or mistaken result obtained in a study or experiment Random error is the portion of variation in measurement that has no apparent connection to any other measurement or variable, generally regarded as due to chance Systematic error which often has a recognizable source, e.g., a faulty measuring instrument, or pattern, e.g., it is consistently wrong in a particular direction (Last)

Relationship b/w Bias and Chance
True BP (intra-arterial cannula) BP measurement (sphygmomanometer) No. of observations Chance Bias 80 90 Diastolic Blood Pressure (mm Hg)

Validity Validity: The degree to which a measurement measures what it purports to measure (Last) Degree to which the data measure what they were intended to measure – that is, the results of a measurement correspond to the true state of the phenomenon being measured (Fletcher) also known as ‘Accuracy’

Reliability The degree of stability expected when a measurement is repeated under identical conditions; degree to which the results obtained from a measurement procedure can be replicated (Last) Extent to which repeated measurements of a stable phenomenon – by different people and instruments, at different times and places – get similar results (Fletcher) Also known as ‘Reproduciblity’ and ‘Precision’

Validity and Reliability
High Low

Bias Deviation of results or inferences from the truth, or processes leading to such deviation. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth (Last) A process at any stage of inference tending to produce results that depart systematically from true values (Fletcher)

Types of biases Selection bias Measurement / (mis)classification bias
Confounding bias

Selection bias Errors due to systematic differences in characteristics between those who are selected for study and those who are not. (Last; Beaglehole) When comparisons are made between groups of patients that differ in ways other than the main factors under study, that affect the outcome under study (Fletcher)

Examples of Selection bias
Subjects: hospital cases under the care of a physician Excluded: Die before admission – acute/severe disease. Not sick enough to require hospital care Do not have access due to cost, distance etc. Result: conclusions cannot be generalized Also known as ‘Ascertainment Bias’ (Last)

Ascertainment Bias Systematic failure to represent equally all classes of cases or persons supposed to be represented in a sample. This bias may arise because of the nature of the sources from which the persons come, e.g., a specialized clinic; from a diagnostic process influenced by culture, custom, or idiosyncracy. (Last)

Selection bias with ‘volunteers’
Also known as ‘response bias’ Systematic error due to differences in characteristics b/w those who choose or volunteer to take part in a study and those who do not

Examples …response bias
Volunteer either because they are unwell, or worried about an exposure Respondents to ‘effects of smoking’ usually not as heavy smokers as non-respondents. In a cohort study of newborn children, the proportion successfully followed up for 12 months varied according to the income level of the parents

Examples…. (Assembly bias)
Study: ? association b/w reserpine and breast cancer in women Design: Case Control Cases: Women with breast cancer Controls: Women without breast cancer who were not suffering from any cardio-vascular disease (frequently associated with HT) Result: Controls likely to be on reserpine systematically excluded  association between reserpine and breast cancer observed

Examples…. (Assembly bias)
Study: effectiveness of OCP1 vs. OCP2 Subjects: on OCP1 – women who had given birth at least once ( able to conceive) on OCP2 – women had never become pregnant Result: if OCP2 found to be better, inference correct??

Susceptibility Bias Groups being compared are not equally susceptible to the outcome of interest, for reasons other than the factors under study Comparable to ‘Assembly Bias’ In prognosis studies; cohorts may differ in one or more ways – extent of disease, presence of other diseases, the point of time in the course of disease, prior treatment etc.

Examples…..(Susceptibility Bias)
Background: for colorectal cancer, - CEA levels correlated with extent of disease (Duke’s classification) Duke’s classification and CEA levels strongly predicted diseases relapse Question: Does CEA level predict relapse independent of of Duke’s classification, or was susceptibility to relapse explained by Duke’s classification alone?

Example… CEA levels (contd.)
Answer: association of pre-op levels of CEA to disease relapse was observed for each category of Duke’s classification stratification

Disease-free survival according to CEA levels in colorectal cancer pts
Disease-free survival according to CEA levels in colorectal cancer pts.with similar pathological staging (Duke’s B) 100 80 CEA Level (ng) % disease free <2.5 2.5 – 10.0 60 >10.0 3 6 9 12 15 18 21 24 Months

Selection bias with ‘Survival Cohorts’
Patients are included in study because they are available, and currently have the disease For lethal diseases patients in survival cohort are the ones who are fortunate to have survived, and so are available for observation For remitting diseases patients are those who are unfortunate enough to have persistent disease Also known as ‘Available patient cohorts’

Example… bias with ‘survival cohort’
TRUE COHORT Observed improvement True improvement Measure outcome Improved: Not improved: 75 Assemble Cohort (N=150) 50% 50% SURVIVAL COHORT Assemble patients Begin Follow-up (N=50) Measure outcome Improved: Not improved: 10 50% 80% Not observed (N=100) Dropouts Improved: Not improved: 65

Selection bias due to ‘Loss to Follow-up’
Also known as ‘Migration Bias’ In nearly all large studies some members of the original cohort drop out of the study If drop-outs occur randomly, such that characteristics of lost subjects in one group are on an average similar to those who remain in the group, no bias is introduced But ordinarily the characteristics of the lost subjects are not the same

Example of ‘lost to follow-up’
EXPOSURE irradiation EXPOSURE irradiation +nt -nt Total 50 100 150 10000 20000 30000 +nt -nt Total 60 4000 8000 12000 DISEASE cataract 30 30 RR= 50/10000 100/20000 = 1 RR= 30/4000 30/8000 = 2

Migration bias A form of Selection Bias
Can occur when patients in one group leave their original group, dropping out of the study altogether or moving to one of the other groups under study (Fletcher) If occur on a large scale, can affect validity of conclusions. Bias due to crossover more often a problem in risk studies, than in prognosis studies, because risk studies go on for many years

Example of migration Question: relationship between lifestyle and mortality Subjects: 10,269 Harvard College alumni classified according to physical activity, smoking, weight, BP In 1966 and 1977 Mortality rates observed from 1977 to 1985

Example of migration (contd.)
Problem: original classification of ‘lifestyle’ might change (migration b/w groups) Solution: defined four categories Men who maintained high risk lifestyles Men who crossed over from low to high risk Men who crossed over from high to low risk Men who maintained low risk lifestyles

Example of migration (contd.)
Result: after controlling for other risk factors those who maintained or adopted high risk characteristics had highest mortality Those who changed from high to low had lesser mortality than above Those who never had any high risk behavior had least mortality

Healthy worker effect A phenomenon observed initially in studies of occupational diseases: workers usually exhibit lower overall death rates than the general population, because the severely ill and chronically disabled are ordinarily excluded from employment. Death rates in the general population may be inappropriate for comparison if this effect is not taken into account. (Last)

Example…. ‘healthy worker effect’
Question: association b/w formaldehyde exposure and eye irritation Subjects: factory workers exposed to formaldehyde Bias: those who suffer most from eye irritation are likely to leave the job at their own request or on medical advice Result: remaining workers are less affected; association effect is diluted

Measurement bias Systematic error arising from inaccurate measurements (or classification) of subjects or study variables (Last) Occurs when individual measurements or classifications of disease or exposure are inaccurate (i.e. they do not measure correctly what they are supposed to measure) (Beaglehole) If patients in one group stand a better chance of having their outcomes detected than those in another group (Fletcher)

Measurement / (Mis) classification
Exposure misclassification occurs when exposed subjects are incorrectly classified as unexposed, or vice versa Disease misclassification occurs when diseased subjects are incorrectly classified as non-diseased, or vice versa (Norell)

Causes of misclassification
Measurement gap: gap between the measured and the true value of a variable Observer / interviewer bias Recall bias Reporting bias 2. Gap b/w the theoretical and empirical definition of exposure / disease

Sources of misclassification
Measurement results Measurement errors Empirical definition Gap b/w theoretical & empirical definitions Theoretical definition

Example… ‘gap b/w definitions’
Theoretical definition Exposure: passive smoking – inhalation of tobacco smoke from other people’s smoking Disease: Myocardial infarction – necrosis of the heart muscle tissue Empirical definition Exposure: passive smoking – time spent with smokers (having smokers as room-mates) Disease: Myocardial infarction – certain diagnostic criteria (chest pain, enzyme levels, signs on ECG)

Exposure misclassification – Non-differential
Misclassification does not differ between cases and non-cases Generally leads to dilution of effect, i.e. bias towards RR=1 (no association)

Example…Non-differential Exposure Misclassification
X-ray exposure EXPOSURE X-ray exposure +nt -nt Total 40 80 120 10000 40000 50000 +nt -nt Total 60 120 20000 30000 50000 Breast Cancer DISEASE RR= 60/20000 60/30000 = 1.5 RR= 40/10000 80/40000 = 2

Exposure misclassification - Differential
Misclassification differs between cases and non-cases Introduces a bias towards RR= 0 (negative / protective association), or RR= α (infinity)(strong positive association)

Example…Differential Exposure Misclassification
X-ray exposure EXPOSURE X-ray exposure +nt -nt Total 40 80 120 9960 39920 49880 10000 40000 50000 +nt -nt Total 40 80 120 19940 29940 49880 19980 30020 50000 Breast Cancer DISEASE RR= 40/10000 80/40000 = 2 RR= 40/19980 80/30020 = 0.75

Implications of Differential exposure misclassification
An improvement in accuracy of exposure information (i.e. no misclassification among those who had breast cancer), actually reduced accuracy of results Non-differential misclassification is ‘better’ than differential misclassification So, epidemiologists are more concerned with comparability of information than with improving accuracy of information

Causes of Differential Exposure Misclassification
Recall Bias:Systematic error due to differences in accuracy or completeness of recall to memory of past events or experience. For e.g. patients suffering from MI are more likely to recall and report ‘lack of exercise’ in the past than controls

Measurement bias: e.g. analysis of Hb by different methods (cyanmethemoglobin and Sahli's) in cases and controls. e.g.biochemical analysis of the two groups from two different laboratories, which give consistently different results

Interviewer / observer bias: systematic error due to observer variation (failure of the observer to measure or identify a phenomenon correctly) e.g. in patients of thrombo-embolism, look for h/o OCP use more aggressively

Measurement bias in treatment effects
Hawthorne effect: effect (usually positive / beneficial) of being under study upon the persons being studied; their knowledge of being studied influences their behavior Placebo effect: (usually, but not necessarily beneficial) expectation that regimen will have effect, i.e. the effect is due to the power of suggestion.

Total effects of treatment are the sum of spontaneous improvement, non-specific responses, and the effects of specific treatments EFFECTS Specific to treatment Placebo Hawthorne Natural History IMPROVEMENT 

Confounding A situation in which the effects of two processes are not separated. The distortion of the apparent effect of an exposure on risk brought about by the association with other factors that can influence the outcome A relationship b/w the effects of two or more causal factors as observed in a set of data such that it is not logically possible to separate the contribution that any single causal factor has made to an effect (Last)

Confounding When another exposure exists in the study population (besides the one being studied) and is associated both with disease and the exposure being studied. If this extraneous factor – itself a determinant of or risk factor for health outcome is unequally distributed b/w the exposure subgroups, it can lead to confounding (Beaglehole)

Confounder … must be Risk factor among the unexposed (itself a determinant of disease) Associated with the exposure under study Unequally distributed among the exposed and the unexposed groups

Examples … confounding
SMOKING LUNG CANCER (As age advances chances of lung cancer increase) AGE (If the average ages of the smoking and non-smoking groups are very different)

HEART DISEASE COFFEE DRINKING (Smoking increases the risk of heart ds) (Coffee drinkers are more likely to smoke) SMOKING

ALCOHOL INTAKE MYOCARDIAL INFARCTION (Men are more likely to consume alcohol than women) (Men are more at risk for MI) SEX

Exposure-alcohol +nt -nt 140 100 Total 30000 RR = 140/30000 100/30000 = 1.4 Disease MI Exposure-alcohol RR = 120/20000 60/10000 = 1 RR = 20/10000 40/20000 +nt -nt male female 120 20 60 40 Total 20000 10000 Disease MI

Example … multiple biases
Study: ?? Association b/w regular exercise and risk of CHD Methodology: employees of a plant offered an exercise program; some volunteered, others did not coronary events detected by regular voluntary check-ups, including a careful history, ECG, checking routine heath records Result: the group that exercised had lower CHD rates

Biases operating Selection: volunteers might have had initial lower risk (e.g. lower lipids etc.) Measurement: exercise group had a better chance of having a coronary event detected since more likely to be examined more frequently Confounding: if exercise group smoked cigarettes less, a known risk factor for CHD

Dealing with Selection Bias
Ideally, To judge the effect of an exposure / factor on the risk / prognosis of disease, we should compare groups with and without that factor, everything else being equal But in real life ‘everything else’ is usually not equal

Methods for controlling Selection Bias
During Study Design Randomization Restriction Matching During analysis Stratification Adjustment Simple / standardization Multiple / multivariate adjustment Best case / worst case analysis

Restriction Subjects chosen for study are restricted to only those possessing a narrow range of characteristics, to equalize important extraneous factors Limitation: generalisability is compromised; by excluding potential subjects, cohorts / groups selected may be unusual and not representative of most patients or people with condition

Example… restriction Study: effect of age on prognosis of MI
Restriction: Male / White / Uncomplicated anterior wall MI Important extraneous factors controlled for: sex / race / severity of disease Limitation: results not generalizable to females, people of non-white community, those with complicated MI

Example… restriction OCP example
restrict study to women having at least one child Colorectal cancer example restrict patients to a particular staging of Duke’s classification

Matching - definition The process of making a study group and a comparison group comparable with respect to extraneous factors (Last) For each patient in one group there are one or more patients in the comparison group with same characteristics, except for the factor of interest (Fletcher)

Types of Matching Caliper matching: process of matching comparison group to study group within a specific distance for a continuous variable (e.g., matching age to within 2 years) Frequency matching: frequency distributions of the matched variable(s) be similar in study and comparison groups Category matching: matching the groups in broad classes such as relatively wide age ranges or occupational groups

Types of Matching … (contd.)
Individual matching: identifying individual subjects for comparison, each resembling a study subject on the matched variable(s) Pair matching: individual matching in which the study and comparison subjects are paired (Last)

Matching is often done for age, sex, race, place of residence, severity of disease, rate of progression of disease, previous treatment received etc. Limitations: controls for bias for only those factors involved in the match Usually not possible to match for more than a few factors because of the practical difficulties of finding patients that meet all matching criteria If categories for matching are relatively crude, there may be room for substantial differences b/w matched groups

Example… Matching Study: ? Association of Sickle cell trait (HbAS) with defects in physical growth and cognitive development Other potential biasing factors: race, sex, birth date, birth weight, gestational age, 5-min Apgar score, socio economic status Solution: matching – for each child with HbAS selected a child with HbAA who was similar with respect to the seven other factors (50+50=100) Result: no difference in growth and development

Overmatching A situation that may arise when groups are being matched. Several varieties: The matching procedure partially or completely obscures evidence of a true causal association b/w the independent and dependant variables. Overmatching may occur if the matching variable is involved in, or is closely connected with, the mechanism whereby the independent variable affects the dependant variable. The matching variable may be an intermediate cause in the causal chain or it may be strongly affected by, or a consequence of, such an intermediate cause

2. The matching procedure uses one or more unnecessary matching variables, e.g., variables that have no causal effect or influence on the dependant variable, and hence cannot confound the relationship b/w the independent and dependant variables. 3. The matching process is unduly elaborate, involving the use of numerous matching variables and / or insisting on a very close similarity with respect to specific matching variables. This leads to difficulty in finding suitable controls (Last)

Stratification The process of or the result of separating a sample into several sub-samples according to specified criteria such as age groups, socio-economic status etc (Last) The effect of confounding variables may be controlled by stratifying the analysis of results After data are collected, they can be analyzed and results presented according to subgroups of patients, or strata, of similar characteristics (Fletcher)

Example…Stratification (Fletcher)
HOSPITAL ‘A’ Pre-op Risk High Pts Deaths % Total 1200 48 4 500 30 6 Medium 400 16 4 HOSPITAL ‘B’ Low 300 02 .67 Pre-op Risk High Pts Deaths % Total 2400 64 2.6 400 24 6 Medium 800 32 4 Low 1200 8 .67

Example…Stratification
Pinellas county Dade county Relat. Rate Dead Total Overall 5726 374,665 15.3 8332 935,047 8.9 1.7 Age – Wise Stratifi cation Birth – 54 yrs 737 229,198 3.2 2463 748,035 3.3 1.0 > 55 yrs 4989 145,147 5898 187,985 31.2 34.4 1.1

Standardization A set of techniques used to remove as far as possible the effects of differences in age or other confounding variables when comparing two or more populations The method uses weighted averaging of rates specific for age, sex, or some other potentially confounding variable(s), according to some specified distribution of these variables (Last)

Standard population A population in which the age and sex composition is known precisely, as a result of a census or by an arbitrary means – e.g. an imaginary population, the “standard million” in which the age and sex composition is arbitrary. A standard population is used as comparison group in the actuarial procedure of standardization of mortality rates. (e.g. Segi world population, European standard population) (Last)

Types of standardization
Direct: the specific rates in a study population are averaged using as weights the distribution of a specified standard population. The standardized rate so obtained represents what the rate would have been in the study population if that population had the same distribution as the standard population w.r.t. the variables for which the adjustment or standardization was carried out.

Indirect: used to compare the study populations for which the specific rates are either statistically unstable or unknown. The specific rates are averaged using as weights the distribution of the study population. The ratio of the crude rate for the study population to the weighted average so obtained is known as standardized mortality (or morbidity) ratio, or SMR (Last) [represents what the rate would have been in the study population if that population had the same specific rates as the standard population]

Standardized mortality ratio (SMR)
Ratio of The no. of deaths observed in the study group or population X 100 No. of deaths expected if the study population had the same specific rates as the standard population

Example … direct standardization
Age Pop Deaths Rate 4000 60 15.0 1-4 4500 20 4.4 5-14 12 3.0 15-19 5000 15 20-24 16 4.0 25-34 8000 25 3.1 34-44 9000 48 5.3 45-54 100 12.5 55-64 7000 150 21.4 Total 53,500 446 8.3 Std.Pop Exp deaths 2400 36 9600 42.24 19000 57 9000 27 8000 32 14000 43.4 12000 63.6 11000 137.5 171.2 93000 609.94(6.56)

Example … direct standardization
HOSPITAL ‘A’ Preop Pts Deaths % High 500 30 6 Medium 400 16 4 Low 300 2 .67 Total 1200 48 HOSPITAL ‘Std’ Preop Pts Rate Exp.deaths High 400 6 24 Medium 4 16 Low .67 2.68 Total 1200 42.68 (3.6%)

Stratification vs. Standardization
Standardization removes the effect Stratification controls for the effect of factor, but the effect can still be seen For e.g. in the ‘hospital example’, with standardization we found that patients had similar prognosis in both hospitals; with stratification also learnt mortality rates among different risk strata Similar to difference b/w age-standardized mortality rate and age specific mortality rates

Multivariate adjustment
Simultaneously controlling the effects of many variables to determine the independent effects of one Can select from a large no. of variables a smaller subset that independently and significantly contributes to the overall variation in outcome, and can arrange variables in order of the strength of their contribution Only feasible way to deal with many variables at one time during the analysis phase

Examples… Multivariate adjustment
CHD is the joint result of lipid abnormalities, HT, smoking, family history, DM, exercise, personality type. Start with 2x2 tables using one variable at a time Contingency tables, i.e. stratified analyses, examining the effect of one variable changed in the presence/absence of one or more variables

Example…Multivariate adjustment
Multi variable modeling i.e developing a mathematical expression of the effects of many variables taken together Basic structure of a multivariate model: Outcome variable = constant + (β1 x variable1) + (β2 x variable2) + ………. β1, β2, … are coefficients determined from the data; variable1, variable2, …. are the predictor variables that might be related to outcome

Sensitivity analysis When data on important prognostic factors is not available, it is possible to estimate the potential effects on the study by assuming various degrees of mal-distribution of the factors b/w the groups being compared and seeing how that would affect the results Best case / worst case analysis is a special type of sensitivity analysis – assuming the best and worst type of mal-distribution

Example… best/worst case analysis
Study: effect of gastro-gastrostomy on morbid obesity Subjects: cohort of 123 morbidly obese patients who underwent gastro-gastrostomy, 19 to 47 months after surgery Success : losing >30% excess weight Follow-up: 103 (84%) patients 20 patients lost to follow up

Example…. (contd.) Success rate: 60/103 (58%)
Best case: all 20 lost to follow up had “success” Best success rate: (60+20)/123 (65%) Worst case: all 20 lost to follow up had “failures” Worst success rate: 60/123 (49%) Result: true success rate b/w 49% and 65%; probably closer to 58% ! (because pts. lost to follow up unlikely to be all successes or all failures

Randomization The only way to equalize all extraneous factors, or ‘everything else’ is to assign patients to groups randomly so that each has an equal chance of falling into the exposed or unexposed group Equalizes even those factors which we might not know about! But it is not possible always

Overall strategy Except for randomization, all ways of dealing with extraneous differences b/w groups. Are effective against only those factors that are singled out for consideration Ordinarily one uses several methods layered one upon another

Example… Study: effect of presence of VPCs on survival of patients after acute MI Strategies: Restriction: not too young / old; no unusual causes (e.g.mycotic aneurysm) for infarction Matching: for age (as important prognostic factor, but not the factor under study) Stratification: examine results for different strata of clinical severity Multivariate analysis: adjust crude rates for the effects of all other variables except VPC, taken together.

Dealing with measurement bias
Blinding Subject Observer / interviewer Analyser 2. Strict definition / standard definition for exposure / disease / outcome 3. Equal efforts to discover events equally in all the groups

Controlling confounding
Similar to controlling for selection bias Use randomization, restriction, matching, stratification, standardization, multivariate analysis etc.

Lead time bias Lead time is the period of time b/w the detection of a medical condition by screening and when it ordinarily would be diagnosed because a pt. experiences symptoms and seeks medical care As a result of screening, on an average, pt will survive longer from the time of diagnosis than who are diagnosed otherwise, even if T/t is not effective. Not more ‘survival time’, but more ‘disease time’

How lead time affects survival time
Unscreened Diag Screened – Early T/t not effective Diag Screened – Early T/t is effective Diag Onset of Ds Death Survival after diagnosis

Controlling lead time bias
Compare screened group of people, and control group, and compare age specific mortality rates, rather than survival times from time of diagnoses E.g. early diagnosis and T/t for colorectal cancer is effective because mortality rates of screened people are lower than those of a comparable group of unscreened people

Length time bias Can affect studies of screening
B’cos the proportion of slow growing tumors diagnosed during screening programs is greater than those diagnosed during usual medical care B’cos slow growing tumors are present for a longer period before they cause symptoms; fast growing tumors are likely to cause symptoms leading to interval diagnosis Screening tends to find tumors with inherently better prognoses

Compliance bias Compliant patients tend to have better prognoses regardless of the screening If a study compares disease outcomes among volunteers for a screening program with outcomes in a group of people who did not volunteer, better results for the volunteers might not be due to T/t but due to factors related to compliance Compliance bias and length-time bias can both be avoided by relying on RCTs

Types of studies & related biases
Prevalence study Uncertainty about temporal sequences Bias studying ‘old’/prevalent cases Case control Selection bias in selecting cases/controls Measurement bias Cohort study Susceptibility bias Survival cohort vs. true cohort Migration bias Randomized control trials Consider natural h/o disease, Hawthorne effect, placebo effect etc. Compliance problems Effect of co-interventions

Random error Divergence on the basis of chance alone of an observation on a sample from the population from the true population values ‘random’ because on an average it is as likely to result in observed values being on one side of the true value as on the other side Inherent in all observations Can be minimized, but never avoided altogether

Sources of random error
Individual biological variation Measurement error Sampling error ( the part of the total estimation of error of a parameter caused by the random nature of the sample)

Sampling variation Because research must ordinarily be conducted on a sample of patients and not on all the patients with the condition under study always a possibility that the particular sample of patients in a study, even though selected in an unbiased way, might not be similar to population of patients as a whole

Sampling variation - definition
Since inclusion of individuals in a sample is determined by chance, the results of analysis on two or more samples will differ purely by chance. (Last)

Assessing the role of chance
Hypothesis testing Estimation

Hypothesis testing Start off with the Null Hypothesis (H0)
the statistical hypothesis that one variable has no association with another variable or set of variables, or that two or more population distributions do not differ from one another. in simpler terms, the null hypothesis states that the results observed in a study, experiment or test are no different from what might have occurred as a result of operation of chance alone (Last)

Statistical tests – errors (Fletcher)
TRUE DIFFERENCE PRESENT (H0) false ABSENT (H0) true CONCLUSION OF STATISTICAL TEST SIGNIFICANT (H0) Rejected NOT (H0) Accepted Type I ( α ) error Power Type II ( β ) error

Statistical tests - errors
Type I (α) error: error of rejecting a true null hypothesis , I.e. declaring a difference exists when it does not Type II (β) error: error of failing to reject a false null hypothesis , I.e. declaring that a difference does not exist when in fact it does Power of a study: ability of a study to demonstrate an association if one exists Power = 1- β

p - value Probability of an α error.
Quantitative estimate of probability that observed difference in b/w the groups in the study could have happened by chance alone, assuming that there is no real difference b/w the groups OR If there were no difference b/w the groups, and the trial was repeated many times, what proportion of the trials would lead to conclusions that there is the same or a bigger difference b/w the groups than the results found in the study

p – value – Remember!! Usually P < 0.05 is considered statistically significant (i.e. probability of 1 in 20 that observed difference is due to chance) 0.05 is an arbitrary cut-off; can change according to requirements Statistically significant result might not be clinically significant and vice-versa

Statistical significance vs. clinical significance
Large RCT called GUSTO (41,021 pts of ac MI) Study: Streptokinase vs. tPA Result: death rate at 30 days streptokinase (7.2%) (p < 0.001) tPA (6.3%) But, need to treat 100 patients with tPA instead of streptokinase to prevent 1 death! tPA costly - $ 250 thousand to save one death ??? Clinically significant

Estimation Effect size observed in a particular study is called ‘Point estimate’ True effect is unlikely to be exactly that observed in study because of random variation Confidence interval (CI): usually 95% (Last) computed interval with a given probability e.g. 95%, that the true value such as a mean, proportion, or rate is contained within the interval

Confidence intervals (Fletcher) If the study is unbiased, there is a 95% chance that that the interval includes the true effect size. The true value is likely to be close to the point estimate, less likely to be near the outer limits of that interval, and could (5 times out of 100) fall outside these limits altogether, CI allows the reader to see the range of plausible values and so to decide whether the effect size they regard as clinically meaningful is consistent with or ruled out by the data

Multiple comparison problem
If a no. of comparisons are made, (e.g. in a large study, the effect of treatment assessed separately for each subgroup, and for each outcome), 1 in 20 of these comparisons is likely to be statistically significant at the 0.05 level

“If you dredge the data sufficiently deeply, and sufficiently often, you will find something odd. Many of these bizarre findings will be due to chance…….discoveries that were not initially postulated among the major objectives of the trial should be treated with extreme caution.”

Dealing with random error
Increasing the sample size: sample size depends upon level of statistical significance (α error) Acceptable chance of missing a real effect (β error) Magnitude of effect under investigation Amount of disease in population Relative sizes of groups being compared Sample size is usually a compromise b/w ideal and logistic and financial considerations

References Fletcher RH et al.Clinical Epidemiology : The Essentials – 3rd ed. Beaglehole R et al. Basic Epidemiology, WHO Last JM. Dictionary in Epidemiology – 3rd ed. Maxcy-Rosenau-Last. Public Health & Preventive Medicine – 14th ed. Norell SE. Workbook of Epidemiology Park K. Park’s textbook of preventive and social medicine – 16th ed.

Biases and errors in Epidemiology

Similar presentations

Presentation on theme: "Biases and errors in Epidemiology"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Biases and errors in Epidemiology

Similar presentations

Presentation on theme: "Biases and errors in Epidemiology"— Presentation transcript:

Similar presentations

About project

Feedback