Presentation on theme: "Presentation and interpretation of epidemiological data: objectives Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences."— Presentation transcript:
Presentation and interpretation of epidemiological data: objectives Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences Section, Division of Community Health Sciences, University of Edinburgh, Edinburgh EH89AG
Presentation and interpretation of epidemiological data: objectives You should understand: The aim of manipulating epidemiological data is to sharpen understanding of risk and burden of disease, but distortions occur. Epidemiological studies measure, present and interpret risk, comparing one population to another The idea, definition, and calculation of: proportional mortality, proportional mortality ratio, actual overall (crude) rates, directly and indirectly standardised rates, the standardised mortality ratio, relative risk, odds ratio, attributable risk, population attributable risk and numbers needed to treat.
Presentation and interpretation of epidemiological data: objectives 2 The principal relative measure is the relative risk while the odds ratio can approximate it in particular circumstances. Attributable and population attributable risk are measures that help assess the proportion of the burden of disease that is caused by a particular risk factor. How epidemiological data contributes to assessing the health needs and health status of populations. Different ways of presenting data have a major impact on the perception of risk so epidemiological studies should provide means of both relative and actual risk
Proportional mortality ratio (PMR) Sometimes the only data we have is cases e.g. no accurate population denominators for outcomes by hospital PMR is commonly used to study disease patterns by cause in settings where population denominators are not available P.M. = Number of deaths due to cause X total number of deaths
Proportional mortality ratio (PMR) 2 The proportional mortality can be calculated by sex, age group or any other appropriate sub-division of the population Figures can be compared between populations, places or time periods by calculating the proportional mortality ratio (PMR) which is simply the ratio of PM's in the two comparison populations, ie PMR = PM in population A PM in population B Proportional mortality is a simple and potentially useful way of portraying the burden of a specific disease within a population, and the PMR provides a way to compare populations PMR is one measure of the strength of the association
Adjusted overall rates: standardisation and the SMR Age and sex specific rates can be compared between times, places and sub-populations Age and sex specific rates may be imprecise in small studies Age and sex specific tables are usually large and difficult to assimilate If so, you may calculate the summary, overall (crude) rate Overall actual rates (crude) rates may mislead Age and sex structure of the compared population probably differs If so, age and sex are confounding variables Therefore, we need to adjust (or standardise) the rates for age, sex or both
Adjusted overall rates: standardisation and the SMR 2 If age and sex differences are potentially interesting or important explanatory factors for population disease patterns, rates should not be adjusted The age-adjusted figure loses information, particularly when differences are not consistent across age group or sex With major differences in age and sex structure between populations, when adjustment is most needed, the method is less effective Rates adjusted by the indirect method are weighted (or biased) in relation to the age and sex structure of the population under study Output from such adjustment is the SMR Only SMR comparisons between the study population and the chosen standard population are valid
Class exercise: age-specific and actual overall (crude) rates Consider the age-specific and actual overall rates in the table 8.3. Comment on the age structure, and the effect this has on the overall rate, which varies in populations A, B and C. Why does this effect occur?
Class exercise: age-specific and actual overall (crude) rates 2 Population B has high overall rates because it has a comparatively older population. The larger number of older people is weighting (exerting influence upon) the summary figure. In effect, the size of the population in each age group provides a set of weights that are applied to the overall rates. The overall rates are misleading us into thinking there are differences because the weights exerted by the population structure differ
Exercise: effect of directly standardising on overall rates Consider the age structures of the standard population, and the age-specific and overall rates in table 8.4. Calculate the number of cases expected if the standard population had the same age specific rates as population A What is the relationship between the overall rates in table 8.4 to those in table 8.3. Why are the overall rates now the same in populations A, B and C? What is the influence of a relatively young and relatively old standard population?
Direct standardising: example Age specific rate in population A, age group 21-30, is 5% There are 3000 people in the standard population In this age group if the standard population had the same rate as population A, then 5 percent of them would be affected 5% of 3000 is 150
Effect of directly standardising on overall rates 2 The identical age-specific rates obtained from table 8.3 lead to an identical overall (standardised) rate The standard population structure supplies the weights and these are the same in all comparison groups The overall result of 7.5% in table 8.4 is not real The young standard leads to a low standardised rate (7.5%), and an old standard to a high rate (13.9%)
Indirect standardisation The standard population supplies disease rates, not population structure The question : how many cases would have occurred if the study population had the same specific rates as the standard population? Observed figure is compared to the expected cases Resulting figure is the standardised morbidity (or mortality) ratio (SMR) and Usually expressed as a percentage
Exercise: indirect standardisation Example of calculation: in the age group the rate in the standard population (table 8.5 (a)) is 10 percent In population A there were 1000 people in this age group. If population A had the same age specific rate as the standard population 10 percent would be affected i.e. 100 The total number of cases gives the expected number if population A had the same rates as the standard population i.e. 450 This number can be compared to the number actually seen i.e. 300 The overall rates and standardised rates in the three populations A, B and C differ. Why?
Exercise: indirect standardisation 2 Because the standard rates are weighted differentially by the different population structures of A, B, C. Here the population structures of A, B and C are weighting the national rates.
Relative risk The relative risk is the ratio of two incidence rates Incidence rate in the population of interest divided by the rate in a comparison (or control or reference) population We are relating the incidence of disease in those with to those without the risk factor This measures the size of the effect on disease rates of the risk factor and, hence, the strength of the association in epidemiology RR can never be calculated from case-control studies which do not give incidence data, though in some circumstances the odds ratio calculated from such a study provides an acceptable estimate of the relative risk
Calculating and interpreting relative risk Imagine that the incidence of lung cancer is compared in two cities, one with polluted air (A), the other not (B). In the polluted city there were 20 cases in a population of 100,000; in the other city 10 cases in a population of 100,000. Assume accuracy in the numerators and denominators. What is the relative risk of lung cancer in the polluted city (A)? What is the relative risk of lung cancer in the less polluted city (B)? What explanations are there for the higher relative risk in the polluted city? What questions will you consider before concluding that there is a real association between pollution and lung cancer?
The two by two table Risk factorOutcome: disease Outcome: no disease total presentaba+b absentcdc+d totala+cb+da+b+c+d
Simple formulae for relative risk and odds ratios Incidence in those with the risk factor = a/a+b Incidence in those without the risk factor = c/c+d (b)relative risk = a/a+b divided by c/c+d (c)OR = cross product ratio = a x d divided by b x c
The two by two table: lung cancer as a rare outcome Risk factorOutcome: lung cancer Outcome: no lung cancer total Living in city A a= 20b=99,980a+b 100,000 living in city B. c= 10d= 99,990c+d 100,000 totala+c 30 b+d 199,970 a+b+c+d 200,000
The two by two table: lung cancer as a common outcome Risk factorOutcome: lung cancer Outcome: no lung cancer total Living in city A a= 20b=80a+b 100 living in city B. c= 10d= 90c+d 100 totala+c 30 b+d 170 a+b+c+d 200
Relative risk exercise: answers Relative risk in city A = Incidence rate in city A/incidence in City B = 20 divided by 10= 2 Relative risk in city B = Incidence rate in city B/incidence in City A = 10 divided by 20= 0.5 If investigators can consider the relative risk as a fair measure of the strength of the association- They can apply frameworks for causal thinking to judge whether pollution is the probable cause of the higher relative risk in town A
Odds ratios Odds are the chances in favour of one side in relation to the second side Odds are the chances of being exposed (or diseased) as opposed to not being exposed (or diseased) Odds ratio is simply one set of odds divided by another Odds of exposure, in the two by two table, for the group with disease are a c and for the group without disease b d Odds ratio for exposure is simply the odds a÷c divided by the odds b÷d. Similarly, the odds of disease in those exposed to the risk factor is a÷b, and for those not exposed c÷d, and the odds ratio is a÷b divided by c÷d
Odds ratios 2 The epidemiological idea is a simple one i.e. if a disease is causally associated with an exposure, then the odds of exposure in the diseased group will be higher than the corresponding odds in the non-diseased group. If there is no association, the odds ratio will be one. If the exposure is protective against disease, the odds ratio will be less than one
Odds ratio 3 In what circumstances will the O.R. for disease approximate the R.R.? For both the odds ratio and the relative risk the numerators (a and c) for the fractions are identical. The denominators are different, that is, b and d in the odds ratio, and a + b and c + d in the relative risk.
Odds ratio 4 When b is similar to a + b, and d is similar to c + d, the odds ratio and relative risk will be similar. This happens when the disease is rare, i.e., when a and c are small.
Odds ratio 5 Odds ratios approximate well to the relative risk in some circumstances. In case-control studies where relative risk cannot be calculated, odds ratio provide an estimate. Odds have desirable mathematical properties permitting easy manipulation in mathematical models and statistical computations, as, for example, in multiple logistic regression. Epidemiologists need to be aware that misinterpretation of the odds ratio is common Statistical packages may label the output of odds ratio analysis as relative risk, creating a trap for the unwary investigator
Exercise on odds ratios Calculate the odds ratio on the lung cancer exercise for the two instances where the outcome is rare and the outcome is common How do these values compare with the relative risk?
Epidemiological information to choose between priorities In a few diseases there is a unique known causal factor e.g. nutritional disorders such as scurvy All cases of such diseases are attributable, by definition, to one cause Often the causes are multiple and complex Choosing between alternative actions becomes necessary for there is limited time, money, energy and expertise Attributable risk provides a way of developing the epidemiological base for such decisions
Epidemiological information to choose between priorities 2 Imagine that there is insufficient resources to tackle all six of these CHD risk factors, what epidemiological information would help choose between them to reduce coronary heart disease in a population? High levels of some lipids in the blood, particularly low density lipoprotein (LDL) cholesterol High blood pressure Smoking Low levels of physical activity Obesity Diabetes
Epidemiological information to choose between priorities 3 Solid evidence that each of these risk factors is a component of the causal pathway Knowledge of the frequency of each risk factor in the population Knowledge of the additional risk that each risk factor imposes understanding of the actions that are (or might be) effective in reducing the prevalence of the risk factor and their costs the reduction in disease outcome (attributable risk)
Epidemiological information to choose between priorities The question being answered by attributable risk is-how many cases would not have occurred if a particular risk factor had not been present? Or, what proportion of disease incidence in those exposed to the risk factor is attributable to that particular risk factor. In short, what is the attributable risk associated with a risk factor? from the total number of cases, subtract the number that would have occurred anyway, even if the cases had not had the risk factor
Attributable risk for lung cancer in city A Risk factorOutcome: lung cancer Outcome: no lung cancer total Living in city A a= 20b=80a+b 100 living in city B. c= 10d= 90c+d 100 totala+c 30 b+d 170 a+b+c+d 200
Attributable risk for lung cancer in city A Attributable risk = incidence in city A minus incidence in city B= This is best expressed as a fraction of the total risk in City A = 20-10/20 = 0.5 This is best expressed as a percentage, so we multiply by 100 = 50%
Population attributable risk From a public health perspective we are interested in both the benefits of an intervention to the exposed group and to the whole community In this case the question is: what proportion of the disease in the whole population (not just the exposed population) is attributable to a particular exposure? The answer depends on how common the exposure is If a community had no or very little exposure to smoking, as in Sikh women living in the Punjab India, then cases of lung cancer in that population must be caused by other factors
Numbers needed to treat (NNT) or to prevent (NNP) The NNT is a measure that combines directness with simplicity The number of people who need to be treated for one patient to benefit The same measure could be applied to preventative measures The NNT is the reciprocal of the absolute (or actual) risk reduction The reciprocal of 5 is 1/5 So, if the incidence of outcome in the untreated group = 30/1000 and, incidence of outcome in the treated group=25/1000 then, the actual or absolute reduction in risk = /1000= 5/1000 and, the NNT= 1000/5 = 200
Theory Epidemiological purposes and theories underpin measurement, presentation and interpretation of data The capacity to measure and analyse data also alters our theories e.g. the case-control study and the odds ratio are now inextricably intertwined Interpretation of data is influenced by investigators' philosophy on the nature of knowledge (epistemology) Epidemiologists practice positivism, the philosophic system that is based on facts, acquired by empirical observations, and logic Facts are extracted by analysis and interpretation from data that are invariably flawed
Summary Epidemiological data can be manipulated and presented in many ways Epidemiological summary measures estimate absolute risks (e.g. numbers, rates, life years lost, numbers needed to treat) or relative ones (e.g. adjusted rates, relative risk, odds ratios) Relative and actual risks portray dramatically different perspectives on the health needs of populations Relative measures of risk are more useful in aetiologic inquiry Actual measures are better in health planning and policy Epidemiological data on diseases can be combined with other information on risk factors Combining data sets generates causal understanding of disease processes in populations and rational interventions to improve public health