Measures of disease occurrence and frequency

Slides:



Advertisements
Similar presentations
External validity: to what populations do our study results apply?
Advertisements

Conclusion Epidemiology and what matters most
What is an exposure? What is a disease? How do we measure them? Epidemiology matters: a new introduction to methodological foundations Chapter 3.
An introduction Epidemiology matters: a new introduction to methodological foundations Chapter 1.
What is a population? What is population health?
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 13: Inference for Distributions of Categorical Data
What is a sample? Epidemiology matters: a new introduction to methodological foundations Chapter 4.
Statistical Tests Karen H. Hagglund, M.S.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Calculating & Reporting Healthcare Statistics
Measure of disease frequency
Lesson Fourteen Interpreting Scores. Contents Five Questions about Test Scores 1. The general pattern of the set of scores  How do scores run or what.
Point and Confidence Interval Estimation of a Population Proportion, p
PSY 307 – Statistics for the Behavioral Sciences
Evaluating Hypotheses
Chapter 11 Data Descriptions and Probability Distributions
Manish Chaudhary MPH (BPKISH)
Measuring Epidemiologic Outcomes
Incidence and Prevalence
Measures of Central Tendency
Are exposures associated with disease?
Today: Central Tendency & Dispersion
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Lecture 3: Measuring the Occurrence of Disease
Describing distributions with numbers
Economics 173 Business Statistics Lecture 2 Fall, 2001 Professor J. Petry
Measurement Measuring disease and death frequency FETP India.
Measuring disease and death frequency
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
1.1 - Populations, Samples and Processes Pictorial and Tabular Methods in Descriptive Statistics Measures of Location Measures of Variability.
Descriptive Statistics: Numerical Methods
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Measures of Disease Frequency COURTNEY D. LYNCH, PhD MPH ASSISTANT PROFESSOR DEPT. OF OBSTETRICS & GYNECOLOGY
Is the association causal, or are there alternative explanations? Epidemiology matters: a new introduction to methodological foundations Chapter 8.
Categorical vs. Quantitative…
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Epidemiology: Basic concepts and principles ENV
Central Tendency & Dispersion
Chapter Eight: Using Statistics to Answer Questions.
Data Analysis.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
CHAPTER Basic Definitions and Properties  P opulation Characteristics = “Parameters”  S ample Characteristics = “Statistics”  R andom Variables.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
CHAPTER 2: Basic Summary Statistics
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Epidemiology. Classically speaking Classically speaking EPI DEMO LOGOS Upon,on,befall People,population,man the Study of The study of anything that happens.
Chapter ( 2 ) Strategies for understanding the meanings of Data : Learning outcomes Understand how data can be appropriately organized and displayed Understand.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
CHAPTER 3 – Numerical Techniques for Describing Data 3.1 Measures of Central Tendency 3.2 Measures of Variability.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Instructional Objectives:
Measures of Association
Module 8 Statistical Reasoning in Everyday Life
Comparing Populations
Measures of Disease Occurrence
Epidemiological Measurements of health
Interpreting Epidemiologic Results.
Presentation transcript:

Measures of disease occurrence and frequency Epidemiology matters: a new introduction to methodological foundations Chapter 5

Epidemiology Matters – Chapter 1 Seven steps Define the population of interest Conceptualize and create measures of exposures and health indicators Take a sample of the population Estimate measures of association between exposures and health indicators of interest Rigorously evaluate whether the association observed suggests a causal association Assess the evidence for causes working together Assess the extent to which the result matters, is externally valid, to other populations Epidemiology Matters – Chapter 1

Seven measure of disease occurrence and frequency Counts Prevalence Incidence/risk Mean/variance Median Mode Rates Epidemiology matters - Chapter 5

Tuberculosis in New York City Tuberculosis is a reportable condition All diagnosed cases must be reported to the department of health In 2011, there were 689 new cases of tuberculosis in New York City Epidemiology matters - Chapter 5

Tuberculosis in New York City Tuberculosis is a reportable condition All diagnosed cases must be reported to the department of health In 2011, there were 689 new cases of tuberculosis in New York City Is this information useful? Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 1. Counts Provide an absolute number of the burden of disease However counts has limited utility for two reasons The burden of disease in the population is very different if the population size is 100,000 versus 1,000,000 Some people are not at risk for developing a new onset of tuberculosis in 2011 (due to pre-existing infection), thus we need to know not only the size of the total population, but the size of the total population at risk Epidemiology matters - Chapter 5

Incidence and prevalence Two measures overcome many of the limitations of a simple count of cases - incidence and prevalence Prevalence tells us about the proportion of cases among the total population at any given time Incidence tells us the probability of a new onset of disease among those at risk for developing the illness Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 2. Prevalence The proportion of people who have the disease (existing cases plus new cases) over the total population for a given time period Epidemiology matters - Chapter 5

Disease occurrence in a sample of Farrlandia over time Year 1, 5 individuals developed the outcome Year 2, an additional 7 people developed the outcome Year 3, an additional 4 people developed the outcome

What is the prevalence of disease in Year 2? What is the numerator? 5 cases in Year 1 + 7 cases in Year 2 = 12 What is the denominator? Total sample size = 30 Prevalence = 12/30 = 0.4 The prevalence of disease in Year 2 is 40% Epidemiology matters - Chapter 5

What is the prevalence of disease in Year 3? What is the numerator? 5 cases in Year 1 + 7 cases in Year 2 + 4 cases in Year 3 = 16 What is the denominator? Total sample size = 30 Prevalence = 16/30 = 0.533 The prevalence of disease in Year 2 is 53.3% Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Summary: Prevalence For prevalence, we need a numerator (number of existing cases), and denominator (total sample size), and a time period of interest The time period should be specified as much as possible For example, when we say “in Year 2” we mean over the duration of time that spanned up to Year 2 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 3. Incidence Perhaps the most widely used tool in epidemiology Goes by many names - most common alternative name is “risk,” and less commonly, “incidence proportion” Numerator = number of new cases Denominator = population at risk of becoming a new case Specified over a specific time period Epidemiology matters - Chapter 5

What is the incidence of disease in Year 2? What is the numerator? 7 new cases in Year 2 What is the denominator? 25 people at risk (5 people already developed the disease in Year 1 and are thus not at risk) Incidence = 7/25 = 0.28 The incidence (risk) of disease in Year 2 is 28% Epidemiology matters - Chapter 5

What is the incidence of disease in Years 2 and 3? What is the numerator? 7 new cases in Year 2 + 4 new cases in Year 3 = 11 What is the denominator? 25 people at risk (5 people already developed the disease in Year 1 and are thus not at risk) Incidence = 11/25 = 0.44 The incidence (risk) of disease in Years 2 and 3 is 44% Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Summary: Incidence For incidence, we need a numerator (number of new cases), and denominator (total sample size at risk), and a time period of interest The time period should again be specified as much as possible Epidemiology matters - Chapter 5

The relation between incidence and prevalence For incidence, we need a numerator (number of new cases), and denominator (total sample size at risk), and a time period of interest The time period should again be specified as much as possible Epidemiology matters - Chapter 5

Understanding incidence and prevalence: the bathtub example Epidemiology matters - Chapter 5

Examples of the relation between incidence and prevalence High incidence, steady prevalence Example: highly contagious infectious disease with very short duration or a high case-fatality Low incidence, high prevalence Examples: diseases with long duration such as arthritis, diabetes, Crohn’s disease, and other chronic illnesses Epidemiology matters - Chapter 5

Examples of the relation between incidence and prevalence Impact of a new treatment that prolongs life with the disease but does not cure it New HIV Infections People Living with HIV Epidemiology matters - Chapter 5

Summary, incidence, prevalence Prevalence is affected by incidence and duration If a disease has short duration, Prevalence ~= incidence* If a disease has long duration, in general, Prevalence > incidence * Assumes that incidence is constant over time Epidemiology matters - Chapter 5

Mean, variance, median, mode Health outcomes are sometimes not measured by presence or absence, but rather as a continuous measure Examples: Body Mass Index, blood pressure, cholesterol, birth weight, lung function, number of depression or anxiety symptoms In these cases, we need measures of centrality and spread to characterize occurrence and frequency Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Mean The mean is estimated by summing the outcomes for each individual and dividing that summed score by the number of individuals For example, suppose we measured BMI in a sample of 31 individuals Epidemiology matters - Chapter 5

Mean Table: Body mass index (BMI) in a random sample of 31 Farrlandians Epidemiology matters - Chapter 5

Thus, the mean BMI in our sample is 31.1 The mean is estimated by summing the outcomes for each individual and dividing that summed score by the number of individuals = 31.1 Thus, the mean BMI in our sample is 31.1 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Variance In addition to estimating the mean of a continuous variable, it is important to estimate how close all of the individual values are to that mean For example, suppose we sampled two populations, and obtained the following histograms of their risk of disease Epidemiology matters - Chapter 5

The values of BMI in Sample 2 are closer to the mean than in Sample 1 Therefore, Sample 2 has a lower variance than Sample 1 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Variance The spread of individual values around the mean is a measure of the variance of the data The size of the variance gives us important information about the distribution of the variable of interest within the sample A large variance tells us that while the mean may be 31.1, there is a wide range of total values across the whole sample (and, if a representative sample, underlying population) A small variance tells us that there is little variability in the sample (and, if a representative sample, underlying population) with respect to the variable of interest Epidemiology matters - Chapter 5

Mean and variance: limitations The mean can be influenced by extremes in the data If our data had one recorded miscoded as a BMI of 550 instead of 55, the mean would be 47.1 rather than 31.1 In general, when the outcomes are not evenly distributed across a full range of potential values and instead are aggregated at the low end or the high end, the mean may not be the most informative measure of centrality For example, suppose we would like to measure the mean number of cigarettes smoked per day among a sample of adolescents Epidemiology matters - Chapter 5

Mean and variance: limitations Table: Number of cigarettes smoked per day among a random sample of 17 adolescents Epidemiology matters - Chapter 5

Mean and variance: limitations The mean would be 9.24 However most of the values are between 1 and 3, thus reporting an average of 9.24 cigarettes smoked in the sample is not very informative Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 5. Median The median of a variable is the numerical value that falls in the exact middle of the range of values; it is the value for which 50% of the remaining values are above and 50% are below Epidemiology matters - Chapter 5

Median 3 5 7 3 3 5 7 9 9 11 The median value is 5 3 5 7   The median value is 5 3 3 5 7 9 9 11 The median value of this variable is 7 1 1 3 4 7 9 There are six observations in this set, so that there is no single value that falls directly in the middle In this case, we take the mean of the two values most centered. Since 3 and 4 are the most centered values (2 observations fall below, and 2 observations fall above), the median of this set is the mean of 3 and 4: (3+4)/2=3.5 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Median Considering our smoking variable, the median value would be 2 There are eight observations that fall below 2 in this string of values, and eight that fall above 2 Thus, whereas the mean number of cigarettes smoked was 9.24, the median was 2 This signals that the distribution is quite skewed by a few heavy smokers Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 6. Mode One simple measure of centrality is the most frequently observed value, which is labeled the mode Returning to our example of cigarette smoking, we can determine the following: 3 students reported smoking 1 cigarette per day 6 students reported 2 cigarettes per day 4 reported 3 cigarettes per day 1 student reported 10 per day 1 student reported 20 per day 1 student reported 40 per day 1 reported 60 per day The modal value is the value that is most frequent; given that 6 students reported 2 cigarettes per day, the modal value would be 2 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 7. Incidence rates We have learned that “incidence” or “risk” is calculated as the number of new cases over the population at risk of becoming a new case Incidence is an accurate representation of a sample experience of health and disease when we have complete follow-up of a sample That is, each individual is observed at every measurement time point from the beginning of the study to the end Epidemiology matters - Chapter 5

Example: alcohol consumption and liver cirrhosis Suppose we conduct a study to estimate the association between heavy alcohol consumption and liver cirrhosis We follow 20 people over time 10 are heavy alcohol consumers First, let us imagine that we had complete follow-up data on all people in the study Epidemiology matters - Chapter 5

Disease incidence over time by population exposure four time points = 0.65 or 65% 13/20 = Let us imagine that this is the sample followed forward in time with complete follow up. Let us imagine that people in black are exposed and grey are unexposed – ignore this for now, we will return to black and grey when we learn about measures of association. Epidemiology matters - Chapter 5

Example: alcohol consumption and liver cirrhosis Now, let us imagine that we lost some people over time Thus, we do not know whether these individuals became diseased or not Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Loss to follow up in a sample over time Epidemiology matters - Chapter 5

Incidence when there is loss to follow-up We know that the true incidence is 65% If we only analyzed the data based on who was present at the end of the study, we would estimate incidence as 9/15 = 0.60 or 60% If we assumed that individuals who dropped out did not become diseased we would get 9/20 = 0.45 or 45% If we assumed that individuals who dropped out did become diseased we would get 14/20 = 0.70 or 70% There is one more option: a rate Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Incidence rates Incidence rates are commonly used in prospective studies in which some people are lost over time To estimate a rate over the time frame of the study, we need to know how much total time each person contributed to the study follow-up before they either developed the outcome or dropped out We term the total time that each person contributed as person-time Epidemiology matters - Chapter 5

Understanding person years Person 2 stayed in the study all 40 years and did not develop the outcome Person 10 dropped out of the study at Year 30 Person 19 developed the outcome at Year 10 Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Understanding person years Table: Person-time and disease status among 20 subjects followed for forty years Epidemiology matters - Chapter 5

Calculating the incidence rate The numerator is the number of cases The denominator is the total person-time In our example: 8/440 = 0.18, or a rate of 18 cases per 1,000 person-years Epidemiology matters - Chapter 5

Calculating the incidence rate The incidence rate can be interpreted as the number of expected cases in every set of 1,000 person years That is, if we were to observe 1,000 people for 1 year, we would expect 18 cases If we were to observe 500 people for 2 years, we would still expect 18 cases The assumption underlying this is that the incidence rate is constant over time, so for every year in which 1,000 person years are observed an additional 18 cases will be expected Given this assumption, the incidence rate tells us the average number of cases per a specified set of person time Epidemiology matters - Chapter 5

Rate versus proportion: what’s the difference? A proportion can range from 0 to 100, and the numerator is contained in the denominator A rate can range from 0 to infinity and the numerator is the number of cases whereas the denominator is the person-time at risk Incidence rates can be conceptualized as the speed at which disease is occurring in cases per person year When we have complete follow-up of a sample or a population, the rate can approximate the proportion of disease or the risk Epidemiology matters - Chapter 5

Risks and rates, an example, part 1 We have 10 people who are disease free at the start of follow-up, each followed for 1 year Three of these individuals develop the disease. All individuals are followed for the entirety of the study period The risk (incidence) of disease will be 3 out of 10, or 0.3 Assuming these individuals developed the disease just as the year was ending, and the rate would be 3 per 10 person years or 0.3 (equivalent to the risk) Epidemiology matters - Chapter 5

Rate versus proportion, an example, part 2 Now suppose that those who developed the disease did so halfway through the year 7 people were followed and did not develop the disease, i.e., 1 person year for each totaling 7 person years 3 people developed the disease, i.e., we assign each of them 0.5 person years for the midpoint of the time interval for a total of 1.5 person years Thus, the incidence rate would be 3 per 8.5 person years, or 0.35 Epidemiology matters - Chapter 5

Incidence vs. incidence rate: what’s the difference? Because measures of incidence are so central to epidemiological investigation, the term “incidence” can be used in various contexts, and the concept that we refer to as “incidence” can go by different terms The incidence refers to the number of new cases divided by the population at risk. It is also called the incidence proportion, or the risk When we refer to “incidence”, we mean the incidence proportion, also known as the risk The incidence rate refers to the number of new cases divided by the person-time at risk contributed by members of the study When we refer to “incidence rate”, we specifically refer to a measure in which the denominator is the person-time at risk contributed by members of the study. Epidemiology matters - Chapter 5

An extra, conditional risks We can “condition” risk estimate by other factors to begin to examine whether certain factors are associated with increased or decreased risk Let us return to our earlier example of alcohol consumption an liver cirrhosis In order to estimate whether heavy drinkers have a different incidence of cirrhosis compared with non-heavy drinkers, we can use a measure of the conditional incidence Epidemiology matters - Chapter 5

Two by two table showing exposure in each row and disease status in each column Conditional risk of cirrhosis among heavy drinkers = 8/10 = 80% Conditional risk of cirrhosis among non-heavy drinkers = 5/10 = 50% Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Conditional risks It appears that heavy drinkers have a higher incidence of cirrhosis compared with non-heavy drinkers (Next we will learn how to quantify this) Building these 2x2 tables crossing exposure with disease and using these 2x2 tables to estimate associations will become a building block of epidemiology Epidemiology matters - Chapter 5

Epidemiology matters - Chapter 5 Summary Measures of disease occurrence and frequency in epidemiology are the cornerstone of how we build the science of population health Key measures are: incidence/risk, prevalence, mean, median, mode, incidence rates, and conditional risks Incidence rates are more appropriate than incidence when there are losses to follow-up Epidemiology matters - Chapter 5

Epidemiology Matters – Chapter 1 Seven steps Define the population of interest Conceptualize and create measures of exposures and health indicators Take a sample of the population Estimate measures of association between exposures and health indicators of interest Rigorously evaluate whether the association observed suggests a causal association Assess the evidence for causes working together Assess the extent to which the result matters, is externally valid, to other populations Epidemiology Matters – Chapter 1

Epidemiology Matters – Chapter 1 epidemiologymatters.org Epidemiology Matters – Chapter 1