Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interactive learning modules and videos of abridged lectures can be accessed by smart phone as well.

Similar presentations


Presentation on theme: "Interactive learning modules and videos of abridged lectures can be accessed by smart phone as well."— Presentation transcript:

1 Interactive learning modules and videos of abridged lectures can be accessed by smart phone as well. http://sph.bu.edu/otlt/lamorte/EP713/

2 Rounding errors When calculating ratios, don’t round off the numerator and denominator too much; it can cause the result to vary quite a bit. Keep 3-4 decimal places in the numerator & denominator. Round off the final result. 0.1199 rounded to 0.1449 0.1 = 1.0 0.1 0.1199 = 0.827 = 0.83 0.1449

3 Using a reference group for comparison: You can do this for both cohort studies and case-control studies.

4 # MIs (non-fatal) 41 57 56 67 85 Person-years of observation 177,356 194,243 155,717 148,541 99,573 Rate of MI per 100,000 P-Yrs (incidence) 23.1 29.3 36.0 45.1 85.4 Relative Risk 1.0 1.3 1.6 2.0 3.7 <21 21-23 23-25 25-29 >29 BMI: wgt kg hgt m 2 BMIEventsPerson-Years >298599,573 <2141177,356

5

6 1620 62163 Tooth Loss some none 2217 62163 Tooth Loss complete none Case Control Cancer Using a Reference Group in a Case-Control Study

7 Evaluating Random Error (Sampling Error)

8 Target Population Study population Collect data Make comparisons Is there an association? Are the results valid?  Chance (Random Error)  Bias  Confounding Inference Sample Testing Whether a Factor Is Associated With Disease

9 What is a “random sample”? In reality, we usually take convenience samples… we take the subjects that are available. Example: BWHS recruiting via subscribers to a magazine. When selection of one subject doesn’t affect the chance of other subjects being chosen, i.e. selection is be independent (uninfluenced by other factors). To achieve this we should select by a process such that selection is independent of other factors… an independent selection process.

10 Types of Data 1) Discrete Variables: Variables that assume only a finite number of values: Dichotomous variables (male/female, alive/died) Categorical variables (or nominal variables, e.g., race) Ordinal variables: ordered categories (Excellent/Good/Fair/Poor) 2) Continuous Variables: Can take on any value within a range of plausible values, e.g., serum cholesterol level, weight & systolic blood pressure. (Also called quantitative or measurement variables.)

11 Case fatality rate for H1N1? Is H1N1 vaccine effective? Suppose you were reading two studies: one that addressed the first question and another that addressed the second question… what would be the study design of each? What was being estimated in each?

12 One Group:  Estimating a prevalence, risk, or rate in one group (e.g., case-fatality rate with bird flu)  Estimating mean (e.g. average body weight) (How precise are the estimates?) Comparing 2 or more groups:  Comparing risks, rates  Comparing means (body weigh in smokers versus non- smokers) (Are differences just due to sampling error?) (If we calculate a measure of association, how precise is it?) Goals

13 Estimates in one group. Mean body weight of a population? [measurement] Case-fatality rate from bird flu (% of cases that die)? [proportion] Measurements Frequencies Are the groups different? How precise is measure of association? Are the groups different? How precise is measure of association? Increased risk of wound infection? [relative risk] Comparing two or more groups Do smokers weigh less than non-smokers? [measurement difference] 1 2+

14 Estimates in a Single Group: How Precise?

15 Are the groups different? How precise is measure of association? Are the groups different? How precise is measure of association? Estimates in one group. Mean body weight of a population? [measurement] Case-fatality rate from bird flu (% of cases that die)? [proportion] Increased risk of wound infection? [relative risk] Comparing two or more groups Do smokers weigh less than non-smokers? [measurement difference] Measurements Frequencies 1 2+

16 100 120 140 160 180 200 220 240 260 280 300 320 True mean Estimated Mean = 205

17 100 120 140 160 180 200 220 240 260 280 300 320 True mean Estimated Mean = 140

18 100 120 140 160 180 200 220 240 260 280 300 320 True mean Estimated Mean = 265

19 Body Weight Total population Number of people with each body weight (True Mean) Distribution of Body Weights in a Population 100 120 140 160 180 200 220 240 260 280 300 320 Estimates may vary from sample to sample. A sample provides an estimate of population mean weight. The estimates will be more precise with larger samples, i.e., more consistent, and they are more likely to be close to the true mean.

20 Confidence Interval for a Mean 184 202 220 The range within which the true value lies with a specified degree of confidence. The estimated mean was 202. With 95% confidence, the true mean lies in the range of 184-220.

21 Are the groups different? How precise is measure of association? Are the groups different? How precise is measure of association? Estimates in one group. Mean body weight of a population? [measurement] Case-fatality rate from bird flu (% of cases that die)? [proportion] Increased risk of wound infection? [relative risk] Comparing two or more groups Do smokers weigh less than non-smokers? [measurement difference] Measurements Frequencies 1 2+

22 Humans Who Got Bird Flu No comparisons; just a single estimate in a group of cases. What type of measure of disease frequency are we estimating? What is the case-fatality rate?

23 See Epi_Tools.XLS : Worksheet “CI – One Group” Interpretation: The estimated case-fatality rate was 50% (4/8). With 95% confidence the true CFR lies in the range 21.5-78.5%.

24 95% Confidence Interval 95% Confidence Interval For A Proportion It is very unlikely (less than 5% probability) that the true value is outside this range. 0.780.22 4/8 = 0.5 = point estimate 0.01.0 The range within which the true value lies, with 95% confidence.

25 Suppose it had been 32 deaths out of 64 cases, i.e., same proportion but larger sample? The confidence interval would be….? 1.Narrower 2.Wider 3.Same width 4.Can’t say

26 Precision Depends on Sample Size … the narrower the confidence interval (and the more precise the estimate). N N N N N The greater the sample size…

27 Weymouth, MA conducted an extensive health survey several years ago. One question asked, “Have you experienced violence because you or someone else was drinking alcohol?” Grade# who said yes# respondents%95% confidence interval 129227034.1(28-40%) What type of measure of disease frequency are we estimating?

28 Hypothesis Testing: Comparing Two or More Groups

29 Are the groups different? How precise is measure of association? Are the groups different? How precise is measure of association? Estimates in one group. Mean body weight of a population? [measurement] Case-fatality rate from bird flu (% of cases that die)? [proportion] Increased risk of wound infection? [relative risk] Comparing two or more groups Do smokers weigh less than non-smokers? [measurement difference] Measurements Frequencies 1 2+

30 Null Hypothesis (H 0 ): the groups are not different H 0 : Mean Wgt. smokers = Mean Wgt. Non-smokers Alternative Hypothesis (H A ): the groups are different H A : Mean Wgt. Smokers = Mean Wgt. Non-smokers

31 Null Hypothesis: Smokers & Non-Smokers Do Not Differ in Weight on Average Random sample of non-smokers Random sample of smokers Random Samples True population distribution Note that, here, smokers & non-smokers have the same distribution of body weights and the mean weights are the same. The sample means may differ just by chance (random error). 100 120 140 160 180 200 220 240 260 280 300 320

32 Population of smokers True means Alternative Hypothesis: They Differ Population of non=smokers sample of non-smokers sample of smokers 100 120 140 160 180 200 220 240 260 280 300 320 Even if they are really different, there may be some overlap.

33 Statistical Testing If the null hypothesis were true, what would be the probability of getting a difference this great (or greater) just by chance (random error)? How do we decide if they are really different or the differences were just due to chance? 190 lbs.205 lbs. Smokers Non-smokers

34 P-values are generated by performing a statistical test. The particular statistical test used depends on study type, type of measurement, etc. “p” is for probability. P– Values Definition of a p-value: The probability of seeing a difference this big or bigger, if the groups were not different, i.e. just by chance. 190 lbs.205 lbs. Smokers Non-smokers

35 p-value Probability of occurring by chance.55 55%.25 25%.12 12%.05 5%.01 1 in 100 (1% ) P-Values Chance is an unlikely explanation The differences may be due to chance. p < or =.05 Range between 0 and 1. Small p-values (< 0.05) mean a low probability that the observed difference occurred just by chance (so they’re likely to be different).

36 If “p” < 0.05 it is unlikely that the difference is due to chance. If so, we reject the null hypothesis in favor of the alternative hypothesis, i.e. the groups probably are different.  This criterion for significance (< 0.05) is arbitrary.  The p-value is a probability, not a certainty.  P-values are influenced by:  The magnitude of difference between groups.  The sample size Statistical Significance

37 If we wanted to compare two groups on measurement data (e.g. Is the mean weight of smokers less than that of non-smokers?), we might use a statistical test known as a “t-test” to determine whether there was a “statistically significant” difference. Epi_Tools.XLS Spreadsheet Describes T-Tests

38 Evaluating the Role of Chance With Categorical Data

39 Are the groups different? How precise is measure of association? Are the groups different? How precise is measure of association? Estimates in one group. Mean body weight of a population? [measurement] Case-fatality rate from bird flu (% of cases that die)? [proportion] Increased risk of wound infection? [relative risk] Comparing two or more groups Do smokers weigh less than non-smokers? [measurement] Measurements Frequencies 1 2+

40 Yes No Wound Infection 1 78 79 7 124 131Yes No 8 202 210 Subjects RR = 7/131 = 5.3 = 4.2 1/79 1.3 RR = 7/131 = 5.3 = 4.2 1/79 1.3 Cumulative Incidence 5.3% 1.3% (7/131) (1/79) Had Incidental Appendectomy Do incidental appendectomy? Is the risk of wound infection really different, or were the observed differences just due to “the luck of the draw”?

41 What would the null hypothesis be for the study on wound infections after incidental appendectomy?

42 0 1.0 10 Possible values of RR or OR Estimated RR = 4.2 The “null” value; i.e., no difference. How accurate is this estimate? How confident are we that the groups are different?

43 Null Hypothesis (H 0 ): the groups are not different H 0 : Incid. Infection exposed = Incid. infection not exposed or H 0 : RR = 1.0 (or OR = 1) or H 0 : RD = 0 or H 0 : Attrib. fraction = 0 Alternative Hypothesis (H A ): the groups are different H A : RR or OR = 1.0 or H A : RD = 0 or H A : Attrib. fraction = 0

44 Statistical Testing If the null hypothesis were true, what would be the probability of getting a difference this great (or greater) just by chance (random error)? How do we decide if they are really different or the differences were just due to chance? 1.3%5.3% No appy. Appendectomy

45 Observed Data: Infection No Inf. Appy No Appy 7124 178 131 (62%) 79 (38%) 8202210 total Expected if no association (Null Hypothesis) Infection No Inf. Appy No Appy 5 (62% of 8) 126 3 (38% of 8) 76 If the Null Hypothesis Were True, What Would You Have Expected?  How big a difference was there between what you expected under the null hypothesis and what was observed?  If the null hypothesis were true, what is the probability of seeming a difference this big just due to sampling variability (the luck of the draw)?

46 Observed Data: Infection No Inf. Appy No Appy 7124 178 131 (62%) 79 (38%) 8202210 total Expected if no association: Infection No Inf. Appy No Appy 5 (62% of 8) 126 3 (38% of 8) 76  2 =  (O-E) 2 E = 2.24 p = 0.13 If the null hypothesis were true, there would be a 13% probability of obtaining a difference this big (or bigger) just by chance (“luck of the draw”). Expected with null hypothesis If the Null Hypothesis Were True, What Would You Have Expected? (“p-value”)

47  2 =  (O-E) 2 E YesNo + - 30- 36- 699 1,399 662,098 total Death Person-Years of Observation Observed Data: Chi Squared Test with Person-Years of Observation

48 Observed Data: 66 699 (33%) 1,399 (67%) 2,098 total + - 30 36 - - YesNo Death  2 =  (O-E) 2 E Data Expected by Chance: + - YesNo Death - - 22 33% of 66 = 22 44 67% of 66 = 44  2 = 8 2 + 8 2 =2.91 + 1.45 = 4.36 22 44 (p < 0.05) Expected with null hypothesis Chi Squared Test with Person-Years of Observation

49 Expected with null hypothesis: Infection No Inf. Appy No Appy 5126 3 76  2 =  (O-E) 2 E = 2.24 p = 0.13 Fisher’s Exact Test (for small samples, i.e. if expected number in any cell is <5): Here, p = 0.26 with Fisher’s. Fisher’s requires an iterative calculation that is tough for a spreadsheet. See http://www.langsrud.com/fisher.htm for a 2x2 table that performs Fisher’s Exact Test.http://www.langsrud.com/fisher.htm Caution:  2 Test Exaggerates Significance with Small Samples

50 P = 0.26

51 The Limitations of p-Values 1.There is a tendency for p-values to devolve into a conclusion of "significant" or not based on the p-value. 2.If an effect is small and clinically unimportant, the p-value can be "significant" if the sample size is large. Conversely, an effect can be large, but fail to meet the p<0.05 criterion if the sample size is small. 3.When many possible associations are examined using a criterion of p< 0.05, the probability of finding at least one that meets the “critical point” increases in proportion to the number of associations that are tested. 4.The p-value does not represent the probability that the null hypothesis is true. The p-value is the probability that the data could deviate from the null hypothesis as much as they did or more. 5.Statistical significance does not take into account the evaluation of bias and confounding.

52 Yes No Wound Infection 1 78 79 7 124 131Yes No 8 202 210 Subjects RR = 7/131 = 5.3 = 4.2 1/79 1.3 RR = 7/131 = 5.3 = 4.2 1/79 1.3 Cumulative Incidence 5.3% 1.3% (7/131) (1/79) Had Incidental Appendectomy Do incidental appendectomy? How precise is my estimate of the risk ratio?

53 p + 1.96p(1 - p) n 95% CI = For A Proportion: Estimated proportion or frequency p.6.7.8 For RR or OR: RR exp(1 + 1.96 )  95% CI = RR or OR Estimated relative risk or odds ratio 0.6 4.2 28 95% CI for Risk Ratio or an Odds Ratio 95% CI for a Proportion (one group)

54 Protective effect No Difference Increased risk 0 1.0 10 Relative Risk The range in which the true RR or OR lies, with 95% confidence. Estimated RR The true RR probably isn’t 1. (probability <5%) 95% CI for a Risk Ratio or an Odds Ratio We’re 95% confident that the true value is in this range, so there is <5% probability that the true value is outside that range. Here, the null value is outside the interval, so there is <5% chance that the null is the true value. The null is excluded, so p<0.05.

55 Protective effect No Difference Increased risk 0 1.0 10 Relative Risk Estimated RR The true RR could be 1. 95% CI for RR or OR (continued) Here, the null value is inside the interval, so there is >5% chance that the null is the true value. The null could be true, so p>0.05.

56 (protective effect) No Difference (increased risk) 0 1.0 10 Relative Risk If the 95% CI includes the null value, then the findings are compatible with RR=1.0, so there is no significant difference). But, if the 95% CI does NOT include the null value, RR=1.0 is unlikely (p<0.05). The 95% confidence interval for relative risk gives the range of RR values that are compatible with your sample.

57 The 95% Confidence Interval for a RR or OR Tells You: Significant protective effect (p<.05) Significant increase in risk (p<.05) No significant difference (p>.05) 1)Whether the groups being compared differ significantly (p<.05) 2) How precise was the estimate of relative risk. RR= 1.0 means no difference, i.e. no association. RR= 1.0 means no difference, i.e. no association.

58 0 1.0 10 Relative Risk A 95% Confidence Interval Puts Non-significant Results in Perspective Both of these results are non- significant. Do you view them differently?

59

60 It does mean: The estimate of RR or OR will be more precise. If there really is a difference between groups, you will have a better chance of detecting it (better statistical power). Having a larger sample size does not ensure that you will get a “statistically significant” difference.

61 Yes No Wound Infection 1 78 79 7 124 131Yes No 8 202 210 Subjects RR = 7/131 = 5.3 = 4.2 1/79 1.3 RR = 7/131 = 5.3 = 4.2 1/79 1.3 Cumulative Incidence 5.3% 1.3% (7/131) (1/79) Had Incidental Appendectomy Do incidental appendectomy? How precise was the RR? Use the Cohort Study sheet in “EpiTools” to calculate  2, the p-value, and the 95% confidence interval.EpiTools

62 (protective effect) (increased risk) Relative Risk 95% Confidence Interval for Estimated Risk Ratio in the Appendectomy Study Estimated RR= 4.2 95% CI:.53 – 36.5 1.0 10

63 A tour of Epi_Tools.XLS

64 Sample Size Calculations It’s possible that incidental appendectomy really does carry an increased risk of wound infection, but the sample size was too small to demonstrate a significant difference. If the true difference were really 5.3% vs. 1.3% how many subjects would it have taken to demonstrate a significant difference (with 90% power)? What if the true difference were really 10% vs. 1.3% how many subjects would it have taken to demonstrate a significant difference (with 90% power)? You can use the Sample Size sheet (Part II) in “Epi_Tools.XLS” to calculate how many subjects would have been needed, assuming equal numbers in each group.

65 Among 210 people who developed bird flu to date 122 died (58% case-fatality rate). How frequently is bird flu fatal? How precise are these estimates? Among the first 10 people who developed bird flu 2 died ( case-fatality rate= 20%).

66 A cohort study examined the association between DES and breast cancer & found RR=1.4 (p=0.006). 1.40% of women on DES got breast cancer but this was not ‘statistically significant.’ 2. 40% of women on DES got breast cancer & this was ‘statistically significant.’ 3.Women on DES had 1.4 times the risk of getting breast cancer, but this was not statistically significant because the p-value was less than 0.05. 4.Women on DES had 1.4 times the risk of developing breast cancer, and this was statistically significant because the p-value was less than 0.05.

67 An association between DES exposure and breast cancer? Results: RR=1.4 p-value = 0.006  The RR suggests an increased risk with DES.  The small p-value indicates it is unlikely that the differences were due to chance.  Therefore, difference is statistically significant. If “p” < or = 0.05 it is unlikely that the difference is due to chance.

68 A RCT compared % pain relief from glucosame + chondroitin vs. placebo in osteoarthritis patients. RR=1.1 (p=0.10). 1.Those on G+C were 10% more likely to report pain relief, but it was not ‘statistically significant.’ 2.Those on G+C were 10% more likely to report pain relief, and it was ‘statistically significant.’ 3.Those on G+C had 1.1 times the risk of pain, but it was not statistically significant. 4.Those on G+C had 1.1 times the risk of pain, & it was statistically significant.

69 Are Glucosamine + Chondroitin Associated with Pain Relief? Results: RR=1.1 p-value = 0.10  The RR suggests a small increase in pain relief.  But the p-value indicates a 10% probability that the difference was just due to sampling error.  Therefore, NOT statistically significant. If “p” < or = 0.05 it is unlikely that the difference is due to chance.

70 How would you interpret these findings? Suppose you did a larger study and found a RR = 3.0 and a 95% confidence interval from 2.0- 4.0. How would you interpret these findings? A study compared 35 diabetics to a comparable group of 35 non-diabetics and found that the diabetics had a higher incidence of femoral arterial occlusion. The relative risk was 7.0, and the 95% CI was 1.2 to 14.3. A study compared 35 diabetics to a comparable group of 35 non-diabetics and found that the diabetics had a higher incidence of femoral arterial occlusion. The relative risk was 7.0, and the 95% CI was 1.2 to 14.3.

71 A study compared 35 diabetics to a comparable group of 35 non-diabetics and found that the diabetics had a higher incidence of femoral arterial occlusion. The relative risk was 7.0, and the 95% CI was 1.2 to 14.3. 1.Not statistically significant since the RR was inside the conf. int. 2.Not statistically significant since the null was outside the conf. int. 3.Was statistically significant since the RR was inside the conf. int. 4.Was statistically significant since the null was outside the conf. int.

72 A) The association is statistically significant because p <0.05. B) The association is statistically significant because the 95% confidence interval does not contain the “null value.” C) A 95% confidence interval for an OR is the range within which the true OR lies with 95% confidence. D) All of the above. For Lenny's bakery: OR=3.46 P= 0.0033; 95% CI = (1.49-8.06)

73 For Lenny's bakery: OR=3.46, and P= 0.0033; 95% CI = (1.49-8.06) A) The association is statistically significant because p <0.05. B) The association is not statistically significant because p <0.05 & the null is excluded from the 95% confidence interval. C) The association is not statistically significant because the 95% confidence interval contains the estimated OR.

74

75 Organizing Data

76 26. Please list the names and locations of the restaurants, delis, or cafeterias where you [your child] ate lettuce during those 7 days. Name: _________________ Location: _____________ Date: _______ 27. Did you [your child] eat sprouts, such as alfalfa or bean sprouts, in a restaurant, deli or cafeteria? These may have been eaten as part of a salad or as part of other food items such as sandwiches, tacos, etc. Yes NoDon't know If yes: Did you [your child] eat alfalfa sprouts? (These are the thread-like sprouts with green tops the size of the head of a pin.) Yes NoDon't know 28. Please list the names of the restaurants, delis, cafeterias, etc. where you [your child] ate alfalfa sprouts in those 7 days? Name: _________________ Location: _____________ Date: _______ Data Collection

77 Creating a Database Excel provides a simple, user-friendly platform that many people are familiar with.  Also facilitates analysis »Sorting »Counts, averages –Statistical tests (Chi square, t-tests, etc.) Example: The Excel file “Hepatitis Outbreak” illustrates the use of Excel to record & store data from an outbreak investigation.Hepatitis Outbreak

78 The observed estimate is true (unbiased & precise). The observed estimate was misleading due to:  Chance or sampling variability  Bias  Confounding Internal Validity Causality External Validity The findings can be generalized. A judgment based on a body of evidence.

79 Funnel Plot


Download ppt "Interactive learning modules and videos of abridged lectures can be accessed by smart phone as well."

Similar presentations


Ads by Google