Presentation is loading. Please wait.

Presentation is loading. Please wait.

The binomial applied: absolute and relative risks, chi-square.

Similar presentations


Presentation on theme: "The binomial applied: absolute and relative risks, chi-square."— Presentation transcript:

1 The binomial applied: absolute and relative risks, chi-square

2 Probability speak (just shorthand!)… P(X) = “the probability of event X” P(D) = “the probability of disease” P(E) = “the probability of exposure” P(~D) = “the probability of not getting the disease” P(~E)= “the probability of not being exposed” P(D/E) = “the probability of disease given exposure” or “the probability of disease among the exposed” P(D/~E) = “the probability of disease given unexposed” or “the probability of disease among the unexposed”

3 Things that follow a binomial distribution… Cohort study (or cross-sectional): The number of exposed individuals in your sample that develop the disease The number of unexposed individuals in your sample that develop the disease Case-control study: The number of cases that have had the exposure The number of controls that have had the exposure

4 Cohort study example: You sample 100 smokers and 100 non- smokers and follow them for 5 years to see who develops heart disease. Let’s say the “true” risk of developing heart disease in 5 years is 20% for smokers and 10% for non-smokers. In probability symbols: P(D/E)=.20 P(D/~E)=.10

5 Seeing it as a binomial… The number of smokers that develop heart disease in your study follows a binomial distribution with N=100, p=.20 The number of non-smokers that develop heart disease in your study follows a binomial distribution with N=100, p=.10

6 One possible outcome: Smoker (E)Non-smoker (~E) Heart disease (D)229 No Disease (~D)7891 100

7 Another possible outcome: Smoker (E)Non-smoker (~E) Heart disease (D)1715 No Disease (~D)8385 100

8 Another possible outcome: Smoker (E)Non-smoker (~E) Heart disease (D)2113 No Disease (~D)7987 100 Let’s say these are the data we found!

9 Statistics for these data 1. Risk ratio (relative risk) 2. Difference in proportions (absolute risk) 3. Chi-square test of independence Mathematically equivalent to difference in proportions for 2x2 tables.

10 Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd a+cb+d risk to the exposed risk to the unexposed 1. Risk ratio (relative risk)

11 Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd a+cb+d Risk of disease in the exposed risk of disease in the unexposed In probability terms…

12 Risk ratio calculation: Smoker (E)Non-smoker (~E) Heart disease (D)2113 No Disease (~D)7987 100

13 Inferences about risk ratio… Is our observed risk ratio statistically different from 1.0? What is the p-value? I’m going to present statistical inference for odds ratio; risk ratio is similar. So, for now, just get answer from SAS: Confidence interval: 0.86 to 3.04 P-value>.05

14 2. Difference in proportions Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd a+cb+d

15 2. Difference in proportions Smoker (E)Non-smoker (~E) Heart disease (D)2113 No Disease (~D)7987 100 Absolute, rather than relative risk difference!

16 Is this statistically significant? This 8% difference could reflect a true association or it could be a fluke in this particular sample. The question: is 8% bigger or smaller than the expected sampling variability?

17 Difference in proportions test Null hypothesis: The difference in proportions is 0. Formula for standard error follows directly from binomial. Recall, variance of a proportion is XX Use average proportion in standard error formula, because under the null hypothesis we assume groups have same proportion. Follows a normal because…

18 What is the standard error under the null hypothesis? Corresponding two-sided p-value is.131.

19 Corresponding confidence interval…

20 OR, use computer simulation to get the standard error… 1. In SAS, assume infinite population of smokers and non-smokers with equal disease risk (UNDER THE NULL!) 2. Use the random binomial function to randomly select 100 smokers and 100 non- smokers 3. Calculate the observed difference in proportions. 4. Repeat this 1000 times. 5. Observe the distribution of differences under the null hypothesis.

21 Computer Simulation Results Standard error is about 5.3%

22 Difference in proportion test We observed a difference of 8% between smokers and non-smokers.

23 Hypothesis Testing Step 4: Calculate a p-value

24 When we ran this study 1000 times, we got 72 result as big or bigger than 8%. P-value from our simulation… We also got 82 results as small or smaller than –8%.

25 P-value from our simulation…

26 P-value From our simulation, we estimate the p-value to be: 154/1000 or.154

27 Reject the null. Alternative hypothesis: There is an association between smoking and heart disease.

28 Finally, chi-square Smoker (E)Non-smoker (~E) Heart disease (D)2113 No Disease (~D)7987 100 Null hypothesis: smoking and heart disease are independent

29 Finally, chi-square Smoker (E)Non-smoker (~E) Heart disease (D)2113 No Disease (~D)7987 100 Under independence, P(A&B)=P(A)*P(B)

30 Case-control example… Sample

31 Statistics for these data 1. Odds ratio (relative risk) 2. Difference in proportions exposed (absolute risk) 3. Chi-square

32 Odds vs. Risk If the risk is…Then the odds are… ½ (50%) ¾ (75%) 1/10 (10%) 1/100 (1%) Note: An odds is always higher than its corresponding probability, unless the probability is 100%. 1:1 3:1 1:9 1:99

33 The proportion of cases and controls are set by the investigator; therefore, they do not represent the risk (probability) of developing disease. Exposure (E)No Exposure (~E) Disease (D)ab No Disease (~D)cd The Odds Ratio (OR) a+b=cases c+d=controls Odds of exposure in the cases Odds of exposure in the controls

34 Inferences about the odds ratio…

35 Simulation…

36 Properties of the OR (simulation) (50 cases/50 controls/20% exposed) If the Odds Ratio=1.0 then with 50 cases and 50 controls, of whom 20% are exposed, this is the expected variability of the sample OR  note the right skew

37 Properties of the lnOR Standard deviation =

38 Practice problem

39 Do observed and expected differ more than expected due to chance?

40 Chi-Square test Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4

41 The Chi-Square distribution: is sum of squared normal deviates The expected value and variance of a chi- square: E(x)=df Var(x)=2(df)

42 Chi-Square test Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4 Rule of thumb: if the chi-square statistic is much greater than it’s degrees of freedom, indicates statistical significance. Here 85>>4.

43 Brain tumorNo brain tumor Own a cell phone 5347352 Don’t own a cell phone 38891 8435453 Chi-square example: Cell size of 3 tells us we should opt for Fisher’s exact result in SAS. But doesn’t turn out very different in this case.

44 Same data, but use Chi-square test Brain tumorNo brain tumor Own5347352 Don’t own38891 8435453

45 Same data, but use Odds Ratio Brain tumorNo brain tumor Own a cell phone 5347352 Don’t own a cell phone 38891 8435453


Download ppt "The binomial applied: absolute and relative risks, chi-square."

Similar presentations


Ads by Google