Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.

Similar presentations


Presentation on theme: "Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences."— Presentation transcript:

1 Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry patten@ucalgary.ca

2 Some General Principles of Data Analysis Data cleaning, checking is always the first step after data entry. Start with “univariate” analysis (frequencies and their CIs. Progress to “bivariate” analysis – does a “dependent” variable differ depending on an “independent” variable The next stage is “multivariate”

3 Statistical Errors

4

5 Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten www.ucalgary.ca/~patten Scroll to the bottom. Right click to download the files described as being “for PGME Students” –One is a dataset –One is a data dictionary Save them on your desktop

6 Open the Datafile

7 Comparing Proportions We’ve looked at two procedures (e.g. for obesity in men vs. women): generate obese = bmi recode obese 0/30=0 30.001/1000=1 prtest obese, by(sex)

8 Generate Commands Using Logic generate obese2 =. recode obese2.=0 if bmi <= 30 recode obese2.=1 if bmi > 30 tab obese obese2 prtest obese2, by(sex)

9 Generate as a Recode Subcommand recode bmi (0/30=0) (30.01/1000=1), gen(obese3) tab obese obese3

10 Alternative to prtest Can use tab with the subcommand “exact” tab obese sex, exact

11 Epitab Commands 1 3 2

12 Risk Ratios “risk” in the “exposed” “risk” in the “non-exposed” RR =

13 Odds Ratios Odds in the “exposed” Odds in the “non-exposed” OR =

14 Measures of Association The most common ones are ratios.. –RR –OR –PR –IR You’ll sometimes see differences as well.. –Risk Difference

15 Another Alternative… The “cs” command is for “cross-sectional” and will give you risk ratios or risk differences However, it requires 0 and 1 values. recode sex (1=0) (2=1), gen(female) cs obese female, exact

16 Odds and Proportions In our sample, there are… –1560 obese –10,015 non-obese –(and 52 missing) The frequency of obesity (prevalence) is 1,560/(1,560 + 10,015) The odds are: 1,560/10,015

17 Odds and Proportions In other words… –If ‘a’ means “have disease” and b means “does not have the disease” then… –Proportion = a / a+b –Odds = a / b

18 Another Alternative… The “cc” command is for “case-control” and will give you odds ratios However, it requires 0 and 1 values. cc obese female, exact

19 As Task for You… What is the prevalence of diabetes? (provide a 95% confidence interval for your estimate) What is the prevalence of diabetes in men and women (hint: use “by” in the dialogue box) What is the odds ratio for the association of diabetes and obesity? What is the risk ratio for the association of diabetes and obesity? Is the association statistically significant?

20 A More Complex Problem.. The prevalence of obesity is said to be associated with lower levels of education

21 Two-way Tables 1 2 3

22 A Two-way Table

23 Bar Graphs It is under the graphics menu, the dialogue box… 1 2 3

24 Select Categories.. 1 2

25

26 Histograms, with “by” The pattern of obesity by education is different than that of mean BMI. Your Task: use the “by” subcommand with the histogram command to look at the distribution of BMI by eduation.

27 Does BMI Differ by Education? If we had two groups we’ld use a t-test. Our null would be Mean(1) = Mean(2), or as Stata says: Mean(1) – Mean(2) = 0 But we have > 2 groups, so could try to use ANOVA –Can think of this test as an extension of the two group t-test –Assumes normal distribution and equal variances (like the t-test it is “parametric”)

28 1 2 3 One-Way ANOVA

29 STATA Warns of a Problem

30 The Kruskal-Wallis Test 1 2 3

31 Kruskal-Wallis Output

32 Non-Parametric Tests Kruskall-Wallis and its 2 sample version (Wilcoxon Rank Sum Test) require that… –The variable can be meaningfully ordered, and –Has a roughly/loosely bell shaped frequency distribution (should have a central tendency) Your task: Repeat our analysis from last week in which we compared BMI in men and women, but use Kruskall-Wallis and Wilcoxon’s Rank Sum test. –Do you get equivalent results?

33 Comparing Proportions? Yes No Fisher’s Exact TestParametric Assumptions? Yes No Multiple Groups? Yes No YesNo ANOVA t-test Kruskall-Wallis Wilcoxon’s-Rank Sum

34 Prevalence of Diabetes Your Task: try this command: cii 11627 530, exact (Does your estimate resemble what you get with ci diabetes, exact?)

35 The CI Calculator 12 3

36 The “CC” Calculator 1 2 3 The CC Calculator

37 Your Task: Try the cci command to obtain the OR Your Task: Can you reproduce these CIs with an immediate Command?

38 Diagnostic Test Metrics Sensitivity Specificity Positive Predictive Value Negative Predictive Value

39 Common Notation for Test Metrics

40 Formulas for Test Metrics… Let’s make formulas for Se, Sp, PPV and NPV using this terminology.

41 (In Class) Assignment for Today Our database has random blood glucose (they call it “casual”) In these units (mg/dl) about 140 may be used as a cut-point for an “elevated” level Create a variable for “elevated” glucose and determine its Se, Sp, PPV and NPV as a diagnostic test for diabetes Calculate a confidence interval for each parameter.


Download ppt "Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences."

Similar presentations


Ads by Google