Download presentation
Presentation is loading. Please wait.
Published byBartholomew Robertson Modified over 9 years ago
1
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry patten@ucalgary.ca
2
Some General Principles of Data Analysis Data cleaning, checking is always the first step after data entry. Start with “univariate” analysis (frequencies and their CIs. Progress to “bivariate” analysis – does a “dependent” variable differ depending on an “independent” variable The next stage is “multivariate”
3
Statistical Errors
5
Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten www.ucalgary.ca/~patten Scroll to the bottom. Right click to download the files described as being “for PGME Students” –One is a dataset –One is a data dictionary Save them on your desktop
6
Open the Datafile
7
Comparing Proportions We’ve looked at two procedures (e.g. for obesity in men vs. women): generate obese = bmi recode obese 0/30=0 30.001/1000=1 prtest obese, by(sex)
8
Generate Commands Using Logic generate obese2 =. recode obese2.=0 if bmi <= 30 recode obese2.=1 if bmi > 30 tab obese obese2 prtest obese2, by(sex)
9
Generate as a Recode Subcommand recode bmi (0/30=0) (30.01/1000=1), gen(obese3) tab obese obese3
10
Alternative to prtest Can use tab with the subcommand “exact” tab obese sex, exact
11
Epitab Commands 1 3 2
12
Risk Ratios “risk” in the “exposed” “risk” in the “non-exposed” RR =
13
Odds Ratios Odds in the “exposed” Odds in the “non-exposed” OR =
14
Measures of Association The most common ones are ratios.. –RR –OR –PR –IR You’ll sometimes see differences as well.. –Risk Difference
15
Another Alternative… The “cs” command is for “cross-sectional” and will give you risk ratios or risk differences However, it requires 0 and 1 values. recode sex (1=0) (2=1), gen(female) cs obese female, exact
16
Odds and Proportions In our sample, there are… –1560 obese –10,015 non-obese –(and 52 missing) The frequency of obesity (prevalence) is 1,560/(1,560 + 10,015) The odds are: 1,560/10,015
17
Odds and Proportions In other words… –If ‘a’ means “have disease” and b means “does not have the disease” then… –Proportion = a / a+b –Odds = a / b
18
Another Alternative… The “cc” command is for “case-control” and will give you odds ratios However, it requires 0 and 1 values. cc obese female, exact
19
As Task for You… What is the prevalence of diabetes? (provide a 95% confidence interval for your estimate) What is the prevalence of diabetes in men and women (hint: use “by” in the dialogue box) What is the odds ratio for the association of diabetes and obesity? What is the risk ratio for the association of diabetes and obesity? Is the association statistically significant?
20
A More Complex Problem.. The prevalence of obesity is said to be associated with lower levels of education
21
Two-way Tables 1 2 3
22
A Two-way Table
23
Bar Graphs It is under the graphics menu, the dialogue box… 1 2 3
24
Select Categories.. 1 2
26
Histograms, with “by” The pattern of obesity by education is different than that of mean BMI. Your Task: use the “by” subcommand with the histogram command to look at the distribution of BMI by eduation.
27
Does BMI Differ by Education? If we had two groups we’ld use a t-test. Our null would be Mean(1) = Mean(2), or as Stata says: Mean(1) – Mean(2) = 0 But we have > 2 groups, so could try to use ANOVA –Can think of this test as an extension of the two group t-test –Assumes normal distribution and equal variances (like the t-test it is “parametric”)
28
1 2 3 One-Way ANOVA
29
STATA Warns of a Problem
30
The Kruskal-Wallis Test 1 2 3
31
Kruskal-Wallis Output
32
Non-Parametric Tests Kruskall-Wallis and its 2 sample version (Wilcoxon Rank Sum Test) require that… –The variable can be meaningfully ordered, and –Has a roughly/loosely bell shaped frequency distribution (should have a central tendency) Your task: Repeat our analysis from last week in which we compared BMI in men and women, but use Kruskall-Wallis and Wilcoxon’s Rank Sum test. –Do you get equivalent results?
33
Comparing Proportions? Yes No Fisher’s Exact TestParametric Assumptions? Yes No Multiple Groups? Yes No YesNo ANOVA t-test Kruskall-Wallis Wilcoxon’s-Rank Sum
34
Prevalence of Diabetes Your Task: try this command: cii 11627 530, exact (Does your estimate resemble what you get with ci diabetes, exact?)
35
The CI Calculator 12 3
36
The “CC” Calculator 1 2 3 The CC Calculator
37
Your Task: Try the cci command to obtain the OR Your Task: Can you reproduce these CIs with an immediate Command?
38
Diagnostic Test Metrics Sensitivity Specificity Positive Predictive Value Negative Predictive Value
39
Common Notation for Test Metrics
40
Formulas for Test Metrics… Let’s make formulas for Se, Sp, PPV and NPV using this terminology.
41
(In Class) Assignment for Today Our database has random blood glucose (they call it “casual”) In these units (mg/dl) about 140 may be used as a cut-point for an “elevated” level Create a variable for “elevated” glucose and determine its Se, Sp, PPV and NPV as a diagnostic test for diabetes Calculate a confidence interval for each parameter.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.