Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: courses.

Similar presentations


Presentation on theme: "Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: courses."— Presentation transcript:

1 Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ courses

2 Jul-15H.S.2 Agenda Concepts Bivariate analysis –Continuous symmetrical –Continuous skewed –Categorical Multivariable analysis –Linear regression –Logistic regression Outcome variable decides analysis

3 CONCEPTS Jul-15H.S.3

4 Jul-15H.S.4 Precision and bias Measures of populations –precision - random error - statistics –bias - systematic error - epidemiology True value Estimate Precision Bias

5 Jul-15H.S.5 Precision: Estimation PopulationSample ( | ) Estimate with confidence interval 95% confidence interval: 95% of repeated intervals will contain the true value

6 Jul-15H.S.6 Precision: Testing PopulationSample | | group 1 group 2 p-value=P(observing this difference or more, when the true difference is zero)

7 Jul-15H.S.7 Precision: Significance level Birth weight, 500 newborn, observe difference H 0 : boys=girls 10 grp=0.90 50 grp=0.40 100 grp=0.10 130 grp=0.04 150 grp=0.02 H a : boys≠girls p<0.05 Significance level

8 Jul-15H.S.8 Precision: Test situations 1 sample test Weight =10 2 independent samples Weight by sex K independent samples Weight by age groups 2 dependent samples Weight last year = Weight today

9 Jul-15H.S.9 Bias: DAGs E gest age D birth weight C2 parity C1 sex AssociationsBivariate (unadjusted) Causal effectsMultivariable (adjusted) Draw your assumptions before your conclusions

10 WHY USE GRAPHS? Jul-15H.S.10

11 Jul-15H.S.11 Problem example Lunch meals per week –Table of means (around 5 per week) –Linear regression

12 Jul-15H.S.12 Problem example 2 Iron level by sex –Both linear and logistic regression –Opposite results Iron level in blood

13 Jul-15H.S.13 Datatypes Categorical data –Nominal: married/ single/ divorced –Ordinal:small/ medium/ large Numerical data –Discrete:number of children –Continuous:weight

14 Jul-15H.S.14 Outcome data type dictates type of analysis

15 BIVARIATE ANALYSIS 1 Continuous symmetric outcome: Birth weight Jul-15H.S.15

16 Jul-15H.S.16 Distribution kdensity weight drop if weight<2000 kdensity weight

17 Jul-15H.S.17 Central tendency and dispersion Mean and standard deviation: Mean with confidence interval:

18 Jul-15H.S.18 Compare groups, equal variance? EqualNot equal

19 Jul-15H.S.19 2 independent samples Are birth weights the same for boys and girls? Scatterplot Density plot

20 Jul-15H.S.20 2 independent samples test ttest weight, by(sex) unequalunequal variances ttest var1==var2paired test

21 Jul-15H.S.21 K independent samples Is birth weight the same over parity? Scatterplot Density plot

22 Jul-15H.S.22 K independent samples test equal means? Equal variances?

23 Jul-15H.S.23 Continuous by continuous Does birth weight depend on gestational age? Scatterplot Scatterplot, outlier dropped

24 Jul-15H.S.24 Continuous by continuous tests Cut gestational age up in groups, then use T-test or ANOVA or Use linear regression with 1 covariate

25 Jul-15H.S.25 Test situations 1 sample test ttest weight =10 2 independent samples test weight, by(sex) K independent samples oneway weight parity 2 dependent samples (Paired) ttest weight_last_year == weight_today

26 BIVARIATE ANALYSIS 2 Continuous skewed outcome: Number of sexual partners Jul-15H.S.26

27 Jul-15H.S.27 Distribution kdensity partners if partners<=50

28 Jul-15H.S.28 Central tendency and dispersion Median and percentiles:

29 Jul-15H.S.29 2 independent samples Do males and females have the same number of partners? ScatterplotDensity plot

30 Jul-15H.S.30 2 independent samples test equal medians?

31 Jul-15H.S.31 K independent samples Do partners vary with age? ScatterplotDensity plot

32 Jul-15H.S.32 K independent samples test equal medians?

33 Jul-15H.S.33 Table of descriptives

34 Jul-15H.S.34 Table of tests Categorical ordered: use nonparametric tests If N is large: may use parametric tests Remarks:If unequal variance in ANOVA: Use linear regression with robust variance estimation

35 BIVARIATE ANALYSIS 3 Categorical outcome: Being bullied Jul-15H.S.35

36 Jul-15H.S.36 Frequency and proportion Frequency: Proportion with CI:

37 Jul-15H.S.37 Proportion, confidence interval proportion: standard error: confidence interval: x=”disease” n=total number

38 Jul-15H.S.38 Crosstables equal proportions? Are boys bullied as much as girls?

39 Jul-15H.S.39 Ordered categories, trend Trend? equal proportions?

40 Jul-15H.S.40 Table of tests Categorical ordered: use nonparametric tests If N is large: may use parametric tests Remarks:If unequal variance in ANOVA: Use linear regression with robust variance estimation

41 MULTIVARIABLE ANALYSIS 1 Continuous outcome: Linear regression, Birth weight Jul-15H.S.41

42 Jul-15H.S.42 Regression idea

43 Jul-15H.S.43 Model and assumptions Model Association measure  1 = increase in y for one unit increase in x 1 Assumptions –Independent errors –Linear effects –Constant error variance Robustness –influence

44 Jul-15H.S.44 Workflow DAG Scatterplots Bivariate analysis Regression –Model estimation –Test of assumptions Independent errors Linear effects Constant error variance –Robustness Influence E gest age D birth weight C2 parity C1 sex

45 Categorical covariates 2 categories –OK 3+ categories –Use “dummies” “Dummies” are 0/1 variables used to create contrasts Want 3 categories for parity: 0, 1 and 2-7 children Choose 0 as reference Make dummies for the two other categories Jul-15H.S.45 generate Parity1 =(parity==1) if parity<. generate Parity2_7 =(parity>=2) if parity<.

46 Create meaningful constant Expected birth weight at: gest= 0, sex=0, parity=0, not meaningful gest=280, sex=1, parity=0

47 Model estimation Jul-15H.S.47

48 Jul-15H.S.48 Test of assumptions Plot residuals versus predicted y –Independent residuals? –Linear effects? –constant variance?

49 Jul-15H.S.49 Violations of assumptions Dependent residuals Use mixed models or GEE Non linear effects Add square term Non-constant variance Use robust variance estimation

50 Jul-15H.S.50 Influence

51 Jul-15H.S.51 Measures of influence Measure change in: –Predicted outcome –Deviance –Coefficients (beta) Delta beta Remove obs 1, see change remove obs 2, see change

52 Delta beta for gestational age Jul-15H.S.52 If obs nr 539 is removed, beta will change from 6 to 16

53 Removing outlier Jul-15H.S.53 Full modelOutlier removed One outlier affected two estimatesFinal model

54 MULTIVARIABLE ANALYSIS 2 Binary outcome: Logistic regression, Being bullied Jul-15H.S.54

55 Ordered categories and model Jul-15H.S.55 CategoriesRegression model 2Logistic 3-7Ordinal logistic >7Linear (treat as interval) Interval versus ordered scale: Interval scale Ordered scale 123 lowmediumhigh

56 Jul-15H.S.56 Logistic model and assumptions Association measure Odds ratio in y for 1 unit increase in x 1 Assumptions –Independent errors –Linear effects on the log odds scale Robustness –influence

57 Jul-15H.S.57Jul-1557Jul-15H.S.57 Being bullied We want the total effect of country on being bullied. –The risk of being bullied depends on age and sex. –The age and sex distribution may differ between countries. Should we adjust for age and sex? E country D bullied C1 age C2 sex No, age and sex are mediating variables

58 Logistic: being bullied Jul-15H.S.58 OR  RR if outcome is rare OR>RR (further from 1) if the outcome is common Prevalence of being bullied=17% Roughly: Same risk of being bullied in Island as in Sweden. 2 times the risk in Norway as in Sweden. 3 times the risk in Finnland as in Sweden.

59 Jul-15H.S.59 Summing up DAGs –State prior knowledge. Guide analysis Plots –Linearity, variance, outliers Bivariate analysis –Continuous symmetricalMean, T-test, anova –Continuous skewedMedian, nonparametric –CategoricalFreq, cross, chi-square Multivariable analysis –ContinuousLinear regression –BinaryLogistic regression


Download ppt "Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: courses."

Similar presentations


Ads by Google