Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical methods for health sciences Descriptive statistics Hypotheses tests Analysis of variance Regression analysis Live event analysis Multivariate.

Similar presentations


Presentation on theme: "Statistical methods for health sciences Descriptive statistics Hypotheses tests Analysis of variance Regression analysis Live event analysis Multivariate."— Presentation transcript:

1 Statistical methods for health sciences Descriptive statistics Hypotheses tests Analysis of variance Regression analysis Live event analysis Multivariate analysis SPSS

2 Do we need statistics?

3 The test is positive for 80% of those infected The test is negative for 95% of those not infected We test 1000 individuals and get 58 positive tests, how many of those 58 are really infected by the virus? ~A few? ~Half? ~Most? Ex. A new test to detect if a person is infected with a virus

4 Not infected (n=990)Infected (n=10)

5 ~5% of 990 without the virus will have false positive tests i.e. ~50 persons ~80% of 10 infected will have positive tests i.e. ~8 persons Not infected (n=990)Infected (n=10)

6 Variable types Qualitative (e.g. colors) Quantitative Continous (e.g. systolic blood pressure) Discrete (e.g. number of children) Nominal categorical (e.g. dead/alive) Ordinal categorical with order (e.g. x-small, small, medium, large) Scale quantity of “something” (e.g. bmi)

7 Which type of variable? Age Age group 25-34, 35-44, 45-54,... Sex (male/female) Education (primary, secondary, univ) Smoker (yes/no) Syst BP, Diast BP (mmHg) BMI (23.45, 28.12,…) Obese (BMI 30)

8 Study designs in Medical Research Observational studies-One or more groups of subjects are observed and data is recorded for analysis. Experimental studies-Involves an intervention, and interest lies in the effect the intervention has on study subjects. In medical research: subjects=people

9 Observational studies Prospective study, cohort study A group of disease-free individuals are identified at one point and followed over a period of time until some develop the disease. Retrospective study, case-control study Two groups are identified, one with disease (cases) and one without (controls). Cross-sectional study, prevalence study The study population is asked about their disease status. The prevalence of the disease at one time-point is estimated.

10 Descriptive statistics Always start your analysis with a description of your data detect outliers detect mistakes in the data registration (e.g. length=888 cm) - Overview of the distribution of the variables Overview of differences and trends

11 Descriptive statistics Measures of location mean median Measure of spread range (min-max) variance standard deviation, SD standard error of the mean, SEM percentiles/quantiles (p 25, p 75, q1, q3,...) Frequency tables Graphs barchart/histogram boxplot scatterplot

12 Measures of location Ex. 20 birthweights (g): 2069, 2581, 2759, 2834, 2838, 2841, 3031, 3101, 3200, 3245, 3248, 3260, 3265, 3314, 3323, 3484, 3541, 3609, 3649, 4146 mean = ”sum/(no. of measurements)” = 3166,9 median= ”value in the middle” = 3246,5

13 Measure of spread Ex cont. 20 birthweights (g): 2069, 2581, 2759, 2834, 2838, 2841, 3031, 3101, 3200, 3245, 3248, 3260, 3265, 3314, 3323, 3484, 3541, 3609, 3649, 4146 10th percentile  ”10% of obs are less than”  2670 90th percentile  ”90% of obs are less than”  3629 1st quartile = 25th percentile 3rd quartile = 75th percentile median = 50th percentile

14 Measure of spread Standard deviation (SD)  the average of (deviation from the mean)  445 Variance  SD 2 = 198323 Standard error of the mean (SEM) = SD/  n  99,6 If the variable you are measuring are normally distributed most of the measurements (  95%) lies within mean  2SD, in the example between 2277g och 4058g

15 Graphs, histogram Tip: in SPSS you can ad the normal distribution curve

16 Graphs, boxplot max p 75 p 25 median min

17 Graphs, scatterplot Ex. does the mothers weight influence the birthweight?

18 Estimates and confidence intervals Ex. Estimate the mean length of the population in Umeå by measuring a sample of 10 individuals Estimate = sample mean 95% confidence interval = mean  1,96SEM 95% confidence interval is an interval that with 95% probability will cover the true value (what we want to estimate)

19 Hypotheses test Ex. We suspect that patients who suffer a heart attack have higher bloodpressure than healthy. null-hypotheses H 0 : no difference (we want to reject) alternative hypotheses H 1 : difference p-value (p) = ”the probabability that the difference we see is just coused by random” level of significance (  ) = the limit at we reject H 0, most commonly 0.05 (5%)

20 Hypotheses test H 0 is rejected if p <  (0.05) and we conclude that there is a statistically significant difference in blood pressure between patients with heart infarction and healthy NB: p > 0.05 is not a proof of equality, just that there is >5% chance that the observed difference is by random!

21 Reasons for non-significant results There is no difference There is a difference, but we had too few patients to detect it

22 Type 1 and type 2 error Type 1 error: conclude a difference when there really is none (alpha<5%) Type 2 error: failing to prove a true difference

23 Multiple testing We want to compare patients with healthy individuals with respect to 20 variables (hight, weight, smoking, cholesterol, ASAT, ALAT, CRP) We test each variable and get p=0.04 for CRP and p>0.05 for all other variables How do we interpret this result?

24 Multiple testing What shall we do? 1. Don’t make any strong conclusions from p-values close to 0.05 2. Formally: Bonferroni-correction, devide the ordinary level of significance (0.05) with the number of tests (or multiply the uncorrected p-values with the number of tests)

25 Multiple testing, Bonferroni hightp=0.50 (corrected p =1.0) weightp=0.40(1.0) smokingp=0.06(1.0) cholesterolp=0.07(1.0) ASATp=0.20(1.0) ALATp=0.20(1.0) CRPp=0.04(0.8) etc. (20 variables) Compare with a level of significance of 0.05/20=0.0025, or multiply p by 20 (within parantheses)

26 Normal probability distribution Common assumption in several statistical tests (e.g. t-test) Does NOT mean that the observations are distributed as they normally would be Notion: N(mean, variance)

27 Normal probability distribution Mean=0 SD=1 ~68% of obs. within mean ±1SD ~95% within ±2SD ~99.7% within ±3SD

28 Normal probability distribution How do I know if my variable is normally distributed? continuous variable, no cut-off point not to few observations in each group (~20 or more) draw histogram symmetric, bell-shaped, mean=median Unsure? Use non-parametric tests if available

29 Parametric/non-parametric test Parametric tests: if data are normally distributed describe your data with mean and SD Non-parametric tests: primarily if data are not normaly distributed can also be used if data is normally distributed, but less powerful less sensitive to outliers describe your data with median and percentiles

30 t-test When: continuous, normally distributed variable compare two independent groups (independent samples t-test) compare two dependent groups (paired t-test) Ex: test if the BP-level differs between healthy and MI- patients (t-test) test within patient change in BP, before and after a treatment (paired t-test)

31 Independent samples t-test Levene's Test for Equality of Variances t-test for Equality of Means FSig.tdf Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference LowerUpper CholesterolEqual variances assumed,323,5704,4361812,000,3077,0694,1717,4437 Equal variances not assumed 4,537638,32,000,3077,0678,1745,4408 Test equality of variances p-value for difference in means p if equal variances assumed p if unequal variances assumed If this p-value<0.05, then use the test where equal variances not assumed

32 Wilcoxon Mann-Whitney test When: Not normally distributed variable Do not know if the variable is normally distributed ”The non-parametric version of independent samples t-test” Ex: Compare level of concsiousness (1-5) between two groups (Wilcoxon rank sum test/Mann-Whitney) ”non-paired” Compare level of concsiousness (1-5) between arrival to hospital and two weeks after (Wilcoxon signed rank sum test) ”paired”

33  2 -test When: 2 categorical variables Ex: Compare if the proportion with diabetes differs for men and women


Download ppt "Statistical methods for health sciences Descriptive statistics Hypotheses tests Analysis of variance Regression analysis Live event analysis Multivariate."

Similar presentations


Ads by Google