Download presentation
Presentation is loading. Please wait.
Published byEmilia Iles Modified over 9 years ago
1
Appropriate techniques of statistical analysis Anil C Mathew PhD Professor of Biostatistics & General Secretary ISMS PSG Institute of Medical Sciences and Research Coimbatore 641 004
2
Types of studies Case study Case series Cross sectional studies Case control study Cohort study Randomized controlled trials Screening test evaluation
3
Data analysis-Case series Measures of averages Mean, Median, Mode Length of stay for 5 patients 1,3,2,4,5 Mean length of stay 3 days Median length of stay 3 days Mode length of stay No mode
4
Which is the best average MeanMedianMode DBP817976 Height180 SAL7.57.68.1
5
Data analysis-case series Frequency distribution RBCFrequencyRelative frequency 5.95-7.9510.029 7.95-9.9580.229 9.95-11.95140.400 11.95-13.9590.257 13.95-15.9520.057 15.95-17.9510.029 Total351.000
6
Design of Cohort Study Time Direction of inquiry Population People without the disease Exposed Not Exposed no disease disease no disease disease
7
Is obesity associated with adverse pregnancy outcomes? Women with a Body Mass Index > 30 delivering singletons. Ref- University of Udine, Italy,2006 Preterm BirthNo preterm birth % Obese1635 T=51 31.4 Normal46487 T=533 8.6 RR= 3.65
8
Design of Case Control Study Disease No Disease Not Exposed Exposed Not Exposed Exposed
9
Results of a Case Control Study Lung Cancer (D+) No Lung Cancer (D-) Totals Exposed (E+)80 a30 ba + b Non exposed (E-) 20 c70 dc + d Totals100 a + c100 b + d
10
Analysis of Case-control study Odds ratio = a*d/b*c =80*70/30*20 =9.3
11
Data Analysis-Screening Test Evaluation-Whether the plasma levels of (Breast Carcinoma promoting factor) could be used to diagnose breast cancer? Positive criterion of BCPF >150 units vs. Breast Biopsy (the gold standard) D+ D- BCPF Test T+570150720 T-30850880 600 1000 1600 TP = 570FN = 30 FP = 150TN = 850
12
Sensitivity = P (T+/D+)=570/600 = 95% Specificity = P(T-/D-) = 850/1000 = 85% False negative rate = 1 – sensitivity False positive rate = 1 – specificity Prevalence = P(D+) = 600/1600 = 38% Positive predictive value = P (D+/T+) = 570/720 = 79%
13
Tradeoffs between sensitivity and specificity When the consequences of missing a case are potentially grave When a false positive diagnosis may lead to risky treatment
14
Data analysis-case series Measures of variation Range Standard deviation Group 1Group 2 2925 30 3135
15
Data analysis- Analytical studies Tests of significance
16
Case Study 1: Drug A and Drug B Aim: Efficacy of two drugs on lowering serum cholesterol levels Method: Drug A – 50 Patients Drug B – 50 Patients Result: Average serum cholesterol level is lower in those receiving drug B than drug A at the end of 6 months
17
What is the Conclusion?
18
A)Drug B is superior to Drug A in lowering cholesterol levels : Possible/Not possible
19
B) Drug B is not superior to Drug A, instead the difference may be due to chance: Possible/Not possible
20
C) It is not due to drug, but uncontrolled differences other than treatment between the sample of men receiving drug A and drug B account for the difference: Possible/Not possible
21
D) Drug A may have selectively administrated to patients whose serum cholesterol levels were more refractory to drug therapy: Possible/Not possible
22
Observed difference in a study can be due to 1) Random change 2) Biased comparison 3) Uncontrolled confounding variables
23
Solutions: A and B Test of Significance – p value P<0.05, means probability that the difference is due to random chance is less than 5% P<0.01, means probability that the difference is due to random chance is less than 1% P value will not tell about the magnitude of the difference
24
Solutions: C and D Random allocation and compare the baseline characteristics
25
Figure 1
26
Table 1-Baseline Characteristics CharacteristicVitamin group (n = 141) Placebo group (n = 142) Mean age ± SD, y28.9 ± 6.429.8 ± 5.6 Smokers, n (%)22 (15.6)14 (9.9) Mean body mass index ± SD, kg/m225.3 ± 6.025.6 ± 5.6 Mean blood pressure ± SD, mm Hg Systolic Diastolic 112 ± 15 67 ± 11 110 ± 12 68 ± 10 Parity, n %) 0 1 2 >2 91 (65) 39 (28) 9 (6) 2 (1) 87 (61) 42 (30) 8 (6) 5 (4) Coexisting disease, n (%) Essential hypertension Lupus/antiphospholipid syndrome Diabetes 10 (7%) 4 (3%) 2 (1%) 7 (5%) 1(1%) 3 (2%)
27
“t” Test Ho: There is no difference in mean birth weight of children from HSE and LSE in the population CR = t = | X1 - X2 | SD 1 + 1 n1 n2 SD = (n1-1)SD1 2 + (n2-1)SD2 2 n1 + n2- 2 SD = 14*0.27 2 + 9*0.22 2 = 0.25 23 t = | 2.91 – 2.26| = 6.36 0.25 1 + 1 15 10 DF = n1 + n2 – 2 CAL > Table REJECT Ho
28
GENERAL STEPS IN HYPOTHESIS TESTING 1 ) State the hypothesis to be tested 2) Select a sample and collect data 3) Calculate the test statistics 4) Evaluate the evidence against the null hypothesis 5) State the conclusion
29
Commonly used statistical tests T test-compare two mean values Analysis of variance-Compare more than two mean values Chi square test-Compare two proportions Correlation coefficient-relationship of two continuous variables
30
Data entry format Treatment Ageweight Diabetes Painscore-bPainscore-a Vomiting 121501960 1245301090 125551991 1285001061 1296001050 1206501080 026600990 025901991 024801991 0288901081 0228611091 0224501090
31
Example t test Body temperature c Simple febrile seizure N = 25 Febrile without seizure N =25 P value Mean39.0138.64P<0.001 SD0.560.45
32
Example-Analysis of variance Serum zinc level in simple febrile patients based on duration of seizure occurred Duration min nMeanSDP value < 5310.270.25P <0.001 5 to 10189.020.81 >1046.900.98
33
Example Chi-square test Characteristics of patients in the two groups Duration of fever (hour) Simple febrile seizure Febrile without seizure P value < 24166P<0.05 More than 24919
34
Example Correlation We found a negative correlation between serum zinc level and simple febrile seizure event r = - 0.86 p <0.001
35
Type 1 and Type 2 Errors Ho True Ho False / H1 True Accept Ho Reject Ho Power = 1- β Correct decisionType 2 error β = P (Type 2 error) Type 1 error α = P (Type 1 error) Correct decision
36
Multivariate problem Main outcome Continuous variable-Linear regression Dichotomous variable-Logistic regression
37
Bradford Hills Questions Introduction- Why did you start? Methods-What did you do? Results- What did you find? Discussion- What does it mean?
38
How to begin writing? Data Tables Methods, Results Introduction, Discussion Abstract Title, Key words, References
39
Thank you
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.