Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

AP Statistics Course Review.
Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Statistics for Linguistics Students Michaelmas 2004 Week 3 Bettina Braun
Lecture 9: One Way ANOVA Between Subjects
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 8: Significantly significant.
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
Choosing Statistical Procedures
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 8 Introduction to Hypothesis Testing
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Biost 511 DL Discussion Section Announcements Quiz 1 (CEU students only) Will be available on Canvas.uw.edu Friday 12 pm – Sunday 11:59 pm One hour to.
● Midterm exam next Monday in class ● Bring your own blue books ● Closed book. One page cheat sheet and calculators allowed. ● Exam emphasizes understanding.
Chapter 1: Introduction to Statistics
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Normal Distribution Chapter 5 Normal distribution
Biostatistics in Practice Peter D. Christenson Biostatistician LABioMed.org /Biostat Session 2: Summarization of Quantitative Information.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Topic 5 Statistical inference: point and interval estimate
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Food additives and behaviour in children. Jim Stevenson 20 May 2008 Presentation to the Associate Parliamentary Food and Health Forum.
Statistics Definition Methods of organizing and analyzing quantitative data Types Descriptive statistics –Central tendency, variability, etc. Inferential.
Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Chapter 11 Inference for Distributions AP Statistics 11.1 – Inference for the Mean of a Population.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Review of Chapters 1- 6 We review some important themes from the first 6 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Research Ethics:. Ethics in psychological research: History of Ethics and Research – WWII, Nuremberg, UN, Human and Animal rights Today - Tri-Council.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 2: Summarization of Quantitative Information.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Biostatistics in Practice Peter D. Christenson Biostatistician LABioMed.org /Biostat Session 2: Summarization of Quantitative Information.
Experimental Psychology PSY 433 Appendix B Statistics.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 6: Case Study.
Issues concerning the interpretation of statistical significance tests.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Quantitative and Inferential Issues.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size for Precision or Power.
Chapter 6: Analyzing and Interpreting Quantitative Data
Biostatistics in Practice Session 2: Summarization of Quantitative Information Peter D. Christenson Biostatistician
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Chapter 13 Sampling distributions
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Biostatistics in Practice Session 6: Data and Analyses: Too Little or Too Much Youngju Pak Biostatistician
Hypothesis Testing and Statistical Significance
When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.
BPS - 5th Ed. Chapter 231 Inference for Regression.
PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Agenda n Probability n Sampling error n Hypothesis Testing n Significance level.
Experimental Research
Article & Final Reviews
Hypothesis testing using contrasts
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Applied Statistical Analysis
Descriptive and inferential statistics. Confidence interval
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Presentation transcript:

Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1

What we have learned in Session 1?  Basic Study Design  Parameters vs. Statistics  Inferential vs. Descriptive statistics  Categorical vs. Quantitative Data? Why important?  Summarizing the data with graphs: Contingency Tables, Box Plots, Histogram, etc.  How to run MYSTAT 2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

Today’s topics  Article : McCann, et al., Lancet 2007 Nov 3;370(9598): Subject selection /Randomization Efficiency from study design What statistics were used? Experimental Units / Independence of Measurements  Normal Distributions  Confidence Intervals & P-values 78

McCann, et al., Lancet 2007 Nov 3;370(9598):  Food additives and hyperactive behaviour in 3-year- old and 8/9-year-old children in the community: a randomised, double-blinded, placebo-controlled trial.  Target population: 3-4, 8-9 years old children  Study design: randomized, double-blinded, controlled, crossover trial  Sample size: 153 (3 years), 144(8-9 years) in Southampton UK  Objective: test whether intake of artificial food color and additive (AFCA) affects childhood behavior

McCann, et al., Lancet 2007 Nov 3;370(9598):  Sampling: Stratified sampling based on SES in Southampton, UK  Baseline measure: 24h recall by the parent of the child’s pretrial diet  Group: Three groups, for 3 years old –mix A : 20 mg of food colorings + 45 mg sodium benzoate, which is a widely used food preservative –mix B : 30mg of food coloring + 45 mg sodium benzoate(current average daily consumption) –Placebo –For 8/9 years old: multiply these by 1.25  Cross-over Design  A participants receive one of 6 possible random sequences. In a separate study with N=20, no significant difference in looks and taste of drinks among three groups was found even though people ask about which diet type they got when they received placebo (65%) > mix B (52%) > mix A (40%) 80 T0 (baseline)Week 1Week 2Week 3Week 4Week 5Week 6 Randomize Typical DietWashout

McCann, et al., Lancet 2007 Nov 3;370(9598):  Outcomes: Global Hyper Activity(GHA) Score  Attention-Deficit Hyperactivity Disorder(ADHD) rating scale IV by teachers, scaled 1 – 5, higher number means more hyperactive  Weiss-Werry-Peters(WWP) hyperactivity scale by parents,  Classroom observation code,  Conners continuous performance test II (CPTII)  GHA to be aggregated from these four scores 81

Why standardized outcome measure? GHA = Global Hyperactivity Aggregate, where a higher value ↔ more hyperactive For each child at each time: Z1 = Z-Score for ADHD from Teachers Z2 = Z-Score for WWP from Parents Z3 = Z-Score for ADHD in Classroom Z4 = Z-Score for Conner on Computer, where Z-score= (Score-Score at T0)/SD to make each measure scaled similarly. GHA= Mean of Z1, Z2, Z3, Z4 82

Why normal distribution? Symmetric. One peak. Roughly bell-shaped. No outliers. Many statistical tests(parametric) rely on the assumption that outcome measures follow the normal distribution. 83

A property of the normal distribution For bell-shaped distributions of data (“normally” distributed): ~ 68% of values are within mean ±1 SD ~ 95% of values are within mean ±2 SD “(Normal) Reference Range” ~ 99.7% of values are within mean ±3 SD 84

What if it is not normally distributed Skewed Need to transform intensity to another scale, e.g. Log(intensity) Or Nonparametric tests Multi-Peak Need to summarize with percentiles, not mean. Nonparametric tests 85

Representative or Random Samples How were the children to be studied selected (second column on the first page)? The authors purposely selected "representative" social classes. Is this better than a "randomly" chosen sample that ignores social class? Often hear: Non-random = Non-scientific.

Case Study: Participant Selection No mention of random samples.

Case Study: Participant Selection It may be that only a few schools are needed to get sufficient individuals. If, among all possible schools, there are few that are lower SES, none of these schools may be chosen. So, a random sample of schools is chosen from the lower SES schools, and another random sample from the higher SES schools.

Non-Completing or Non-Adhering Subjects Is it really a random sample? If not, what are the problems?

Why Randomize? So that groups will be similar except for the intervention. So that, when enrolling, we will not unconsciously choose an “appropriate” treatment for a particular subject. Minimizes the chances of introducing bias when attempting to systematically remove it, as in plant yield example.

Case Study: Crossover Design Each child is studied on 3 occasions under different diets. Is this better than three separate groups of children? Why, intuitively? How could you scientifically prove your intuition?

Estimated mean changes and their Confidence Intervals Line or Profile Plot What information was given by these confidence intervals? 92 Confidence Interval

Confidence Interval (CI) How well your sample mean(m) reflects the true( or population) mean  How confident?  95%? A confidence interval (CI) is one of inferential statistics that estimate the true unknown parameter using interval scales. 93

Confidence Interval for Population Mean 95% Reference range or “Normal Range”, is sample mean ± 2(SD) _____________________________________ 95% Confidence interval (CI) for the (true, but unknown) mean for the entire population is sample mean ± 2(SD/√N) SD/√N is called “Std Error of the Mean” (SEM) 94

Confidence Interval: Case Study Confidence Interval: ± 1.99(1.04/√73) = ± 0.24 → to 0.10 Table 2 Normal Range: ± 1.99(1.04) = ± 2.07 → to Adjusted CI close to 95

96

P-values ! Used the evidence of contradiction to your null hypothesis (H 0 ) –e.g., H 0 : no difference in mean GHA scores among three different diet. Based on the statistical test –Eg., T test statistics = Signal / Noise – if Signal >> Noise  statistically significant Usually p < 0.05 called as “statistically significant” in favor of H a 97

Experimental Units _____ Independence of Measurements 98

Units and Independence Experiments may be designed such that each measurement does not give additional independent information. Many basic statistical methods require that measurements are “independent” for the analysis to be valid. In mathematics, two events are independent if and only if the occurrence of one event makes it neither more nor less probable that the other occurs. 99

Experimental Units in Case Study What is the experimental unit in this study? 1. School 2. Child 3. Parent 4. GHA score (results from three diets) Are all GHA scores(eg. 153 x 3 groups=459 GHA scores for 3-4 years old children) independent? The analysis MUST incorporate this possible correlation (clustering) if there exists.  eg., Mixed Model allowing for clustering due to schools. 100

What have we learned today?

Announcements Keys for HW1 and HW 2 will be posted on class website by Wednesday. 102