1 Statistics Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Significance Testing Chapter 13 Victor Katch Kinesiology.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Sample size computations Petter Mostad
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Chapter 8 Introduction to Hypothesis Testing
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Hypothesis Testing.
Sample Size Determination Ziad Taib March 7, 2014.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
AM Recitation 2/10/11.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 10 Hypothesis Testing
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
1 Statistics Achim Tresch Gene Center LMU Munich.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
More About Significance Tests
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Inference for a Single Population Proportion (p).
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Topic 8 Hypothesis Testing Mathematics & Statistics Statistics.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
© Copyright McGraw-Hill 2004
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter Nine Hypothesis Testing.
Hypothesis Testing: Hypotheses
Comparing Populations
Presentation transcript:

1 Statistics Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html

2 Measure in the sample Measure in the population? Variance? Confidence intervals? Estimation, Regression: II. Testing Difference in the sample Difference in the population? Probability of a false call? Significance Testing: Induction from the sample to the population II. Testing

3 What allows us to conclude from the sample to the population? The sample has to be representative (figures about drug abuse of students cannot be generalized to the whole population of Germany) How is representativity achieved? Large sample numbers Random recruitment of samples from the population E.g.: Dial a random phone number. Choose a random name from the register of birth (Advantages/Disadv.?) Randomization: Random allocation of the samples to the different experimental groups II. Testing

4 A non-sheep detector Training:Measure the length of all sheep that cross your way

5 Training:Measure the length of all sheep that cross your way. Determine the distribution of the quantity of interest. A non-sheep detector II Testing

6 Testing: For any unknown animal, test the hypothesis that it is a sheep. Measure ist length and compare it to the learned length distribution of the sheep. If its length is „out of bounds“, the animal will be called a non-sheep (rejection of the hypothesis). Otherwise, we cannot say much (non-rejection). A non-sheep detector Not a sheep II Testing

7 Advantage of the method: One does not need to know much about sheep. Disadvantage: It produces errors… True Negatives Negatives calls Positive calls Decision boundary True Positives False Positives False Negatives II Testing A non-sheep detector

8 Statistical Hypothesis Testing State a null hypothesis H 0 („nothing happens, there is no difference…“) Choose an appropriate test statistic (the data- derived quantity that finally leads to the decision) This implicitly determines the null distribution (the distribution of the test statistic under the null hypothesis). II Testing

9 Statistical Hypothesis Testing Stats an alternative hypothesis (e.g. „the test statistic is higher than expected under the null hypothesis“) Determine a decision boundary. This is equivalent to the chioce of a significance level α, i.e. the fraction of false positive calls you are willing to accept. α d II Testing Acceptance region Rejection region

10 Statistical Hypothesis Testing α d Calculate the actual value of the test statistic in the sample, and make your decision according to the pre- specified(!) decision boundary. Keep H 0 (no rejection) Reject H 0 (assume the alternative hypothesis) II Testing

11 0 d Good statistic Good test statistics, bad test statistics Accept null hypothesis Reject null hypothesis Null hypothesis is true right decision Typ I error (False Positive) Alternative is true Typ II error (False Negative) right decision Distribution of the test statistic under the null hypothesis Distribution of the test statistic under the alternative hypothesis II Testing

0 d Bad statistic II Testing Distribution of the test statistic under the null hypothesis Distribution of the test statistic under the alternative hypothesis Accept null hypothesis Reject null hypothesis Null hypothesis is true right decision Typ I error (False Positive) Alternative is true Typ II error (False Negative) right decision Good test statistics, bad test statistics

13 The Offenbach Oracle Throw the 20-sided dice Score = 20: reject the null hypothesis Score ≠ 20: keep the null hypothesis This is (independent of the null hypothesis) a valid statistical test at a 5% type I error level! Toni, 29, Offenbach, mechanician and moral philosopher II Testing

14 The Offenbach Oracle But: The distribution of the test statistic under null- and alternative hypothesis is identical This test cannot discriminate between the two alternatives! Distribution under H 0 Distribution under H 1 95% of the Positives (as well as the Negatives) will be missed. II Testing

15 The p-value p = 0.08 Given a test statistic and ist actual value t in a sample, a p-Wert can be calculated: Each test value t maps to a p-value, the latter is the probability of observing a value of the test statistic which is at least as extreme as the actual value t [under the assumption of the null hypothesis]. t=4.2 II Testing

16 p = 0.42 t=0.7 II Testing The p-value Given a test statistic and ist actual value t in a sample, a p-Wert can be calculated: Each test value t maps to a p-value, the latter is the probability of observing a value of the test statistic which is at least as extreme as the actual value t [under the assumption of the null hypothesis].

17 Test decisions according to the p-value Decision boundary d significance level α Observed test statistic t p-value α = 0.05 p ≥ α Keep H 0 (no rejection) p < α Reject H 0 (assume the alternative hypothesis) t p = 0.02 d t p = 0.83 t more extreme than d p is smaller than α II Testing

18 One- and two-sided hypotheses ][ Acceptance region Rejection region One-sided alternative H 0 : The value of a quantity of interest in group A is not higher than in group B. H 1 :The value of a quantity of interest in group A is higher than in group B. II Testing

19 ][ Acceptance region Rejection region H 0 : The quantity of interest has the same value in group A and group B H 1 :The quantity of interest is different in group A and group B ][ Rejection region Generally, two-sided alternatives are more conservative: Deviations in both directions are detected. II Testing One- and two-sided hypotheses Two-sided alternative

20 Example “Testing”: Colon Carcinoma What about this fact? Variable: Vaccine Scale: binary Endpoint: 4-year survival Scale: binary 32*94 ≈ 30 (62-32)*77 ≈ 23 II Testing

21 Interesting questions: Das the vaccine yield any effect? Is this effect „significant“ ? 4-year survival JaNein Vaccine yes (n=32)30 (94%)2 (6%) no (n=30)23 (77%)7 (23%) II Testing Example “Testing”: Colon Carcinoma

22 Null hypothesis H 0 : Vaccination has not (either positive or negative) impact on the patients. The survival rates in the vaccine and non-vaccine group in the whole population are the same. Alternative hypothesis H 1 : For the whole population, the survival rates in the vaccine and non vaccine group are different. Choose the significance level α (usually: α = 1%; 0.1%; 5%) Interpretation of the significane level α : If there is no difference between the groups, one obtains a false positive result with a probability of α. II Testing Example “Testing”: Colon Carcinoma

23 Choice of test statistic: „Fisher‘s Exact Test“ Sir Ronald Aylmer Fisher, Theoretical Biology, Evolution Theory, Statistics II Testing Example “Testing”: Colon Carcinoma

24 Value of the test statistic t after the experiment has been carried out. This value can be converted into a p-value: p =  7.7% Since we have chosen a significane level α = 5%, and p > α, we cannot reject the null hypothesis, thus we keep it. Formulation of the result: At a 5% significance level (and using Fisher‘s Exact Test), no significant effect of vaccination on survival could be detected. Consequence: We are not (yet) sufficiently convinced of the utility of this therapy. But this does not mean that there is no difference at all! II Testing Example “Testing”: Colon Carcinoma

25 “No test based upon the theory of probability can by itself provide any valuable evidence of the truth or falsehood of a hypothesis.“ Neyman J, Pearson E (1933) Phil Trans R Soc A Egon Pearson ( ) Jerzy Neyman ( ) Non-significance ≠ equivalence Statistics can never prove a hypothesis, it can only provide evidence II Testing

26 Confidence intervals 95%-Confidence interval: An estimated interval which contains the „true value“ of a quantity with a probability of 95%. 24,3 ____________________________________ () ,5 X Interval estimate Point estimate (e.g. % votes for the SPD in the EU elections) ( 1 – α ) – Conficence interval: An estimated interval which contains the „true value“ of a quantity with a probability of (1 – α). 1 – α = confidence level, α = error probability Use confidence intervals with caution! I. Description

27 … Gene A Gene B gene expression measurements Which gene is expressed at a higher level? group 1 group 2 Comparison of two group means Specific statistical tests

28 group 1 group 2 Hypothesis: The expression of gene g in group 1 is lower than in group 2. Data: Expression of gene g in different samples Decision for “lower expression“, if Test statistic, e.g. Difference of group means Two group comparison

29 Bad idea: Difference of group means Problem: d is not scaling invariant Solution: Divide d by an estimate of the standard deviation s(d) in the two groups This is the t-statistic giving rise to the (unpaired) t-test. group 1 group 2 Two group comparison

Question: Given independent samples in group 1 and group 2, Are the values in group 1 smaller than in group 2 ? measurements group group Raw scale Rank scale Rank sum group 1: = 22 Rank sum group 2: = 33 Wilcoxon (rank sum) test (equiv. to Mann-Whitney-test)

Choose the rank sum of group 1 as test statistic W Rank sum distribution for group 1, |group 1| = 5, |group 2| = 5 The p-value corresponding to W can be computed exactly for small sample numbers. For large numbers, there exist good approximations. 22 P(W ≤ 22, given the groups do not differ in their location) Wilcoxon W = 0.15 Wilcoxon (rank sum) test (equiv. to Mann-Whitney-test)

32 Gaussian data? Paired samples? Paired Samples? Unpaired two sample t-test yes no Paired two sample t-Test Wilcoxon signed rank test Wilcoxon rank sum test yes no Question: Do the two measurements in the two groups differ in their location? Summary: Two-group comparison of a continuous variable

Effect effectno effect Medi- cation Verum657 Placebo4413 Question: Do the distributions in group 1 and group 2 differ? Unpaired data: Fisher‘s exact test Example: Clinical trial, unpaired design (each test person receives only one treatment) Comparison of two binary variables

headstails Fair coin5446 Bent coin8218 Odds (= Chances): Odds (fair coin) = 54 : 46 = 1.17 Odds (bent coin) = 82 : 18= 4.56 Odds Ratio Odds und Odds Ratio

Null hypothesis: 5yr survival is independent of tumor size. Unpaired data: Chi-square test (χ 2 -test) 5yr survival NoYes Tumor size Comparison of two categorial variables In this example, p <

Requirements Sample number sufficiently large (n ≥ 60) Expected number of is not too small ( ≥ 5) for all possible observations Unpaired data Chi-square test (χ 2 -test) Vergleich zweier kategorialer Merkmale Note that for binary data and large n, chisquare test and Fisher test are equivalent.

37 Binary data? Paired data? McNemar test yes no Fisher‘s exact test (Bowker Symmetry- test) Chisquare (χ 2 ) -Test yes no Question: Do there exist differences in the distribution of one variable if grouped by the second variable? Summary: Comparison of two categorial variables

38 MerkmalDesign Deskription numerisch Deskription graphisch Test Con- tinuous two sample Medians, quartiles 2 Boxplots Wilcoxon rank sum test, t-test* Con- tinuous paired Medians, quartiles óf differences Boxplot of differences Wilcoxon signed rank test, paired t-test* binarytwo sample Cross table, odds ratio Barplot Fisher‘s exact test binarypairedCross tableBarplot McNemar- test categorialtwo sampleCross table3D Barplotχ 2 -test * If differences follow a normal distribution Summary: Description und Testing

Data description is the mandatory first step of every statistical analysis / test. Test results should report the outcome (singificant/not significant) together with the p-value that has been obtained. Never report a p-value of exactly 0! (why?) Remarks on Testing

40 For large sample numbers, even tiny differences may produce significant findings. For small sample numbers, an observed relevant difference can be statistically insignificant. Statistical significance ≠ relevance

41 Examples of multiple tests: Testing of several endpoints (systolic and diastolic blood pressure, pulse, …) Comparison of several groups (e.g., 4 groups require 6 pairwise two-group comparisons) Let us set a significance level of 5%, and suppose the null hypothesis holds in all cases. → If we perform 6 tests, the probability of reporting at least one false positive finding can increase to 30%! Multiple Testing

Remedy: Bonferroni correction For m tests and a target significance level, perform each individual test at a significance level of α/m (local significance level). The probability of producing a false positive finding in at least one of the m tests is then at most α (multiple / global significance level) Multiple Testing, Bonferroni Correction