Basic statistics European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov 2009 Tim Massingham,

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Dealing With Statistical Uncertainty
Statistical Decision Making
Multiple testing adjustments European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov 2009 Tim Massingham,
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Dealing With Statistical Uncertainty Richard Mott Wellcome Trust Centre for Human Genetics.
Lecture 9: One Way ANOVA Between Subjects
Chapter 14 Tests of Hypotheses Based on Count Data
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Today Concepts underlying inferential statistics
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Inferential Statistics
Chapter 10 Analyzing the Association Between Categorical Variables
How Can We Test whether Categorical Variables are Independent?
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
1 Advances in Statistics Or, what you might find if you picked up a current issue of a Biological Journal.
AM Recitation 2/10/11.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Correlation.
Evidence Based Medicine
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
CHAPTER 18: Inference about a Population Mean
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Estimating a Population Proportion
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
Inferential Statistics. Coin Flip How many heads in a row would it take to convince you the coin is unfair? 1? 10?
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Chapter Eight: Using Statistics to Answer Questions.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chapter 9: Inferences Involving One Population
Inferential Statistics
Discrete Event Simulation - 4
Inferential testing.
EE, NCKU Tien-Hao Chang (Darby Chang)
Presentation transcript:

Basic statistics European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov 2009 Tim Massingham,

Introduction Basic statistics What is a statistical test Count data and Simpson’s paradox A Lady drinking tea and statistical power Thailand HIV vaccine trial and missing data Correlation Nonparametric statistics Robustness and efficiency Paired data Grouped data Multiple testing adjustments Family Wise Error Rate and simple corrections More powerful corrections False Discovery Rate

What is a statistic Anything that can be measured Calculated from measurements Branches of statistics Frequentist Neo-Fisherian Bayesian Lies Damn lies (and each of these can be split further) “If your experiment needs statistics, you ought to have done a better experiment.” Ernest Rutherford “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” H.G.Wells “He uses statistics as a drunken man uses lamp-posts, for support rather than illumination.” Andrew Lang (+others) Classical statistics

Repeating the experiment 1,000 times 10,000 times100 times Initial experiment Imagine repeating experiment, some variation in statistic

Frequentist inference Repeated experiments at heart of frequentist thinking Have a “null hypothesis” Null distribution What distribution of statistic would look like if we could repeatedly sample LikelyUnlikely Compare actual statistic to null distribution Classical statistics Finding statistics for which the null distribution is known

Anatomy of a Statistical Test Value of statistic Density of null distribution P-value of test Probability of observing statistic or something more extreme Equal to area under the curve Density measures relative probability Total area under curve equals exactly one

Some antiquated jargon Critical values Dates back to when we used tables Old way Calculate statistic Look up critical values for test Report “significant at 99% level” Or “rejected null hypothesis at ….” Size Critical value Pre-calculated values (standard normal distribution)

Power Probability of correctly rejecting null hypothesis (for a given size) Null distributionAn alternative distribution Power changes with size Reject nullAccept null

Confidence intervals Use same construction to generate confidence intervals Confidence interval = region which excludes unlikely values For the null distribution, the “confidence interval” is the region which we accept the null hypothesis. The tails where we reject the null hypothesis are the critical region

Count data

(Almost) the simplest form of data we can work with Each experiment gives us a discrete outcome Have some “null” hypothesis about what to expect YellowBlackRedGreenOrange Observed Expected20 Example: Are all jelly babies equally liked by PhD students?

Chi-squared goodness of fit test YellowBlackRedGreenOrange Observed Expected20 Summarize results and expectations in a table jelly_baby <- c( 19, 27, 15, 22, 17) expected_jelly <- c(20,20,20,20,20) chisq.test( jelly_baby, p=expected_jelly, rescale.p=TRUE ) Chi-squared test for given probabilities data: jelly_baby X-squared = 4.4, df = 4, p-value = pchisq(4.4,4,lower.tail=FALSE) [1]

Chi-squared goodness of fit test jelly_baby <- c( 19, 27, 15, 22, 17) expected_jelly <- c(20,20,20,20,20) chisq.test( jelly_baby, p=expected_jelly, rescale.p=TRUE ) Chi-squared test for given probabilities data: jelly_baby X-squared = 4.4, df = 4, p-value = What’s this? YellowBlackRedGreenOrange Observed ? How much do we need to know to reconstruct table? Number of samples Any four of the observations or equivalent, ratios for example

More complex models Specifying the null hypothesis entirely in advance is very restrictive YellowBlackRedGreenOrange Observed Expected20 4 df 0 df Allowed expected models that have some features from data e.g. Red : Green ratio Each feature is one degree of freedom YellowBlackRedGreenOrange Observed Expected :23.8 = 15: = 40 4 df 1 df

Example: Chargaff’s parity rule Chargaff’s 2 nd Parity Rule In a single strand of dsDNA, %A≈%T and %C≈%G ACGT Helicobacter pylori From data, %AT = 61% %CG=39% Apply Chargaff to get %{A,C,G,T} ACGT Proportion Number Null hypothesis has one degree of freedom Alt. hypothesis has three degrees of freedom Difference: two degrees of freedom

Contingency tables PieFruit Custardac Ice-creambd Observe two variables in pairs - is there a relationship between them? Silly example: is there a relationship desserts and toppings? A test of row / column independence Real example McDonald-Kreitman test - Drosophila ADH locus mutations BetweenWithin Nonsynonymous72 Synonymous1742

Contingency tables A contingency table is a chi-squared test in disguise PieFruit Custard p Ice-cream (1-p) q(1-q)1 Null hypothesis: rows and columns are independent Multiply probabilities Pie & Custard Pie & Ice- cream Fruit & Custard Fruit & Ice- cream Observedabcd Expectedn p qn (1-p) qn p (1-q)n (1-p)(1-q) p q p (1-q) (1-p) q(1-p)(1-q) × n

Contingency tables Pie & Custard Pie & Ice- cream Fruit & Custard Fruit & Ice- cream Observedabcn - a - b - c Expectedn p qn (1-p) qn p (1-q)n (1-p)(1-q) Observed:three degrees of freedom(a, b & c) Expected:two degrees of freedom(p & q) In general, for a table with r rows and c columns Observed:r c - 1 degrees of freedom Expected:(r-1) + (c-1) degrees of freedom Difference:(r-1)(c-1)degrees of freedom Chi-squared test with one degree of freedom

Bisphenol A Bisphenol A is an environmental estrogen monomer Used to manufacture polycarbonate plastics lining for food cans dental sealants food packaging Many in vivo studies on whether safe: could polymer break down? Is the result of the study independent of who performed it? F vom Saal and C Hughes (2005) An Extensive New Literature Concerning Low-Dose Effects of Bisphenol A Shows the Need for a New Risk Assessment. Environmental Health Perspectives 113(8):928 HarmfulNon-harmful Government9410 Industry011

Bisphenol A HarmfulNon-harmful Government % Industry0119.6% 81.7%18.2%115 Observed table HarmfulNon-harmful Government % Industry % 81.7%18.2% Expected table E.g × ×115 = 85.0 Chi-squared statistic = Test with 1 d.f.p-value = 3.205e-12

Bisphenol A Association measure Discovered that we have dependence How strong is it? Coefficient of association for 2×2 table Chi-squared statistic Number of observations Number between 0 = independent to 1 = complete dependence For the Bisphenol A study data Should test really be one degree of freedom? Reasonable to assume that government / industry randomly assigned? Perhaps null model only has one degree of freedom pchisq( , df=1, lower.tail=FALSE) [1] e-12 pchisq( , df=2, lower.tail=FALSE) [1] e-11

Simpson’s paradox Famous example of Simpson’s paradox C R Charig, D R Webb, S R Payne, and J E Wickham (1986) Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. BMJ 292: 879–882 Compare two treatments for kidney stones open surgery percutaneous nephrolithotomy (surgery through a small puncture) SuccessFail open perc. neph % success 83% success Percutaneous nephrolithotomy appears better (but not significantly so, p-value 0.15)

Simpson’s paradox SuccessFail open81687 perc. neph SuccessFail open perc. neph Small kidney stones Large kidney stones Missed a clinically relevant factor: the size of the stones The order of treatments is reversed 93% success 87% success p-value % success 69% success p-value 0.55 Combined Open78% Perc. neph.83%

Simpson’s paradox SuccessFail open81687 prec. neph SuccessFail open prec. neph Small kidney stones Large kidney stones What’s happened? SmallLarge Failure of randomisation (actually an observational study) Small and large stones have a different prognosis (p-value<1.2e-7) 88% success total 72% success total Open Prec. neph 93% success 87% success 73% success 69% success

A Lady Tasting Tea

The Lady Tasting Tea “A LADY declares that by tasting a cup of tea made with milk she can discriminate whether the milk or the tea infusion was first added to the cup … Our experiment consists in mixing eight cups of tea, four in one way and four in the other, and presenting them to the subject for judgment in a random order. The subject has been told in advance of what the test will consist, namely that she will be asked to taste eight cups, that these shall be four of each kind, and that they shall be presented to her in a random order.” Fisher, R. A. (1956) Mathematics of a Lady Tasting Tea Eight cups of tea Exactly four one way and four the other The subject knows there are four of each The order is randomised Guess teaGuess milk Tea first??4 Milk first??4 448

Fisher’s exact test Guess teaGuess milk Tea first??4 Milk first??4 448 Looks like a Chi-squared test But the experiment design fixes the marginal totals Eight cups of tea Exactly four one way and four the other The subject knows there are four of each The order is randomised Fisher’s exact test gives exact p-values with fixed marginal totals Often incorrectly used when marginal totals not known

Sided-ness Not interested if she can’t tell the difference Guess teaGuess milk Tea first404 Milk first Guess teaGuess milk Tea first044 Milk first Two possible ways of being significant, only interested in one Exactly right Exactly wrong

Sided-ness tea <- rbind( c(0,4), c(4,0) ) tea [,1] [,2] [1,] 0 4 [2,] 4 0 fisher.test(tea)$p.value [1] fisher.test(tea,alternative="greater")$p.value [1] 1 fisher.test(tea,alternative="less")$p.value [1] More correctMore wrong Only interested in significantly greater Just use area in one tail

Statistical power Are eight cups of tea enough? Guess teaGuess milk Tea first404 Milk first A perfect score, p-value Guess teaGuess milk Tea first314 Milk first Better than chance, p-value

Statistical power Guess teaGuess milk Tea firstp n(1-p) nn Milk first(1-p) np nn nn2n Assume the lady correctly guesses proportion p of the time % % % e-54e-5 To investigate simulate 10,000 experiments calculate p-value for experiment take mean

Thailand HIV vaccine trials

Thailand HIV vaccine trial News story from end of September 2009 “Phase III HIV trial in Thailand shows positive outcome” 16,402 heterosexual volunteers tested every six months for three years Sero +veSero -ve Control Vaccine fisher.test(hiv) Fisher's Exact Test for Count Data data: hiv p-value = alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: sample estimates: odds ratio ,395 Rerks-Ngarm, Pitisuttithum et al. (2009) New England Journal of Medicine /NEJMoa

Thailand HIV vaccine trial "Oh my God, it's amazing. " … "The significance that has been established in this trial is that there is a 5% chance that this is a fluke. So we are 95% certain that what we are seeing is real and not down to pure chance. And that's great." Significant? Would you publish? Should you publish? Study was randomized (double-blinded) Deals with many possible complications male / female balance between arms of trial high / low risk life styles volunteers who weren’t honest about their sexuality (or changed their mind mid-trial) genetic variability in population (e.g. CCL3L1) incorrect HIV test / samples mixed-up vaccine improperly administered

Thailand HIV vaccine trial "Oh my God, it's amazing. " … "The significance that has been established in this trial is that there is a 5% chance that this is a fluke. So we are 95% certain that what we are seeing is real and not down to pure chance. And that's great." Significant? Would you publish? Should you publish? Initially published data based on modified Intent To Treat Intent To Treat People count as soon as they are enrolled in trial modified Intent To Treat Excluded people found to sero +ve at beginning of trial Per-Protocol Only count people who completed course of vaccines A multiple testing issue? How many unsuccessful HIV vaccine trials have there been? One or more and these results are toast.

Missing data Some people go missing during / drop out of trials This could be informative E.g. someone finds out they have HIV from another source, stops attending check-ups Double-blinded trials help a lot Extensive follow-up, hospital records of death etc Missing Completely At Random Missing data completely unrelated to trial Missing At Random Missing data can be imputed Missing Not At Random Missing data informative about effects in trial Volunteer 1----?? Volunteer

Correlation

Correlation? Wikimedia, public domain Various types of data and their correlation coefficients

Correlation does not imply causation If A and B are correlated then one or more of the following are true A causes B B causes A A and B have a common cause (might be surprising) Do pirates cause global warming? Pirates: R. Matthews (2001) Storks delivery Babies (p=0.008) Teaching Statistics 22(2):36-38

Pearson correlation coefficient Standard measure of correlation “Correlation coefficient” Measure of linear correlation, statistic belongs to [-1,+1] 0independent 1 perfect positive correlation -1 perfect negative correlation cor.test( gene1, gene2, method="pearson") Pearson's product-moment correlation data: gene1 and gene2 t = , df = 22808, p-value < 2.2e-16 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor Measure of correlation Roughly proportion of variance of gene1 explained by gene2 (or vice versa)

Pearson: when things go wrong Pearson’s correlation test Correlation = (p-value = ) Correlation = (p-value = ) Correlation = (p-value = 5.81e-06) Correlation = (p-value = 6.539e-08) A single observation can change the outcome of many tests Pearson correlation is sensitive to outliers 200 observations from normal distribution x ~ normal(0,1)y ~ normal(1,3)

Spearman’s test Nonparametric test for correlation Doesn’t assume data is normal Insensitive to outliers Coefficient has roughly same meaning Replace observations by their ranks Ranks Raw expression cor.test( gene1, gene2, method="spearman") Spearman's rank correlation rho data: gene1 and gene2 S = , p-value < 2.2e-16 alternative hypothesis: true rho is not equal to 0 sample estimates: rho

Comparison Look at gene expression data cor.test(lge1,lge2,method="pearson") Pearson's product-moment correlation data: lge1 and lge2 t = , df = 22808, p-value < 2.2e-16 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor cor.test(lge1,lge2,method="spearman") Spearman's rank correlation rho data: lge1 and lge2 S = , p-value < 2.2e-16 alternative hypothesis: true rho is not equal to 0 sample estimates: rho cor.test(lge1,lge2,method="kendall") Kendall's rank correlation tau data: lge1 and lge2 z = , p-value < 2.2e-16 alternative hypothesis: true tau is not equal to 0 sample estimates: tau

Comparison Spearman and Kendall are scale invariant Log - Log scale Pearson0.97 Spearman0.98 Kendall0.90 Normal - Log scale Pearson0.56 Spearman0.98 Kendall0.90 Normal - Normal scale Pearson0.99 Spearman0.98 Kendall0.90