+ Refresher in inferential statistics stats.

Slides:



Advertisements
Similar presentations
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Advertisements

Nonparametric Statistics Timothy C. Bates
Statistical Tests Karen H. Hagglund, M.S.
Chapter Seventeen HYPOTHESIS TESTING
PSY 307 – Statistics for the Behavioral Sciences
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Topic 2: Statistical Concepts and Market Returns
Final Review Session.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Today Concepts underlying inferential statistics
5-3 Inference on the Means of Two Populations, Variances Unknown
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Non-parametric statistics
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
AM Recitation 2/10/11.
Statistical Analysis I have all this data. Now what does it mean?
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
More About Significance Tests
NONPARAMETRIC STATISTICS
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Experimental Design and Statistics. Scientific Method
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Analyzing and Interpreting Quantitative Data
© Copyright McGraw-Hill 2004
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 21prepared by Elizabeth Bauer, Ph.D. 1 Ranking Data –Sometimes your data is ordinal level –We can put people in order and assign them ranks Common.
Chapter 13 Understanding research results: statistical inference.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 7 Analyzing and Interpreting Quantitative Data.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Nonparametric Statistics
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Agenda n Probability n Sampling error n Hypothesis Testing n Significance level.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Non-Parametric Tests.
SA3202 Statistical Methods for Social Sciences
Review for Exam 2 Some important themes from Chapters 6-9
Inferential testing.
Presentation transcript:

+ Refresher in inferential statistics stats

+ Resources

+ Our basic question… Did something occur? Importantly, did what we predicted would occur, transpire?, i.e., is the world as we predicted? Why does this require statistics?

+ Is Breastfeeding good for Baby’s brains?

The association between breastfeeding and IQ is moderated by a genetic polymorphism (rs174575) in the FADS2 gene Caspi A et al. PNAS 2007;104: ©2007 by National Academy of Sciences

+ Overview Hypothesis testing p-values Type I vs. Type II errors Power Correlation Fisher’s exact test T-test Linear regression Non-parametric statistics (mostly for you to go over in your own time)

+ Hypothesis testing 1. Propose a null and an experimental hypothesis. Mistakes here may make the experiment un-analysable 2. Consider the assumptions of the test: Are they met? Statistical independence of observations Statistical independence Distributions of the observations. Student's t distribution, normal distribution etc. Student's t distributionnormal distribution 3. Compute the relevant test statistic.test statistic 1. Student’s t-test-> t ; ANOVA  F; Chi 2 4. Compute likelihood of the test-statistic: 1. Does it exceed your chosen threshold? 2. Either reject (or fail to reject) the null hypothesis

+ What mistakes can we make? “The World” YesNo Your Decision Yescorrect detectionfalse positive Nofalse negativecorrect rejection

+ Starting to make inferences…the Binomial Toss a coin

+ Dropping lots of coins... Pachinko

+ Normal compared to Binomial n = 6 p =.5

+ Distributions normal (µ, ∂) binomial (p, n)

+ Distributions Poisson (lambda) Power Accidents in a period of time; Publication rates

+ Testing what distribution you have

+ Why are things normal?

+ Central limit theorem The mean of a large number of independent random variables is distributed approximately normally.

+ Hypothesis testing Making statistical decisions using experimental data. Need to form a null hypothesis (we can reject, but not confirm hypotheses) A result is “significant” if it is unlikely to have occurred by chance. chance Ronald Fisher “We may discover whether a second sample is or is not significantly different from the first”. Ronald Fisher

+ What mistakes can we make? “The World” YesNo Your Decision Yescorrect detectionfalse positive Nofalse negativecorrect rejection

+ Error Type-I error: False Alarm, a bogus effect reject the null hypothesis when it is really true Much of published science is Type-I error (Ioannides, 2008) Type-II error: Miss a real effect Fail to reject our null hypothesis when it is false Many small projects have this problem Type-III error: :-) lazy, incompetent, or willful ignorance of the truth

+ p-values Almost any difference (a count, a difference in means, a difference in variances) can be found with some probability, irrespective of the true situation. All we can do is to set a threshold likelihood for deciding that an event occurred by chance. p=.05 = 1 time in 20, the result would be as large by chance when the null hypothesis is true.

+ Type I vs. Type II errors Type I: False positive Likelihood of type 1 = α p=.05 = setting α to.05 Type II: False negative Likelihood of type 2 = β Power = 1- β World YesNo You Yes Correct detection (power) Type I (α ) No Type II (β ) Correct rejection

+ P-values p-value is the likelihood of mean differences as large or larger than those observed in the data occurring by chance p-value criteria (alpha ) allow us a binary answer to our questions Questions – is a smaller p-value: “ More ” significant? Indicate a “ Bigger ” effect? (if so when?) and how could we measure” effect”?

+ Compare these two statements It ’ s ‘ significant ’, but how big is the effect? I can see it ’ s big: but what is the p-value?

+ Confidence Intervals Range of values within a given likelihood threshold (for instance 95%) Closely related to p-values. p = 1-CI i.e., if p<.05, 95% CI will not include 0 (no difference) Would you rather have a CI or a p-value? Why? What is an effect size?

+ P and CI You can ’ t go from p to CI! You can go from CI to p At a p=.05, 95%CIs will overlap less than 25% At p=.01, the 95% CI bars just touch

+ Units of a Confidence Interval Unlike p, CIs are given in the units of the DV Cumming and Finch (2005) BMI in people on a low carb diet might be19-23 kg/m 2 Cumming, G. and Finch S.(2005). Inference by eye: confidence intervals and how to read pictures of data. American Psychologist. 60: PMID: PMID:

+ Standard Errors and Standard Deviations SE is (typically) the standard error of the mean The precision with which we have estimated the population mean based on our sample Computationally, it is ∂/sqrt(n) A 95% confidence interval is ± 1.96 SE

+ Example: coin toss Random sample of 100 coin tosses, of a coin believed to be fair We observed number of 45 heads, and 55 tails: Is the coin fair?

+ Binomial test  binom.test(x=45, n=100, p=.5, alternative="two.sided”) number of successes = 45, number of trials = 100 p-value = alternative hypothesis: true probability of success != percent confidence interval: sample estimates: probability of success: 0.45

+ Categorical Data Fisher’s Exact Test Categorical data resulting from classifying objects in one of two ways Tests significance of the observed "contingency" of the two outcomes. Fisher, R. A. (1922). On the interpretation of χ 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1),

+ The Lady Drinking Tea Question: Does Tea taste better if the milk is added to the tea, or vice versa? Null Hypothesis: The drinker cannot tell Subjects: Ms Bristol Experiment: 8 "trials" (cups): 4 in each way, in random order DV: Milk versus Milk second discrimination Enter data into 2 x 2 contingency table

+ Fisher Contingency Table A = c(1, 1, 1, 0, 1, 0, 0, 0) # vector of guesses B = c(1, 1, 1, 1, 0, 0, 0, 0) # vector of Teas guessTable <- table(A,B) # contingency table labels = list(Guess = c("Milk", "Tea"), Truth = c("Milk", "Tea")) # make labels dimnames(guessTable)= labels # add label fisher.test(guessTable, alternative = "greater") # test Guess MilkTea TruthMilk 31 Tea 13

+ Can she tell? Fisher's Exact Test for Count Data p-value = 0.24 # association could not be established Alternative hypothesis: true odds ratio is greater than 1 95% confidence interval: – Inf Sample odds ratio: 6.40

+ What if we have two continuous variables? Are they related Q: If you have continuous depression scores and cut-off scores, which is more powerful?

+ Correlation of two continuous variables: Pearson’s r All variables continuous Pearson

+ Correlation: what are the maximum and minimum correlations?

+ Power (1- β ) Probability that a test will correctly reject the null hypothesis. Complement of the false negative rate, βfalse negative False negative = missing a real effect 1- β = p (correctly reject a false null hypothesis)

+ Power and how to get it Probability of rejecting the null hypothesis when it is false Whence comes power?

+ Power applied to a correlation Samples of n=30 from a population in which two normal traits correlate 0.3 r=0.3 xy = mvrnorm (n=30, mu=rep(0,2), Sigma= matrix(c(1,r,r,1),nrow=2, ncol=2)); xy = data.frame(xy); names(xy) <- c("x", "y"); qplot(x, y, data = xy, geom = c("point", "smooth"), method=lm)

+ Power of a correlation test library(pwr) pwr.r.test(n = 30, r =.3, sig.level = 0.05) n = 30 r = 0.3 sig.level = 0.05 power = alternative = two.sided

+ Power: r =.3

+ t-test When we wish to compare means in a sample, we must estimate the standard deviation from the sample Student's t-distribution is the distribution of small samples from normally varying populations

+ t-distribution function t is defined as the ratio: Z/sqrt(V/v) Z is normally distributed with expected value 0 and variance 1;normally distributedexpected value V has a chi-square distribution with ν degrees of freedom;chi-square distribution

+ Normal and t-distributions Normal is in blue Green = t with df = 1 Red = t with df = 3 (far right = df increasing to 30)

+ Power of t-test power.t.test(n=15, delta=.5) Two-sample t test power calculation n = 15 ; delta = 0.5 ; sd = 1; sig.level = 0.05 power = 0.26 alternative = two.sided NOTE: n is number in *each* group

+ Linear regression

+ fit = lm(y ~ x1 + x2 + x3, data=mydata) summary(fit) # show results anova(fit) # anova table coefficients(fit) # model coefficients confint(fit, level=0.95) # CIs for model parameters fitted(fit) # predicted values residuals(fit) # residuals influence(fit) # regression diagnostics

+ Nonparametric Statistics Timothy C. Bates

+ Bootstrapping: Kurtosis differences kurtosisDiff <- function(x, y, B = 1000){ kx <- replicate(B, kurtosi(sample(x, replace = TRUE))) ky <- replicate(B, kurtosi(sample(y, replace = TRUE))) return(kx - ky) } kurtDiff 0) # p= NS

+ Parametric Statistics 1 Assume data are drawn from samples with a certain distribution (usually normal) Compute the likelihood that groups are related/unrelated or same/different given that underlying model t-test, Pearson’s correlation, ANOVA…

+ Parametric Statistics 2 Assumptions of Parametric statistics 1. Observations are independent 2. Your data are normally distributed 3. Variances are equal across groups Can be modified to cope with unequal ∂ 2

+ Non-parametric Statistics? Non-parametric statistics do not assume any underlying distribution They compute the likelihood that your groups are the same or different by comparing the ranks of subjects across the range of scores.

+ Non-parametric Statistics Assumptions of non-parametric statistics 1. Observations are independent

+ Non-parametric Statistics? Non-parametric statistics do not assume any underlying distribution Estimating or modeling this distribution reduces their power to detect effects… So don’t use them unless you have to

+ Why use a Non-parametric Statistic? Very small samples Leads to Type-1 (false alarm) errors Outliers more often lead to spurious Type-1 (false alarm) errors in parametric statistics. Nonparametric statistics reduce data to an ordinal rank, which reduces the impact or leverage of outliers.

+ Non-parametric Choices Data type? χ2χ2 discret e Question? continuous Number of groups? Spearman’s Rank associationDifferent central value Mann-Whitney U Wilcoxon’s Rank Sums Kruskal-Wallis test two-groupsmore than 2 Brown- Forsythe Difference in ∂ 2

+ Non-parametric Choices Data type? χ2χ2 discret e Question? continuous Number of groups? Spearman’s Rank Like a Pearson’s R Mann-Whitney U Wilcoxon’s Rank Sums Kruskal-Wallis test two-groupsmore than 2 Like ANOVA Like Student’s t No alternative Different central value Brown- Forsythe Difference in ∂ 2 Like F-test association

+ Binomial test binom.test(45, 100,.5, alternative="two.sided”) number of successes = 45, number of trials = 100, p-value = alternative hypothesis: true probability of success is not equal to percent confidence interval: Sample estimates: probability of success 0.45 binom.test(51,235,(1/6),alternative="greater")

+ Spearman Rank test ( ρ (rho)) Named after Charles Spearman, Non-parametric measure of correlation Assesses how well an arbitrary monotonic function describes the relationship between two variables, Does not require the relationship be linear Does not require interval measurement

+ Spearman Rank ( ρ rho) d = difference in rank of a given pair n = number of pairs Alternative test = Kendall's Tau (Kendall's τ)

+ Mann-Whitney U AKA: “Wilcoxon rank-sum test Mann & Whitney, 1947; Wilcoxon, 1945 Non-parametric test for difference in the medians of two independent samples Assumptions: Samples are independent Observations can be ranked (ordinal or better)

+ Mann-Whitney U U tests the difference in the medians of two independent samples n 1 = number of obs in sample 1 n 2 = number of obs in sample 2 R = sum of ranks of the lower-ranked sample

+ Mann-Whitney U or t? Should you use it over the t-test? Yes if you have a very small sample (<20) (central limit assumptions not met) If your data are really ordinal Otherwise, probably not. It is less prone to type-I error (spurious significance) due to outliers. But does not in fact handle comparisons of samples whose variances differ very well (Use unequal variance t-test with rank data)

+ Wilcoxon signed-rank test (related samples) Same idea as Mann-U, generalized to matched samples Equivalent to non-independent sample t-test

+ Kruskall-Wallis Non-parametric one-way analysis of variance by ranks (named after William Kruskal and W. Allen Wallis) tests equality of medians across groups. It is an extension of the Mann-Whitney U test to 3 or more groups. Does not assume a normal population, Assumes population variances among groups are equal.

+ Aesop: Mann-Whitney U Example Suppose that Aesop is dissatisfied with his classic experiment in which one tortoise was found to beat one hare in a race. He decides to carry out a significance test to discover whether the results could be extended to tortoises and hares in general …

+ Aesop 2: Mann-Whitney U He collects a sample of 6 tortoises and 6 hares, and makes them all run his race. The order in which they reach the finishing post (their rank order) is as follows: tort = c(1, 7, 8, 9, 10,11) hare = c(2, 3, 4, 5, 6, 12) Original tortoise still goes at warp speed, original hare is still lazy, but the others run truer to stereotype.

+ Aesop 3: Mann-Whitney U wilcox.test(tort, hare) Wilcoxon = W = 25, p-value = 0.31 Tortoises and hares do not differ tort = c(1, 7, 8, 9, 10,11) (n 2 = 6) hare = c(2, 3, 4, 5, 6, 12) (n 1 = 6, R 1 =32)