Practice & Communication of Science

Slides:



Advertisements
Similar presentations
Topics Today: Case I: t-test single mean: Does a particular sample belong to a hypothesized population? Thursday: Case II: t-test independent means: Are.
Advertisements

Comparing Two Population Means The Two-Sample T-Test and T-Interval.
PSY 307 – Statistics for the Behavioral Sciences
Chapter Goals After completing this chapter, you should be able to:
Lecture 9: One Way ANOVA Between Subjects
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Comparing Means From Two Sets of Data
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
Dependent Samples: Hypothesis Test For Hypothesis tests for dependent samples, we 1.list the pairs of data in 2 columns (or rows), 2.take the difference.
1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework –The two-sample.
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Chapter 13 Understanding research results: statistical inference.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.
Comparing Multiple Groups:
Inference about the slope parameter and correlation
Correlation Scientific
Dependent-Samples t-Test
Psych 231: Research Methods in Psychology
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Chi-square: Comparing Observed and
INF397C Introduction to Research in Information Studies Spring, Day 12
Practice & Communication of Science From Distributions to Confidence
Lecture Slides Elementary Statistics Twelfth Edition
Using the t-distribution
3. The X and Y samples are independent of one another.
LBSRE1021 Data Interpretation Lecture 9
Inference and Tests of Hypotheses
Prepared by Lloyd R. Jaisingh
CHAPTER 10 Comparing Two Populations or Groups
Hypothesis testing using contrasts
Having Confidence in our Means: Confidence Intervals
Hypothesis Testing Review
Statistics 200 Objectives:
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Scientific Practice Correlation.
From Distributions to Confidence
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Chapter 11 Analysis of Variance
Chi-square: Comparing Observed and Expected Counts
Is a persons’ size related to if they were bullied
Review: What influences confidence intervals?
Introduction to ANOVA.
CHAPTER 10 Comparing Two Populations or Groups
Statistics for the Social Sciences
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
CHAPTER 10 Comparing Two Populations or Groups
What are their purposes? What kinds?
Inferential Statistics
Some statistics questions answered:
Psych 231: Research Methods in Psychology
CHAPTER 10 Comparing Two Populations or Groups
Psych 231: Research Methods in Psychology
Facts from figures Having obtained the results of an investigation, a scientist is faced with the prospect of trying to interpret them. In some cases the.
Psych 231: Research Methods in Psychology
CHAPTER 10 Comparing Two Populations or Groups
Psych 231: Research Methods in Psychology
Variance and Hypothesis Tests
CHAPTER 10 Comparing Two Populations or Groups
Comparing the Means of Two Dependent Populations
Statistical Inference for the Mean: t-test
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Presentation transcript:

From Confidence to Hypothesis-Testing @UWE_KAR Practice & Communication of Science From Confidence to Hypothesis-Testing @UWE_KAR

Where We Are/Where We Are Going Many things we measure are normally dist When we sample an ND population, we get… the standard deviation of the sample sample mean  estimate of the population mean plus a measure of the mean’s distribution (the SEM)  a 95%CI of the mean calculation uses the t-distribution (via t-table) 95%CI = mean ± (t(N-1),0.05 * SEM) range around the estimated mean in which 95/100 further estimates of the pop mean would lie So far, so descriptive… we are using the data to estimate the pop mean but what if we already know the pop mean?

Comparing to Something Known If we already know the pop mean then another possibility opens up. Eg… a survey of daily travel time to UWE (min) gave… 26,33,65,28,34,55,25,44,50,36,26,37,43,62,35,38, 45,32,28,34 mean is 38.8 ± 11.7 min, n = 20 national average for commuting is 45.8 min we can ask the following question; are our times… significantly different to the national average? ie does the difference of 7 min mean something? this is hypothesis-testing, where… not significantly different is the Null Hypothesis significantly different is the Alternative Hypothesis

Testing Differences We can extend our use of the t-dist to help us decide which to accept and which to reject UWE (38.8 ± 11.7 min, n=20) vs Nation (45.8) SEM = 11.7/√20 = 2.62 DoF = N-1 = 19 95%CI = sample mean ± (t(20-1),0.05 * SEM) = 38.8 ± (t(20-1),0.05 * 2.62) = 38.8 ± (2.093 * 2.62) = 38.8 ± 5.48= (33.32, 44.28 min) Confident that 95/100 UWE surveys would yield a mean between 33.32 and 44.28 min does not include the National average, so we conclude we are significantly different to the rest! 

Testing Differences In practice usually calculate slightly differently calculate the diff between sample and pop means calculate t as diff / SEM (this ‘standardises’ it) compare t to ‘critical t’ in t-table if t > critical-t then reject Null Hypo UWE (38.8 ± 11.7 min, n=20) vs Nation (45.8) diff = 38.8 – 45.8 = -7 min SEM diff = 11.7/√20 = 2.62 t = -7/2.62 = -2.67 = 2.67 (we ignore sign) critical t(20-1),0.05 = 2.093 2.67>2.093 (SEMs to start of 2.5% ‘tail’) so in ‘tail’, so p<0.05, so reject the Null Hypo

Testing Differences Or stick the numbers into a stats package like Minitab and ask it to do a 1-sample t-test One-Sample T: C1 Test of mu = 45.8 vs not = 45.8 Variable N Mean StDev SE Mean 95% CI T P UWE time 20 38.80 11.70 2.62 (33.33, 44.27) -2.68 0.015 Note actual p-value calculated.. ie 1.5% chance that a difference in travel time this big would be seen if travel to UWE was no different to the national picture ‘no different’ is the Null Hypo so reject the Null Hypo in favour of alternative travel to UWE is significantly quicker than national average

A Special Case of the 1-sample t-test Say we surveyed the same people at UWE before and after some road improvements are interested in seeing the effect of improvements for each person, we have two pieces of data; before and after travel times if we put the data into two columns, then adjacent cells ‘pair up’ – the data are said to be paired now calculate difference between each data pair end up with third column containing the differences it will have a mean, an SD and an ‘n’ The Null Hypo says that the mean of our third column is not significantly different to 0 we are doing a paired t-test

The Paired t-test Before travel time (min)…26,33,65,28,34, 55,25,44,50,36,26,37,43,62,35,38,45,32,28,34 38.8 ± 11.7 min, n = 20 After travel time (min)…28,30,62,29,31, 54,22,41,52,33,25,38,43,60,31,37,42,31,29,34 37.6 ± 11.5 min, n = 20 At first sight this does not look promising… two means are close, and huge overlap of SDs Differences (min)…-2,3,3,-1,3,1,3,3,-2,3,1, -1,0,2,4,1,3,1,-1,0 1.2 ± 1.908 min, n = 20 so, SEM = 1.908/√20 = 0.427 min

The Paired t-test t = mean diff / SEM from the table, critical t(20-1),0.05 = 2.093 2.81>2.093 (SEMs to start of 2.5% ‘tail’) our Null Hypo says that expected mean diff is 0 so 0 min (the Null Hypo) is in ‘tail’ so p<0.05 so reject the Null Hypo the shortening of journey time by an average of 1.2 min is significant! The paired t-test is very powerful as it compensates for between subject variation

The Paired t-test In practice put paired data into Minitab and ask it to do a paired t-test… N Mean StDev SE Mean Before 20 38.80 11.70 2.62 After 20 37.60 11.47 2.56 Difference 20 1.200 1.908 0.427 95% CI for mean difference: (0.307, 2.093) T-Test of mean difference = 0 (vs not = 0): T-Value = 2.81 P-Value = 0.011 Note actual p-value calculated.. 1.1% chance that a difference in travel time this big would be seen if road improvements had no effect ‘no different’ is the Null Hypo so reject the Null Hypo in favour of alternative travel to UWE has been significantly improved

The 2-sample t-test The paired t-test has two columns of data, but the test is actually done on a single column, the differences between pairs of data but what if we have two data samples that don’t ‘pair up’ (they are independent of each other)? eg data from males and also from females This is where the 2-sample t-test comes in… the 2-sample t-test is an extension of what we just looked at aka unpaired t-test but we will approach it in a way that also incorporates the four general steps that underpin hypothesis-testing

The Basis of the 2-sample t-Test Say we are looking at growth of schoolchildren, and we measured heights… the values of heights will vary and their distribution might look pretty ‘normal’ but science is all about trying to explain variation, so one part of our ‘explanation’ of variation in height might be that some of it is down to gender

The Basis of the 2-sample t-Test Ie, when it comes to height, male and female schoolchildren belong to different populations…

The Basis of the 2-sample t-Test The Null Hypothesis says… No, the two sets of data (male and female) are drawn from the same population (just plain schoolchildren) How can we decide? We know that repeated sampling from a single population yields different means, just by chance So how far ‘apart’ must our male & female means be before we conclude it is not just chance? We need some sort of ‘measure’ of separation called the ‘test statistic’ and a probability level, p, we are happy with generally use the 95% (5% or 0.05) level

The Basis of the 2-sample t-Test The ‘test statistic’ is a measure of ‘separation’ of our two putative populations, male & female for the t-test it is the t-value Depends on diff in means and variability… big difference in means implies… a strong ‘signal’ that the two populations differ diff would be zero if samples the same big variability implies… a lot of ‘noise’ masking the diff in means ‘signal’ The t-value is like a signal-to-noise ratio!

The Basis of the 2-sample t-Test The bigger our ‘signal-to-noise ratio’, t, the less likely we are dealing with two samples from the same population ie we can reject the Null Hypo and… accept the Alternative Hypo that male and female schoolchildren differ significantly in their heights How big t needs to be depends on… degrees of freedom (N-2 in this case) p-value we are working at (alpha, usually 0.05) We look up the critical value of t in the t-table, just like we did when calculating a Confidence Interval

2-sample t-Test – Worked Example Say we are looking at lung function in schoolchildren, say the FVC… 50 boys and 55 girls (n doesn’t have to be same) Male data… 2.159, 2.065, 1.518, 2.227, 2.09, 2.451, 1.871, 2.571, 2.532, 2.545, 2.538, 2.795, 2.102, 1.804, 2.432, 2.704, 2.258, 2.282, 1.663, 2.795, 2.238, 1.953, 2.382, 2.344, 2.967, 2.68, 2.413, 2.444, 1.953, 2.314, 2.15, 2.634, 2.598, 2.09, 2.641, 2.92, 2.727, 2.307, 2.76, 2.439, 2.259, 2.111, 2.58, 2.602, 2.461, 3.128, 2.241, 2.602, 3.177, 2.419 𝑥 = 2.399, s = 0.348 L, n = 50 Female data… 1.913, 2.18, 1.56, 1.586, 1.712, 2.038, 1.791, 1.869, 2.296, 1.897, 1.846, 2.246, 2.318, 1.934, 1.92, 1.958, 2.521, 2.04, 2.19, 1.886, 1.734, 2.148, 2.198, 2.351, 2.193, 1.772, 2.38, 1.776, 2.505, 2.438, 2.317, 2.857, 2.604, 2.275, 1.727, 2.185, 2, 2.428, 2.304, 1.775, 2.537, 1.904, 2.519, 2.611, 2.425, 2.302, 2.366, 1.999, 3.111, 1.923, 2.978, 2.673, 2.311, 2.428, 2.407 𝑥 = 2.185, s = 0.345 L, n = 55

2-sample t-Test – Worked Example Like any statistical test there are four stages… Formulate the Null Hypothesis Generate a test statistic Use the test statistic to work out a probability Interpret the probability 1: Formulate the Null Hypothesis there is a probability of 5% or more that our observed differences in FVC between male and female schoolchildren arose by chance and they both belong to a single underlying population called schoolchildren ie gender has no significant influence on FVC in schoolchildren

2-sample t-Test – Worked Example 2: Generate the test statistic, t previously, t calculated from diff in means/SEM easy to get the simple difference in male and female means but we have two sets of data contributing to the SEM of the difference if the variances (square of SD) are similar… SEM diff = √(SDA2/nA + SDB2/nB) for our data t = (2.399 – 2.385) √((0.3482/50) + (0.3452/55)) t = 3.16

2-sample t-Test – Worked Example 3: Use test statistic to look up the probability Row; DoF is N-2 (103) Column; level of ‘confidence’ α = 0.05 (5%) critical t = 1.96 our t, 3.16 > 1.96 ie a mean diff of 0 (Null Hypo) lies more than 1.96 SEM along the distribution so p<0.05

2-sample t-Test – Worked Example 4: Interpret the probability p < 0.05 but our t-value (3.16) > critical t (2.58) at α = 0.01 so actually < 0.01 this means that the chances of ‘randomly’ picking two samples from the same population that are as far apart as we saw, with the variability in each sample we saw, is < 1% ie < 1% chance the Null Hypo is true so we conclude that these samples represent two different populations ie we accept the Alt Hypo that males/females differ

2-sample t-Test – Worked Example In practice put data into Minitab and ask it to do a 2-sample t-test… Two-sample T for Male vs Female N Mean StDev SE Mean Male 50 2.399 0.348 0.049 Female 55 2.185 0.345 0.047 Difference = mu (Male) - mu (Female) Estimate for difference: 0.2140 95% CI for difference: (0.0798, 0.3481) T-Test of difference = 0 (vs not =): T-Value = 3.16 P-Value = 0.002 DF = 103 (Both use Pooled StDev = 0.3462) Note actual p-value calculated.. 0.2% chance that a difference in FVC this big would be seen if gender had no effect so reject the Null Hypo in favour of alternative gender significantly affects FVC

Summary Hypothesis-testing tests the Null Hypothesis Like any statistical test there are four stages… Formulate the Null Hypothesis Generate a test statistic (in this case, t) Use the test statistic to work out a probability Interpret the probability For t-test, t = ‘diff’/SEM (‘signal’/’noise’) 1-sample t-test test a sample against an expected mean paired t-test tests ‘before-after’ data (against 0) 2-sample t-test (unpaired t-test) test two independent samples to see if means differ