Download presentation

Presentation is loading. Please wait.

Published byIsaiah Ryman Modified over 2 years ago

1
The t test Peter Shaw RIl "Small samples are slippery customers whose word is not to be taken as gospel" (Moroney).

2
Introduction n Last week we met the Mann- Whitney U test, a non-parametric test to examine how likely it is that 2 samples could have come from the same population. n This week we explore other approaches to this and related situations.

3
Student’s t test n This test was invented by a statistician working for the brewer Guinness. He was called WS Gosset (1867-1937), but preferred to keep anonymous so wrote under the name “Student”. n Hence we have Student’s t test, the Studentised range, etc - in memory of Mr Gosset (!).

4
T vs U? n These 2 tests are identical in hypothesis formulation. n They require 2 samples which may be from the same population. These samples need not be of equal #, nor are they paired. u H0: The 2 samples are from the same population - any differences are due to chance u H1: The 2 samples come from different populations.

5
1 big difference (+ a few small ones): n The t test is a parametric test - it assumes the data are normally distributed. n There are several different versions of the t test, depending on exactly what assumptions you make about the data. I’ll stick to the simplest.

6
The basic idea σ μ How many s.d.s is this data point from the mean? Z i = (X i - μ)/σ We can look up Z in tables, but these assume that the values of μ and σ are known perfectly. Remember Z scores? These apply to the idealised normal distribution

7
Gosset’s discovery: n Was the formulae appropriate to Z when the sample is small, so that μ and S are based on inadequate data. n To distinguish this distribution from the idealised normal distribution, Gosset named the function the “t statistic”, and the value of (X i - μ)/S when μ and S are estimates was renamed from Z to t. n Hence t is really just a special, unreliable Z score. To identify a t score you must also specify how many data points it comes from: a value based on 6 observations is FAR less reliable than one based on 6000.

8
The theory... You have 2 samples which may be from 1 distribution or 2. To assess the likelihood, find how many s.d.s the means of the 2 populations are apart: How many S.D.’s? Calculate t = (μ1 - μ2) / pooled sd μ1 μ2

9
The details are slightly more messy.. n Because of the question “How do we calculate the pooled sd?” n There are several ways of doing this which make different assumptions, and give slightly different answers. n The simplest model assumes that the 2 samples have a common variance, and gives t as follows: n Given data X1, X2 which have N1, N2 datapoints each, and sums of squares SSx1, SSx2 n t = (μ1 - μ2) with N1+n2-1 df n __________ sq.root [ (SSx1 + SSx2)*(1/Nx+1/Ny) / (n1+n2-2)]

10
Beware! n I spent an afternoon in the library once checking ways to calculate t. n I found 3 different formulae, plus several confusing ways to express the relationship I just showed you. n Another one widely used differs in assuming that the 2 samples have unequal variance. This gives a messier formula, plus another even messier formula for the df. n The third approach assumes that samples are accurately paired - the paired samples t test.

11
n x1x2 n 57.864.2 n 56.258.7 n 61.963.1 n 54.462.5 n 53.659.8 n 56.459.2 n 53.2 n n76 n Sum x393.5367.5 n sumx*222174.4122535.87 n mean56.21%61.25% n ss54.08926.495

12
So you know what to do to compare 2 groups! n You have the choice of M-W U, or Student’s t test. n But what if there are 3 groups, or 4, or 5? n You may work out the following routine: u Test group 1 vs group 2, then 2 vs 3, etc. u Clever, but WRONG! (The danger with multiple tests is that you will get a “p=0.05” significant result more often than 1:20). X1X2X3

13
Multiple groups can be compared.. n With a suitable multiple test. n There are 2 options here, both of which are usually run on PCs. n Parametric data: Analysis of variance ANOVA n Non-Parametric data: Kruskal-Wallis ANOVA. u I make M.Sc. students run ANOVA calculations by hand, but K-W ANOVA is PC only.

14
Kruskal_Wallis ANOVA Analysis of variance (ANOVA) >=2 Mann-Whitney U test T test2 Non- Parametric ParametricNumber of groups: Type of data

15
Example: n n76 n Sum x393.5367.5 n sumx*222174.4122535.87 n mean56.21%61.25% n SS54.089 26.495 ┌─ ─┐ n Se differe = sqrt│(SSxx + SSyy)*(1/Nx+1/Ny)│ n │──────────────── │ n │ Nx+Ny-2 │ n └─ ─┘ SE diff = sqrt[(26.495+54.089)*(1/6+1/7)/(6+7-2)] = sqrt[2.2675] = 1.506 Hence t = (61.25 - 56.21)/1.506 = 3.35 with 12df This is significant at p<0.05

Similar presentations

Presentation is loading. Please wait....

OK

T-tests continued.

T-tests continued.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on law against child marriage bangladesh Ppt on sri lanka history Ppt on 2d and 3d figures 2nd Ppt on self awareness quotes Ppt on business environment nature concept and significance Ppt on types of agriculture in india Administrative law ppt on rule of law Ppt on combination of resistances barb Ppt on bodybuilding workouts Ppt on first conditional activities