Presentation is loading. Please wait.

Presentation is loading. Please wait.

Student’s t statistic Use Test for equality of two means

Similar presentations


Presentation on theme: "Student’s t statistic Use Test for equality of two means"— Presentation transcript:

1 Student’s t statistic Use Test for equality of two means
E.g., compare two groups of subjects given different treatments Test for value of a single mean E.g., test to see if a single group of subjects differs from a known value Also ‘matched sample’ test where a single group is compared before and after treatment (test for zero treatment effect) Advanced Tests of significance of correlation/regression coefficients.

2 Student’s t statistic Assumptions Robustness
Parent population is normal Sample observations (subjects) are independent. Robustness To normality: Affects Type I error and power and may lead to inappropriate interpretation. In real life, we can’t expect exactly normal data but it should not be too much skewed

3 Student’s t statistic Formula (single group)
Let x1, x2, ….xn be a random sample from a normal population with mean µ and variance σ2, then the following statistic is distributed as Student’s t with (n-1) degrees of freedom.

4 Student’s t statistic Formula (two groups)
Case 1: Two matched samples The following statistic follows t distribution with n-1 d.f. Where, d is the difference of two matched samples and Sd is the standard deviation of the variable d.

5 Student’s t statistic Formula (two groups)
Case 2: Equal Population Standard Deviations: The following statistic is distributed as t distribution with (n1+n2 -2) d.f. The pooled standard deviation, n1 and n2 are the sample sizes and S1 and S2 are the sample standard deviations of two groups.

6 Student’s t statistic Formula (two groups)
Case 3: Unequal population standard deviations The following statistic follows t distribution. The d.f. of this statistic is,

7 Student’s t statistic One-sided Two-sided
There can only be on direction of effect The investigator is only interested in one direction of effect. Greater power to detect difference in expected direction Two-sided Difference could go in either direction More conservative

8 Student’s t statistic One group Two groups One sided Two sided
A single mean differs from a known value in a specific direction. e.g. mean > 0 Two means differ from one another in a specific direction. e.g., mean2 < mean1 Two sided A single mean differs from a known value in either direction. e.g., mean ≠ 0 Two means are not equal. That is, mean1 ≠ mean2

9 Student’s t statistic SPSS
One Group: Analyze>Compare Means> One-Sample T Test Two Groups (Matched Samples): Analyze>Compare Means> Paired Samples T Test Two Groups: Analyze>Compare Means> Independent Samples T Test

10 Student’s t statistic R
The default t-test is t.test(x, y = NULL, alternative = "two.sided", mu = 0, paired = False, var.equal = FALSE, conf.level = 0.95) Where x and y are two data for two numeric variables. We need to change only default settings matching with the case we want to perform. For example, One Group: t.test(x, alternative=“greater”, mu=30) Two Groups (Matched Samples): t.test(x, y, alternative= "less", mu = 0, paired = TRUE,) Two Groups: t.test(x,y, alternative=“greater”, mu=0, var.equal = TRUE)

11 Student’s t-statistic
MS Excel (in Tools -> Data Analysis…) One Group: Not available Two Groups (Matched Samples): t-Test: Paired two sample for mean Two Groups (Independent Samples): t-Test: Two-Sample Assuming Equal Variances t-Test: Two-Sample Assuming Unequal Variances

12 Example 1 Consider the heights of children 4 to 12 years old in dataset 1 of our course website (variable ‘hgt’). Suppose we want to test if the average height (µ) for this age group in the population is 50 inches, using our sample of 60 children. We will use 5% level of significance. This is a one-sample, two-sided test.

13 Example 1 Hypotheses: Computation in Excel:
Ha: µ ≠ 50 Computation in Excel: Excel does not have a 1-sample test, but we can fool it. Create a dummy column parallel to the hgt column with an equal number of cells, all set to 0.0 Run the Matched sample test using hgt and the dummy column and 50 as the hypothesized mean difference. The p-value for two tail test is

14 Example 1 Using SPSS: Using R,
Analyze> Caompare Means >One Sample T Test > Select hgt > Test value: 50 > ok P-value is .009 Using R, t.test(df1$hgt, mu=50) Two-tail p-value is .0092

15 Example 2 Suppose we want to compare the height of two groups (hgt in each sex from dataset). H0: Mean heights are equal for the two sexes. Ha: Mean heights are not equal Using MS-Excel: Sort data by sex (data>sort>by:sex) In Data Analysis… t-test:Two-sample Assuming equal variance select the range of hgt for all sex = f as Variable 1 Range select the range of hgt for all sex = m as Variable 2 Range P-value for two-sided test = 0.205

16 Example 2 Using SPSS: Analyze>Compare Means>Independent-Samples T-test> Select hgt as a Test Variable Select sex as a Grouping Variable In Define Groups, type f for Group 1 and m for Group 2 Click Continue then OK It gives us the p-value We can assume equal variance as the p-value of F statistic for testing equality of variances is

17 Sign Test (Nonparametric)
Use: (1) Compare the median of a single group with a specified value (instead of single sample t-test). (2) Compare medians of two matched groups (instead of Two matched samples t-test) Test Statistic: Number of positive difference of (median-c). The number of positive difference follows a Binomial distribution.

18 Sign Test (Nonparametric)
SPSS: Analyze> Nonparametric Tests> Binomial R: sign.test(x, y = NULL, md = 0, alternative = "two.sided", conf.level = 0.95) For testing the median (md) of a single sample, use data only for one variable. To compare paired data, use two paired variables. NB: This test requires the BSDA package

19 Wilcoxon Signed-Rank Test:
USE: Compares medians of two paired samples. Test Statistic: Consider n pairs of data of two variables x and Y, then the following statistic is known as Wilcoxon signed rank statistic. WS = Sum of the rank of positive differences after assigning ranks to the absolute value of differences.

20 Wilcoxon Rank-Sum Test
Use: Compares medians of two independent groups. Test Statistic: Let, X and Y be two samples of sizes m and n. Suppose N=m+n. Compute the rank of all N observations. Then, the statistic, Wm= Sum of the ranks of all observations of variable X.

21 Wilcoxon Signed-Rank Test & Wilcoxon Rank-Sum Test
SPSS: Two Matched Groups: Analyze> Nonparametric Tests> 2 Related Samples Two Groups: Analyze> Nonparametric Tests> 2 Independent Samples

22 Wilcoxon Signed-Rank Test: /Wilcoxon Rank-Sum Test
The default test is wilcox.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, exact = FALSE, conf.int = FALSE, conf.level = 0.95) Two matched Groups: wilcox.test(x, y, alternative = “less", paired = TRUE) Two Groups: wilcox.test(x, y, alternative = “greater“)

23 Example 3 (two matched samples)
Subject Hours of Sleep Difference Rank Ignoring Sign Drug Placebo 1 6.1 5.2 0.9 3.5 2 7.0 7.9 -0.9 3 8.2 3.9 4.3 10 4 7.6 4.7 2.9 7 5 6.5 5.3 1.2 6 8.4 5.4 3.0 8 6.9 4.2 2.7 6.7 0.6 9 7.4 3.8 3.6 5.8 6.3 -0.5 3rd & 4th ranks are tied hence averaged. P-value of this test is Hence the test is significant at any level more than 2%, indicating the drug is more effective than placebo.

24 Proportion Tests Use Test for equality of two Proportions
E.g. proportions of subjects in two treatment groups who benefited from treatment. Test for the value of a single proportion E.g., to test if the proportion of smokers in a population is some specified value (less than 1)

25 Proportion Tests Formula One Group: Two Groups:

26 Proportion Test SPSS: R: The default tests are:
One Group: Analyze> Nonparametric Tests> Binomial Two Groups? R: The default tests are: One Group: binom.test(x, n, p = 0.5, alternative = "two.sided", conf.level = 0.95) Two Groups: prop.test(c(x,y), c(m,n), p = NULL, alternative = "two.sided", conf.level = 0.95, correct = TRUE) X, Y are the number of successes and m and n are the sample sizes

27 Example 4: Proportion of males in Dataset 1
n=60 and there are 30 males binom.test(30,60) returns a p-value of 1.0. SPSS: recode sex as numeric - Transform> Recode>Into Different Variables> Make all selections there and click on Change after recoding character variable into numeric. Analyze> Nonparametric test> Binomial> select Test variable> Test proportion Set null hypothesis = 0.5 The p-value = 1.0

28 Chi-square statistic USE Assumptions
Testing the population variance σ2= σ02. Testing the goodness of fit. Testing the independence/ association of attributes Assumptions Sample observations should be independent. Cell frequencies should be >= 5. Total observed and expected frequencies are equal

29 Chi-square statistic Formula: If xi (i=1,2,…n) are independent and normally distributed with mean µ and standard deviation σ, then, If we don’t know µ, then we estimate it using a sample mean and then,

30 Chi-square statistic For a contingency table we use the following chi- square test statistic,

31 Chi-square statistic SPSS:
Analyze> Descriptive stat> Crosstabs> statistics> Chi-square Select variables. Click on Cell button to select items you want in cells, rows, and columns.

32 Example 5 (class demonstration)
Make a contingency table using two variables sex and grp from our dataset. Analyze> Descriptive statistics> crosstabs> select variables for rows and columns Statistics> Chi-square> Continue> Cells> selection> ok. It will give us a contingency table and p-value of Pearson Chi-square Tests. For this particular case, the p-value of Pearson-Chi-square test is and d.f. is 2.

33 F-statistic Use: Testing the equality of population variances.
Testing the significance of difference of several means in analysis of variance.

34 F-statistic Let X and Y be two independent Chi-square variables with n1 and n2 d.f. respectively, then the following statistic follows a F distribution with n1 and n2 d.f. Let, X and Y are two independent normal variables with sample sizes n1 and n2. Then the following statistic follows a F distribution with n1 and n2 d.f. Where, sx2 and sy2 are sample variances of X and Y.

35 F-statistic Hypotheses:
H0: µ1= µ2=…. =µn Ha: µ1≠ µ2 ≠ …. ≠µn Comparison will be done using analysis of variance (ANOVA) technique. ANOVA uses F statistic for this comparison. The ANOVA technique will be covered in another class session.


Download ppt "Student’s t statistic Use Test for equality of two means"

Similar presentations


Ads by Google