Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He

Similar presentations


Presentation on theme: "Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He"— Presentation transcript:

1 Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He the@tea.ntue.edu.tw

2 T-tests (Based on Student’s t distributions) T-tests (Based on Student’s t distributions) 1. To test whether the difference in the two means of one sample (i.e., pretest mean vs. posttest mean) will be significant or non-significant -> paired- sample t-test; 2. To test whether the differences in the two means of two samples (i.e., the mean of experimental group vs. the mean of the control group) will be significant or non-significant -> independent t-test. 3. To test whether the hypothesizes mean will be the population mean -> too manipulative, not a good technique -> One-Sample T-tests

3 Conditions for t tests to Be Used: Two means of a Single Dependent Variable from One Sample or Two Different (Independent) Samples. Conditions for t tests to Be Used: Two means of a Single Dependent Variable from One Sample or Two Different (Independent) Samples. 1. A Univariate Analysis: t tests analyze a “single dependent variable” at a time, and that is why it is called a univariate analysis. 2. A dependent variable means “the final result of treatment” or a “phenomenon” that can be measured. (e.g., reading achievement, scores on written papers... ). t tests are not suitable for testing multiple dependent variables.

4 3. t tests (except for one-sample t test) are used to test two means of a single dependent variable. 4. The two means can come either from a sample (i.e., the pretest and posttest on a sample) or from two samples (i.e., experimental group vs. control group).

5 The Reasons for Using Student’s t Distribution The Reasons for Using Student’s t Distribution 1. Standard Sampling Distribution of the Mean: Using z score distribution as  is known: Z = x-  /(  /√n), when  /√n = Stand Error (SD) 2. But, in reality,  is always unknown. In this satiation, it calls for the use of Student’s t distribution as the standard sampling distribution of means: Standard Error: s/√n when = standard deviation of the sample. Standard Error: s/√n when s = standard deviation of the sample.

6 3. Formula (Don’t worry, you don’t need to memorize it!)

7 Features of Student’s t distributions: Features of Student’s t distributions: 1. For small sample sizes, sampling distribution of the mean departs considerably away from normal distributions; large sample sizes are always welcome. 2. It is a family of distributions; 3. As sample sizes increase, distributions of sampling distribution of the mean approximates normal distributions; 4. t distributions with a mean equal to 0 and SD = 1 (Exactly like z- score distribution)

8 5. All testing procedures are identical to z-score distribution 6. Each t distribution is related to degree of freedom (] [ 自由度 ): n-1 6. Each t distribution is related to degree of freedom (df] [ 自由度 ): n-1 (1) df: the number of elements of data that are free to vary in calculating a statistic. [2] why df: each t distribution responds to a df. [3] if one restriction is added, the number of freedom will be one less. 7. Check critical values for t distributions from Table c.3, p. 622.

9 Statistical Precision Statistical Precision 1. It is the inverse of a standard error; 2. The smaller a standard error is, the greater the statistical precision; 3. As the sample size increases, the precision increases accordingly. It is because standard error is computed by using the forma s/√n. When n increases, standard error decreases and statistical precision increases.

10 Confidence Interval (CI: 信賴區間 ): Estimation of  Confidence Interval (CI: 信賴區間 ): Estimation of  1. CI: A range of values that we are confident contains the population parameter (i.e.,  ] will fall between the range of the two values. 2. It means: at the confidence of level (1-  ),  will fall between the CI values. 3. Formula: CI = X± (t cv )*standard error (Good news, you don’t need to memorize this formula).

11 4. A researcher hypothesizes that GPA of student athletes is less than 2.5. To test this hypothesis, the researcher selects 20 subjects and find the GPA mean of this sample is 2.45, s = 0.54, s 2 = 0.29,  is set as 0.05 level. What is the CI?

12 Computation: 1. standard error = s /√n -> 0.54/ √20 = 0.12 2. Critical Values of t when n = 20, df= 19,  = 0.05, two-tailed -> t cv = 2.093 3. CI 95 = 2.45 ±[2.093]*0.12 = 2.45 ± 0.25 = (2.20, 2.70) Interpretation: We are 95% confident that the  will fall between 2.20 and 2.70.

13 Before running statistical tests on means (e.g., t tests) or collect data from survey, researchers need to: (a) decide their sample sizes prior to the data collection and (b) test data to see if assumptions are met. Before running statistical tests on means (e.g., t tests) or collect data from survey, researchers need to: (a) decide their sample sizes prior to the data collection and (b) test data to see if assumptions are met. Sample Sizes: Sample Sizes: 1. Five factors for determining how large samples should be: a. Confidence Level: 1-  b. Confidence Interval: Margin of Errors

14 c. Power: 1-β when β = 4*  ; usually, power is set at 0.80 since it is a minimum requirement. d. Effect Size: e. One Tailed or Two Tailed: 2. Check the formulas 12.1 on p. 300 (Once again, no need to memorize it)

15 Desired Levels of Power: A Post Hoc Power Analysis vs. A Priori Power Analysis Desired Levels of Power: A Post Hoc Power Analysis vs. A Priori Power Analysis A Post Hoc Power Analysis A Post Hoc Power Analysis a. Meaning: The inverse of power indicates the maximum percent of making Type II error (i.e., the sum of power and chance to make Type II error should be equal to 100). b. When a null hypothesis is retained, researchers think this finding is hard to explain. So, they run a post hoc power analysis to report the chance of Type II error, that is, the chance to retain a false Ho.

16 c. Example: If power is set at 0.80, it means the chance of making Type II error (i.e., retaining a false Ho) is no more than 20%. A Priori Power Analysis a. Function: Depending on the desired level of power (e.g., 0.80), researchers can decide their sample sizes. This analysis is called “a priori power analysis.”

17 Two Meanings of Effect Size: A Specified Priori Effect Size and A Calculated Post Hoc Effect Size Two Meanings of Effect Size: A Specified Priori Effect Size and A Calculated Post Hoc Effect Size 1. A Specified Priori Effect Size: a. Meaning: Researchers decide the degree of practical significance before studies. Researchers can specify the effect size to be small, medium, or large.

18 2. A Calculated Post Hoc Effect Size a. Function: It is an index to indicate “practical significance” after “statistical significance” has been identified. b. Meaning: It is an index of the proportion (or percentage) of variability in the dependent variable that is associated with (or is explained by) the grouping variable (independent variable). c. Similarity: Its meaning is similar to r 2 or covariances in correlational studies.

19 d. Example: Check the follow table to see different measures of effect size and their cutting scores (SPSS only produces partial Eta Squared:η p 2 )

20 e. Effect sizes are often reported in experimental studies when two means are involved after significant differences are identified. The higher the effect size, the more effect the treatment is.

21 Sample Sizes for Survey & Two or More Samples Sample Sizes for Survey & Two or More Samples a. For Survey: Online Calculator b. For Other Univariate or Multivariate Analysis: G*Power c. Please Download “G*Power” to Your Computer. Refer to [ 統計 Sample Size 補充資料 Sample Size Calculators] for detailed info.

22 Before t tests are used, two assumptions must be met: Normality and Homogeneity of Variance Before t tests are used, two assumptions must be met: Normality and Homogeneity of Variance 1. Normality Tests for Paired-Sample t Tests & Independent Sample t Tests: a. Applying Kolmogorov-Smirnov (Nonsignificance is desired) b. Check Kurtosis & Skewness (+-1)

23 2. Homogeneity Test for Independent Sample t Tests (Equal Variance Assumption): a. For two or more samples, their population variances must be equal (For Paired-Sample t Tests, there is no need testing homogeneity since only one population (sample) is involved)

24 b. For instance, in an experimental study, before treatment is given, the experimental group and control group must be homogeneous so that there will no other sources except treatment that contributes to significant differences. Results found from samples will be generalized to population. c. Levene’s test is used widely (set  level at 0.01 or 0.001 because this test is very sensitive to sample sizes) (Nonsignificance is desired).

25 d. Samples of Same Sizes (Robustness of Homogeneity): The best way to deal with homogeneity is equal ns.

26 Alternatives to Violated Assumptions Alternatives to Violated Assumptions 1. Data Transformations: Square-root transformation may compensate these violations (A favored technique) 2. Nonparametric Tests ( 無母數分析 ) When Normality is Violated: a. Nonparametric tests are used because they do not need to meet normality. Thus, they are not allowed “inferential statistics.” Usually, the sample size is very small (n< or = 20).

27 b. Two-Sample Kolmogorov-Smirnov Test or Mann- Whitney Test for two means or Wilcoxon Signed Rank Test (n < or = 20) for paired-mean. 3. Use of Other Type of t Tests Dealing with Violated Assumption: Welch’s t test (Not available in SPSS).

28 One-Sample t Test: One-Sample t Test: 1. Purpose: One-Sample t-tests test whether the mean of a single variable differs from a specified constant. 2. Examples: A researcher might want to test whether the average IQ score for a group of students differs from 100. Or, a cereal manufacturer can take a sample of boxes from the production line and check whether the mean weight of the samples differs from 1.3 pounds at the 95% confidence level. 3. Seldom used in language education research.

29 4. Recall the following example we illustrated in previous class: If we hypothesize  to be 455, and its standard deviation of the population, , is hypothesized to be 100. Then we select a sample of 144 subjects and find its sample mean = 535. We formulate the following hypotheses: Ho:  = 455, H a :   455 This is the use of one-sample t-test.

30 Paired-Samples T-Test: Paired-Samples T-Test: 1. Paired-Samples t-tests compare two means of a single dependent variable from a single sample. 2. Example: Pretest and Posttest on a Group a. The same measure are applied to a single sample twice to collect data of a single dependent variable.

31 b. Scenario in Demo File: In a study that focuses on the effects of Phonics instruction on reading comprehension, the researcher randomly select a sample of 40 subjects. Before Phonics instruction starts, the researcher tests the selected sample’s reading comprehension. The reading test has 3 items, and each item occupies 5 points (maximum score is 15). After 4 weeks, the researcher retests the subjects using the same test. The researcher wants to know whether the instruction will make significant effects on the sample’s reading comprehension.

32 c. Condition: One sample and two sets of data on a single dependent variable d. Goal: To see whether the difference in the pretest mean and posttest mean is so huge (significant) that it is very Unlikely that sampling error will cause this difference. e. X prettest = sampling error + reading proficiency (constant: 常數 ) X posttest =sampling error + reading proficiency + effects of instruction

33 f.Hypothesis Testing: Ho: X prettest = X posttest Ha: X prettest ≠ X posttest g. one-tailed,  =.05, power = 0.80, effect size = 0.4, n=40 (according to our analysis for sample size, the sample size should be 41) h. Proper Stat Technique: Paired-Sample t-test i. Check Assumptions for t -test

34 j. Report effect sizes (post hoc) so that “practical significance” can be more clarified given that “statistical significance” is found. k. Go to the following URL for computing “effect size” of Cohen’s d: http://www.uccs.edu/~faculty/lbecker/


Download ppt "Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He"

Similar presentations


Ads by Google