Presentation is loading. Please wait.

Presentation is loading. Please wait.

SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem.

Similar presentations


Presentation on theme: "SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem."— Presentation transcript:

1 SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem Solving the Problem with SPSS Logic for Independent Samples T-Test of Population Means Power Analysis

2 SW388R6 Data Analysis and Computers I Slide 2 Independent Samples T-Test: Purpose  Purpose: test whether or not the populations represented by the two samples have a different mean  Examples:  Social work students have higher GPA’s than nursing students  Social work students volunteer for more hours per week than education majors  UT social work students score higher on licensing exams than graduates of Texas State University

3 SW388R6 Data Analysis and Computers I Slide 3 Independent Samples T-Test: Hypotheses  Hypotheses:  Null: mean of population 1 = mean of population 2 Versus  Research: mean of population 1 < mean of population 2  Research: mean of population 1 ≠ mean of population 2  Research: mean of population 1 > mean of population 2  Decision:  Reject null hypothesis if p SPSS ≤ alpha (≠ relationship)  Reject null hypothesis if p SPSS ÷2 ≤ alpha ( relationship)

4 SW388R6 Data Analysis and Computers I Slide 4 Independent Samples T-Test: Assumptions and Requirements  Variable is interval level (ordinal with caution)  Variable is normally distributed  Acceptable degree of skewness and kurtosis or  Using the Central Limit Theorem (30+ in each group)  The variance of the two groups is not different (if different, use alternative formula)

5 SW388R6 Data Analysis and Computers I Slide 5 Independent Samples T-Test: Effect Size  Cohen’s d measures difference in means in standard deviation units.  Cohen’s d = difference in population means population standard deviation  Interpretation:  small: d =.20 to.50  medium: d =.50 to.80  large: d =.80 and higher

6 SW388R6 Data Analysis and Computers I Slide 6 Independent Samples T-Test: APA Style  An independent samples T-test is presented the same as the one-sample t-test: t(75) = 2.11, p =.02 (one –tailed), d =.48  Example: Survey respondents who were employed by the federal, state, or local government had significantly higher socioeconomic indices (M = 55.42, SD = 19.25) than survey respondents who were employed by a private employer (M = 47.54, SD = 18.94), t(255) = 2.363, p =.01 (one-tailed). Degrees of freedom Value of statistic Significance of statistic Include if test is one-tailed Effect size if available

7 SW388R6 Data Analysis and Computers I Slide 7 This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of.05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). o True o True with caution o False o Incorrect application of a statistic Homework problems: Independent Samples T-Test of Population Means This is the general framework for the problems in the homework assignment on “Independent Samples T-Test of Population Means.” The description is similar to findings one might state in a research article.

8 SW388R6 Data Analysis and Computers I Slide 8 This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of.05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). o True o True with caution o False o Incorrect application of a statistic Homework problems: Independent Samples T-Test - Data set, variables, and sample The first paragraph identifies: The data set to use, e.g. GSS2000R.Sav The groups that will be compared in the analysis The variable compared in the t-test The alpha level to use for the hypothesis test

9 SW388R6 Data Analysis and Computers I Slide 9 This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of.05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). o True o True with caution o False o Incorrect application of a statistic Homework problems: Independent Samples T-Test - Specifications The second paragraph specifies: The sample means and standard deviation for the groups being compared The relationship for deriving the research hypothesis

10 SW388R6 Data Analysis and Computers I Slide 10 This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of.05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). o True o True with caution o False o Incorrect application of a statistic Homework problems: Independent Samples T- Test - Choosing an answer The answer to a problem will be True if the t-test supports the finding in the problem statement. The answer to a problem will Incorrect application of a statistic if the t-test violates the level of measurement requirement, i.e. the dependent variable is nominal the assumption of normality of the dependent variable is violated and the central limit theorem doesn’t apply the independent variable is not dichotomous The answer to a problem will be False if the t-test does not support the finding in the problem statement. The answer to a problem will be True with caution if the t-test supports the finding in the problem statement, but the dependent variable is ordinal level.

11 SW388R6 Data Analysis and Computers I Slide 11 Solving the problem with SPSS: Identifying numeric codes for groups - 1 Select the Variables… command from the Utilities menu. Our first task in SPSS is to identify the numeric codes for the groups that SPSS will require us to specify. The problem statement tells us “This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt].” NOTE: in our problems we required that the grouping, or independent variable, be dichotomous, because there are other statistical tests to use when there are more than two groups. SPSS does not require the independent variable to be dichotomous, but it does require that you enter the numeric codes for the two groups (possibly out of a larger number of groups) that you wish to compare.

12 SW388R6 Data Analysis and Computers I Slide 12 Solving the problem with SPSS: Identifying numeric codes for groups - 2 Scroll through the list of variables until you see wkgovt. Click on wkgovt and the information for the variable appears in the panel to the right. The Variable Information panel shows us the text labels that the creator of the data set assigned to each of the possible numeric responses for this variable. The numeric codes for the groups we want to compare are: 1 (GOVERNMENT) and 2 (PRIVATE). Click on Close to dismiss the dialog box. This remaining numeric codes represent missing data: 0 (NAP), 8 (DK), and 9 (NA).

13 SW388R6 Data Analysis and Computers I Slide 13 Solving the problem with SPSS: Level of measurement Statistical tests of means require that the dependent variable be interval level. "Highest year of school completed" [educ] is interval level, satisfying the requirement. In our analyses, we will allow the dependent variable to be ordinal, which violates this requirement in the strictest interpretation of level of measurement. However, since the research literature often computes means for ordinal level data, especially scaled measures, we will follow the convention of applying interval level statistics to ordinal data. Since all analysts may not agree with this convention, a caution is added to any true findings.

14 SW388R6 Data Analysis and Computers I Slide 14 Solving the problem with SPSS: Evaluating normality - 1 Select the Descriptive Statistics > Descriptives… command from the Analysis menu. The independent samples t- test uses the t-distribution for the probability of the test statistic. To obtain accurate probabilities, the variable must follow a normal distribution. We will generate descriptive statistics to evaluate normality.

15 SW388R6 Data Analysis and Computers I Slide 15 Solving the problem with SPSS: Evaluating normality - 2 Second, click on the Options… button to select the statistics we want. First, move the variable we will use in the t-test, educ, to the Variable(s) list box.

16 SW388R6 Data Analysis and Computers I Slide 16 Solving the problem with SPSS: Evaluating normality - 3 Second, click on the Continue button to close the dialog box. First, in addition to the statistics, SPSS has checked by default, mark the Kurtosis and Skewness check boxes on the Distribution panel.

17 SW388R6 Data Analysis and Computers I Slide 17 Solving the problem with SPSS: Evaluating normality - 4 Click on the OK button to obtain the output.

18 SW388R6 Data Analysis and Computers I Slide 18 Solving the problem with SPSS: Evaluating normality - 5 "Highest year of school completed" [educ] did not satisfy the criteria for a normal distribution. The skewness of the distribution (-.137) was between -1.0 and +1.0, but the kurtosis of the distribution (1.246) fell outside the range from -1.0 to +1.0. Having failed the normality requirement using this criteria, we will see if we can apply the central limit theorem.

19 SW388R6 Data Analysis and Computers I Slide 19 Solving the problem with SPSS: The independent-samples t-test - 1 The number of cases in each group is part of the output for the independent samples t-test, so we will go ahead and compute that test to continue addressing the issue of normality. Select Compare Means > Independent- Samples T Test… from the Analyze menu.

20 SW388R6 Data Analysis and Computers I Slide 20 Solving the problem with SPSS: The independent-samples t-test - 2 First, move the dependent variable educ to the Test Variable(s) list box. Second, move the independent variable wkgovt to the Grouping Variable text box. Note that SPSS lists two question marks after the variable name and activates the Define Groups… button as its clue for what it wants us to do next. Click on the Define Groups button.

21 SW388R6 Data Analysis and Computers I Slide 21 Solving the problem with SPSS: The independent-samples t-test - 3 First, type in the numeric codes for the groups in the wkgovt variable that we looked up at the beginning of the problem. Second, click on the Continue button to close the dialog box.

22 SW388R6 Data Analysis and Computers I Slide 22 Solving the problem with SPSS: The independent-samples t-test - 4 Note that SPSS has replaced the question marks after the variable name with the numeric codes we typed in. Click on the OK button to close the dialog box.

23 SW388R6 Data Analysis and Computers I Slide 23 Solving the problem with SPSS: Evaluating normality with the central limit theorem - 6 Since survey respondents who were employed by the federal, state, or local government had 38 cases and survey respondents who were employed by a private employer had 217 cases, the assumption of normality was satisfied by the Central Limit Theorem which required both groups to have 30 or more cases. If we are unable to establish normality either by the distribution or by the central limit theorem, the t-test would not be an appropriate statistic.

24 SW388R6 Data Analysis and Computers I Slide 24 Solving the problem with SPSS: Evaluating equality of group variances - 1 The independent-samples t-test assumes that the variances of the dependent variable for both groups are equal in the population. This assumption is evaluated with Levene's Test for Equality of Variances. The null hypothesis for this test states that the variance for both groups are equal. The desired outcome for this test is to fail to reject the null hypothesis, which demonstrates equality. The probability associated with Levene's Test for Equality of Variances (.161) is greater than alpha (.05), indicating that the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis.

25 SW388R6 Data Analysis and Computers I Slide 25 Solving the problem with SPSS: Evaluating equality of group variances - 2 Since we failed to reject the hypothesis for Levene’s test, the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis. Had the probability associated with Levene’s test been less than the alpha level, we would have used the statistics for the ‘Equal variances not assumed’ row in the table.

26 SW388R6 Data Analysis and Computers I Slide 26 Solving the problem with SPSS: Answering the question - 1 The finding we are trying to verify is: Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). Our first task is to make certain we have solved the right problem. First, we check to make certain we have the correct groups in the output. Second, we verify that the mean and standard deviations for the groups match the problem statement.

27 SW388R6 Data Analysis and Computers I Slide 27 Solving the problem with SPSS: Answering the question - 2 The finding we are trying to verify is: Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84). Since the problem states that the mean for one group is significantly higher than the mean of the other group, the research hypothesis is a one-tailed test. We divide the SPSS 2-tailed significance (.080) in half and make our decision about the null hypothesis by comparing p =.04 to alpha =.05.

28 SW388R6 Data Analysis and Computers I Slide 28 Solving the problem with SPSS: Answering the question - 3 The answer to the question is True. We can include the t-test results in our statement of the finding: Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84), t(255) = 1.761, p =.04 (one-tailed).

29 SW388R6 Data Analysis and Computers I Slide 29 Logic for independent-samples t-test: Level of measurement Measurement level of independent variable? Dichotomous Interval/ordinal /nominal Inappropriate application of a statistic Measurement level of dependent variable? Interval/ordinal Nominal/ Dichotomous Inappropriate application of a statistic Strictly speaking, the test requires an interval level variable. We will allow ordinal level variables with a caution.

30 SW388R6 Data Analysis and Computers I Slide 30 Logic for independent-samples t-test: Assumption of normality Skewness and Kurtosis between -1.0 and +1.0? Number of cases in both groups is at least 30? Yes No Inappropriate application of a statistic

31 SW388R6 Data Analysis and Computers I Slide 31 Logic for independent-samples t-test: Assumption of equality of variances Probability for Levene test of equality of population variances less than or equal to alpha? YesNo Use ‘Equal variances not assumed’ Use ‘Equal variances assumed’

32 SW388R6 Data Analysis and Computers I Slide 32 Logic for independent-samples t-test: Means and standard deviations correct Mean and standard deviation of both variables are correct? Yes No False

33 SW388R6 Data Analysis and Computers I Slide 33 Logic for independent-samples t-test: Decision about null hypothesis Probability for t-test less than or equal to alpha? One-tailed or two-tailed test? FalseTrue Add caution for ordinal dependent variable. Divide two-tailed significance by 2 Two-tailedOne-tailed YesNo

34 SW388R6 Data Analysis and Computers I Slide 34 Power Analysis: Independent-samples T-test Problem that was False This problem uses the data set GSS2000R.Sav to compare the average score on the variable "number of hours worked in the past week" [hrs1] for groups of survey respondents defined by the variable "self-employment" [wrkslf]. Using an independent samples t-test with an alpha of.05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who were self-employed worked significantly longer hours in the past week (M = 42.04, SD = 13.86) than survey respondents who were working for someone else (M = 40.55, SD = 12.46). 1 True 2 True with caution 3 False 4 Incorrect application of a statistic The answer to this problem was false because the probability for the t-test was.29 (one-tailed), greater than the alpha of 0.05. We can conduct a post-hoc power analysis to determine what number of cases would have been sufficient to have a better opportunity to find a statistically significant difference.

35 SW388R6 Data Analysis and Computers I Slide 35 Power Analysis: Statistical Results for False Independent-samples T-test - 1 The answer to the problem was false because the one-tailed significance was p =.29 (.583 ÷ 2), greater than the alpha of.05.

36 SW388R6 Data Analysis and Computers I Slide 36 Power Analysis: Statistical Results for False Independent-samples T-test - 2 To calculate the effect size, and corresponding power, for this problem, we need a pooled estimate of the standard deviation for the two groups. SamplePower will calculate that for us, we will enter the sample sizes, means, and standard deviations for the two groups in SamplePower.

37 SW388R6 Data Analysis and Computers I Slide 37 Access to SPSS’s SamplePower Program The UT license for SPSS does not include SamplePower, the SPSS program for power analysis. However, the program is available on the UT timesharing server. Information about access this program is available at this site.

38 SW388R6 Data Analysis and Computers I Slide 38 Power Analysis for Independent-samples T-test - 1 In the SamplePower program on the ITS Timesharing Systems, select the New… command from the File menu.

39 SW388R6 Data Analysis and Computers I Slide 39 Power Analysis for Independent-samples T-test - 2 First, select the Means tab to access the tests for means. Second, since we want to enter the means for our two groups, select the option button for t-test for 2 (independent) groups with common variance (Enter means) Third, click on the Ok button to enter the specific values for our problem.

40 SW388R6 Data Analysis and Computers I Slide 40 Power Analysis for Independent-samples T-test – 3 I want to my entries to display two decimal places, instead of the default of 1, so I click on the Decimals displayed tool button.

41 SW388R6 Data Analysis and Computers I Slide 41 Power Analysis for Independent-samples T-test – 4 First, click the up arrow button on the spinner for Decimals for data entry until 2 appears. Second, click on the OK button to close the dialog box.

42 SW388R6 Data Analysis and Computers I Slide 42 Power Analysis for Independent-samples T-test - 5 SPSS sets the default test to a two- tailed test with an alpha of.05. Since our test was a one-tailed test with an alpha of.05, we click on the text specified as the SPSS default.

43 SW388R6 Data Analysis and Computers I Slide 43 Power Analysis for Independent-samples T-test - 6 First, click on the 1 Tailed option on the Tails panel. Second, click on the Ok button to change the test specifications.

44 SW388R6 Data Analysis and Computers I Slide 44 Power Analysis for Independent-samples T-test - 7 We enter the values from the SPSS output from the independent- samples t-test for the Population 1 group: 42.04 for Population Mean 13.86 for Standard Deviation 26 for the N Per Group Note that SPSS fills in the standard deviation and N Per Group numbers for Population 2 with the same values.

45 SW388R6 Data Analysis and Computers I Slide 45 Power Analysis for Independent-samples T-test – 8 When we click on the box to change the Standard Deviation, this message appears. Since the standard deviation for our two groups is not the same, we click on the Yes button. First, enter the population mean for the second group, 40.55.

46 SW388R6 Data Analysis and Computers I Slide 46 Power Analysis for Independent-samples T-test – 9 We are now able to enter the standard deviation for the second group, 12.46.

47 SW388R6 Data Analysis and Computers I Slide 47 Power Analysis for Independent-samples T-test – 10 Since the number of cases for our two groups is not the same, we click on the Yes button. When we click on the box to change the N Per Group for the second group, this message box below appears.

48 SW388R6 Data Analysis and Computers I Slide 48 Power Analysis for Independent-samples T-test - 11 Having entered the values for the two groups, we now click on the Compute button. We are now able to enter the N Per Group for the second group, 145.

49 SW388R6 Data Analysis and Computers I Slide 49 Power Analysis for Independent-samples T-test - 12 SamplePower tells us that our power to obtain statistical significance was 14%, translating to a possible successful outcome 1 in 7 tries.

50 SW388R6 Data Analysis and Computers I Slide 50 Power Analysis for Independent-samples T-test – 13 With the mean difference of 1.49 and a pooled standard deviation of 12.68, we can use a calculator to compute the effect size of.12 (Cohen’s d), about half of what would be typically characterized as a small effect. Suppose, however, that even a very small effect of this size had important consequences. We can ask ourselves how large would the sample need to have been in order to find a statistically significant effect.

51 SW388R6 Data Analysis and Computers I Slide 51 Power Analysis for Independent-samples T-test - 14 To find the group sizes needed, select Find N for power of 80% from the Tools menu.

52 SW388R6 Data Analysis and Computers I Slide 52 Power Analysis for Independent-samples T-test – 15 Click on the Yes button to link the group sample sizes. This dialog box appears. SamplePower will need additional information to know how it should increase the size of each group.

53 SW388R6 Data Analysis and Computers I Slide 53 Power Analysis for Independent-samples T-test - 16 First, assuming the proportion of cases in each of our groups was representative of the population, we mark the option button to Link Sample Size in two groups. Second, using a calculator, I compute that group 2 was about 6 times larger than group 1, so I increase the second spinner to 6. Third, click OK to close the dialog box.

54 SW388R6 Data Analysis and Computers I Slide 54 Power Analysis for Independent-samples T-test - 17 To find the the group sizes needed, again select Find N for power of 80% from the Tools menu.

55 SW388R6 Data Analysis and Computers I Slide 55 Power Analysis for Independent-samples T-test - 18 SamplePower indicates that we would have needed a total sample of 3,654 to detect this very small effect size in the population. This very small effect size would have to have very important consequences in order to justify the expense of collecting samples this large.


Download ppt "SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem."

Similar presentations


Ads by Google