Presentation is loading. Please wait.

Presentation is loading. Please wait.

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.

Similar presentations


Presentation on theme: "COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology."— Presentation transcript:

1 COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology Chapter 9: Introduction to the t Statistic © 2013 - - DO NOT CITE, QUOTE, REPRODUCE, OR DISSEMINATE WITHOUT WRITTEN PERMISSION FROM THE AUTHOR: Dr. John J. Kerbs can be emailed for permission at kerbsj@ecu.edu

2 Some Review from Earlier Chapters The older formulas from prior chapters are important for understanding t-tests as discussed in chapter 9 The older formulas from prior chapters are important for understanding t-tests as discussed in chapter 9 Please review these formulas and commit them to memory. Please review these formulas and commit them to memory. NOTE: Estimated Standard Errors can be tricky to calculate because it is easy to confuse the two-step formula: 1) calculate sample variance, and 2) calculate estimated standard error which is the square root of the sample variance divided by n

3 Moving from z-Scores to t-Statistics Using the information from the prior slide, we can convert the z-score formula to a t-statistic formula as follows: Using the information from the prior slide, we can convert the z-score formula to a t-statistic formula as follows:

4 The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown population mean. The t statistic allows researchers to use sample data to test hypotheses about an unknown population mean. The particular advantage of the t statistic is that the t statistic does not require any knowledge of the population standard deviation. The particular advantage of the t statistic is that the t statistic does not require any knowledge of the population standard deviation.

5 The t Statistic (cont’d.) Thus, the t statistic can be used to test hypotheses about a completely unknown population; that is, both μ and σ are unknown, and the only available information about the population comes from the sample. Thus, the t statistic can be used to test hypotheses about a completely unknown population; that is, both μ and σ are unknown, and the only available information about the population comes from the sample. All that is required for a hypothesis test with t is a sample and a reasonable hypothesis about the population mean. All that is required for a hypothesis test with t is a sample and a reasonable hypothesis about the population mean.

6 The Estimated Standard Error and the t Statistic The goal for a hypothesis test is to evaluate the significance of the observed discrepancy between a sample mean and the population mean. The goal for a hypothesis test is to evaluate the significance of the observed discrepancy between a sample mean and the population mean. Whenever a sample is obtained from a population you expect to find some discrepancy or "error" between the sample mean and the population mean. Whenever a sample is obtained from a population you expect to find some discrepancy or "error" between the sample mean and the population mean. This general phenomenon is known as sampling error. This general phenomenon is known as sampling error.

7 The Estimated Standard Error and the t Statistic (cont’d.) The hypothesis test attempts to decide between the following two alternatives: The hypothesis test attempts to decide between the following two alternatives: 1. Is it reasonable that the discrepancy between M and μ is simply due to sampling error and not the result of a treatment effect? 2. Is the discrepancy between M and μ more than would be expected by sampling error alone? That is, is the sample mean significantly different from the population mean?

8 The Estimated Standard Error and the t Statistic (cont.) The critical first step for the t statistic hypothesis test is to calculate exactly how much difference between M and μ is reasonable to expect. The critical first step for the t statistic hypothesis test is to calculate exactly how much difference between M and μ is reasonable to expect. However, because the population standard deviation is unknown, it is impossible to compute the standard error of M as we did with z-scores in Chapter 8. However, because the population standard deviation is unknown, it is impossible to compute the standard error of M as we did with z-scores in Chapter 8. Therefore, the t statistic requires that you use the sample data to compute an estimated standard error of M. Therefore, the t statistic requires that you use the sample data to compute an estimated standard error of M.

9 The Estimated Standard Error and the t Statistic (cont’d.) This calculation defines standard error exactly as it was defined in Chapters 7 and 8, but now we must use the sample variance, s 2, in place of the unknown population variance, σ 2 (or use sample standard deviation, s, in place of the unknown population standard deviation, σ). This calculation defines standard error exactly as it was defined in Chapters 7 and 8, but now we must use the sample variance, s 2, in place of the unknown population variance, σ 2 (or use sample standard deviation, s, in place of the unknown population standard deviation, σ). The resulting formula for estimated standard error is: The resulting formula for estimated standard error is:

10 The Estimated Standard Error and the t Statistic (cont’d.) The t statistic (like the z-score) forms a ratio. The t statistic (like the z-score) forms a ratio. The top of the ratio contains the obtained difference between the sample mean and the hypothesized population mean. The top of the ratio contains the obtained difference between the sample mean and the hypothesized population mean. The bottom of the ratio is the standard error which measures how much difference is expected by chance. The bottom of the ratio is the standard error which measures how much difference is expected by chance. obtained difference M  μ obtained difference M  μ t = ───────────── = ───── standard error s M standard error s M

11 The Estimated Standard Error and the t Statistic (cont’d.) A large value for t (a large ratio) indicates that the obtained difference between the data and the hypothesis is greater than would be expected if the treatment has no effect. A large value for t (a large ratio) indicates that the obtained difference between the data and the hypothesis is greater than would be expected if the treatment has no effect.

12 Degrees of Freedom and the t Statistic You can think of the t statistic as an "estimated z- score." You can think of the t statistic as an "estimated z- score." The estimation comes from the fact that we are using the sample variance to estimate the unknown population variance. The estimation comes from the fact that we are using the sample variance to estimate the unknown population variance. With a large sample, the estimation is very good and the t statistic will be very similar to a z-score. With a large sample, the estimation is very good and the t statistic will be very similar to a z-score. With small samples, however, the t statistic will provide a relatively poor estimate of z. With small samples, however, the t statistic will provide a relatively poor estimate of z.

13 Degrees of Freedom and the t Distribution The value of degrees of freedom, df = n - 1, is used to describe how well the t statistic represents a z-score. The value of degrees of freedom, df = n - 1, is used to describe how well the t statistic represents a z-score. Also, the value of df will determine how well the distribution of t approximates a normal distribution. Also, the value of df will determine how well the distribution of t approximates a normal distribution. For large values of df, the t distribution will be nearly normal, but with small values for df, the t distribution will be flatter and more spread out than a normal distribution. For large values of df, the t distribution will be nearly normal, but with small values for df, the t distribution will be flatter and more spread out than a normal distribution.

14 Normal Distribution versus t- Distribution with df = 20 and df=5 Larger values of df create nearly normal distribution t distribution will be nearly normal for large values of df t distribution will be nearly normal for large values of df t distribution will be flatter and more spread out than a normal distribution for smaller values of df t distribution will be flatter and more spread out than a normal distribution for smaller values of df Smaller values of df create flatter & more spread out distribution

15 The Shape of the t Distribution (cont’d.) t – distributions have four (4) key characteristics t – distributions have four (4) key characteristics 1. They are bell shaped 1. They are bell shaped 2. They are symmetrical 2. They are symmetrical 3. They have a mean of zero 3. They have a mean of zero 4. The larger the sample size, the closer the t- distribution is to a normal distribution as seen in the z-distribution 4. The larger the sample size, the closer the t- distribution is to a normal distribution as seen in the z-distribution

16 Degrees of Freedom and the t Distribution (cont’d.) To evaluate the t statistic from a hypothesis test, select an α level, find the value of df for the t statistic, and consult the t distribution table (see p. 703). To evaluate the t statistic from a hypothesis test, select an α level, find the value of df for the t statistic, and consult the t distribution table (see p. 703). If the obtained t statistic is larger than the critical value from the table, you can reject the null hypothesis. If the obtained t statistic is larger than the critical value from the table, you can reject the null hypothesis. In this case, you have demonstrated that the obtained difference between the data and the hypothesis (numerator of the ratio) is significantly larger than the difference that would be expected if there was no treatment effect (the standard error in the denominator). In this case, you have demonstrated that the obtained difference between the data and the hypothesis (numerator of the ratio) is significantly larger than the difference that would be expected if there was no treatment effect (the standard error in the denominator).

17 Degrees of Freedom and the t Distribution (cont’d.) What do I do if my t-statistic has a degrees of freedom (df) value that is not listed in the distribution table for t-statistics on page 703 of the textbook? What do I do if my t-statistic has a degrees of freedom (df) value that is not listed in the distribution table for t-statistics on page 703 of the textbook? Problem: This blocks my ability to look up an exact critical t with the exact same df that defines your critical region(s) as needed for our 4-part hypothesis testing protocol. Problem: This blocks my ability to look up an exact critical t with the exact same df that defines your critical region(s) as needed for our 4-part hypothesis testing protocol. Solution: Look up the critical t for both of the surrounding df values listed and then use the larger value for t, which corresponds to the lower df value. This will define a larger t crit and make it more difficult to commit Type I Errors (i.e., less likely to produce a false positives finding). Solution: Look up the critical t for both of the surrounding df values listed and then use the larger value for t, which corresponds to the lower df value. This will define a larger t crit and make it more difficult to commit Type I Errors (i.e., less likely to produce a false positives finding).

18 Hypothesis Tests with the t Statistic The hypothesis test with a t statistic follows the same four-step procedure that was used with z-score tests: The hypothesis test with a t statistic follows the same four-step procedure that was used with z-score tests: 1. State the hypotheses and select a value for α. (Note: The null hypothesis always states a specific value for μ.) 2. Locate the critical region. (Note: You must find the value for df and use the t distribution table.) 3. Calculate the test statistic. 4. Make a decision. (Either "reject" or "fail to reject" the null hypothesis.)

19 Hypothesis Tests with the t Statistic (cont’d.) There are two general situations where this type of hypothesis test is used: There are two general situations where this type of hypothesis test is used: 1. To determine the effect of treatment on a population mean 2. In situations where the population mean is unknown

20 Hypothesis Tests with the t Statistic (cont’d.) 1. In order to determine the effect of treatment on a population mean, you must know the value of μ for the original, untreated population. A sample is obtained from the population and the treatment is administered to the sample. If the resulting sample mean is significantly different from the original population mean, you can conclude that the treatment has a significant effect.

21 Hypothesis Tests with the t Statistic (cont’d.)

22 2. Occasionally a theory or other prediction will provide a hypothesized value for an unknown population mean. A sample is then obtained from the population and the t statistic is used to compare the actual sample mean with the hypothesized population mean. A significant difference indicates that the hypothesized value for μ should be rejected.

23 Hypothesis Tests with the t Statistic (cont’d.) Two basic assumptions are necessary for hypothesis tests with the t statistic: Two basic assumptions are necessary for hypothesis tests with the t statistic: The values in the sample must consist of independent observations. The values in the sample must consist of independent observations. Two observations are independent if there is no consistent, predictable relationship between the first observation and the second observation. Two observations are independent if there is no consistent, predictable relationship between the first observation and the second observation. Stated differently, the occurrence of the first event has no effect on the probability of the second event Stated differently, the occurrence of the first event has no effect on the probability of the second event The population that is sampled must be normal. The population that is sampled must be normal. Very important assumption for small samples Very important assumption for small samples With larger samples, this assumption can be violated without affecting the validity of the hypothesis test. With larger samples, this assumption can be violated without affecting the validity of the hypothesis test.

24 Hypothesis Tests with the t Statistic (cont’d.) Both the sample size and the sample variance influence the outcome of a hypothesis test. Both the sample size and the sample variance influence the outcome of a hypothesis test. Sample size is inversely related to estimated standard error. Sample size is inversely related to estimated standard error. As the sample size increases, the standard error decreases in the denominator of the t-statistic, and there is an increases the likelihood of a significant test because the t-statistic will increase in size As the sample size increases, the standard error decreases in the denominator of the t-statistic, and there is an increases the likelihood of a significant test because the t-statistic will increase in size The sample variance, on the other hand, is directly and positively related to the estimated standard error. The sample variance, on the other hand, is directly and positively related to the estimated standard error. As variance increases, standard error will also increase, which decreases in the likelihood of a significant test because the t-statistic will shrink in size as the standard error increases in the denominator As variance increases, standard error will also increase, which decreases in the likelihood of a significant test because the t-statistic will shrink in size as the standard error increases in the denominator

25 Measuring Effect Size for the t Statistic Because the significance of a treatment effect is determined partially by the size of the effect and partially by the size of the sample, you cannot assume that a significant effect is also a large effect. Because the significance of a treatment effect is determined partially by the size of the effect and partially by the size of the sample, you cannot assume that a significant effect is also a large effect. Therefore, it is recommended that a measure of effect size be computed along with the hypothesis test. Therefore, it is recommended that a measure of effect size be computed along with the hypothesis test.

26 Measuring Effect Size for the t Statistic (cont’d.) For the t test, it is possible to compute an estimate of Cohen’s d just as we did for the z-score test in Chapter 8. The only change is that we now use the sample standard deviation instead of the population value (which is unknown). For the t test, it is possible to compute an estimate of Cohen’s d just as we did for the z-score test in Chapter 8. The only change is that we now use the sample standard deviation instead of the population value (which is unknown). mean difference M  μ mean difference M  μ Estimated Cohen’s d = ─────────── = ────── standard deviation s standard deviation s Magnitude of dEvaluation of Effect Size d = 0.2Small effect (mean difference around 0.2 standard deviations) d = 0.5Medium effect (mean difference around 0.5 standard deviations) d = 0.8Large effect (mean difference around 0.8 standard deviations)

27 Measuring Effect Size for the t Statistic (cont’d.) As before, Cohen’s d measures the size of the treatment effect in terms of the standard deviation. As before, Cohen’s d measures the size of the treatment effect in terms of the standard deviation. With a t test, it is also possible to measure effect size by computing the percentage of variance accounted for by the treatment. With a t test, it is also possible to measure effect size by computing the percentage of variance accounted for by the treatment. This measure is based on the idea that the treatment causes the scores to change, which contributes to the observed variability in the data. This measure is based on the idea that the treatment causes the scores to change, which contributes to the observed variability in the data.

28

29 Measuring Effect Size for the t Statistic (cont’d.) By measuring the amount of variability that can be attributed to the treatment, we obtain a measure of the size of the treatment effect. For the t statistic hypothesis test: By measuring the amount of variability that can be attributed to the treatment, we obtain a measure of the size of the treatment effect. For the t statistic hypothesis test: percentage of variance accounted for = r 2 = t 2 / (t 2 + df ) Note that r 2 can range in values from 0.00 to 1.00 or from 0% to 100% of the variance accounted for if you convert the decimal version r 2 by multiplying by 100 for a percentage Note that r 2 can range in values from 0.00 to 1.00 or from 0% to 100% of the variance accounted for if you convert the decimal version r 2 by multiplying by 100 for a percentage Percent of Variance Explained as Measured by r 2 Evaluation of Effect Size r 2 = 0.01 (0.01*100 = 1%)Small effect r 2 = 0.09 (0.09*100 = 9%)Medium effect r 2 = 0.25 (0.25*100 = 25%)Large effect

30 Measuring Effect Size for the t Statistic (cont’d.) The size of a treatment effect can also be described by computing an estimate of the unknown population mean after treatment. The size of a treatment effect can also be described by computing an estimate of the unknown population mean after treatment. A confidence interval is a range of values that estimates the unknown population mean by estimating the t value. A confidence interval is a range of values that estimates the unknown population mean by estimating the t value.

31 Confidence Intervals Consider the example from book on page 301 regarding time infants spend looking at attractive versus less attractive faces. In this example, we want to construct an 80% Confidence Interval around M = 13, s M = 1.00, n = 9. We want to be 80% confident that the real population mean (μ) is actually contained in the interval. Consider the example from book on page 301 regarding time infants spend looking at attractive versus less attractive faces. In this example, we want to construct an 80% Confidence Interval around M = 13, s M = 1.00, n = 9. We want to be 80% confident that the real population mean (μ) is actually contained in the interval. Step #1: Look up the corresponding t values in the t distribution table for scores that crop the middle 80% of the distribution. Step #1: Look up the corresponding t values in the t distribution table for scores that crop the middle 80% of the distribution. This means that you need to have 10% in each tail. This means that you need to have 10% in each tail. Calculate the degrees of freedom for t : df = n – 1 = 9 -1 = 8 Calculate the degrees of freedom for t : df = n – 1 = 9 -1 = 8 Now look up the values of t with 8 df for a 1 tail test at 10% or a 2 tail test at 20%: t = +/- 1.397 Now look up the values of t with 8 df for a 1 tail test at 10% or a 2 tail test at 20%: t = +/- 1.397

32 Confidence Intervals 10% of the t distribution in the lower tail 10% of the t distribution in the upper tail

33 Confidence Intervals Step #2: Calculate the bounded values of the interval. To do this, you must use M and s M as obtained from the sample data and plug these values into the estimation formula: μ = M ± t*s M Step #2: Calculate the bounded values of the interval. To do this, you must use M and s M as obtained from the sample data and plug these values into the estimation formula: μ = M ± t*s M μ = M ± t*s M = 13 +/- (1.397)*(1.00) μ lower = M - t*s M = 13 - (1.397)*(1.00) = 11.603 μ upper = M + t*s M = 13 + (1.397)*(1.00) = 14.397

34 Confidence Intervals Step #3: Summarize the findings Step #3: Summarize the findings The average time looking at the more attractive face for the population of infants is between μ = 11.603 and μ = 14.397 seconds. We are 80% confident that the true population mean is located within this interval. 11.60314.397 M = 13.000 Value for lower boundary of 80% confidence interval Value for upper boundary of 80% confidence interval Values that fall in the middle of the 80% CI

35 Confidence Intervals Some Facts about Confidence Intervals Some Facts about Confidence Intervals 1. Increasing the width of the interval will increase confidence 1. Increasing the width of the interval will increase confidence 2. Decreasing the width of the interval will decrease confidence 2. Decreasing the width of the interval will decrease confidence 3. A larger level of confidence produces a larger t value and a wider interval. 3. A larger level of confidence produces a larger t value and a wider interval. 4. A smaller level of confidence produces a smaller t value and a smaller interval. 4. A smaller level of confidence produces a smaller t value and a smaller interval. 5. As sample size increases, the standard error decreases and the interval gets smaller 5. As sample size increases, the standard error decreases and the interval gets smaller 6. Confidence intervals are not an adequate substitute for Cohen’s d or r 2 because they are influence by sample sizes 6. Confidence intervals are not an adequate substitute for Cohen’s d or r 2 because they are influence by sample sizes

36 Directional Hypotheses and One-Tailed t-Tests A forensic psychologist prepared a depression test that is administered to inmates on the day of release from prison in NC. The test measures how depressed each inmate feel upon release, and the standardized depression scale (range 1-20) was administered to every released inmate in 2010. The higher the score, the more depressed the inmate. The 2010 cohort of released inmates had a mean score of μ = 15. A sample of n = 9 released inmates from 2011 was selected, tested, and found to have the following scores: 7, 12, 11, 15, 7, 8, 15, 9, and 6 (M = 10, SS=94, population variance is unknown). Are the inmates who were released in 2011 significantly less depressed than the inmates released in 2010? A forensic psychologist prepared a depression test that is administered to inmates on the day of release from prison in NC. The test measures how depressed each inmate feel upon release, and the standardized depression scale (range 1-20) was administered to every released inmate in 2010. The higher the score, the more depressed the inmate. The 2010 cohort of released inmates had a mean score of μ = 15. A sample of n = 9 released inmates from 2011 was selected, tested, and found to have the following scores: 7, 12, 11, 15, 7, 8, 15, 9, and 6 (M = 10, SS=94, population variance is unknown). Are the inmates who were released in 2011 significantly less depressed than the inmates released in 2010?

37 Directional Hypotheses and One-Tailed t-Tests Step #1: State the hypotheses and select an alpha level. Step #1: State the hypotheses and select an alpha level. H 0 : μ ≥ 15 H 0 : μ ≥ 15 H 1 : μ < 15 H 1 : μ < 15 For this test, we will set α =.05, one tail test because of the directional nature of the study and associated hypothesis. For this test, we will set α =.05, one tail test because of the directional nature of the study and associated hypothesis. Step #2: Locate the critical region. Step #2: Locate the critical region. With a sample of 9 inmates, the t statistic has df = n -1 = 8. For a one-tailed test with α =.05 and df = 8, the critical t values are as follows: t = - 1.860 With a sample of 9 inmates, the t statistic has df = n -1 = 8. For a one-tailed test with α =.05 and df = 8, the critical t values are as follows: t = - 1.860 Thus, the obtained t value must be less than this critical value for t to reject H 0. Thus, the obtained t value must be less than this critical value for t to reject H 0.

38 Directional Hypotheses and One-Tailed t-Tests Step #3: Compute the test statistic in 3 steps Step #3: Compute the test statistic in 3 steps Compute Sample Variance Compute Sample Variance Compute Estimated Standard Error Compute Estimated Standard Error Compute t Statistic Compute t Statistic

39 Directional Hypotheses and One-Tailed t-Tests Step #4: Make decision about H 0 and state a conclusion Step #4: Make decision about H 0 and state a conclusion H 0 : μ ≥ 15 H 0 : μ ≥ 15 H 1 : μ < 15 H 1 : μ < 15 t.05 (8)= - 1.860 t.05 (8)= - 1.860 If t obtained ≥ -1.860, Fail to Reject H 0 If t obtained ≥ -1.860, Fail to Reject H 0 If t obtained < -1.860, Reject H 0 If t obtained < -1.860, Reject H 0 t obtained = - 4.39, which is less than - 1.860, thus reject H 0 t obtained = - 4.39, which is less than - 1.860, thus reject H 0 We conclude that the sample of released inmates in 2011 had significantly lower levels of depression as compared to the released cohort of inmates in 2010. We conclude that the sample of released inmates in 2011 had significantly lower levels of depression as compared to the released cohort of inmates in 2010.


Download ppt "COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology."

Similar presentations


Ads by Google