Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Similar presentations


Presentation on theme: "Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ"— Presentation transcript:

1 Introduction to the t Statistic

2 What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ = (SS/n)

3 What Do You Notice About These Formulae? They are based on population parameters They are based on population parameters How often do you think these population parameters are known? How often do you think these population parameters are known? Well, we have said that the sample mean is usually a good estimate of the population mean Well, we have said that the sample mean is usually a good estimate of the population mean Therefore finding μ is not generally a problem we worry about Therefore finding μ is not generally a problem we worry about What about σ and σ m ? What about σ and σ m ? These we cannot estimate These we cannot estimate

4 So, What Do We Do? When we do not know the population variation we use t - tests When we do not know the population variation we use t - tests

5 The Story… The t statistic was introduced by William Sealy Gosset for cheaply monitoring the quality of beer brews. "Student" was his pen name. Gosset was a statistician for the Guinness brewery in Dublin, Ireland, and was hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset published the t test in Biometrika in 1908, but was forced to use a pen name by his employer who regarded the fact that they were using statistics as a trade secret. In fact, Gosset's identity was unknown not only to fellow statisticians but to his employerthe company insisted on the pseudonym so that it could turn a blind eye to the breach of its rules. William Sealy Gossetpen name GuinnessDublin, IrelandOxfordCambridgeBiometrikaWilliam Sealy Gossetpen name GuinnessDublin, IrelandOxfordCambridgeBiometrika From Wikipedia

6 … more on Gossett… Gossett was a chemist and was responsible for developing procedures for ensuring the similarity of batches of Guiness. The t-test was developed as a way of measuring how closely the yeast content of a particular batch of beer corresponded to the brewery's standard. Gossett was a chemist and was responsible for developing procedures for ensuring the similarity of batches of Guiness. The t-test was developed as a way of measuring how closely the yeast content of a particular batch of beer corresponded to the brewery's standard. From

7 Why t-test? Student's distribution arises when (as in nearly all practical statistical work) the population is unknown and has to be estimated from the data. Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data.standard deviation Textbook problems treating the standard deviation as if it were known are of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and. (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and.variance (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining. (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining. From Wikipedia

8 The t Statistic The t statistic is used to test hypotheses about an unknown population mean (μ) when the value of σ is unknown. The formula for the t statistic has the same structure as the z-score formula, except that the t statistic uses the estimated standard error in the denominator The t statistic is used to test hypotheses about an unknown population mean (μ) when the value of σ is unknown. The formula for the t statistic has the same structure as the z-score formula, except that the t statistic uses the estimated standard error in the denominator t = (M – μ)/s m t = (M – μ)/s m s m = s/n = (s 2 /n) s m = s/n = (s 2 /n) s = [SS/(n-1)] = (SS/df) s = [SS/(n-1)] = (SS/df)

9 What is the Estimated Standard Error (s m )? The estimated standard error (s m ) is used as an estimate of the real standard error (σ m ) when the value of σ is unknown. It is computed from the sample variance or sample standard deviation and provides an estimate of the standard distance between a sample mean M and the population mean μ. The estimated standard error (s m ) is used as an estimate of the real standard error (σ m ) when the value of σ is unknown. It is computed from the sample variance or sample standard deviation and provides an estimate of the standard distance between a sample mean M and the population mean μ.

10 What Are The Degrees of Freedom (df)? Degrees of freedom describe the number of scores in a sample that are independent and free to vary. Because the sample mean places a restriction on the value of one score in the sample, there are n-1 degrees of freedom for the sample. Degrees of freedom describe the number of scores in a sample that are independent and free to vary. Because the sample mean places a restriction on the value of one score in the sample, there are n-1 degrees of freedom for the sample.

11 Describe the Shape of the t - Distribution The t is leptokurtic but as df gets larger, it more closely resembles the normal curve The t is leptokurtic but as df gets larger, it more closely resembles the normal curve This is due to the fact that s m more closely estimates σ m when the df gets very large This is due to the fact that s m more closely estimates σ m when the df gets very large Once df is sufficiently large t is distributed as z Once df is sufficiently large t is distributed as z What is two tailed the critical value (t crit ) for α =.05 and df = 6 What is two tailed the critical value (t crit ) for α =.05 and df = One tailed? One tailed?

12 How Did Gossett Use His Test? He had to find out if the beer that was brewed met the brewery standards for the yeast content He had to find out if the beer that was brewed met the brewery standards for the yeast content First, he would take samples of the beer from each vat and determine the yeast content First, he would take samples of the beer from each vat and determine the yeast content With this data he would know the desired yeast content (μ) as set by factory standards, the mean yeast content for the samples (M), and the sample standard deviation (s) for yeast content With this data he would know the desired yeast content (μ) as set by factory standards, the mean yeast content for the samples (M), and the sample standard deviation (s) for yeast content

13 Lets See What Gossett Might Have Seen… What if there are usually 15 grams of yeast per bottle of Guinness. What if there are usually 15 grams of yeast per bottle of Guinness. We take nine samples of beer from a vat and we get readings of {7, 12, 11, 15, 7, 8, 15, 9, 6} We take nine samples of beer from a vat and we get readings of {7, 12, 11, 15, 7, 8, 15, 9, 6} Does this vat have a significantly different (α =.05) level of yeast than what Guinness wants? Does this vat have a significantly different (α =.05) level of yeast than what Guinness wants?

14 Step 1: State Your Hypotheses Null: Null: H 0 : μ = 15 H 0 : μ = 15 Alternative Alternative H 1 : μ 15 H 1 : μ 15 State your alpha State your alpha α =.05 α =.05

15 Step 2: Find t crit First find the df First find the df df = n – 1 = 9 – 1 = 8 df = n – 1 = 9 – 1 = 8 Find the two tailed critical t value for df = 8 and α =.05 Find the two tailed critical t value for df = 8 and α =.05

16 Step 3: Sample Data and Test Statistics Mean = 10 Mean = 10 SS = 94 SS = 94 s 2 = s 2 = s = 3.43 s = 3.43 s m = 1.14 s m = 1.14 t = t = -4.39

17 Step 4: Make a Decision Is our observed t (t obs ) greater than, or less than the critical value for t (t crit ) Is our observed t (t obs ) greater than, or less than the critical value for t (t crit ) Therefore we make the decision Therefore we make the decision t(8) = -4.39, p<.05 t(8) = -4.39, p<.05

18 Measuring Effect Size How did we measure effect size before? How did we measure effect size before? Mean difference over standard deviation Mean difference over standard deviation Therefore… Therefore… Here, estimated d = mean difference / sample standard deviation Here, estimated d = mean difference / sample standard deviation

19 Measuring Effect Size (Take Two!) We can measure effect size by looking at the proportion of variance accounted for We can measure effect size by looking at the proportion of variance accounted for This is sometimes called PRE, or Proportional Reduction in Error This is sometimes called PRE, or Proportional Reduction in Error Two ways of calculating this Two ways of calculating this 1. Variability accounted for / total variability 2. r 2 = t 2 /(t 2 + df)

20

21

22

23 Effect Size Cohens d = mean difference/standard deviation Cohens d = mean difference/standard deviation 5/3.43 = /3.43 = 1.46 r 2 = Variability accounted for / total variability r 2 = Variability accounted for / total variability r 2 = t 2 /(t 2 + df) r 2 = t 2 /(t 2 + df)

24 Confidence Intervals Point Estimate Interval Estimate

25 Directional Hypotheses When is a directional hypothesis justified? When is a directional hypothesis justified? When there is clear theoretical support for a one tailed test. When there is clear theoretical support for a one tailed test. This is done through a literature review of past findings, not simply well thought out logic This is done through a literature review of past findings, not simply well thought out logic What are examples of directional hypotheses? What are examples of directional hypotheses? How do we use directional hypotheses? How do we use directional hypotheses?


Download ppt "Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ"

Similar presentations


Ads by Google