Presentation is loading. Please wait.

Presentation is loading. Please wait.

The t-test With small samples, the z-test has to be modified. Statisticians use the t-test.

Similar presentations


Presentation on theme: "The t-test With small samples, the z-test has to be modified. Statisticians use the t-test."— Presentation transcript:

1 The t-test With small samples, the z-test has to be modified. Statisticians use the t-test.

2 Example In Los Angeles, many studies have been conducted to determine the concentration of CO (carbon monoxide) near freeways with various conditions of traffic flow. Some machines are used to measure concentrations up to about 100 ppm (parts per million by volume) with errors on the order of 10 ppm. The machines are so delicate that they have to be calibrated every day. This involves measuring CO concentration in a manufactured gas sample, called span gas, where the concentration is precisely controlled at 70 ppm. If the machine reads close to 70 ppm on the span gas, then it’s ready for use; otherwise, it has to be adjusted. The size of the measurement errors varies from day to day. We assume that the errors are independent and follow the normal curve; the SD is unknown and changes from day to day.

3 Example One day, a technician makes five readings on span gas: 78 83 68 72 88 Four out of five are higher than 70, and some of them by quite a bit. Can this be explained on the basis of chance variation? Or does it show bias, perhaps from improper adjustment of the machine? A test of significance is needed. First, we need to set up the equation: Measurement = true value + bias + chance error That is, observed value = 70 ppm + bias + error.

4 Set up a test We know that the errors will in general even out when repeating measurements. By the previous assumption, it follows the normal curve. What is the expected average of the errors? The errors should average out to 0, and the SD is unknown. According to the equation, how do we set up the null hypothesis? The null hypothesis says that, the difference between the true value and the average of the measurements is due to chance variation. That is, there is no bias. Hence, the difference follows the normal curve.

5 Set up a test

6 Calculation

7 Modify the test There could be a problem in the previous calculation. Why? The SD of the measurements is only an estimate for the SD of the errors. This estimate is good only when the number of measurements is reasonably large. In the example, the number (5) of measurements is so small that the estimate could be way off. So we have to modify the test in two steps.

8 Step 1

9 Step 2 Now we are ready to find the P-value. But with a small number of measurements, we use a different curve, called Student’s curve, rather than the normal curve. Using Student’s curve requires some work: there is one of the curves for each number of degrees of freedom. (The Student’s curves are a bunch of curves, the degrees of freedom tells you which to use.) To calculate the degree, we have the following formula: Degrees of freedom = number of measurements – one. In our example, the t-statistic is 2.2. From the table, we see that the P-value is less than 5%. That is quite a bit more than the 1% from the normal curve. (We still get rejection to the null hypothesis by significance, but not that strong.)

10 Student’s curves The dashed line above is Student’s curve for 4 degrees of freedom. The solid line is a normal curve, for comparison.

11 Student’s curves Here the dashed line is Student’s curve for 9 degrees of freedom. Again, we compare it with the normal curve.

12 Student’s curves Student’s curves look quite a lot like the normal curve: They are symmetric, bell-shaped, and centered at 0. But they are less piled up in the middle and more spread out. As the number of degrees of freedom goes up, the curves get closer and closer to the normal, reflecting the fact that the SD of the measurements is getting closer and closer to the SD of the error box. All the curves are symmetric around 0, and have the total area 100% for each.

13 To read the table The rows are labeled by degrees of freedom. The columns are labeled by the percentages of the areas. The entries are the t-statistics.

14 To read the table For example, if we look at the row for 4 degrees of freedom. The first entry is 1.53, in the column headed 10%. This means the area to the right of 1.53 under Student’s curve with 4 degrees of freedom is about 10%.

15 To find the P-value In our previous example, with 5 measurements, there are 5 – 1 = 4 degrees of freedom. The t-statistic is about 2.2, which is greater than 0. So we look at the area to the right of 2.2 under Student’s curve with 4 degrees of freedom. From the table, this is about 5%. (In fact, less than 5%.)

16 Remarks The t-test should be used under the following circumstances: The data are like draws from a box. The SD of the box is unknown. The number of observations is small, so the SD of the box cannot be estimated very accurately. The histogram for the errors does not look too different from the normal curve. With 25 or more measurements, the normal curve would ordinarily be used. If the SD of the box is known, and the errors follow the normal curve, then the normal curve can be used even for small samples.

17 Another example On another day, 6 readings on span gas turn out to be: 72 79 65 84 67 77. Is the machine properly calibrated? Or do the measurements show bias?

18 Solution

19

20 Notes

21 Summary


Download ppt "The t-test With small samples, the z-test has to be modified. Statisticians use the t-test."

Similar presentations


Ads by Google