Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overview CONFIDENCE INTERVALS STUDENT’S T / T STATISTICS

Similar presentations


Presentation on theme: "Overview CONFIDENCE INTERVALS STUDENT’S T / T STATISTICS"— Presentation transcript:

1 ANALYTICAL PROPERTIES PART II ERT 207 ANALYTICAL CHEMISTRY SEMESTER 1, ACADEMIC SESSION 2015/16

2 Overview CONFIDENCE INTERVALS STUDENT’S T / T STATISTICS
STATISTICS AIDS TO HYPOTHESIS TESTING COMPARISON OF TWO EXPERIMENTAL MEANS ERRORS IN HYPOTHESIS TESTING COMPARISON OF VARIANCES ANALYSIS OF VARIANCE

3 CONFIDENCE INTERVALS The confidence interval for the mean is the range of values within which the population mean (μ) is expected to lie with a certain probability. Sometimes the limits of the interval are called confidence limits. The size of the confidence interval, which is computed from the sample standard deviation, depends on how well the sample standard deviation (s) estimates the population standard deviation (σ).

4 CONFIDENCE INTERVALS Figure 1 shows a series of five normal error curves. In each, the relative frequency is plotted as a function of the quantity z, which is the deviation from the mean divided by the population standard deviation. The numbers within the shaded areas are the percentage of the total area under the curve that is included within these values of z.

5 Figure 1: Areas under a Gaussian curve for various values of ±z.
CONFIDENCE INTERVALS (a) (b) (c) Figure 1: Areas under a Gaussian curve for various values of ±z. (d) (e)

6 CONFIDENCE INTERVALS From Figure 1 (a):
50% of the area under any Gaussian curve is located between -0.67σ and +0.67σ. We may assume 50 times out of 100 the true mean μ will fall in the interval of x ± 0.67σ. Confidence level: The probability that the true mean lies within a certain interval It is often expressed as a percentage.

7 CONFIDENCE INTERVALS Figure 1 (a):
The confidence level is 50% and the confidence interval is from -0.67σ to +0.67σ. Significance level: the probability that a result is outside the confidence interval. A general expression for the confidence interval (CI) of the true mean based on measuring single value x: CI for μ = x ± z σ

8 CONFIDENCE INTERVALS For the experimental mean of N measurements:
Table 1 shows the values of z at various confidence level. The relative size of the confidence interval as a function of N is shown in Table 2.

9 CONFIDENCE INTERVALS Table 1: Table 2:

10 CONFIDENCE INTERVALS EXAMPLE 1:
Determine the 80% and 95% confidence intervals for: (a) A data entry of 1108 mg/L glucose (b) A mean value for 1 week data of mg/L (1 data is recorded per day). Assume that in each part, s = 19 is a good estimate of σ.

11 STUDENT’S T / T STATISTICS
The t statistics is often called Student’s t. To account for the variability of s, we use the important statistical parameter t, which is defined in exactly the same way as z except that s is substituted for σ. For a single measurement with result x, For the mean of N measurement,

12 STUDENT’S T / T STATISTICS
The confidence interval for the mean of N replicate measurements can be calculated from t,

13 STUDENT’S T / T STATISTICS
Table 3:

14 STUDENT’S T / T STATISTICS
Example 2: A clinical chemist obtained the following data for the alcohol content of a sample of blood: % C2H5OH: 0.084, 0.089, and Calculate the 95% confidence interval for the mean assuming that The three results obtained are the only indication of the precision of the method From previous of experience on hundreds of samples, we know that the standard deviation the method s = 0.005% C2H5OH is a good estimate of σ.

15 STATISTICS AIDS TO HYPOTHESIS TESTING
Hypothesis testing is the basis for many decision made in science and engineering. The hypothesis tests that we describe are used to determine if the results from these experiments support the model. If agreement is found, the hypothetical model serves as the basis for further experiments. When the hypothesis is supported by sufficient experimental data, it becomes recognized as a useful theory until such time as data are obtained that prove it.

16 STATISTICS AIDS TO HYPOTHESIS TESTING
A null hypothesis postulates that two or more observed quantities are the same. Specific examples of hypothesis tests that scientists often use include the comparison of The mean of an experimental data set with what is believed to be the true value, The mean to a predicted or cutoff (threshold) value, The means or the standard deviations from two or more sets of data.

17 STATISTICS AIDS TO HYPOTHESIS TESTING
Comparing an experimental mean with a known value: A statistical hypothesis test to draw conclusions about the population mean (μ) and its nearness to the known value (μ0). There are two contradictory outcomes that we consider in any hypothesis test: The null hypothesis H0, states that μ = μ0. The alternative hypothesis Ha,

18 STATISTICS AIDS TO HYPOTHESIS TESTING
We might reject the null hypothesis in favor of Ha if is different than μ0 (μ ≠ μ0). Other alternative hypotheses are μ > μ0 or μ < μ0.

19 STATISTICS AIDS TO HYPOTHESIS TESTING
Suppose we are interested in determining whether the concentration of lead in an industrial wastewater discharge exceeds the maximum permissible amount of 0.05 ppm. Our hypothesis test would be summarized: H0: μ = 0.05 ppm Ha: μ > 0.05 ppm

20 STATISTICS AIDS TO HYPOTHESIS TESTING
Large Sample Z test: If a large number of results are available so that s is a good estimate of σ, the z test is appropriate. State the null hypothesis: H0: μ = μ0 Form the test statistic: State the alternative hypothesis Ha and determine the rejection region.

21 STATISTICS AIDS TO HYPOTHESIS TESTING
For Ha: μ ≠ μ0, reject H0 if z ≥ zcrit or if z ≤ -zcrit (two-tailed test) For Ha: μ > μ0, reject H0 if z ≥ zcrit (one-tailed test) For Ha: μ < μ0, reject H0 if z ≤ -zcrit Figure 2 (a): There is only a 5% probability that random error will lead to a value of z ≥ zcrit or z ≤ -zcrit. The significance level overall is α = 0.05 From Table 1, the critical value of z is 1.96

22 STATISTICS AIDS TO HYPOTHESIS TESTING
Figure 2: Rejection regions for the 95% confidence level (a) Two-tailed test for Ha: μ≠ μ0.

23 STATISTICS AIDS TO HYPOTHESIS TESTING
Figure 2: Rejection regions for the 95% confidence level (c) One-tailed test for Ha: μ< μ0. Figure 2 (b): The probability that z exceeds zcrit to be 5% or the total probability in both tails to be 10%. The significance level overall is α = 0.10. The critical value from Table 1 is 1.64.

24 STATISTICS AIDS TO HYPOTHESIS TESTING
Example 3 A class of 30 students determined the activation energy of a chemical reaction to be 116 kJ/mol (mean value) and standard deviation of 22 kJ/mol. Are the data in agreement with the literature value of 129 kJ/mol at The 95% confidence level The 99% confidence level Estimate the probability of obtaining a mean equal to the student value.

25 STATISTICS AIDS TO HYPOTHESIS TESTING
For a small number of results, we use a similar procedure to the z test except that the test statistics is the t statistic. The null hypothesis H0: μ= μ0, where μ0 is a specific value of μ such as an accepted value, a theoretical value or a threshold value. State the null hypothesis: H0: μ = μ0 From the test statistic: State the alternative hypothesis Ha and determine the rejection region.

26 STATISTICS AIDS TO HYPOTHESIS TESTING
For Ha: μ ≠ μ0, reject H0 if t ≥ tcrit or if t ≤ -tcrit (two-tailed test) For Ha: μ > μ0, reject H0 if t ≥ tcrit (one-tailed test) For Ha: μ < μ0, reject H0 if t ≤ -tcrit Figure 3: Curve A: If the analytical method had no systematical error, or bias, random errors would give the frequency distribution.

27 STATISTICS AIDS TO HYPOTHESIS TESTING
Figure 3: Curve B: The frequency distribution of results by a method that could have a significant bias due to a systematic error. Figure 3: Illustration of systematic error in an analytical method.

28 STATISTICS AIDS TO HYPOTHESIS TESTING
Example 4: A new procedure for the rapid determination of sulfur in kerosenes was tested on a sample known from its method of preparation to contain 0.123% S (μ0=0.123%S). The results for %S were 0.112, 0.118, and Do the data indicate that there is a bias in the method at the 95% confidence level?

29 COMPARISON OF TWO EXPERIMENTAL MEANS
Frequently scientists must judge whether a difference in the means of two sets of data is real or the result of random error. c The t-Test for differences in means: The test statistics t is could be found from:

30 COMPARISON OF TWO EXPERIMENTAL MEANS
If there is good reason to believe that the standard deviations of the two data sets differ, the two-sample t test must be used. Paired data: Scientists and engineers often make use of pairs of measurements on the same sample in order to minimize sources of variability that are not of interest. The test statistic value: Specific difference (0) Average difference

31 COMPARISON OF TWO EXPERIMENTAL MEANS
Example 5: A new automated procedure for determining glucose in serum (Method A) is to be compared to the established method (Method B). Both methods are performed on serum from the same six patients in order to eliminate patient-to-patient variability. Do the following results confirm a difference in the two methods at the 95% confidence level?

32 ERRORS IN HYPOTHESIS TESTING
Type I error: A type 1 error occurs when H0 is rejected although it is actually true. In some sciences, a type I error is called a false negative. Type II error: A type II error occurs when H0 is accepted and it is actually false. It is sometimes termed a false positive.

33 ERRORS IN HYPOTHESIS TESTING
The consequences of making errors in hypothesis testing are often compared to the errors made in judicial procedures. Convicting an innocent person is usually considered a more serious error than setting a guilty person free. If we make it less likely that an innocent person gets convicted, we make it more likely that a guilty person goes free. It is important when thinking about errors in hypothesis testing to determine the consequences of making a type I or type II error.

34 ERRORS IN HYPOTHESIS TESTING
As a general rule of thumb, the largest α that is tolerable for the situation should be used. This ensures the smallest type II error while keeping the type I error within acceptable limits. For many cases in analytical chemistry, an α value of 0.05 (95% confidence level) provides an acceptable compromise.

35 COMPARISON OF VARIANCES
At times, there is a need to compare the variances (or standard deviation) of two data sets. The normal t-test requires that the standard deviations of the data sets being compared are equal. F-test: A simple statistical test can be used to test this assumption under the provision that the populations follow the normal (Gaussian) distribution.

36 COMPARISON OF VARIANCES
F-test is based on the null hypothesis that the two population variances under consideration are equal. The test statistic F, which is defined as the ratio of the two samples variances. It is calculated and compared with the critical value of F at the desired significance level. The null hypothesis is rejected if the test statistic differs too much from unity.

37 COMPARISON OF VARIANCES
F-test is used in comparing > two means and in linear regression analysis. Critical values of F at the 0.05 significant level are shown in Table 4. Table 4:

38 COMPARISON OF VARIANCES
Two degrees of freedom are given, one associated with the numerator and the other with denominator. The F-test can be used in either a one-tailed mode or in a two-tailed mode.

39 COMPARISON OF VARIANCES
Example 6 A standard method for the determination of the carbon monoxide (CO) level in gaseous mixtures is known from many hundreds of measurements to have a standard deviation of 0.21 ppm CO. A modification of the method yields a value for s of 0.15 ppm CO for a pooled data set with 12 degrees of freedom. A second modification, also based on 12 degrees of freedom, has a standard deviation of 0.12 ppm CO. Is either modification significantly more precise than the original?

40 ANALYSIS OF VARIANCE ANOVA – the methods used for multiple comparisons fall under the general category of analysis of variance. ANOVA indicates a potential difference, multiple comparison procedures can be used to identify which specific population means differ from the others. Experimental design methods take advantages of ANOVA planning and performing experiments.

41 ANALYSIS OF VARIANCE ANOVA detects difference in several population means by comparing the variances. The following are typical applications of ANOVA: Is there a difference in the results of five analysts determining calcium by a volumetric method? Will four different solvent compositions have differing influences on the yield of a chemical synthesis? Are the results of manganese determination by three different analytical method different? Is there any difference in the fluorescence of a complex ion at six different values of pH?

42 ANALYSIS OF VARIANCE Figure 4 – a single factor, or one-way ANOVA.
The basic principle of ANOVA is to compare the variation between the different factor levels (groups) to that within factor levels. The groups are the different analysts, a comparison of the variation between analysts to the within-analyst variation (Figure 5).

43 ANALYSIS OF VARIANCE Figure 4

44 ANALYSIS OF VARIANCE Figure 5

45 ANALYSIS OF VARIANCE ANOVA Table:

46 ANALYSIS OF VARIANCE Example 7:
Five analysts determined calcium by a volumetric method and obtained the amount (in mmol Ca) shown in the table below. Do the means differ significantly at the 95% confidence level?

47 EXAMPLE 1 (a) From Table 1, z = 1.28 & 1.96 for 80% and 95% confidence levels. 80% CI = 1108 ± 1.28 x 19 = 1108 ± 24.3 mg/L 95% CI = 1108 ± 1.96 x 19 = 1108 ± 37.2 mg/L

48 EXAMPLE 1 It can be concluded that 80% probable that the population mean (μ) lies in the interval to mg/L glucose. The probability is 95% that μ lies in the interval between and mg/L.

49 EXAMPLE 1 (b) For the seven measurements, 80% CI = = 1100.3 ± 9.2 mg/L

50 EXAMPLE 1 The experimental mean (Ẋ = mg/L), it can be concluded that there is an 80% chance that μ is located in the interval between and mg/L glucose and a 95% chance that it lies between and mg/L glucose. Note: the intervals are considerably smaller when we use the experimental mean instead of a single value.

51 EXAMPLE 2 (a) = 0.252 = 0.021218 = 0.0050% C2H5OH In this, = 0.084

52 EXAMPLE 2 t = 4.30 for two degrees of freedom & the 95% confidence level. = ± 0.012%C2H5OH (b) Because s = % is a good estimate of σ, we can use z, = ± 0.006%C2H5OH

53 EXAMPLE 3 μ0 is the literature value of 129 kJ/mol so that the null hypothesis is μ = 129 kJ/mol. The alternative hypothesis is that μ ≠ 129 kJ/mol. This is a two-tailed test. From Table 1, zcrit = 1.96 for the 95% confidence level, and zcrit = 2.58 for the 99% confidence level. The test statistic is calculated as: =

54 EXAMPLE 3 Since z ≤ -1.96, we reject the null hypothesis at the 95% confidence level. Since z ≤ -2.58, we also reject H0 at the 99% confidence level. In order to estimate the probability of obtaining a mean value μ = 116 kJ/mol, the probability of obtaining a z value of 3.27. Table 1, the probability of obtaining a z value this large because of random error is only about 0.2%.

55 EXAMPLE 3 All of these results lead us to conclude that the student mean is actually different from the literature value and not just the result of random error.

56 EXAMPLE 4 The null hypothesis is H0: μ = 0.123% S,
The alternative hypothesis is Ha: μ ≠ 0.123%S. = 0.464 = 0.116%S = = % S

57 EXAMPLE 4 The test statistic can be calculated as, = -4.375
From Table 3: The critical value of t for 3 degrees of freedom and the 95% confidence level is 3.18. Since t ≤ -3.18, we conclude that there is a significant difference at the 95% confidence level and thus bias in the method.

58 EXAMPLE 4 If we were to do this test at 99% confidence level, tcrit = 5.84. Since t = is greater than -5.84, we would accept the null hypothesis at the 99% confidence level and conclude there is no difference between the experimental and the accepted values.

59 EXAMPLE 5 If μd is the true average difference between the methods, we want to test the null hypothesis: H0: μd = 0 Ha : μd ≠ 0 The t-test statistic is: N = 6, ∑di = =88 ∑di2 =1592, ḋ = 88/6 = 14.67

60 EXAMPLE 5 The standard deviation of the difference: = 7.76
The t-statistic: = 4.628 The critical value of t = 2.57 for the 95% confidence level and 5 degrees of freedom.

61 EXAMPLE 5 Since t>tcrit, we reject the null hypothesis and conclude that the two methods give different results.

62 EXAMPLE 6 Null hypothesis: The alternative hypothesis:
Because an improvement is claimed, the variances of the modifications are placed in the denominator. The variance of the standard method The variance of the modified method

63 EXAMPLE 6 For 1st modification: = 1.96 For 2nd modification: = 3.06
For the standard procedure, sstd is a good estimate of, and the number of degrees of freedom from the numerator can be taken as infinite.

64 EXAMPLE 6 F1 < 2.30, We cannot reject the null hypothesis & conclude that there is no improvement in precision. F2 > 2.30, We reject the null hypothesis and conclude that the second modification does appear to give better precision at the 95% confidence level. Is the precision of the 2nd modification is significantly better the 1st ?

65 EXAMPLE 6 The F-test dictates that we must accept the null hypothesis,
= 1.56 In this case, Fcrit = 2.69. Since F < 2.69, we must accept H0 and conclude that the two methods give equivalent precision.

66 EXAMPLE 7 We obtain the mean and standard deviations for each analyst.
The mean for analyst 1 is = 10.5 mmol Ca The remaining means are obtained in the same manner:

67 EXAMPLE 7 The results are summarized as follows,
The grand mean is found:

68 EXAMPLE 7

69 EXAMPLE 7 The F table, the critical value of F at the 95% confidence level for 4 and 10 degrees of freedom is 3.48. Since F exceeds 3.48, we reject H0 at the 95% confidence level and conclude that there is a significant difference among the analysts.


Download ppt "Overview CONFIDENCE INTERVALS STUDENT’S T / T STATISTICS"

Similar presentations


Ads by Google