Fundamentals of Data Analysis Lecture 6 Testing of statistical hypotheses pt.3.

Presentation on theme: "Fundamentals of Data Analysis Lecture 6 Testing of statistical hypotheses pt.3."— Presentation transcript:

Fundamentals of Data Analysis Lecture 6 Testing of statistical hypotheses pt.3

Nonparametric test Nonparametric tests can be divided into two groups: tests of goodness of fit, allowing to test the hypothesis that the population has a certain type of distribution, tests of the hypothesis that two samples come from one population (ie, that the two populations have the same distribution).

Kolmogorov test of goodness of fit In Kolmogorov λ test of goodness of fit is suitable for veryfication of the hypothesis that population has specified distribution. During that test the empirical distribution function is compared to the hypothetical one, not as in the chi- square test, where the size of empirical series was culculated and compared with the size of hypothetical series. In fact, when the population distribution is consistent with the hypothesis,the value of empirical and hypothetical distribution should be similar in all examined points.

Kolmogorov test of goodness of fit The test starts with the analysis of the differences between the two distribution functions, the largest of which will be used then for the construction of lambda statistics whose distribution does not depend on the form of a hypothetical distribution. This distribution determines the critical value for this test. If the maximum difference at some point in the area of the characteristic variability is too high, the hypothesis that the distribution of the population has the cumulative distribution as we suspect should be rejected.

Kolmogorov test of goodness of fit The usefulness of this test is limited, because the hypothetical distribution must be continuous, we should also know the parameters of this distribution, but in the case of large samples they can be estimated from the sample.

Kolmogorov test of goodness of fit The procedure for the Kolmogorov test is as follows: 1 we sort the results in ascending or group them in a relatively narrow ranges, with right ends x i and corresponding sizes n i ; 2. For each x i we calculate empirical cumulative distribution F n (x) with the aid of equation :

Kolmogorov test of goodness of fit 3. from a hypothetical distribution for each x i we determine the theoretical value of the distribution function F(x); 4. For each x i we calculate the absolute value of the difference F n (x)-F(x); 5. We calculate the parameter D = sup|F n (x)-F(x)| and then the value of special statistics: which, when null hypothesis is true should have the Kolmogorov distribution.

Kolmogorov test of goodness of fit 6. for a certain level of confidence  we read from Kolmogorov distribution critical value satisfying condition P{  cr } = 1 - . When  kr the null hypothesis should be rejected, otherwise there is no reason to reject the null hypothesis.

Kolmogorov test of goodness of fit The sample of size n = 1000 were tested and the results are grouped into 10 narrow classes and included in the table. Our task is to extend the reasonable null hypothesis for the distribution and verify it on the level of confidence equal to 95%. Example Class Size

Kolmogorov test of goodness of fit Example Size distribution is close to symmetric, the maximum is at one of the middle classes, which raises the hypothesis that the distribution of the tested attribute is a normal distribution N(m,  ) n If we were to take m=65, then inside the range there are 1000-(25+19) = 956 results, what gives 95.6% of all results. From properties of the normal distribution, we know that the probability of adopting the values within the range from u-1.96  to u+1.96  is equal to  (this mean that for 1000 samples probe 950 samples should be inside that range).

Kolmogorov test of goodness of fit Example Thus, a reasonable hypothesis seems to be  = 1 n Our null hypothesis is H 0 : N(65,1)

Kolmogorov test of goodness of fit Example n In the third column the distribution function values ​​ are placed which were calculated by:

Kolmogorov test of goodness of fit Example In the fourth column we place the right ends of standardized classes (x - m)/  n in the fifth column are the values of distribution function read from the tables of N(0, 1) distribution n in the sixth column the absolute values ​​ of differences between the distribution functions are placed, the largest of which is d 4 = 0,0280

Kolmogorov test of goodness of fit Example n then we calculate sqrt(n) d n = sqrt(1000) · 0,0280 = 0,886 n For a confidence level 0,95 we read from the Kolmogorov distribution tables a critical value cr = 1,354 n in the sixth column the absolute values ​​ of differences between the distribution functions are placed, the largest of which is d 4 = 0,0280

Kolmogorov test of goodness of fit Example n Critical value is greater than the calculated value, so the test results do not contradict the null hypothesis that the distribution of the general population is normal N(65, 1)

Kolmogorov test of goodness of fit Exercise n The capacity of 40 capacitors was measured (in pF): 55.1; 67.3; 54.6; 52.2; 58.4; 50.4; 70.1; 55.3; 57.6; 62.5; 68.4; 54.5; 56.7; 53.5; 61.6; 59.6; 49.0; 63.7; 58.1; 56.7; 57.8; 63.6; 69.2; 60.8; 62.9; 54.3; 61.0; 58.2; 64.3; 57.4; 39.3; 59.0; 60.1; 60.7; 59.9; 70.5; 57.2; 61.8; 46.0 Using the Kolmogorov test ( with 95% level of significance) to find the distribution to which capacitance of the capacitor come under.

The end of nonparametric tests

Download ppt "Fundamentals of Data Analysis Lecture 6 Testing of statistical hypotheses pt.3."

Similar presentations