Presentation on theme: "Statistical descriptions of random values The mean, fashion and median, is numerical descriptions of position of random values on a numerical axis. The."— Presentation transcript:
statistical descriptions of random values The mean, fashion and median, is numerical descriptions of position of random values on a numerical axis. The mean M X is determined on a formula, where N is a number of experimental data; x i are possible size of value. A fashion is the most probable value of size. Distributions are polymodal (A), antimodal (B) and without (C) modal. A B C
A median Me is usually determined for continuous variables. It is considered as a value of random variables, for which 50% observation less of it and fifty percents anymore. There is other formulation. A median is description of random variable for which the condition of P(X Me) is executed. And another. A perpendicular, passing through a median, divides the area of chart of distribution of size by two equal parts. The value of area on the left of median is equal to the value of area on the right of median.
Dispersion D x is description of dispersion, degree of variation of values of random variable round it mean. Dispersion– дисперсия We buy five boards. Their length is shown on a picture. Mx=200 Dx=((200-200)^2+(200-198)^2+(200-202)^2+(200-198)^2+ (200-202)^2)/5=(0+4+4+4+4)/5=3.2 And another party Mx=200. Dx=((100-200) 2 +(200-300) 2 + (200- 150) 2 +(200-250) 2 + (200- 200) 2 )/5=5000 What party does have greater variation of values? So, if we want to have party with the values of it elements small different from middle, we must get dispersion of small size. And vice versa, if dispersion will be large, it testifies to the strongly different values of elements in party.
Example. Students got marks in two classes. First class ________2 3 3 4 4 4 5 5 Second class _____ 2 5 2 5 2 5 4 Find mean, fashion and dispersion. Use Excel or count up by hand. For determination medians use an algorithm. We must create a variation table value2345 count1232 frequentness0.1250.250.3750.25 Cumul.frequent ness0.1250.3750.751 A median is between a number 3 and by a number 4, because the accumulated frequentness corresponds her equal 0.5. For a calculation use a formula The low bound of interval is equal to 3. Width of interval is equal to 1, because marks differ from each other on 1. For a value 3 have w x =0.375 (Cumul.frequentness ), для 4 - w x =0.75. So, the percent of interval of median is equal 0.75-0.375=0.375 Me= 3+(0.5-0.375)/0.375=3.33
Next indexes. Lower and upper quantiles by analogy with a median, but divide chart by 25% and 75%. We will go back to dispersion. You buy party of apples and, will assume, the middle-weight of box is equal 25 kilograms. And dispersion is 100 kilogram in a square. It much or small? A result is not clear, because concept kilogram in a square not applicably to the tons and kilograms. Ton in a square! And if we estimate dispersion of group of people and you get a result in a square? A man is in a square! It can not be. The lack of dispersion is that it is measured in square units. Next description Mean quadratic deviation and its dimension coincides with the dimension of measurand. Mean quadratic deviation– Среднее квадратическое отклонение
The coefficient of asymmetry is used for the estimation of chamfered distribution and determined on a formula. In a symmetric curve asymmetry is equal to the zero. If right part of chart (a chart is divided into two parts by the value of mean) anymore than left, then it is positive asymmetry, if the left part anymore than right - negative. coefficient of asymmetry– коэффициент асимметрии chamfered - скошенность
An excess is determined on a formula. Number 3 take because for the normal law of distribution it is equal to 3. Normal distribution takes place, when operate great number of independent random values, among which it is impossible to specify main, none of factors renders Characterizes the degree of concentration of cases round a mean value and is description of steepness of chart. If EX =0, it is normal distribution. If EX>0, then a chart has a sharp top and data thickly form group round mean. If EX <0 a chart has a declivous top.
A range of varying is a difference between the maximal and minimum value of size. A confidence interval for middle is presented by the interval of values round an estimation, where with this level of trust there is a mean value of selection. For example, if mean selections 23 is equal, and lower and overhead the borders of confidence interval with the level of p=.95 are equal to 19 and 27, then it is possible to conclude that with probability 95% interval with borders 19 and 27 the mean covers selections. If you will set the greater level of trust, then an interval will become wider, probability with which he "covers" unknown mean to increases therefore, and vice versa. It is well known, for example, that than more "uncertain" weather (i.e. wider confidence interval) forecast, the higher chance, that he will be correct. Width of confidence interval depends on a volume or size of selection, and also from variation (to changeability) of data. The increase of size of selection does the estimation of middle more reliable. The increase of variation of the looked after values diminishes reliability of estimation). Calculation of confidence intervals is based on supposition of normality of the looked after sizes. If this supposition is not executed, then an estimation can appear bad, especially for small selections. At the increase of sample size, say, to 100 or more, quality of estimation gets better and without supposition of normality of selection. So, if the borders of confidence interval are certain and the level of significant of 95% is set, then it is possible to assert that with probability of 95% the mean is in these borders.