Brought to you by Tutorial Support Services The Math Center
Statistics is the study of how to collect, organize, analyze, and interpret numerical information. Descriptive statistics generally characterizes or describes a set of data elements by graphically displaying the information or describing its central tendencies and how it is distributed. Inferential statistics tries to infer information about a population by using information gathered by sampling.
Population : The complete set of data elements where N refers to the Population Size. Sample : A portion of a population selected for further analysis. Midrange: The arithmetic mean of the highest and lowest data elements. Parameter : A characteristic of the whole population. Statistic : A characteristic of a sample, presumably measurable.
The Arithmetic Mean is obtained by summing all elements of the data set and dividing by the number of elements: The Sample Size is the number of elements in a sample. It is referred to by the symbol n, whereas x refers to each element in the data set. The Mode is the data element which occurs most frequently.
The Median is the middle element when the data set is arranged in order of magnitude. 1. When n is odd, simply take the middle value of the data set. 2. When n is even, take the sum of the two middle values, leaving the same amount of even numbers before these two values and the same amount after them, and divide by 2. The Midrange is the arithmetic mean of the highest and lowest data element:
Example: A sample of size 9 ( n=9 ) is taken of student quiz scores with the following results: 5, 6, 7, 7, 8, 8, 8, 9.5, 10 Answer: The mean is : The median is: 8 (since this is the middle element) The Mode is 8, since it is the data value which appears in the distribution the most frequently The Midrange is:
Range is the difference between the highest and lowest data element. The Standard deviation is another way to calculate dispersion. This is the most common and useful measure because it is the average distance of each score from the mean. The formula for sample standard deviation is as follows: The Population Standard Deviation is as follows: Notice the difference between the sample and population standard deviations. The sample standard deviation uses ( n-1) in the denominator, hence is slightly larger than the population standard deviation which uses N (which is often written as n ). Variance is the third method of measuring dispersion:
First, we want to calculate the mean and sample standard deviation of the following distribution: 1, 2, 3, 4, 5. We calculate our mean, and it is: Now we construct a table in which to keep track of our data: 1 -2 4 2 1 3 0 0 4 1 1 5 2 4
We now want to find the sum of : 4+1+0+1+4=10. The total number of values is N =5. To find N- 1, subtract 1 from 5 to get 4. Now we find the sample standard deviation:
Using the formula for the population standard deviation gives us the following: The variance of our distribution 1, 2, 3, 4, 5 is: Squaring σ gives us: