 # DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.

## Presentation on theme: "DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion."— Presentation transcript:

DESCRIBING DATA: 2

Numerical summaries of data using measures of central tendency and dispersion

Central tendency--Mode Table 1. Undergraduate Majors

Bimodal Distributions Table 1. Undergraduate Majors

Mode for Grouped Frequency Distributions based on Interval Data Midpoint of the modal class interval

Median The point in the distribution above which and below which exactly half the observations lie (50th percentile) Calculation depends on whether the no. of observations is odd or even.

Median= 188

MEDIAN for grouped frequency distributions based on interval data Median = 40 + ((20/30) * 10) = 40 + 6.67 = 46.67

ARITHMETIC MEAN

Mean for Grouped Data Mean = sum of weighted midpoints / n = 4650/100=46.5

Mean is the balancing point of the distribution 0123456789 X X X X X X X MEAN

Key Properties of the Mean Sum of the differences between the individual scores and the mean equals 0 sum of the squared differences between the individual scores and the mean equals a minimum value. The minimum value

Weaknesses of each measure of central tendency MODE: ignores all other info. about values except the most frequent one MEDIAN: ignores the LOCATION of scores above or below the midpoint MEAN: is the most sensitive to extreme values

Mode Mean Median Impacts of skewed distributions

Measures of Dispersion Poverty Households (%) in 2 suburbs by tract Less dispersion more dispersion

Range Highest value minus the lowest value problem: ignores all the other values between the two extreme values

Interquartile range Based on the quartiles (25th percentile and 75th percentile of a distribution) Interquartile range = Q 3 -Q 1 Semi-interquartile range = (Q 3 -Q 1 )/2 eliminates the effect of extreme scores by excluding them

Graphic representation: Box Plot Infant mortality rate AfricaAsia Latin America

Variance A measure of dispersion based on the second property of the mean we discussed earlier: minimum

Step 1: Calculate the total sum of squares around the mean

Step 2: Take an average of this total variation Why n-1? Rather than simply n??? The normal procedure involves estimating variance for a population using data from a sample. Samples, especially small samples, are less likely to include extreme scores in the population. N-1 is used to compensate for this underestimate.

Step 3: Take the square root of variance Purpose: expresses dispersion in the original units of measurement--not units of measurement squared Like variance: the larger the value the greater the variability

Coefficient of Variation (V) V = (standard deviation / mean) Value : To allow you to make comparisons of dispersion across groups with very different mean values or across variables with very different measurement scales.

Download ppt "DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion."

Similar presentations