Presentation on theme: "July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data."— Presentation transcript:
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data
July, 2000Guang Jin Key Concepts in This Chapter 4 Mean 4 Median 4 Mode 4 Range 4 Standard Deviation 4 Variance 4 Coefficient of Variation
July, 2000Guang Jin Measures of Central Tendency 4 Central tendency - the tendency of a set of data to center around certain values. 4 The three most common values are the mean, the median, and the mode.
July, 2000Guang Jin The Mean 4 The arithmetic mean (or simply, mean) is computed by summing all the observations in the sample and dividing the sum by the number of observations. 4 Symbolically, the mean x 1 is the first and x i is the ith in a series of observations. n is the total number of observations
July, 2000Guang Jin The Mean (Continued) 4 The arithmetic mean may be considered the balance point, or fulcrum, in a distribution. 4 The arithmetic mean is the point that balances the positive and negative deviations from the fulcrum. 4 The mean is affected by values of each observations of the distribution and may be distorted when extreme values exist.
July, 2000Guang Jin The Median 4 Median is defined as the middle value when observations are ordered. 4 Median is the value above which there are the same number of observations as below. 4 For an even number of observations, the median is the average of the two middlemost values.
July, 2000Guang Jin The Mode 4 The mode is the observation that occurs most frequently. 4 Mode can be read from a graph as that value on the horizontal axis that corresponds to the peak of the distribution.
July, 2000Guang Jin Which Average Should You Use for Quantitative Data? 4 When a distribution of observation is normal or not too skewed, the values of the mode, the median and the mean are same or similar, and any of them can be used to describe central tendency. 4 When a distribution is skewed, appreciable difference between the values of mean and median, therefore both the mean and median should be reported.
July, 2000Guang Jin 4 The mode always can be used with qualitative data 4 Median can be used whenever the qualitative data is ordinal 4 Mean is not appropriate for qualitative data Measures of central tendency for Qualitative Data
July, 2000Guang Jin Measures of Variation 4 Measure of variation (or variability) is important to know whether observations tend to be quite similar (homogeneous) or whether they vary considerably (heterogeneous). 4 Three most common measures of variation include the range, the standard deviation, and the variance.
July, 2000Guang Jin Range 4 The range is defined as the difference in value between the highest (maximum) and lowest (minimum) observation: Range = X max - X min
July, 2000Guang Jin Standard Deviation and Variance 4 By far the most widely used measure of variation is the standard deviation, represented by symbol s. 4 Standard deviation is the square root of the variance (represented by symbol s 2 ) of the observation. 4 The larger the standard deviation and variance, the more heterogeneous the distribution.
July, 2000Guang Jin Variance 4 The variance (s 2 ) is computed by squaring each deviation from the mean, adding them up, and dividing their sum by one less than n, the sample size:
July, 2000Guang Jin Standard Deviation 4 The standard deviation (s, sometimes represented by SD) is computed by extracting the square root of the variance: 4 The units of the standard deviation is the same as the unites of raw data.
July, 2000Guang Jin Important Generalizations 4 For most frequency distributions, a majority (often as many as 68%) of all observations are within one standard deviation on either side of the mean. 4 For most frequency distributions, a small minority (often as many as 5%) of all observations deviate more than two standard deviations on either side of the mean.
July, 2000Guang Jin Variability for Qualitative Data For qualitative data can not be ordered, measures of variability are nonexistent. For qualitative data can be ordered, it is appropriate to describe variability by identifying extreme observations.
July, 2000Guang Jin Coefficient of Variation 4 Coefficient of variation (represented by CV) is defined as the ratio of the standard deviation to the absolute value of the mean, expressed as a percentage: 4 CV depicts the size of the standard deviation relative to its mean and can be used to compare the relative variation of even unrelated quantities.
July, 2000Guang Jin Equations for Population and Sample Means and Standard Deviation