Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.

Similar presentations


Presentation on theme: "Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA."— Presentation transcript:

1

2 Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA

3 Dr. Serhat Eren 2 6.1 CHAPTER OBJECTIVES Numerical measures of center: the mean, the median, and the mode Numerical measures of variability: the range and the standard deviation Describing a set of data: the empirical rule and box-plots Descriptive statistics for grouped data Measures of relative standing: percentiles and percentile rank Identifying outliers: z-scores and box- plots

4 Dr. Serhat Eren 3 6.2 DESCRIBING DATA NUMERICALLY Numerical measures calculated from the data are known as either statistics or parameters. A statistic is a numerical descriptor that is calculated from sample data and is used to describe the sample. Statistics are usually represented by Roman letters. A parameter is a numerical descriptor that is used to describe a population. Parameters are usually represented by Greek letters.

5 Dr. Serhat Eren 4 6.3 MEASURES OF CENTRAL TENDENCY 6.3.1 The Arithmetic Mean The mean, or average, is calculated by adding all of the data values in the sample and then dividing by the number of values. The symbol for the sample mean is (this is read as X-bar). The sample mean is the center of balance of a set of data, and is found by adding up all of the data values and dividing by the number of observations.

6 Dr. Serhat Eren 5 The population parameter that corresponds to the sample mean is the population mean,  (mu). The population mean is represented by the Greek letter  (mu). Using the  notation, we can write the formula for the sample mean as: 6.3 MEASURES OF CENTRAL TENDENCY

7 Dr. Serhat Eren 6

8 7 6.3.1.A What Does the Sample Mean Really Measure? You can think of the sample mean as the balance point of the data. The value of X balances the higher values against the lower ones. 6.3 MEASURES OF CENTRAL TENDENCY

9 Dr. Serhat Eren 8 6.3.2 The Sample Median The sample median is a measure of the middle of the data after it is sorted from lowest to highest. The sample median is the value of the middle observation in an ordered set of data. Finding the sample median requires sorting the data set first. Once this is done, the sample median is the value of the observation that is in the middle of the data. 6.3 MEASURES OF CENTRAL TENDENCY

10 Dr. Serhat Eren 9 6.3.2 The Sample Median The exact location of the middle will depend on whether the number of observations in the sample is even or odd. –Step 1: If the number of observations in the sample, n, is odd, then the median is the value of the observation in the (n+1)/2 position. –Step 2: If n is even, then the median is the average of the values in the n/2 and n/2 + 1 positions. 6.3 MEASURES OF CENTRAL TENDENCY

11 Dr. Serhat Eren 10

12 Dr. Serhat Eren 11

13 Dr. Serhat Eren 12 6.3.2.A Why Use Two Different Measures? The median tells you that half of the observations in the sample are above that value and half of the observations are below it. Because it is a measure of location it ignores the actual values of the observations and may not fully reflect the sample data. 6.3 MEASURES OF CENTRAL TENDENCY

14 Dr. Serhat Eren 13 The mean uses all of the data values in its calculation and measures the center of balance of the data. While it can be shifted by extreme values, it does reflect all of the data values equally. If we start out with a symmetric, mound- shaped distribution, then the mean and the median are both located at the center of the distribution, at the bump. This is illustrated in Figure 6.2. 6.3 MEASURES OF CENTRAL TENDENCY

15 Dr. Serhat Eren 14

16 Dr. Serhat Eren 15 6.3.3 Comparing the Mean and the Median We know that when the data are more spread out in one direction, then the mean is pulled toward these values, in the direction of the skew. This is illustrated in Figure 6.3. 6.3 MEASURES OF CENTRAL TENDENCY

17 Dr. Serhat Eren 16 6.3.4 The Sample Mode The sample mode is the data value that has the highest frequency of occurrence in the sample. It would appear that the mode would be a very good measure of a typical value. But depending on the size of the sample and the number of possible data values, there may not be any repeated values in the sample. That is, for some samples, the mode may not exist. 6.3 MEASURES OF CENTRAL TENDENCY

18 Dr. Serhat Eren 17 For continuous data, we often refer to the modal class in a frequency distribution or histogram. The modal class is the class interval in a frequency distribution or histogram that has the highest frequency. Another problem with the mode is that there may appear to be more than one mode for a sample. This frequently happens with small samples. 6.3 MEASURES OF CENTRAL TENDENCY

19 Dr. Serhat Eren 18 6.4 MEASURES OF DISPERSION OR SPREAD 6.4.1 The Sample Range The simplest measure of dispersion, the sample range, involves looking at the two extreme values in the sample: the highest (maximum) and the lowest (minimum) values. The sample range, R, is the difference between the maximum and minimum observations in the sample.

20 Dr. Serhat Eren 19 The sample range is very easy to calculate and understand. It gives information about the distance from one end of an ordered data set to the other. If the sample data are symmetric, then it also gives information about the spread of the data relative to the measures of central tendency. 6.4 MEASURES OF DISPERSION OR SPREAD

21 Dr. Serhat Eren 20

22 Dr. Serhat Eren 21 6.4.2 The Sample Standard Deviation The standard deviation is most often defined relative to another measure of dispersion called the sample variance. In practice, the measure that is used is the standard deviation because its units and order of magnitude are the same as those of the actual data. 6.4 MEASURES OF DISPERSION OR SPREAD

23 Dr. Serhat Eren 22 The sample variance, s², is the average of the squared deviations of the data values from the sample mean. The sample standard deviation, s, is the positive square root of the sample variance. 6.4 MEASURES OF DISPERSION OR SPREAD

24 Dr. Serhat Eren 23 6.4.2 The Sample Standard Deviation To calculate the sample standard deviation, you calculate the sample variance, s², first, using the formula To obtain the sample standard deviation, s, you take the positive square root of the sample variance to obtain 6.4 MEASURES OF DISPERSION OR SPREAD

25 Dr. Serhat Eren 24 6.4.2 The Sample Standard Deviation The population variance and standard deviation are represented by the Greek letter  (sigma), where  ² is the population variance and  is the population standard deviation. 6.4 MEASURES OF DISPERSION OR SPREAD

26 Dr. Serhat Eren 25

27 Dr. Serhat Eren 26

28 Dr. Serhat Eren 27 6.4.3 Interpreting the Standard Deviation-The Empirical Rule The empirical rule says that for a mound- shaped, symmetric distribution –about 68% of all observations are within one standard deviation of the mean –about 95% of all observations are within two standard deviations of the mean –almost all (more than 99%) of the observations are within three standard deviations of the mean yourself. 6.4 MEASURES OF DISPERSION OR SPREAD

29 Dr. Serhat Eren 28 The empirical rule (Figure 6.5) is defined for large data sets and distributions that are symmetric and mound-shaped, often called bell-shaped or normal curves. 6.4 MEASURES OF DISPERSION OR SPREAD

30 Dr. Serhat Eren 29 6.4.4 z-Scores A z-score measures the number of standard deviations that a data value is from the mean. To calculate the z-score of a data value we first find the distance that the data value is from the mean and then divide by the standard deviation: 6.4 MEASURES OF DISPERSION OR SPREAD

31 Dr. Serhat Eren 30 As in the empirical rule, fort sample data we substitute and s for  and . A positive z-score indicates that the data value is above the mean, while a negative z- score indicates that the data value is below the mean. 6.4 MEASURES OF DISPERSION OR SPREAD


Download ppt "Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA."

Similar presentations


Ads by Google