1 Numerical Summary Measures Lecture 03: Measures of Variation and Interpretation, and Measures of Relative Position.

Slides:



Advertisements
Similar presentations
Class Session #2 Numerically Summarizing Data
Advertisements

Measures of Dispersion
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics: Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
Slides by JOHN LOUCKS St. Edward’s University.
Basic Business Statistics 10th Edition
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
12.3 – Measures of Dispersion
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
Describing Data: Numerical
Chapter 2 Describing Data with Numerical Measurements
Describing Data Using Numerical Measures
Department of Quantitative Methods & Information Systems
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Statistics for Managers.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 (continued) Nutan S. Mishra. Exercises Size of the data set = 12 for all the five problems In 3.11 variable x 1 = monthly rent of.
Numerical Descriptive Techniques
Chapter 3 – Descriptive Statistics
© Copyright McGraw-Hill CHAPTER 3 Data Description.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
© The McGraw-Hill Companies, Inc., Chapter 3 Data Description.
Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.
Review Measures of central tendency
1 MATB344 Applied Statistics Chapter 2 Describing Data with Numerical Measures.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Chapter 2 Describing Data.
Describing distributions with numbers
Lecture 3 Describing Data Using Numerical Measures.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Business Statistics, A First Course.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
1 Chapter 4 Numerical Methods for Describing Data.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Basic Business Statistics 11 th Edition.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Data Description Note: This PowerPoint is only a summary and your main source should be the book. Lecture (8) Lecturer : FATEN AL-HUSSAIN.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 3 Section 3 Measures of variation. Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see.
Math 201: Chapter 2 Sections 3,4,5,6,7,9.
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Quartile Measures DCOVA
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Presentation transcript:

1 Numerical Summary Measures Lecture 03: Measures of Variation and Interpretation, and Measures of Relative Position

2 Measures of Variation Consider the following three data sets: –Data 1: 1, 2, 3, 4, 5 –Data 2: 1, 1, 3, 5, 5 –Data 3: 3, 3, 3, 3, 3 For these data sets, the mean and the median are clearly identical. But, they are different data sets! The need to measure the variation in the data.

3 On the Perils of an “Average Value” Situation: Man has his head in a very hot compartment, and his feet feeling very cold. Question: Mr., how are you feeling? Reply: Oh, on the average, I am just fine! … Crash! Dead!

4 Sample Variance To measure degree of variation, one could look at the values of the deviations of the observations from its sample mean. The sample variance, denoted by S 2, is defined to be the ‘average’ of the squared deviations of the observations from its sample mean.

5 Computational Formula Definitional formula not very efficient for purposes of computation of the sample variance. The computational formula is oftentimes used.

6 Properties It has squared units … which leads to defining the standard deviation. It is always nonnegative, and equals zero if and only if all the observations are identical. The larger the value, the more variation in the data. The divisor of (n-1) instead of n makes the sample variance “unbiased” for the population variance (  2 ) … will be explained when we get into inference.

7 Standard Deviation The sample standard deviation, denoted by S, is the positive square root of the sample variance. Purpose: to have a measure with the same units of measurements as the original observations.

8 Illustration of Computation Data set in the example for the mean and median. Data: 122, 135, 110, 126, 100, 110, 110, 126, 94, 124, 108, 110, 92, 98, 118, 110, 102, 108, 126, 104, 110, 120, 110, 118, 100, 110, 120, 100, 120, 92 We illustrate computations using the definitional and computational formulas in a spreadsheet-type format.

9 Example … continued The spreadsheet-type table on the next slide is obtained from an Excel worksheet. The first three columns illustrates the computation using the definitional formula. The last column is used to illustrate the computation using the computational formula. Details will be provided in class!

10

11 Explanations of Columns in the Sheet Column 1: contains the values of X, Sum of X, and Sample Mean. Column 2: contains the deviations, Dev = X- SampleMean, and the Sum of Deviations. Column 3: contains the squared deviations, Sum of squared deviations, variance, and the standard deviation (via definitional formula). Column 4: contains the squared X; sum of squared X, and the variance (via the computational formula).

12 Population Parameters (Analogs) If the quantities are computed from the population values, then we obtain population parameters such as the mean, variance and standard deviations. The notation are as follows:

13 Information from Mean and Standard Deviation Empirical Rule: For symmetric mound-shaped distributions: –Percentage of all observations within 1 standard deviation of the mean is approximately 68%. –Percentage of all observations within 2 standard deviations of the mean is approximately 95%. –Percentage of all observations within 3 standard deviations of the mean is approximately 100%. –Thus, usually no observations will be more than 3 standard deviations of the mean!

14 Information … continued Chebyshev’s Rule: For any distribution (be it symmetric, skewed, bi-modal, etc.), we always have that: –Percentage of all observations within 1 standard deviation of the mean is at least 0%. –Percentage of all observations within 2 standard deviations of the mean is at least 75%. –Percentage of all observations within 3 standard deviations of the mean is at least 88.89%. –More generally, the percentage of observations within k standard deviations of the mean is at least (1 - 1/k 2 ).

15 Illustration of these Rules Consider the sample data with 30 observations considered earlier. Data: 122, 135, 110, 126, 100, 110, 110, 126, 94, 124, 108, 110, 92, 98, 118, 110, 102, 108, 126, 104, 110, 120, 110, 118, 100, 110, 120, 100, 120, 92 Recall that: –Sample mean = –Sample standard deviation = Percentages in the intervals of form: [Mean - kS, Mean + kS]

16 Percentages in Certain Intervals

17 Measure of Relative Standing: Z-Score Given a data set, the z-score, called the standardized score, associated with an observation whose value is x is given by It measures the distance of x from the sample mean in terms of the number of standard deviations. A negative (positive) value indicates the value x is smaller (larger) than the sample mean.

18 Percentiles Given a set of n observations, the 100pth percentile, where 0 < p < 1, is that value which is larger than 100p% of all the observation, and less than 100(1-p)% of the observations. For example, the 95th percentile is the value larger than 95% of all the observations and it is smaller than 5% of all the observations.

19 Measures of Relative Standing: Quartiles The first quartile, denoted by Q 1, is the 25th percentile of the data set. The third quartile, denoted by Q 3, is the 75th percentile of the data set. The second quartile, which is the 50th percentile, is simply the median of the data set, M.

20 Computing the Quartiles Divide the arranged data set into two parts using the median as cut-off. If the sample size n is odd, then the median should be included in each group; while if n is even then the median is not included in either group. First quartile (Q 1 ) is the median of the lower group. Third quartile (Q 3 ) is the median of the upper group.

21 Example: Quartile Computation Arranged Data: 92, 92, 94, 98, 100, 100, 100, 102, 104, 108, 108, 110, 110, 110, 110, 110, 110, 110, 110, 118, 118, 120, 120, 120, 122, 124, 126, 126, 126, 135 M = 110 = average of 15th and 16th values. Q 1 = in 8th position = 102 Q 3 = in 23rd position = 120.

22 Box Plots Another graphical summary of the data is provided by the boxplot. This provides information about the presence of outliers. Steps in constructing a boxplot are as follows: –Calculate M, Q 1, Q 3, and the minimum and maximum values. –Form a box with left and right ends being at Q1 and Q3, respectively. –Draw a vertical line in the box at the location of the median. –Connect the min and max values to the box by lines.

23 The BoxPlot For the systolic blood pressure data set, the resulting boxplot, obtained using Minitab, is shown below. HV LV Q3Q3 Q1Q1 M

24 Comparative BoxPlots The boxplot could also be used to make a comparison of the distributions of different groups. This could be achieved by presenting the boxplots of the different groups in a side-by-side manner. We demonstrate this idea using the Beanie Babies Data on page 91. This data set contains the following variable: Name: name of beanie baby Age: in months, since 9/98 Status: R=retired, C=current Value: Value of baby

25 Comparative BoxPlots of Value by Status Distributions for both groups very right-skewed!

26 Comparative BoxPlots of Log(Value) by Status

27 Relationship Between Age and Value

28 Relationship Between Log(Age) and Value