GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC

Slides:



Advertisements
Similar presentations
Data Analysis Techniques II: Measures of Central Tendencies, Dispersion and Symmetry Advanced Planning Techniques, Lecture 9 Prof. Dr. S. Shabih-ul-Hassan.
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Measures of Dispersion
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Analysis of Research Data
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
12.3 – Measures of Dispersion
1 CUMULATIVE FREQUENCY AND OGIVES. 2 AS (a) Collect, organise and interpret univariate numerical data in order to determine measures of dispersion,
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Programming in R Describing Univariate and Multivariate data.
REPRESENTATION OF DATA.
Numerical Descriptive Techniques
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
AP Stats Chapter 1 Review. Q1: The midpoint of the data MeanMedianMode.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Categorical vs. Quantitative…
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 1.Develop and interpret a stem-and-leaf display 2.Develop and interpret a: 1.Dot plot 3.Develop and interpret quartiles, deciles, and percentiles 4.Develop.
BUSINESS STATISTICS I Descriptive Statistics & Data Collection.
To be given to you next time: Short Project, What do students drive? AP Problems.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Summary Statistics: Measures of Location and Dispersion.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Descriptive Statistics – Graphic Guidelines
LIS 570 Summarising and presenting data - Univariate analysis.
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
CHAPTER 1 Basic Statistics Statistics in Engineering
Descriptive Statistics(Summary and Variability measures)
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
Copyright © 2016 Brooks/Cole Cengage Learning Intro to Statistics Part II Descriptive Statistics Intro to Statistics Part II Descriptive Statistics Ernesto.
MR. MARK ANTHONY GARCIA, M.S. MATHEMATICS DEPARTMENT DE LA SALLE UNIVERSITY.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Exploratory Data Analysis
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
Percentiles and Box-and- Whisker Plots
Numerical Measures: Skewness and Location
STA 291 Summer 2008 Lecture 4 Dustin Lueker.
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
STA 291 Spring 2008 Lecture 4 Dustin Lueker.
Presentation transcript:

GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC 8.3 : Measures of Location 8.4 : Measures of Dispersion

LEARNING OUTCOMES 8.3(b) Find and interpret the mean, mode, median, quartiles and percentile for grouped data 8.3(c) Describe the symmetry and skewness for a data distribution 8.4(b) Find and interpret variance, standard deviation and coefficient of variation for grouped data

Sketch of Median, Quartiles, Interquartiles, Decile and Percentile from ogive Cumulative frequency P75 = Q3 = D7.5 Median = P50 = Q2 = D5 P25=Q1 = D2.5 X1 X2 X3 Class boundaries

Using the ogive drawn below, determine the Example 1 Using the ogive drawn below, determine the 5 10 15 20 25 30 40 35 Median First quartile Third decile Seventieth percentile

Solution Median: 60/2= 30th observation From the ogive, the median = 20 (b) First quartile:60/4=15th observation From the ogive, the first quartile =12.5 (c) Third decile;3/10 X 60=18th From the ogive, the third decile =14 (d) Seventieth percentile; 70/100 X 60=42th From the ogive percentile is = 24.5

Seventieth percentile 5 10 15 20 25 30 40 35 Seventieth percentile Median Third decile First quartile 12.5 14 20 24.5

Shape of data distribution Symmetry and Skewness The general shape of the data distribution can be determine from mean, median and mode as illustrated in the histogram or frequency curve. For largely skewed distribution, median is more appropriate measure of central tendency. For symmetrical distribution or almost symmetrical distribution, mean is the appropriate measure of central tendency.

Shape of data distribution Symmetry and Skewness Three important shapes: i. Symmetry ii. Positively skewed or right- skewed distribution iii. Negatively skewed or left-skewed distribution

Mean = Median = Mode SYMMETICAL Symmetrical ~The values of the mean, median and mode are identical. ~They lie at the center. frequency Mean = Median = Mode SYMMETICAL Mean Median Mode variable

IN DETAIL A set of observations is symmetrically distributed if its graphical representation (histogram, bar chart) is symmetric with respect to a vertical axis passing through the mean. For a symmetrically distributed population or sample, the mean, median and mode have the same value. Half of all measurements are greater than the mean, while half are less than the mean.

Mean > Median > Mode POSITIVELY SKEWED (ii) Positively skewed or Skewed to the right ~The value of the mean is the largest ~The mode is the smallest ~The median lies between these two values frequency Mean > Median > Mode POSITIVELY SKEWED Mode Mean variable Median

IN DETAIL A set of observations that is not symmetrically distributed is said to be skewed. It is positively skewed if a greater proportion of the observations are less than or equal to (as opposed to greater than or equal to) the mean; this indicates that the mean is larger than the median. The histogram of a positively skewed distribution will generally have a long right tail; thus, this distribution is also known as being skewed to the right.

Mean < Median < Mode NEGATIVELY SKEWED (iiI) Negatively skewed or Skewed to the left ~The value of the mean is the smallest ~The mode is the largest ~The median lies between these two values Mean < Median < Mode NEGATIVELY SKEWED frequency Mean Mode variable Median

IN DETAIL A negatively skewed distribution has more observations that are greater than or equal to the mean. Such a distribution has a mean that is less than the median. The histogram of a negatively skewed distribution will generally have a long left tail; thus, the phrase skewed to the left is applied here.

MEASURES OF DISPERSION VARIANCE STANDARD DEVIATION MEASURES OF DISPERSION RANGE INTER-QUARTILE RANGE

RANGE INTERQUARTILE RANGE Range = upper boundary of the last data - lower boundary of the first class INTERQUARTILE RANGE Defined as the difference between the third quartile and the first quartile Interquartile range = Q3 - Q1

Variance and standard deviation

Find the range, variance and standard deviation Example 2: Find the range, variance and standard deviation Class Intervals Frequency Class mark x 1-3 5 2 10 20 4-6 3 15 75 7-9 8 16 128 10-12 1 11 121 13-15 6 14 84 1176 16-18 4 17 68 1156

Solution: Range = upper boundary of the last data - lower boundary of the first class = 18.5 – 0.5 = 18

REMARK Sometimes we would like to compare the variability of two different data sets that have different units of measurement. Standard deviation is not suitable since it is a measure of absolute variability and not of relative variability. The most appropriate measure is the coefficient of variation (CV) which expresses standard deviation as a percentage of the mean.

Coefficient of variation Note: A larger coefficient of variation means that the data is more dispersed and less consistent.

Example : Suppose we want to compare two production process that fill containers with products Process A is filling fertilizer bags, which have a nominal weight of 80 pounds. For process A : For process A, Process B is filling cornflakes boxes, which have a nominal weight of 24 ounces. For process B : For process B,

Is process A much more variable than process B because 1 Is process A much more variable than process B because 1.2 is three times larger than 0.4? No because the two processes have very similar variability relative to the size of their means