Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Introduction to Summary Statistics
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Calculating & Reporting Healthcare Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Describing Data Using Numerical Measures
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Copyright © 2004 Pearson Education, Inc.
Objectives 1.2 Describing distributions with numbers
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Chapter 2 Summarizing and Graphing Data
Descriptive Statistics
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 2: Methods for Describing Sets of Data
Copyright © 2004 Pearson Education, Inc.. Chapter 2 Descriptive Statistics Describe, Explore, and Compare Data 2-1 Overview 2-2 Frequency Distributions.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 2 Describing Data.
Describing distributions with numbers
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Measures of Center.
Lecture 3 Describing Data Using Numerical Measures.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Chapter 8 Making Sense of Data in Six Sigma and Lean
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
1 Measures of Center. 2 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely.
Chapter 2 Descriptive Statistics
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Exploratory Data Analysis
Methods for Describing Sets of Data
ISE 261 PROBABILISTIC SYSTEMS
Chapter 2: Methods for Describing Data Sets
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Midrange (rarely used)
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Numerical Descriptive Measures
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Honors Statistics Review Chapters 4 - 5
Chapter 2 Describing, Exploring, and Comparing Data
Presentation transcript:

Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to large data sets.  Describing  Useful for recognizing important characteristics of data  Used in inferential statistics

Important Characteristics of Data  Center – typical data value  Variation – spread in data  Distribution – shape of data distribution  Outliers – problems in data  Time – changes over time  Population Size –

Graphical Summary Methods for Qualitative Data  Pie Chart  Useful for qualitative or quantitative data  Bar Chart  Useful for qualitative data  Called a Pareto chart if bars ordered by height

Graphical Summary Methods for Quantitative Data  Frequency Histogram  A “connected bar plot” with bar height proportional to the frequency of the associated value or class (interval of values)  Graphical summary of a frequency distribution (sometimes called a frequency table)

Frequency Distribution  For Discrete Data:  Lists data values and corresponding counts  Resulting histogram has a bar on each value with height proportional to its count  For Continuous Data:  Data is divided into classes (intervals of values) and the classes are listed along with the corresponding counts

Types of Histograms  Frequency – height of bar is count  Relative Frequency - height of bar is relative frequency ( proportion/percentage/probability)  Cumulative Frequency – height of bar is cumulative count  Cumulative Relative Frequency – height of bar is cumulative relative frequency (percentile)

Other Types of Graphs  Stem-and-Leaf Plot  Each value is separated into a stem (such as the leftmost value or values) and a leaf (such as the rightmost value or values)  Stems are listed in order and leaves are plotted alongside the appropriate stem  Ordered Stem-and-Leaf Plot  Dotplot  Each value is plotted as a dot along an x-axis. Dots representing equal values are stacked.

Other Types of Graphs  Scatter Diagram or Scatter Plot  Plot of paired (x,y) data with x on the horizontal axis and y on the vertical axis.  Useful for seeing relationship between x and y  Time-Series Plot  A special scatter diagram which as time plotted on the horizontal axis.

Importance of Knowing the Distribution of Data  Distribution can affect the choice of an appropriate statistic to use.  Distribution can aid in determining the validity of many inferential statistics.  Common data distributions  Bell (normal), bi-modal, right-skewed (chi- squared, exponential), left-skewed

Numerical Summary Methods  Measures of Center (Location)  The middle value or typical observation from a population.  Measures of Variability  The dispersion or spread in the population.  Measures of Relative Standing  The comparative value relative to the population.

Measures of Center Population Mean  Mean (Arithmetic Mean)  The size of the population is denoted by N. The sample size is denoted by n. Sample Mean

Measures of Center  Median  Middle value in the ordered data for odd n.  Mean of the 2 middle values for even n.  Commonly called the 50 th percentile.  The location of the median in the ordered data set is:(n+1)÷2

Measures of Center  Mode  Most common value (occurs most frequently)  Midrange  Midway between the lowest and highest value  Trimmed Mean  Mean of values remaining after an equal number of values are removed from each tail.

Skewness Mode = Mean = Median SKEWED LEFT (negatively ) SYMMETRIC Mean Mode Median SKEWED RIGHT (positively) Mean Mode Median

MeanMedianModeMidrange Trimmed Mean Values utilized? AllMiddle Most Common Extreme All but extreme Outliers? Not Robust RobustRobust Robust Exists?UniqueUnique May not exist or be unique UniqueUnique Data type? Quan.Quan.AnyQuan.Quan. Compare Measures of Center

Measures of Variation  Range  Distance between minimum and maximum  Range = Max – Min  The range does not measure the overall variability in the data. A measure is needed which incorporates the variability of every value in the data. One was is to look at deviations from the mean (x i -  for each x i.

Measures of Variation Population Variance  Variance  The average squared difference of the observations from the mean. Sample Variance

Calculator Formula for Sample Variance

Measures of Variation Population Standard Deviation  Standard Deviation  The square root of the average squared difference of the observations from the mean. Sample Standard Deviation

Calculator Formula for Sample Standard Deviation

Empirical Rule For data that is approximately bell-shaped in distribution,  68% of data values fall within 1 standard deviation of the mean,  95.4% of data values fall within 2 standard deviation of the mean,  99.7% of data values fall within 3 standard deviation of the mean,

x The Empirical Rule (applies to bell-shaped distributions ) FIGURE 2-13

x - s x x + sx + s 68% within 1 standard deviation 34% The Empirical Rule (applies to bell-shaped distributions ) FIGURE 2-13

x - 2s x - s x x + 2s x + sx + s 68% within 1 standard deviation 34% 95% within 2 standard deviations The Empirical Rule (applies to bell-shaped distributions ) 13.5% FIGURE 2-13

x - 3s x - 2s x - s x x + 2s x + 3s x + sx + s 68% within 1 standard deviation 34% 95% within 2 standard deviations 99.7% of data are within 3 standard deviations of the mean The Empirical Rule (applies to bell-shaped distributions ) 0.1% 2.4% 13.5% FIGURE 2-13

Measures of Relative Position  Standard Score or Z-Score

Measures of Relative Position Order Statistics  The order statistics, denoted by, x (1), x (2), … x (n) are the observed data values ordered from smallest to greatest.

Measures of Relative Position  Percentile  The k th percentile (P k ) separates the bottom k% of data from the top (100-k)% of data.  The location of P k in the order statistics is:

Finding the Value of the kth Percentile Sort the data. (Arrange the data in order of lowest to highest.) The value of the kth percentile is midway between the Lth value and the next value in the sorted set of data. Find P k by adding the L th value and the next value and dividing the total by 2. Start Compute L = n where n = number of values k = percentile in question ) ( k 100 Change L by rounding it up to the next larger whole number. The value of P k is the Lth value, counting from the lowest Is L a whole number ? Yes No Figure 2-15

Measures of Relative Position  Quartiles  The quartiles (Q 1 =P 25, Q 2 =P 50 and Q 3 =P 75 ) separate the data into fourths.  Interquartile Range (IQR)  The distance between the first and third quartiles: IQR=Q 3- Q 1.  The IQR is a measure of variability which is less affected by outliers than the range, variance and standard deviation.

Box-and-Whisker Diagram (Boxplot)  Graphical display of the “5 Number Summary”  X (1) =Min  Q 1 =P 25, Q 2 =P 50, Q 3 =P 75  X (n) =Max  Inner & Outer Fences  Useful for identifying potential outliers in data.

Bell-Shaped Figure 2-17 Boxplots

Bell-Shaped Figure 2-17 Boxplots Uniform

Bell-ShapedSkewed Figure 2-17 Boxplots Uniform

Percentile Rank  If x=P k and k is the percentile rank of x, then k is approximately equal to:

Exploring  Measures of center: mean, median, and mode  Measures of variation: Standard deviation and range  Measures of relative location: order statistics, minimum, maximum, percentile  Unusual values: outliers  Distribution: histograms, stem-leaf plots, and boxplots