Copyright © 2004 Pearson Education, Inc.. Chapter 2 Descriptive Statistics Describe, Explore, and Compare Data 2-1 Overview 2-2 Frequency Distributions.

Slides:



Advertisements
Similar presentations
Chapter Four Making and Describing Graphs of Quantitative Variables
Advertisements

Probabilistic & Statistical Techniques
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.
1 Chapter 2. Section 2-4. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION E LEMENTARY.
Statistics It is the science of planning studies and experiments, obtaining sample data, and then organizing, summarizing, analyzing, interpreting data,
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Slide 1 Spring, 2005 by Dr. Lianfen Qian Lecture 2 Describing and Visualizing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 2-1.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Calculating & Reporting Healthcare Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Descriptive Statistics
1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Copyright © 2004 Pearson Education, Inc.
Chapter 2 Summarizing and Graphing Data
Descriptive Statistics
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Edited by.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Statistics Workshop Tutorial 3
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Created by Tom Wegleitner, Centreville, Virginia Section 3-1 Review and.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Methods for Describing Sets of Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Elementary Statistics Eleventh Edition Chapter 3.
Created by Tom Wegleitner, Centreville, Virginia Section 2-4 Measures of Center.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Chapter 2 Describing Data.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Measures of Center.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
1 Measures of Center. 2 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely.
1 a value at the center or middle of a data set Measures of Center.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Chapter 2 Descriptive Statistics
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Honors Statistics Chapter 3 Measures of Variation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Measures of Center.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
ISE 261 PROBABILISTIC SYSTEMS
Midrange (rarely used)
NUMERICAL DESCRIPTIVE MEASURES
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Frequency Distributions
Chapter 2 Summarizing and Graphing Data
Overview Created by Tom Wegleitner, Centreville, Virginia
Section 2-1 Review and Preview
Lecture Slides Elementary Statistics Eleventh Edition
Essentials of Statistics 4th Edition
Chapter 2 Describing, Exploring, and Comparing Data
Presentation transcript:

Copyright © 2004 Pearson Education, Inc.

Chapter 2 Descriptive Statistics Describe, Explore, and Compare Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative Standing 2-7 Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-1 Overview

Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics Describe the important characteristics of a set of data. Organize, present and summarize data: 1. Graphically 2. Numerically Overview

Copyright © 2004 Pearson Education, Inc. “Shape, Center, and Spread” 1. Center: A representative or average value that indicates where the middle of the data set is located 2. Variation: A measure of the amount that the values vary among themselves 3. Distribution: The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed) Important Characteristics of Quantitative Data

Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-2 and 2-3 Frequency Distributions And Visualizing Data

Copyright © 2004 Pearson Education, Inc.  Frequency Distribution Table that organizes data values into classes along with the number of data values that fall in each class (frequency, f). 1. Ungrouped Frequency Distribution – for data sets with few different values. Each value is in its own class. 2. Grouped Frequency Distribution: for data sets with many different values, which are grouped together in the classes. Frequency Distributions And Histograms

Copyright © 2004 Pearson Education, Inc. Ungrouped Frequency Distributions Number of Peas in a Pea Pod Sample Size: Peas per podFreq, f Peas per podFreq, f

Copyright © 2004 Pearson Education, Inc. Frequency Histogram A bar graph that represents the frequency distribution of a data set. It has the following properties: 1.Horizontal scale is quantitative and measures the data values. 2.Vertical scale measures the frequencies of the classes. 3.Consecutive bars must touch.

Copyright © 2004 Pearson Education, Inc. Frequency Histogram Ex. Peas per Pod Peas per podFreq, f

Copyright © 2004 Pearson Education, Inc.  Relative Frequency Distribution Shows the proportion (or percentage) of data values that fall into each class relative frequency: rf = f/n  Relative Frequency Histogram Has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies. Relative Frequency Distributions and Relative Frequency Histograms

Copyright © 2004 Pearson Education, Inc. Relative Frequency Histogram Has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies. Figure 2-2

Copyright © 2004 Pearson Education, Inc. Group data into 5-20 classes of equal width. Grouped Frequency Distributions Exam ScoresFreq, f

Copyright © 2004 Pearson Education, Inc.  Lower class limits: are the smallest numbers that can actually belong to different classes  Upper class limits: are the largest numbers that can actually belong to different classes  Class width: is the difference between two consecutive lower class limits or two consecutive lower class boundaries  Class midpoints: the value halfway between LCL and UCL  Class boundaries : the value halfway between an UCL and the next LCL Definitions

Copyright © 2004 Pearson Education, Inc. 1. Calculate the range of values to span the set: Range = Hi – Low. (May round up) 2.Decide on the number of classes (should be between 5 and 20). 3. Calculate class width: (May round up) 4. Choose the 1 st LCL (less than or equal to smallest value) 5. Write all LCLs by adding the class width. 6. Enter all the UCLs. 7. Find the frequencies for each class. Constructing a Grouped Frequency Table class width  (highest value) – (lowest value) number of classes

Copyright © 2004 Pearson Education, Inc.  Symmetric Data is symmetric if the left half of its histogram is roughly a mirror image of its right half.  Skewed Data is skewed if it is not symmetric and if it extends more to one side than the other.  Uniform Data is uniform if it is equally distributed (on a histogram, all the bars are the same height). “Shape” of Distribution

Copyright © 2004 Pearson Education, Inc. Shape Figure 2-11

Copyright © 2004 Pearson Education, Inc.  Outliers are “unusal” data values as compared to the rest of the set. They may be distinguished by gaps in a histogram. Outliers

Copyright © 2004 Pearson Education, Inc.  Besides histograms, there are other ways to graph quantitative data: 1. Stem and Leaf plots 2. Dot plots 3. Time Series Other Graphs

Copyright © 2004 Pearson Education, Inc. Stem-and Leaf Plot Represents data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit)

Copyright © 2004 Pearson Education, Inc. Dot Plot Consists of a graph in which each data value is plotted as a point along a scale of values Figure 2-5

Copyright © 2004 Pearson Education, Inc. Time-Series Graph Data that have been collected at different points in time. Figure 2-8 Ex.

Copyright © 2004 Pearson Education, Inc.  The two most common graphs for qualitative data are: 1. Pareto Charts (Bar charts) 2. Pie Charts Qualitative Data

Copyright © 2004 Pearson Education, Inc. Pareto Chart A bar graph for qualitative data, with the bars arranged in order according to frequencies Figure 2-6

Copyright © 2004 Pearson Education, Inc. Pie Chart A graph depicting qualitative data as slices pf a pie Figure 2-7

Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-4 Measures of Center

Copyright © 2004 Pearson Education, Inc. Measures of Center  Measure of Center Number representing a “typical” or central value of a data set. An “average”. There are 4 common “averages”: 1.Mean 2.Median 3.Mode 4.Midrange

Copyright © 2004 Pearson Education, Inc.  Mean: the measure of center obtained by adding the values and dividing the total by the number of values. The Mean

Copyright © 2004 Pearson Education, Inc. Notation  denotes the addition of a set of values x is the variable usually used to represent the individual data values n represents the number of values in a sample N represents the number of values in a population

Copyright © 2004 Pearson Education, Inc. Notation µ is pronounced ‘mu’ and denotes the mean of all values in a population x = n  x x is pronounced ‘x-bar’ and denotes the mean of a set of sample values x N µ =  x x

Copyright © 2004 Pearson Education, Inc. Carry one more decimal place than is present in the original set of values. Round-off Rule for Measures of Center

Copyright © 2004 Pearson Education, Inc. Median  Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude  often denoted by x (pronounced ‘x-tilde’) ~  is not affected by an extreme value

Copyright © 2004 Pearson Education, Inc. Finding the Median  If the number of values is odd, the median is the number located in the exact middle of the list  If the number of values is even, the median is found by computing the mean of the two middle numbers

Copyright © 2004 Pearson Education, Inc odd number of values: median is the exact middle value MEDIAN is even number of values: median is the mean of the by two numbers MEDIAN is 7.5

Copyright © 2004 Pearson Education, Inc. Mode  Mode: the value that occurs most frequently. The mode is not always unique. A data set may be:Bimodal Multimodal No Mode example: a b c  Mode is 1.10  Bimodal - 27 & 55  No Mode

Copyright © 2004 Pearson Education, Inc.  Midrange: the value midway between the highest and lowest values in the Original data set. Midrange Midrange = highest score + lowest score 2

Copyright © 2004 Pearson Education, Inc. Best Measure of Center

Copyright © 2004 Pearson Education, Inc. Picking the best “average”  The shape of your data may help determine the best measure of center.  Outliers may effect the mean, making it too high or too low to represent a “typical” value. If so, the median may be the best choice.

Copyright © 2004 Pearson Education, Inc. Shape Figure 2-11

Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-5 Measures of Variation

Copyright © 2004 Pearson Education, Inc. Measures of Variation “Spread” Because this section introduces the concept of variation, this is one of the most important sections in the entire book. The two most common methods of measuring spread: 1. Range 2. Standard deviation and variance

Copyright © 2004 Pearson Education, Inc. Definition The range of a set of data is the difference between the highest value and the lowest value value highest lowest value

Copyright © 2004 Pearson Education, Inc. Standard Deviation and Variance measure the amount data values vary (or deviate) from the mean.  ( x - x ) 2 n - 1 S2 =S2 = sample variance: sample standard deviation: s2s2 S =S =  ( x - x ) 2 n - 1 =

Copyright © 2004 Pearson Education, Inc. Round-off Rule for Measures of Variation Carry one more decimal place than is present in the original set of data. Round only the final answer, not values in the middle of a calculation.

Copyright © 2004 Pearson Education, Inc. Notation SamplePopulation StatisticsParameters Mean x µ Standards σ Deviation Variances 2 σ 2

Copyright © 2004 Pearson Education, Inc. Sample vs. Population Standard Deviation 2  ( x - µ ) N  = Note: Unlike x and µ, the formulas for s and σ are not mathematically the same:  ( x - x ) 2 n - 1 s = s =

Copyright © 2004 Pearson Education, Inc. Standard Deviation - Key Points  The standard deviation is a measure of variation of all values from the mean. The larger s is, the more the data varies.  ( When would s = 0 ?)  The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others)  The units of the standard deviation s are the same as the units of the original data values (The variance has units 2 ).

Copyright © 2004 Pearson Education, Inc. Standard Deviation and “Spread” How does “s” show how much the data varies? Three methods: 1. Range Rule of Thumb 2. Chebyshev’s Theorem 3. The Empirical Rule

Copyright © 2004 Pearson Education, Inc. The Range Rule of Thumb Alternatively, If the range is known, you can use the range rule to estimate the standard deviation: Range 4 s  Range Rule: For most data sets, the majority of the data lies within 2 standard deviations of the mean. Recall: Range = High – Lo Estimate: Range ≈ 4s

Copyright © 2004 Pearson Education, Inc. Chebyshev’s Theorem For data with any distribution, the proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1-1/K 2, where K is any positive number greater than 1.  For K = 2, at least 3/4 (or 75%) of all values lie within 2 standard deviations of the mean  For K = 3, at least 8/9 (or 89%) of all values lie within 3 standard deviations of the mean

Copyright © 2004 Pearson Education, Inc. The Empirical Rule Empirical ( ) Rule For data sets having a symmetric distribution:  About 68% of all values fall within 1 standard deviation of the mean  About 95% of all values fall within 2 standard deviations of the mean  About 99.7% of all values fall within 3 standard deviations of the mean

Copyright © 2004 Pearson Education, Inc. The Empirical Rule

Copyright © 2004 Pearson Education, Inc. The Empirical Rule

Copyright © 2004 Pearson Education, Inc. The Empirical Rule

Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-6 and 2-7 Measures of Position (Relative Standing)

Copyright © 2004 Pearson Education, Inc. Measures of Position Sometimes we want to know the “relative standing” or “relative position” of a particular data value in the set. Some measures of position: 1.Standard Scores (z-scores*) 2.Median, Quartiles, Percentiles

Copyright © 2004 Pearson Education, Inc.  The z-score (or standard score) for a data value x is the number of standard deviations that x is above or below the mean. z-score

Copyright © 2004 Pearson Education, Inc. Sample: Population x - µ z =  Round to 2 decimal places Computing z-scores z = x - x s  To convert a data value x to a z-score:

Copyright © 2004 Pearson Education, Inc. Interpreting Z Scores Whenever a value is less than the mean, its corresponding z score is negative Ordinary values: z score between –2 and 2 sd Unusual Values:z score 2 sd FIGURE 2-14

Copyright © 2004 Pearson Education, Inc. Other Measures of Position  Median  Quartiles  Percentiles Recall: The median separates ranked data into 2 equal parts.

Copyright © 2004 Pearson Education, Inc. Quartiles Quartiles separate ranked data into 4 equal parts:  Q 1 (First Quartile) separates the bottom 25% of sorted values from the top 75%.  Q 2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%.  Q 1 (Third Quartile) separates the bottom 75% of sorted values from the top 25%.

Copyright © 2004 Pearson Education, Inc. Q 1, Q 2, Q 3 divides ranked scores into four equal parts Quartiles 25% Q3Q3 Q2Q2 Q1Q1 Low(High) (median)

Copyright © 2004 Pearson Education, Inc. Percentiles Just as there are quartiles separating data into four parts, there are 99 percentiles denoted P 1, P 2,... P 99, which partition the data into 100 groups.

Copyright © 2004 Pearson Education, Inc. Tukey’s 5-number Summary  Tukey’s 5-number summary: Low Q 1 Median Q 3 High These 5 numbers can also give another representation of “center and spread.”

Copyright © 2004 Pearson Education, Inc. Boxplots Figure 2-16  A Boxplot (or Box & Whisker plot) is a graphical representation of Tukey’s 5-number summary. example:

Copyright © 2004 Pearson Education, Inc. Figure 2-17 Boxplots