4.1.1: Summarizing Numerical Data Sets

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

Ch 11 – Probability & Statistics
Introduction Data sets can be compared and interpreted in the context of the problem. Data values that are much greater than or much less than the rest.
Describing Distributions Numerically
Unit 4 – Probability and Statistics
Grouped Data Calculation
Introduction Measures of center and variability can be used to describe a data set. The mean and median are two measures of center. The mean is the average.
Vocabulary for Box and Whisker Plots. Box and Whisker Plot: A diagram that summarizes data using the median, the upper and lowers quartiles, and the extreme.
Box and Whisker Plot 5 Number Summary for Odd Numbered Data Sets.
Quartiles & Extremes (displayed in a Box-and-Whisker Plot) Lower Extreme Lower Quartile Median Upper Quartile Upper Extreme Back.
CONFIDENTIAL 1 Grade 8 Algebra1 Data Distributions.
Describing distributions with numbers
Working with one variable data. Spread Joaquin’s Tests Taran’s Tests: 76, 45, 83, 68, 64 67, 70, 70, 62, 62 What can you infer, justify and conclude about.
Measures of Central Tendency & Spread
Objectives Vocabulary
Warm Up Problem of the Day Lesson Presentation Lesson Quizzes.
6-9 Data Distributions Objective Create and interpret box-and-whisker plots.
What is the MEAN? How do we find it? The mean is the numerical average of the data set. The mean is found by adding all the values in the set, then.
Numerical Statistics Given a set of data (numbers and a context) we are interested in how to describe the entire set without listing all the elements.
Bell Ringers Calculate the mean median and mode for the following sets of data. 1.15, 16, 19, 6, 16, 17, 19 Mean: Median: Mode: 2. 68, 74, 20, 45, 96,
Warm-Up Define mean, median, mode, and range in your own words. Be ready to discuss.
Measures of Variation For Example: The 11 Workers at a company have the following ages: 27, 39, 40, 22, 19, 25, 41, 58, 53, 49, 51 Order data from least.
Quantitative data. mean median mode range  average add all of the numbers and divide by the number of numbers you have  the middle number when the numbers.
Numerical Measures of Variability
Summary Statistics and Mean Absolute Deviation MM1D3a. Compare summary statistics (mean, median, quartiles, and interquartile range) from one sample data.
Introduction Measures of center and variability can be used to describe a data set. The mean and median are two measures of center. The mean is the average.
What are the effects of outliers on statistical data?
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
 Please pick up a copy of the Unit 4 vocabulary and the KWL chart.  Students will review Unit 4: Describing Data Essential Vocabulary and complete the.
Warm Up Simplify each expression
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
Unit 4: Probability Day 4: Measures of Central Tendency and Box and Whisker Plots.
Concept: Comparing Data. Essential Question: How do we make comparisons between data sets? Vocabulary: Spread, variation Skewed left Skewed right Symmetric.
Unit 4 Describing Data Standards: S.ID.1 Represent data on the real number line (dot plots, histograms, and box plots) S.ID.2 Use statistics appropriate.
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
CCGPS Coordinate Algebra Unit 4: Describing Data.
Measures of Central Tendency, Dispersion, IQR and Standard Deviation How do we describe data using statistical measures? M2 Unit 4: Day 1.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Introduction Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data.
Holt McDougal Algebra 1 Data Distributions Holt Algebra 1 Warm Up Warm Up Lesson Presentation Lesson Presentation Lesson Quiz Lesson Quiz Holt McDougal.
Statistics Vocab Notes Unit 4. Mean The average value of a data set, found by adding all values and dividing by the number of data points Example: 5 +
Introduction To compare data sets, use the same types of statistics that you use to represent or describe data sets. These statistics include measures.
Chapter 5 : Describing Distributions Numerically I
Measures of Central Tendency & Center of Spread
Measures of Central Tendency & Center of Spread
Unit 4 Statistics Review
SEE SOMETHING, SAY SOMETHING
Measures of Variation.
Warm-up 8/25/14 Compare Data A to Data B using the five number summary, measure of center and measure of spread. A) 18, 33, 18, 87, 12, 23, 93, 34, 71,
Lesson 1: Measures of Center Mean and Median
Lesson 1: Summarizing and Interpreting Data
The absolute value of each deviation.
Summarizing Numerical Data Sets
11.2 box and whisker plots.
Algebra I Unit 1.
Describe the spread of the data:
1.3 Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Exploratory Data Analysis
EOC Review Question of the Day.
Statistics and Data (Algebraic)
Basic Practice of Statistics - 3rd Edition
Statistics Vocabulary Continued
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
First Quartile- Q1 The middle of the lower half of data.
5 Number Summaries.
Describing Distributions with Numbers
Warm-Up Define mean, median, mode, and range in your own words. Be ready to discuss.
Core Focus on Linear Equations
Statistics Vocabulary Continued
Presentation transcript:

4.1.1: Summarizing Numerical Data Sets Introduction Measures of center and variability can be used to describe a data set. The mean and median are two measures of center. The mean is the average value of the data. The median is the middle-most value in a data set. These measures are used to generalize data sets and identify common or expected values. Interquartile range and mean absolute deviation describe variability of the data set. Interquartile range is the difference between the third and first quartiles. The first quartile is the median of the lower half of the data set. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Introduction, continued The third quartile is the median of the upper half of the data set. The mean absolute deviation is the average absolute value of the difference between each data point and the mean. Measures of spread describe the variance of data values (how spread out they are), and identify the diversity of values in a data set. Measures of spread are used to help explain whether data values are very similar or very different. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Key Concepts Data sets can be compared using measures of center and variability. Measures of center are used to generalize data sets and identify common or expected values. Two measures of center are the mean and median. Finding the Mean Find the sum of the data values. Divide the sum by the number of data points. This is the mean. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Key Concepts, continued The mean is useful when data sets do not contain values that vary greatly. Median is a second measure of center. Finding the Median First arrange the data from least to greatest. Count the number of data points. If there is an even number of data points, the median is the average of the two middle-most values. If there is an odd number of data points, the median is the middle-most value. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Key Concepts, continued The mean and median are both measures that describe the expected value of a data set. Measures of spread describe the range of data values in a data set. Mean absolute deviation and interquartile range describe variability. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Key Concepts, continued The mean absolute deviation takes the average distance of the data points from the mean. This summarizes the variability of the data using one number. Finding the Median Absolute Deviation Find the mean. Calculate the absolute value of the difference between each data value and the mean. Determine the average of the differences found in step 2. This average is the mean absolute deviation. 4.1.1: Summarizing Numerical Data Sets

Key Concepts, continued Finding the Interquartile Range Arrange the data from least to greatest. Count the number of data points in the set. Find the median of the data set. The median divides the data into two halves: the lower half and the upper half. Find the middle-most value of the lower half of the data. The data to the left represents the first quartile, Q 1. Find the middle-most value of the upper half of the data. The data to the right is the third quartile, Q 3. Calculate the difference between the two quartiles, Q 3 – Q 1 . The interquartile range is the difference between the third and first quartiles. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Key Concepts, continued The interquartile range finds the distance between the two data values that represent the middle 50% of the data. This summarizes the variability of the data using one number. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Common Errors/Misconceptions confusing the terms mean and median, and how to calculate each measure forgetting to order data from least to greatest before calculating the median, quartiles, and interquartile range incorrectly finding the absolute value of the difference between each data value and the mean 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice Example 1 Aaron has diabetes, and needs to monitor his blood sugar level. He measures his blood sugar each day before he eats dinner. Aaron’s results for the past 10 days are in the table below. Blood sugar levels for any person before meals should be between 80 and 120. Are there any striking deviations in the data? Day Blood sugar level 1 109 2 115 3 89 4 92 5 106 6 101 7 98 8 94 9 107 10 93 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 1, continued To see if there are any deviations in the data, start by finding Aaron’s mean blood sugar level over the past 10 days. Aaron’s mean blood sugar level over the past 10 days was 100.4. 2. Examine the data for deviations. The lowest blood sugar level was 89. The highest blood sugar levels was 115. These numbers are not very far apart in terms of the mean. Also, the data is within the suggested range of blood sugar levels. There are no striking deviations in the data. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice Example 2 A website captures information about each customer’s order. The total dollar amounts of the last 8 orders are listed in the table to the right. What is the mean absolute deviation of the data? Order Dollar amount 1 21 2 15 3 22 4 26 5 24 6 7 17 8 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 2, continued To find the mean absolute deviation of the data, start by finding the mean of the data set. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 2, continued Find the sum of the data values, and divide the sum by the number of data values. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 2, continued Find the absolute value of the difference between each data value and the mean: |data value – mean|. |21 – 21| = 0 |15 – 21| = 6 |22 – 21| = 1 |26 – 21| = 5 |24 – 21| = 3 |17 – 21| = 4 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 2, continued Find the sum of the absolute values of the differences. 0 + 6 + 1 + 5 + 3 + 0 + 4 + 1 = 20 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Divide the sum of the absolute values of the differences by the number of data values. The mean absolute deviation of the dollar amounts of each order set is 2.5. This says that the average cost difference between the orders and the mean order is $2.50. ✔ 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice Example 3 A company keeps track of the age at which employees retire. It is considered an early retirement if the employee retires before turning 65. The age of the 11 employees who took early retirement this year are listed in the table below. Are there any striking deviations in the data? 4.1.1: Summarizing Numerical Data Sets

Guided Practice: Example 3, continued Employee Age at early retirement 1 56 2 55 3 60 4 51 5 53 6 58 7 8 64 9 59 10 42 11 48 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued First find the interquartile range. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Order the data set from least to greatest. 42 48 51 53 55 56 56 58 59 60 64 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Find the median of the data set. If there is an odd number of data values, find the middle-most value. If there is an even number of data values, find the average of the two middle-most values. There are 11 data values. The sixth data value is the middle-most value, and therefore is the median. The median of this data set is 56. 42 48 51 53 55 56 56 58 59 60 64 median 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Find the first quartile. The first quartile is the median of the lower half of the data set, or the values less than the median value. The first five data values are the lower half of the data set: 42, 48, 51, 53, and 55. The median of the first five data values is the middle-most value of these five values. The first quartile is the third value, 51. 42 48 51 53 55 56 56 58 59 60 64 Q1 median 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Find the third quartile. The third quartile is the median of the upper half of the data set, or the values greater than the median value. The last five data values are the upper half of the data set: 56, 58, 59, 60, and 64. The median of the last five data values is the middle-most value of these five values. The third quartile is the third value, 59. 42 48 51 53 55 56 56 58 59 60 64 Q1 median Q3 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Find the difference between the third and first quartiles: third quartile – first quartile, or IQR = Q3 – Q1. IQR= 59 – 51 = 8 The interquartile range is 8. 4.1.1: Summarizing Numerical Data Sets

4.1.1: Summarizing Numerical Data Sets Guided Practice: Example 3, continued Look for striking deviations in the data. Think about the typical retirement age, which is 65. Also consider the interquartile range, which is 8. Retiring at the age of 42 is young and far away from the median of 56. The age 42 would be considered a striking deviation because it is far away from the other data values. This may be considered an outlier. We will discuss outliers later. ✔ 4.1.1: Summarizing Numerical Data Sets