Section 2.1 Frequency Distributions and Their Graphs.

Section 2.1 Frequency Distributions and Their Graphs

How to construct the following graphs from frequency distributions: frequency histograms relative frequency histograms frequency polygons ogives

Definition: A table that shows the classes or intervals of data entries with the number of entries in each class. Example: ClassFrequency 67-783 79-905 91-1028 103-1149 115-1265

ClassƒBoundaries 67-78366.5-78.5 79-90578.5-90.5 91-102890.5-102.5 103-1149102.5-114.5 115-1265114.5-126.5 ClassƒMidpoint 67-78372.5 79-90584.5 91-102896.5 103-1149108.5 115-1265120.5

Time on Phone minutes Relative frequency ClassFrequencyRelative Frequency Cumulative Frequency 67-7830.103 79-9050.178 91-10280.2716 103-11490.3025 115-12650.1730

9 8 7 6 5 4 3 2 1 0 Time on Phone minutes ClassƒMidpoint 67-78372.5 79-90584.5 91-102896.5 103-1149108.5 115-1265120.5 72.5 84.5 96.5108.5 120.5 How to Construct a Frequency Polygon Plot the midpoint and frequency. Connect consecutive midpoints. Extend the frequency polygon to the axis by one class. 60.5 132.5

9 8 7 6 5 4 3 2 1 0 5 9 8 5 3 Time on Phone minutes 72.5 84.5 96.5 108.5120.5 132.5 60.5

ClassFrequencyRelative Frequency Cumulative Frequency 67-7830.103 79-9050.178 91-10280.2716 103-11490.3025 115-12650.1730 0 3 8 16 25 30 66.578.590.5102.5114.5126.5 0 10 20 30 Minutes Minutes on Phone Cumulative frequency

ClassFrequencyRelative Frequency Cumulative Relative Frequency 67-7830.100.1 79-9050.1670.267 91-10280.2670.534 103-11490.300.834 115-12650.1671.001 0.1.267.534.834 1.0 66.578.590.5102.5114.5126.5 0.30.60 1.0 Cumulative Relative Frequency Minutes Minutes on Phone

 In the texbook turn to page 31. Using the data set do the following:  Fill in the following frequency distribution:  On a single piece of graph paper construct:  A Histogram  A Frequency Polygon  An Ogive ClassFrequencyRelative F.Cumulative F.MidpointsBoundaries 0 – 9 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69

Section 2.2 More Graphs and Displays

quantitative How to graph and interpret quantitative data sets using stem-and-leaf plots and dot plots qualitative How to graph and interpret qualitative data sets using pie charts and Pareto charts paired data How to graph and interpret paired data sets (bivariate data) using scatter plots

STEM LEAF 6 7 7 1 8 8 2 5 6 7 7 9 2 5 7 9 9 10 0 1 2 3 3 4 5 5 7 8 9 11 2 6 8 12 2 4 5 Key: 6 | 7 means 67 102124108 86103 82 71104112 11887 95 103116 85 12287 100 10597107 6778 125 10999105 99101 92 Minutes Spent on the Phone

Stem Leaf 6 7 7 1 7 8 8 2 8 5 6 7 7 9 2 9 5 7 9 9 10 0 1 2 3 3 4 10 5 5 7 8 9 11 2 11 6 8 12 2 4 12 5 1st line digits 0 1 2 3 4 2nd line digits 5 6 7 8 9

Example: Suppose the following stem-and-leaf plot set represents the scores on a statistics quiz (out of 50). a) How many students took the quiz? b) How many students scored a perfect score of 50? c) What is the lowest score on the quiz? d) What score occurs the most frequently (the “mode”) e) What advantage(s) does a stem-and-leaf plot have compared to a histogram?

Example: Suppose the following stem-and-leaf plot set represents the scores on a statistics quiz (out of 50). a) How many students took the quiz? 31 b) How many students scored a perfect score of 50? 2 c) What is the lowest score on the quiz? 6 d) What score occurs the most frequently (the “mode”) 17 e) What advantage(s) does a stem-and-leaf plot have compared to a histogram? Retains the data values

 What is the highest grade a man earned on the quiz?  How many women were in the class?  What percent of the class is male?  If a passing grade is 35 or higher, what percent of the females passed?  Overall do you think women or men did better on this quiz?

 What is the highest grade a man earned on the quiz? 50  How many women were in the class? 12  What percent of the class is male? 19/31 = 61.3%  If a passing grade is 35 or higher, what percent of the females passed? 3/12 = 25%  Overall do you think women or men did better on this quiz? Men

66768696106116126 Time Spent on Phone minutes 102124108 86103 82 7110411211887 95 103116 8512287 100 10597107 6778 125 10999105 99101 92 Minutes Spent on the Phone Duplicates: 87, 99, 103, 105

The following is a dotplot for a different group of students taking the 50-point statistics quiz a) How many students took the quiz? b) How many students scored a perfect score of 50? c) What is the lowest score on the quiz? d) What score represents the mode?

The following is a dotplot for a different group of students taking the 50-point statistics quiz a) How many students took the quiz? 26 b) How many students scored a perfect score of 50? 0 c) What is the lowest score on the quiz? 4 d) What score represents the mode? 27

 A Bar chart is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency (or relative frequency) on the other axis.  A Pareto chart is a bar graph whose bars are drawn in decreasing order

A pie chart is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.  What percentage of the M&M’s are green?  What color was the most common M&M color?  Suppose 17% of the M&Ms are blue. If you have a bag with 400 M&M’s. How many of them would be blue?

A pie chart is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.  What percentage of the M&M’s are green? ≈12%  What color was the most common M&M color? Orange  Suppose 17% of the M&Ms are blue. If you have a bag with 400 M&M’s. How many of them would be blue? 68 M&Ms

Central Angle for each segment = Category Budget Degrees Human Space Flight 5.7 143 Technology 5.9 149 Mission Support 2.7 68 Total: 14.3 360 # in category total number X 360 degrees

AbsencesGrade 02468101214 16 40 45 50 55 60 65 70 75 80 85 90 95 Number of Absences x 8 2 5 12 15 9 6 y 78 92 90 58 43 74 81 Final Grade Do the absences depend on the grade? OR Do the grades depend on the absences? The Impact of Absences on Grades

Section 2.3 Measures of Central Tendency

Discover how outliers affects the mean, median, and mode How to describe the shape of a distribution as symmetric, uniform, or skewed How to find a weighted mean and the mean of a frequency distribution

Given the following set determine the mean, median, and mode. 20 20 20 20 20 20 21 21 21 21 22 22 22 23 23 23 23 24 24 65 Mean: Median: Mode:

Remove the outlier and determine the mean, median, mode 20 20 20 20 20 20 21 21 21 21 22 22 22 23 23 23 23 24 24 65 Mean: Median: Mode: Outlier: An individual value that falls outside the overall pattern

Remove the outlier and determine the mean, median, mode 20 20 20 20 20 20 21 21 21 21 22 22 22 23 23 23 23 24 24 65 Mean: 23.75New Mean: 21.58 Median: 21.5New Median: 21 Mode: 20New Mode: 20

Uniform Symmetric Bell-shaped Symmetric Skewed RightSkewed Left Mean = Median Mean > MedianMean < Median Mean = Median When describing the center of a data set use the mean for symmetrical distributions and the median for skewed distributions

Weighted Mean: The mean of a data set whose entries have varying weights. Example: You are taking AP Biology and your grade is determined by five sources: 50% test, 15% midterm, 20% final exam, 10% lab work, and 5% homework. You scores are 86 (test), 96 (midterm), 82 (final), 98 (lab work), and 100 (homework. What is your weighted mean? SourceXWXW Tests Midterm Final Labs Homework

Weighted Mean: The mean of a data set whose entries have varying weights. Example: You are taking AP Biology and your grade is determined by five sources: 50% test, 15% midterm, 20% final exam, 10% lab work, and 5% homework. You scores are 86 (test), 96 (midterm), 82 (final), 98 (lab work), and 100 (homework. What is your weighted mean? SourceXWXW Tests860.5 Midterm960.15 Final820.2 Labs980.1 Homework1000.05 Σw = 1

Weighted Mean: The mean of a data set whose entries have varying weights. Example: You are taking AP Biology and your grade is determined by five sources: 50% test, 15% midterm, 20% final exam, 10% lab work, and 5% homework. You scores are 86 (test), 96 (midterm), 82 (final), 98 (lab work), and 100 (homework. What is your weighted mean? SourceXWXW Tests860.543 Midterm960.1514.4 Final820.216.4 Labs980.19.8 Homework1000.055 Σw = 1Σxw = 88.6

The average starting salaries (by degree attained) for 20 employees at a company are given below: 2 with a doctorate: $105,000 7 with a masters degree: $63,000 11 with a bachelors degree: $41,000 What is the mean starting salary for these employees? SourceXWXW Σw = Σxw =

The average starting salaries (by degree attained) for 20 employees at a company are given below: 2 with a doctorate: $105,000 7 with a masters degree: $63,000 11 with a bachelors degree: $41,000 What is the mean starting salary for these employees? SourceXWXW Doctorate105,0002 Masters63,0007 Bachelors41,00011 Σw = 20 = 20 =

The average starting salaries (by degree attained) for 20 employees at a company are given below: 2 with a doctorate: $105,000 7 with a masters degree: $63,000 11 with a bachelors degree: $41,000 What is the mean starting salary for these employees? SourceXWXW Doctorate105,0002210,000 Masters63,0007441,000 Bachelors41,00011451,000 Σw = 20 Σxw = 1,102,000 = 1,102,000 20 = $55,100

Mean of a Frequency Distribution: Example: Use the frequency distribution to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent sessions. xfxf 12.56 24.510 36.513 48.58 60.55 72.56 84.52 n = 50 *If you are given a class, use the midpoint for x.

Mean of a Frequency Distribution: Example: Use the frequency distribution to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent sessions. xfxf 12.5675 24.510245 36.513747.5 48.58388 60.55302.5 72.56435 84.52169 n = 50 Σ xf = 2089.0 *If you are given a class, use the midpoint for x.

Example: Use the frequency distribution to approximate the mean age of the residents of Bow, Wyoming. *If you are given a class, use the midpoint for x. Agexfxf 0 – 957 10 – 1968 20 – 2936 30 – 3955 40 – 4971 50 – 5944 60 – 6936 70 – 7914 80 – 898 n = Σ xf =

Example: Use the frequency distribution to approximate the mean age of the residents of Bow, Wyoming. Agexfxf 0 – 94.557 256.5 10 – 1914.568 986 20 – 2924.536 882 30 – 3934.555 1897.5 40 – 4944.571 3159.5 50 – 5954.544 2398 60 – 6964.536 2322 70 – 7974.514 1043 80 – 8984.58 676 n = 389 Σ xf = 13,620.5 *If you are given a class, use the midpoint for x. 13,620.5 389 35 years

1. Suppose John wants to calculate his grade in his statistics class with the grading scale listed below. There are three exams each worth 15%, a final exam worth 25% and homework worth 30%. John has received a 81%, 77%, and 65% on his tests. His homework percentage is 84.6%. If John gets a 78% on the final exam, calculate his grade in the class. 2. Suppose that Ashley is took 4 courses this semester. She got a 4.0 in her 3 credit statistics class (of course), a 3.0 in her 5 credit chemistry class, a 3.5 in her 2 credit fitness class and a 2.5 in her 4 credit English class. What is her GPA for the semester?

Section 2.4 Finding and Interpreting Measures of Variation (Day 1)

How to find the range, variance, and standard deviation of a data set.

Closing prices for two stocks were recorded on ten successive Fridays. Calculate the mean, median, and mode for each. 56 33 56 42 57 48 58 52 61 57 63 67 67 77 67 82 67 90 DIS The Walt Disney Company WMT Wal-Mart Stores

Both stocks have the same mean, median, and mode. Does this mean the data are the same? How are the two stocks different? DIS Mean = 61.5 Median = 62 Mode = 67 WMT Mean = 61.5 Median = 62 Mode = 67 56 33 56 42 57 48 58 52 61 57 63 67 67 77 67 82 67 90 WMT DIS

Both stocks have the same mean, median, and mode. Does this mean the data are the same? How are the two stocks different? DIS Mean = 61.5 Median = 62 Mode = 67 WMT Mean = 61.5 Median = 62 Mode = 67 56 33 56 42 57 48 58 52 61 57 63 67 67 77 67 82 67 90 WMT DIS RANGE = 11 RANGE = 57

The deviation for each value x is the difference between the value of x and the mean of the data set. To learn to calculate measures of variation that use every value in the data set, you first want to know about deviations. In a population, the deviation for each value x is: In a sample, the deviation for each value x is:

– 5.5 – 4.5 – 3.5 – 0.5 1.5 5.5 56 57 58 61 63 67 56 – 61.5 57 – 61.5 DIS Deviation The sum of the deviations is always zero. Mean: 61.5

Population Variance: The sum of the squares of the deviations, divided by N. Sample Variance: The sum of the squares of the deviations, divided by n – 1.

x - x – 5.5 – 4.5 – 3.5 – 0.5 1.5 5.5 x 56 57 58 61 63 67 (x – x) 2 30.25 20.25 12.25 0.25 2.25 30.25 188.50

Population Standard Deviation: The square root of the population variance. The standard deviation is $4.58 for the Disney Stock. Sample Standard Deviation: The square root of the sample variance.

Now, let’s find the standard deviation for the Wal-Mart stock prices. x – x x 33 42 48 52 57 67 77 82 90 (x – x) 2

Now, let’s find the standard deviation for the Wal-Mart stock prices. x – x -28.5 -19.5 -13.5 -9.5 -4.5 5.5 15.5 20.5 28.5 x 33 42 48 52 57 67 77 82 90 (x – x) 2 812.25 380.25 182.25 90.25 20.25 30.25 30.25 240.25 420.25 812.25 3018.5 x

The population standard deviation for DIS stock is $4.58. The population standard deviation for WMT stock is $18.31. The data for the WMT stock is much more deviated (spread out) than DIS stock.

Describes the standard deviation as a percent of the mean. Used to compare data with different units. Example: On average students spend 80 minutes studying per night with a standard deviation of 15 minutes. Also student height averages at 66 inches with a standard deviation of 3.5 inches. Which is more variable, height or study time? CV study = 15/80 * 100 = 18.75 CV height = 3.5/66 * 100 = 5.303

Range = Maximum value – Minimum value Sample Standard Deviation Sample Variance Population Standard Deviation Population Variance Coefficient of Variation

How to approximate the sample standard deviation for grouped data

Sample Grouped Standard Deviation Sample Grouped Variance

f 10 19 7 2 1 4 x0123456x0123456 xf (x- ) (x- ) 2 (x- ) 2 f

f 10 19 7 2 1 4 x0123456x0123456 xf 0 19 14 21 8 5 24 (x- ) (x- ) 2 (x- ) 2 f

f 10 19 7 2 1 4 x0123456x0123456 xf 0 19 14 21 8 5 24 (x- ) -1.8 -0.8 0.2 1.2 2.2 3.2 4.2 (x- ) 2 3.24 0.64 0.04 1.44 4.84 10.24 17.64 (x- ) 2 f 32.40 12.16 0.28 10.08 9.68 10.24 70.56

f3583341f3583341 Class 0-9 10-19 20-29 30-39 40-49 50-59 60-69 xf (x- ) (x- ) 2 (x- ) 2 f x

f3583341f3583341 Class 0-9 10-19 20-29 30-39 40-49 50-59 60-69 xf 13.5 72.5 196 103.5 133.5 218 64.5 (x- ) (x- ) 2 (x- ) 2 f x 4.5 14.5 24.5 34.5 44.5 54.5 64.5 n = 27∑xf = 801.5 29.7

295.16 S=17.18 f3583341f3583341 Class 0-9 10-19 20-29 30-39 40-49 50-59 60-69 xf 13.5 72.5 196 103.5 133.5 218 64.5 (x- ) - 25.2 -15.2 -5.2 4.8 14.8 24.8 34.8 (x- ) 2 635.04 231.04 27.04 23.04 219.04 615.04 1211.04 (x- ) 2 f 1905.12 1155.2 216.32 69.12 657.12 2460.16 1211.04 n = 27∑xf = 809.5 ∑ (x- ) 2 f = 7674 x 4.5 14.5 24.5 34.5 44.5 54.5 64.5 29.7

In studying the behavior of Old Faithful geyser in Yellowstone National Park, geologists collect data for the time (in minutes) between eruptions. The table below summarizes actual data that were obtained. (a) What is the mean time between eruptions? (b) What is the standard deviation for the time between eruptions?

How to use the Empirical to interpret standard deviation

Data with symmetric bell-shaped distribution have the following characteristics. About 68% of the data lies within 1 standard deviation of the mean About 99.7% of the data lies within 3 standard deviations of the mean About 95% of the data lies within 2 standard deviations of the mean –4–3–2–101234

The mean value of homes on a street is $125 thousand with a standard deviation of $5 thousand. The data set has a bell shaped distribution. Estimate the percent of homes between $120 and $135 thousand. $120 thousand is 1 standard deviation below the mean 81.5% of the homes have a value between $120,000 and $135,000. 125130135120140145115110105 $135 thousand is 2 standard deviations above the mean.

The distribution of heights of young women aged 18 to 24 is approximately normal with mean 64.5 inches and standard deviation 2.5 inches. (a) What percent of the young women are taller than 64.5 inches? (b) What percent of young women are between 64.5 and 69.5 inches? (c) What percent of heights are less than 59.5 inches?

Suppose it takes you, on average, 20 minutes to drive to school with a standard deviation of 2 minutes. Suppose a Normal model is appropriate for your driving times. (a)How often will you arrive in less than 22 minutes? (b) How often will it take you more than 26 minutes? (c) How often will it take you between 22 and 28 minutes?

Each portion of the SAT reasoning test is designed to be approximately normal and have an overall mean of 500 and standard deviation of 100. (a) What percent of students will score above 500? (b) What percent of students will score below 400? (c) What percent of students will score between 600 and 800?

Section 2.5 Measures of Position

How to find the quartiles and inter-quartile range of a data set How to represent the data using a box and whisker plot How to interpret other fractiles, such as percentiles How to calculate z-scores

3 quartiles Q 1, Q 2 and Q 3 divide the data into 4 equal parts.  Q 2 is the same as the median.  Q 1 is the median of the data below Q 2.  Q 3 is the median of the data above Q 2.

You are managing a store. The average sale for each of 27 randomly selected days in the last year is given. Find Q 1, Q 2, and Q 3. The data in ranked order (n = 27) are: 17 19 20 23 27 28 30 33 35 37 37 38 39 42 42 43 43 44 45 45 45 46 47 48 48 51 55 The median = Q2 = 42. Q1 is 30 & Q3 is 45. The Interquartile Range is Q 3 – Q 1 = 45 – 30 = 15. Lower half Upper half

5545352515 A box and whisker plot uses 5 key values to describe a set of data. Q 1, Q 2 and Q 3, the minimum value and the maximum value. 5 Number Summary Minimum value First Quartile Q 1 The median Q 2 Third Quartile Q 3 Maximum value 17 30 42 45 55 424530 1755 Interquartile Range = 45 – 30 = 15

The data in ranked order (n = 26) are: 51 59 65 67 72 73 73 73 75 83 85 85 86 88 88 89 90 91 92 93 94 96 97 98 99 100 Lower half Upper half 9080706050 5 Number Summary Minimum value: First Quartile Q 1 : The median Q 2 : Third Quartile Q 3 : Maximum value: Interquartile Range =____________ 100

The data in ranked order (n = 26) are: 51 59 65 67 72 73 73 73 75 83 85 85 86 88 88 89 90 91 92 93 94 96 97 98 99 100 Lower half Upper half 9080706050 5 Number Summary Minimum value: 51 First Quartile Q 1 : 73 The median Q 2 : 87 Third Quartile Q 3 : 93 Maximum value: 100 879373 51100 Interquartile Range = 93 – 73 = 20 100

Percentiles divide the data into 100 parts. There are 99 percentiles: P 1, P 2, P 3 …P 99. Example: If a ten year old boy is at the 75 th percentile, that means the boys weight is greater than 75% of all ten year old boys. P 50 =Q 2 =median P 25 =Q 1 P 75 =Q 3

The standard score or z-score, represents the number of standard deviations that a data value, x, falls from the mean. Example: The test scores for a civil service exam have a mean of 152 and standard deviation of 7. Find the standard z-score for a person with a score of: (a) 161 (b) 148 (c) 152

(c) 152 (a) 161 (b) 148 A value of x = 161 is 1.29 standard deviations above the mean. A value of x = 148 is 0.57 standard deviations below the mean. A value of x = 152 is equal to the mean. Mean: 152 and standard deviation: 7

What does a z-score of 1.29 mean? This means that 90.15% of the people scored a 161 or below on their civil service exam. OR This means that 9.85% of the people scored a 161 or higher on their civil service exam.

Section 2.1 Frequency Distributions and Their Graphs.

Similar presentations

Presentation on theme: "Section 2.1 Frequency Distributions and Their Graphs."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Section 2.1 Frequency Distributions and Their Graphs.

Similar presentations

Presentation on theme: "Section 2.1 Frequency Distributions and Their Graphs."— Presentation transcript:

Similar presentations

About project

Feedback