Descriptive Statistics

Slides:



Advertisements
Similar presentations
Descriptive Statistics
Advertisements

Descriptive Statistics
Section 2.4 Measures of Variation Larson/Farber 4th ed.
Measures of Variation Section 2.4 Statistics Mrs. Spitz Fall 2008.
Frequency Distributions
Descriptive Statistics
Descriptive Statistics
2.1: Frequency Distributions and Their Graphs. Is a table that shows classes or intervals of data entries with a count of the number of entries in each.
Descriptive Statistics
Measures of Central Tendency
Chapter 2 Descriptive Statistics 1 Larson/Farber 4th ed.
Descriptive Statistics
Graphing Quantitative Data Sets
Descriptive Statistics
Chapter 2 descriptive statistics. Outline Frequency Distributions and Their GraphsMore Graphs and DisplaysMeasures of Central TendencyMeasures of VariationMeasures.
MM207-Statistics Unit 2 Seminar-Descriptive Statistics Dr Bridgette Stevens AIM:BStevensKaplan (add me to your Buddy list) 1.
 Mean: of a data set is the sum of the data entries divided by the number of entries. To find the mean of a data set, use one of the following formulas.
Descriptive Statistics
Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.
 The range of a data set is the difference between the maximum and minimum data entries in the set. The find the range, the data must be quantitative.
Section 2.4 Measures of Variation Larson/Farber 4th ed. 1.
Frequency Distributions and Their Graphs
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Statistics Numerical Representation of Data Part 2 – Measure of Variation.
Descriptive Statistics
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
1 of 96 Chapter Outline 2.1 Frequency Distributions and Their Graphs 2.2 More Graphs and Displays 2.3 Measures of Central Tendency 2.4 Measures of Variation.
Statistics Numerical Representation of Data Part 1 – Measures of Central Tendency.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Measures of Central Tendency A statistic is a characteristic or measure obtained by using the data values from a sample. A parameter is a characteristic.
Chapter 2 Descriptive Statistics 1 Larson/Farber 4th ed.
Section 2.3 Measures of Central Tendency 1 of 149 © 2012 Pearson Education, Inc. All rights reserved.
Measures of Variation 1 Section 2.4. Section 2.4 Objectives 2 Determine the range of a data set Determine the variance and standard deviation of a population.
Section 3-2 Measures of Variation.
Section 2.4 Measures of Variation Day 1. Range The difference between the maximum and minimum data entries in the set. The data must be quantitative.
Chapter 2 Descriptive Statistics 1 Larson/Farber 4th ed.
Section 2.4 Measures of Variation. Section 2.4 Objectives Determine the range of a data set Determine the variance and standard deviation of a population.
Chapter Descriptive Statistics 1 of © 2012 Pearson Education, Inc. All rights reserved.
Frequency Distributions and Their Graphs
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
2.3: Measures of Central Tendency Chapter 2: Descriptive Statistics Objectives... Determine the mean, median, and mode of a population and of a sample.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Section 2.3 Measures of Central Tendency. Section 2.3 Objectives Determine the mean, median, and mode of a population and of a sample (and which to use.
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
Chapter 2 Descriptive Statistics 1 Larson/Farber 4th ed.
Sect.2.4 Measures of variation Objective: SWBAT find the range of a data set Find the variance and standard deviation of a population and of a sample How.
Section 2.4 Measures of Variation 1 of 149 © 2012 Pearson Education, Inc. All rights reserved.
Section 2.4 Measures of Variation 1 of 149 © 2012 Pearson Education, Inc. All rights reserved.
Do Now Identify the w’s and specify each variable as categorical or quantitative. Scientists at a major pharmaceutical firm conducted an experiment to.
Chapter Outline 2.1 Frequency Distributions and Their Graphs 2.2 More Graphs and Displays 2.3 Measures of Central Tendency 2.4 Measures of Variation 2.5.
Copyright © Cengage Learning. All rights reserved. Probability and Statistics.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Chapter Outline 2.1 Frequency Distributions and Their Graphs 2.2 More Graphs and Displays 2.3 Measures of Central Tendency 2.4 Measures of Variation 2.5.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Algebra II Descriptive Statistics 1 Larson/Farber 4th ed.
Statistics Test # 2 Review
Chapter 2 Descriptive Statistics.
Chapter 2 Descriptive Statistics.
Descriptive Statistics
Descriptive Statistics
Descriptive Statistics
Chapter 2 Descriptive Statistics.
Descriptive Statistics
Descriptive Statistics
What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. Larson/Farber 4th ed.
Section 2.4 Measures of Variation Larson/Farber 4th ed.
Presentation transcript:

Descriptive Statistics Chapter 2 Descriptive Statistics Larson/Farber 4th ed.

Useful screencast/videos: Video on creating a frequency distribution by hand: http://screencast.com/t/OGY3ZjJj Video on using Excel 2007 to create frequency distributions: http://screencast.com/t/tkMv2FMWhJe Video on using Excel 2007 to create a histogram http://screencast.com/t/L0u9UI2eI Larson/Farber 4th ed.

Frequency Distributions and Their Graphs Section 2.1 Frequency Distributions and Their Graphs Larson/Farber 4th ed.

Frequency Distribution - Terminology A table that shows classes or intervals of data with a count of the number of entries in each class. The frequency, f, of a class is the number of data entries in the class. Class Frequency, f 1 – 5 5 6 – 10 8 11 – 15 6 16 – 20 21 – 25 26 – 30 4 Larson/Farber 4th ed.

Determining the Relative Frequency Relative Frequency of a class Portion or percentage of the data that falls in a particular class. Class Frequency, f Relative Frequency 7 – 18 6 19 – 30 10 31 – 42 13 Larson/Farber 4th ed.

Example: Constructing a Frequency Distribution The following sample data set lists the number of minutes 50 Internet subscribers spent on the Internet during their most recent session. Construct a frequency distribution that has seven classes. 50 40 41 17 11 7 22 44 28 21 19 23 37 51 54 42 86 41 78 56 72 56 17 7 69 30 80 56 29 33 46 31 39 20 18 29 34 59 73 77 36 39 30 62 54 67 39 31 53 44 Video on computing frequency distribution using this data: http://screencast.com/t/OGY3ZjJj Larson/Farber 4th ed.

Expanded Frequency Distribution Class Frequency, f Midpoint Relative frequency Cumulative frequency 7 – 18 6 12.5 0.12 19 – 30 10 24.5 0.20 16 31 – 42 13 36.5 0.26 29 43 – 54 8 48.5 0.16 37 55 – 66 5 60.5 0.10 42 67 – 78 72.5 48 79 – 90 2 84.5 0.04 50 Σf = 50 Larson/Farber 4th ed.

Graphs of Frequency Distributions Frequency Histogram A bar graph that represents the frequency distribution. The horizontal scale is quantitative and measures the data values. The vertical scale measures the frequencies of the classes. Consecutive bars must touch. data values frequency Larson/Farber 4th ed.

Solution: Frequency Histogram (using Midpoints) Larson/Farber 4th ed.

Graphs of Frequency Distributions Relative Frequency Histogram Has the same shape and the same horizontal scale as the corresponding frequency histogram. The vertical scale measures the relative frequencies, not frequencies. data values relative frequency Larson/Farber 4th ed.

Solution: Relative Frequency Histogram 6.5 18.5 30.5 42.5 54.5 66.5 78.5 90.5 From this graph you can see that 20% of Internet subscribers spent between 18.5 minutes and 30.5 minutes online. Larson/Farber 4th ed.

More Graphs and Displays Section 2.2 More Graphs and Displays Larson/Farber 4th ed.

Graphing Quantitative Data Sets Stem-and-leaf plot Each number is separated into a stem and a leaf. Similar to a histogram. Still contains original data values. 26 2 1 5 5 6 7 8 3 0 6 6 4 5 Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45 Larson/Farber 4th ed.

Graphing Qualitative Data Sets Pie Chart A circle is divided into sectors that represent categories. The area of each sector is proportional to the frequency of each category. Larson/Farber 4th ed.

Measures of Central Tendency Section 2.3 Measures of Central Tendency Larson/Farber 4th ed.

Measures of Central Tendency Measure of central tendency A value that represents a typical, or central, entry of a data set. Most common measures of central tendency: Mean Median Mode Larson/Farber 4th ed.

Measure of Central Tendency: Mean Mean (average) The sum of all the data entries divided by the number of entries. Sigma notation: Σx = add all of the data entries (x) in the data set. Population mean: Sample mean: Larson/Farber 4th ed.

Example: Finding a Sample Mean The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. What is the mean price of the flights? 872 432 397 427 388 782 397 Larson/Farber 4th ed.

Solution: Finding a Sample Mean 872 432 397 427 388 782 397 The sum of the flight prices is Σx = 872 + 432 + 397 + 427 + 388 + 782 + 397 = 3695 To find the mean price, divide the sum of the prices by the number of prices in the sample The mean price of the flights is about $527.90. Larson/Farber 4th ed.

Measure of Central Tendency: Median The value that lies in the middle of the data when the data set is ordered. Measures the center of an ordered data set by dividing it into two equal parts. If the data set has an odd number of entries: median is the middle data entry. even number of entries: median is the mean of the two middle data entries. Larson/Farber 4th ed.

Example: Finding the Median The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. Find the median of the flight prices. 872 432 397 427 388 782 397 Larson/Farber 4th ed.

Solution: Finding the Median 872 432 397 427 388 782 397 First order the data. 388 397 397 427 432 782 872 There are seven entries (an odd number), the median is the middle, or fourth, data entry. The median price of the flights is $427. Larson/Farber 4th ed.

Example: Finding the Median The flight priced at $432 is no longer available. What is the median price of the remaining flights? 872 397 427 388 782 397 Larson/Farber 4th ed.

Solution: Finding the Median 872 397 427 388 782 397 First order the data. 388 397 397 427 782 872 There are six entries (an even number), the median is the mean of the two middle entries. The median price of the flights is $412. Larson/Farber 4th ed.

Measure of Central Tendency: Mode The data entry that occurs with the greatest frequency. If no entry is repeated the data set has no mode. If two entries occur with the same greatest frequency, each entry is a mode (bimodal). Larson/Farber 4th ed.

Example: Finding the Mode The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. Find the mode of the flight prices. 872 432 397 427 388 782 397 Larson/Farber 4th ed.

Solution: Finding the Mode 872 432 397 427 388 782 397 Ordering the data helps to find the mode. 388 397 397 427 432 782 872 The entry of 397 occurs twice, whereas the other data entries occur only once. The mode of the flight prices is $397. Larson/Farber 4th ed.

Example: Finding the Mode At a political debate a sample of audience members was asked to name the political party to which they belong. Their responses are shown in the table. What is the mode of the responses? Political Party Frequency, f Democrat 34 Republican 56 Other 21 Did not respond 9 Larson/Farber 4th ed.

Solution: Finding the Mode Political Party Frequency, f Democrat 34 Republican 56 Other 21 Did not respond 9 The mode is Republican (the response occurring with the greatest frequency). In this sample there were more Republicans than people of any other single affiliation. Larson/Farber 4th ed.

Section 2.4 Measures of Variation Larson/Farber 4th ed.

Deviation, Variance, and Standard Deviation The difference between the data entry, x, and the mean of the data set. Population data set: Deviation of x = x – μ Sample data set: Deviation of x = x – x Larson/Farber 4th ed.

Example: Finding the Deviation A corporation hired 10 graduates. The starting salaries for each graduate are shown. Find the deviation of the starting salaries. Starting salaries (1000s of dollars) 41 38 39 45 47 41 44 41 37 42 Solution: First determine the mean starting salary. Larson/Farber 4th ed.

Solution: Finding the Deviation Determine the deviation for each data entry. Salary ($1000s), x Deviation: x – μ 41 41 – 41.5 = –0.5 38 38 – 41.5 = –3.5 39 39 – 41.5 = –2.5 45 45 – 41.5 = 3.5 47 47 – 41.5 = 5.5 44 44 – 41.5 = 2.5 37 37 – 41.5 = –4.5 42 42 – 41.5 = 0.5 Σx = 415 Σ(x – μ) = 0 Larson/Farber 4th ed.

Deviation, Variance, and Standard Deviation Population Variance Population Standard Deviation Sum of squares, SSx Larson/Farber 4th ed.

Deviation, Variance, and Standard Deviation Sample Variance Sample Standard Deviation Larson/Farber 4th ed.

Example: Using Technology to Find the Standard Deviation Sample office rental rates (in dollars per square foot per year) for Miami’s central business district are shown in the table. Use a calculator or a computer to find the mean rental rate and the sample standard deviation. (Adapted from: Cushman & Wakefield Inc.) Office Rental Rates 35.00 33.50 37.00 23.75 26.50 31.25 36.50 40.00 32.00 39.25 37.50 34.75 37.75 37.25 36.75 27.00 35.75 26.00 29.00 40.50 24.50 33.00 38.00 Larson/Farber 4th ed.

Solution: Using Technology to Find the Standard Deviation Sample Mean Sample Standard Deviation Larson/Farber 4th ed.

Interpreting Standard Deviation Standard deviation is a measure of the typical amount an entry deviates from the mean. The more the entries are spread out, the greater the standard deviation. Larson/Farber 4th ed.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule) For data with a (symmetric) bell-shaped distribution, the standard deviation has the following characteristics: About 68% of the data lie within one standard deviation of the mean. About 95% of the data lie within two standard deviations of the mean. About 99.7% of the data lie within three standard deviations of the mean. Larson/Farber 4th ed.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule) 99.7% within 3 standard deviations 2.35% 95% within 2 standard deviations 13.5% 68% within 1 standard deviation 34% Larson/Farber 4th ed.

Example: Using the Empirical Rule In a survey conducted by the National Center for Health Statistics, the sample mean height of women in the United States (ages 20-29) was 64 inches, with a sample standard deviation of 2.71 inches. Estimate the percent of the women whose heights are between 64 inches and 69.42 inches. Larson/Farber 4th ed.

Solution: Using the Empirical Rule Because the distribution is bell-shaped, you can use the Empirical Rule. 34% 13.5% 55.87 58.58 61.29 64 66.71 69.42 72.13 34% + 13.5% = 47.5% of women are between 64 and 69.42 inches tall. Larson/Farber 4th ed.

Larson/Farber 4th ed.

Larson/Farber 4th ed.

Important Formulas Range = Maximum value – Minimum value Population Variance Population Standard Deviation Sample Variance Sample Standard Deviation

Using the Empirical Rule 1. The mean value of homes on a street is $125 thousand with a standard deviation of $5 thousand. The data set has a bell shaped distribution. Estimate the percent of homes between $120 and $135 thousand. 125 130 135 120 140 145 115 110 105 $120 thousand is 1 standard deviation below the mean and $135 thousand is 2 standard deviations above the mean. 68% + 13.5% = 81.5%

2. An instructor recorded the average number of absences for his students in one semester. For a random sample the data are: 2 4 2 0 40 2 4 3 6 Calculate the mean, the median, and the mode, using the appropriate notation. [Hint: is this a sample or a population?]

3. Find the class width: 3 4 5 19 Class Frequency, f 1 – 5 21 6 – 10 1 – 5 21 6 – 10 16 11 – 15 28 16 – 20 13 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

4. The mean annual automobile insurance premium is $950, with a standard deviation of $175. The data set has a bell-shaped distribution. Estimate the percent of premiums that are between $600 and $1300. 68% 75% 95% 99.7% Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

81.5% have a value between $120 and $135 thousand. xbar = 63, median = 3, mode = 2. This is a sample, so these are all sample statistics. (C) 5 (C) 95% Larson/Farber 4th ed.