DESCRIBING DISTRIBUTION NUMERICALLY

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

HS 67 - Intro Health Statistics Describing Distributions with Numbers
Descriptive Measures MARE 250 Dr. Jason Turner.
Understanding and Comparing Distributions 30 min.
CHAPTER 1 Exploring Data
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Looking at data: distributions - Describing distributions with numbers IPS chapter 1.2 © 2006 W.H. Freeman and Company.
Jan Shapes of distributions… “Statistics” for one quantitative variable… Mean and median Percentiles Standard deviations Transforming data… Rescale:
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Homework Questions. Quiz! Shhh…. Once you are finished you can work on the warm- up (grab a handout)!
Basic Practice of Statistics - 3rd Edition
Describing Distributions Numerically
CHAPTER 2: Describing Distributions with Numbers
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
Objectives 1.2 Describing distributions with numbers
1.3: Describing Quantitative Data with Numbers
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group.
INVESTIGATION 1.
1 Further Maths Chapter 2 Summarising Numerical Data.
Essential Statistics Chapter 21 Describing Distributions with Numbers.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
Chapter 5 Describing Distributions Numerically.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Numerical descriptions of distributions
Univariate EDA. Quantitative Univariate EDASlide #2 Exploratory Data Analysis Univariate EDA – Describe the distribution –Distribution is concerned with.
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Chapter 16: Exploratory data analysis: numerical summaries
Notes 13.2 Measures of Center & Spread
CHAPTER 1 Exploring Data
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
CHAPTER 2: Describing Distributions with Numbers
(12) students were asked their SAT Math scores:
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Unit 4 Statistics Review
2.6: Boxplots CHS Statistics
Chapter 5: Describing Distributions Numerically
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Describing Distributions Numerically
1.2 Describing Distributions with Numbers
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Statistics and Data (Algebraic)
Describing Distributions Numerically
Essential Statistics Describing Distributions with Numbers
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
The Five-Number Summary
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
Describing Distributions with Numbers
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

DESCRIBING DISTRIBUTION NUMERICALLY MEASURES OF CENTER: MIDRANGE = (MAX + MIN) / 2 MEDIAN IS THE MIDDLE VALUE WITH HALF OF THE DATA ABOVE AND HALF BELOW IT. MEAN = (SUM OF DATA) / (NUMBER OF COUNTS n) EXAMPLE: DATA: 45, 46, 49, 35, 76, 80, 89, 94, 37, 61, 62, 64, 68, 56, 57, 57, 59, 71, 72. SORTED DATA: 35, 37, 45, 46, 49, 56, 57, 59, 61, 62, 64, 68, 71, 72, 76, 80, 89, 94. MIDRANGE = (94 + 35) / 2 = 64.5 MEDIAN = 61 MEAN = (35 + 37 + … + 94) / 19 = 62 NOTE: FOR SKEWED DISTRIBUTIONS THE MEDIAN IS A BETTER MEASURE OF THE CENTER THAN THE MEAN.

MEASURES OF THE SPREAD RANGE = MAX – MIN INTERQUARTILE RANGE (IQR) = Q3 – Q1 Q3 = UPPER QUARTILE = MEDIAN OF UPPER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) Q1 = LOWER QUARTILE MEDIAN OF LOWER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) VARIANCE (later) STANDARD DEVIATION (later)

Note: Include the median in the calculation of both quartiles EXAMPLE: (odd number of observations, 19) Median = 61 UPPER HALF 35 37 45 46 49 56 57 57 59 [61 62 64 68 71 72 76 80 89 94] Q3 = (71 +72) / 2 = 71.5 LOWER HALF [35 37 45 46 49 56 57 57 59 61] 62 64 68 71 72 76 80 89 94 Q1 = (49 + 56) / 2 = 52.5 IQR = 71.5 – 52.5 = 19 Note: Include the median in the calculation of both quartiles

Quartiles EXAMPLE: (even number of observations, 18) 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] 60 = Median = (59+61)/2 (Average of the middle two numbers) UPPER HALF 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] Q3 = 71 LOWER HALF [35 37 45 46 49 56 57 57 59 ] 62 64 68 71 72 76 80 89 94 Q1 = 49 IQR = 71 – 49 = 42

5 – NUMBER SUMMARY: THE 5-NUMBER SUMMARY OF A DISTRIBUTION REPORTS ITS MEDIAN, QUARTILES, AND EXTREMES(MINIMUM AND MAXIMUM) MAX = 94 Q3 = 71.5 MEDIAN = 61 Q1 = 52.5 MIN=35 OUTLIERS: DATA VALUES WHICH ARE BEYOND FENCES IQR = Q3 – Q1 = 19 UPPER FENCE = Q3 + 1.5IQR = 71.5 + 1.5x19 = 100 LOWER FENCE = Q1 – 1.5IQR = 52.5 – 1.5x19 = 24 IN THE EXAMPLE CONSIDERED ABOVE, THERE ARE NO OUTLIERS.

BOXPLOTS WHENEVER WE HAVE A 5-NUMBER SUMMARY OF A\ (QUANTITATIVE) VARIABLE, WE CAN DISPLAY THE INFORMATION IN A BOXPLOT. THE CENTER OF A BOXPLOT IS A BOX THAT SHOWS THE MIDDLE HALF OF THE DATA, BETWEEN THE QUARTILES. THE HEIGHT OF THE BOX IS EQUAL TO THE IQR. IF THE MEDIAN IS ROUGHLY CENTERED BETWEEN THE QUARTILES, THEN THE MIDDLE HALF OF THE DATA IS ROUGHLY SYMMETRIC. IF IT IS NOT CONTERED, THE DISTRIBUTION IS SKEWED. THE MAIN USE FOR BOXPLOTS IS TO COMPARE GROUPS.

BOXPLOTS

Examples: 1. Here are costs of 10 electric smoothtop ranges rated very good or excellent by Consumers Reports in August 2002. 850 900 1400 1200 1050 1000 750 1250 1050 565 Find the following statistics by hand: a) mean b) median and quartiles c) range and IQR

VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN DEVIATION = (each data value) – mean VARIANCE = 4648 / (19 -1) = 258.8 STANDARD DEVIATION = SQUARE ROOT ( VARIANCE) = 16.1

VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN Step 1: Sort Data: 565 Mean = 1001.5 750 Median =1025 850 Q1=850 900 Q3=1200 1000 Range = 835 1050 IQR= 350 1050 1200 1250 1400

VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN Computing the Variance DEVIATION = (each data value) – mean Squared Deviation= ((each data value) – mean)^2 Sum all squared deviations Variance = (sum of all squared deviations)/(n-1), where n = is the number of observations

Variance Example: Variance = 147.2/4 = 36.8 Data Squared Deviations 35 54.76 37 29.16 6.76 46 12.96 49 43.56 Mean = 42.4 Variance = 147.2/4 = 36.8 Std Deviation = square root of variance Std dev = 6.06

Some Remarks If the shape is skewed, report the median and IQR. Mean and median will be very differnet. You may want to include the mean and std deviation, but you should point out why the mean and the median differ. If the histogram is symmetric, report the mean and the std deviation and possibly the median and IQR.