Describing Distributions Numerically

Slides:



Advertisements
Similar presentations
1 Radio Maria World. 2 Postazioni Transmitter locations.
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
/ /17 32/ / /
Reflection nurulquran.com.
EuroCondens SGB E.
Worksheets.
& dding ubtracting ractions.
Addition and Subtraction Equations
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
1 When you see… Find the zeros You think…. 2 To find the zeros...
Describing Data: Measures of Dispersion
4-4 Variability Objective: Learn to find measures of variability.
12.3 – Analyzing Data.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
Outliers. Do Now Bill Gates makes $100 thousand a year. He s in a room with 9 teachers, 4 of whom make $40k, 3 make $45k, and 2 make $55k a year. What.
Lecture Slides Elementary Statistics Tenth Edition
CHAPTER 18 The Ankle and Lower Leg
Summative Math Test Algebra (28%) Geometry (29%)
CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter,
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
Describing Distributions with Numbers
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
St. Edward’s University
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edwards University.
Break Time Remaining 10:00.
The basics for simulations
PP Test Review Sections 6-1 to 6-6
MM4A6c: Apply the law of sines and the law of cosines.
Frequency Tables and Stem-and-Leaf Plots 1-3
Data Distributions Warm Up Lesson Presentation Lesson Quiz
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Progressive Aerobic Cardiovascular Endurance Run
Introduction Our daily lives often involve a great deal of data, or numbers in context. It is important to understand how data is found, what it means,
1..
Name of presenter(s) or subtitle Canadian Netizens February 2004.
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Foundation Stage Results CLL (6 or above) 79% 73.5%79.4%86.5% M (6 or above) 91%99%97%99% PSE (6 or above) 96%84%100%91.2%97.3% CLL.
Subtraction: Adding UP
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Resistência dos Materiais, 5ª ed.
Five Number Summary and Box Plots
PSSA Preparation.
& dding ubtracting ractions.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Chapter 2 Tutorial 2nd & 3rd LAB.
Boxplot Q1 Q3 Median largest observation that is not a suspected outlier smallest observation that is not a suspected outlier Whisker * outlier.
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
Chapter 5 Describing Distributions Numerically.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Presentation transcript:

Describing Distributions Numerically Chapter 3 Describing Distributions Numerically

Describing the Distribution Center Median Mean Spread Range Interquartile Range Standard Deviation

Median Literally = middle number (data value) n (number of observations) is odd Order the data from smallest to largest Median is the middle number on the list (n+1)/2 number from the smallest value Ex: If n=11, median is the (11+1)/2 = 6th number from the smallest value Ex: If n=37, median is the (37+1)/2 = 19th number from the smallest value

Example – August Temps 13 observations High Temperatures for Des Moines, Iowa taken from the first 13 days of August 2005. 71 76 81 81 85 86 90 90 91 93 93 96 96 Remember to order the values, if they aren’t already in order! 13 observations (13+1)/2 = 7th observation from the bottom Median = 90

Median n is even Order the data from smallest to largest Median is the average of the two middle numbers (n+1)/2 will be halfway between these two numbers Ex: If n=10, (10+1)/2 = 5.5, median is average of 5th and 6th numbers from smallest value

Example – Yankees 10 observations (10 + 1)/2 = 5.5, average of 5th and 6th observations from bottom Median = 5 Scores of last 10 games 2 3 3 5 5 5 6 7 7 10 Remember to order the values if they aren’t already in order!

Mean Ordinary average Formula Add up all observations Divide by the number of observations Formula n observations y1, y2, y3, …, yn are the values

Mean ( )

Example – Vikings (as of 1/9) Find the mean of the (17 values) 13 14 16 18 20 22 23 27 27 28 28 31 31 31 34 35 38

Example – Colts as of (1/9) Find the mean of the scores (17 values) 14 20 23 24 24 24 31 31 34 35 35 41 41 45 49 49 51

Mean vs. Median Median = middle number Mean = value where histogram balances Mean and Median similar when Data are symmetric Mean and median different when Data are skewed There are outliers

Mean vs. Median Mean influenced by unusually high or unusually low values Example: Income in a small town of 6 people $25,000 $27,000 $29,000 $35,000 $37,000 $38,000 **The mean income is $31,830 **The median income is $32,000

Mean vs. Median Bill Gates moves to town Mean is pulled by the outlier $25,000 $27,000 $29,000 $35,000 $37,000 $38,000 $40,000,000 **The mean income is $5,741,571 **The median income is $35,000 Mean is pulled by the outlier Median is not Mean is not a good center of these data

Mean vs. Median Skewness pulls the mean in the direction of the tail Skewed to the right = mean > median Skewed to the left = mean < median Outliers pull the mean in their direction Large outlier = mean > median Small outlier = mean < median

Weighted Mean Used when values are not equally represented.

Example (weighted mean) A recent survey of new diet cola reported the following percentages of people who liked the taste. Find the weighted mean of the percentages. Area % Favored Number surveyed 1 40 1000 2 30 3000 3 50 800

Example (cont.) x1 = .40 x2 = .30 x3 = .50 w1 = 1000 w2 = 3000 w3 = 800 Use formula: {.40(1000) + .30(3000) + .50(800)} / {1000+3000+800} = 1700/4800 = 0.354 = 35.4%

Spread Range is a very basic measure of spread (Max – Min). It is highly affected by outliers Makes spread appear larger than reality Ex. The annual numbers of deaths from tornadoes in the U.S. from 1990 to 2000: 53 39 39 33 69 30 25 67 130 94 40 Range with outlier: 130 – 25 = 105 Range without outlier: 94 – 25 = 69

Spread Interquartile Range (IQR) IQR = Q3 – Q1 First Quartile (Q1) 25th Percentile Third Quartile (Q3) 75th Percentile IQR = Q3 – Q1 Center (Middle) 50% of the values

Finding Quartiles Order the data Split into two halves at the median When n is odd, include the median in both halves When n is even, do not include the median in either half Q1 = median of the lower half Q3 = median of the upper half

Top 15 Populations US Cities 2004 New York, N.Y. 810 Los Angeles, Calif. 385 Chicago, Ill. 286 Houston, Tex. 201 Philadelphia, Pa. 147 Phoenix, Ariz. 142 San Diego, Calif. 126 San Antonio, Tex. 124 Dallas, Tex. 121 San Jose, Calif. 90 Detroit, Mich. Indianapolis, Ind. 78 Jacksonville, Fla. San Francisco, Calif. 74 * Populations were all divided by 10,000.

Example – Top City Populations Order the values (14 values) 74 78 78 90 90 121 124 126 142 147 201 286 385 810 Lower Half = 74 78 78 90 90 121 124 Q1 = Median of lower half = 90 Upper Half = 126 142 147 201 286 385 810 Q3 = Median of upper half = 201 IQR = Q3 – Q1 = 201 - 90 = 111

August High Temps (8/1–8/13) Order the values (13 values) 71 76 81 81 85 86 90 90 91 93 93 96 96 Lower Half = 71 76 81 81 85 86 Q1 = Median of lower half = 81 Upper Half = 90 90 91 93 93 96 96 Q3 = Median of upper half = 93 IQR = Q3 – Q1 = 93 - 81 = 12

August High Temps (8/14–8/25) Order the values (12 values) 76 77 77 79 81 83 84 85 86 88 91 93 Lower Half = 76 77 77 79 81 83 Q1 = Median of lower half = 78 Upper Half = 84 85 86 88 91 93 Q3 = Median of upper half = 87 IQR = Q3 – Q1 = 87-78 = 9

Five Number Summary Minimum Q1 Median Q3 Maximum

Examples Vikings (as of 1/9) Colts (as of 1/9) Min = 13 Q1 = 20 Median = 27 Q3 = 31 Max = 38 Colts (as of 1/9) Min = 14 Q1 = 24 Median = 34 Q3 = 41 Max = 51

Graph of Five Number Summary Boxplot Box between Q1 and Q3 Line in the box marks the median Lines extend out to minimum and maximum Best used for comparisons Use this simpler method

Example – Vikings & Colts Boxplot of Vikings scores Box from 20 to 31 Line in box 27 Lines extend out from box from 14 and 38 Boxplot of Colts scores Box from 24 to 41 Line in box at 34 Lines extend out from box to 14 and 51

Side by Side Boxplots of Vikings Scores and Colts Scores

Spread Standard deviation “Average” spread from mean Most common measure of spread Denoted by letter s Make a table when calculating by hand

Standard Deviation

Example – Deaths from Tornadoes 53 53-56.27 =-3.27 10.69 39 39-56.27 = -17.27 298.25 33 33-56.27 = -23.27 541.49 69 69-56.27 = 12.73 162.05 30 30-56.27 = -26.27 690.11 25 25-56.27 = -31.27 977.81 67 67-56.27 = 10.73 115.13 130 130-56.27 = 73.73 5436.11 94 94-56.27 = 37.73 1423.55 40 40-56.27 = -16.27 264.71

Example - Vikings Find the standard deviation of the scores of Vikings games given the following statistic:

Properties of s s = 0 only when all observations are equal; otherwise, s > 0 s has the same units as the data s is not resistant Skewness and outliers affect s, just like mean Tornado Example: s with outlier: 31.97 s without outlier: 21.70

Which summaries should you use with different distributions? The appropriate measures of center and spread when your distribution is symmetric are: Mean Standard deviation The appropriate measures of center and spread when your distribution is skewed are: Median IQR

Comparing Variance When comparing the variance for two sets of numbers find the coefficient of variation: Formula = Cvar = = Then compare the percentages.

Standardizing (first look) I got a 85 on my English test and you got a 36 on your Spanish test. Who did better? How can we compare things that come from different scales? Standardizing Use z formula (called z-score)

Standardizing Z=standardized score X = raw score X-bar = mean of raw scores S = sample standard deviation So what does this mean for our test scores?

Standardizing I got a 85 on my English test and you got a 35 on your Spanish test. Who did better? Now I need to give you more information. The English class’s tests had a mean of 83 and a standard deviation of 3. The Spanish tests had a mean of 30 and a standard deviation of 2.

Standardizing

Comparing Standardized Scores I scored .667 standard deviations above the mean on my English test where you scored 2.5 standard deviations above the mean on your Spanish test. Comparatively you scored better on your exam.