Download presentation

Presentation is loading. Please wait.

Published byDarin Pearson Modified about 1 year ago

1
Boxplot Q1 Q3 Median largest observation that is not a suspected outlier smallest observation that is not a suspected outlier Whisker * outlier

2
Boxplot Q3Q1Median Whisker largest observation that is not a suspected outlier smallest observation that is not a suspected outlier May also be represented horizontally: * outlier

3
The data: “Guess my age” Example

4
The data: “Guess my age” Sorted data - Example

5
Calculations needed for the boxplot: You may also want to know the mean: Mean = Five number summary: MinQ 1 MQ 3 Max

6
Details of calculating median and quartiles Calculating Median: n=79 The median is the observation in position (n+1)/2 = (79+1)/2=40 Median =

7
Details of calculating median and quartiles Calculating Q1: Q1 is the median of the first 50% of the data. The first 50% of the data are the first 39 observations (not counting the Median). The median of these observations is the observation in position (39+1)/2=20 Q1 = Q1Q1

8
Details of calculating median and quartiles Calculating Q3: Q3 is the median of the top 50% of the data. The top 50% of the data are the highest 39 observations. Within these 39 observations we look for the median, which is in position 20 Q3 = Q3Q3

9
Now we can draw the “box”

10
A criterion for outliers: An observation is marked as a suspected outlier if it falls outside the range: [Q xIQR, Q xIQR] For the “Guess my age” data IQR=Q3-Q1=34-28=6 1.5*IQR=9 Q1-1.5IQR=28-9=19 (lower fence) Q3+1.5IQR=34+9=43 (upper fence) An outlier is an observation below 19 or above 43

11
No observations are suspected outliers

12
Drawing the whiskers Draw a line to the smallest observation that is not an outlier - 20 Draw a line to the largest observation that is not an outlier

13
Box plot of “Guess my age” data:

14
Box plot of “Guess my age” data: You may add the mean (as + or ) Minitab:..\SURVEY1000.MPJ..\SURVEY1000.MPJ

15
Box plot – building blocks Create a box from quartiles Add the median (parallel to quartiles) (add the mean: dot or + in box) Draw whiskers (lines from box to largest and smallest values within fences) Observations more than 1.5 x IQR outside the central box are plotted individually as suspected outliers.

16
Comparative Box plots – “Guess my age “ data for females and males: Minitab:..\SURVEY1000.MPJ..\SURVEY1000.MPJ

17
Example - Boxplot populations of the 10 largest U.S. cities in 1990, in millions. New York7.323 Los Angeles3.485 Chicago2.784 Houston1.631 Philadelphia1.586 San Diego1.111 Detroit1.028 Dallas1.007 Phoenix0.983 San Antonio0.936

18
Example - Boxplot Write in ascending order San Antonio0.936 Phoenix0.983 Dallas1.007 Detroit1.028 San Diego1.111 Philadelphia1.586 Houston1.631 Chicago2.784 Los Angeles3.485 New York7.323 M=( )/2=1.349 Q 1 =1.007 Q 3 =2.784 IQR= = *IQR=2.666 Q =<0 Q =5.45 New York is an outlier (mean=2.187)

19
* + N.Y. Boxplot of U.S cities populations (in Millions):

20
Choosing measures of center and spread The five number summary (and the boxplot) is usually better than the mean and standard deviation for describing a skewed distribution or a distribution with strong outliers. Use the mean and standard deviation only for reasonably symmetric distributions that are free of outliers.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google