Download presentation

Presentation is loading. Please wait.

Published byDarin Pearson Modified over 2 years ago

1
Boxplot Q1 Q3 Median largest observation that is not a suspected outlier smallest observation that is not a suspected outlier Whisker * outlier

2
Boxplot Q3Q1Median Whisker largest observation that is not a suspected outlier smallest observation that is not a suspected outlier May also be represented horizontally: * outlier

3
The data: “Guess my age” 31 42 27 29 30 27 26 32 36 27 31 30 27 35 31 37 27 27 35 30 25 28 31 20 33 30 34 26 30 33 30 38 34 30 36 35 43 43 35 32 26 37 30 29 27 29 35 26 30 32 32 29 27 30 28 29 36 26 32 32 30 30 27 28 30 26 28 33 35 32 30 28 38 26 29 37 36 32 40 Example

4
The data: “Guess my age” Sorted data - Example 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43

5
Calculations needed for the boxplot: You may also want to know the mean: Mean = 31.139 Five number summary: MinQ 1 MQ 3 Max 2028303443

6
Details of calculating median and quartiles Calculating Median: n=79 The median is the observation in position (n+1)/2 = (79+1)/2=40 Median = 30 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43

7
Details of calculating median and quartiles Calculating Q1: Q1 is the median of the first 50% of the data. The first 50% of the data are the first 39 observations (not counting the Median). The median of these observations is the observation in position (39+1)/2=20 Q1 = 28 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43 Q1Q1

8
Details of calculating median and quartiles Calculating Q3: Q3 is the median of the top 50% of the data. The top 50% of the data are the highest 39 observations. Within these 39 observations we look for the median, which is in position 20 Q3 = 34 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43 Q3Q3

9
Now we can draw the “box” 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

10
A criterion for outliers: An observation is marked as a suspected outlier if it falls outside the range: [Q 1 -1.5xIQR, Q 3 +1.5xIQR] For the “Guess my age” data IQR=Q3-Q1=34-28=6 1.5*IQR=9 Q1-1.5IQR=28-9=19 (lower fence) Q3+1.5IQR=34+9=43 (upper fence) An outlier is an observation below 19 or above 43

11
No observations are suspected outliers 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43

12
Drawing the whiskers Draw a line to the smallest observation that is not an outlier - 20 Draw a line to the largest observation that is not an outlier - 43 20 25 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 32 32 32 32 32 32 32 32 33 33 33 34 34 35 35 35 35 35 35 36 36 36 36 37 37 37 38 38 40 42 43 43

13
Box plot of “Guess my age” data: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

14
Box plot of “Guess my age” data: You may add the mean (as + or ) 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 Minitab:..\SURVEY1000.MPJ..\SURVEY1000.MPJ

15
Box plot – building blocks Create a box from quartiles Add the median (parallel to quartiles) (add the mean: dot or + in box) Draw whiskers (lines from box to largest and smallest values within fences) Observations more than 1.5 x IQR outside the central box are plotted individually as suspected outliers.

16
Comparative Box plots – “Guess my age “ data for females and males: Minitab:..\SURVEY1000.MPJ..\SURVEY1000.MPJ

17
Example - Boxplot populations of the 10 largest U.S. cities in 1990, in millions. New York7.323 Los Angeles3.485 Chicago2.784 Houston1.631 Philadelphia1.586 San Diego1.111 Detroit1.028 Dallas1.007 Phoenix0.983 San Antonio0.936

18
Example - Boxplot Write in ascending order San Antonio0.936 Phoenix0.983 Dallas1.007 Detroit1.028 San Diego1.111 Philadelphia1.586 Houston1.631 Chicago2.784 Los Angeles3.485 New York7.323 M=(1.111+1.586)/2=1.349 Q 1 =1.007 Q 3 =2.784 IQR=2.784-1.007 =1.777 1.5*IQR=2.666 Q1-2.666=<0 Q3+2.666=5.45 New York is an outlier (mean=2.187)

19
0 1 2 3 4 5 6 7 8 * + N.Y. Boxplot of U.S cities populations (in Millions):

20
Choosing measures of center and spread The five number summary (and the boxplot) is usually better than the mean and standard deviation for describing a skewed distribution or a distribution with strong outliers. Use the mean and standard deviation only for reasonably symmetric distributions that are free of outliers.

Similar presentations

OK

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on film industry bollywood movies Ppt on low level language of computer Ppt on new year resolutions Ppt on self awareness for students Ppt on x ray crystallography Ppt on old movies vs new movies Ppt on conservation of land Ppt on ideal gas law constant Ppt on security features of atm Ppt on power system reliability