Download presentation
Published byDoris Higgins Modified over 9 years ago
1
Percentiles and Percentile Ranks and their Graphical Representations
Chapters 2 and 3 : Frequency Distributions, Histograms, Percentiles and Percentile Ranks and their Graphical Representations Note: we’ll be skipping book sections: 2.4 (apparent and real limits) 2.8, 2.9 (percentile and percentile ranks for grouped data)
2
Percentiles and Percentile Ranks
Chapter 2: Frequency Distributions, Histograms, Percentiles and Percentile Ranks How can we represent or summarize a list of values? frequency distribution: shows the number of observations for the possible categories or score values in a set of data. Can be done on any scale (nominal, ordinal, interval, or ratio). Often represented as a bar graph (Chapter 3). Example of a frequency distribution for nominal scale data: 2008 Auto sales by country: Japan: 11,563,629 China: 9,345,101 US: 8,705,239 Germany: 6,040,582 South Korea: 3,806,682 Brazil: 3,220,475
3
Car sales drawn as a histogram
Japan China US Germany South Korea Brazil 2 4 6 8 10 12 Car Sales in 2008 (millions) Japan: 11,563,629 China: 9,345,101 US: 8,705,239 Germany: 6,040,582 South Korea: 3,806,682 Brazil: 3,220,475
4
This histogram shows the proportion of members for each category.
Distribution of all M&M's.
5
Ice Dancing , compulsory dance scores, 4 Winter Olympics
Making histograms from interval and ratio data 111.15 108.55 106.6 103.33 100.06 97.38 96.67 96.12 92.75 89.62 85.36 84.58 83.89 83.12 80.47 80.3 79.31 76.73 74.25 72.01 68.87 63.73 59.64 We need to bin the raw scores into a set of class intervals. How do we decide these class intervals? Be sure the intervals don’t overlap, have the same width, and cover the entire range of scores. Use around 10 to 20 intervals. Use a ‘sensible’ width (like 5, and not ) Make the lower score a multiple of the width (e.g. if the width is 5, a lower score should be 50, not 48) If a score lands on the border, put it in the lower class interval.
6
Ice Dancing , compulsory dance scores,
Winter Olympics Let’s use a class interval width of 5 points, with a lowest score of 55. 111.15 108.55 106.6 103.33 100.06 97.38 96.67 96.12 92.75 89.62 85.36 84.58 83.89 83.12 80.47 80.3 79.31 76.73 74.25 72.01 68.87 63.73 59.64 Class Intervals Frequency (f) 95-100 90-95 85-90 80-85 75-80 70-75 65-70 60-65 55-60 1 2 3 5 n=23 Count the number of scores in each bin to get the frequency
7
Histogram of Ice Dancing Scores (frequency)
Class Intervals Frequency (f) 55 60 65 70 75 80 85 90 95 100 105 110 115 1 2 3 4 5 Ice Dancing Score Frequency 95-100 90-95 85-90 80-85 75-80 70-75 65-70 60-65 55-60 1 2 3 5
8
Relative frequency .0435 .0870 .1304 .2174 Relative frequency (prop)
111.15 108.55 106.6 103.33 100.06 97.38 96.67 96.12 92.75 89.62 85.36 84.58 83.89 83.12 80.47 80.3 79.31 76.73 74.25 72.01 68.87 63.73 59.64 .0435 .0870 .1304 .2174 Relative frequency (prop) Relative frequency (%) 4.35 8.70 13.04 21.74 Class Intervals Frequency (f) 95-100 90-95 85-90 80-85 75-80 70-75 65-70 60-65 55-60 1 2 3 5 n=23 Divide by the total number of scores to get relative frequency in proportion Then multiply by 100 to get relative frequency in percent
9
Relative frequency histogram of Ice Dancing Scores (frequency)
Class Intervals 55 60 65 70 75 80 85 90 95 100 105 110 115 5 10 15 20 25 Ice Dancing Score Relative Frequency (%) 95-100 90-95 85-90 80-85 75-80 70-75 65-70 60-65 55-60 4.35 8.70 13.04 21.74
10
Choosing your class intervals can have an influence on the way your histogram looks
interval width 10 interval width 5 60 70 80 90 100 110 120 1 2 3 4 5 6 7 Ice Dancing Score Frequency 60 80 100 1 2 3 4 5 Ice Dancing Score Frequency interval width 3 interval width 1 60 80 100 1 2 3 Ice Dancing Score Frequency 60 80 100 1 2 Ice Dancing Score Frequency
11
These three graphs have the same class intervals on the same scores!
60 80 100 1 2 3 4 5 Ice Dancing Score Frequency 60 70 80 90 100 110 1 2 3 4 5 Ice Dancing Score Frequency 60 70 80 90 100 110 1 2 3 4 5 Ice Dancing Score Frequency
12
When possible, include zero on your y-axis.
Not like this
13
When possible, include zero on your y-axis:
As of March 27 March 31 Goal 2 4 6 8 Enrollment (Millions) Like this Not like this
14
“Fox News Apologizes For Obamacare Graphic, Corrects Its 'Mistake‘”
15
Percentile ranks and percentile point:
Percentile Point: A point on the measurement scale below which a specific percentage of scores fall. Percentile Rank: The percentage of cases that fall below a given point on the measurement scale. Percentile ranks are always between zero and 100.
16
Growth charts convert percentile points to percentile ranks
At 30 mos. P95 = 36lbs
17
Ice Dancing , compulsory dance scores, Winter Olympics
Percentile ranks and percentile point: What is the percentile rank for a percentile point of 100? In other words, What proportion of scores fall below a score of 100? Class interval f rel f(%) Cumu-lative f Cumu-lative % 1 4.35 23 100 2 8.7 22 95.65 20 86.96 95-100 3 13.04 18 78.26 90-95 15 65.22 85-90 14 60.87 80-85 5 21.74 12 52.17 75-80 7 30.43 70-75 65-70 60-65 55-60 78.26% of the scores fall below 100 The number is the percentile rank The number 100 is the corresponding percentile point We write P78.26 =100 Ice Dancing , compulsory dance scores, Winter Olympics
18
Percentile ranks and percentile point:
Class interval f rel f(%) Cumu-lative f Cumu-lative % 1 4.35 23 100 2 8.7 22 95.65 20 86.96 95-100 3 13.04 18 78.26 90-95 15 65.22 85-90 14 60.87 80-85 5 21.74 12 52.17 75-80 7 30.43 70-75 65-70 60-65 55-60 21.74% of the scores are below 75 or P21.74 = 75 =78.26% of the scores are above 75. Ice Dancing , compulsory dance scores, Winter Olympics
19
The Cumulative Percentage Curve
Class interval Cumu-lative % 100 95.65 86.96 95-100 78.26 90-95 65.22 85-90 60.87 80-85 52.17 75-80 30.43 70-75 21.74 65-70 13.04 60-65 8.7 55-60 4.35 100 90 80 70 60 Cumulative Percentage 50 40 30 20 10 60 65 70 75 80 85 90 95 100 105 110 115 Ice Dancing Score 21.74% of the scores fall below a score of 75 The number is the percentile rank The number 75 is the corresponding percentile point We write P21.74 = 75
20
The Cumulative Percentage Curve
Class interval Cumu-lative % 100 95.65 86.96 95-100 78.26 90-95 65.22 85-90 60.87 80-85 52.17 75-80 30.43 70-75 21.74 65-70 13.04 60-65 8.7 55-60 4.35 100 90 80 70 60 Cumulative Percentage 50 40 30 20 10 60 65 70 75 80 85 90 95 100 105 110 115 Ice Dancing Score 78.26% of the scores fall below a score of 100 The number 78.26is the percentile rank The number 100 is the corresponding percentile point We write P78.26 = 100
21
The Cumulative Percentage Curve
Class interval Cumu-lative % 100 95.65 86.96 95-100 78.26 90-95 65.22 85-90 60.87 80-85 52.17 75-80 30.43 70-75 21.74 65-70 13.04 60-65 8.7 55-60 4.35 100 90 80 70 60 Cumulative Percentage 50 40 30 20 10 60 65 70 75 80 85 90 95 100 105 110 115 Ice Dancing Score 50% of the scores fall below a score of about 84 The number 50 is the percentile rank The number 84 is an estimate of the percentile point We write P50 = 84
22
Ice Dancing , compulsory dance scores, Winter Olympics
Cumulative frequency distribution Class interval f rel f(%) Cumu-lative f Cumu-lative % 1 4.35 23 100 2 8.7 22 95.65 20 86.96 95-100 3 13.04 18 78.26 90-95 15 65.22 85-90 14 60.87 80-85 5 21.74 12 52.17 75-80 7 30.43 70-75 65-70 60-65 55-60 What is the percentile point for a percentile rank of 21.74%? Answer: 75 points (21.75% of the scores fall below 75) Ice Dancing , compulsory dance scores, Winter Olympics
23
Ice Dancing , compulsory dance scores, Winter Olympics
Cumulative frequency distribution Cumulative frequency Cumulative proportion Cumulative percent Class Intervals Frequency (f) 95-100 90-95 85-90 80-85 75-80 70-75 65-70 60-65 55-60 1 2 3 5 23 22 20 18 15 14 12 7 5 3 2 1 1.00 .96 .87 .78 .65 .61 .52 .30 .22 .13 .09 .04 100 96 87 78 65 61 52 30 22 13 8 4 What is the percentile point for a percentile rank of 50? (Or what is P50?) We know it’s between 80 and 85, since 52% fall below 85 and 30% fall below 80. Ice Dancing , compulsory dance scores, Winter Olympics
24
note this is different from the book!
Here’s how to calculate the percentile rank for each raw score: note this is different from the book! Score Rank order Subtract 1/2 Divide by n (23) Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 96.12 16 15.5 0.67 67 92.75 15 14.5 0.63 63 89.62 14 13.5 0.59 59 85.36 13 12.5 0.54 54 84.58 12 11.5 0.50 50 83.89 11 10.5 0.46 46 83.12 10 9.5 0.41 41 80.47 9 8.5 0.37 37 80.3 8 7.5 0.33 33 79.31 7 6.5 0.28 28 76.73 6 5.5 0.24 24 74.25 5 4.5 0.20 72.01 4 3.5 0.15 68.87 3 2.5 0.11 63.73 2 1.5 0.07 59.64 1 0.5 0.02 The percentile point for a percentile rank of 50 is 84.58 ( P50 = 84.58) Ice Dancing, compulsory dance scores, Winter Olympics
25
Ice Dancing , compulsory dance scores, Winter Olympics
Here’s how to calculate the percentile rank for each raw score: Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 96.12 16 15.5 0.67 67 92.75 15 14.5 0.63 63 89.62 14 13.5 0.59 59 85.36 13 12.5 0.54 54 84.58 12 11.5 0.50 50 83.89 11 10.5 0.46 46 83.12 10 9.5 0.41 41 80.47 9 8.5 0.37 37 80.3 8 7.5 0.33 33 79.31 7 6.5 0.28 28 76.73 6 5.5 0.24 24 74.25 5 4.5 0.20 72.01 4 3.5 0.15 68.87 3 2.5 0.11 63.73 2 1.5 0.07 59.64 1 0.5 0.02 The percentile point for a percentile rank of 80 is 100.6 (P80 = 100.6) Ice Dancing , compulsory dance scores, Winter Olympics
26
How do we calculate the percentile point for all the other ranks?
Example: What is the percentile point for the percentile rank of 90%? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 We know it’s between and In fact, it’s ¼ of the way between and (90-89)/(93-89) = 1/4 That means that P90 = /4( ) =
27
How do we calculate the percentile point for other ranks?
Example, what is the percentile point for the percentile rank of P75? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 We know it’s ¾ of the way between and 97.38 /4( ) = 97.2
28
How do we calculate the percentile point for other ranks?
Example, what is the percentile score for the percentile rank of P25? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 80.47 9 8.5 0.37 37 80.3 8 7.5 0.33 33 79.31 7 6.5 0.28 28 76.73 6 5.5 0.24 24 74.25 5 4.5 0.20 20 72.01 4 3.5 0.15 15 68.87 3 2.5 0.11 11 63.73 2 1.5 0.07 59.64 1 0.5 0.02 We know it’s 1/4 of the way between and 79.31 /4( ) = 77.37
29
General formula for calculating percentile points:
Example, what is the percentile point for the percentile rank of 81? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 Make a chart like the one above Find the two rows that fall above and below the percentile rank Let PH and PL be the high and low cumulative percentiles (85 and 80 in this example) Let SH and SL be the high and low scores ( and in this example) If p is the percentile rank (81 in our example), then the percentile point is:
30
Going the other way: from percentile ranks to percentile points
Example: What is the percentile rank for the percentile point of ? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 This is easy, since is one of the scores. The percentile rank is 85%. 85% of the scores fall below
31
Going the other way: from percentile ranks to percentile points
Example: What is the percentile rank for the percentile point of 100? Score Rank order Subtract 1/2 Divide by 23 Multiply by 100 111.15 23 22.5 0.98 98 108.55 22 21.5 0.93 93 106.6 21 20.5 0.89 89 103.33 20 19.5 0.85 85 100.06 19 18.5 0.80 80 97.38 18 17.5 0.76 76 96.67 17 16.5 0.72 72 This is not as easy, since 100 is not one of the scores. We do know that it is between 76 and 80. In fact, we know it must be really close to 80, since P80 is Here’s how to do it. After finding the two rows that bracket the percentile point, if S is the percentile point, then the percentile rank is: 79.91% o the scores fall below 100
32
Another Example: integer valued data
Scores on Professor Flans’ Midterm (n = 20) We’ll choose a class interval width of 3. An odd number for width is good for integer data because the middle value will be a whole number. Raw Test Scores 94 93 92 91 87 86 85 84 83 82 81 80 77 73 68 59 Class interval f 58-61 1 61-64 64-67 67-70 70-73 2 73-76 76-79 79-82 5 82-85 4 85-88 88-91 91-94 3 94-97 97-100 Remember, scores that land on the border are assigned to the lower class interval. So 85 lands in the interval 82-85.
33
Bins labeled by the centers of the class intervals
58-61 1 61-64 64-67 67-70 70-73 2 73-76 76-79 79-82 5 82-85 4 85-88 88-91 91-94 3 94-97 97-100 5 4 3 Frequency 2 1 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Test Score
34
You can also show the whole interval on the x-axis labels
5 4 3 Frequency 2 1 58-61 61-64 64-67 67-70 70-73 73-76 76-79 79-82 82-85 85-88 88-91 91-94 94-97 97-100 Test Score
35
The Cumulative Percentage Curve
Class Interval frequency 97-100 94-97 91-94 3 88-91 1 85-88 2 82-85 4 79-82 5 76-79 73-76 70-73 67-70 64-67 61-64 58-61 Cumulative frequency 20 17 16 14 10 5 4 2 1 Relative frequency(%) 15 5 10 20 25 cumulative frequency % 100 85 80 70 50 25 20 10 5
36
Cumulative frequency%
The Cumulative Percentage Curve for Professor Flans’ Midterm Estimate the percentile point for a percentile rank of 50% Class Interval Cumulative frequency% 97-100 100 94-97 91-94 88-91 85 85-88 80 82-85 70 79-82 50 78-79 25 73-76 20 70-73 67-70 10 64-67 5 61-64 58-61 100 90 80 70 60 Cumulative Frequency (%) 50 40 30 20 10 61 64 67 70 73 76 79 82 85 88 91 94 97 100 Test Score About 50% of the scores fall below 82. (So P50 is about 82)
37
Estimate the percentile point for a percentile rank of 90%
Estimating percentile points and percentile ranks from the cumulative percentage curve Estimate the percentile point for a percentile rank of 90% 100 90 80 70 60 Cumulative Frequency (%) 50 40 30 20 10 61 64 67 70 73 76 79 82 85 88 91 94 97 100 Test Score 90% of the scores fall below a score of about 92. (P90 is about 92)
38
Calculating percentile points from raw data.
What is the percentile point for a percentile rank of 50%? Test score Rank order Subtract 1/2 Divide by 20 Multiply by 100 94 20 19.5 0.975 97.5 93 19 18.5 0.925 92.5 92 18 17.5 0.875 87.5 91 17 16.5 0.825 82.5 87 16 15.5 0.775 77.5 86 15 14.5 0.725 72.5 85 14 13.5 0.675 67.5 84 13 12.5 0.625 62.5 12 11.5 0.575 57.5 83 11 10.5 0.525 52.5 82 10 9.5 0.475 47.5 81 9 8.5 0.425 42.5 8 7.5 0.375 37.5 80 7 6.5 0.325 32.5 6 5.5 0.275 27.5 77 5 4.5 0.225 22.5 73 4 3.5 0.175 3 2.5 0.125 68 2 1.5 0.075 59 1 0.5 0.025 It’s between 82 and 83 P50 = 82.5
39
It’s exactly halfway between 92 and 93
Calculating percentile points from raw data. What is the percentile point for a percentile rank of 90%? Test score Rank order Subtract 1/2 Divide by 20 Multiply by 100 94 20 19.5 0.975 97.5 93 19 18.5 0.925 92.5 92 18 17.5 0.875 87.5 91 17 16.5 0.825 82.5 87 16 15.5 0.775 77.5 86 15 14.5 0.725 72.5 85 14 13.5 0.675 67.5 84 13 12.5 0.625 62.5 12 11.5 0.575 57.5 83 11 10.5 0.525 52.5 82 10 9.5 0.475 47.5 81 9 8.5 0.425 42.5 8 7.5 0.375 37.5 80 7 6.5 0.325 32.5 6 5.5 0.275 27.5 77 5 4.5 0.225 22.5 73 4 3.5 0.175 3 2.5 0.125 68 2 1.5 0.075 59 1 0.5 0.025 It’s between 92 and 93 It’s exactly halfway between 92 and 93
40
Going the other way: from percentile ranks to percentile points
Example, what is the percentile rank for the percentile point of 90? Test score Rank order Subtract 1/2 Divide by 23 Multiply by 100 94 20 19.5 0.975 97.5 93 19 18.5 0.925 92.5 92 18 17.5 0.875 87.5 91 17 16.5 0.825 82.5 87 16 15.5 0.775 77.5 86 15 14.5 0.725 72.5 85 14 13.5 0.675 67.5 84 13 12.5 0.625 62.5 12 11.5 0.575 57.5 83 11 10.5 0.525 52.5 82 10 9.5 0.475 47.5 81 9 8.5 0.425 42.5 8 7.5 0.375 37.5 80 7 6.5 0.325 32.5 6 5.5 0.275 27.5 77 5 4.5 0.225 22.5 73 4 3.5 0.175 3 2.5 0.125 68 2 1.5 0.075 59 1 0.5 0.025 It’s between 77.5 and 82.5 81.25% of the scores fall below 90 points
41
More stuff about frequency distributions:
Frequency polygon Frequency histogram 60 63 66 69 72 75 78 81 84 87 90 93 96 99 1 2 3 4 5 Test Score Frequency 5 4 3 Frequency 2 1 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Test Score
42
Properties of frequency distributions
‘normal’ or bell-shaped positively skewed Negatively skewed
43
Example of a negatively skewed distribution
40 35 30 25 Frequency 20 15 10 5 300 350 400 450 500 550 600 650 700 750 800 GRE quant scores
44
Example of positively skewed distribution: Household annual income
45
Household income distribution as of 2006:
P0-89 (bottom 90%) — income below $104,696 (average income, $30,374*) P (top 10%) — income above $104,696 (average income, $269,658*) P90-95 (next 5%) — income between $104,696 and $148,423 (average income, $122,429*) P95-99 (next 4%) — income between $148,423 and $382,593 (average income, $210,597*) P (top 1%) — income above $382,593 (average income, $1,243,516*) P (top 0.5%) — income above $597,584 (average income, $2,022,315*) P (top 0.1%) — income above $1,898,200 (average income, $6,289,800*) P (top .01%) —income above $10,659,283 (average income, $29,638,027*) So the ‘top 1%’ can be described as: P99 = $382,593
46
Two (of many) ways that frequency distributions differ
Shift in central tendency 20 40 60 80 100 Scores Shift in variability 20 40 60 80 100 Scores
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.