# Statistics What are statistics and why do we use them?

## Presentation on theme: "Statistics What are statistics and why do we use them?"— Presentation transcript:

Statistics What are statistics and why do we use them?
Statistics help to make sense of numbers that have been collected. For example, if there was a survey into the size of feet at your school, after asking everyone their size you would end up with hundreds of random numbers!! Using statistics you could sort the numbers out to find: the most common size the mean the range of sizes. There are many other, more complex, ways that are used to evaluate data which we go into in Intermediate 2.

Statistics At Intermediate 1 level you covered basic statistics including: finding the mean, median, range and mode from a set of numbers finding the mean, median, range and mode from a frequency table finding the probability of an event occurring. In Intermediate 2 you build on this work to calculate other values that are used to evaluate data. Before we go any further into the new work, we will go over the Intermediate 1 work.

Statistics The range is used to measure how widely spread a set of values are: range = highest value – lowest value Example: Stephen played 12 holes at his local golf club and recorded his scores. What was his range of scores? 4, 3, 4, 6, 5, 3, 8, 6, 9, 2, 3, 7 range = highest value – lowest value = 9 – 2 = 7 Example: The next day he played another 12 holes. What is his range of scores now? 2, 9, 10, 9, 13, 12, 12, 11, 1, 3, 4, 2 range = highest value – lowest value = 13 – 1 = 12

Statistics The average number from a set of numbers can be calculated using three different methods. 1. The mode is the most common number. Example: A group of pupils were asked how many kilometres they could run. 4, 4 3, 2, 3, 1, 5, 6, mode = 4

Statistics The mean is found by adding all the numbers together, then dividing by the number of pieces of data. Example: 11 people were asked how much pocket money they got. What is the mean amount? 3, 4, 2, 5, 3, 6, 8, 5, 5, 7, 7 mean = 11 = = 5

Statistics 3. The median is the middle number.
To see which number is in the middle you have to put them in order. Example: Julie saved up some of her pocket money over 11 weeks for an iPod Touch. What is the median amount she saved each week? 3, 4, 2, 5, 3, 6, 8, 4, 5, 6, 7 Rearrange: 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 8 median = 5 Once you’ve rearranged the numbers, count them to make sure you haven’t missed any of them out. There are 11 numbers here, so the median will be the 6th number.

Statistics If you have an even number of amounts, the median will be between two numbers. To calculate this value, find the mean of the two middle numbers. Example: Linda saved up some of her pocket money over a 10- week period for a Wii. What was the median amount that she saved? 3, 4, 2, 6, 3, 6, 8, 4, 6, 7 Rearrange: 2, 3, 3, 4, 4, 6, 6, 6, 7, 8 There are 10 numbers here, so the median will be between the 5th and 6th numbers The median lies between £4 and £6. To calculate this value, find the mean of these two numbers. median = £5

Statistics Write in your jotters the range, mode, median and mean of the following sets of numbers. 1) 2) 3) 4) The following are the distances jumped in a school sports day in metres. 4, 3, 5, 6, 4, 5, 6, 7, 8, 12, 6 The following numbers are the maths scores in an S2 class. 17, 12, 13, 12, 14, 15, 16, 17, 18, 17, 17, 20 The following are times for the 100m sprint (in seconds). 24.5, , , , , , 20, , The following are minimum temperatures (°C) in Glasgow measured over one week. 0, 3, 1, 0, 0, 0, 4, 3, 2, 3

1) Range: mode: median: mean: 6 2) Range: mode: median: mean: 3) Range: mode: median: mean: 18.5 4) Range: mode: median: mean: 1.6

Frequency tables 1 7 1  7 = 7 2 4 2  4 = 8 3 3  3 = 9 4  4 = 16 5
It is possible to calculate the mean, mode and median from a frequency table by adding a third column to it. Number of items x Frequency f f  x 1 7 1  7 = 7 2 4 2  4 = 8 3 3  3 = 9 4  4 = 16 5 5  4 = 20 The values for the third column are found by multiplying the values in column 1 (x) with the values in column 2 (f). The mode is the most common number = 1 The number 1 appears seven times. Because there are 20 numbers, the median will be between the 10th and 11th numbers, in this case 2.

Frequency tables 1 7 1  7 = 7 2 4 2  4 = 8 3 3  3 = 9 4  4 = 16 5
To calculate the mean you use this formula: Number of items x Frequency f f  x 1 7 1  7 = 7 2 4 2  4 = 8 3 3  3 = 9 4  4 = 16 5 5  4 = 20 Totals 20 60 mean ∑ stands for ‘sum of’ mean = = 3

Frequency tables 1 4 2 9 3 10 6 5 Totals
Copy and complete the frequency table to work out the mean, mode and median of the number of cars in a group of pupils’ homes. Number of cars x Frequency f f  x 1 4 2 9 3 10 6 5 Totals

Frequency tables 1 4 2 9 18 3 10 30 6 24 5 Totals 81
Now check that your answers are correct. Number of cars x Frequency f f  x 1 4 2 9 18 3 10 30 6 24 5 Totals 81 Mean: Mode: Median: 3

Frequency tables 3 1 4 12 5 11 6 14 7 8 Totals
Copy and complete the frequency table to work out the mean, mode and median of the shoe sizes of S1 pupils. Shoe size x Frequency f f  x 3 1 4 12 5 11 6 14 7 8 Totals

Frequency tables 3 1 4 12 48 5 11 55 6 14 84 7 8 56 Totals 46 246
Now check that your answers are correct. Shoe size x Frequency f f  x 3 1 4 12 48 5 11 55 6 14 84 7 8 56 Totals 46 246 Mean: Mode: Median: 5

Cumulative frequency A cumulative frequency column can be added to a frequency table to keep a running total of the frequencies. A group of parents were asked how many children they each had. Number of children x Frequency f Cumulative frequency 1 6 2 7 3 5 4 The 21 tells you that 21 parents had 4 or fewer children 6 13 (6 + 7) 18 (13 + 5) 21 (18 + 3) 22 (21 + 1) 22 You can easily work out the median from a cumulative frequency column. There were 22 parents asked so the median is between the 11th and 12th people asked. 6 parents had 1 child and 13 had 2 or less. Therefore if the median is between the 11th and 12th it must be 2 children.

Cumulative frequency The number of pairs of shoes owned by 5th year girls is shown below. Copy and complete the cumulative frequency table. Number of shoes x Frequency f Cumulative frequency 4 9 5 12 6 8 7 11 1) How many girls owned fewer than 6 pairs of shoes? 2) What was the median number of shoes owned by the girls?

Cumulative frequency Now check that your answers are correct. Number of shoes x Frequency f Cumulative frequency 4 9 5 12 21 6 8 29 7 11 40 46 1) 21 girls 2) Median = 6

Cumulative frequency A group of 4th year boys were asked how many hours a week they spent playing computer games. Copy and complete the cumulative frequency table. Number of hours x Frequency f Cumulative frequency 5 2 6 7 12 8 17 9 20 1) How many boys played games for less than 8 hours a week? 2) What was the median number of hours spent playing computer games?

Cumulative frequency Now check that your answers are correct. Number of hours x Frequency f Cumulative frequency 5 2 6 7 12 19 8 17 36 9 20 56 1) 19 boys 2) Median = 8

Quartiles To order a set of numbers into quartiles, we first of all have to put the numbers in order from the lowest to the highest. 10, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 30 Q1 Q2 Q3 The median splits the numbers into two equal parts and is the second Quartile, Q2 To calculate what the other two quartiles, Q1 and Q3, are, you calculate the median of the upper and lower halves. The median of the lower half is called Q1. The median of the upper half is called Q3. The quartiles must divide the numbers into four groups with the same amount of numbers in each group, in this case groups of three.

Quartiles If you have a larger group of numbers, it might not be so easy to find which number to look to for the median and the quartiles. The following rule will help you decide, no matter how many numbers you have. Divide the number of values by 4. 2. Your answer will tell you how many numbers will be in each group. 3. The remainder will tell you how many extra values there are. This will be 0, 1, 2 or 3. Example 1 12 numbers , 3, 3, 3, 4, 6, 7, 7, 8, 8, 8, 9 12 ÷ 4 = 3 r 0, therefore there will be 3 in each quarter, with 0 extra values to be fitted in. 2, 3, 3, 3, 4, 6, 7, 7, 8, 8, 8, 9 Q1 = 3 Q2 = 6.5 Q3 = 8

Quartiles Example 2 13 numbers 0, 1, 2, 2, 2, 2, 3, 5, 6, 7, 7, 7, 9
13 ÷ 4 = 3 r 1, therefore there will be 3 in each quarter, with 1 extra value to be fitted in symmetrically. 0, 1, 2, 2, 2, 2, 3, 5, 6, 7, 7, 7, 9 Q1 = 2 Q2 = 3 Q3 = 7 Example 3 14 numbers , 0, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5, 6 14 ÷ 4 = 3 r 2, therefore there will be 3 in each quarter, with 2 extra values to be fitted in symmetrically. 0, 0, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5, 6 Q1 = 1 Q2 = 2.5 Q3 = 5

Quartiles Example 4 15 numbers
0, 1, 2, 3, 3, 3, 3, 4, 5, 5, 6, 7, 7, 8, 8 15 ÷ 4 = 3 r 3, therefore there will be 3 in each group of 4, with 3 extra values to be fitted in symmetrically. 0, 1, 2, 3, 3, 3, 3, 4, 5, 5, 6, 7, 7, 8, 8 Q1 = 3 Q2 = 4 Q3 = 7

Five-figure summary Q1 Q2 Q3
A five-figure summary is a summary of a set of numbers. The five figures are the three quartiles (Q1, Q2 and Q3) together with the highest and lowest numbers. 10, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 30 Q1 Q2 Q3 Using the previous example showing how to calculate the quartiles, the five-figure summary is as follows: highest: 30 lowest: 10 Q1: 12 Q2: 13 Q3: 14

Five-figure summary In your jotters, write down the five-figure summary for each set of numbers. 1) 14 pupils in S4 were asked their shoe size. 4, 5, 3, 6, 4, 7, 4, 5, 6, 6, 4, 7, 8, 9 2) A group of paper boys were asked how much they earn a week. 13, 14, 15, 12, 16, 17, 18, 18, 18, 19, 29, 39, 38, 37, 36, 37 1 2 3 4 English exam scores n = 23 represents a score of 11 3) 4) A group at the swimming pool were asked their ages. 12, 8, 7, 19, 23, 25, 20, 14

1) Minimum: 3 Maximum: 9 Q1: Q2: Q3: 7 2) Minimum: Maximum: Q1: Q2: Q3: 36.5 3) Minimum: Maximum: Q1: Q2: Q3: 28 4) Minimum: 7 Maximum: Q1: Q2: Q3: 21.5

The range Up until now, when we calculated the range of a set of numbers, we took the lowest number from the highest. In certain situations, however, this will not give an accurate reflection of the spread of the numbers. For example, here are the ages of a group of children in the scouts and their leader. 10, 11, 12, 14, 13, 15, 13, 12, 11, 12, 14, 15, 14, 13, 30 The range here is the highest 30 take away the lowest 20 All of the children are aged between 10 and 15. The leader of the group is 30 and this gives a false impression of how widely spread the ages are. The range only uses the two end ages and disregards all the others. Another measure of spread is the semi-interquartile range, which takes into account more of the numbers to give a more accurate and relevant result.

The semi-interquartile range
Now that we know how to work out the quartiles, we can calculate the semi-interquartile range. Using the example of the scout group, we found that the range was 20 years. However, because one person was so much older than the rest, this was not an accurate reflection of the range of ages. 10, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 30 Q1 Q2 Q3 To calculate the semi-interquartile range, you find the difference between the upper quartile Q3 and the lower quartile Q1, and then halve your answer:

The semi-interquartile range
This way of working out the range is often preferred to just taking the lowest from the highest as you do for the range. The reason for this is that it takes into account more of the numbers in the data and it also disregards what can sometimes be extreme high or low numbers that are not typical of the data. If you ever forget the formula for calculating the semi- interquartile range, you could construct it by breaking down the words. Interquartile range is the range between the upper and lower quartile. To calculate the ‘semi’, divide your answer by 2.

The semi-interquartile range
1) A group of 20 pupils were asked how much pocket money they got each week. 7, 7, 2, 2, 3, 4, 5, 9, 10, 3, 4, 5, 6, 6, 7, 7, 7, 6, 3, 3 In your jotters, write down the five-figure summary, range and semi-interquartile range. 2) A group of 15 pupils were asked what their shoe size was. 12, 9, 2, 3, 7, 8, 5, 5, 6, 8, 9, 10, 5, 4, 6 In your jotters, write down the five-figure summary, range and semi-interquartile range.

The semi-interquartile range
Now check that your answers are correct. 1) minimum: maximum: Q1: Q2: Q3: 7 range: 8 semi-interquartile range: 2 2) minimum: 2 maximum: Q1: Q2: Q3: 9 range: semi-interquartile range: 2

Comparing sets of data Very often the reason for using statistics is to compare two or more sets of results. Once you have statistics for two or more sets of data, you can make statements based on the results. Example: As part of a school project, pupils from two schools were asked how much pocket money they received each week. Quahog School: , 6, 5, 4, 6, 5, 7, 10, 10, 7, 8, 9, 7 Springfield Elementary: 12, 7, 3, 4, 2, 3, 4, 4, 5, 2, 6, 3, 4 Quahog: Mean = £7.08 Median = £7.00 Springfield: Mean = £4.92 Median = £4.00

Comparing sets of data By calculating the mean and median from each set of data, what statements can be made about how much each child receives? Quahog: Mean = £7.08 Median = £7.00 Springfield: Mean = £4.92 Median = £4.00 By looking at the mean and median of both sets of data, we can see that the children at Quahog are given more pocket money on average than the children that go to Springfield Elementary. The mean and median are similar in each school, which suggests that they are both a good indication of the average given to each child.

Comparing sets of data By comparing the mean, median and range of the following sets of data, what statements can be made about the data? 1) Two companies that produce boxes of paper clips claim that they provide their customers with more paper clips in each box. The boxes cost the same from each company. Clips R Us: , 106, 101, 100, 99, 92, 96, 100, 101, 110, 90 Pippa’s Clippas: 87, 120, 104, 102, 100, 98, 97, 100, 101, 102, 95 2) A teacher wanted to compare the marks of her two first-year classes. What conclusions can you make about the scores? Class 2A: , 19, 17, 18, 17 ,17, 18, 18, 19, 17, 20, 16, 13, 12 Class 2B: , 20, 19, 3, 2, 4, 6, 10, 11, 3, 2, 15, 16, 17

1) Clips R Us: Mean: Median: Range: 20 Pippa’s Clippas: Mean: Median: Range: 33 By comparing the median we can see no difference in the results. The mean shows that Pippa’s Clippas have slightly more on average in each box. However, the range is much bigger, meaning that the amount in each box could vary by a fairly large amount in comparison to Clips R Us. 2) Class 2A: Mean: Median: Range: 8 Class 2B: Mean: Median: Range: 18 The mean for each class tells us that class 2A achieved a higher mark on average than 2B and the median backs this up. The range in 2B is very high, suggesting that while some people did very well, others did very poorly. The range in 2B shows that the scores that each pupil achieved were closer together, suggesting that in this class pupils are closely matched in ability.

Box plots Once you have a five-figure summary you can represent the information on a box plot. Using the example earlier, on the scout trip we calculated that the quartiles were 12, 13 and 14. 10, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 30 Q1 Q2 Q3 We can see from the list that the lowest number is 10 and the highest number is 30. This information can be represented on a box plot. 2 4 6 8 16 18 20 22 24 26 14 12 10 28 30 Median Q2 Q1 Q3 Lowest Highest

Standard deviation So far we have looked at two methods for checking the spread of numbers: the range and the semi-interquartile range. The last measure of spread of data we are going to look at is called standard deviation. The reason that we need to use another method is because of the limitations of the range and semi-interquartile range, which are: the range only uses the two end values, ignoring every other value the semi-interquartile range totally disregards the two end-values. The standard deviation is the most accurate measure of spread because it takes into account all of the numbers. When you work out the standard deviation you obtain a number. This number tells you how far away on average each of the values are from the mean.

Standard deviation To work out the standard deviation of a group of numbers we are going to divide the calculation into four steps. Example: The following group of numbers is how late the bus was (in minutes) each day as George went to work one week. 23, 15, 7, 8, 7 Calculate the mean and the standard deviation. Step Calculate the mean. Each value when we use standard deviation is represented with an x. mean = (the sum of all the x values) ÷ (the number of values) We are going to be using some new notation for this: (pronounced x bar) is the mean is the sum of all the x values is the number of values used

Step 2 Now we draw a table to see how far each value is from the mean.
23, 15, 7, 8, 7 is the mean Step Now we draw a table to see how far each value is from the mean. x 23 23 – 12 = 11 121 15 15 – 12 = 3 9 7 7 – 12 = -5 25 8 8 – 12 = -4 16 We now need to find the mean of these values but if we add them together we get zero! To get round this problem we square each value. The negatives disappear and we add an extra column to the table.

Step 3 We now find the mean of the numbers in the last
column. For standard deviation we divide the total by the number of values minus 1. In this case 5 – 1 = 4. ( ) ÷ 4 = 49 Step 4 Remember that we squared the numbers in step 2 so now we must find the square root of 49. This number is called the standard deviation and is the measure of how far each value is from the mean. standard deviation The formula for standard deviation is: in this case = 7 When the standard deviation is low it means the scores are close to the mean. When it is high it means they are spread out from the mean. In this case it is a high number in relation to the mean, so the numbers are spread out from the mean.

Standard deviation Standard deviation
Use this formula to calculate the standard deviation of the following sets of data in your jotters. 1) The ages of four people who climbed Everest are: 28, 43, 50, 27 2) The following times show the 0 to 60 acceleration of different BMWs: 6.0, 5.2, 10.7, 9.6, 8.3, 11.5, 7.5 3) The following scores were recorded at a golf competition: 68, 72, 70, 71, 69

1) 2) 2.4 3) 1.6

Standard deviation There is one final formula that can be used to find the standard deviation from a set of numbers. You will have noticed that in the previous examples when you calculated the mean at the beginning, it gave an easy-to-use number, i.e. the mean was either a whole number or a decimal number to 1 decimal place. If you calculate the mean and you have a number with many decimal places, you can use an alternative formula. This still gives the same answer as the one we found before, but this formula is easier to use for numbers that have more decimal places. Standard deviation

Standard deviation Example: Calculate the mean and standard deviation of the following numbers. 22, 23, 21, 20, 20.4, 21.3 x 22 484 23 529 21 441 20 400 20.4 416.16 21.3 453.69 = … standard deviation

Standard deviation Use this formula to calculate the standard deviation of the following sets of data in your jotters. 1) The reaction time of four drivers were tested: 0.23, 0.85, 0.42, 0.94 2) The BMI values of a group of S5 pupils were recorded as follows: 17.7, , 21.2, 23, , 18.4 3) A group of S6 students were asked at what age they thought they would get married: 33, 32, 34, 34, 35

1) 0.3 2) 2.6 3) 1.1

Probability Probability is the likelihood of an event happening.
To calculate the probability of an event happening, the following formula can be used. P stands for probability P(event) = number of favourable outcomes number of possible outcomes Example If you were to roll a dice what would the probability be that it would land on a 2? P(2) = number of 2s on the dice total numbers on the dice = 1 6

Probability Example If you were to roll a dice, what is the probability that you would roll an odd number? P(odd) = number of odd numbers on the dice total numbers on the dice = 3 6 = 1 2 Example If you were to pick a random card out from a set of cards, what is the probability that you would pick out the number 4? P(4) = number of 4s in a pack of cards total number of cards in a pack = 4 52 = 1 13

Probability In your jotters, calculate the probability of the following events happening. 1) There are 52 cards in a pack. What is the probability that you pick out a red card? 2) A bag full of bank notes has 14 £1 notes, 6 £5 notes 3 £10 notes and 1 £20 note. What is the probability that a £5 note would be randomly picked out? 3) There are 49 numbers in the National Lottery. What is the probability that the first ball that rolls out is a multiple of 4?

Probability Now check that your answers are correct. 1) 26 = 1
1) = 1 2) 6 = 1 3) 12 52 2 24 4 49

Download ppt "Statistics What are statistics and why do we use them?"

Similar presentations