Presentation is loading. Please wait.

Presentation is loading. Please wait.

S1: Chapter 4 Representation of Data Dr J Frost Last modified: 25 th September 2014.

Similar presentations


Presentation on theme: "S1: Chapter 4 Representation of Data Dr J Frost Last modified: 25 th September 2014."— Presentation transcript:

1 S1: Chapter 4 Representation of Data Dr J Frost (jfrost@tiffin.kingston.sch.uk) Last modified: 25 th September 2014

2 Stem and Leaf recap 4.7 3.6 3.8 4.7 4.1 2.2 3.6 4.0 4.4 5.0 3.7 4.6 4.8 3.7 3.2 2.5 3.6 4.5 4.7 5.2 4.7 4.2 3.8 5.1 1.4 2.1 3.5 4.2 2.4 5.1 Put the following measurements into a stem and leaf diagram: 1234512345 4 1 2 4 5 2 5 6 6 6 7 7 8 8 0 1 2 2 4 5 6 7 7 7 7 8 0 1 1 2 Now find: (1) (4) (9) (12) (4) Key: 2 | 1 means 2.1 ? ? ? ? ?

3 Back-to-Back Stem and Leaf recap Girls 5580 84 91 80 9298 40 60 64 6672 96 85 88 9076 54 58 92 7880 79 Boys 8060 91 65 67 5975 46 72 71 7457 64 60 50 68 The data above shows the pulse rate of boys and girls in a school. Comment on the results. The back-to-back stem and leaf diagram shows that boy’s pulse rate tends to be lower than girls’. GirlsBoys 456789456789 6 0 7 9 0 0 4 5 7 8 1 2 4 5 0 1 0 8 5 4 6 4 0 9 8 6 2 8 5 4 0 0 0 8 6 2 2 1 0 Key: 0|4|6 Means 40 for girls and 46 for boys. ? ?

4 Box Plots allow us to visually represent the distribution of the data. MinimumLower QuartileMedianUpper QuartileMaximum 315172227 0 5 10 15 20 25 30 Sketch How is the IQR represented in this diagram? How is the range represented in this diagram? Sketch IQRrange Box Plot recap

5 Box Plots recap 0 4 8 12 16 20 24 Sketch a box plot to represent the given weights of cats: 5lb, 6lb, 7.5lb, 8lb, 8lb, 9lb, 12lb, 14lb, 20lb MinimumMaximumMedianLower QuartileUpper Quartile 52087.512 ???? ? Sketch

6 Outliers An outlier is: an extreme value. 0 5 10 15 20 25 30 More specifically, it’s generally when we’re 1.5 IQRs beyond the lower and upper quartiles. (But you will be told in the exam if the rule differs from this) Outliers beyond this point ?

7 Outliers 0 5 10 15 20 25 30 We can display outliers as crosses on a box plot. But if we have one, how do we display the marks for the minimum/maximum? 0 5 10 15 20 25 30 Maximum point is not an outlier, so remains unchanged. But we have points that are outliers here. This mark becomes the ‘outlier boundary’, rather than the minimum.

8 Examples Smallest valuesLargest valuesLower QuartileMedianUpper Quartile 0, 321, 2781014 0 5 10 15 20 25 30 Smallest valuesLargest valuesLower QuartileMedianUpper Quartile 3, 720, 25, 26121316 0 5 10 15 20 25 30 ? ?

9 Exercises Pages 58 Exercise 4B Q2 Page 59 Exercise 4C Q1, 2

10 £100k £150k £200k £250k £300k £350k £400k £450k Kingston Croydon Box Plot comparing house prices of Croydon and Kingston-upon-Thames. Comparing Box Plots “Compare the prices of houses in Croydon with those in Kingston”. (2 marks) For 1 mark, one of: In interquartile range of house prices in Kingston is greater than Croydon. The range of house prices in Kingston is greater than Croydon. i.e. Something spread related. For 1 mark: The median house price in Kingston was greater than that in Croydon. i.e. Compare some measure of location (could be minimum, lower quartile, etc.) ??

11 6 7 8 9 Shoe Size Frequency Height 1.0m 1.2m 1.4m 1.6m 1.8m Frequency Density Bar Charts For discrete data. Frequency given by height of bars. Histograms For continuous data. Data divided into (potentially uneven) intervals. [GCSE definition] Frequency given by area of bars.* No gaps between bars. ? ? ? ? Bar Charts vs Histograms * Not actually true. We’ll correct this in a sec. Use this as a reason whenever you’re asked to justify use of a histogram.

12 F.D. Freq Width Weight (w kg)FrequencyFrequency Density 0 < w ≤ 10404 10 < w ≤ 1561.2 15 < w ≤ 35522.6 35 < w ≤ 45101 ? ? ? ? 10 20 30 40 50 Height (m) 5432154321 Frequency Density Frequency = 15 Frequency = 30 Frequency = 40 Frequency = 25 ? ? ? ? Bar Charts vs Histograms Still using the ‘incorrect’ GCSE formula:

13 Area = frequency?

14 The key to almost every histogram question… …This diagram! AreaFrequency For a given histogram, there’s some scaling to get from an area (whether the total area of the area of a particular bar) to the corresponding frequency. Once you’ve worked out this scaling, any subsequent areas you calculate can be converted to frequencies.

15 Area = frequency? 543210543210 Frequency Density There were 60 runners in a 100m race. The following histogram represents their times. Determine the number of runners with times above 14s. 91218 Time (s) We first find what area represents the total frequency. Total area = 15 + 9 = 24 Then use this scaling along with the desired area. ? ?

16 Weight (to nearest kg)Frequency 1-2 3-6 7-9 543210543210 Frequency Density 1 2 3 4 5 6 7 8 9 10 Time (s) ? ? Note the gaps! We can use the complete set of information in the first row combined with the bar to again work out the correct ‘scaling’.

17 A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results. (a) Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample. (4 marks) M1 A1: Determine what one small square or one large square is worth. M1 A1: Use this to find number of cars travelling >35mph. May 2012 76543217654321 We can make the frequency density scale what we like. ? ?

18 A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results. (b) Estimate the value of the mean speed of the cars in the sample. (3 marks) M1 M1: Use histogram to construct sum of speeds. A1 Correct value ? ? May 2012 Bro Tip: Whenever you are asked to calculate mean, median or quartiles from a histogram, form a grouped frequency table. Use your scaling factor to work out the frequency of each bar.

19 May 2012 Speed 10-1512.530 20-3015240 30-3532.590 35-4037.530 40-4542.560

20 Jan 2012 14 ? 5 ? Bro Tip: Be careful that you use the correct class widths! 21 + 45 + 3 = 69 ?

21 M1 A1 B1 M1 A1 = 12 runners ? ? ? ? ? Jan 2008

22 Answer: Distance is continuous Note that gaps in the class intervals! 4 / 5 = 0.8 19 / 5 = 3.8 53 / 10 = 5.3... ? ?

23 3515 (5 x 5) + 15 = 40 ? ? ? Jun 2007

24 Skew Skew gives a measure of whether the values are more spread out above the median or below the median. Height Frequency Weight Frequency Sketch Mode Sketch Median Sketch Mean modemedianmeanmodemedianmean Sketch Mode Sketch Median Sketch Mean We say this distribution has positive skew. (To remember, think that the ‘tail’ points in the positive direction) We say this distribution has negative skew. ? ?

25 Skew Salaries on the UK. DistributionSkew High salaries drag mean up. So positive skew. Mean > Median IQA symmetrical distribution, i.e. no skew. Mean = Median Heights of people in the UKWill probably be a nice ‘bell curve’. i.e. No skew. Mean = Median Age of retirement Likely to be people who retire significantly before the median age, but not many who retire significantly after. So negative skew. Mean < Median Remember, think what direction the ‘tail’ is likely to point. ? ? ? ? ? ? ? ?

26 Exam Question (d) Describe the skewness of the marks of the students, giving a reason for your answer.(2) Negative skew because mean < median 1 st mark 2 nd mark ? ?

27 Skew Positive skew Negative skew Given the quartiles and median, how would you work out whether the distribution had positive or negative skew? ? ? No skew ?

28 Exam Question 1 st mark 2 nd mark Therefore positive skew. ? ?

29 Calculating Skew One measure of skew can be calculated using the following formula: (Important Note: this will be given to you in the exam if required) 3(mean – median) standard deviation When mean > median, mean < median, and mean = median, we can see this gives us a positive value, negative value, and 0 respectively, as expected. Find the skew of the following teachers’ annual salaries: £3 £3.50 £4 £7 £100 Mean = £23.50Median = £4Standard Deviation = £38.28 Skew = 1.53 ? ? ? ?

30 S1: Chapter 4 Revision!

31 Revision Stem and leaf diagrams: Can you construct one, and write the appropriate key? Can you calculate mode, mean, median and quartiles? Can you assess skewness by using these above values? Back-to-back stem and leaf diagrams: Can you construct one with appropriate key? Can you compare the data on each side? 1234512345 4 1 2 4 5 2 5 6 6 6 7 7 8 8 0 1 2 2 4 5 6 7 7 7 7 8 0 1 1 2 (1) (4) (9) (12) (4) ? ? ? ? Key: 2 | 1 means 2.1 ? ? ?

32 GirlsBoys 456789456789 6 0 7 9 0 0 4 5 7 8 1 2 4 5 0 1 Key: 0|4|6 Means 40 for girls and 46 for boys. Revision 0 8 5 4 6 4 0 9 8 6 2 8 5 4 0 0 0 8 6 2 2 1 0 The data above shows the pulse rate of boys and girls in a school. Comment on the results. Boy’s pulse rate tends to be lower than girls’. Notice the values go outwards from the centre. ? ? ?

33 Revision Histograms

34 M1 A1 B1 M1 A1 = 12 runners ? ? ? ? ? Revision

35 Smallest valuesLargest valuesLower QuartileMedianUpper Quartile 0, 321, 2781014 0 5 10 15 20 25 30 Smallest valuesLargest valuesLower QuartileMedianUpper Quartile 3, 720, 25, 26121316 0 5 10 15 20 25 30 ? ?

36 Revision Skewness Find the skew of the following teachers’ annual salaries: £3 £3.50 £4 £7 £100 Mean = £23.50Median = £4 Standard Deviation = £38.28 Skew = 1.53 ? ? ? ?


Download ppt "S1: Chapter 4 Representation of Data Dr J Frost Last modified: 25 th September 2014."

Similar presentations


Ads by Google