Presentation is loading. Please wait.

Presentation is loading. Please wait.

S1: Chapter 4 Representation of Data Dr J Frost Last modified: 20 th September 2015.

Similar presentations


Presentation on theme: "S1: Chapter 4 Representation of Data Dr J Frost Last modified: 20 th September 2015."— Presentation transcript:

1 S1: Chapter 4 Representation of Data Dr J Frost (jfrost@tiffin.kingston.sch.uk) www.drfrostmaths.com Last modified: 20 th September 2015

2 Overview We’ll look at 3 different ways of presenting data, as well as ways of analysing them (including ‘skew’). BOX PLOTS *NEW since GCSE!* Outliers. STEM AND LEAF *NEW since GCSE!* Back to back stem and leaf diagrams. HISTOGRAMS *NEW since GCSE!* Area is not necessarily equal to frequency.

3 Skew Skew gives a measure of whether the values are more spread out above the median or below the median. Height Frequency Weight Frequency Sketch Mode Sketch Median Sketch Mean modemedianmeanmodemedianmean Sketch Mode Sketch Median Sketch Mean We say this distribution has positive skew. (To remember, think that the ‘tail’ points in the positive direction) We say this distribution has negative skew. ? ?

4 Skew Salaries on the UK. DistributionSkew High salaries drag mean up. So positive skew. Mean > Median IQA symmetrical distribution, i.e. no skew. Mean = Median Heights of people in the UKWill probably be a nice ‘bell curve’. i.e. No skew. Mean = Median Age of retirement Likely to be people who retire significantly before the median age, but not many who retire significantly after. So negative skew. Mean < Median Remember, think what direction the ‘tail’ is likely to point. ? ? ? ? ? ? ? ?

5 Skew based on mean/median Describe the skewness of the marks of the students, giving a reason for your answer.(2) Negative skew because mean < median 1 st mark 2 nd mark ? ? Bro Tip: If you ever forget which way the two go, just think of salaries! High values (i.e. a positive tail) drag up the mean but not the median. So it’s the position of the mean that determines skew.

6 Skew based on quartiles Positive skew Negative skew ? ? No skew ? (The data is spread out more in the positive direction, so we have positive skew)

7 Example Exam Question 1 st mark 2 nd mark Therefore positive skew. ? ?

8 Test Your Understanding Available DataComment on skew (2 marks) Little/no skew as median and mean are roughly equal. ? ? ?

9 Calculating Skew One measure of skew can be calculated using the following formula: (Important Note: this will be given to you in the exam if required) 3(mean – median) standard deviation When mean > median, mean < median, and mean = median, we can see this gives us a positive value, negative value, and 0 respectively, as expected. Find the skew of the following teachers’ annual salaries: £3 £3.50 £4 £7 £100 Mean = £23.50Median = £4Standard Deviation = £38.28 Skew = 1.53 ? ? ? ?

10 Exercise 1 1 2 ? ? ? ? ? ? ?

11 3 ? ? ? ?

12 4 ? ? ?

13 Stem and Leaf recap 4.7 3.6 3.8 4.7 4.1 2.2 3.6 4.0 4.4 5.0 3.7 4.6 4.8 3.7 3.2 2.5 3.6 4.5 4.7 5.2 4.7 4.2 3.8 5.1 1.4 2.1 3.5 4.2 2.4 5.1 Put the following measurements into a stem and leaf diagram: 1234512345 4 1 2 4 5 2 5 6 6 6 7 7 8 8 0 1 2 2 4 5 6 7 7 7 7 8 0 1 1 2 Now find: (1) (4) (9) (12) (4) Key: 2 | 1 means 2.1 ? ? ? ? ?

14 Back-to-Back Stem and Leaf recap Girls 5580 84 91 80 9298 40 60 64 6672 96 85 88 9076 54 58 92 7880 79 Boys 8060 91 65 67 5975 46 72 71 7457 64 60 50 68 The data above shows the pulse rate of boys and girls in a school. Comment on the results. The back-to-back stem and leaf diagram shows that boy’s pulse rate tends to be lower than girls’. GirlsBoys 456789456789 6 0 7 9 0 0 4 5 7 8 1 2 4 5 0 1 0 8 5 4 6 4 0 9 8 6 2 8 5 4 0 0 0 8 6 2 2 1 0 Key: 0|4|6 Means 40 for girls and 46 for boys. ? ?

15 Box Plots allow us to visually represent the distribution of the data. MinimumLower QuartileMedianUpper QuartileMaximum 315172227 0 5 10 15 20 25 30 Sketch How is the IQR represented in this diagram? How is the range represented in this diagram? Sketch IQRrange Box Plot recap

16 Outliers An outlier is: an extreme value. 0 5 10 15 20 25 30 More specifically, it’s generally when we’re 1.5 IQRs beyond the lower and upper quartiles. (But you will be told in the exam if the rule differs from this) Outliers beyond this point ?

17 Examples Smallest valuesLargest valuesLower QuartileMedianUpper Quartile 0, 321, 2781014 0 5 10 15 20 25 30 Draw a box plot to represent the above data. When there’s an outlier at one end, there’s two allowable places to put the end of the whisker: Bro Exam Tip: You MUST show your outlier boundary calculations. ? The maximum value not an outlier, 21 (I think this one makes most sense). OR the outlier boundary, 23. ? Use one or the other (not both).

18 Test Your Understanding a ? b ? c ? (on your printed sheet)

19 £100k £150k £200k £250k £300k £350k £400k £450k Kingston Croydon Box Plot comparing house prices of Croydon and Kingston-upon-Thames. Comparing Box Plots “Compare the prices of houses in Croydon with those in Kingston”. (2 marks) For 1 mark, one of: In interquartile range of house prices in Kingston is greater than Croydon. The range of house prices in Kingston is greater than Croydon. i.e. Something spread related. For 1 mark: The median house price in Kingston was greater than that in Croydon. i.e. Compare some measure of location (could be minimum, lower quartile, etc.) ??

20 Test Your Understanding Jan 2005 Q2 (on your printed sheet) ? ? ?

21 Exercise 2 a ? b ? c ? d ? (on your printed sheet)

22 Exercise 2 (on your printed sheet) ? ? ? ? ?

23 Exercise 2 (on your printed sheet) ? ? ? ?

24 Exercise 2 (on your printed sheet) ? ? ? ?

25 Exercise 2 (on your printed sheet) ? ? ? (Solutions to (d) and (e) on next slide)

26 Exercise 2 (on your printed sheet) ? ?

27 Exercise 2 (on your printed sheet) 63 5 52 45 1217 28 ? ? ?

28 (Textbook Exercise Reference) Pages 58 Exercise 4B Q2 Exercise 4G Q8

29 6 7 8 9 Shoe Size Frequency Height 1.0m 1.2m 1.4m 1.6m 1.8m Frequency Density Bar Charts For discrete data. Frequency given by height of bars. Histograms For continuous data. Data divided into (potentially uneven) intervals. [GCSE definition] Frequency given by area of bars.* No gaps between bars. ? ? ? ? Bar Charts vs Histograms * Not necessarily true. We’ll correct this in a sec. Use this as a reason whenever you’re asked to justify use of a histogram.

30 F.D. Freq Width Weight (w kg)FrequencyFrequency Density 0 < w ≤ 10404 10 < w ≤ 1561.2 15 < w ≤ 35522.6 35 < w ≤ 45101 ? ? ? ? 10 20 30 40 50 Height (m) 5432154321 Frequency Density Frequency = 15 Frequency = 30 Frequency = 40 Frequency = 25 ? ? ? ? Bar Charts vs Histograms Still using the ‘incorrect’ GCSE formula: Q1 Q2

31 SKILL #1 :: Area = frequency? 543210543210 Frequency Density There were 60 runners in a 100m race. The following histogram represents their times. Determine the number of runners with times above 14s. 91218 Time (s) Total frequency is known; therefore find total area and hence the ‘scaling’. Total area = 15 + 9 = 24 Then use this scaling along with the desired area. ? ? Unlike at GCSE, the area of a bar is not necessarily equal to the frequency; there are just proportional.

32 A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results. (a) Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample. (4 marks) M1 A1: Determine what one small square or one large square is worth. M1 A1: Use this to find number of cars travelling >35mph. May 2012 Q5 76543217654321 Bro Tip: We can make the frequency density scale what we like. ? Test Your Understanding (on your printed sheet) ?

33 (b) Estimate the value of the mean speed of the cars in the sample. (3 marks) M1 M1: Use histogram to construct sum of speeds. A1 Correct value ? ? Bro Tip: Whenever you are asked to calculate mean, median or quartiles from a histogram, form a grouped frequency table. Use your scaling factor to work out the frequency of each bar. Test Your Understanding (on your printed sheet)

34 Test Your Understanding (on your printed sheet) (c) Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.(2) (d) Comment on the shape of the distribution. Give a reason for your answer.(2) (e) State, with a reason, whether the estimate of the mean or the median is a better representation of the average speed of the traffic on the road.(2) ? ? ?

35 SKILL #2 :: Gaps! Weight (to nearest kg) FrequencyF.D. 1-2 3-6 7-9 210210 Frequency Density 1 2 3 4 5 6 7 8 9 10 Time (s) ? ? Note the gaps affects class width! Remember the frequency density axis is only correct to scale, so there may be some scaling. However in an exam scaling is unlikely to be required for F.D. if the F.D. scale is already given. ? ? ? We set the scaling between area and frequency to be 1.

36 Jan 2012 Q1 14 ? 5 ? Bro Tip: Be careful that you use the correct class widths! 21 + 45 + 3 = 69 ? Test Your Understanding (on your printed sheet)

37 SKILL #3 :: Width and height on diagram An exam favourite is to ask what width and height we’d draw a bar in a drawn histogram. Q: The frequency table shows some running times. On a histogram the bar for 0-4 seconds is drawn with width 6cm and height 8cm. Find the width and height of the bar for 4-6 seconds. Time (seconds)Frequency  Bro Tip: Find the scaling for class width to drawn width and frequency density to drawn height. Strategy ? Solution ?

38 Test Your Understanding (on your printed sheet) ? ?

39 Q1 Exercise 3 (on your printed sheet) ?

40 Answer: Distance is continuous Note that gaps in the class intervals! 4 / 5 = 0.8 19 / 5 = 3.8 53 / 10 = 5.3... ? ? Q2 Exercise 3 (on your printed sheet)

41 Exercise 3 (on your printed sheet) ? ? ? ? Q3

42 Exercise 3 (on your printed sheet) Q4 [June 2007 Q5] ? ? ? ? ? ?

43 Exercise 3 (on your printed sheet) Q5 ? ? ? ?

44 Exercise 3 (on your printed sheet) Q6 ? ? ? ? ?

45 Exercise 3 (on your printed sheet) Q7 a ? b ? c ? d ? e ?


Download ppt "S1: Chapter 4 Representation of Data Dr J Frost Last modified: 20 th September 2015."

Similar presentations


Ads by Google