# © Christine Crisp “Teach A Level Maths” Statistics 1.

## Presentation on theme: "© Christine Crisp “Teach A Level Maths” Statistics 1."— Presentation transcript:

© Christine Crisp “Teach A Level Maths” Statistics 1

Introduction to S1 You met some statistical diagrams when you did GCSE. The next three presentations and this one remind you of them and point out some details that you may not have met before. We will start with stem and leaf diagrams ( including back-to-back ). Stem and leaf diagrams are sometimes called stem plots.

Introduction to S1 Weekly hours of 30 men 5 4 4 3 3 2 2 51 456 4112 355555566 300112222333444 28 21 e.g. The table below gives the number of hours worked in a particular week by a sample of 30 men 35413331304535365132 302835343335363242 33314121343534463235 The stem shows the tens... I’ll use intervals of 5 hours to draw the diagram i.e. 20-25, 26-30 etc. and the leaves the units e.g. 46 is 4 tens and 6 units

Introduction to S1 Weekly hours of 30 men e.g. The table below gives the number of hours worked in a particular week by a sample of 30 men 51 456 4112 355555566 300112222333444 28 21 I’ll use intervals of 5 hours to draw the diagram i.e. 20-25, 26-30 etc. 35413331304535365132 302835343335363242 33314121343534463235 e.g. 46 is 4 tens and 6 units Weekly hours of 30 men The stem shows the tens... and the leaves the units

Introduction to S1 Weekly hours of 30 men e.g. The table below gives the number of hours worked in a particular week by a sample of 30 men 51 456 4112 355555566 300112222333444 28 21 I’ll use intervals of 5 hours to draw the diagram i.e. 20-25, 26-30 etc. 35413331304535365132 302835343335363242 33314121343534463235 e.g. 46 is 4 tens and 6 units Weekly hours of 30 men N.B. 35 goes here... not in the line below. The stem shows the tens... and the leaves the units

Introduction to S1 e.g. The table below gives the number of hours worked in a particular week by a sample of 30 men 51 456 4112 355555566 300112222333444 28 21 I’ll use intervals of 5 hours to draw the diagram i.e. 20-25, 26-30 etc. 35413331304535365132 302835343335363242 33314121343534463235 e.g. 46 is 4 tens and 6 units We must show a key. Key: 3 5 means 35 hours Weekly hours of 30 men The stem shows the tens... and the leaves the units

Introduction to S1 51 456 4112 355555566 300112222333444 28 21 Weekly hours of 30 men If you tip your head to the right and look at the diagram you can see it is just a bar chart with more detail. Points to notice: The leaves are in numerical order The diagram uses raw ( not grouped ) data Key: 3 5 means 35 hours

Introduction to S1 The data below is a back to back stem and leaf diagram giving the weight in grams of eggs collected from ostriches and emus. This method can be used to compare two sets of data. Ostrich Emu 8 3 1 0272 4 8 7 6 2 1282 4 6 7 9 4 1290 3 5 1 0307 Key 27|2 = 272

Introduction to S1 0–402 5–905 5 8 10–1412 15–1916 7 7 9 20–2420 2 2 4 25–2925 5 5 7 7 9 9 30–343 35–3936 A grouped data stem and leaf diagram Data 2, 5, 5, 8, 12, 16, 17, 17, 19, 20, 22, 22, 24, 25, 25, 25, 27, 27, 29, 29, 36 Draw a stem and leaf diagram using groupings 0–4, 5–9, 10–14 etc Key 1/2 = 12 3/6 = 36

Introduction to S1 Histogram : A bar chart with continuous data. The bars are drawn up to the class boundaries. NO GAPS between bars. The class boundary occurs halfway between the boundaries of two successive groups. (Except in age questions) Groups 0-9, 10-19, 20-29 etc. the class boundaries between each group occur at 9.5, 19.5 So any quantity >9.5 is in group 2 and any quantity <9.5 is in group 1. The bars are drawn at 9.5 and 19.5 etc. It is very important that the area under each bar is proportional to the frequency.

Introduction to S1 Suppose the data are grouped so that those below 20 and above 69 are combined. e.g. The projected population of the U.K. for 2005 ( by age ) Source: USA IDB 090+ 280 – 89 470 – 79 660 – 69 850 – 59 940 – 49 930 – 39 720 – 29 810 – 19 70 – 9 (millions)( years ) FreqAGE Histograms

Introduction to S1 e.g. The projected population of the U.K. for 2005 ( by age ) Source: USA IDB 090+ 280 – 89 470 – 79 660 – 69 850 – 59 940 – 49 930 – 39 720 – 29 810 – 19 70 – 9 (millions)( years ) FreqAGE Suppose the data are grouped so that those below 20 and above 69 are combined. 670+ 660 - 69 850 - 59 940 - 49 930 - 39 720 - 29 150 - 19 AGE (years) Freq (millions) To draw the diagram we must have an upper class value

Introduction to S1 e.g. The projected population of the U.K. for 2005 ( by age ) Source: USA IDB Suppose the data are grouped so that those below 20 and above 69 are combined. I chose a sensible figure 670 - 109 660 - 69 850 - 59 940 - 49 930 - 39 720 - 29 150 - 19 Freq (millions) AGE (years) Source: USA IDB 090+ 280 – 89 470 – 79 660 – 69 850 – 59 940 – 49 930 – 39 720 – 29 810 – 19 70 – 9 (millions)( years ) FreqAGE

Introduction to S1 e.g. The projected population of the U.K. for 2005 ( by age ) 670 - 109 660 - 69 850 - 59 940 - 49 930 - 39 720 - 29 150 - 19 Freq (millions) AGE (years) If we use the data below to draw an age/frequency graph then it is very misleading as the 1 st and last bar dominate So frequencies are represented by areas Bar1 1 should represent just over twice as many people as bar 2 but it appears to be about 4 times as many

Introduction to S1 A histogram shows frequencies as areas. To draw the histogram, we need to find the width and height of each column. The width is the class width: upper class boundary (u.c.b.) minus lower class boundary (l.c.b.). 670 - 109 660 - 69 850 - 59 940 - 49 930 - 39 720 - 29 150 - 19 Freq (millions) AGE (years) Class width 20 Since these are ages, the 1 st class, for example, has u.c.b. = 20 and the l.c.b. = 0, so the width is 20.

Introduction to S1 A histogram shows frequencies as areas. e.g. The projected population of the U.K. for 2005 ( by age )  height = frequency width The width is the class width: upper class boundary (u.c.b.) minus lower class boundary (l.c.b.). Area of a rectangle = width  height To draw the histogram, we need to find the width and height of each column. So, frequency = width  height 670 - 109 660 - 69 850 - 59 940 - 49 930 - 39 720 - 29 20150 - 19 Class width Freq (millions) AGE (years) 40 10

Introduction to S1 40670 - 109 10660 - 69 10850 - 59 10940 - 49 10930 - 39 10720 - 29 20150 - 19 Class width Freq (millions) AGE (years) Freq density A histogram shows frequencies as areas. e.g. The projected population of the U.K. for 2005 ( by age ) The height is called the frequency density The width is the class width: upper class boundary (u.c.b.) minus lower class boundary (l.c.b.). e.g. For the 1 st class, freq. density = To draw the histogram, we need to find the width and height of each column. height = frequency width

Introduction to S1 40670 - 109 10660 - 69 10850 - 59 10940 - 49 10930 - 39 10720 - 29 20150 - 19 Freq density Class width Freq (millions) AGE (years) A histogram shows frequencies as areas. e.g. The projected population of the U.K. for 2005 ( by age ) The height is called the frequency density The width is the class width: upper class boundary (u.c.b.) minus lower class boundary (l.c.b.). e.g. For the 1 st class, freq. density = To draw the histogram, we need to find the width and height of each column. height = frequency width 0 ·75

Introduction to S1 40670 - 109 10660 - 69 10850 - 59 10940 - 49 10930 - 39 10720 - 29 20150 - 19 Freq density Class width Freq (millions) AGE (years) A histogram shows frequencies as areas. e.g. The projected population of the U.K. for 2005 ( by age ) The width is the class width: upper class boundary (u.c.b.) minus lower class boundary (l.c.b.). We can now draw the histogram. To draw the histogram, we need to find the width and height of each column. The height is called the frequency density  height = frequency width 0 ·75 0 ·15 0 ·6 0 ·8 0 ·9 0 ·7

Introduction to S1 AGE (years) Freq (millions) Class width Freq density 0 - 1915200 ·75 20 - 297100 ·7 30 - 399100 ·9 40 - 499100 ·9 50 - 598100 ·8 60 - 696100 ·6 70 - 1096400 ·15 The projected population of the U.K. for 2005 ( by age ) Notice that the frequencies for the last 2 classes are the same. On the histogram the areas showing these classes are the same. If we had plotted frequency on the y -axis, the diagram would be very misleading. ( It would suggest there are 6 million in each age group 70 – 79, 80 – 89, 90 – 99 and 100 – 109. )

Introduction to S1 SUMMARY  Frequency is shown by area.  The y -axis is used for frequency density. Histograms are used to display grouped frequency data.  Class width is given by u.c.b. – l.c.b. where, u.c.b. is upper class boundary and l.c.b. is lower class boundary  frequency density =

Introduction to S1 Exercise 95 components are tested until they fail. The table gives the times taken ( hours ) until failure. Time to failure (hours) 0-1920-2930-3940-4445-4950-5960-89 Number of components 581622181610 Find 3 things wrong with the histogram which represents the data in the table.

Introduction to S1 Answer: Time to failure (hours) 0-1920-2930-3940-4445-4950-5960-89 Number of components 581622181610 Frequency has been plotted instead of frequency density. There is no title. There are no units on the x -axis.

Introduction to S1 Time taken for 95 components to fail Incorrect diagram Correct diagram

Introduction to S1 Length of millipedeClass boundaries FrequencyClass widthFreq. Density 0 – 90 – 9.569.50.63 10 – 199.5 – 19.518101.8 20 – 3919.5 – 39.514200.7 Note Bars drawn at 9.5, 19.5 and 39. Freq density length Histogram showing length of millipede

Introduction to S1 Source: USA IDB 60090+ 60280 – 89 58470 – 79 54660 – 69 48850 – 59 40940 – 49 31930 – 39 22720 – 29 15810 – 19 770 – 9 (millions) ( years ) Cu.FFreqAGE ANS: The data are given to the nearest million. The projected figure was 113,000. Why does this appear as 0? In drawing the diagram I shall miss out this group. e.g. The projected population of the U.K. for 2005, by age: Cumulative Frequency Graphs

Introduction to S1 Source: USA IDB 60280 – 89 58470 – 79 54660 – 69 48850 – 59 40940 – 49 31930 – 39 22720 – 29 15810 – 19 770 – 9 (millions) ( years ) Cu.FFreqAGE Points are plotted at upper class boundaries (u.c.bs.) Points to notice: e.g. the u.c.b. for 0  9 would normally be 9·5 There is no gap between 9 and 10 as the data are continuous. e.g. The projected population of the U.K. for 2005, by age:

Introduction to S1 Source: USA IDB 60280 – 89 58470 – 79 54660 – 69 48850 – 59 40940 – 49 31930 – 39 22720 – 29 15810 – 19 770 – 9 (millions) ( years ) Cu.FFreqAGE Points to notice: e.g. the u.c.b. for 0  9 would normally be 9·5 Age data have different u.c.bs. Can you say why this is? ANS: If I ask children their ages, they reply 9 even if they are nearly 10, so, the 0-9 group contains children right up to age 10 NOT just nine and a half. Points are plotted at upper class boundaries (u.c.bs.) There is no gap between 9 and 10 as the data are continuous. e.g. The projected population of the U.K. for 2005, by age:

Introduction to S1 e.g. The projected population of the U.K. for 2005, by age: Points to notice: The u.c.bs. for this data set are 10, 20, 30,... Source: USA IDB 60280 – 89 58470 – 79 54660 – 69 48850 – 59 40940 – 49 31930 – 39 22720 – 29 15810 – 19 770 – 9 (millions) ( years ) Cu.FFreqAGE e.g. the u.c.b. for 0  9 would normally be 9·5 Points are plotted at upper class boundaries (u.c.bs.) There is no gap between 9 and 10 as the data are continuous.

Introduction to S1 The projected population of the U.K. for 2005 ( by age ) Age (yrs) The median age is estimated as the age corresponding to a cumulative frequency of 30 million. The median age is 39 years ( Half the population of the U.K. will be over 39 in 2005. ) e.g. The projected population of the U.K. for 2005, by age: Source: USA IDB 9060280 – 89 8058470 – 79 7054660 – 69 6048850 – 59 5040940 – 49 4031930 – 39 3022720 – 29 2015810 – 19 10770 – 9 ( yrs )(m) ( yrs ) u.c.b.Cu.ffAGE

Introduction to S1 The projected population of the U.K. for 2005 ( by age ) Age (yrs) The quartiles are found similarly: lower quartile: 20 years upper quartile: 56 years e.g. The projected population of the U.K. for 2005, by age: The projected population of the U.K. for 2005 ( by age ) Source: USA IDB 9060280 – 89 8058470 – 79 7054660 – 69 6048850 – 59 5040940 – 49 4031930 – 39 3022720 – 29 2015810 – 19 10770 – 9 ( yrs )(m) ( yrs ) u.c.b.Cu.ffAGE The interquartile range is 36 years LQ =  (n+1)th item of data UQ =  (n+1)th item of data

Introduction to S1 The projected population of the U.K. for 2005 ( by age ) Age (yrs) If the retirement age were to be 65 for everyone, how many people would be retired? ANS: ( 60 – 51 ) million = 9 million e.g. The projected population of the U.K. for 2005, by age: 51 Source: USA IDB 9060280 – 89 8058470 – 79 7054660 – 69 6048850 – 59 5040940 – 49 4031930 – 39 3022720 – 29 2015810 – 19 10770 – 9 ( yrs )(m) ( yrs ) u.c.b.Cu.ffAGE

Introduction to S1 Exercise The table and diagram show the number of flowers in a sample of 43 antirrhinum plants. 431160-179 421140-159 411120-139 405100-119 35780-99 281260-79 161040-59 6620-39 Cu.ffx Source: O.N.Bishop Number of flowers on antirrhinum plants Estimate the median number of plants and the percentage of plants that have more than 90 flowers.

Introduction to S1 Number of flowers on antirrhinum plants 431160-179 421140-159 411120-139 405100-119 35780-99 281260-79 161040-59 6620-39 Cu.ffx The u.c.bs. ( where we plot the points ) are at 39·5, 59·5 etc. Solution: Number with more than 90 flowers = There are 43 observations, so the median is given by the 21·5 th one. Median = 70 32

Introduction to S1 Number of flowers on antirrhinum plants 431160-179 421140-159 411120-139 405100-119 35780-99 281260-79 161040-59 6620-39 Cu.ffx Solution: 32 There are 43 observations, so the median is given by the 21·5 th one. Number with more than 90 flowers = Median = 70 Percentage with more than 90 flowers 43 – 32 = 11 26%

Introduction to S1