Presentation is loading. Please wait.

Presentation is loading. Please wait.

 NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING.

Similar presentations


Presentation on theme: " NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING."— Presentation transcript:

1

2

3

4

5  NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING ERROR FOR A BOX PLOT

6 "Whenever I see I remember "

7

8

9

10 HOW CONFIDENT ARE YOU NOW?

11

12

13  OR: HOW WRONG OUR ESTIMATES COULD BE  LAST YEAR (SOME OF) YOU MET THE IDEA OF INFORMAL CONFIDENCE INTERVALS

14

15  Our sample gives us this:  But we need to see this  So we need an idea of how “wrong” we could be: - We could take another sample or an even larger sample - Even better we could take another 1000 samples and find the average of our results - In 202 we used a formula to find an interval that had a high probability of containing our statistic (the median that is)  In 302 we use something called BOOTSTRAPPING to do something similar

16 https://www.youtube.com/watch?v=3Y_Ps4ETwo0 Ponyland is a mystical land, home to all kinds of magical creatures. The Little Ponies make their home in Paradise Estate, living a peaceful life filled with song and games. However, not all of the creatures of Ponyland are so peaceful, and the Ponies often find themselves having to fight for survival against witches, trolls, goblins and all the other beasts that would love to see the Little Ponies destroyed, enslaved or otherwise harmed. [1] [1] https://en.wikipedia.org/wiki/My_Little_Pony_(TV_series)

17 How tall is your PONY? Casio Graphics calculator: RandNorm#(5, 150) RanInt#(5,150) Excel (normsinv(rand())*5+150 TI 84+ calculator: randnorm(150,5) Draw your PONY Tell your neighbor about your PONY

18 I wonder what the mean height is for the Little Ponies at Paradise Estate?  From Level 6 (Year 11) we use means/medians to estimate the population mean; however we know there is too much uncertainty  From Level 7 (Year 12) we use an informal confidence interval to estimate the range where the population mean will lie.  Level 8 (Year 13); you will learn a new analysis tool called Bootstrapping

19 1. Find the mean and median of your sample first – write it down 2. Shuffle your Ponies 3. Select 1 and record their height in excel 4. Put that Pony back and re-shuffle 5. Select another Pony 6. Repeat process until you have recorded 11 Pony heights THIS IS YOUR SAMPLE OF 11 Find the mean and median your sample Plot your mean and median on the board using the appropriate colour

20

21  Good News: Bootstrapping will be the easiest part of your the Inference assignment

22 Using iNZight to re-sample  Start iNZight and select the Bootstrap Confidence Interval Construction VIT module.  Import the Pony sample session 1 file.  Drag Height down to the variable 1 box, and then click the Analyse tab.  The default quantity is “mean”. Do NOT change this, just click on “Record my choices”  Play, and replicate what you have just done by hand. Check you know what each selection does.  To finish, copy and paste the Bootstrap distribution of re-sample means into a word document.

23

24 Using iNZight to check how well this method works  Start iNZight and select the Confidence interval coverage VIT module (or select FILE and VIT modules).  Import the Pony height population file.  Drag “Height” down to the variable 1 box, and then click the Analyse tab.  The default quantity is mean. Do NOT change this. Change the CI Method to bootstrap: percentile and the Sample Size to 10, then click on Record my choices.  Play. Check you know what each selection does, and how it relates to the bootstrap confidence intervals. Just remember: You will rarely have data on the whole population! This is just a teaching tool to show you how it works!

25

26  posing a comparison investigative question using a given multivariate data set  selecting and using appropriate displays and summary statistics  discussing sample distributions  discussing sampling variability, including the variability of estimates  making an appropriate formal statistical inference  communicating findings in a conclusion.

27  Achieved - Use statistical methods to make a formal inference involves showing evidence of using each component of the statistical enquiry cycle.  Merit - Use statistical methods to make a formal inference, with justification involves linking components of the statistical enquiry cycle to the context, and referring to evidence such as sample statistics, data values, or features of visual displays in support of statements made.  Excellence - Use statistical methods to make a formal inference, with statistical insight involves integrating statistical and contextual knowledge throughout the statistical enquiry cycle, and may include reflecting about the process; considering other relevant explanations.

28 PROPLEM AND PLAN STAGE

29 I wonder what the difference is between the median weight of forward and back rugby players in New Zealand according to a sample from http://www.rugby- sidestep-central.com/http://www.rugby- sidestep-central.com/ What you are comparing (you must include the mean or median) The characteristic you are grouping by What the population is Where your sample data is sourced from The weight is the weight of the rugby players in kilograms, and the position is the player’s normal position on the rugby field, either forward or back. What next? What did we do for the BIVARIATE standard?

30  I am doing this investigation as I play rugby and it has often been commented that I would better be suited to playing back due to my size and weigh. I wonder how my weight compares to the median ….. I would expect the median weights for backs to be less than forwards although ……

31

32 Basic facts One thing I didn’t know One thing I found interesting As you may know the link between autism and vaccines has a long and contentious history. Use this topic to do some research into this area. The table below may help you summarise your findings. Come up with AT LEAST two different questions I DO NOT want you to spend much time on this Autism and Vaccines

33

34

35

36

37

38

39 SUCCOS S SPREAD Discuss the Inter Quartile Range (IQR) – which is UQ – LQ This is the spread of the middle 50% U UNUSUAL FEATURES This is usually seen by looking at the raw data (dot plot) OR a long whisker C CLUSTERSWhere does most of the data lie between OR any groupings? C CENTRE Compare the middle 50% of the data and which is higher up the scale O OVERLAPIs there a visible overlap of the boxes? S SHAPE Is there an even distribution? – median in the middle of the box and whiskers even in length

40

41

42  SPREAD  Comparing the sizes of the spreads What do you see? What does this mean for the sample? What does this mean for the population? The inter quartile range for the forwards is 12.2 kg whereas the interquartile range for the backs is 7.5 kg. The range is also greater for the forwards than the backs. The standard deviation is also higher for the forwards. This indicates that the forwards have more variation in their weights than the backs. Overall visually forwards seem to be slightly more spread out than backs.

43  UNUSUAL  Describing any unusual features What do you see? What does this mean for the sample? What does this mean for the population? Looking at the graphs I can see that the forwards have one player that weighs more than most of the other forwards. He is a New Zealander weighing 137 kg and is 1.81 m tall. This could be because he is a stockier player that is quite large with more muscles causing him to weigh more, which is what I would expect are characteristics a forward is more likely to have. {research?}

44  CENTRE  Comparing the middle 50% What do you see? What does this mean for the sample? What does this mean for the population? The forwards’ median weight is 18.50 kg higher than the backs’ median weight. The middle 50% of the forward’s weights are between 104.8 kg and 117.0 kg whereas the middle 50% of the back’s weights are between 88.0 kg and 95.5 kg. Remember this structure is only a guide

45  CLUSTERS  Where does most of the data lie between and are there any groupings? What do you see? What does this mean for the sample? What does this mean for the population? There are two discernable groups for the forwards, one between 97kg-105kg and the other between 115kg and 120kg. This could be due to the heavier group being props and the lighter group being flankers. (If we have access to the raw data we could actually find this out!) This is a great opportunity to integrate some research. What can we find out about the weight of props and flankers on the internet?

46  OVERLAP  Where does most of the data lie between and are there any groupings? What do you see? What does this mean for the sample? What does this mean for the population? The lower quartile for the forwards weight is higher than the upper quartile of the weight of the backs Therefore the middle 50% do not overlap. This suggest that weights for forwards will be higher on average than the weights for backs

47  SHAPE  What is the distribution like? What do you see? What does this mean for the sample? What does this mean for the population? The forwards weights appear to have two distinct groupings and be skewed to the right whereas the backs weights seem reasonably symmetrical. This means that the weights of the backs are more evenly spread out but cluster around the median following an almost normal distribution. The forwards however have weights that are more variable with two distinct groupings and a particularly heavier player who skews the data to the right. Backs appear to be unimodal whereas the forwards are potentially bimodal. However there is only one player skewing the data to the right so this could be down to sampling variability.

48 Open run mode Import data Chose your Variable 1 (has to be numerical) Subset by your two groups Import ‘Student Data’ and draw a comparison B & W for the head perimeter between males and females. Get summary Statistics ** Data is based on Year 11 students at Blah College

49 FemaleMale S SPREADAD U UNUSUAL FEATURES C CLUSTERS C CENTRE O OVERLAP S SHAPE

50 FemaleMale SPREAD: Compare the IQR (middle 50% spread) Female IQR = 58 – 55 = 3 Male IQR = 58 – 53.75 = 4.25 The middle 50% of head circumferences belonging to the male year 11 students at Blah College are more spread out than the middle 50% of head circumferences of female Year 11 students at Blah College. This is shown by the male head circumference IQR range being larger by 1.25. This could be because … (possible reason why)

51 FemaleMale Unusual features/value: There is one unusually small head circumference for year 11 males at Blah College at 46cm whereas there are no unusual head circumferences for females at Blah College. This could be because … (possible reason why)

52 FemaleMale Clusters: Most of the head circumferences for Year 11 females at Blah College are between 53cm and 58cm whereas most of the head circumferences for the Year 11 males at Blah College are between 54cm and 58cm. There also seems to be two groupings of Year 11 female students with a head circumference of 57cm and 55cm, whereas the male year 11 students seem to be more scattered with no clusters. This could be because … (possible reason why)

53 FemaleMale Centre: Expectation is to compare the middle 50% Female middle 50% = 58 and 55cm median = 57cm Male middle 50% = 58 and 53.75cm median = 55cm The median head circumference for year 11 female students at Blah College is 2cm bigger than the male Year 11 students at Blah College. The middle 50% of year 11 female students at Blah College is between 55 and 58cm, which is approximately the same as the year 11 male students at Blah College. For example the middle 50% of students have roughly the same head circumference no matter if you were male or female. This could be because … (possible reason why)

54 FemaleMale Overlap: Does the boxes (middle 50%) overlap?? Female middle 50% = 58 and 55cm Male middle 50% = 58 and 53.75cm There is significant overlapping of the middle 50% between male and female year 11 students at Blah College which suggests that we may not be able to make a call whether there is a difference in head circumferences between male and female. This could be because … (possible reason why)

55 FemaleMale Symmetry: Both male and female students at Blah College have asymmetric distributions meaning there is an uneven distribution. This is because the head circumferences for female year 11 students at Blah College have been pushed slightly towards having larger head circumferences whereas the males have been pushed slightly towards having smaller head circumferences skewing both distributions. This could be because … (possible reason why)

56 I wonder what is the difference between the mean wing length of Male Pegasus18 years or over and Female Pegasus Ponies that are 18 years of age or over Draw a comparison box and whisker graphs on the wing length of Pegasus Ponies at Paradise Estate Describe any features.

57

58 Comment on the sample distribution for your TWO investigation questions Heights Spike Copy and paste ANY relevant graphs and/or statistics you have used. Describe the features

59

60

61 I am fairly confident that there is a difference between the mean wing length of female and male Pegasus Ponies that are 18 years or over. I can make the call that Males have longer wings than females as the bootstrap values are both positive. I can also say that Male Pegasus Ponies 18 years and over have a mean wing length that is somewhere between 1.534cm and 3.595cm larger than the mean female length. I wonder if there are any differences between the mean wing length of Male and Female Pegasus Ponies that are 18 years of age or over

62 Answer both of your comparison questions 1. Open iNZight in bootstrap VIT mode 2. Inport appropriate data 3. Show bootstrap distibution 4. Calculate confidence interval 5. Write a inference. Remember We want to create a bootstrap confidence interval for the difference between median heights of female ponies and median heights of male ponies. We want to create a bootstrap confidence interval for the difference in median heights between the ponies chased by Spike and the ponies not chased by Spike.

63 Complete this sheet in student resources

64

65 Make a formal statistical inference. Conclude your investigation, reflecting on your hypothesis and justifying your formal inference This may include: -Discussing sampling variability, including the variability of estimates. -Reflecting on the process you have used to make the formal inference -Discussing your choice of the mean or median -Are there any lurking variables that you could consider next to improve your investigation?

66 I wonder if there are any differences between the mean wing lengths of Male and Female Pegasus Ponies that are 18 years of age or over When looking at the sample variation between male and females, females wing lengths are a lot more spread out than males. However when you compare just the middle 50% spread they only have a difference of 0.65cm which is very small. This leads me to believe that if I had a different sample the spread could potentially be different where there may not be as many female Pegasus Ponies with short wings. If this was the case, this would push up the mean, but may have little effect on the median. Through my research about Pegasus Ponies wings I have learnt that female Pegasus Ponies have a different shape of wing as they are narrower, so looking just at the length of the wing may not be enough to make a recommendation about whether to make special female army wing guards. When looking at the standard deviation there is not much variation between the difference of means. The bootstrap interval is also significantly more than 0 which gives me confidence that there is indeed a difference in mean wing lengths. Based on my investigation and the sample that I was given, I would conclude that there is a difference between male and female wing lengths for all Pegasus Ponies that are 18 years and over. I therefore make the recommendation that they should be making special wing guards for females. Copy and paste question into your conclusion

67 Because our Ponies are fictional we are not going to write a conclusions based on this.

68

69

70 Research http://nzta.govt.nz/about/advertising/drink-driving/legend.html


Download ppt " NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING."

Similar presentations


Ads by Google