Analysis. Start with describing the features you see in the data.

Slides:



Advertisements
Similar presentations
Literacy and thinking bivariate data investigations
Advertisements

Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Descriptive Measures MARE 250 Dr. Jason Turner.
What does integrated statistical and contextual knowledge look like for 3.10? Anne Patel & Jake Wills Otahuhu College & Westlake Boys High School CensusAtSchool.
Statistics Unit 6.
Jared Hockly - Western Springs College
PROBLEM This is where you decide what you would like more information on. PLAN You need to know what you will measure and how you will do it. What data.
Inferential Reasoning in Statistics. PPDAC Problem, question, purpose for investigating Plan, Data, Analyse data, Draw a conclusion, justify with evidence.
Making the call Year 10 Some activities to immerse students in ideas about sample, population, sampling variability and how to make a “claim” when comparing.
1 Chapter 1: Sampling and Descriptive Statistics.
“Teach A Level Maths” Statistics 1
Level 1 Multivariate Unit
Copyright (c) Bani Mallick1 Lecture 2 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #2 Population and sample parameters More on populations.
Measures of Central Tendency
Chapter In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries.
PPDAC Cycle.
Information for teachers This PowerPoint presentation gives examples of responses for the Conclusion section of the report. Students own answers will differ.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Objectives 1.2 Describing distributions with numbers
Informal statistical inference: Years 10 to 12 Maxine Pfannkuch and Chris Wild The University of Auckland.
Report Exemplar. Step 1: Purpose State the purpose of your investigation. Pose an appropriate comparison investigative question and do not forget to include.
By C. Kohn Waterford Agricultural Sciences.   A major concern in science is proving that what we have observed would occur again if we repeated the.
How much do you smoke?. I Notice... That the median for the males is 13.5 cigarettes per day and the median for females is 10 cigarettes per day. This.
Writing the question. PPDAC Problem, question, purpose for investigating Plan, Data, Analyse data, Draw a conclusion, justify with evidence.
Statistics and parameters. To find out about a population we take a sample.
W HICH WORKSHOP … Workshop choices Session – 3 pm Workshop ONE Multivariate Stats Workshop TWO Stats thinking & literacy 3 – 3.30 pmAfternoon tea.
Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?
Carrying out a statistics investigation. A process.
Chapter 3 Looking at Data: Distributions Chapter Three
Stage 1 Statistics students from Auckland university Using a sample to make a point estimate.
Revision Analysing data. Measures of central tendency such as the mean and the median can be used to determine the location of the distribution of data.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Inference Bootstrapping for comparisons. Outcomes Understand the bootstrapping process for construction of a formal confidence interval for a comparison.
1 Chapter 4 Numerical Methods for Describing Data.
Stat 31, Section 1, Last Time Distributions (how are data “spread out”?) Visual Display: Histograms Binwidth is critical Bivariate display: scatterplot.
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
PPDAC Cycle.
I wonder if right handed students from the CensusAtSchool NZ 2009 Database are taller than left handed students from the CensusAtSchool NZ 2009 Database.
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Use statistical methods to make an inference. Michelle Dalrymple.
Concept: Comparing Data. Essential Question: How do we make comparisons between data sets? Vocabulary: Spread, variation Skewed left Skewed right Symmetric.
Statistical Thinking Julia Horring & Pip Arnold TEAM Solutions University of Auckland.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
KIWI KAPERS Species Weight(kg)Height (cm) Region/Gender.
 NHANES 1000, MARIJANA USE  COMPARE WHAT HAPPENS WHEN WE CHANGE SAMPLE SIZE CENSUS AT SCHOOL, ARM SPAN LOOKING AT WHAT HAPPENS TO THE SAMPLING.
Introduction Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data.
Multi-variate data internal 4 Credits. achieved The student: Poses an appropriate comparison question, with or without guidance from the teacher,
Statistics Unit 6.
Statistics and Probability-Part 2
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Inference.
Reaction Times for Males Vs Females
Statistics Unit 6.
Box and Whisker Plots.
Writing the executive summary section of your report
Inferences to the population
Inference credits Making the call……..
Common Core Math I Unit 2: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
“Teach A Level Maths” Statistics 1
Using your knowledge to describe the features of graphs
Honors Statistics Review Chapters 4 - 5
“Teach A Level Maths” Statistics 1
Numerical Descriptive Measures
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

Analysis

Start with describing the features you see in the data

Starting point Overall visual non- numerical comparisons Overlap Shift Unusual features

Starting point Overlap: I notice there is a lot of overlap between the boys’ and girls’ foot lengths.

Starting point Shift: The boys’ foot lengths are shifted further up the scale than the girls’ foot lengths.

Starting point Unusual features: There is an unusually low foot size of 15 cm in the girls’ data. I suspect that this data is a mistake as it seems too low in comparison with the other data for girls. OR one of the girls has a recorded foot length far shorter than any other girl

After the initial overall visual non-numerical comparisons: Make more detailed comparative descriptions of the features including use of summary statistics and specific observation values where appropriate. Reflect and perhaps comment on some of the features using “I wonder...” and “I expect...” type statements, i.e., comment on any inferential thoughts.

comparative descriptions of the features including use of summary statistics The median foot length of the boys (25cm) is 3cm longer than the median foot length of the girls (22cm). The mean foot length of the boys (25.5cm) is 2.1cm longer than the mean foot length of the girls (23.4cm).

comparative descriptions of the features including use of summary statistics The range of foot lengths for the boys (9cm) is the same as the range of foot lengths for the girls (9cm) if we ignore the unusual value. Also the interquartile range for the foot lengths for the boys (3cm) is the same as for the girls (3cm).

comparative descriptions of the features including use of summary statistics The most common result for the foot length of boys was 25cm but for the girls it was 22 and 23 cm. In all these cases, the boys seem to have higher values of foot length than the girls by about 2cm.

comparative descriptions of the features including use of summary statistics The median foot length for the boys is the same as the UQ value for the girls (25cm)

Make Comparisons Between the groups (e.g., overlap, shift, spread and shape statements) Within each group (e.g., unusual observations)

Overlap Be aware of sampling variation: Sampling alone can produce shifts These shifts are small in large samples They can be large in small samples.

Overlap There is some overlap of the boxes but the median of the girls’ foot length is outside the boys’ box and the median of the boys’ foot length is the same as the UQ of the girls’ foot length.

Overlap OR There is some overlap for the middle 50% of the boys’ right foot lengths and the middle 50% of the girls’ right foot lengths.

Shift The boys values are shifted to the right of the girls values for maximum and minimum values and median and UQ and LQ foot lengths.

Shift The middle 50% of the boys’ foot lengths (the box) is shifted much further along the scale than the middle 50% of the girls’ foot lengths.

Spread The spread for both boys’ foot lengths and girls’ foot lengths are the same i.e. range is 9cm in both cases and IQR is 3cm for both.

Spread The middle 50% of boys have a right foot measuring between 24cm and 27cm (IQR = 3cm) whereas the middle 50% of the girls are between 22 and 25cm (IQR = 3cm). This means that the foot lengths for these boys vary by about the same amount as these girls’ do.

Spread I expect that the boys’ and girls’ foot length distributions back in the two populations have similar variability.

Note: The range should not be used as it is very inclined to be an unstable estimate of the population spread. The range is highly likely to vary greatly from sample to sample for samples of these sizes. The range is also prone to be severely affected by the occasional extreme observation. This is why we use other more resistant measures of spread such as the IQR. The IQR is not disturbed by the presence of a few very large or very small observations.

From the dot plot: Some of the boys have bigger right foot lengths than some of the girls and vice versa

Shape The shape of the distributions is not clear from the dot plots but appears to be unimodal as would be expected and maybe slightly skewed to the right as indicated by the box plots. To get a more accurate view, we would need to increase the sample size.

Shape OR The sample distribution for the boys’ foot lengths is roughly symmetrical with a mound around 24 to 27cm, i.e., unimodal The sample distribution for the girls’ foot lengths shows a large mound around 22 to 24 cm.

Shape I wonder if boys’ and girls’ foot length distributions back in the two populations are roughly symmetric and unimodal. I expect so for a body measurement such as foot length for both girls and boys.

Unusual value I notice one of the girls has a foot length (15cm) far smaller than any other girl I worry that this may be a mistake. It could be a measurement or recording mistake or perhaps this girl is much younger than 13 years. I wouldn’t expect a 13 year-old girl to have a foot size this small. I need to check her other measurements such as age, height etc. to further investigate this extreme value.

Gaps and clusters I notice the dots are stacked on whole numbers. This is because the foot lengths are measured to the nearest cm.

Gaps and clusters There is a gap in the girls’ group at 28cm and gaps in the boys’ group at 22 and 29cm.

Gaps and clusters Boys’ and girls’ foot length distributions back in the two populations would not have gaps at these same values. The gaps are in the sample due to the small size of the sample.

Sampling If a new random sample of year-old boys and a new random sample of year-old girls were taken I would expect the plots to look different because of sampling variability. With these sample sizes, I would expect each IQR spread to change slightly and that each box would be slightly further down or up the scale.

I wonder: if I repeated this sampling process many times the boys’ foot lengths would, just about always, be shifted further up the scale than the girls’ if boys tend to have a greater foot length than girls back in the two populations if the median foot length of boys really is greater than that of girls back in the two populations

Conclusion I notice that the distance between the medians is greater than 1/3 of the “overall visible spread”

Conclusion I am going to claim that the right foot lengths of 13 year- old New Zealand boys tend to be longer than the right foot lengths of 13 year-old New Zealand girls back in the two populations. I am prepared to make this call because, in my data, the distance between the boys’ and the girls’ median foot lengths is big relative to the overall visible spread. To make this call, with sample sizes of around 30, the difference between the two foot length medians needs to be more than about 1/3 of the overall visible spread. This is true for my data.

Conclusion I don’t believe that the pattern in my data of the boys tending to have longer foot lengths than the girls is just due to who happened to be randomly selected in the girls’ group and who happened to be randomly selected in the boys’ group, i.e., I don’t believe this data pattern has just happened by chance. I am prepared to claim that this pattern in the data is real, i.e., that this pattern persists back in the two populations.

Notes: We use ‘… right foot length …’ because the investigative question asks about the right foot length.

Notes: Using statistics there is always the possibility that the calls (decisions) that we make are wrong, i.e., we are making calls in the face of uncertainty. For example, we want to make a call on who tends to be taller (back in the two populations), 13 year-old boys or 13 year-old girls. We may make the call that it’s 13 year-old boys when in fact it’s girls who tend to be taller. Or, we may not want to make a call even though boys tend to be taller than girls..

Explanatory I expected that boys tend to have bigger feet than girls back in the populations and the information I collected (my data) supports this belief. I can’t think of any other factor which can explain the difference in foot size other than gender.

Notes: In this explanatory element we ask ourselves if our conclusion makes sense with what we know, i.e., whether our contextual knowledge matches our conclusions. We must try to think of other factors which may lead to alternative explanations when measuring foot lengths. These suggestions should also be present in the conclusion.