Thinking about variation. Learning Objectives By the end of this lecture, you should be able to: – Discuss with an example why it is important to know.

Slides:



Advertisements
Similar presentations
Percentiles and the Normal Curve
Advertisements

The Normal Distribution
Chapter – 5.4: The Normal Model
The Normal Distribution
CHAPTER 6: The Standard Deviation as a Ruler & The Normal Model
Practice Using the Z-Table
The Normal Distribution
Using the Rule Normal Quantile Plots
Standard Normal Table Area Under the Curve
Looking at data: distributions - Density curves and normal distributions IPS section 1.3 © 2006 W.H. Freeman and Company (authored by Brigitte Baldi, University.
Inference: Confidence Intervals
Chapter 9: The Normal Distribution
Section 2.3 Gauss-Jordan Method for General Systems of Equations
Z - SCORES standard score: allows comparison of scores from different distributions z-score: standard score measuring in units of standard deviations.
Normal Distribution Z-scores put to use!
z-Scores What is a z-Score? How Are z-Scores Useful? Distributions of z-Scores Standard Normal Curve.
Chris Morgan, MATH G160 March 2, 2012 Lecture 21
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Bell-Shaped Curves and Other Shapes Chapter 8.
P-value Method 2 means, sigmas unknown. Sodium levels are measured in millimoles per liter (mmol/L) and a score between 136 and 145 is considered normal.
The Normal distributions PSLS chapter 11 © 2009 W.H. Freeman and Company.
Objectives (BPS 3) The Normal distributions Density curves
Copyright © Cengage Learning. All rights reserved. 6 Normal Probability Distributions.
Basic Statistics Standard Scores and the Normal Distribution.
Think of a topic to study Review the previous literature and research Develop research questions and hypotheses Specify how to measure the variables in.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 6.4.
Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) Lecture 4.
Density Curves Normal Distribution Area under the curve.
Probability Distributions What proportion of a group of kittens lie in any selected part of a pile of kittens?
Looking at Data - Distributions Density Curves and Normal Distributions IPS Chapter 1.3 © 2009 W.H. Freeman and Company.
Section 7.1 The STANDARD NORMAL CURVE
The Mean of a Discrete Probability Distribution
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
Normal Distribution MATH 102 Contemporary Math S. Rook.
Chapter 9 – 1 Chapter 6: The Normal Distribution Properties of the Normal Distribution Shapes of Normal Distributions Standard (Z) Scores The Standard.
Standard Deviation Z Scores. Learning Objectives By the end of this lecture, you should be able to: – Describe the importance that variation plays in.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
1 Psych 5500/6500 Standard Deviations, Standard Scores, and Areas Under the Normal Curve Fall, 2008.
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
Outline Lecture 6 1. Two kinds of random variables a. Discrete random variables b. Continuous random variables 2. Symmetric distributions 3. Normal distributions.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Bell-Shaped Curves and Other Shapes Chapter 8.
The Normal Distribution Lecture 20 Section Fri, Oct 7, 2005.
The Standard Normal Distribution Section Starter Weights of adult male Norwegian Elkhounds are N(42, 2) pounds. What weight would represent the.
Intro to Inference & The Central Limit Theorem. Learning Objectives By the end of this lecture, you should be able to: – Describe what is meant by the.
Normal Distributions (aka Bell Curves, Gaussians) Spring 2010.
Chapter 4 Lesson 4.4a Numerical Methods for Describing Data
Normal distributions Normal curves are used to model many biological variables. They can describe a population distribution or a probability distribution.
1 Lecture 6 Outline 1. Two kinds of random variables a. Discrete random variables b. Continuous random variables 2. Symmetric distributions 3. Normal distributions.
Chapter 131 Normal Distributions. Chapter 132 Thought Question 2 What does it mean if a person’s SAT score falls at the 20th percentile for all people.
The Normal Distribution Lecture 20 Section Mon, Oct 9, 2006.
Welcome to MM570 Psychological Statistics Unit 5 Introduction to Hypothesis Testing Dr. Ami M. Gates.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Statistics III. Opening Routine ( cont. ) Opening Routine ( 10 min) 1- How many total people are represented in the graph below?
Chapter 3 Section 3 Measures of variation. Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see.
The Normal Distributions.  1. Always plot your data ◦ Usually a histogram or stemplot  2. Look for the overall pattern ◦ Shape, center, spread, deviations.
Continuous random variables
Warm-up We are going to collect some data and determine if it is “normal” Roll each pair of dice 10 times and record the SUM of the two digits in your.
ID1050– Quantitative & Qualitative Reasoning
Normal Distribution Z-distribution.
Honors Statistics The Standard Deviation as a Ruler and the Normal Model Chapter 6 Part 3.
MATH 2311 Section 4.3.
Standard Normal Table Area Under the Curve
Density Curves Normal Distribution Area under the curve
Using the Rule Normal Quantile Plots
Intro to Inference & The Central Limit Theorem
Density Curves Normal Distribution Area under the curve
Using the Rule Normal Quantile Plots
Standard Normal Table Area Under the Curve
Thinking about variation
Presentation transcript:

Thinking about variation

Learning Objectives By the end of this lecture, you should be able to: – Discuss with an example why it is important to know the variation when analyzing a dataset – Interpret a series of Normal curves relative to each other in terms of their center and variation – Be able to compare values from different datasets by comparing their z- scores

Thoughts on variation continued Let’s take a moment to think about spread (again)… Suppose you score 12 out of 15 on a test. – Great score? – Good score? – Average score? – Poor score? – Terrible score? Answer: You can’t tell! I hope you’d agree that you’d at least need the mean in order to interpret how good a score this was. Okay then, so suppose I tell you that the mean was 11 / 15. Now answer the same question: Is 12/15 with a mean of 11 this a Great score, Good score, Fair score, Poor score, Terrible score? Answer: You STILL can’t tell! While you could say that is somewhat better than average, you really have no way of knowing if it is approximately average, good, or great.

Thoughts on variation continued Suppose I tell you that the mean was 11 / 15. Is 12/15 a: – Great score? – Good score? – Average score? – Poor score? – Terrible score? Discussion: What’s missing from this interpretation is a measure of spread. Suppose I told you that of the 500 students who took this test, the vast majority scored between 9.5 and In this case, you’d suspect that a score of 12 was, in fact, quite good, but you couldn’t put a number on it. KEY POINT: In order to properly interpret any score (of a Normal distribution), we simply can not ignore the standard deviation!!! Suppose the standard deviation was 0.5. In this case, a score of 12 is two standard deviations above the mean. This would be a score at about the 98 th percentile – which is a great result. Suppose the standard deviation was 2. In that case, your z-score is +0.5 and you are in the 70 th percentile which is good, but not fantastic. In other words, without knowing the spread, you simply do not know the story!

What’s different? What’s the same? In this group, means are different (  = 10, 15, and 20) while the standard deviations are the same (  = 3) In this group, the means are the same (  = 15) but the standard deviations are different (  = 2, 4, and 6).

Another extremely useful thing about working with normally distributed data is that we can compare apples and oranges! That is, because we can convert any observation into a z- score, we can then answer questions to compare seemingly non- comparable distributions.

SAT vs ACT Question: Suppose that student A scores 1140 on their SAT, and student B scores 18.2 on their ACT. You are an admissions counselor and you need to make a decision based exclusively on their test score. Can you use this data to decide? Answer: If you can convert these numbers to their corresponding z-scores, then absolutely! To do so, you would, of course, need to know the mean and standard deviation of the two exams. This information is routinely provided by the testing services. E.g. If student A had a z-score of +1, that means he was in the 84 th percentile for the SAT. If student B had a z-score of +1.3, that means that he was in the 90 th percentile. So even though they took completely different exams, you do have a way of comparing them!

A study was done in which the gestation time of mothers in a poor neighborhood was measured. While there were free prenatal vitamins available, there was a great deal of misinformation about proper prenatal nutrition. The gestation time of this group can be seen on the light-blue curve below. Over the next couple of years, a public health project was implemented at local health-care institutions in which women were also provided with nutritional counseling and healthier food. The results of a study after the nutritional program was implemented are summarized on the orange graph below. Try to interpret the results in your own words….     Example: Gestation time in malnourished mothers

Try to interpret the results in your own words…. The mean gestational time improved from about 250 to 266. In addition to the mean improving, there were more people who reached the mean (the peak of the orange curve is higher than the peak of the blue curve). There was more consistency in the “better nutrition” group: the spread of the orange distribution is narrower. (While you can simply eyeball it, and you can also quantify it by the standard deviation).       Example: Gestation time in malnourished mothers Don’t feel bad if you didn’t automatically ‘get’ all these facts. That’s why we do examples here! Your goal should be to begin making these kinds of interpretations on your own.

A commonly accepted number for a minimum gestational period (ideally) is about 240 days or longer. How might we quantify the improvement shown below? Instead of waiting for me to answer, try to come up with it on your own. I.e. STOP and THINK about it for a moment… Answer: The best way would be to look at the percentage of women who reached the target of 240 days in each group.     Example: Gestation time in malnourished mothers

Vitamins Only: In the group without nutritional counseling (vitamins only), what percent of mothers failed to carry their babies at least 240 days? Vitamins only: About 31% of women failed to reach the target length of 240 days.  =250,  =20, x=240

Nutritional counseling and better food  =266,  =15, x=240 Conclusion: Compared to vitamin supplements alone, vitamins and better food resulted in a much smaller percentage of women with pregnancy terms below 8 months (4% vs. 31%). Nutritional assistance program: Only about 4% of women failed to carry their babies 240 days!

Going in the other direction… Remember: stats teachers love this!! We may also want to find the observed range of values that correspond to a given proportion/ area under the curve. For that, we go backward, that is, we start with the normal table:  we first find the desired area/ proportion in the body of the table,  we then read the corresponding z-value from the left column and top row. For an area to the left of 1.25 % (0.0125), the z-value is -2.24

Example:  =266,  =15, upper area 75% How long are the longest 75% of pregnancies when mothers in the neighborhood are entered in the “better food” program? Answer: This is another case where we start with an area, and need to come back to our ‘x’. ? upper 75% Conclusion: The 75% longest pregnancies in this group are about 256 days or longer.