Using the Rule Normal Quantile Plots

Slides:



Advertisements
Similar presentations
The Normal Distribution
Advertisements

Chapter 2: Modeling Distributions of Data
Practice Using the Z-Table
Normal Distribution; Sampling Distribution; Inference Using the Normal Distribution ● Continuous and discrete distributions; Density curves ● The important.
The Diversity of Samples from the Same Population Thought Questions 1.40% of large population disagree with new law. In parts a and b, think about role.
1 Normal Probability Distributions. 2 Review relative frequency histogram 1/10 2/10 4/10 2/10 1/10 Values of a variable, say test scores In.
HS 67 - Intro Health Stat The Normal Distributions
Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.
Standard Normal Table Area Under the Curve
Inference: Confidence Intervals
CHAPTER 3: The Normal Distributions Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
Explaining the Normal Distribution
The Normal Distributions
STANDARD SCORES AND THE NORMAL DISTRIBUTION
Density Curves and Normal Distributions
Normal Distribution Recall how we describe a distribution of quantitative (continuous) data: –plot the data (stemplot or histogram) –look for the overall.
Normal Distribution Recall how we describe a distribution of data:
Density Curves Normal Distribution Area under the curve.
3.3 Density Curves and Normal Distributions
Looking at Data - Distributions Density Curves and Normal Distributions IPS Chapter 1.3 © 2009 W.H. Freeman and Company.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
AP Statistics: Section 2.2 C. Example 1: Determine if each of the following is likely to have a Normal distribution (N) or a non-normal distribution (nn).
Probability, contd. Learning Objectives By the end of this lecture, you should be able to: – Describe the difference between discrete random variables.
Chapter 2.2 STANDARD NORMAL DISTRIBUTIONS. Normal Distributions Last class we looked at a particular type of density curve called a Normal distribution.
+ Chapter 2: Modeling Distributions of Data Section 2.2 Normal Distributions The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Do NOT glue (we’ll do that later)— simply.
2.2A I NTRODUCTION TO N ORMAL D ISTRIBUTIONS. S ECTION 2.2A N ORMAL D ISTRIBUTIONS After this lesson, you should be able to… DESCRIBE and APPLY the
Standard Deviation Z Scores. Learning Objectives By the end of this lecture, you should be able to: – Describe the importance that variation plays in.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
CHAPTER 3: The Normal Distributions ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 6 The Normal Curve. A Density Curve is a curve that: *is always on or above the horizontal axis *has an area of exactly 1 underneath it *describes.
CHAPTER 3: The Normal Distributions
Think about this…. If Jenny gets an 86% on her first statistics test, should she be satisfied or disappointed? Could the scores of the other students in.
Slide 6-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
The Standard Deviation as a Ruler and the Normal Model
Ch. 2 – Modeling Distributions of Data Sec. 2.2 – Assessing Normality.
Introduction to Statistics Chapter 6 Feb 11-16, 2010 Classes #8-9
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
Intro to Inference & The Central Limit Theorem. Learning Objectives By the end of this lecture, you should be able to: – Describe what is meant by the.
Normal Distributions (aka Bell Curves, Gaussians) Spring 2010.
+ Chapter 2: Modeling Distributions of Data Section 2.2 Normal Distributions The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Normal distributions Normal curves are used to model many biological variables. They can describe a population distribution or a probability distribution.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
Unit 2: Modeling Distributions of Data of Data. Homework Assignment For the A: 1, 3, 5, Odd, 25 – 30, 33, 35, 39 – 59 Odd and 54, 63, 65 – 67,
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
The Normal Distributions.  1. Always plot your data ◦ Usually a histogram or stemplot  2. Look for the overall pattern ◦ Shape, center, spread, deviations.
Chapter 2.2 STANDARD NORMAL DISTRIBUTIONS. Normal Distributions Last class we looked at a particular type of density curve called a Normal distribution.
Continuous random variables
The Normal distribution
11. The Normal distributions
Chapter 2: Modeling Distributions of Data
Do NOT glue (we’ll do that later)—simply type the data into List 1
Density Curves and Normal Distribution
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Warmup Normal Distributions.
Normal Distribution Z-distribution.
Chapter 2: Modeling Distributions of Data
Chapter 3 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Standard Normal Table Area Under the Curve
Density Curves Normal Distribution Area under the curve
Using the Rule Normal Quantile Plots
Density Curves Normal Distribution Area under the curve
Using the Rule Normal Quantile Plots
Standard Normal Table Area Under the Curve
Presentation transcript:

Using the 68-95-99.7 Rule Normal Quantile Plots

Learning Objectives By the end of this lecture, you should be able to: Do various calculations involving areas under the density curve using the 68-95-99.7 rule Identify the mathematical technique used to help confirm (thought not guarantee!) that our distribution is indeed Normal.

A few numbers worth memorizing (though not just yet) Because we use the Normal distribution SO much, it is worth memorizing the approximate areas from the Normal table that correspond to a few different z-scores. I say approximate, because the values are rounded off. Look at the areas shown here – but don’t memorize them just yet. z = -2  about 2.2% z = -1  about 16% z = +1  about 84% z = +2  about 98% What I do want you to memorize are the 3 numbers shown in a famous ‘rule’ on the next slide.

You WILL be asked to use these numbers on quizzes and exams. The 68-95-99.7% Rule for Normal Distributions This is essentially a “shortcut” for a mental ballpark of the areas under the normal curve. It is definitely worth memorizing. The area between -1 and +1 standard deviations corresponds to about 68% of the observations. The area between -2 and +2 standard deviations corresponds to about 95% of the observations. The area between -3 and +3 standard deviations corresponds to about 99.7% of the observations. You WILL be asked to use these numbers on quizzes and exams. Please note that on your exams you will not be provided with the three numbers (68, 95, 99.7).

Examples: The 68-95-99.7% ‘Shortcut’ Rule for Normal Distributions The z=0 line (black line) is very helpful in doing many of these calculations. Now let’s play around with these numbers by answering some questions. All numbers refer to z-scores (i.e. standard deviations): What percentage of observations lie between -1 and +1? Answer: As we just discussed, the number of observations between -1 and +1 standard deviations is 68%. What percentage lie between 0 and +1? Answer: Recall that z=0 represents 50%. So, if -1 to +1 is 68%, then 0 to +1 is half of that, which is 34%. This is an important one. Make sure you understand how to do it!! There are a few ways to think of it: Look at the area between z=0 (the black line) and z=+1. Note that is is half of the area between -1 and +1. If you need to visualize it (and you should!!), then shade in the area between z=0 and z=+1. What percentage of observations lie below +1? Answer: To do this, look at your z=0 line. Make sure you recognize that the area to the left of z=0 represents 50% of observations. Now, how many observations are between 0 and +1? Recall from the previous question that this is 34%.Therefore, from 0 to +1 = 34, and below 0 is 50, so the area to the left of +1 represents 84% of observations.

Examples: The 68-95-99.7% ‘Shortcut’ Rule for Normal Distributions More examples: What percentage of observations lies between -2 and +1? Answer: Use your midline! I would solve this by adding the area between -2 and 0 (half of 95%) to the area between 0 and +1 (half of 68%)  47.5%+ 34% = 81.5% What percentage of observations lies between 0 and +3? Answer: Half of the area between -3 and +3 (99.7) which is 49.85%. What percentage of observations lies below -2? Answer: While this too can be answered in a few different ways, I would like you to make sure you can do it this way: Look at the area between -2 and +2. Our ‘shortcut’ tells us that this contains 95% of observations. This means that the area above +2 and below -2 together compromise 5% of observations. So the area above +2 = 2.5% of observations, and the area below -2 also comprises 2.5% of observations. Answer: 2.5% What percentage of observations lies above +3? Answer: Use the same technique as was just discussed: Between -3 and +3 makes up 99.7. Therefore below -3 and above +3 makes up 0.3%. Therefore below -3 is 0.15% and above +3 = 0.15%

Examples: The 68-95-99.7% ‘Shortcut’ Rule for Normal Distributions One more! What percentage of observations lies below +2 standard deviations? Answer: Repeat the process from before to determine the area on either side of +2 and -2. That value was 2.5%. If 2.5% of values lie above +2, then 97.5% of observations lie below it. Answer: 97.5%

The 68-95-99.7% ‘Shortcut’ Rule for Normal Distributions What percentage of women are between 62 and 67 inches tall? Answer: Corresponds to -1 to +1 SDs, that is, about 68% What is the range of heights between which about 95% of women fall? Answer: About -2 to +2 SDs, so, about 59.5 to 69.5 inches tall. What is the range of heights between which nearly all (over 99%) of women fall? Answer: A quick answer would simply to pick the -3 to +3 SD range (57-72). Inflection point mean µ = 64.5 standard deviation s = 2.5 N(µ, s) = N(64.5, 2.5)

The 68-95-99.7% ‘Shortcut’ Rule for Normal Distributions More Examples: What percentage are taller than 67 inches? Answer: If 68% of all women are between 62 and 67 inches tall, this means that 32% are outside of that range. In other words, 16% are shorter than 62 inches, and 16% are taller than 67. What percentage are shorter than 59.5 inches? Answer: If 95% of all women are between 59.5 and 69.5”, then 5% are outside of that range. In other words, 2.5% are shorter than 59.5 and 2.5% are taller than 69.5”. Inflection point mean µ = 64.5 standard deviation s = 2.5 N(µ, s) = N(64.5, 2.5)

* Is the distribution truly Normal? Deciding whether data does indeed show a Normal (or, close to Normal) distribution is a very important question. All the examples we’ve been discussing above involving z-scores assume that the data is Normal. If the data was not Normal, all of our answers and calculations would be flawed. Recall that there are many other types of distributions that are not Normal. Some examples include skewed, bimodal, Binomial (later in the quarter), Poisson, etc, etc Each type of distribution has its own characteristic formulas, calculations, inference techniques, etc. Again, because the Normal distirbution is one of the most commonly encountered distributions, we will spend lots of time discussing it. So how to you decide if a distribution is Normal? You might be tempted to say “look at a graph”. And this is not entirely false: When examining data, a chart is a great (if not the BEST) place to start! However, as humans, we are easily fooled. There are many histograms (and related density curves) that look Normal, but in fact, are not. Fortunately, we do have a statistical test that can help confirm (thought not guarantee) that our dataset does indeed appear to be Normal.

Normal Quantile Plot The Normal Quantile plot is a graph that helps us determine if a distribution is indeed Normal It is a mathematical plot that we can create using our statistical software package of choice. Here is the method (which is provided for interest only): The data points are ranked and the percentile ranks are converted to z-scores with Table A. The z-scores are then used for the x axis against which the data are plotted on the y axis of the normal quantile plot. If the distribution is indeed normal the plot will show a straight line, indicating a good match between the data and a normal distribution. Systematic deviations from a straight line indicate a non-normal distribution. Outliers appear as points that are far away from the overall pattern of the plot.

Good fit to a straight line: the distribution of rainwater pH values is close to normal. Curved pattern: the data are not normally distributed. Instead, it shows a right skew: a few individuals have particularly long survival times. Normal quantile plots are complex to do by hand, but they are standard features in most statistical software.

The normal quantile test supports normality, but does NOT guarantee it! Two key points here: If the plot IS straight, then you have supported the idea that your dataset is normal. However, you have NOT guaranteed it! However, if the plot is NOT straight, then your data is NOT normal! This concept (supportive tests) will come up with certain other statistical tests that we discuss down the road.

Shortcut Rule or Z-Table? Students have often been confused as to which should be used. Whenever possible, use your z-table as you will get a much more accurate result. In particular, if you are given z-scores that are not anywhere near whole numbers (e.g. 2.332), then there is no shortcut to use! The shortcut can only be used with whole (integer) numbers between -3 and +3. The main purpose of learning the ‘shortcut’ rule (in addition to the fact that they come up on all kinds of exams), is to encourage you develop an undersatnding of what you are trying to do rather than just jumping to calculators and z-tables. For this course, you will be asked to do both.