Normal and Poisson Distributions

Slides:



Advertisements
Similar presentations
Chapter 6 Continuous Random Variables and Probability Distributions
Advertisements

ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Statistics.
CHAPTER 13: Binomial Distributions
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 5-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Introduction to Probability and Statistics
Discrete Probability Distribution
Chapter 6 Continuous Random Variables and Probability Distributions
Probability and Probability Distributions
Chapter 6 The Normal Distribution and Other Continuous Distributions
1 Probability Distributions GTECH 201 Lecture 14.
CHAPTER 6 Statistical Analysis of Experimental Data
Continuous Random Variables and Probability Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Chapter 5 Discrete Probability Distributions
Lecture Slides Elementary Statistics Twelfth Edition
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Discrete and Continuous Probability Distributions.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 5 Sampling Distributions
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Chapter 6 The Normal Probability Distribution
8.5 Normal Distributions We have seen that the histogram for a binomial distribution with n = 20 trials and p = 0.50 was shaped like a bell if we join.
Chapter 6: Probability Distributions
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Chapter 6: Probability Distributions
Overview 6.1 Discrete Random Variables
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Theory of Probability Statistics for Business and Economics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Barnett/Ziegler/Byleen Finite Mathematics 11e1 Learning Objectives for Section 11.5 Normal Distributions The student will be able to identify what is meant.
Applied Business Forecasting and Regression Analysis Review lecture 2 Randomness and Probability.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
PROBABILITY DISTRIBUTIONS
Modular 11 Ch 7.1 to 7.2 Part I. Ch 7.1 Uniform and Normal Distribution Recall: Discrete random variable probability distribution For a continued random.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Continuous Random Variables.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Introduction to Behavioral Statistics Probability, The Binomial Distribution and the Normal Curve.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
Slide 5-1 Chapter 5 Probability and Random Variables.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter The Normal Probability Distribution 7.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
Basic Business Statistics
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Seven Normal Curves and Sampling.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 5-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Continuous Random Variables and Probability Distributions
Chapter 7 The Normal Probability Distribution 7.1 Properties of the Normal Distribution.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Chap 5-1 Discrete and Continuous Probability Distributions.
THE NORMAL DISTRIBUTION
Theoretical distributions: the Normal distribution.
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Introductory Statistics and Data Analysis
The Normal Probability Distribution
The Normal Distribution: Comparing Apples and Oranges
PROBABILITY DISTRIBUTIONS
The Normal Probability Distribution
Chapter 5 Some Important Discrete Probability Distributions
Probability distributions
Statistics for Managers Using Microsoft® Excel 5th Edition
Presentation transcript:

Normal and Poisson Distributions GTECH 201 Lecture 14

Sampling Population Unit Sample Sampling Frame The entire group of objects about which information is sought Unit Any individual member of the population Sample A part or a subset of the population used to gain information about the whole Sampling Frame The list of units from which the sample is chosen

Simple Random Sampling A simple random sample of size n is a sample of n units chosen in such a way that every collection of n units from a sampling frame has the same chance of being chosen

Random Sampling in R In R you can simulate random draws For example, to pick five numbers at random from the set 1:40, you can > sample(1:40,5) [1] 4 30 28 40 13

Sampling with Replacement Default in R is ‘without replacement’ sample(c("H", "T"), 10, replace=T) [1] "T" "T" "T" "T" "T" "H" "H" "T" "H" "T“ prob=c(.9,.1) sample(c("S", "F"), 10, replace=T, prob)

Random Number Tables A table of random digits is: A list of 10 digits 0 through 9 having the following properties The digit in any position in the list has the same chance of being any of of 0 through 9; The digits in different positions are independent, in that the value of one has no influence on the value of any other Any pair of digits has the same chance of being any of the 100 possible pairs, i.e., 00,01,02, ..98, 99 Any triple of digits has the same chance of being any of the 1000 possible triples, i.e., 000, 001, 002, …998, 999

Using Random Number Tables A health inspector must select a SRS of size 5 from 100 containers of ice cream to check for E. coli contamination The task is to draw a set of units from the sampling frame Assign a number to each individual Label the containers 00, 01,02,…99 Enter table and read across any line 81486 69487 60513 09297 81, 48, 66, 94, 87, 60, 51, 30, 92, 97

Random Number Generation in R > rnorm(10) > rnorm(10, mean=7, sd=5) > rbinom(10, size=20, prob=.5) We will revisit the meaning of the parameters at the end of today’s session

Combinatorics 1 Back to draw five out of 40 sample(1:40,5) The probability for any given number is 1/40 in the first sample,, 1/39 in the second, and so on  P(x ) = 1/(40*39*38*37*36*35) > 1/prod(40:36) [1] 1.266449e-08 But…

Combinatorics 2 We don’t care about the order of the five numbers out of 40 There are 5*4*3*2*1 combinations for the five drawn numbers  > prod(1:5) / prod(40:36) [1] 1.519738e-06 Shorthand for the above in > 1/choose(40,5)

Binomial Distribution Discrete probability distribution Events have only 2 possible outcomes binary, yes-no, presence-absence Computing probability of multiple events or trials Examples Probability that x number of people are alive at the age of 65 Probability of a river reaching flood stage for three consecutive years

When to Apply Binomial If sample is less than 10% of a large population in which a proportion p have a characteristic of interest, then the distribution X, the number in the sample with that characteristic, is approximately binomial (n, p), where n is the sample size

Geometric Distribution Tossing a biased coin until the first head appears pr(H) = p pr(X = x) = pr(TT…T H) = pr(T1 ∩ T2 ∩ ..∩ Hx) = (1 – p)x-1 p The geometric distribution is the distribution of the number of tosses of a biased coin up to and including the first head

Poisson Distribution Discrete probability distribution Named in honor of Simeon Poisson (1781-1840) What is it used for? To model the frequency with which a specified event occurs over a period of time The specified event occurs randomly Independent of past or future occurrences Geographers also use this distribution to model how frequently an event occurs across a particular area We can also examine a data set (of frequency counts in order to determine whether a random distribution exists

Poisson Distribution is used… To analyze the number of patients arriving at a hospital emergency room between 6 AM and 7 AM on a particular day Obvious implications for resource allocation To analyze the number of phone calls per day arriving at a telephone switchboard To analyze the number of cars using the drive through window at a fast-food restaurant To analyze hailstorm occurrence in one Canadian province

The Poisson Probability Formula Lambda () is a positive real number (mean frequency) e = 2.718 (mathematical constant) X = 0, 1, 2, 3, ….(frequency of an occurrence) X!= X factorial

Example - 1 General Hospital, located in Phoenix, keeps records of emergency room traffic. From these records, we find that the number of patients arriving between 10 AM and 12 Noon has a Poisson distribution of with parameter  =6.9 Determine the probability that, on any given day, the number of patients arriving at that emergency room between 10 AM and 12 Noon will be: Exactly four At the most two

Exactly four arrivals, x=4

At the most, two arrivals…

Revisiting Mean and Standard Deviation

What if… We wanted to obtain a table of probabilities for the random variable X, the number of patients arriving between 10AM and 12 Noon?

Discrete versus Continuous Distributions Moving from individual probabilities to total number of successes or failures Probability distribution f (x ) = P (X=x) for discrete events: Probability distribution for continuous events:

Expected Values Population standard deviation square root of the average squared distance of X from the mean m

Expected Values Mean and Poisson distribution It can be shown that this adds to l. Thus, for Poisson-distributed populations E(X) = l The standard deviation sd(X) for Poisson(l) is √l

Probability Density Functions Moving from the discrete to the continuous Increasing the frequency of observations results in an ever finer histogram Total area under the curve = 1

Probability Density Functions Population means and standard dev’s mx balances the distribution The standard deviation is calculated as for discrete density functions

The Normal Distribution

Properties of a Normal Distribution Continuous Probability Distribution Symmetrical about a central point No skewness Central point in this dataset corresponds to all three measures of central tendency Also called a Bell Curve If we accept or assume that our data is normally distributed, then, We can compute the probability of different outcomes

Properties of a Normal Distribution Using the symmetrical property of the distribution, we can conclude: 50 % of values must lie to the right, i.e. they are greater than the mean 50% of values must lie to the left, i.e. If the data is normally distributed, the probability values are also normally distributed The total area under the normal curve represents all (100%) of probable outcomes What can you say about data values in a normally distributed data set?

Normal Distribution and Standard Deviations

Approximating a Normal Distribution In reality, If a variable’s distribution is shaped roughly like a normal curve, Then the variable approximates a normal distribution Normal Distribution is determined by Mean Standard Deviation These measures are considered parameters of a Normal Distribution / Normal Curve

Equation of a Normal Curve Mean = ; Standard Deviation = e = 2.718 ; = 3.142

Areas Within the Normal Curve For a normally distributed variable, the percentage of all possible observations that lie within any specified range equals the corresponding area under its associated normal curve expressed as a percentage A college has an enrollment of 3264 female students. Mean height is 64.4 inches, standard deviation is 2.4 inches Frequency and relative frequency are presented

Relative Frequency Table Frequency and Relative Frequency Table 0.0735, i.e. 7.35 % of the students are between 67 and 68 inches tall

Relative Frequency Histogram with Normal Curve 0.0735 = the area that has been cross-hatched Shaded area under the normal curve approximates the percentage of students who are between 67-68 inches tall

Standardizing a Normal Variable Once we have mean and standard deviation of a curve, we know its distribution and the associated normal curve Percentages for a normally distributed variable are equal to the areas under the associated normal curve There could be hundreds of different normal curves (one for each choice of mean or std. dev. value How can we find the areas under a standard normal curve? A normally distributed variable with a mean of 0 and a standard deviation of 1 is said to have a standard normal distribution

Z Score The variable z is called the standardized version of x, or the standardized variable corresponding to x, with the mean = 0 and standard deviation = 1 Almost all observations in a dataset will lie within three standard deviations to either side of the mean, i.e., almost all possible observations will have z scores between – 3 and + 3

Normal Curve Properties The total area under the standard normal curve is equal to 1 The standard normal curve extends infinitely in both directions, approaching but never touching the horizontal axis Standard normal curve is symmetric about 0 Most of the area under a standard normal curve lies between –3 and + 3

Using the Standard Normal Table The times taken for runners to complete a local 10 km race is normally distributed with a mean of 61 minutes and a standard deviation of 9 minutes. Let x be the finish time of a randomly selected runner. Find the probability that x > 75 minutes Step 1 Calculate the standard score z = 75-61/9; z = 1.56 Step 2 Determine the probability from the normal table For z of 1.56, p = 0.4406 Step 3 Interpret the result p (x>75) = 0.5 – 0.446 = 0.054 or 5.4% chance

Using the Standard Normal Table In the previous example, what is the probability that someone finishes in less than 45 minutes? Step 1 Calculate the standard score z = 45-61/9; z = -1.78 Step 2 Determine the probability from the normal table For z of -1.78, area= 0.4625 Step 3 Interpret the result p (x<45) = 1- (0.5+0.4625) = 0.038 or 3.8 % of the runners finish in less than 45 minutes

Three Distributions

Normal Approximations for Discrete Distributions Approximation of the Binomial Binomial is used for large n and small p If p is moderate (not close to 0 or 1), then the Binomial can be approximated by the normal Rule of thumb: np (1-p) ≥ 10 Other normal approximations If X ~ Poisson(l), normal works well for l ≥ 10

Built-in Distributions in Four fundamental items can be calculated for a statistical distribution: Density or point probability Cumulated probability, distribution function Quantiles Pseudo-random numbers In there are functions for each of these

Density of a Normal Distribution > x = seq(-4, 4, 0.1) > plot (x, dnorm(x), type="l")

For Discrete Distributions.. > x = 0:50 > plot (x, dbinom(x, size=50, prob=.33, type="h")

Cumulative Distribution Functions Could be graphed but is not very informative Example Blood sugar concentration in the US population has a mean of 132 and a standard deviation of 13. How special is a patient with a value 160? 1 – pnorm(160, mean=132, sd=13) [1] 0.01562612 or 1.5%

Random Number Generation in R > rnorm(10) > rnorm(10, mean=7, sd=5) > rbinom(10, size=20, prob=.5) Now you understand the parameters…