Binomial setting and distributions Binomial distributions are models for some categorical variables, typically representing the number of successes in.

Slides:



Advertisements
Similar presentations
AP Statistics 51 Days until the AP Exam
Advertisements

Sampling Distributions for Counts and Proportions
Chapter 8: Binomial and Geometric Distributions
CHAPTER 13: Binomial Distributions
Chapter 2 Discrete Random Variables
Chapter 4: Probabilistic features of certain data Distributions Pages
Probability Distributions
Chapter 4 Probability Distributions
Lecture Slides Elementary Statistics Twelfth Edition
Slide 1 Statistics Workshop Tutorial 7 Discrete Random Variables Binomial Distributions.
Sampling Distributions For Counts and Proportions IPS Chapter 5.1 © 2009 W. H. Freeman and Company.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Probability Models Binomial, Geometric, and Poisson Probability Models.
Chapter 17 Probability Models Binomial Probability Models Poisson Probability Models.
Objectives (BPS chapter 13) Binomial distributions  The binomial setting and binomial distributions  Binomial distributions in statistical sampling 
Chapter 5 Several Discrete Distributions General Objectives: Discrete random variables are used in many practical applications. These random variables.
CHAPTER 6 Random Variables
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Discrete Random Variables Chapter 4.
Chapter 5 Sampling Distributions
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
Discrete Random Variables and Probability Distributions
What is a probability distribution? It is the set of probabilities on a sample space or set of outcomes.
AP STATISTICS LESSON 8 – 1 ( DAY 2 ) THE BINOMIAL DISTRIBUTION (BINOMIAL FORMULAS)
Chapter 8 The Binomial and Geometric Distributions YMS 8.1
5.5 Distributions for Counts  Binomial Distributions for Sample Counts  Finding Binomial Probabilities  Binomial Mean and Standard Deviation  Binomial.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
The Binomial and Geometric Distribution
Slide 1 Copyright © 2004 Pearson Education, Inc..
Poisson Random Variable Provides model for data that represent the number of occurrences of a specified event in a given unit of time X represents the.
Introduction Discrete random variables take on only a finite or countable number of values. Three discrete probability distributions serve as models for.
Sampling distributions - for counts and proportions IPS chapter 5.1 © 2006 W. H. Freeman and Company.
Probability Theory General Probability Rules. Objectives General probability rules  Independence and the multiplication rule  Applying the multiplication.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
The Practice of Statistics Third Edition Chapter 8: The Binomial and Geometric Distributions 8.1 The Binomial Distribution Copyright © 2008 by W. H. Freeman.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.3 Binomial and Geometric.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 5 Discrete Random Variables.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.3 Binomial and Geometric.
Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)
Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability Distributions 5-4 Mean, Variance and Standard Deviation.
Reminder: What is a sampling distribution? The sampling distribution of a statistic is the distribution of all possible values taken by the statistic when.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
Probability Distributions ( 확률분포 ) Chapter 5. 2 모든 가능한 ( 확률 ) 변수의 값에 대해 확률을 할당하는 체계 X 가 1, 2, …, 6 의 값을 가진다면 이 6 개 변수 값에 확률을 할당하는 함수 Definition.
IPS Chapter 5 © 2012 W.H. Freeman and Company  5.1: The Sampling Distribution of a Sample Mean  5.2: Sampling Distributions for Counts and Proportions.
12. Discrete probability distributions
MECH 373 Instrumentation and Measurements
Discrete Random Variables and Probability Distributions
Discrete Random Variables
Discrete Random Variables
Binomial and Geometric Random Variables
CHAPTER 14: Binomial Distributions*
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
The Practice of Statistics in the Life Sciences Fourth Edition
CHAPTER 6 Random Variables
Chapter 5 Sampling Distributions
Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)
Chapter 6: Random Variables
Chapter 5 Sampling Distributions
CHAPTER 6 Random Variables
CHAPTER 6 Random Variables
Elementary Statistics
CHAPTER 6 Random Variables
The Binomial Distributions
12/12/ A Binomial Random Variables.
Chapter 8: Binomial and Geometric Distributions
Chapter 5: Sampling Distributions
Presentation transcript:

Binomial setting and distributions Binomial distributions are models for some categorical variables, typically representing the number of successes in a series of n independent trials. The observations must meet these requirements:  the total number of observations n is fixed in advance  each observation falls into just one of two categories: success and failure  the outcomes of all n observations are statistically independent  all n observations have the same probability p of “success”

Applications for binomial distributions Binomial distributions describe the possible number of times that a particular event will occur in a sequence of observations.  In a clinical trial, a patient’s condition may improve or not. The binomial distribution describes the number of patients who improved (not how much better they feel) among the study participants.  Is a child obese or not (based on their body mass index)? The binomial distribution describes the number of obese children in a random sample of school-age children.  In a quality control study, we assess the number of defective items in a lot of goods, irrespective of the type of defect.

We express a binomial distribution for the count X of successes among n observations as a function of the parameters n and p: X ~ B(n,p).  The parameter n is the total number of observations.  The parameter p is the probability of success on each observation.  The count of successes X can be any whole number between 0 and n. The CDC estimates that a third of adult men are obese. In a random sample of 10 adult men, each man is either obese or not. The variable X is the number of obese men among those 10 men sampled, our count of “successes.” For each man, the probability of success, “obese,” is 1/3. The number X of obese men among 10 men has the binomial distribution B(n = 10, p = 1/3). Binomial parameters

Binomial probabilities The number of ways of arranging k successes in a series of n observations (with constant probability p of success) is the number of possible combinations (unordered sequences). This can be calculated with the binomial coefficient: R: choose(n,k) The binomial coefficient “n_choose_k” uses the factorial notation “!”. The factorial n! for any strictly positive whole number n is: n! = n × (n − 1) × (n − 2) × … × 3 × 2 × 1 where k = 0, 1, 2,..., or n

The binomial coefficient counts the number of ways in which k successes can be arranged among n observations. The binomial probability P(X = k) is this count multiplied by the probability of any specific arrangement of the k successes: XP(X)P(X) 012…k…n012…k…n Total1 The probability that a binomial random variable takes any range of values is the sum of each probability for getting exactly that many successes in n observations. P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)

The frequency of color blindness (dyschromatopsia) in the Caucasian American male population is estimated to be about 8%. In a group of 25 Caucasian American males, what is the probability that exactly five are color blind?  P(x = 5) = [n! / k!(n – k)!] p k (1 – p) n-k = (25! / 5!(20)!) = [21*22*23*24*24*25 / 1*2*3*4*5] = 53,130 * * =  Use technology > dbinom(5,25,.08) [1]

The probability that exactly 2 adults in the sample have depression is ???? The incidence of major depression in adults is about 10%. A random sample of 50 adults will be tested for depression. The variable X is the number of individuals diagnosed with depression among all 50 and has the binomial distribution Bin(n = 50, p = 0.1). A) B) C) D) E) 0.112

Binomial mean and variance The center and spread of the binomial distribution for a count X are defined by the mean  and standard deviation  The incidence of major depression in adults is about 10%. A random sample of 50 adults will be tested for depression. The variable X is the number of individuals diagnosed with depression among all 50 and has the binomial distribution Bin(n = 50, p = 0.1). Thus,

Effect of changing p when n is fixed Binomial distributions are skewed when p is close to 0 or close to 1 (especially if the sample is small).

Effect of changing n for a fixed value of p

Normal approximation to binomial Binomial distribution can be approximated by a Normal distribution, when both np ≥10 and n(1 − p) ≥10. The approximation can be improved by using a continuity correction to take into account the fact that the Normal distribution is continuous. Hint: P(X=x) = P(x-.5 ≤ X ≤ x+.5)

Count of adults diagnosed with depression in a sample of 20 adults, Bin(n = 20, p = 0.1). No Normal approximation Why?? The incidence of major depression in adults is about 10%. Count of adults diagnosed with depression in a sample of 100 adults, Bin(n = 100, p = 0.1). Normal approximation OK Why? Binomial, n=20, p=0.1 Binomial, n=100, p=0.1

The frequency of color blindness (dyschromatopsia) in the Caucasian American male population is about 8%. We take a random sample of size 125 from this population. What is the probability that 6 individuals or fewer in the sample are color blind?  Distribution of the count X: B (n = 125, p = 0.08)  np = 10 P(X ≤ 6) = pbinom(6,size=125,prob=.08) in R [1] or about 12%  Normal approximation: N (np = 10, √np(1 − p) = 3.033) P(X ≤ 6) = pnorm(6, mean=10, sd=3.033) = or about 9% Or z = (x - µ)/σ = (6 − 10)/3.033 =  P(X ≤ 6) = from Table B The Normal approximation is reasonable, but not quite close to 12%. Here p =.08 is not close to 0.5, but np = 10 just meets the criterion. Using a continuity correction greatly improves the approximation:  P(X ≤ 6) = P(X≤6.5) = pnorm(6.5, mean=10, se=3.033) =

Distributions for the color blindness example. n = 50 n = 125 n = 1000 The larger the sample size the better the Normal approximation fits the binomial distribution.

The Poisson distributions A Poisson distribution describes the count X of occurrences of an event in fixed, finite intervals of time or space when  occurrences are all independent,  and the probability of an occurrence is the same over all possible intervals. Think of the Poisson distribution as describing the number of items in containers. Items Containers  Radioactive decays  Weeds  Fleas  Cardiovascular deaths  Second  Acre of farm land  Dog  County / year

If we divide a natural lawn into 1 ft 2 quadrants, we can count how many dandelions are in each quadrant. Dandelions seeds are wind-spread. The probabilities of a quadrant containing 0,1,2,3… dandelions are given by a Poisson distribution: (i) independence of dandelions: the presence of one dandelion in a quadrant does not make the presence of another more or less likely. (ii) homogeneity of quadrants: each quadrant is equally susceptible to contain dandelions.

Poisson probabilities If μ is the population mean number of occurrences for a specified interval of time or space, then the Poisson probability distribution of observing k occurrences (k = 0, 1, 2, …) at constant μ  (> 0) is: The Poisson distribution has mean μ and standard deviation σ:

Effect of changing μ: The Poisson distribution is skewed when μ  5.

The number of deer crossing a road at night during mating season in a particular rural area can be modeled with a Poisson distribution. A local survey conducted over 4 nights found a total of 20 deer crossings. Based on this information, what is the probability that fewer than three deer would cross on a given night during mating season in this area? To compute this probability using the Poisson distribution, we need to know μ. In this case μ = 20 / 4 = 5 deer crossings per night. > ppois(2,lambda=5) [1] x

Historical records over 20 years in a particular town indicate an average of 4 severe rainstorms per year. Modeling the occurrences of severe rainstorms with the Poisson distribution, the probability that there would be no severe rainstorm next year is P(X = 0) = (4) 0 e –4 / 0! = Probability of 5 severe rainstorms next year P(X = 5) = (4) 5 e –4 / 5! = Probability of 1 or more severe rainstorms next year P(X > 1) = 1 – P(X = 0) = 1 – = Probability of more than 5 severe rainstorms next year P(X > 5) = 1 – P(X ≤ 5) = 1 – = x P(X=x)P(X≤x) % 1.832% % 9.158% % % % % % % % % % % % % % % % % % % % % % % % % % %