June 10, 2008Stat 111 - Lecture 9 - Proportions1 Introduction to Inference Sampling Distributions for Counts and Proportions Statistics 111 - Lecture 9.

Slides:



Advertisements
Similar presentations
THE CENTRAL LIMIT THEOREM
Advertisements

Chapter 18 Sampling distribution models
Distributions of sampling statistics Chapter 6 Sample mean & sample variance.
June 9, 2008Stat Lecture 8 - Sampling Distributions 1 Introduction to Inference Sampling Distributions Statistics Lecture 8.
Sampling Distributions and Sample Proportions
Chapter 8: Binomial and Geometric Distributions
CHAPTER 13: Binomial Distributions
© 2010 Pearson Prentice Hall. All rights reserved Sampling Distributions and the Central Limit Theorem.
Sampling Distributions
NORMAL APPROXIMATION TO THE BINOMIAL A Bin(n, p) random variable X counts the number of successes in n Bernoulli trials with probability of success p on.
1 Sampling Distributions Chapter Introduction  In real life calculating parameters of populations is prohibitive because populations are very.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
SAMPLING DISTRIBUTION
June 2, 2008Stat Lecture 18 - Review1 Final review Statistics Lecture 18.
Chapter 5 Sampling Distributions
Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
AP Statistics Chapter 9 Notes.
5.5 Distributions for Counts  Binomial Distributions for Sample Counts  Finding Binomial Probabilities  Binomial Mean and Standard Deviation  Binomial.
Binomial Distributions Calculating the Probability of Success.
AP Statistics Section 8.1: The Binomial Distribution.
The Binomial and Geometric Distribution
Copyright ©2011 Nelson Education Limited The Normal Probability Distribution CHAPTER 6.
1 Chapter 5 Sampling Distributions. 2 The Distribution of a Sample Statistic Examples  Take random sample of students and compute average GPA in sample.
Sampling distributions - for counts and proportions IPS chapter 5.1 © 2006 W. H. Freeman and Company.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Sampling Distributions 8.
© 2010 Pearson Prentice Hall. All rights reserved 8-1 Objectives 1.Describe the distribution of the sample mean: samples from normal populations 2.Describe.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
Binomial Formulas Target Goal: I can calculate the mean and standard deviation of a binomial function. 6.3b h.w: pg 404: 75, 77,
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
Sampling Distributions. Sampling Distribution Is the Theoretical probability distribution of a sample statistic Is the Theoretical probability distribution.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 6: Random Variables Section 6.3 Binomial and Geometric Random Variables.
Sample Means & Proportions
Lecture 8: More on the Binomial Distribution and Sampling Distributions June 1, 2004 STAT 111 Introductory Statistics.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Probability Models Chapter 17. Bernoulli Trials  The basis for the probability models we will examine in this chapter is the Bernoulli trial.  We have.
Introduction to Inference Sampling Distributions.
Chapter 5 Sampling Distributions. Introduction Distribution of a Sample Statistic: The probability distribution of a sample statistic obtained from a.
SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather.
A statistic from a random sample or randomized experiment is a random variable. The probability distribution of this random variable is called its sampling.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Understanding Sampling Distributions: Statistics as Random Variables
Binomial and Geometric Random Variables
CHAPTER 14: Binomial Distributions*
CHAPTER 6 Random Variables
Section 9.2 – Sample Proportions
Sampling Distributions for a Proportion
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Sampling Distribution Models
Review of Hypothesis Testing
Introduction to Probability and Statistics
MATH 2311 Section 4.4.
Chapter 5 Sampling Distributions
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Chapter 5 Sampling Distributions
Sampling Distributions
The estimate of the proportion (“p-hat”) based on the sample can be a variety of values, and we don’t expect to get the same value every time, but the.
Exam 2 - Review Chapters
Continuous Probability Distributions
12/12/ A Binomial Random Variables.
Chapter 8: Binomial and Geometric Distributions
A statistic from a random sample or randomized experiment is a random variable. The probability distribution of this random variable is called its sampling.
Chapter 5: Sampling Distributions
Presentation transcript:

June 10, 2008Stat Lecture 9 - Proportions1 Introduction to Inference Sampling Distributions for Counts and Proportions Statistics Lecture 9

June 10, 2008Stat Lecture 9 - Proportions2 Administrative Notes Homework 3 is due on Monday, June 15 th –Covers chapters 1-5 in textbook Exam on Monday, June 15 th Review session on Thursday

June 10, 2008Stat Lecture 9 - Proportions3 Last Class Focused on models for continuous data: using the sample mean as our estimate of population mean Sampling Distributionof the Sample Mean how does the sample mean change over different samples? Population Parameter:  Distribution of these values? Sample 1 of size n x Sample 2 of size n x Sample 3 of size n x Sample 4 of size n x Sample 5 of size n x Sample 6 of size n x.

June 10, 2008Stat Lecture 9 - Proportions4 Today’s Class We will now focus on count data: categorical data that takes on only two different values “Success” (Y i = 1) or “Failure” (Y i = 0) Goal is to estimate population proportion: p = proportion of Y i = 1 in population

June 10, 2008Stat Lecture 9 - Proportions5 Examples Gender: our class has 83 women and 42 men What is proportion of women in Penn student population? Presidential Election: out of 2000 people sampled, 1150 will vote for McCain in upcoming election What proportion of total population will vote for McCain? Quality Control: Inspection of a sample of 100 microchips from a large shipment shows 10 failures What is proportion of failures in all shipments?

June 10, 2008Stat Lecture 9 - Proportions6 Inference for Count Data Goal for count data is to estimate the population proportion p From a sample of size n, we can calculate two statistics: 1. sample count Y 2. sample proportion = Y/n Use sample proportion as our estimate of population proportionp Sampling Distributionof the Sample Proportion how does sample proportion change over different samples? Population Parameter: p Distribution of these values? Sample 1 of size n x Sample 2 of size n x Sample 3 of size n x Sample 4 of size n x Sample 5 of size n x Sample 6 of size n x.

June 10, 2008Stat Lecture 9 - Proportions7 The Binomial Setting for Count Data 1.Fixed number n of observations (or trials) 2.Each observation is independent 3.Each observation falls into 1 of 2 categories: 1.Success (Y = 1) or Failure (Y = 0) 4.Each observation has the same probability of success: p = P(Y = 1)

June 10, 2008Stat Lecture 9 - Proportions8 Binomial Distribution for Sample Count Sample count Y (number of Y i =1 in sample of size n) has a Binomial distribution The binomial distribution has two parameters: number of trials n and population proportion p P(X=k) = nCk * p k (1-p) (n-k) Binomial formula accounts for number of success: p k number of failures : (1-p) n-k different orders of success/failures: nCk = n!/(k!(n-k)!)

June 10, 2008Stat Lecture 9 - Proportions9 Binomial Probability Histogram Can make histogram out of these probabilities Can add up bars of histogram to get any probability we want: eg. P(Y < 4) Different values of n and p have different histograms, but Table C in book has probabilities for many values of n and p

June 10, 2008Stat Lecture 9 - Proportions10 Binomial Table

June 10, 2008Stat Lecture 9 - Proportions11 Example: Genetics If a couple are both carriers of a certain disease, then their children each have probability 0.25 of being born with disease Suppose that the couple has 4 children P(none of their children have the disease)? P(X=0) = 4!/(0!*4!) *.25 0 * (1-.25) 4 P(at least two children have the disease)? P(Y ≥ 2) = P(Y = 2) +P(Y = 3) +P(Y = 4) = (from table) =

June 10, 2008Stat Lecture 9 - Proportions12 Example: Quality Control A worker inspects a sample of n=20 microchips from a large shipment The probability of a microchip being faulty is 10% (p = 0.10) What is the probability that there are less than three failures in the sample? P(Y < 3) = P(Y = 0) + P(Y =1) + P(Y = 2) = (from table) = 0.677

June 10, 2008Stat Lecture 9 - Proportions13 Sample Proportions Usually, we are more interested in a sample proportion = Y/n instead of a sample count P ( < k ) = P( Y < n*k) Example: a worker inspects a sample of 20 microchips from a large shipment with probability of a microchip being faulty is 0.1 What is the probability that our sample proportion of faulty chips is less than 0.05? P ( <.05 ) = P( Y < 1) = P(Y=0) = x 20

June 10, 2008Stat Lecture 9 - Proportions14 Mean and Variance of Binomial Counts If our sample count Y is a random variable with a Binomial distribution, what is the mean and variance of Y across all samples? Useful since we only observe the value of Y for our sample but what are the values in other samples? We can calculate the mean and variance of a Binomial distribution with parameters n and p: μ Y = n*p σ 2 = n*p*(1-p) σ = √ (n*p*(1-p))

June 10, 2008Stat Lecture 9 - Proportions15 Mean/Variance of Binomial Proportions Sample proportion is a linear transformation of the sample count ( = Y/n ) μ = 1/n * mean(Y) = 1/n * np = p Mean of sample proportion is true probability of success p σ 2 = 1/n 2 Var(Y) = 1/n 2 * n*p*(1-p) = p(1-p)/n Variance of sample proportion decreases as sample size n increases!

June 10, 2008Stat Lecture 9 - Proportions16 Variance over Long-Run Lower variance with larger sample size means that sample proportion will tend to be closer to population mean in larger samples Long-run behaviour of two different coin tossing runs. Much less likely to get unexpected events in larger samples

June 10, 2008Stat Lecture 9 - Proportions17 Binomial Probabilities in Large Samples In large samples, it is often tedious to calculate probabilities using the binomial distribution Example: Gallup poll for presidential election Bush has 49% of vote in population. What is the probability that Bush gets a count over 550 in a sample of 1000 people? P(Y > 550) = P(Y = 551) + P(Y = 552) + … + P(Y =1000) = 450 terms to look up in the table! We can instead use the fact that for large samples, the Binomial distribution is closely approximated by the Normal distribution

June 10, 2008Stat Lecture 9 - Proportions18

June 10, 2008Stat Lecture 9 - Proportions19 Normal Approximation to Binomial If count Y follows a binomial distribution with parameters n and p, then Y approximately follows a Normal distribution with mean and variance: μ Y = n*p This approximation is only good if n is “large enough”. Rule of thumb for “large enough”:n·p≥ 10 and n(1-p) ≥ 10 Also works for sample proportion: = Y/n follows a Normal distribution with mean and variance

June 10, 2008Stat Lecture 9 - Proportions20 Example: Quality Control Sample of 100 microchips (with usual 10% of microchips are faulty. What is the probability there are at least 17 bad chips in our sample? Using Binomial calculation/table is tedious. Instead use Normal approximation: Mean = n·p = 100  0.10 = 10 Var = n·p·(1-p) = 100  0.10  0.90 = 9 = P(Z ≥ 2.33) =1- P(Z ≤ 2.33) = 0.01 (from table)

June 10, 2008Stat Lecture 9 - Proportions21 Example: Gallup Poll Bush has 49% of vote in population What is the probability that Bush gets sample proportion over 0.51 in sample of size 1000? Use normal distribution with mean = p = 0.49 and variance p·(1-p)/n = = P(Z ≥1.27) =1- P(Z ≤1.27) = 0.102

June 10, 2008Stat Lecture 9 - Proportions22 Why does Normal Approximation work? Central Limit Theorem: in large samples, the distribution of the sample mean is approx. Normal Well, our count data takes on two different values: “Success” (Y i = 1) or “Failure” (Y i = 0) The sample proportion is the same as the sample mean for count data! So, Central Limit Theorem works for sample proportions as well!

June 10, 2008Stat Lecture 9 - Proportions23 Next Class - Lecture 10 Review session on Wednesday/Thursday –Show up with questions!