Sample Means & Proportions

Slides:



Advertisements
Similar presentations
Chapter 18 Sampling distribution models
Advertisements

SAMPLING DISTRIBUTIONS Chapter How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
AP Statistics: Section 9.1 Sampling Distributions
Chapter 8: Binomial and Geometric Distributions
CHAPTER 13: Binomial Distributions
The Diversity of Samples from the Same Population Thought Questions 1.40% of large population disagree with new law. In parts a and b, think about role.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Means and Proportions as Random Variables Chapter 9.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 7 Introduction to Sampling Distributions
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
Copyright ©2011 Brooks/Cole, Cengage Learning Random Variables Chapter 8 1.
Point and Confidence Interval Estimation of a Population Proportion, p
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Introduction to Statistics Chapter 7 Sampling Distributions.
1 Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions Probability distributions (4.1) The normal probability distribution (4.2) Sampling.
Sampling Distributions
Chapter 7 The Normal Probability Distribution 7.5 Sampling Distributions; The Central Limit Theorem.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. The Diversity of Samples from the Same Population Chapter 19.
Sampling Theory Determining the distribution of Sample statistics.
Binomial Probability Distribution.
PROBABILITY DISTRIBUTIONS
Probability Models Chapter 17.
Chapter 5 Sampling Distributions
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
June 10, 2008Stat Lecture 9 - Proportions1 Introduction to Inference Sampling Distributions for Counts and Proportions Statistics Lecture 9.
40S Applied Math Mr. Knight – Killarney School Slide 1 Unit: Statistics Lesson: ST-5 The Binomial Distribution The Binomial Distribution Learning Outcome.
Chapter 6: Probability Distributions
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
1 Normal Random Variables In the class of continuous random variables, we are primarily interested in NORMAL random variables. In the class of continuous.
© 2003 Prentice-Hall, Inc.Chap 7-1 Basic Business Statistics (9 th Edition) Chapter 7 Sampling Distributions.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
Agresti/Franklin Statistics, 1e, 1 of 139  Section 6.4 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Week 8 Confidence Intervals for Means and Proportions.
Warm-up 7.1 Sampling Distributions. Ch. 7 beginning of Unit 4 - Inference Unit 1: Data Analysis Unit 2: Experimental Design Unit 3: Probability Unit 4:
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Chap 7-1 Basic Business Statistics (10 th Edition) Chapter 7 Sampling Distributions.
Sampling Distributions. Sampling Distribution Is the Theoretical probability distribution of a sample statistic Is the Theoretical probability distribution.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
AP Statistics: Section 9.1 Sampling Distributions.
A.P. STATISTICS LESSON SAMPLE PROPORTIONS. ESSENTIAL QUESTION: What are the tests used in order to use normal calculations for a sample? Objectives:
V. Katch Movement Science Review Application of the Normal Distribution.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
The Normal Distribution
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.3 Binomial and Geometric.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
P. 403 – 404 #71 – 73, 75 – 78, 80, 82, 84 #72B: Binary? Yes – Success is a person is left-handed. I: Independent? Yes, since students are selected randomly,
Section Binomial Distributions For a situation to be considered a binomial setting, it must satisfy the following conditions: 1)Experiment is repeated.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
Chapter 18 Sampling distribution models math2200.
UNIT 3 YOUR FINAL EXAMINATION STUDY MATERIAL STARTS FROM HERE Copyright ©2011 Brooks/Cole, Cengage Learning 1.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.1 Summarizing Possible Outcomes and Their Probabilities.
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables UNIT V 1.
!! DRAFT !! STA 291 Lecture 14, Chap 9 9 Sampling Distributions
Sampling Distributions
FINAL EXAMINATION STUDY MATERIAL PART I
Understanding Sampling Distributions: Statistics as Random Variables
Normal Distribution and Parameter Estimation
CHAPTER 6 Random Variables
Section 9.2 – Sample Proportions
The Diversity of Samples from the Same Population
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Sampling Distribution Models
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
12/12/ A Binomial Random Variables.
Presentation transcript:

Sample Means & Proportions Week 7 Sample Means & Proportions

Variability of Summary Statistics Variability in shape of distn of sample Variability in summary statistics Mean, median, st devn, upper quartile, … Summary statistics have distributions

Parameters and statistics Parameter describes underlying population Constant Greek letter (e.g. , , , …) Unknown value in practice Summary statistic Random Roman letter (e.g. m, s, p, …) We hope statistic will tell us about corresponding parameter

Distn of sample vs Sampling distn of statistic Values in a single random sample have a distribution Single sample --> single value for statistic Sample-to-sample variability of statistic is its sampling distribution.

Means Unknown population mean,  Sample mean, X, has a distribution — its sampling distribution. Usually x ≠  A single sample mean, x, gives us information about 

Sampling distribution of mean If sample size, n, increases: Spread of distn of sample is (approx) same. Spread of sampling distn of mean gets smaller. x is likely to be closer to  x becomes a better estimate of 

Sampling distribution of mean Population with mean , st devn  Random sample (n independent values) Sample mean, X, has sampling distn with: Mean, St devn, (We will deal later with the problem that  and  are unknown in practice.)

Weight loss Estimate mean weight loss for those attending clinic for 10 weeks Random sample of n = 25 people Sample mean, x How accurate? Let’s see, if the population distn of weight loss is:

Some samples Four random samples of n = 25 people: Mean = 8.32 pounds, st devn = 4.74 pounds Mean = 8.48 pounds, st devn = 5.27 pounds Mean = 7.16 pounds, st devn = 5.93 pounds N.B. In all samples, x ≠ 

Sampling distribution Means from simulation of 400 samples Theory: mean =  = 8 lb, s.d.( ) = lb (How does this compare to simulation? To popn distn?)

Errors in estimation From 70-95-100 rule Even if we didn’t know  Population Sampling distribution of mean mean =  = 8 lb, s.d.( ) = lb From 70-95-100 rule x will be almost certainly within 8 ± 3 lb x is unlikely to be more than 3 lb in error Even if we didn’t know 

Increasing sample size, n If we sample n = 100 people instead of 25: s.d.( ) = lb. Larger samples  more accurate estimates

Central Limit Theorem If population is normal (, ) If popn is non-normal with (, ) but n is large Guideline: n > 30 even if very non-normal

Other summary statistics E.g. Lower quartile, proportion, correlation Usually not normal distns Formula for standard devn of samling distn sometimes Sampling distn usually close to normal if n is large

Lottery problem Pennsylvania Cash 5 lottery 5 numbers selected from 1-39 Pick birthdays of family members (none 32-39) P(highest selected is 32 or over)? Statistic: H = highest of 5 random numbers (without replacement)

Lottery simulation Theory? Fairly hard. Simulation: Generated 5 numbers (without replacement) 1560 times Highest number > 31 in about 72% of repetitions

Normal distributions Family of distributions (populations) Shape depends only on parameters  (mean) &  (st devn) All have same symmetric ‘bell shape’ = 65 inches, s = 2.7 inches

Importance of normal distn A reasonable model for many data sets Transformed data often approx normal Sample means (and many other statistics) are approx normal.

Standard normal distribution Z ~ Normal ( = 0,  = 1) -3 -2 -1 1 2 3 Prob ( Z < z* )

Probabilities for normal (0, 1) P(Z  -3.00) = P(Z  −2.59) = P(Z  1.31) = P(Z  2.00) = P(Z  -4.75) = 0.0013 0 .0048 0 .9049 0 .9772 0 .000001 Check from tables:

Probability Z > 1.31 P(Z > 1.31) = 1 – P(Z  1.31) = 1 – .9049 = .0951

Prob ( Z between –2.59 and 1.31) P(-2.59  Z  1.31) = P(Z  1.31) – P(Z  -2.59) = .9049 – .0048 = .9001

Standard devns from mean Normal (, )     Heights of students = 65 inches, s = 2.7 inches

Probability and area X ~ normal ( = 65 , s = 2.7 ) P (X ≤ 67.7) = area

Probability and area (cont.) Normal (, )     Exactly 70-95-100 rule P(X within  of ) = 0.683 approx 70% P(X within 2 of ) = 0.954 approx 95% P(X within 3 of ) = 0.997 approx 100%

Finding approx probabilities Ht of college woman, X ~ normal ( = 65 , s = 2.7 ) Prob (X ≤ 62 )? Sketch normal density Estimate area P (X ≤ 62) = area About 1/8

Translate question from X to Z X ~ Normal (, ) Find P(X ≤ x*)    x*  Translate to z-score: Z ~ Normal ( = 0,  = 1) -3 -2 z* -1 1 2 3

Finding probabilities Prob (height of randomly selected college woman ≤ 62 )? About 13%.

Prob (X > value) Ht of college woman, X ~ normal ( = 65 , s = 2.7 ) Prob (X > 68 inches)?

Finding upper quartile Blood Pressures are normal with mean 120 and standard deviation 10. What is the 75th percentile? Step 1: Solve for z-score Closest z* with area of 0.7500 (tables) z = 0.67 Step 2: Calculate x = z*s + m x = (0.67)(10) + 120 = 126.7 or about 127.

Probabilities about means Blood pressure ~ normal ( = 120,  = 10) 8 people given drug If drug does not affect blood pressure, Find P(average blood pressure > 130)

P ( X > 130) ? X ~ normal ( = 120,  = 10) n = 8 prob = 0.0023 Very little chance!

Distribution of sum   X ~ distn with (, ) aX ~ distn with (a, a) e.g. miles to kilometers  Central Limit Theorem implies approx normal

Probabilities about sum Profit in 1 day ~ normal (= $300, = $200) Prob(total profit in week < $1,000)? Total = Prob = 0.0188 Assumes independence

Categorical data Most important parameter is  = Prob (success) Corresponding summary statistic is p = Proportion (success) ^ N.B. Textbook uses p and p

Number of successes Easiest to deal with count of successes before proportion. If… 1. n “trials” (fixed beforehand). 2. Only “success” or “failure” possible for each trial. 3. Outcomes are independent. Prob (success), remains same for all trials, . Prob (failure) is 1 – . X = number of successes ~ binomial (n, )

Examples

Binomial Probabilities for k = 0, 1, 2, …, n You won’t need to use this!! Prob (win game) = 0.2 Plays of game are independent. What is Prob (wins 2 out of 3 games)? What is P(X = 2)?

Mean & st devn of Binomial For a binomial (n, )

Extraterrestrial Life? 50% of large population would say “yes” if asked, “Do you believe there is extraterrestrial life?” Sample of n = 100 X = # “yes” ~ binomial (n = 100,  = 0.5)

Extraterrestrial Life? Sample of n = 100 X = # “yes” ~ binomial (n = 100,  = 0.5) 70-95-100 rule of thumb for # “yes” About 95% chance of between 40 & 60 Almost certainly between 35 & 65

Normal approx to binomial If X is binomial (n , ), and n is large, then X is also approximately normal, with Conditions: Both n and n(1 – ) are at least 10. (Justified by Central Limit Theorem)

Number of H in 30 Flips X = # heads in n = 30 flips of fair coin X ~ binomial ( n = 30, = 0.5) Bell-shaped & approx normal.

Opinion poll n = 500 adults; 240 agreed with statement If  = 0.5 of all adults agree, what P(X ≤ 240) ? X is approx normal with Not unlikely to see 48% or less, even if 50% in population agree.

Sample Proportion Suppose (unknown to us) 40% of a population carry the gene for a disease, ( = 0.40). Random sample of 25 people; X = # with gene. X ~ binomial (n = 25 ,  = 0.4) p = proportion with gene

Distn of sample proportion X ~ binomial (n , ) Large n: p is approx normal (n ≥ 10 & n (1 – ) ≥ 10)

Examples Election Polls: to estimate proportion who favor a candidate; units = all voters. Television Ratings: to estimate proportion of households watching TV program; units = all households with TV. Consumer Preferences: to estimate proportion of consumers who prefer new recipe compared with old; units = all consumers. Testing ESP: to estimate probability a person can successfully guess which of 5 symbols on a hidden card; repeatable situation = a guess.

Public opinion poll Suppose 40% of all voters favor Candidate A. Pollsters sample n = 2400 voters. Propn voting for A is approx normal Simulation 400 times & theory.

Probability from normal approx If 40% of voters favor Candidate A, and n = 2400 sampled Sample proportion, p, is almost certain to be between 0.37 and 0.43 Prob 0.95 of p being between 0.38 and 0.42