Sampling Distribution Models

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
SAMPLING DISTRIBUTIONS Chapter How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling Distributions (§ )
Chapter 18 Sampling Distribution Models
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
© 2010 Pearson Prentice Hall. All rights reserved Sampling Distributions and the Central Limit Theorem.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Sampling Distributions
Introduction to Probability and Statistics Chapter 7 Sampling Distributions.
Today Today: Chapter 8, start Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Chapter 7 ~ Sample Variability
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
PROBABILITY WITH APPLICATION Random outcome  Example of rolling a die Repeating an experiment  Simulation using StatCrunch Software Probability in Real.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Sampling Theory Determining the distribution of Sample statistics.
Chapter 9 Sampling Distributions and the Normal Model © 2010 Pearson Education 1.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Sampling Distributions
Agresti/Franklin Statistics, 1e, 1 of 139  Section 6.4 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Sampling Distributions Chapter 7. The Concept of a Sampling Distribution Repeated samples of the same size are selected from the same population. Repeated.
Sampling W&W, Chapter 6. Rules for Expectation Examples Mean: E(X) =  xp(x) Variance: E(X-  ) 2 =  (x-  ) 2 p(x) Covariance: E(X-  x )(Y-  y ) =
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
1 Chapter 8 Sampling Distributions of a Sample Mean Section 2.
Sampling Distribution Models Chapter 18. Toss a penny 20 times and record the number of heads. Calculate the proportion of heads & mark it on the dot.
Sampling Error SAMPLING ERROR-SINGLE MEAN The difference between a value (a statistic) computed from a sample and the corresponding value (a parameter)
Chapter 18: Sampling Distribution Models
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Introduction to Inference Sampling Distributions.
Sampling Theory Determining the distribution of Sample statistics.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 7 Sampling Distributions Section 7.1 How Sample Proportions Vary Around the Population.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Sampling Distributions Chapter 9 Central Limit Theorem.
Chapter 6: Sampling Distributions
Sampling and Sampling Distributions
Introduction to Inference
CHAPTER 6 Random Variables
Chapter 7 Review.
Sampling Distribution Models
Sampling Distribution Models
Binomial and Geometric Random Variables
Chapter 6: Sampling Distributions
Sampling Distributions
Chapter 7 ENGR 201: Statistics for Engineers
Chapter 18: Sampling Distribution Models
Elementary Statistics
Combining Random Variables
Sampling Distribution Models
Week 8 Chapter 14. Random Variables.
Basic Practice of Statistics - 3rd Edition Sampling Distributions
Basic Practice of Statistics - 3rd Edition Sampling Distributions
Sampling Distributions
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Sampling Distributions
Sampling Distributions
Sampling Distributions (§ )
Random Variables and Probability Distributions
Sampling Distributions
Chapter 5: Sampling Distributions
Presentation transcript:

Sampling Distribution Models Week 9 Chapter 15. Sampling Distribution Models

Probability of the Possible Outcome Suppose there are two candidates in an electoral campaign. Let Y denote the possible outcome for selecting candidate #1: y = 0 (no), 1 (yes) Suppose that each candidate has a fair probability of being selected (0.50). Let P(y) denote the probability of the possible outcome for selecting candidate #1. Let 𝝁 denotes the mean population for possible outcome of y. 𝝁 = (0 x ½) + (1 x ½) = ½ = 0.50 This means that in the long run, on average we would get 0.50 of the population vote for candidate #1. Random Outcome Possible Outcome Probability of the Possible Outcome Y 1 ½ = 0.5

Probability of the Possible Outcome The mean population, which is population proportion for the votes for candidate 1, is: 𝝁 = (0 x ½) + (1 x ½) = ½ = 0.50 The variance of population is (squared deviation between an observation and the mean; here we also multiply the squared deviation by the probability of that observation which is the random outcome): 𝝈 𝟐 = (𝟎−𝟎.𝟓) 𝟐 0.5 + (𝟏−𝟎.𝟓) 𝟐 0.5 = 0.25 The standard deviation population is the square root of variance (take positive value): 𝝈 = 𝟎.𝟐𝟓 = +0.50 Let P(y) denote the probability of the possible outcome for selecting candidate #1. Random Outcome Possible Outcome Probability of the Possible Outcome Y 1 ½ = 0.5

Number of times that 1 occurs in the possible sample For three randomly selected eligible voters, the sampling distribution of sample proportion when population proportion is 0.50 is as follows: Note the reason we multiply the probabilities is because, the outcome (whether voting for candidate #1 or not ) stays the same (independent) from a person to another person (eligible voters). Possible Sample Number of times that 1 occurs in the possible sample Sample Proportion Probability of each Possible sample (1, 1, 1) 3 3/3 = 1 (0.5 x 0.5 x 0.5) = 0.125 (1, 1, 0) 2 2/3 = 0.667 (0.5 x 0.5) x 0.5 = 0.125 (1, 0, 1) (1, 0, 0) 1 1/3 = 0.333 (0.5) x 0.5 x 0.5 = 0.125 (0, 1, 1) (0, 1, 0) (0, 0, 1) (0, 0, 0) 0/3 = 0.000 0.5 x 0.5 x 0.5 = 0.125

This table is the organized version of the previous table. For three randomly selected eligible voters, the sampling distribution of sample proportion when population proportion is 0.50 is as follows: This table is the organized version of the previous table. Sample Proportion Probability of each Possible sample 1 x 0.125 = 0.125 0.333 3 x 0.125 = 0.375 0.667 1

Let’s experiment this concept with simulation using StatCrunch. Idea is the same as tossing a fair coin (with equal probability, 0.50, for obtaining a head or a tail). Let’s toss one fair coin a fixed number of times.

Suppose we toss 1 fair coin 1000 times. Suppose we take a random sample of size 1 eligible voters, 1000 times. The histogram:  

Toss 2 fair coins 1000 times: Suppose we take a random sample of size 2 eligible voters, 1000 times.

Toss 5 fair coins 1000 times: Suppose we take a random sample of size 5 eligible voters, 1000 times.

Toss 20 fair coins 1000 times: Suppose we take a random sample of size 20 eligible voters, 1000 times.

Toss 50 fair coins 1000 times: Suppose we take a random sample of size 50 eligible voters, 1000 times.

Toss 100 fair coins 1000 times: Suppose we take a random sample of size 100 eligible voters, 1000 times.

Toss 1000 fair coins 1000 times: Suppose we take a random sample of size 1000 eligible voters, 1000 times.

I saved the data for each experiment in a worksheet: Roll 1 fair coin (n* =1) 1000 times Roll 2 fair coins (n* = 2) 1000 times Roll 5 fair coins (n* = 5) 1000 times Roll 20 fair coins (n* = 20) 1000 times Roll 50 fair coins (n* = 50) 1000 times Roll 100 fair coins (n* = 100) 1000 times Roll 1000 fair coins (n* = 1000) 1000 times   n *= 1 n *= 2 n *= 5 n* = 20 n* = 50 n* = 100 n* = 1000

Group dot plots: What do you see? What happens as we increase the size of our sample (n*) in the repeated experiment (n = 1000)? n = 1000 n = 100 n = 50 n = 20 n = 5 n = 2 n = 1

Group dot plots. What do you see? Note that the spread gets narrower as the size of repeated sampling increases.

What do you see in the descriptive statistics table (e. g What do you see in the descriptive statistics table (e.g., sample mean (sample proportion) for each sample)?

What do you see in the descriptive statistics table (e. g What do you see in the descriptive statistics table (e.g., sample mean (sample proportion) for each sample)? Note that the sample mean varies from sample to sample and as the size of the repeated sampling increases the sample proportion gets closer to the actual population proportion (0.5).

Sampling Distribution of Sample Mean, 𝒙 Sample mean is a variable because it varies from sample to sample. For random sample, sample mean fluctuates around the population mean 𝝁. The standard deviation of sample mean, 𝒙 , is called standard error of the sample mean and it is denoted by 𝝈 𝒙 . In practice, we don’t do repeated sampling, we use the theory behind the idea of repeated sampling. Hence, 𝝈 𝒙 = 𝝈 𝒏 𝝈 𝒙 = 𝝈 𝒏 is a fraction of the spread of population. Individual observations tend to vary much more than the sample means vary from sample to sample. As sample size increases, the standard error decreases, and the sampling distribution gets narrower (what we saw in our previous experiment with tossing a fair coin many times) and gets closer to the actual population parameter (e.g., mean, proportion).

Central Limit Theorem (CLT) Regardless of the original shape of the population, for large random sample, n, the sampling distribution of 𝒙 is approximately normal. 𝒙 ~𝑵(𝝁, 𝝈 𝒙 = 𝝈 𝒏 ) We can apply the empirical rule, in that sample mean most certainly (close to probability of 1) falls within 3 standard error of the population mean.

Example The distribution of household electricity usage is right skewed with mean 673 KWh and standard deviation of 556 KWh. Suppose a researcher takes a random sample of 900 households. For the sampling distribution of his sample mean,   a. specify its mean and its standard deviation (standard error). b. specify its shape: c. specify the theorem that you used to answer part a: d. sketch the sampling distribution of the sample mean for n = 900.

Example The distribution of household electricity usage is right skewed with mean 673 KWh and standard deviation of 556 KWh. Suppose a researcher takes a random sample of 900 households. For the sampling distribution of his sample mean,   a. specify its mean and its standard deviation (standard error). Sample mean ( 𝒙 ) 𝒉𝒂𝒔 𝒎𝒆𝒂𝒏 𝝁=𝟔𝟕𝟑, 𝒂𝒏𝒅 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓 𝒐𝒇 𝝈 𝒙 = 𝝈 𝒏 = 𝟓𝟓𝟔 𝟗𝟎𝟎 =𝟏𝟖.𝟓𝟑 b. specify its shape: Approximately Normal c. specify the theorem that you used to answer part a: Central Limit Theorem d. sketch the sampling distribution of the sample mean for n = 900.

Example The distribution of household electricity usage is right skewed with mean 673 KWh and standard deviation of 556 KWh. Suppose a researcher takes a random sample of 900 households. What is the probability that his sample mean is more than 720?  𝑩𝒚 𝑪𝑻𝑳: 𝒙 ~𝑵(𝝁=𝟔𝟕𝟑, 𝝈 𝒙 = 𝝈 𝒏 = 𝟓𝟓𝟔 𝟗𝟎𝟎 =𝟏𝟖.𝟓𝟑) 𝑭𝒐𝒓𝒎𝒖𝒍𝒂: 𝒁= 𝒙 −𝝁 𝝈 𝒙 = 𝒙 −𝝁 𝝈 𝒏 𝒁= 𝟕𝟐𝟎−𝟔𝟕𝟑 𝟓𝟓𝟔 𝟗𝟎𝟎 =𝟏𝟖.𝟓𝟑 = 2.54 Area above Z = 2.54 based on our table is: 1 – Area below Z = 2.54; So, 1 – 0.9945 = 0.0055 Or you can think of it as follow: Area above Z = 2.54 is equivalent to area below Z = -2.54 in the Z-table: 0.0055. Thus, it is very unlikely that his sample mean would be more 720.  

Watch the following video about the idea of Central Limit Theorem  http://vimeo.com/75089338