Populations & Samples Objectives:

Slides:



Advertisements
Similar presentations
Confidence Intervals Chapter 19. Rate your confidence Name Mr. Holloways age within 10 years? within 5 years? within 1 year? Shooting a basketball.
Advertisements

Estimating Population Values
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 6 Hypothesis Tests with Means.
Confidence Intervals Objectives: Students should know how to calculate a standard error, given a sample mean, standard deviation, and sample size Students.
“Students” t-test.
CHAPTER 14: Confidence Intervals: The Basics
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Sampling Distributions (§ )
Chapter 10: Sampling and Sampling Distributions
Ch 6 Introduction to Formal Statistical Inference.
Topics: Inferential Statistics
Sampling Distributions
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
Statistical Concepts (continued) Concepts to cover or review today: –Population parameter –Sample statistics –Mean –Standard deviation –Coefficient of.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
T T07-01 Sample Size Effect – Normal Distribution Purpose Allows the analyst to analyze the effect that sample size has on a sampling distribution.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Clt1 CENTRAL LIMIT THEOREM  specifies a theoretical distribution  formulated by the selection of all possible random samples of a fixed size n  a sample.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Review of normal distribution. Exercise Solution.
Chapter 7 Estimation: Single Population
Chapter 11: Estimation Estimation Defined Confidence Levels
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Introduction to Statistical Inference Chapter 11 Announcement: Read chapter 12 to page 299.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Confidence Intervals for Means. point estimate – using a single value (or point) to approximate a population parameter. –the sample mean is the best point.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Normal Distributions Z Transformations Central Limit Theorem Standard Normal Distribution Z Distribution Table Confidence Intervals Levels of Significance.
Chapter 11 – 1 Chapter 7: Sampling and Sampling Distributions Aims of Sampling Basic Principles of Probability Types of Random Samples Sampling Distributions.
Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.
Chapter 7: Sampling and Sampling Distributions
Chapter 8 Confidence Intervals 8.1 Confidence Intervals about a Population Mean,  Known.
AP Statistics Chapter 10 Notes. Confidence Interval Statistical Inference: Methods for drawing conclusions about a population based on sample data. Statistical.
Estimating a Population Mean
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 10: Confidence Intervals
Mystery 1Mystery 2Mystery 3.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Point Estimates point estimate A point estimate is a single number determined from a sample that is used to estimate the corresponding population parameter.
Review of Statistical Terms Population Sample Parameter Statistic.
Introduction to Inference Sampling Distributions.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
1 Probability and Statistics Confidence Intervals.
9-1 ESTIMATION Session Factors Affecting Confidence Interval Estimates The factors that determine the width of a confidence interval are: 1.The.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Chapter 12 Inference for Proportions AP Statistics 12.2 – Comparing Two Population Proportions.
Lab Chapter 9: Confidence Interval E370 Spring 2013.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Inference: Conclusion with Confidence
Introduction to Inference
Inference for the Mean of a Population
ESTIMATION.
Inference: Conclusion with Confidence
Statistics in Applied Science and Technology
CONCEPTS OF ESTIMATION
Introduction to Inference
CHAPTER 22: Inference about a Population Proportion
CENTRAL LIMIT THEOREM specifies a theoretical distribution
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
Confidence Intervals Accelerated Math 3.
Chapter 8: Estimating With Confidence
Chapter 12 Inference for Proportions
Sampling Distributions (§ )
How Confident Are You?.
Hypothesis Tests with Means of Samples
Presentation transcript:

Populations & Samples Objectives: Students should know the difference between a population and a sample Students should be able to demonstrate populations and samples using a GATE frame Students should know the difference between a parameter and a statistic Students should know the main purpose for estimation and hypothesis testing Students should know how to calculate standard error The first learning objective for this module is that students should know the difference between a population and a sample. Second, students should be able to demonstrate populations and samples using a GATE frame. Third, students should know the difference between a parameter and a statistic. Fourth, students should know the main purpose for estimation and hypothesis testing. Finally, students should know how to calculate standard error.

GATE Frame: Populations & Samples A population is any entire collection of people, animals, plants or objects which demonstrate a phenomenon of interest. Population A sample is a subset of the population; the group of participants from which data is collected. In this module, we are going to discuss populations and samples, critical concepts for understanding the goals of statistical testing. A population is any entire collection of people, animals, plants, or objects which demonstrate a phenomenon of interest. A sample is a subset of the population, or the group of participants from which data is collected. In most cases, we use samples because studying an entire population is not possible due to time and resource constraints. So, data is typically collected from a sample of the population of interest and then used to estimate the phenomenon in the population. Eligible In most situations, studying an entire population is not possible, so data is collected from a sample and used to estimate the phenomenon in the population. Sample/ Participants

Parameters & Statistics A population value is called a parameter. A value calculated from a sample is called a statistic. A population value is called a parameter. A value calculated from a sample is called a statistic. A sample statistic is simply a point estimate of a population parameter. Remember, P and P (population and parameter) and S and S (sample and statistic). We can use carefully drawn samples for testing hypotheses about populations and for estimating population parameters, if our sample is representative of the larger population. When we draw a representative sample from a larger population, the sample statistic serves as a point estimate of a population parameter. For example, if we draw a representative sample and calculate the sample mean, that mean serves as a point estimate of the mean of the population from which the sample is drawn. The sample mean should be representative, or give us a reasonable estimate of, the population mean. Note: A sample statistic is a point estimate of a population parameter.

Estimating Population Parameters Confidence intervals (CI) are ranges defined by lower and upper endpoints constructed around the point estimate based on a preset level of confidence. Hypothesis Testing is used to determine probabilities of obtaining results from a sample or samples if the result is not true in the population. Confidence intervals are ranges defined by lower and upper endpoints constructed around the point estimate, based on a preset level of confidence. We use these confidence intervals to communicate the range within which we would expect to find the population mean, with a certain level of confidence and given the information we have about the sample mean and and standard deviation. The goal of hypothesis testing then, is to determine the probabilities of obtaining results from a sample or samples, if the result is not true of the population.

Sample Estimates of Population Parameters Sample Statistic (point estimate) Population Parameter L = lower value U = upper value The sample is a subset of observations drawn from a larger population. We use the sample statistic, in this case x-bar or sample mean, combined with a measure of variability of the point estimate, to determine the range within which we would expect to find the true population value, in this case mu or population mean. Combine with measure of variability of the point estimate Construct a range of values with an associated probability of containing the true population value

We draw our sample, and the mean HR is 72 bpm What is Standard Error? Suppose a population of 1000 people has a mean heart rate of 75 bpm (but we don’t know this). We want to estimate the HR from a sample of 100 people drawn from the population: Population N=1000 n=100 Let’s walk through our heart rate example again and as we go, you’ll see how we use sample statistics to determine population parameters and also become more familiar with the concept of standard error. First, let’s suppose we have a population of 1000 people and within that population we think the mean heart rate is 75 beats per minute. We don’t know this for sure so we want to estimate the population’s mean heart rate by drawing a sample of 100 individuals from that population. So we draw a sample of 100 people from that population, measure their heart rates, and determine that the mean heart rate for this sample is 72 beats per minute. We draw our sample, and the mean HR is 72 bpm

Standard Error If we draw another sample, the mean will probably be a little different from 72, and if we draw lots of samples we will probably get lots of estimates of the population mean: Population N=1000 n=100 n=100 n=100 Since that’s only 1 small sample from a much larger population, let’s draw another sample. This time we draw another sample of 100 people from the population, measure their heart rates, and determine that the mean heart rate is 71. This is slightly different from the 72 mean heart rate we found in the first sample. We can draw multiple samples of 100 individuals from the larger population and it’s likely we will get different estimates of the population mean heart rate for each one. n=100 n=100 n=100

All possible samples of size n=100 Standard Error The mean of the means of all possible samples of size 100 would exactly equal the population mean: Population N=1000 All possible samples of size n=100 Next, we take all the means from all the different samples and we calculate the mean of those mean scores. If all these samples of 100 individuals are truly representative of the population, the mean of the means of all possible samples will exactly equal the population mean. The standard deviation of the means of all possible samples is the standard error of the mean. The standard error reflects, for a given sample size, now close the means of repeated samples will come to the population mean. The standard deviation of the means of all possible samples is the standard error of the mean

Sample Representativeness The sample means will follow a normal distribution, and: 95% of the sample means will be between the population mean and ±1.96 standard errors. Because the samples are representative of the population and are each of sufficient size, the sample means will also follow a normal distribution so 95% of the sample means will be between the population mean and +/- 1.96 standard errors.

95% of the intervals will contain the true population mean.  Population Mean Sample Means In addition, if we constructed 95% confidence intervals around each individual sample mean: 95% of the intervals will contain the true population mean. In the same way, if we construct 95% confidence intervals around each individual sample mean, 95% of the intervals will contain the true population mean.

Why is This Important and Useful? We rarely have the opportunity to draw repeated samples from a population, and usually only have one sample to make an inference about the population parameter: The standard error can be estimated from a single sample, by dividing the sample standard deviation by the square root of the sample size: This seems so theoretical, so why is it really important? In research, we rarely have the opportunity to draw repeated samples from a population. We usually only have one chance, or sample, that we can use to make an inference about a population parameter. We need to be able to calculate standard error so we know how representative the sample is of the population. The standard error is calculated by dividing the sample standard deviation by the square root of the sample size. Note: you will need to calculate the standard error for your practice exercises and competency quiz. Note: You will need to calculate the standard error in this course.

Standard Error and Confidence Intervals The sample SE can then be used to construct an interval around the sample statistic with a specified level of confidence of containing the true population value: The interval is called a confidence interval The Most Commonly Used Confidence Intervals: 90% = sample statistic + 1.645 SE 95% = sample statistic + 1.960 SE 99% = sample statistic + 2.575 SE We will go into more detail about confidence intervals in the next module, but as an introduction – we use the sample standard error value to construct an interval around the sample statistic with a specified level of confidence that the interval contains the true population value. The most commonly used confidence intervals are calculated in the following ways: To calculate a 90% confidence interval, you add the value for the sample statistic + /- 1.645 times the standard error. To determine a 95% confidence interval, you calculate the sample statistic +/- 1.96 times the standard error. The 99% confidence interval is calculated by adding the sample statistic +/- 2.575 times the standard error.