Estimating Population Parameters Based on a Sample

Slides:

Advertisements

Similar presentations

Estimation in Sampling

Advertisements

Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.

Objectives Look at Central Limit Theorem Sampling distribution of the mean.

Estimation from Samples Find a likely range of values for a population parameter (e.g. average, %) Find a likely range of values for a population parameter.

PPA 415 – Research Methods in Public Administration Lecture 5 – Normal Curve, Sampling, and Estimation.

S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Chapter 11: Random Sampling and Sampling Distributions

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.

From Last week.

Determining Sample Size

Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.

1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.

Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.

Chapter 7 Estimates and Sample Sizes

Estimation in Sampling!? Chapter 7 – Statistical Problem Solving in Geography.

● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.

Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.

Copyright © 2012 by Nelson Education Limited. Chapter 6 Estimation Procedures 6-1.

Chapter 6 USING PROBABILITY TO MAKE DECISIONS ABOUT DATA.

1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.

The Normal Curve Theoretical Symmetrical Known Areas For Each Standard Deviation or Z-score FOR EACH SIDE:  34.13% of scores in distribution are b/t the.

Chapter 7 Probability and Samples: The Distribution of Sample Means.

Estimation of a Population Mean

Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…

Chapter 8 Parameter Estimates and Hypothesis Testing.

1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.

Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall

Chapter 11: Estimation of Population Means. We’ll examine two types of estimates: point estimates and interval estimates.

Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.

1 Probability and Statistics Confidence Intervals.

Chapter 13 Understanding research results: statistical inference.

Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.

Dr.Theingi Community Medicine

CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.

GOVT 201: Statistics for Political Science

And distribution of sample means

Confidence Intervals Chapter 8.

9.3 Hypothesis Tests for Population Proportions

Statistical analysis.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE

Inference and Tests of Hypotheses

Statistical analysis.

Confidence Intervals with Means

Week 10 Chapter 16. Confidence Intervals for Proportions

Hypothesis Testing: Two Sample Test for Means and Proportions

Probability and the Sampling Distribution

Confidence Interval Estimation and Statistical Inference

The normal distribution

Confidence Interval Estimation

Samples and Populations

Hypothesis Testing.

Calculating Probabilities for Any Normal Variable

Chapter 8: Estimating with Confidence

Confidence Intervals with Proportions

Chapter 8: Estimating with Confidence

Chapter 8: Estimating With Confidence

Chapter 8: Estimating with Confidence

Lecture Slides Elementary Statistics Twelfth Edition

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Comparing Means from Two Data Sets

Standard Scores and The Normal Curve

Some Key Ingredients for Inferential Statistics

Presentation transcript:

Estimating Population Parameters Based on a Sample 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Why Estimate? It is often not feasible (lack of time and money) to measure an entire population. Therefore, a researcher must select a representative sample from the population and make estimations. This general principle is used frequently in research and is known as statistical inference. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Estimating a Population Mean Researchers often want to know the mean of a population. E.g., Health Canada, may want to understand obesity trends over the next 10 years. The first step would be to measure obesity in the population on an annual basis (measure BMI). Researchers cannot measure all 20 million adult Canadians every year. Each year a random sample is measured and used to estimate the entire population. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Sampling Error It is unlikely that the sample will have exactly the same mean as the entire population. Sampling error is the amount of error in the estimate of a population parameter that is based on a sample statistic. Therefore, Health Canada needs to determine how accurate the mean BMI of the sample is and what the odds are that it is different from the population mean by a given amount. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Standard Error of the Mean (SEM) Standard error of the mean is a numeric value that indicates the amount of error that may occur when estimating a population mean. The estimation of the population mean is always an educated guess and is accompanied by a probability statement. I.e., upper and lower limits can be set around the estimated mean and the chance of the true mean falling in this range can be stated as a probability such as, 5 out of 100 times, or p=.05 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Understanding SEM Consider the following theoretical exercise. Take 100 random samples (N=400) of the Canadian adult population and find the mean BMI of each sample. This means measuring 400 people and getting a mean BMI, then repeating the process 100 times. This generates 100 estimates of the population mean. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Understanding SEM The majority of the 100 sample means will cluster around the true mean of the population. However, some will also stray further from the true population mean. The sample means will form a normal distribution in the same way individual BMI measurements within a sample form a normal distribution. The standard deviation of the 100 sample means is the SEM. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Individual BMI Scores of One Sample 99.8% 95.4% 68.2% Standard Deviation = 4 Frequency 34.1% 34.1% 13.6% 13.6% 2.2% 2.2% 15 19 23 27 31 35 39 BMI (Kg/m2) 0.1% 0.1% 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Interpreting Previous Slide The sample had a mean of 27 and a SD of 4. It formed a normal distribution, which means 68.2% of the scores lie between 23 and 31 95.4% of the scores lie between 19 and 35 99.8% of the scores lie between 15 and 39 These values can be used to estimate the proportion of the entire population that would fall within the above limits. But, Health Canada needs an estimate of the mean! 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Distribution of the 100 Sample Means 99.8% 95.4% 68.2% SEM = 0.2 Frequency 34.1% 34.1% 13.6% 13.6% 2.2% 2.2% 26.4 26.6 26.8 27 27.2 27.4 27.6 BMI (Kg/m2) 0.1% 0.1% 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Interpreting Previous Slide The mean of the 100 sample means was 27 and the SD of the 100 sample means was 0.2. The means are normally distributed, therefore, 68.2% chance that: 26.8 < true mean < 27.2 95.4% chance that: 26.6 < true mean < 27.4 99.8% chance that: 26.4 < true mean < 27.6 The more precise, or narrow the estimate, the lower the odds of being correct. As the estimate becomes more encompassing, the odds of being correct improve. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Calculating SEM in Reality It is not logical to take 100 samples and then find the SD of the means of those samples. There is an equation used to calculate SEM that is based on the SD of the sample, and the number of measurements in the sample. SD = sample standard deviation N = the number of measurements in the sample 5/12/2019 HK 396 - Dr. Sasho MacKenzie

SEM Example Suppose Health Canada measured the BMI of 1 sample of 400 adults and found: Mean = 27, SD = 4 Therefore, This is in agreement with the standard deviation of the 100 samples means from the last graph. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

SEM is a Z-score SEM is actually a standard deviation on a normal curve; therefore, it is equivalent to a Z-score of ±1. The true mean of the population can be represented by the following equation. Using the previous example, and Z = 1, True mean = 27 ± 0.2 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Level of Confidence A level of confidence (LOC) is a percentage figure that establishes the probability that a statement is correct. It is based on the characteristics of the normal curve. Using the example from the last slide, Health Canada can conclude that the mean BMI for adults, 27 ± 0.2, is accurate at the 68% level of confidence. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

What if 68% isn’t enough? If Health Canada wanted to be 95.4% confident, then they would broaden the estimate of the mean to the values on the normal curve that encompass 95% of the area. Now we need 2 standard deviations: Z = 2, True mean = 27 ± 0.4 This estimate is accurate at the 95.4% LOC 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Probability of Error (p-value) If there is a 68% chance of being correct, there is also a 32% chance of being incorrect. This is referred to as the probability of error. The area under the curve that represents the probability of error is called alpha (). Alpha is the level of chance occurrence. Alpha is directly related to Z because alpha is the area under the curve that extends beyond a given Z-score. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Z, Level of Confidence, and P-value 1.00 68% .32 1.65 90% .10 1.96 95% .05 2.58 99% .01 The above table shows the relationship of Z-score, LOC, and the two-tailed p-value. By tradition, LOC is presented as a percentage, and the probability of error as a decimal. 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Graphic of LOC, p-value, and alpha Level of Confidence = 90% Probability of Error = 0.1 (5% + 5% = 0.1) Alpha = 0.1 Frequency 90% 5% 26.67, Z= -1.65 27.33, Z= 1.65 26.4 26.6 26.8 27 27.2 27.4 27.6 BMI (Kg/m2) 5/12/2019 HK 396 - Dr. Sasho MacKenzie

Tails of the Normal Curve On the last slide, Health Canada had to consider the area on both ends (tails) of the curve. This was necessary since the true mean could be either above, or below, the estimated range. This is considered a two-tailed problem. The following question would be considered a one-tailed problem. What is the chance that the mean BMI of the population is greater than 27.5? 5/12/2019 HK 396 - Dr. Sasho MacKenzie

One-Tailed Problem To answer this question, we need to convert 27.5 to a Z-score and determine the area under the normal curve beyond that Z-score. Z = (27.5 – 27) / 0.2 = 2.5 standard deviations To find the area beyond Z=2.5, we could consult a table in stats book or use Excel. The equation: =1-Normsdist(2.5) in Excel provides the correct p-value of 0.006. This means there is a 0.6% chance that the mean BMI of adult Canadians is greater than 27.5 Kg/m2. 5/12/2019 HK 396 - Dr. Sasho MacKenzie