Survey Methodology EPID 626 Sampling, Part II Manya Magnus, Ph.D. Fall 2001.

Slides:



Advertisements
Similar presentations
Introduction Simple Random Sampling Stratified Random Sampling
Advertisements

Sampling Methods.
1/26/00 Survey Methodology Sampling, Part 2 EPID 626 Lecture 3.
Estimation in Sampling
Sampling: Final and Initial Sample Size Determination
Statistics for Managers Using Microsoft® Excel 5th Edition
Economics 105: Statistics Review #1 due next Tuesday in class Go over GH 8 No GH’s due until next Thur! GH 9 and 10 due next Thur. Do go to lab this week.
QBM117 Business Statistics Statistical Inference Sampling 1.
1. Estimation ESTIMATION.
Chapter 7 Sampling Distributions
Chapter 10 Sampling and Sampling Distributions
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Seven Sampling Methods and Sampling Distributions GOALS When you.
Why sample? Diversity in populations Practicality and cost.
Fundamentals of Sampling Method
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics 10 th Edition.
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
A new sampling method: stratified sampling
7-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft.
Sampling Designs and Techniques
How could this have been avoided?. Today General sampling issues Quantitative sampling Random Non-random Qualitative sampling.
BCOR 1020 Business Statistics
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Understanding sample survey data
Survey Methodology Sampling error and sample size EPID 626 Lecture 4.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Standard error of estimate & Confidence interval.
Chapter 5: Descriptive Research Describe patterns of behavior, thoughts, and emotions among a group of individuals. Provide information about characteristics.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Introduction Parameters are numerical descriptive measures for populations. For the normal distribution, the location and shape are described by  and.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 13.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Sampling: Theory and Methods
Sampling Techniques LEARNING OBJECTIVES : After studying this module, participants will be able to : 1. Identify and define the population to be studied.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination CHAPTER Eleven.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Statistical Sampling & Analysis of Sample Data
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
Part III Gathering Data.
Variables, sampling, and sample size. Overview  Variables  Types of variables  Sampling  Types of samples  Why specific sampling methods are used.
1/19/00 Survey Methodology Sampling EPID 626 Lecture 2.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Lecture 9 Prof. Development and Research Lecturer: R. Milyankova
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Understanding Sampling
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
© Copyright McGraw-Hill 2004
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Basic Business Statistics
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Selecting a Sample. outline Difference between sampling in quantitative & qualitative research.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Sampling Design and Procedure
Copyright ©2011 by Pearson Education, Inc. All rights reserved. Chapter 8: Qualitative and Quantitative Sampling Social Research Methods MAN-10 Erlan Bakiev,
Graduate School of Business Leadership
Chapter 7 Sampling Distributions
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Presentation transcript:

Survey Methodology EPID 626 Sampling, Part II Manya Magnus, Ph.D. Fall 2001

Lecture overview Comments about Assignment I More sampling techniques Sampling error Sample sizes

Comments about Assignment I Late policy Location of mailbox Randomization vs. random selection Validity, reliability Sampling frames Physician responses=?=“gold standard” Research questions vs. survey questions Registering for class

Comments about Assignment I Grading Looked for completeness in answering questions, care in discussion of survey, effort, basically correct information, not just cut-n-paste, synthesis. Questions about grade:

Comments about Assignment I Grading: – % –+80-89% – 70-79% –-60-69% –--<60% –0not turned in

Random digit dialing (1) Delineate the geographic boundaries of the sampling area Identify all of the exchanges used in the geographic area Identify the distribution of prefixes with the sampling area –Example: There may be 8 exchanges, but you may find that 3 of them are used for nearly two- thirds of residential lines.

Random digit dialing (2) You may stratify based on the distribution of prefixes –Ex. Take more samples of the 3 exchanges that account for the most residential lines Try to identify vacuous suffixes –These are suffixes not yet assigned or assigned in large groups to a business –Usually consider suffixes in 100s ex ,

Random digit dialing (3) May randomly select the four-digit suffixes –ex. use a random-numbers table Alternatively, you may use a plus-one approach –When you reach residence, use the number as a seed, and add fixed digits (one or two) to get the next sample

Random digit dialing (4) Provides a nonzero chance of reaching any household within a sampling area that has a telephone line regardless of whether the number is listed Is the probability of reaching every household equal? –No. Households with more than one phone line will have a greater probability than households with one phone line. –Adjust for unequal probability by weighting

Random Digit Dialing (5) Advantages: Inexpensive and easy to do Disadvantages: 1. Large number of unfruitful calls 2. Will exclude individuals without phones 3. May be difficult to ascertain geographic area

Sampling distributions The central limit theorem: In a sequence of samples of a population, for a particular estimate (say a mean), there will be a normal distribution around the true population value As sample size increases, distribution becomes increasingly normal

This variation around the true value is the sampling error—it stems from the fact that, by chance, samples may differ from the population as a whole.

The larger the sample size and the less variance of what is being measured, the more tightly the sample estimates will “bunch” around the true population value, and the more accurate the sample-based estimate will be.

Example (1) (adapted from Babbie) Survey at TUSPHTM Approval of new Lundi Gras holiday Dichotomous outcome: approve/disapprove Survey population—aggregation of students Sampling frame—student list Random sample of students; representative sample of student body

Example (2) (adapted from Babbie) Extremes and all combinations in between possible: 100% approve  100% disapprove, 1% approve, 99% disapprove, etc.. First random sample: 48% approve, 52% disapprove Second random sample: 20% approve, 80% disapprove And so forth

Example (3) (adapted from Babbie) What results from this exercise, is a distribution of samples, or a sampling distribution. As more independent random samples are selected, the sample statistics obtained will be distributed around true population value in a known way.

Example (4) (adapted from Babbie) They will be clustered about the true value within a certain range. The range is given by the standard error. We do not know if the value in our sample is within the range, just that if many similar samples were taken in the same fashion, X% would fall within the specified range; this one may or may not.

Example (5) (adapted from Babbie) Probability theory says that 68% of samples will fall within one standard deviation of the parameter and 95% will fall within two standard deviations of the parameter Increasing confidence with increasing range

Note difference between standard errors & standard deviations

Standard error of a mean

The standard deviation of the distribution of sample estimates of the mean that would be formed if an infinite number of samples of a given size were drawn.

Proportions Mean of a two-value (binomial) distribution Var of a proportion = p(1-p) So the

Table 2.1 Confidence Ranges for Variability Attributable to Sampling Trends If sample size=75 and p=0.20,

Confidence intervals In a survey of 100 respondents, 20% say yes. What is the confidence interval for a 95% confidence level? In a survey 250 respondents, 10% say yes. What is the confidence interval for a 95% confidence level? What if 50% said yes?

In a survey of 100 respondents, 20% say yes. What is the confidence interval for a 95% confidence level? Interval is 8. 95% CI=(12%, 28%)

In a survey 250 respondents, 10% say yes. What is the confidence interval for a 95% confidence level? What if 50% said yes? Interval is about % CI is about (6.2%, 13.8%) If 50% said yes, CI is about (43.7%, 56.3%)

Sampling error and sampling strategy SRS is approximated by the standard error Systematic sampling –If not stratified, sampling error is the same as in SRS. –If stratified, errors are lower than those associated with SRS for the same size for variables that differ (on average) by stratum, if rates of selection are constant across strata.

Sampling error and sampling strategy (2) Unequal rates of selection decrease sampling error for oversampled groups. It will generally produce sampling errors for the whole sample that are higher than those associated with SRS of the same size for variables that differ by stratum.

Sampling error and sampling strategy (3) Clusters will produce sampling errors that are higher than SRS for the same size for variables that are more homogenous within clusters than in the population as a whole. You must look at the nature of the clusters to evaluate the effect on the sampling error.

Caveats Sampling error is in no way the only source of error. Non-sampling error, bias, error resulting from incorrect specification of sampling frame, etc., etc., are also sources of error. Often the latter are more insidious as they are seldom quantifiable Total survey approach useful in this regard.

Sample size (1) Very important to consider prior to undertaking study Consult a biostatistician Many references in texts, available spreadsheet, stat programs, EpiInfo, etc. Never feel bad asking for assistance

Sample size (2) What not to do 1.Sample size does not rely on the fraction of the population that is sampled. Nor does it depend on the size of the population you want to describe. 2.Sample size should not be decided solely based on what others have previously done. 3.Sample size should not be based on the desired level of precision for just one estimate.

Sample size (3) What to do –develop analysis plan –desired precision of estimates for subgroups, –consider research questions –affordability, –feasibility, –and to some extent, previous studies

Sample size (5) Parameters required to calculate sample size: –Null hypothesis—what precisely are you asking/testing? –  [Pr(type I error)] –  [Pr(type II error)]—usually included as 1-  =power –What difference between groups do you want to observe? (e.g.,  1 -  2 ) –What is a good estimate of variance in population?

Sample size (6) How sample size works—some examples

Sample size (7)  sample size,  power Group A  Group B 

Sample size (8)  sample size,  power A:      B:     

Sample size (9)  variability,  power A:      B:     

Sample size (10)  variability,  power A:      B:     

Non-response (1) Very big issue Source of non-sampling error Can lead to bias, uninterpretability of results Violates whole point of probability sample, yet unavoidable

Non-response (2) Issue in probability as well as non- probability samples Exists on many levels

Non-response (3) Whole sample Reached Not reached

Non-response (4) Reached Can participate Cannot participate

Non-response (5) Reached Enrolled Refused

Non-response (6) Participated Answer individual question Did not answer individual question