Social Statistics Estimation and complex survey design Ian Plewis, CCSR, University of Manchester.

Slides:



Advertisements
Similar presentations
Session 1: Introduction to Complex Survey Design
Advertisements

Introduction Simple Random Sampling Stratified Random Sampling
Sample size estimation
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 10: Sampling and Sampling Distributions
QBM117 Business Statistics Statistical Inference Sampling 1.
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
MISUNDERSTOOD AND MISUSED
Dr. Chris L. S. Coryn Spring 2012
Who and How And How to Mess It up
Chapter 7 Sampling and Sampling Distributions
Sampling.
Why sample? Diversity in populations Practicality and cost.
Sampling Prepared by Dr. Manal Moussa. Sampling Prepared by Dr. Manal Moussa.
Chapter 11 Sampling Design. Chapter 11 Sampling Design.
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Sampling Methods.
Formalizing the Concepts: Simple Random Sampling.
Sampling ADV 3500 Fall 2007 Chunsik Lee. A sample is some part of a larger body specifically selected to represent the whole. Sampling is the process.
SAMPLING METHODS. Reasons for Sampling Samples can be studied more quickly than populations. A study of a sample is less expensive than studying an entire.
Sampling Methods and Sampling Theory Alex Stannard.
Chapter Outline  Populations and Sampling Frames  Types of Sampling Designs  Multistage Cluster Sampling  Probability Sampling in Review.
Sampling Theory and Surveys GV917. Introduction to Sampling In statistics the population refers to the total universe of objects being studied. Examples.
Key terms in Sampling Sample: A fraction or portion of the population of interest e.g. consumers, brands, companies, products, etc Population: All the.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Sample Design.
Sampling Methods and Sampling Theory
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 13.
Sampling January 9, Cardinal Rule of Sampling Never sample on the dependent variable! –Example: if you are interested in studying factors that lead.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Chapter 1 Introduction and Data Collection
Chapter 7 Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.
1 Basic Scientific Research Topic 6: Sampling methods Dr Jihad ABDALLAH Source: Research Methods Knowledge Base
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Sampling Methods. Definition  Sample: A sample is a group of people who have been selected from a larger population to provide data to researcher. 
Shooting right Sampling methods FETP India. Competency to be gained from this lecture Select a sample from a population to generate precise and valid.
Chapter 11 – 1 Chapter 7: Sampling and Sampling Distributions Aims of Sampling Basic Principles of Probability Types of Random Samples Sampling Distributions.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Understand the key principles in sampling. 2.Appreciate.
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
Chapter 7: Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
STANDARD ERROR Standard error is the standard deviation of the means of different samples of population. Standard error of the mean S.E. is a measure.
Chapter 7 The Logic Of Sampling. Observation and Sampling Polls and other forms of social research rest on observations. The task of researchers is.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Two THE DESIGN OF RESEARCH.
1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
 When every unit of the population is examined. This is known as Census method.  On the other hand when a small group selected as representatives of.
Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Sampling Design and Procedure
Sampling Why use sampling? Terms and definitions
Graduate School of Business Leadership
SAMPLING (Zikmund, Chapter 12.
Meeting-6 SAMPLING DESIGN
Chapter 7 Sampling Distributions
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
SAMPLING (Zikmund, Chapter 12).
Presentation transcript:

Social Statistics Estimation and complex survey design Ian Plewis, CCSR, University of Manchester

Social Statistics PARAMETER: Population Mean, Probability, Variance etc.: μ, π, σ² ESTIMATOR: Sample Mean, Proportion, Variance etc.:, p, s2 ESTIMATE: Usually a number calculated from the observed data. We combine the estimate and its standard error to make inferences about the parameter.

Social Statistics With simple random sampling, the standard error of is σ n usually estimated by s n where n is the sample size. And the standard error of p is usually estimated by.

Social Statistics MILLENNIUM COHORT STUDY Stratification The population was stratified by UK country - England, Wales, Scotland and Northern Ireland. For England, the population was then stratified, via the stratification of electoral wards extant on 1 April 1998, into three strata: 1) The 'ethnic minority' stratum: children living in wards which, in the 1991 Census of Population, had an ethnic minority indicator of at least 30%. 2) The 'disadvantaged' stratum: children living in wards, other than those falling into stratum (1) above, which fell into the upper quartile (i.e. the poorest 25% of wards) of the ward-based Child Poverty Index. 3) The 'advantaged' stratum: children living in wards, other than those falling into stratum (1) above, which were not in the top quartile of the CPI. Advantaged is therefore a relative term in this context. For Wales, Scotland and Northern Ireland, there were just two strata: disadvantaged and advantaged.

Social Statistics Clustering The wish to bring the broader socio-economic context into the analysis, particularly as represented by the areas or local neighbourhoods that the children live in, and the need to keep field costs down, led to the decision to cluster the sample. Moreover, the chosen method of stratification - by characteristics of electoral wards - meant that using wards, rather than alternative geographical aggregates such as postcode sectors, was the most appropriate way to implement the clustering. In addition, the issues of measuring local context and reducing fieldwork costs pointed to the advantages of including all births in selected wards in the sample, rather than sub- sampling within wards.

Social Statistics The sample is a disproportionately stratified cluster sample. The disproportionality means that the sample is not self-weighting and so weighted estimates of means, variances etc. are needed. The clustering implies that observations are not independent and so allowance must be made for the dependence so induced when sampling errors are computed. It was likely that the design effects for the sample would be greater than one. In other words, the sample would be somewhat less precise than a simple random sample of the same size would have been, although this depends on how far the gains from stratification and systematic selection are offset by the losses from clustering which, in turn, would vary across measures.

Social Statistics Sampling Fractions and Weights across Strata

Social Statistics The weights should be applied when estimating a mean for the sample, say. In other words, the weighted mean for variable y is: where i (i = 1..x h ) indexes the elements in a stratum so x h is the sample size for stratum h. The appropriate weights w h from the table should be used.

Social Statistics Sampling Errors, Proportion of Cohort Members Regarded as Non-white by Stratum and Country

Social Statistics Sampling Errors, Proportion of Natural Mothers who do not have a Longstanding Illness by Stratum and Country

Social Statistics Sampling Errors, Family Income, Natural Mothers by Stratum and Country

Social Statistics Representation of design effects in STATA MEFF: ratio of variance for all aspects of study design including all weights to variance if the sample were a srs of the same size. MEFT: (MEFF) DEFF: ratio of variance for all aspects of sample design to variance if the sample were a srs of the same size. DEFT: (DEFF) See Kreuter, F. and Valliant, R. (2007) STATA Journal, 7, 1-21

Social Statistics Types of populations The MCS population definition is a finite one; we could in principle count all its members. The distinction between a finite and an infinite population is an important one for statistical inference and analysis. Finite populations are often regarded as samples from an infinite or super-population or universe. Inferences about finite populations are essentially descriptive. They are widely used in sample surveys, especially by official statisticians. A perfect Census of Population, in these circumstances, is not a sample: the resulting inferences are made with complete certainty. Descriptive inferences from finite populations are often known as design-based inferences.

Social Statistics Often, we are more interested in analytic or model-based inferences. We want to test a hypothesis, for example, or we want to estimate a relation between two or more variables. We are then more interested in an infinite or super-population: As a basis for scientific generalizations and decisions, a census is only a sample…A census describes a population that is subject to the variations of chance because it is only one of the many possible populations that might have resulted from the same underlying system of social and economic causes. A sample enquiry is then a sample of a sample, and a so-called 100% sample is simply a larger sample, but is still only a sample. (quotes from Deming and Stephan, 1941, p.48).

Social Statistics Suppose we have data obtained by measuring the elements.. in a probability sample.. drawn from the finite population U. Two important types of inference are as follows: a.Inference about the finite population U itself. b.Inference about a model or a superpopulation thought to have generated U. (Särndal et al., 1992) Providing we have selected our sample from the finite population using, as a selection mechanism, probability sampling (the simplest case of which is simple random sampling) then we can rely on the idea of repeatedly sampling in the same way from the population in order to generate what is known as a randomisation distribution which we can use to make inferences about a parameter of interest. A probability sample is one in which each member of the finite population has a known and non-zero chance of being selected into the sample. In simple random sampling, these known selection probabilities are equal.

Social Statistics A probability sample is not essential for model-based inference. Instead, we rely on being able to define the statistical or probability model that generated the data and then using the likelihood of the parameter given the model and the data for estimation. On the other hand, a probability sample from a finite population guards against bias and is usually regarded as desirable. Finite populations are able to be defined in an unambiguous way. Super-populations, although conceptually important, are abstractions that are less easily defined. They will usually extend the finite population across space and time.

Social Statistics Classic reference: Kish, L. (1965) Survey Sampling. New York: Wiley. More recent reference: Särndal, C.-E., Swensson, B. and Wretman, J. (1992) Model Assisted Survey Sampling. New York: Springer.