The Weighting Strategy of the Canadian Community Health Survey Cathlin Sarafin Methodologist Statistics Canada March 25, 2008.

Slides:



Advertisements
Similar presentations
Multiple Indicator Cluster Surveys Survey Design Workshop
Advertisements

EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007.
Labour Force Historical Review Sandra Keys, University of Waterloo DLI OntarioTraining University of Guelph, Guelph, ON April 12, 2006.
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
Riku Salonen Regression composite estimation for the Finnish LFS from a practical perspective.
1 The 2010 Census Coverage Measurement Survey Patrick J. Cantwell U.S Census Bureau Annual Meeting of the Association of Public Data Users September 25,
Canada’s Health by Region By: Jack Wei and Colin McClenaghan.
Chapter 7 Sampling Distributions
© John M. Abowd 2005, all rights reserved Household Samples John M. Abowd March 2005.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chapter 12 Sample Surveys
213Sampling.pdf When one is attempting to study the variable of a population, whether the variable is qualitative or quantitative, there are two methods.
Formalizing the Concepts: Simple Random Sampling.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
~ Draft version ~ 1 HOW TO CHOOSE THE NUMBER OF CALL ATTEMPTS IN A TELEPHONE SURVEY IN THE PRESENCE OF NONRESPONSE AND MEASUREMENT ERRORS Annica Isaksson.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Sampling Moazzam Ali.
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Operational and Methodological Lessons Learned from the 2003 Joint Canada/U.S. Survey of Health Catherine Simile, Ph.D. National Center for Health Statistics.
1 Social Research Methods Surveys. 2 Survey Characteristics Collecting a SMALL amount of data in STANDARDISED form from RELATIVELY LARGE NUMBERS OF INDIVIDUALS.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
National Household Survey: collection, quality and dissemination Laurent Roy Statistics Canada March 20, 2013 National Household Survey 1.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Estimating the Labour Force Trinidad and Tobago 28 th May 2014 Sterling Chadee Director of Statistics.
STATISTICSSTATISTIQUECANADA Aboriginal Labour Force Survey Province of Alberta.
1st NRC Meeting, October 2006, Amsterdam 1 Sampling: Next Steps.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Brian schnick. BASIC CONCEPTS IN SAMPLING  Advantages of Sampling  Sampling Error  Sampling Procedure.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
12th Meeting of the Group of Experts on Business Registers
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
The 2006 National Health Interview Survey (NHIS) Paradata File: Overview And Applications Beth L. Taylor 2008 NCHS Data User’s Conference August 13 th,
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys Bangkok,
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
European Conference on Quality in Official Statistics Roma, July 8-11, 2008 New Sampling Design of INSEE’s Labour Force Survey Sébastien Hallépée Vincent.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
Introduction Since 1995, the Municipality of Firenze designed a quarterly labour force (LF) survey, parallel to that of ISTAT, to cope with the unavailability,
Current Population Survey Sponsor: Bureau of Labor Statistics Collector: Census Bureau Purpose: Monthly Data for Analysis of Labor Market Conditions –CPS.
DTC Quantitative Methods Survey Research Design/Sampling (Mostly a hangover from Week 1…) Thursday 17 th January 2013.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
Using administrative registers in sample surveys European Conference on Quality in Official Statistics 3-–6 May 2010 Kaja Sõstra Statistics Estonia.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 1 Sample Design Produced in Collaboration between World Bank Institute and the Development.
AP STATISTICS Section 5.1 Designing Samples. Objective: To be able to identify and use different sampling techniques. Observational Study: individuals.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
May 12-15, Evaluating the Integrated Census Israel Pnina ZADKA Central Bureau of Statistics Israel.
Chapter Eleven Sampling: Design and Procedures Copyright © 2010 Pearson Education, Inc
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
Household Survey Data on Remittances in Sending Countries Johan A. Mistiaen International Technical meeting on Measuring Remittances Washington DC - January.
Towards an improvement of current migration estimates for Italy Domenico Gabrielli, Maria Pia Sorvillo Istat - Italy Joint UNECE-Eurostat Work session.
Rome, May 2014 Structural variables Weighting the Spanish annual subsample.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
The 2011 Census: Estimating the Population Alexa Courtney.
Chapter 12 Vocabulary. Matching: any attempt to force a sample to resemble specified attributed of the population Population Parameter: a numerically.
Chapter 5 Sampling and Surveys. Section 5.3 Sample Surveys in the Real World.
Guillaume Osier Institut National de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division Construction.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
Statistics Canada National Population Health Surveys (NPHS) Amir Erfani, PhD. Department of Sociology Nipissing University North Bay,
Sampling: Theory and Methods
Sampling: Design and Procedures
Chapter 8: Weighting adjustment
Telling Canada’s story in numbers Marie-Josée Major
Salah Merad Methodology Division, ONS
Sampling and estimation
The European Statistical Training Programme (ESTP)
Presentation transcript:

The Weighting Strategy of the Canadian Community Health Survey Cathlin Sarafin Methodologist Statistics Canada March 25, 2008

Outline Introduction Methodology The Canadian Community Health Survey (CCHS) The Multiple Frames The Weighting Strategy of the CCHS Methodology Recruitment Process

Introduction Methodology Structure: You Recruits are called Junior Methodologists Your Unit 2 to 7 Methodologists supervised by one Senior Methodologist Your Section 3 to 6 units working on related projects, managed by a Chief Your Division A division has roughly 100 people, usually all together on one floor of the building

Introduction Every person has their own responsibilities Senior Methodologist outlines tasks Discuss options and approaches as a team

Introduction Variance estimation Data quality indicators Record linkage Time series Data analysis Disclosure control Research and development Survey Methodology: Frame creation Sampling Questionnaire design Data collection methods Data processing Edit and imputation Weighting and estimation

The CCHS Collects general health information on the Canadian population Estimates produced for more than 120 Health Regions (HRs) across Canada Produces estimates on: Health Risk Factors Health Status Health Care Services

The CCHS The CCHS was introduced in 2000 Data was collected every second year for a total sample size of 130,000 per year It was redesigned in 2007 Data is now collected continuously for a total sample size of ≈ 65,000 respondents per year Annual files are released Multi-year files will be produced starting in 2009

The CCHS A cross-sectional survey Survey a specific population for a given period of time A longitudinal survey Survey a specific population repeatedly over time

The CCHS Target population: Individuals living in private dwellings aged 12 years old and over Exclusions: those living on Indian Reserves and Crown Lands, residents of institutions, full- time members of the Canadian Forces and residents of some remote areas CCHS covers ~98% of the Canadian population

The CCHS Has a complex, multi-stage, dual frame design Area frame (49%) Telephone list frame (50%) Random digit dialing (RDD) frame (1%) The telephone frames compliment the area frame in most HRs

The Area Frame Units are geographical areas Target sampling units are not listed Based on Labour Force Survey (LFS) design 6 rotation groups Stratified probability proportional to size sample of clusters Systematic sample of dwellings Random selection of a start Probabilistic sample of one individual per household

The Area Frame Stratum #1 Stratum #2 1.Each province is divided into geographic strata 2.Clusters selected within strata (PPS sampling)  1st stage 3.Dwellings selected within clusters (systematic sampling)  2nd stage 4.People selected within responding dwellings  3rd stage Province XYZ              LFS Sample Selection

The Area Frame Why use such a design? Stratification: Better coverage of the entire region of interest Increases precision Clustering: Efficient for interviewing (less travel, less costly) Decreases precision

The Area Frame The CCHS selection process: The LFS provides a list of available starts (systematic samples) within each cluster The clusters are mapped to the CCHS HRs A random selection of starts is chosen within a HR Probabilistic sample of one individual per household

The Area Frame 2-phase sample 1 st phase is the LFS sample of starts within the LFS strata 2 nd phase is the CCHS sample of starts within the HRs

The Area Frame Why use the LFS? No adequate list of addresses available Costly to create and maintain such a frame LFS has good coverage of target population It is a monthly sample conducted at Statistics Canada Continually updated

The Telephone Frame List of telephone numbers from across Canada Created using InfoDirect © files Stratified by HR SRSWOR sample of phone numbers Probabilistic sample of one individual per household

The RDD Frame Phone numbers are grouped into banks Banks are assigned to a HR Computer randomly generates the last 2 numbers Probabilistic sample of one individual per household

Dual Frame Design Multiple frames are used to: Improve the coverage of the target population Reduce costs Area Frame Covers target population Costly to implement Listing costs Face-to-face interview costs

Dual Frame Design Telephone Frame Only covers population with listed phone numbers Undercoverage may bias the estimates Growing problem with the increasing popularity of cell phones Less costly to implement Calls made from regional offices

Dual Frame Design RDD Frame Inefficient Results in a large amount of out-of-scope numbers Used alone for 2 northern regions LFS is not adequate for these 2 regions Used as a complement to the area frame in Whitehorse and Yellowknife Quality of telephone frame is considered poor in these regions

The Weighting Strategy of the CCHS Area Frame A4 - Household nonresponse A3 - Out-of-scope dwellings A2 - Stabilization A1 – Sub-cluster adjustment A0 – Initial weight Telephone Frame T4 - Multiple phone lines T3 - Household nonresponse T2 - Out-of-scope numbers T1 - Number of collection periods T0 - Initial weight Final CCHS Weight 6 Combined Frame I5 - Calibration I4 - Winsorization I3 – Person nonresponse I1 - Integration I2 – Person selection

Sampling Weights Number of people in the population represented by the interviewed person Ex: w i = 500 Can be broken down into 3 major steps: Design weights Nonresponse adjustment Calibration

Design Weights Weights determined by the design of the survey They are the inverse of the inclusion probability A person selected according to a sampling fraction of 1% will have a weight of 1/0.01 = 100 The design weights in the CCHS are calculated separately for each frame Sampling fractions differ between HRs, therefore design weights are not uniform

List Frame Design Weights The sample is stratified by HR, so weights are calculated within HR It is an SRSWOR of phone numbers Probability of selection within HR g is

Area Frame Design Weights The LFS is redesigned every 10 years A sample 20 year sample plan created The LFS provides a list of available starts Typically consists of 40 columns and 6 rows per LFS stratum Each row represents a rotation group Each column represents a monthly LFS sample

Area Frame Design Weights LFS Stratum RotationClusterStartClusterStartClusterStart One LFS sample

Area Frame Design Weights The LFS provides a weight for one LFS sample A weight for every start in one column This weight is used to assign a weight to all available starts The weights are then redistributed to the CCHS selected starts within each HR

Nonresponse Adjustments The design weights are corrected for total nonresponse (NR) All the variables for the respondent are missing Complete refusal Unable to contact the respondent Respondent absent for the duration of the survey language barrier Information obtained is unusable

Nonresponse Adjustments There are 2 types of NR in the CCHS Household level Person level The weights of the nonrespondents have to be redistributed to the respondents Form groups based on auxiliary information

NR Adjustments There are several methods available for the creation of response homogeneity groups (RHGs) The CCHS uses the scoring method Logistic regression is used to obtain a probability of response ( ) for every unit Groups are formed based on the values of

NR Adjustments Logistic Regression Models Variables include geographic information, process data and socio-economic indicators Variables derived from process data include: Number of attempts Time/day of attempt Called on weekday/weekend

NR Adjustments Initial groups are formed using a clustering algorithm in SAS These groups are then collapsed to ensure: A response rate of at least 50% At least 20 observations The adjustment within each RHG is

Integration of Frames Area Frame Telephone Frame No phone line Unlisted phone number Listed phone number

Integration of Frames Area Frame Population = A Sample = S A Telephone Frame Population = B Sample = S B

Integration Integration factor: A number between 0 and 1 For CCHS it is based on sample size

Integration Parameter of interest: Unbiased estimates

Integration Composite estimation

Integration of Frames Possible to integrate only the overlapping populations covered by the 2 frames Problem identifying the overlapping portion for the area frame due to nonresponse Possible to impute these cases

Integration of Frames Area Frame Telephone Frame SBSB S AB SASA S AU

Integration of Frames Logistic regression is used to assign a probability of belonging to the non- common part S A The final integration method is

Calibration Weights are adjusted to match population projection counts Based on the Census Adjusted to account for births, deaths, immigration and emigration The rounded average of the monthly projection counts is used within each post-stratum

Calibration Why is calibration used? Gives confidence when estimating totals Improves precision of the estimates If auxiliary variables are well correlated to the survey variables Adjusts for coverage inadequacies when the survey population differs from the target population

Calibration In the CCHS All post-strata with at least 20 observations are calibrated at the HR by age by sex level HR: 120 across Canada Age groups: 12-19, 20-29, 30-44, and 65 + Sex: Male and Female

Calibration Age Group Number of Observations Age GroupNumber of Observations FemalesMales Example: HR 2 Post-strata = HR by age by sexPost-strata = HR by sexPost-strata = Prov by age by sex

Final Weights Master: Contains all variables for all respondents Share: Contains all variables for the subset of people who agreed to share (subset of records) PUMF: Contains a subset of variables for all respondents (subset of variables) Dummy: Contains a subset of records from the master file. Scrambled data used for testing and remote access purposes Bootstrap: Created for variance estimation purposes Special Requests: linkage, different geographies, etc.

Methodology Typical tasks: Write computer programs to solve problems or explore data Attend meetings Write documentation Present our work at seminars Work on different committees

Methodology Working Conditions Permanent job Continuous learning: Computer courses Statistics and methodology courses Language courses Seminars, conferences and publications

Methodology All methodologists work at the Head Office in Ottawa

Recruitment Our recruitment campaign takes place each fall Detailed presentations at the Universities by early October It is a 3 step process: On-line application Starts in September Deadline in mid-October Written Exam Early November Interview January

Recruitment Who can apply? Persons residing in Canada and Canadian citizens residing abroad Preference will be given to Canadian citizens Bilingualism No preference is given to those who speak both English and French

For more information please contact Under: About Us Employment opportunities Mathematical statisticians (MA) Telephone:

Thank you Canadian Community Health Survey