Canadian Community Health Survey Cycle 1.1 Overview of methodological issues and more...

Slides:

Advertisements

Similar presentations

Multiple Indicator Cluster Surveys Survey Design Workshop

Advertisements

Innovation data collection: Advice from the Oslo Manual South East Asian Regional Workshop on Science, Technology and Innovation Statistics.

National Population Health Survey 2004 Congress of the Humanities and Social Sciences All Congress Symposium - A Journey through data: the riches of the.

SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—DESCRIPTIVE STUDIES ?

9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.

The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.

Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.

Multiple Indicator Cluster Surveys Survey Design Workshop

11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.

Canada’s Health by Region By: Jack Wei and Colin McClenaghan.

Canadian Community Health Survey

Population Health Surveys at STC Prepared for: B.C. Research Data Centre Date: Nov. 15, 2000.

Canadian Community Health Survey A new program for collecting health information Interuniversity Research Data Seminar University of British Columbia Béland.

The Ontario Cancer Risk Factor Surveillance Program Michael Spinks Senior Research Analyst Cancer Care Ontario at 5 th Annual RRFSS Workshop Institute.

Why sample? Diversity in populations Practicality and cost.

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

2014 Survey on Living with Chronic Diseases in Canada (SLCDC): Mood & Anxiety Disorders National Mental Health and Addictions Information Collaborative.

A new sampling method: stratified sampling

Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.

FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS

The Weighting Strategy of the Canadian Community Health Survey Cathlin Sarafin Methodologist Statistics Canada March 25, 2008.

Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.

National Household Survey: collection, quality and dissemination Laurent Roy Statistics Canada March 20, 2013 National Household Survey 1.

CANADIAN COMMUNITY HEALTH SURVEY Data and Products Sylvie Lafortune Laurentian University DLI Spring Meeting (ON) April 13, 2010.

Household Surveys ACS – CPS - AHS INFO 7470 / ECON 8500 Warren A. Brown University of Georgia February 22,

Canadian Community Health Survey (CCHS) – Healthy Aging Data Liberation Initiative Webinar Leslie Geran Health Statistics Division, Statistics Canada April.

STATISTICSSTATISTIQUECANADA Aboriginal Labour Force Survey Province of Alberta.

Chapter 5 Data Production

RESEARCH A systematic quest for undiscovered truth A way of thinking

Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.

Deanna E. White, Adam Stevens, John Barbaro, Kristy McGill and Lynne Russell.

The Joint Canada/U.S. Survey of Health (JCUSH) Catherine Simile, PhD, U. S. Project Officer Division of Health Interview Statistics National Center for.

Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.

Health Statistics Information on STC website Calgary–DLI training–Dec 2003 Michel B. Séguin, Statistics Canada,

1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.

The 2006 National Health Interview Survey (NHIS) Paradata File: Overview And Applications Beth L. Taylor 2008 NCHS Data User’s Conference August 13 th,

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys Bangkok,

Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.

1 The 2001 Census PUMFS Odyssey Sponsored by HAL and PALS Presented by Chuck Humphrey.

1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.

Panel Study of Entrepreneurial Dynamics Richard Curtin University of Michigan.

1 Dealing with Item Non-response in a Catering Survey Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen.

STANDARD ERROR Standard error is the standard deviation of the means of different samples of population. Standard error of the mean S.E. is a measure.

National design, fieldwork and data harmonization for Labour Force Survey Irena Svetin Statistical Office of the Republic of Slovenia September 2014.

Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.

5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.

Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.

Household Surveys: American Community Survey & American Housing Survey Warren A. Brown February 8, 2007.

Measuring Disability: Results from the 2001 Census and the 2001 Post-Censal Disability Survey Statistics Canada January 10, 2003.

Analytical Example Using NHIS Data Files John R. Pleis.

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys Asunción,

Analysis of the characteristics of internet respondents to the 2011 Census to inform 2021 Census questionnaire design Orlaith Fraser & Cal Ghee.

1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.

Statistics Canada Citizenship and Immigration Canada Methodological issues.

ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.

CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.

1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago

Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.

Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.

OZAUKEE COUNTY COMMUNITY HEALTH SURVEY – March 2012 Commissioned by: Aurora Health Care Children’s Hospital of Wisconsin Columbia St. Mary’s Health System.

Statistics Canada National Population Health Surveys (NPHS) Amir Erfani, PhD. Department of Sociology Nipissing University North Bay,

BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.

Copyright ©2011 by Pearson Education, Inc. All rights reserved. Chapter 8: Qualitative and Quantitative Sampling Social Research Methods MAN-10 Erlan Bakiev,

Health Statistics Division

LAMAS Working Group 7-8 December 2015

Chapter 8: Weighting adjustment

New Techniques and Technologies for Statistics 2017 Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.

The European Statistical Training Programme (ESTP)

Canadian Community Health Survey (CCHS) - Annual Component

Presentation transcript:

Canadian Community Health Survey Cycle 1.1 Overview of methodological issues and more...

Presentation Outline &Sample Design –Target population, sample allocation and frames –Sampling strategies, oversampling of sub-populations –Data collection, response rates –Imputation –Weighting &Sampling error –Sampling variability guidelines –Variance estimation: Bootstrap re-sampling technique –CV look-up tables &Analysis –Examples –How to use the Bootvar programs

CCHS - Cycle 1.1 Health Region-level survey &Main objective –Produce timely cross-sectional estimates for 136 health regions &Target population –individuals living in private occupied dwellings aged 12 years old or over –Exclusions: those living on Indian Reserves and Crown Lands, residents of institutions, full-time members of the Canadian Armed Forces and residents of some remote areas &CCHS 1.1 covers ~98% of the Canadian population

CCHS - Sample Allocation to Provinces ProvPop # of 1st Step2nd StepTotal Size HRs500/HRX-propSample NFLD551K 6*2,7801,2304,010 PEI135K 21,0001,0002,000 NS909K 63,0002,0405,040 NB738K 73,5001,6505,150 QUE7,139K 168,00016,28024,280 ONT10,714K 3718,50023,76042,260 MAN1,114K 115,5002,5008,000 SASK990K 11*5,4002,3207,720 ALB2,697K 17*8,1506,05014,200 BC3,725K 2010,0008,09018,090 CAN29,000K 13365,83064,920130,750 * The sampling fraction in some small HRs was capped at 1 in 20 households

CCHS - Sample Allocation to Health Regions Pop. Size# of Mean RangeHRsSample Size Smallless than 75, Medium75, , Large240, , ,500 X-Large640,000 and more7 2,500

CCHS - Sample Allocation to Territories Population Sample Yukon25, NWT36, Nunavut22,000800

CCHS - Sample Frame &CCHS sample selected from three frames: Area frame (Labour Force Survey structure)Area frame (Labour Force Survey structure) RDD frame of telephone numbers (Random Digit Dialling)RDD frame of telephone numbers (Random Digit Dialling) List frame of telephone numbersList frame of telephone numbers Three frames are needed for CCHS for the following reasons: 1. To yield the desired sample sizes in all health regions 2. Have a telephone data collection structure in place to quickly address provincial/regional requests for buy-in sample and/or content at any point in time 3. Optimize collection costs

Area frame - Sampling of households &83% of CCHS sampled households &Multistage stratified cluster sample design Stratum #1 Stratum #2 #1: Each health region is divided into strata #2: Clusters selected within strata (PPS sampling) (1st stage)        #3: Dwellings selected within clusters (2nd stage)  

RDD frame of telephone numbers Sampling of households &Elimination of non-working banks method –7% of CCHS sampled households –Telephone bank: area code + first 5 digits of a 7-digit phone # 1- Keep the banks with at least one valid phone # 2- Group the banks to encompass as closely as possible the health region areas - RDD strata 3- Within each RDD stratum, first select one bank at random and then generate at random one number between 00 and Repeat the process until the required number of telephone numbers within the RDD stratum is reached

List frame of telephone numbers Sampling of households &Simple random sample of telephone numbers –10% of CCHS sampled households –Telephone companies’ billing address files and Telephone Infobase (repository of phone directories) 1- Create a list of phone numbers 2- Stratify the phone numbers by health region using the residential postal codes 3- Select phone numbers at random within a health region 4- Repeat the process until the required number of telephone numbers is reached

CCHS - Sampling of persons &Area frame  Simple random sample (SRS) of one person aged 12 years of age or older (82% of households)  SRS sample of two persons aged 12 years of age or older (18%) &RDD / List frames  SRS sample of one person aged 12 years of age or older

CCHS - Sampling of persons Age1996LFS* CCHS groupCensussamplesimulated (all persons) sample ( only 1 person) * averaged distribution over 100 repetitions using the May 99 LFS sample

CCHS - Representativity of sub-populations To address users’ needs, two sub-population groups needed larger effective sample sizes: & &Youths (12-19 years old) – –Decision > Oversample youths by selecting a second person (12-19) in some households based on their composition & &Elderlies (65 years old and +) – –Decision > Do not oversample - let the general sample selection process address the issue by itself

Sampling strategy based on household composition Number of persons aged 20 or over Number of persons aged 20 or over Number of Number of AAAAB 1 1AACCCB 2 2ACCCCC 3+ 3+ACCCCC A: Simple random sample (SRS) of one person aged 12+ B: SRS of two persons aged 12+ C: SRS of one person in the age group and SRS of one person 20+

CCHS - Sample Distribution after Oversampling Age1996* CCHS* CCHS groupCensussimulated simulated samplesample ( only 1 person)( some 2 persons) * averaged distribution over 100 repetitions using the May 99 LFS sample

CCHS - Initial data collection plan &12 monthly samples &12 collection months + 1 Area frame !CAPI !STC field interviewers !targeted response rate: 90% !anticipated vacancy rate: 13% (09 / / 2001) + 09 / 2001 RDD / List frames !CATI !STC call centres !targeted response rate: 85% !telephone hit rate: 15-60%

CCHS data collection - Observed situation &Field interviewers –workload exceeded field staff capacity &Call centres –new collection infrastructure –unequal allocation of work among call centres &Descriptive paper: –« Preventing nonresponse in the Canadian Community Health Survey », Y. Béland, J. Dufour, and M. Hamel. 2001, Hull, Statistics Canada XVIII th International Symposium.

CCHS - Final response rates Field Call centres Total Field Call centres Total NFLD PEI NS NB QUE ONT MAN SASK ALB BC YUK NWT NUN CAN

CCHS - Proxy interviews & Definition: When another person in the household responds to the survey on behalf of the selected person in the sample to the survey on behalf of the selected person in the sample & Acceptable in the following cases: – Out of the country for a long period of time – Mental or physical state of health – Language barrier & Usually, 2 to 3 % of proxy respondents & Because of the field problems: 6,3% – Higher rate in some health regions, for men and younger respondents &Major consequence : one third of the questionnaire is missing – Personal or sensitive type of questions not asked &Solution : imputation

Modules for proxy and non-proxy &Alcohol &Chronic condition &Exposure to second hand smoke &Food insecurity &General health (Q1, Q2 and Q7) &Health care utilization &Health Utility Index (HUI) &Height / Weight (Q2 and Q3) &Injuries &Restriction of activities &Smoking &Tobacco alternatives &Two-week disability &Household composition & housing &Income &Labour force &Socio-demographic characteristics &Administration &Drug use (optional) &Home care (optional)

Modules for non-proxy only &Alcohol dependence / abuse &Blood pressure check &Breastfeeding &Contacts with mental health professionals &Mammography &Fruit & vegetable consumption &General health (Q3-Q6, Q8-Q10) &Height / Weight (Q4 only) &PAP smear test &PSA test &Physical activities &Patient Satisfaction** &Breast examinations &Breast self examinations &Changes made to improve health &Depression &Dental visits &Distress &Driving under influence &Eye examinations &Flu shots* &Mastery &Mood &Physical check-up &Sedentary activities &Self-esteem &Sexual behaviours &Smoking cessation aids &Social support &Spirituality &Suicidal thoughts and attempts &Use of protective equipment &Work stress

Imputation Strategy - 5 passes & 1st pass: Health prevention modules – 3 modules imputed completely – 6 modules imputed partially (some questions only) – 2 modules not imputed & 2nd pass: Mental health modules – 6 modules imputed completely – 7 modules not imputed & 3rd pass: Sexual behaviours & 4th pass: Fruit and vegetable consumption & 5th pass: one question in the module height and weight Note that some modules and/or questions are not imputed (such as physical activity, distress, work stress, time since last flu shot, etc.)

Imputation Strategy & Strategy applied at each imputation pass 1. Create imputation classes – Usually: Province X Sex X Age groups X Filters – The donor has to be in the same imputation class as the recipient – Minimum donor rule : donors / (donors + recipients) >= 60% 2. Identify a list of matching variables 3. Assign a weight to each matching variable – Default weight = 1, sometimes weight = 2, 3 or more. 4. Find the nearest donor - Highest Total Weighted Match - If more than one possible donor, select one randomly from them - No imputation if no donor over a minimum number of matches

CCHS - Weighting and Estimation & &Estimation relates sample back to population & &MUST use weights in calculation of estimates to correctly draw conclusions about population of interest & &Sampling weight is related to the probability of selecting a person in the sample & &Persons are selected with unequal probabilities therefore have varying weights

CCHS - Weighting and Estimation & &Three separate weighting systems: – –Area frame design – –RDD frame design – –List frame design & &Several adjustments – – non-response (household and person) – – seasonal factor – – etc... & &Integration of the two weighting systems based on design effects and sample sizes ( n / deff ) & &Calibration using a one-dimensional poststratification adjustment of ten age/sex poststrata within each health region & &Variance estimation : bootstrap re-sampling approach – –set of 500 bootstrap weights for each individual

Weighting & Estimation

&Initial weight: Inverse of the probability of being selected

Weighting & Estimation &Household nonresponse: Distribute weight of nonresponding households to responding ones –Using “nonresponse classes such as HR, collection period and urban, rural/urban)

Weighting & Estimation &No phone lines: No coverage of hhlds without a phone line. Weights are “boosted” by a certain rate (specific to each HR) –Rates of “no phone lines” calculated using area frame data

Weighting & Estimation &# of people in hhld: Convert the hhld-level weight into a person-level weight (multiply by the number of people) –Depends on the # of people selected (1 or 2), and their age

Weighting & Estimation &Person level nonresponse: Redistribute the weight of selected person who did not respond to the ones who responded –Using classes (age, sex, # person selected, collection period, etc)

Weighting & Estimation &Multiple phone lines: More phone lines = higher probability of being selected –weight divided by the number of residential phone lines

Weighting & Estimation &Final weight: Each frame’s final weight is each representative of the total population. To create a single set of weights, they are combined through “Integration”

Weighting & Estimation &Integration: Combine the 2 sets of weights into one single set of weights –Based on sample size and design effect of each frame

Weighting & Estimation &Seasonal effect: Adjust weights so that each season contains 25% of the total population –Based on the collection period (sept-nov / dec-feb / mar-may / june - aug)

Weighting & Estimation &Post-stratification: Ensure the sum of weights matches the estimated population projections in each HR, for 10 age-sex groups –12-19, 20-29, 30-44, and 65+ crossed with two sexes

Weighting & Estimation &Final CCHS weight: Final weight present on the CCHS master file

CCHS - Special Weights & &For various reasons, many other weights are produced – – Quarter 4 special weight – – PEI special weight – – Share weights (master, Q4 and PEI special) – – Link weights (master, Q4 and PEI special)

Sampling Error &Difference in estimates obtained from a sample as compared to a census &The extent of this error depends on four factors: –sample size –variability of the characteristic of interest –sample design –estimation method &Generally, the sampling error decreases as the size of the sample increases

Sampling Error &Measures of precision associated to an estimate –Variance –Standard deviation (square root of the variance) –95% confidence interval (estimate ± 1.96 x standard deviation) –Coefficient of variation Standard deviation of estimate x 100% / estimate itselfStandard deviation of estimate x 100% / estimate itself CV allows comparison of precision of estimates with different scalesCV allows comparison of precision of estimates with different scales –Examples: 24% of population are daily smokers, std dev. = % of population are daily smokers, std dev. = > CV=0.003/0.24 x 100%=1.25% > 95% CI: ± 1.96 x : {0.234 ; }

Sampling Variability Guidelines Type of estimate CVGuidelines Acceptable General unrestricted release Marginal General unrestricted release but with warningcautioning users of the high sampling variablitity. Should be identified by letter E. Unacceptable> 33.3 No release. Should be flagged with letter F. Should be flagged with letter F.

Sampling Error &Measuring sampling error for complex sample designs: –Simple formulas not available –Most software packages do not incorporate design effect (and weights adjustments) appropriately for calculations –Solution for CCHS: the Bootstrap re-sampling method

Bootstrap method &Principle: –You want to estimate how precise is your estimation of the number of smokers in Canada –You could draw 500 totally new CCHS samples, and compare the 500 estimations you would get from these samples. The variance of these 500 estimations would indicate the precision. –Problem: drawing 500 new CCHS samples is $$$ –Solution: Assuming your sample is representative of the population, sample 500 new subsamples and compute new sampling weights for each subsample.

Bootstrap method T = 40 Var =  (B i - B) 2 / 499 &How CCHS Bootstrap weights are created (the secret is now revealed!!!)

Bootstrap Method &How Bootstrap replicates are built? lThe “real” recipe 1- Subsample clusters (SRS) within a design stratum 2- Apply (initial design) weight 3- Adjust (boost) weight for selection of n-1 among n 4- Apply all standard weight adjustments (nonresponse, integration, share, etc.) 5- Post-stratification to population counts lThe bootstrap method intends to mimic the same approach used for the sampling and weighting processes

Bootstrap Method &Sampling weight versus Bootstrap weights –Sampling weight used to compute the estimation of a parameter (e.g.: number of smokers) –Bootstrap weights used to compute the precision of the estimation (e.g.: the CV of the number of smokers estimation)

Bootstrap Method &The process of variance estimation is divided into two phases: ìCalculation of bootstrap weights Need to be produced only onceNeed to be produced only once Done by Statistics Canada methodologistsDone by Statistics Canada methodologists

Bootstrap Method íVariance estimation using bootstrap weights Done by anyone - internally or externallyDone by anyone - internally or externally Bootstrap weights files distributed with all CCHS files, except Public-Use Microdata File (PUMF)Bootstrap weights files distributed with all CCHS files, except Public-Use Microdata File (PUMF) –Bootstrap weights are in a separate file (match using IDs) –Not for PUMF because bootstrap weights reveal confidential info –PUMF users must proceed through remote access to get ‘ exact ’ variances or use the CV look-up tables

Bootstrap Method &Variance estimation using bootstrap weights SAS and SPSS (beta) macro programs provided to users (BOOTVAR)SAS and SPSS (beta) macro programs provided to users (BOOTVAR) Allow users to perform a few statistical analysis (totals, proportions, differences of proportions and regression analysis)Allow users to perform a few statistical analysis (totals, proportions, differences of proportions and regression analysis) Fully documented with examplesFully documented with examples Bootstrap hands-on workshopBootstrap hands-on workshop

How to use the Bootvar program STEP #1 Create your ‘‘analytical file” &Read CCHS data file &Prepare the necessary dummy variables &Keep only the necessary variables &Perform the analysis to obtain the point estimates (not essential but recommended) STEP #2 Compute your variances using Bootvar &Specify the location of the files: sYour “analytical file” sBootstrap weights file &Specify the level of geography &Specify the analysis to perform sTotal, proportion, diff. of prop. sRegression (linear & logistic) sGeneralized linear model

How to use the Bootvar program Statistical analysis ìUsing the NPHS cycle 3 (1998) cross-sectional dummy data, estimate the number of ontarians aged 12, by gender, who perceive themselves as being: - in poor or fair health, - in good health, - in very good health, - in excellent health. - Compute 95% confidence interval for each point estimate using the Bootvar program.

Necessary variables for the analysis Self-perceived health (GHC8DHDI) 0 = poor, 1 = fair, 2 = good, 3 = very good, 4 = excellent, 9 = not stated Age (DHC8_AGE) Sex (DHC8_SEX) >= 12 1 = male, 2 = female Province (PRC8_CUR)Sampling weight (WT68) 35 = Ontario Record identifier for the household (REALUKEY) Number identifying the person in the household (PERSONID)

Basic theoritical notions for estimating a proportion Example of a data file IDWeight Sex Asthma Asthma_id A 50 M YES 1 B 60 M NO 0 C 50 M NO 0 D 70 M YES 1 E 50 M NO 0 ( Weight A + Weight D ) ( Weight A + Weight D ) (Weight A +Weight B +Weight C +Weight D +Weight E ) (Weight A +Weight B +Weight C +Weight D +Weight E ) = ( ) / ( ) * 100 = 120 / 280 * 100 = 43% = ( ) / ( ) * 100 = 120 / 280 * 100 = 43% %(asthma_men) = __________________________________________ * 100

Little trick for the statistical analysis Create your univariate dummy variable : lMen = 1,0(men) lGood health = 1,0(good) lMen in good health : mgood = men * good men * good= mgood men * good= mgood

Results of the statistical analysis Self-perceived health of ontarians aged 12 or older by gender Self-perceived health of ontarians aged 12 or older by gender in 1998 in 1998 # (‘000) 95% CI % 95% CI # (‘000) 95% CI % 95% CIMen - Poor / fair391(330 ; 452)8.4(7.1 ; 9.8) - Poor / fair391(330 ; 452)8.4(7.1 ; 9.8) - Good1,106(1,007 ; 1,204)23.9(21.7 ; 26.0) - Good1,106(1,007 ; 1,204)23.9(21.7 ; 26.0) - Very good1,764(1,648 ; 1,880)38.1(35.6 ; 40.6) - Very good1,764(1,648 ; 1,880)38.1(35.6 ; 40.6) - Excellent1,373(1,268 ; 1,479)29.6(27.4 ; 31.9) - Excellent1,373(1,268 ; 1,479)29.6(27.4 ; 31.9)Women - Poor / fair480(409 ; 551)9.9(8.5 ; 11.4) - Poor / fair480(409 ; 551)9.9(8.5 ; 11.4) - Good1,258(1,151 ; 1,364)26.1(23.9 ; 28.3) - Good1,258(1,151 ; 1,364)26.1(23.9 ; 28.3) - Very good1,846(1,726 ; 1,965)38.2(35.8 ; 40.7) - Very good1,846(1,726 ; 1,965)38.2(35.8 ; 40.7) - Excellent 1,243(1,138 ; 1,348)25.8(23.6 ; 27.9) - Excellent 1,243(1,138 ; 1,348)25.8(23.6 ; 27.9)

Why use the Bootstrap method? Other techniques: &Taylor –Need to define a linear equation for each statistic examined &Jacknife –Number of replicates depends on the number of strata (large number of strata makes it impossible to disseminate)

Why use the Bootstrap method? BOOTSTRAP &more user-friendly when there is a large number of strata &sets of 500 bootstrap weights can be distributed to data users &Recommended (over the jackknife) for estimating the variance of nonsmooth functions like quantiles, LICO &Official reference: –“Bootstrap Variance Estimation for the National Population Health Survey”, D. Yeo, H. Mantel, and T.-P. Liu. 1999, Baltimore, ASA Conference.

CV Look-up Tables &Alternative to bootstrap &Approximate &Can only be used for categorical variables, and for estimations of totals and proportions &Available for every health region, province and Canada &Provided with PUMF and Share file for some subpopulations

CV Look-up Tables—Example   National Population Health Survey ‑ 1996/1997   Approximate Sampling Variability Tables for Ontario Health Area:OTTAWA CARLETON ‑ Selected members   NUMERATOR OF ESTIMATED PERCENTAGE   PERCENTAGE   ('000) 0.1% 1.0% 2.0% 5.0% 10.0% 15.0% 20.0% 25.0% 30.0% 35.0% 40.0% 50.0% 70.0% 90.0%   1 ********   2 ********   3 ********   4 ********   5 ********   6 ********   7 ********   8 ****************   9 ****************   10 ****************  ...   300 ****************************************************************************************   350 ****************************************************************************************   400 ************************************************************************************************   450 ************************************************************************************************   500 ************************************************************************************************   NOTE: FOR CORRECT USAGE OF THESE TABLES PLEASE REFER TO MICRODATA DOCUMENTATION

Another example using the Bootvar program Statistical analysis íUsing the NPHS cycle 3 (1998) cross-sectional dummy data, determine whether or not the number of men aged 12 or older who perceive themselves as being in excellent health in Ontario is statistically different (at level  =5%) than the number of women.

Basic theoritical notions for performing a Z-test M_excel = estimated proportion of men in excellent health F_excel = estimated proportion of women in excellent health Hypothesis test:H 0 : M_excel = F_excel H 1 : M_excel  F_excel At level  = 0,05, we conclude H 0 if | z | <= 1.96 We conclude H 1 otherwise. Z = ( M_excel - F_excel ) sd (M_excel-F_excel) sd (M_excel-F_excel) We use the section “difference of proportions” of the BOOTVAR program to estimate the standard deviation of the difference between the two estimates. __________________

Results M_excel = 29.64% ; F_excel = 25.75% ; sd(M_excel-F_excel) = 1.62 Z = ( M_excel - F_excel ) = ( ) = 3.89 = 2.40 sd (M_excel-F_excel) sd (M_excel-F_excel) At  = 0,05 level, we conclude H 1 because z = 2.40 > We can then conclude that among the ontarians aged 12 or older there is a statistical difference between men and women with regard to the caracteristic “self-perceived health = excellent”. _____________________________

CCHS - Data Dissemination Strategy & &Wide range of users and capacity – –136 health regions – –13 provincial/territorial Ministries of Health – –Health Canada and CIHI – –Internal STC analysts – –Academics – –Others & &Data products – –Microdata – –Analytical products (Health Reports, How Healthy are Canadians, etc…) – –Tabular statistics (ePubs, Cansim II, community profiles, etc…) – –Client support (head and regional offices, CCHS website, workshops, etc…)

CCHS - Access to microdata & &Master file – –all records, all variables Statistics Canada university research data centres remote access & &Share / Link files – –respondents who agreed to share / link provincial/territorial Ministries of Health health regions (through the STC third-party share agreement) & &Public Use Microdata File (PUMF) – –all records, subset of variables with collapsed response categories free for 136 health regions cost recovery for others

CCHS - Overview of Cycle 1.2 & &Produce provincial cross-sectional estimates from a sample of 30,000 respondents & &Area frame sample only / one person per household & &CAPI only & &90 minute in-depth interviews on mental health and well-being based on WMH2000 questionnaire & &Scheduled to begin collection in May 2002

CCHS - Future Plans & &Same two-year cycle approach: – –health region level survey starting in January 2003 – –provincial level survey starting in January 2004 & &New consultation process with provincial and regional authorities & &Flexible sample designs (adaptable to regional needs) & &Development of an in-depth nutrition focus content (Cycle 2.2)

CCHS Web site

Contacts in Methodology &Yves Béland: &François Brisebois: