Presentation is loading. Please wait.

Presentation is loading. Please wait.

Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

Similar presentations


Presentation on theme: "Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,"— Presentation transcript:

1 Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah, Chief, Census Client Services Census Operations Division Statistics Canada

2 2006 Census Data (released March 13, 2007 to May 1, 2008)

3 2006 Census Data Largest Metropolitan Areas

4 2006 Census Data Mother Tongue

5 2006 Public Use Microdata File Agenda 1. Census PUMF 1971 – Consultations Summary 3. Factors promoting new approach 4. Challenges with micro data 5. Addressing confidentiality concerns 6. The 2 New 2006 Census PUMFs 7. Comparing 2006 Census PUMFs 8. Analytic Content: additions and losses 9. International Comparison (macro level)

6 1. Census PUMF Structure and variables Characteristics File type : 3 Single separated files with no relationship Geography : Province, Census Metropolitan Area (large urban) Variables : repeated in the 3 files and most of them are derived variables Example 2001 Census PUMF INDIVIDUALS 2.7% of the population 140 Variables 801,055 Records FAMILIES 2.7% of the population 163 Variables 348,104 Records HOUSEHOLDS & DWELLINGS 2.7% of the population 150 Variables 312,513 Records

7 1. Census PUMF Variables per universe, 2001 Census PUMF 48 % PUMF of Households 83 % PUMF of families 6.4 % PUMF of Individuals % of complex variables (levels 3-4) OUT OF TOTAL Level 4 Very Complex derive Level 3 Complex derive Level 2 Simple derive Level 1 Very simple derive Number of variables 2001 Universe

8 2. Consultations Summary Objectives: –Present the upcoming changes for the PUMF 2006 –Get feedback concerning the proposed changes –Get information about how often the file is used and about the data needs of the PUMF users Consultations –Data Liberation Initiative: Program to provide colleges and universities data produced by Statistics Canada including Public Use Microdata File –Federal departments who paid part of the collection costs for the 2B long form questionnaire –Questionnaire for experienced university researchers (Queens/UofT/UofAlb) –Academics & Private sector Research –International comparison – macro level

9 2. Consultations Summary (continued) 2.1 Scenarios presented to users Possible scenarios presented for 2006 PUMF: 1.Status Quo as 2001 – 3 universe files 2.Individual Single File 3. Hierarchical file

10 2.2 Geography Wanted provinces and some wanted Census Metropolitan Areas & Census Subdivisions 2.3 Variables –the variables taken from the questionnaires and the most common derived variables; –Derived variables : Ex: LICO, POW –Allow more flexibility to create own derived variables 2. Consultations Summary (continued) Most discussed topics of interest during the consultations

11 2. Consultations Summary (continued) Most discussed topics of interest during the consultations (end) 2.4 Type of file: Most requested file: a) Individual file with geography of (Province & CMAs) b) Hierarchical file (link between universes for better analysis)

12 3. Factors promoting a new approach Statistics Canada provides greater accessibility of census data than before. Improvement in order to provide more analytic content and a greater use at the national and international levels Speed up release of PUMFs

13 4. Challenges with micro data Statistics Canada senior managers are concerned about confidentiality Data confidentiality constraints

14 4. Research Data Centre (RDC) Microdata Across Canada - 15 RDC centres, 9 branches & 26 partners The entire 2B data and details (100% 2B Questionnaire Data (ethnicity, visible minority, labour force, language, place of work, immigration, income etc) Years available 2001, 1996 & Census RDC file - December 2008 Social Sciences and Humanities Research Council (SSHRC) reviews proposals Statistics Canada senior managers Restricted access with a committee that provides approval after review each proposal for access

15 5. Addressing confidentiality concerns (initial plan) How are we addressing these concerns? Number of Files - 3 files to 2 files File size (2006) less than (2001)- Individual (800K) & Hierarchical (150K) Limited geography – Province for single file & regions for hierarchical file Age variable - collapse age variable Income variable - modify income categorisation

16 5. Addressing confidentiality concerns Independent samples where possible Eliminate values with low Canada frequencies X (Top 6?)X (most?)CMAs XRegions (at least 1 million people: Atlantic, QC, ON, Prairies, BC, North territories) XProvinces XXCanada 2006 Hierarchical file2006 Single fileGeography

17 6. The 2 new 2006 Census PUMFs Single file and Hierarchical File (initial plan) Single file (2.7 % of population) –Keep Provinces (legal jurisdiction for education, health etc.) –Keep most CMAs for diversity studies –Variables taken from the questionnaire. Users can create their own derived variables –Release projected for summer 2009 Hierarchical (1% of population) –Keep some CMAs for diversity studies –Links the 3 universes (individual, family & household) –Variables taken from the questionnaire. Users can create their own derived variables –Release projected for summer 2010

18 7. Comparing the 2006 Census PUMF Structure and variables Same suppression level as in 2001 Release projected for summer 2009 Analytic content extended to the Individual Universe Variables taken from the questionnaire so that users can create their own derived variables Loss of information about families and households Geography limited to Canada, provinces and CMAs. Some people represent a family or a household Size: 2.7% of the population Individual Single File Same or less suppression level (less geographies) Release projected for summer 2010 Analytic content extended to the three universes; greater potential for analysis and international comparison Variables taken from the questionnaire so that users can create their own derived variables File representative of households; more varied content including all data Geography more limited to Canada, regions & CMAs with a population of at least 1 million All families and persons in households sampled are included Size: 1 % of the population Hierarchical File (Individuals, family & household)

19 8. Analytic Content: additions and losses Reduced category of age and income Detailed age & income Same or lower suppression level as in 2001 (less geographies) Same suppression level as in 2001 Confidentiality Production projected for summer 2010 Production projected for summer 2009 Certification and production projected for end of 2010 Production requirements Analytic content extended to the three universes Greater potential for analysis and international comparison Analytic content extended to the Individual Universe Analytic content limited to one universe at a time Variables taken from the questionnaire so that users can create their own derived variables Variables taken from the questionnaire so that users can create their own derived variables Repetition of variables between the 3 universes; complex derived variables File representative of households; more varied content including all data Loss of information about families and households Families and households well represented Geography more limited to Regions and CMAs (at least pop. 1 million) Geography limited to provinces and most major CMAs Diverse geographies at the province and CMA levels All families and persons in households sampled are included Some people represent a family or a household Independent samples of the three universes Content PUMF-2006 (Hierarchical File) PUMF-2006 (Single File) PUMF-2001 (Status Quo )

20 State, Super-PUMA >=400K PUMA>=100K RGR(SAR-H-1), LIDA>=120K(SAR-I-91), GOR(IL-SAR-01),LA- Eng &Wales, CA-Scot, PC–NI,UK (SAM-CAMS) State/Ter, SR, ASR Pop>=124K(ASR), 250K(SR) Prov/Terr, CMA >=100K Geographic Identifier 1% & 5% Person Housing -SL-HSAR % -IL-SAR % -H-CAMS % -I-CAMS % Basic & Extend 1% -Person Family Dwelling Not available Hierarchical - SAM 5% 1- Individual 2.7% - Family 2.7% - Household & Housing 2.7% Single U.S.A.UKAustraliaCanadaType 9. International Comparison Similarities & Differences (2001 )

21 Some International Microdata Acronyms United Kingdom File names SAM: Small Area Microdata (individual file) CAMS: Controlled Access Microdata Samples SAR: Samples of Anonymised Record SL: Special Licensed Household IL-SAR: Individual Licensed Samples of Anonymised Record H-CAMS: Controlled Access Microdata Samples I-CAMS: Individual Controlled Access Microdata Samples SAR-H: Household Samples of Anonymised Record SAR-I: Individual Samples of Anonymised Geography LA-Eng: Local Authority –England CA-Scot, PC-NI, UK: Council Area – Scotland and parliamentary Constituency - Northern Ireland, United Kingdom RGR: Registrar General Standard Regions LlDA: Large Local Authority District GOR: Government Office Region

22 Some International Microdata Acronyms Australia File name CURF : Confidentialised Unit Record File (the most detailed microdata statistical information available from the Australian Bureau of Statistics (ABS) Geography ASR: Aggregated Statistical Regions SR: Statistical Region USA File name PUMS: Public use Microdata Sample Geography PUMA: Public Use Microdata Area Super- PUMA: Super Public Use Microdata Area

23 Comments and/or questions Thank you !


Download ppt "Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,"

Similar presentations


Ads by Google