Res Meth Workshop Dec 04 Disclosure problems with design information for surveys Gillian Raab Kathy Buckner/Iona Waterston Napier University Susan Purdon.

Slides:



Advertisements
Similar presentations
Innovation data collection: Advice from the Oslo Manual South East Asian Regional Workshop on Science, Technology and Innovation Statistics.
Advertisements

Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
ESRC UK Longitudinal Studies Centre A Framework for Quality Profiles Nick Buck and Peter Lynn Institute for Social and Economic Research University of.
ESDS meeting 9 th September P|E|A|S Practical Exemplars on the Analysis of Surveys –Web site to help people analyse surveys –Supported by the ESRC.
Sampling A population is the total collection of units or elements you want to analyze. Whether the units you are talking about are residents of Nebraska,
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 1 An Introduction to Business Statistics.
GLOBAL TOBACCO SURVEILLANCE SYSTEM Global Youth Tobacco Survey Training Workshop Introduction to the GYTS Sample Design & Weights.
Multiple Indicator Cluster Surveys Data Processing Workshop
Security Q&A OSG Site Administrators workshop Indianapolis August Doug Olson LBNL.
«Doing research with secondary data» Eva Deuchert Bern, November
15 de Abril de A Meta-Analysis is a review in which bias has been reduced by the systematic identification, appraisal, synthesis and statistical.
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
PEAS wprkshop 2 Non-response and what to do about it Gillian Raab Professor of Applied Statistics Napier University.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
Chapter 1: Introduction to Statistics
PREPARATION, ORGANISATION AND CONDUCTING OF THE POST- ENUMERATION SURVEY IN THE STATE STATISTICAL OFFICE OF THE REPUBLIC OF MACEDONIA Skopje, May 2008.
Near East Regional Workshop - Linking Population and Housing Censuses with Agricultural Censuses. Amman, Jordan, June 2012 Improving Efficiency.
Resources for International Comparative Analysis: The European Social Survey ESRC Research Methods Festival, St Catherine's College, Oxford, 02 July 2008.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 2 Labor Market Information Produced in Collaboration between World Bank Institute and the.
1 Dealing with Item Non-response in a Catering Survey Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen.
Jump to first page (o ns) Modernising Statistical Systems to improve Quality The experiences of the Office for National Statistics (ONS) Presented by Emma.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
 Sampling refers to a group of people taking part in a market research survey selected to be representative of the target market overall  Types of sampling.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Optimum sampling strategy for National Households In Abu Dhabi (The Household Master Sample 2012) Authors : Mohammed Al Rifai (Ph.D.) Mariam.
RENEWING THE EUSTAT TOURISM SURVEY: NEW COLLECTION METHODS AND DESIGN FOR MORE DETAILED ESTIMATES Elena Goni Jorge Aramendi Marta Salvador Anjeles Iztueta.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Peter Granda Archival Assistant Director / Data Archives and Data Producers: A Cooperative Partnership.
Lynn Lethbridge SHRUG November, What is Bootstrapping? A method to estimate a statistic’s sampling distribution Bootstrap samples are drawn repeatedly.
Using REDCap (Research Electronic Data Capture) as a tool to perform research studies Abstract ID no. IRIA-1076.
Data for secondary analysis: the experience of the UK Data Archive Hilary Beedham UK Data Archive.
1 Data Collection and Sampling Chapter Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results.
Two Paradigms for Official Statistics Production Boris Lorenc, Jakob Engdahl and Klas Blomqvist Statistics Sweden.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Analysis of Experiments
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Research and teaching with the SHS data Gillian Raab Professor of Applied Statistics Napier University.
Research Methodology & Design. Research: from theory to practice PhilosophyParadigm Theoretical approach Information collection approach Information collection.
1 Data Collection and Sampling ST Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results of a statistical.
Statistics Definitions Part 2. Representative Sample For a sample to be representative of a population, it must possess the same characteristics as the.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Survey Design: Some Implications for.
Background to PEAS project Gillian Raab Professor of Applied Statistics Napier University.
Slide 7.1 Saunders, Lewis and Thornhill, Research Methods for Business Students, 5 th Edition, © Mark Saunders, Philip Lewis and Adrian Thornhill 2009.
Rotating Panels – Especially with Regards to Business Statistics Peter Tibert Stoltze Statistical Methodology Forum for Sample Survey and Estimation April.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
AFS Decommissioning Overview Andy Romero Core Computing Division.
Appropriate use of Design Effects and Sample Weights in Complex Health Survey Data: A Review of Articles Published using Data from Add Health, MTF, and.
SunGard BSR Inc. 1 University of Melbourne Advance Web Access Training CONFIDENTIALITY STATEMENT: This document contains information.
Applying preservation metadata to repositories The British Library, 21 January 2008 Led by Steve Hitchcock With Bill Hubbard, Gareth Johnson.
1 Project 12: Cars from File. This is an extension of Project 11, Car Class You may use the posted solution for Project 11 as a starting point for this.
Physics coursework Title 2017
Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD)
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Legislation as a Driver to Continuous Improvement in Practice
Sampling And Sampling Methods.
Organizing national surveys
VQManager Your e-portfolio.
Data Collection and Sampling
STEPS Site Report.
Presentation transcript:

Res Meth Workshop Dec 04 Disclosure problems with design information for surveys Gillian Raab Kathy Buckner/Iona Waterston Napier University Susan Purdon National Centre for Social Research

Res Meth Workshop Dec 04 Background PEAS project Uses real survey data – not textbook examples Illustrates how they can be analysed using –Different methodologies –Their implementations in software packages –Links the analyses with sections on the theory relevant to the design and analysis of surveys

Res Meth Workshop Dec 04 Data availability ESRC stipulation –Data used in the exemplars must be available via the ESRC data archive But if this is the ONLY way it is available it it would make site hard to use So they exemplars use extracts, of just a few variables, available on the web

Res Meth Workshop Dec 04

Need to make survey design variables available Cluster (primary sampling unit) identifiers –If the sample is clustered – here it was Indicators of the strata used –Here stratification was by local authority Weights Cluster and stratum identifiers may not be made available via the data archive or may be in restricted files

Res Meth Workshop Dec 04

Clusters are about 10 respondents Strata are local authorities Other cases strata might be (e.g.) large firms in a business survey.

Res Meth Workshop Dec 04 Disclosure can happen if We know the location of individual clusters We can identify an individual within a cluster Where a stratum is small and a large proportion of the stratum is sampled We have some means of linking the data on the web back to the full data source

Res Meth Workshop Dec 04 Steps to prevent disclosure Change cluster identifiers so they no longer reveal location Change IDs so they cannot link back Add noise to the weights so they do not identify individuals Make the details of how the strata are defined unavailable (not in this exemplar) Maybe more things??

Res Meth Workshop Dec 04 What are the principles? Do we need to worry about –Population unique individuals –Sample unique individuals Logically we would expect the former But the latter may also be important If you know you are in the survey? If you know that someone else was in the survey? Principles for individuals and organisations may have to be different

Res Meth Workshop Dec 04 Another way round this Surveys come with sets of replicate weights Standard errors for surveys are provided using jacknife or bootstrap methods The user does not need to have access to the individual deign variables This approach has been pioneered by Statistics Canada But a sharp investigator could still work out clusters

Res Meth Workshop Dec 04 Relevance to researchers We have been able to get the data we wanted for our exemplars so far But there are some surveys at the ESRC data archive where the cluster identifiers are –Not available at all –Information is there, but it is obscure A consistent policy (perhaps with restrictions) would be helpful