Accessing data – a user’s perspective

Slides:



Advertisements
Similar presentations
DLI & Research Data Centres Creating a better understanding of these two programs Chuck Humphrey Data Library University of Alberta April 2004.
Advertisements

DLI Orientation: Concepts A Framework for Thinking about Statistical Information Train the Trainers Montreal, March 9, 2004 Chuck Humphrey Data Library.
13 February 2009ESDS – whats in it for librarians? Royal Statistical Society The strange case of the local data librarian - a peculiarly Edinburgh perspective!
The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
The Economic and Social Data Service (ESDS) Karen Dennison, Support Services Manager, UK Data Archive April 2008.
ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
1 The DLI Contacts and Designates Survey: Ontario regional profile Gaëtan Drolet Train the Trainers February 23-25, 2010 Université de Montréal Montréal,
Data Access and Data Use: the Missing Link? Elizabeth Hamilton University of New Brunswick Chuck Humphrey University of Alberta Data and Knowledge Transfer.
Jeff Moon Data Librarian & Academic Director, Queen’s Research Data Centre Statistics & Data& Data An OverviewAn Overview
Statistics Canada Statistique Canada mai 2005 / 1.
Publishing Research Papers Charles E. Dunlap, Ph.D. U.S. Civilian Research & Development Foundation Arlington, Virginia
Unit 3: Preparing for Transitions and Change Lesson 1- College versus University.
PUBH 898: Health Economics Finding data and statistics.
Searching for Statistics Why can’t we find the data we need? Where should we even start?
Chuck Humphrey Data Library Co-ordinator University of Alberta May 16, Capitalising on Metadata Tool development plans IASSIST 2007.
1 The 2001 Census PUMFS Odyssey Sponsored by HAL and PALS Presented by Chuck Humphrey.
DLI Workshop -- Mar Hosted by Dalhousie University March 2000 DLI Training Workshop.
Health Data Sources Sunny Kaniyathu 03 February 2011.
The Census of Canada and Immigration & Ethno-cultural Data Chuck Humphrey University of Alberta February 10, 2006.
The Research Data Centre Program Microdata Access Division Heather Hobson April 23, 2009.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Soc : Principles of Research Design LONGITUDINAL DATA Sunny Kaniyathu, Data Services Librarian.
Participation Activity Limitation Survey (PALS), 2001 Andrew MacKenzie Senior Analyst - PALS Social and Aboriginal Statistics Division Statistics Canada.
Creating Something from Nothing: Synthetic and Dummy files Bo Wandschneider University of Guelph Chuck Humphrey University of Alberta DLI Training: Ottawa,
Statistics Canada Statistics Canada Statistique Canada Statistique Canada Disseminating gender statistics: The Canadian experience Heather Dryburgh, Ph.D.
Quality Assurance and Quality Improvement.  Standard Pathway - Required for all institutions granted initial accreditation, institutions in significant.
Jeff Moon Data Librarian & Academic Director, Queen’s Research Data Centre Statistics & Data& Data An OverviewAn Overview
Creating Something from Nothing: Working with Synthetic Files ACCOLEDS /DLI Training: December 2003 Chuck Humphrey University of Alberta.
RRM : Resource Data and Environmental Modeling DATA SOURCES Sunny Kaniyathu, Data Services Librarian.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
DLI and EQUINOX Question 1 How do I find out what survey datasets are available from Statistics Canada ?
DLI Training - Ontario 16 April, 2015 Elizabeth Hill, Western University Survey of Household Spending.
Anticipating Great Things: A 2006 Census Preview June, 2006 DLI, Ottawa, ON Paul Schwets // Stuart Fyffe.
Faculty Councils Brad Whittaker Director, Research Services and Industry Liaison Strategic Research Plan.
Hosted by the University of Regina Library December 1999 DLI Training Workshop Chuck Humphrey.
Health Statistics 2016 DLI Atlantic Training
Real Time Remote Access: Educational resources Susan Mowers, University of Ottawa.
Unit 9– Seminar Analyzing Content: Historical, Secondary, and Content Analysis and Crime Mapping Professor Chris Lim, MA, Ph.D.(ABD)
Type author names here Social Research Methods Chapter 14: Secondary analysis and official statistics Alan Bryman Slides authored by Tom Owens.
Data Access North of the (US) Border
Small Area Data and Geography For the 2017 DLI Training Workshop
“Data from national surveys: access, analysis, and sharing”
Geo-referenced data and DLI aggregate data sources
Tracking and Sharing Survey Data: Findings from the Field
ESTAT & CANSIM DLI Equinox <odesi> ICPSR
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Publishing DDI-Related Topics Advantages and Challenges of Creating Publications Joachim Wackerow EDDI16 - 8th Annual European DDI User Conference Cologne,
Secondary Data Analysis
Creating Something from Nothing: Working with Synthetic Files
Research Methods for Business Students
The Research Data Centre Program
The 2008 DLI Contacts and Designates Survey: A Portrait
Research Data Centre DLI Workshop (December, 2001)
Lecture on Primary Data Collection
Susan Mowers, Data Librarian, GSG Centre - UOttawa
The Q Improvement Lab August 2017.
ICPSR: Resources for Instructors Finding and Analyzing Data 9/26/2012
Centre for Multilevel Modelling, University of Bristol
ESDS resources for managing and analysing data
MDI MOVING TO ACTION with MDI Results
MOVING TO ACTION With MDI Results KEY MESSAGE:
University of Regina Library
Secondary Data Analysis
Using Large-Scale Databases for Research and Grant Writing
LibQUAL+® Survey Results
Mapping Data Production Processes to the GSBPM
Capitalising on Metadata
Exploring the DLI Product line
Creating Something from Nothing: Working with Synthetic Files
Presentation transcript:

Accessing data – a user’s perspective Presented to the Atlantic DLI Training Workshop Acadia University April 11-12, 2017 E. Dianne Looker Professor emerita dianne.looker@acadiau.ca

Outline of presentation DLI Opportunities & Responsibilities Secondary versus primary data Accessing data using CANSIM Accessing data using the RDCs Accessing data using DLI Ways to improve DLI access Providing information on data repositories Pro-active information to researchers

DLI Opportunities & Responsibilities Pro-active promotion of data options Providing information on data options to faculty Continuum of data options from Statistics Canada Other data options through data repositories Providing access to Public Use Microdata Files (PUMFs) Providing support for access to PUMFs

Secondary versus primary data Advantages of researchers collecting primary data: Researcher control Access to all data; researcher controls any suppresion Disadvantages of primary data collection: Cost Time – data collection, coding, data cleaning Often limited in scope; response rate issues Complexities and constraints of obtaining ethics approval

Advantages of secondary data: Low or no cost to individual researcher Data processed, cleaned and checked Disadvantages of secondary data: No researcher control – measures and sample might not match your goals Data suppression – variables and codes Need access – takes some time and knowledge to access and run “Over” interpretation of particular data sets Example of “over” interpretation – High School and Beyond data set, U.S.

My experiences Using CANSIM Analysis of data on graduate students in Canada for Canadian Association for Graduate Studies – 41st and 42nd Reports Summary data for background for research papers HOWEVER, I only recently learned of CANSIM

Accessing data using CANSIM Advantages: Large, often nationally representative data sets Excellent for time trends Data quality has been checked No data collection or processing costs Documentation of measures Disadvantages: Limited number of variables Limits on possible multi-variate analyses Codes and recodes set by others Documentation is often restricted to footnotes in HTML tables – not evident if you download

My experience using the RDCs: Co-director of the Atlantic RDC (ARDC) & representative to national Coordinating Committee, 2006-2009 Institutional representative to the ARDC, 2006-2012 Researcher using various data sets, including: Youth in Transition and National Longitudinal Survey of Children and Youth Reviewer for access to RDCs

Accessing data through the RDCs Advantages: No data collection or processing costs Few suppressed variables Can undertake multi-variate analyses Disadvantages: Physical access to RDC required Need focused project – no “exploration” Challenges downloading data into SPSS Delays in accessing output Challenge of avoiding residual disclosure problems

My experience with DLI: Analysed data from various General Social Surveys Analysed data from Survey of Approaches to Educational Planning Accessed Aborginal Peoples Survey Accessed National Graduate Survey

Finding Public Use Data Files Statistics Canada lists the Public Use Data file surveys (PUMFs) and measures http://www120.statcan.gc.ca/dli/e1/stu?fq=studyType%3A%22PU MFFILE%22&showSum=hide# Odesi allows you to search by variable in a range of data sets http://search1.odesi.ca/#/ See also this “small area” data guide: http://www.ruralontarioinstitute.ca/uploads/userfiles/files/Small%20Are a%20Data%20Guide_FINAL2.pdf

Available data: Statistics Canada Other Government agencies, including: Employment & Social Development Canada Immigration, Refugees and Citizenship Canada Canada Revenue Agency Canada Mortgage and Housing Corporation Indigenous and Northern Affairs Canada Innovation, Science and Economic Development Canada Natural Resources Canada

Other data: Archived data in data repositories (see https://library.carleton.ca/find/data/available-data/online-data- repositories for Canadian and international repositories) The three granting councils require that: “Research data resulting from agency funding should normally be preserved in a publicly accessible, secure and curated repository or other platform for discovery and reuse by others.”

Accessing data through DLI Advantages: No data collection or processing costs Data has been cleaned and checked Can access national surveys Disadvantages: Some variables are suppressed No list of suppressed variables; documentation is usually for the full data set Some codes are collapsed You have to take the time to access the data to find the codes are collapsed Different files are in different formats

Using DLI – there is a need for: Clear lists of suppressed variables for each data set Clear lists of codes used in PUMF (not just in the full master file) Having these lists would also facilitate applications to the RDC. Information on related files that compliment particular PUMF files in terms of topic areas Information on how to access alternate data files that are stored in a data repository

In general: Researchers need to know about the range of options and how to access them Many researchers do not know about CANSIM, the RDCs or PUMFs More standardization in file format for PUMFs would be helpful More and better documentation of PUMFs information before accessing the files Many researchers will need assistance setting up and accessing the files

In sum: There are a lot of options for accessing data for research All have advantages and disadvantages PUMFs offer flexibility combined with access to micro-data Better documentation would increase the usefulness of PUMFs Data librarians and DLI representatives have the opportunity (and responsibility?) to pro-actively promote the range of data available

Pro-active promotion Presentations in university classes that teach and use quantitative methods would increase exposure of PUMFs (and other data sources) to potential researchers Social sciences and related disciplines (Economics, Sociology, Psychology, Geography, Education, Business administration, Law, Religious studies, etc.) Life sciences and related disciplines (Health and medicine, Kinesiology, Tourism and recreation, etc.) Other?

Comments, questions, insights E. Dianne Looker dianne.looker@acadiau.ca