Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accessing data – a user’s perspective

Similar presentations


Presentation on theme: "Accessing data – a user’s perspective"— Presentation transcript:

1 Accessing data – a user’s perspective
Presented to the Atlantic DLI Training Workshop Acadia University April 11-12, 2017 E. Dianne Looker Professor emerita

2 Outline of presentation
DLI Opportunities & Responsibilities Secondary versus primary data Accessing data using CANSIM Accessing data using the RDCs Accessing data using DLI Ways to improve DLI access Providing information on data repositories Pro-active information to researchers

3 DLI Opportunities & Responsibilities
Pro-active promotion of data options Providing information on data options to faculty Continuum of data options from Statistics Canada Other data options through data repositories Providing access to Public Use Microdata Files (PUMFs) Providing support for access to PUMFs

4 Secondary versus primary data
Advantages of researchers collecting primary data: Researcher control Access to all data; researcher controls any suppresion Disadvantages of primary data collection: Cost Time – data collection, coding, data cleaning Often limited in scope; response rate issues Complexities and constraints of obtaining ethics approval

5 Advantages of secondary data: Low or no cost to individual researcher
Data processed, cleaned and checked Disadvantages of secondary data: No researcher control – measures and sample might not match your goals Data suppression – variables and codes Need access – takes some time and knowledge to access and run “Over” interpretation of particular data sets Example of “over” interpretation – High School and Beyond data set, U.S.

6 My experiences Using CANSIM
Analysis of data on graduate students in Canada for Canadian Association for Graduate Studies – 41st and 42nd Reports Summary data for background for research papers HOWEVER, I only recently learned of CANSIM

7 Accessing data using CANSIM
Advantages: Large, often nationally representative data sets Excellent for time trends Data quality has been checked No data collection or processing costs Documentation of measures Disadvantages: Limited number of variables Limits on possible multi-variate analyses Codes and recodes set by others Documentation is often restricted to footnotes in HTML tables – not evident if you download

8 My experience using the RDCs:
Co-director of the Atlantic RDC (ARDC) & representative to national Coordinating Committee, Institutional representative to the ARDC, Researcher using various data sets, including: Youth in Transition and National Longitudinal Survey of Children and Youth Reviewer for access to RDCs

9 Accessing data through the RDCs
Advantages: No data collection or processing costs Few suppressed variables Can undertake multi-variate analyses Disadvantages: Physical access to RDC required Need focused project – no “exploration” Challenges downloading data into SPSS Delays in accessing output Challenge of avoiding residual disclosure problems

10 My experience with DLI:
Analysed data from various General Social Surveys Analysed data from Survey of Approaches to Educational Planning Accessed Aborginal Peoples Survey Accessed National Graduate Survey

11 Finding Public Use Data Files
Statistics Canada lists the Public Use Data file surveys (PUMFs) and measures MFFILE%22&showSum=hide# Odesi allows you to search by variable in a range of data sets See also this “small area” data guide: a%20Data%20Guide_FINAL2.pdf

12 Available data: Statistics Canada
Other Government agencies, including: Employment & Social Development Canada Immigration, Refugees and Citizenship Canada Canada Revenue Agency Canada Mortgage and Housing Corporation Indigenous and Northern Affairs Canada Innovation, Science and Economic Development Canada Natural Resources Canada

13 Other data: Archived data in data repositories (see repositories for Canadian and international repositories) The three granting councils require that: “Research data resulting from agency funding should normally be preserved in a publicly accessible, secure and curated repository or other platform for discovery and reuse by others.”

14 Accessing data through DLI
Advantages: No data collection or processing costs Data has been cleaned and checked Can access national surveys Disadvantages: Some variables are suppressed No list of suppressed variables; documentation is usually for the full data set Some codes are collapsed You have to take the time to access the data to find the codes are collapsed Different files are in different formats

15 Using DLI – there is a need for:
Clear lists of suppressed variables for each data set Clear lists of codes used in PUMF (not just in the full master file) Having these lists would also facilitate applications to the RDC. Information on related files that compliment particular PUMF files in terms of topic areas Information on how to access alternate data files that are stored in a data repository

16 In general: Researchers need to know about the range of options and how to access them Many researchers do not know about CANSIM, the RDCs or PUMFs More standardization in file format for PUMFs would be helpful More and better documentation of PUMFs information before accessing the files Many researchers will need assistance setting up and accessing the files

17 In sum: There are a lot of options for accessing data for research All have advantages and disadvantages PUMFs offer flexibility combined with access to micro-data Better documentation would increase the usefulness of PUMFs Data librarians and DLI representatives have the opportunity (and responsibility?) to pro-actively promote the range of data available

18 Pro-active promotion Presentations in university classes that teach and use quantitative methods would increase exposure of PUMFs (and other data sources) to potential researchers Social sciences and related disciplines (Economics, Sociology, Psychology, Geography, Education, Business administration, Law, Religious studies, etc.) Life sciences and related disciplines (Health and medicine, Kinesiology, Tourism and recreation, etc.) Other?

19 Comments, questions, insights
E. Dianne Looker


Download ppt "Accessing data – a user’s perspective"

Similar presentations


Ads by Google