Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.

Slides:



Advertisements
Similar presentations
1 Statistics Norway Information Architecture – some challenges ODaF meeting, Colchester April 2008 Rune Gløersen Director Department for IT and.
Advertisements

DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
DLI Training Nesstar Workshop
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Federal Department of Home Affairs FDHA Federal Statistical Office FSO Meeting of the OECD Expert Group on SDMX September, OECD, Paris Centralized.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Metadata to Support the Survey Life Cycle Alice Born, Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (METIS) Geneva,
Stefania Bergamasco, Cecilia Colasanti An integrated approach to turn statistics into knowledge combining data warehouse, controlled vocabularies and advanced.
United Nations Statistics Division Principles and concepts of classifications.
CZECH STATISTICAL OFFICE | Na padesatem 81, Prague 10 | Jitka Prokop, Czech Statistical Office SMS-QUALITY The project and application.
Präsentationstitel IAB-ITM Find the right tags in DDI IASSIST 2009, 27th-30th Mai 2009 IAB-ITM Finding the Right Tags in DDI 3.0: A Beginner's Experience.
Everything but the Kitchen Sink: Building a metadata repository for time series data at the Federal Reserve Board San Cannon and Meredith Krug Federal.
Codebook Centric to Life-Cycle Centric In the beginning….
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Copyright 2010, The World Bank Group. All Rights Reserved. Integrating Agriculture into National Statistical Systems Section A 1.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
European Conference on Quality in Official Statistics (Q2010) 4-6 May 2010, Helsinki, Finland Brancato G., Carbini R., Murgia M., Simeoni G. Istat, Italian.
METADATA DRIVEN SURVEY RESEARCH Alerk Amin, CentERdata Jeremy Iverson, Colectica.
M ETADATA OF NATIONAL STATISTICAL OFFICES B ELARUS, R USSIA AND K AZAKHSTAN Miroslava Brchanova, Moscow, October, 2014.
Recent Developments of the OECD Business Tendency and Consumer Opinion Surveys Portal coi/coordination
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Software Systems for Survey and Census Yudi Agusta Statistics Indonesia (Chief of IT Division Regional Statistics Office of Bali Province) Joint Meeting.
POPULATION AND HOUSING CENSUSES IN SLOVAKIA ON THE WEBSITE Miroslav Hudec Pavol Büchler INFOSTAT – Bratislava MSIS Geneva
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic 1 Subsystem QUALITY in Statistical Information System Czech.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
The Memobust Handbook on Methodology for Modern Business Statistics Sander Scholtus Rob van de Laar Leon Willenborg
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
United Nations Statistics Division Work Programme on Economic Census Vladimir Markhonko, Chief Trade Statistics Branch, UNSD Youlia Antonova, Senior Statistician,
Copyright 2010, The World Bank Group. All Rights Reserved. Principles, criteria and methods Part 2 Quality management Produced in Collaboration between.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
INFORMATION MANAGEMENT Module INFORMATION MANAGEMENT Module
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
Metadata Working Group Jean HELLER EUROSTAT Directorate A: Statistical Information System Unit A-3: Reference data bases.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
ABS Statistical Databases Session 6 Mark Viney Australian Bureau of Statistics 6 June 2007.
Recent development in the metadata area at Statistics Sweden Klas Blomqvist
MetaPlus Klas Blomqvist Statistics Sweden Research and Development – Central Methods
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
Role of the IMDB in the CBA and IM Strategy Presented to Information Management Committee Standards Division June
Record Keeping Studies – Love ‘Em or Leave ‘Em National Science Foundation Division of Science Resources Statistics International Conference on Establishment.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
5.8 Finalise data files 5.6 Calculate weights Price index for legal services Quality Management / Metadata Management Specify Needs Design Build CollectProcessAnalyse.
The business process models and quality issues at the Hungarian Central Statistical Office (HCSO) Mr. Csaba Ábry, HCSO, Methodological Department Geneva,
Presented By Margaret Hellen Atiro Uganda Bureau of Statistics at the United Nations Regional Seminar on Census Data Archiving 20 – 23 Sep 2011, Addis.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
>> Metadata What is it, and what could it be? EU Twinning Project Activity E.2 26 May 2013.
Metadata requirements for archiving structured data Alice Born Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (9-11 April.
Metadata models to support the statistical cycle: IMDB
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
Prepared by: Galya STATEVA, Chief expert
WORKSHOP GROUP ON QUALITY IN STATISTICS
Generic Statistical Business Process Model (GSBPM)
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Oslo Group’s Mandate Address issues related to energy statistics
2. An overview of SDMX (What is SDMX? Part I)
Software Systems for Survey and Census
2. An overview of SDMX (What is SDMX? Part I)
Metadata The metadata contains
Metadata use in the Statistical Value Chain
Metadata on quality of statistical information
Integrated Statistical Production System WITH GSBPM
Presentation transcript:

Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John Gawalt FedCASIC Workshops, BLS March 17, 2010 National Science Foundation Division of Science Resources Statistics 0

1984 National Science Foundation Division of Science Resources Statistics 1

1,984 National Science Foundation Division of Science Resources Statistics 2

1 National Science Foundation Division of Science Resources Statistics 3

1 9 National Science Foundation Division of Science Resources Statistics 4

1 9 8 National Science Foundation Division of Science Resources Statistics 5

National Science Foundation Division of Science Resources Statistics 6

Today’s Talk National Science Foundation Division of Science Resources Statistics A bit about SRS Historical perspective of data and metadata dissemination Metadata users and their metadata needs Standardization efforts Challenges and future vision 7

A bit about the Division of Science Resources Statistics (SRS) National Science Foundation Division of Science Resources Statistics Federal Statistical agency within NSF 11 periodic data collections on the U.S. Science and Engineering enterprise Data dating back to the 1950s 8

Historical Perspective of SRS data and metadata dissemination National Science Foundation Division of Science Resources Statistics 1950s – early 1990s paper only Detailed statistical tables with minimum metadata as footnotes Publications included  Highlights about the survey  Scope and method of survey  Questionnaire  Cover letters 9

Example s publication National Science Foundation Division of Science Resources Statistics 10

1990’s thru 2000’s National Science Foundation Division of Science Resources Statistics 1992 – electronic format Detailed statistical tables in spreadsheets with minimum metadata as footnotes Kept paper, added electronic text Survey Methodology, Limitations to the data, Definitions, Historical revisions, List of tables PDF added Questionnaire, Cover letters, Instructions 11

Example PDF National Science Foundation Division of Science Resources Statistics 12

Example – 1991 Electronic spreadsheet National Science Foundation Division of Science Resources Statistics 13

Example – 1991 text National Science Foundation Division of Science Resources Statistics 14

Today National Science Foundation Division of Science Resources Statistics Source data tables in Excel with footnotes HTML / PDF  Highlights of the survey  Links to references  Survey description PDF  Survey Questionnaire  Instructions  Definitions 15

Example – 2007 Excel spreadsheet National Science Foundation Division of Science Resources Statistics 16

Example SIRD1 National Science Foundation Division of Science Resources Statistics 17

Example – 2007 HTML National Science Foundation Division of Science Resources Statistics 18

Example – 2007 PDF National Science Foundation Division of Science Resources Statistics 19

BUT THAT’S NOT ALL National Science Foundation Division of Science Resources Statistics Electronic databases  Create and download your own customized aggregate tables Public use files  Access to some microdata series 20

National Science Foundation Division of Science Resources Statistics 21

Metadata in WebCASPAR …. National Science Foundation Division of Science Resources Statistics 22

Metadata in WebCASPAR National Science Foundation Division of Science Resources Statistics Variable specific metadata available under Info link Metadata not tightly integrated with the data itself – does not get downloaded with the data 23

WebCASPAR Taxonomy National Science Foundation Division of Science Resources Statistics Survey specific taxonomies NCES IPEDS Classification of Instructional program codes (CIP) Integrated taxonomy for querying across surveys 24

National Science Foundation Division of Science Resources Statistics 25

National Science Foundation Division of Science Resources Statistics 26

Metadata in SESTAT National Science Foundation Division of Science Resources Statistics Metadata Explorer is separate from the data  Individual variable information  Description  Question  Domain/Availability – history  Valid response categories  Keywords Metadata is not tightly integrated with the data itself – it does not get downloaded with the data 27

Example -- Public Use file National Science Foundation Division of Science Resources Statistics 28

Example -- Public Use file National Science Foundation Division of Science Resources Statistics 29

Summary – Where are we? National Science Foundation Division of Science Resources Statistics Different surveys have evolved differently  Varying levels of details/metadata Not in an standardized structure Hodge-podge 30

National Science Foundation Division of Science Resources Statistics 31 Metadata Users & Their Metadata Needs Not a one-to-one relationship, but many-to-many They occur at all stages of the survey process

Process Data National Science Foundation Division of Science Resources Statistics Define research objectives Choose mode of collection Choose sampling frame Construct and pretest questionnaire Design and select sample Develop Survey Instrument Develop Sample Design 32 Survey Process Source: Survey Methodology (2009) Groves, Fowler, Couper, Lepkowski, Singer & Tourangeau. Recruit and measure sample Code and edit data Make postsurvey adjustments Perform analysis Define Scope Collect Data Disseminate Data

Define Scope National Science Foundation Division of Science Resources Statistics Users Metadata Data User General Survey Manager Topic Subject Matter Expert Population of interest Statistician Other data sources Survey Methodologist Specific Respondent Frame options Sample design options Historical info/data User needs Federal Register notices 33

Develop Survey Instrument National Science Foundation Division of Science Resources Statistics Users Metadata Data User Questions Survey Manager Answer choices Subject Matter Expert Definition of terms Statistician Instructions Survey Methodologist Logic flow of questions Respondent Cognitive work Validity assessments Reliability assessments Functionality testing Alternative questions Instrument design specs – paper, web, CATI

Develop Sample Design National Science Foundation Division of Science Resources Statistics Users Metadata Data User Population of interest Survey Manager Sampling frame / Universe specs Subject Matter Expert Update schedule Statistician Sample design specs Desired criteria Sample selection techniques Historical information on performance of designs Estimation methods 35

Collect Data National Science Foundation Division of Science Resources Statistics Users Metadata Data User Variable names and formats Survey Manager Variable data types Subject Matter Expert Physical storage Statistician Tables and relationships Database Administrators Mapping of questions to Software Developers variables and definitions Logic flow of questions Response rates over time Paradata Cover letter 36

Process Data National Science Foundation Division of Science Resources Statistics Users Metadata Data User Item response rates Survey Manager Zero vs. null vs. missing Subject Matter Expert Edit specifications Statistician Imputation specifications Database Administrators Recode specifications Software Developers Data table specifications Changes across survey cycles 37

Data Dissemination and Publication National Science Foundation Division of Science Resources Statistics Users Metadata Data User History of changes Survey Manager Methodology report Subject Matter Expert Public use files with Statistician documentation Database Administrators Author/contact source Software Developers Who can access what Archivist Type of product Content format URL; Keywords Relationships Metadata schema 38

Who are the Metadata Users? National Science Foundation Division of Science Resources Statistics Data users  Basic & advanced Analysts  General public Respondent Survey Manager Survey Methodologist Statistician Subject Matter Expert Software Developer Database Administrator Archivist 39

Need for Standardization of Metadata is Apparent is Critical National Science Foundation Division of Science Resources Statistics 40

Standardization Efforts National Science Foundation Division of Science Resources Statistics Dublin Core SDMX (aggregate level) DDI 3.0 (record level) 41

Recent SRS Efforts National Science Foundation Division of Science Resources Statistics Data Repository (Oracle) Inclusion of some metadata SAS/ACCESS User Interface for internal users Evaluating external user interfaces 42

SRS Efforts -- Working with Commercial Contractors National Science Foundation Division of Science Resources Statistics Requirements for Data / Metadata delivery Examples document Standard contracting language Checklist 43

SRS Adopted Basic Operating Procedures National Science Foundation Division of Science Resources Statistics Using Oracle to store microdata and metadata Collecting metadata in whatever format Keeping it all organized 44

Challenges National Science Foundation Division of Science Resources Statistics Getting all the players on the same page  Many different users  Many different uses  Many different providers  Many different products  Many different formats Cost Keeping it all straight 45

Near Future Vision National Science Foundation Division of Science Resources Statistics SRS Data Repository Data and Metadata Taxonomy Efforts Data & Metadata Dissemination Analytic tools DDI 3.0, SDMX … 46

Near Future Vision National Science Foundation Division of Science Resources Statistics SRS Data Repository Data and Metadata Taxonomy Efforts Data & Metadata Dissemination Analytic tools DDI 3.0, SDMX … 47 Paradata

1984 National Science Foundation Division of Science Resources Statistics 48

Thank you! National Science Foundation Division of Science Resources Statistics 49