Metadata models to support the statistical cycle: IMDB

Slides:



Advertisements
Similar presentations
Data Quality Assurance and Dissemination International Workshop on Energy Statistics Aguascalientes, Mexico.
Advertisements

Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
Metadata to Support the Survey Life Cycle Alice Born, Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (METIS) Geneva,
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Fitting a survey life cycle in the DDI Irene Wong Chuck Humphrey IASSIST Edinburgh May 2005.
Is Your Data Facility ISO Compliant? Progress Towards Harmonizing the DDI and ISO/IEC Dan Gillman Information Scientist US Bureau of Labor Statistics.
An Integrated Approach to Economic Statistics “ The Canadian Experience” UNSD – IBGE Workshop on Manufacturing Statistics Kevin Roberts Rio de Janeiro,
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Procedures to Develop and Register Data Elements in Support of Data Standardization September 2000.
ISO as the metadata standard for Statistics South Africa
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
Metadata: Integral Part of Statistics Canada Quality Framework International Conference on Agriculture Statistics October 22-24, 2007 Marcelle Dion Director.
The Statistical Metadata System: its role in a statistical organization Jana Meliskova Joint UNECE / Eurostat / OECD Work Session on Statistical Metadata.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Using ISO/IEC to Help with Metadata Management Problems Graeme Oakley Australian Bureau of Statistics.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
IMDB Registration of Survey Variables Dec 19, 2005.
Metadata Registries Workshop April 15, 1998 Slide 1 of 20 ANSI X Douglas D. Mann Stewardship Naming & Identification Classification.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
« 8-11 July 2008 « Metadata Life Cycle « STATISTICS PORTUGAL.
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
February 17, 1999Open Forum on Metadata Registries 1 Census Corporate Statistical Metadata Registry By Martin V. Appel Daniel W. Gillman Samuel N. Highsmith,
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
Metadata Architecture at StatCan MSIS 2008 Luxembourg, April 7-9, 2008 Karen Doherty Director General Informatics Branch Statistics Canada.
1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
Statistical Metadata System in the State Statistical Committee Baku, Azerbaijan, 2013 State Statistical Committee of the Republic of Azerbaijan 1.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
Data and Metadata Session 5 Mark Viney Australian Bureau of Statistics 6 June 2007.
ISO/IEC Metadata Registry Implementations Larry Fitzwater
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
SNA seminar in the Caribbean Integrated questionnaires Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February,
SDC JE What is a Data Registry? v A place to keep facts about characteristics of data that are necessary to clearly describe, inventory,
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
May 2007 Registration Status Small Group Meeting 1: August 24, 2009.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Joseph Lukhwareni Statistics South Africa Reengineering projects focusing on metadata and the statistical cycle Statistics South Africa, South Africa 3-5.
Metadata Framework for a Statistical Data Warehouse
Compilation of Meta Data Presentation to OG6 Canberra, Australia May 2011.
Role of the IMDB in the CBA and IM Strategy Presented to Information Management Committee Standards Division June
Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Census quality evaluation: Considerations from an international perspective Bernard Baffour and Paolo Valente UNECE Statistical Division Joint UNECE/Eurostat.
METIS 2011 Workshop Session III – National Implementation of the GSBPM Alice Born and Tim Dunstan Thursday October 6, 2011 Implementation of the GSBPM.
Metadata requirements for archiving structured data Alice Born Statistics Canada Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (9-11 April.
Implementation of Quality indicators for administrative data
Data Management: Documentation & Metadata
Survey phases, survey errors and quality control system
Generic Statistical Business Process Model (GSBPM)
YTY − an integrated production system for business statistics
Survey phases, survey errors and quality control system
Quality Assurance in Population and Housing Censuses
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
Role of Metadata in Census Data Dissemination
SDMX in the S-DWH Layered Architecture
Mapping Data Production Processes to the GSBPM
Metadata used throughout statistics production
The role of metadata in census data dissemination
The Role of Metadata in Census Data Dissemination
Petr Elias Czech Statistical Office
GSBPM Giorgia Simeoni, Istat,
Presentation transcript:

Metadata models to support the statistical cycle: IMDB Alice Born Statistics Canada UNECE Workshop on Statistical Metadata July 4 to 6, 2007

Outline Survey life cycle and the IMDB IMDB model Data dimension model Business dimension model Questionnaire model Registration Classification of administered items Use of metadata in the statistical system

Role of the IMDB Information management – interpretability of Statistics Canada’s 590+ current surveys Assist in coherence of the data Promote knowledge sharing across STC and with external users Preserve corporate memory Promote reuse of our metadata assets

Operational Data Stores IMDB in the survey life cycle Data Warehouses Operations Management Quality Assurance Analysis Dissemination IMDB Metadata IMDB Design Collect Edit Estimate Tabulate Publish Archive Operational Data Registers Survey Data Administrative Data Operational Data Stores

IMDB metadata model Corporate Metadata Repository (CMR), which is an extension of ISO/IEC 11179 Metadata Registries Statistical surveys Sample Questionnaire Data sets Products Systems IMDB – data dimension, business dimension, questionnaire model, administration and documents model

Data dimension model – ISO/IEC 11179 Data Element Data Element Concept Object Class Survey variable Property Conceptual Domain Value Domain

Type of revenues of establishments Data dimension model Currently in the IMDB: 85 object classes (statistical units) 290 properties 506 data element concepts (O.C. + property) 202 conceptual domains (representation class + property) 1509 value domains (classifications) 1034 data elements (= representation class + property + object class; variables) Type of revenues of establishments

Business dimension model in the IMDB Survey Applications/ Software Frame and sample Survey instance Questionnaire Datasets Products (COR) Survey design Data elements Value domains

Administered items Administrative layer Methodology Data Element Statistical Activity Organization Survey Stewardship Contact Universe Documentation Frame Identification Survey instance Identification Time Frame Instrument Keyword Question Classification Theme Data file Methodology Data Element Instrument design Sampling Data source Error detection Imputation Estimation Quality evaluation Disclosure control Revisions and seasonal adjustment Data accuracy Data Element Concept Administered items Object Class Property Formula Conceptual Domain Value Domain

Information management - Administered items Any item that is managed, tracked, organized and registered in a registry Administered items have their own set of characteristics specific to the administered item and shared administrative characteristics which are common to all administered items – administrative layer

Information management - Administrative Layer Shared administrative characteristics Terminological Designation (Names) Terminological Description Time Frame Organization/Contact Reference Document1 Version Management Stewardship/Registration Classification 1 Reference document is an administered item with all the administrative layer characteristics.

IMDB Administrative Layer - Version Management A snapshot of the information recorded for the administered item. Rules for creation of a version are established for each type of administered item.

Information Management - IMDB Administrative Layer The administrative layer is used to manage administrative information for all IMDB administered items. Administered items are managed in a consistent manner.

Surveys Metadata in the IMDB is organized around the survey administered item Refers to collection, compilation and publication of data measuring characteristics of a population Three types of surveys are recognized: Direct Administrative Derived

Statistical Activities Group of surveys that share common feature, common explanatory text E.g., System of National Accounts, Unified Enterprise Statistics, Health Statistics

Common metadata set Statistical activity Survey (direct, administrative, derived) Target population (population, statistical unit) Survey instance (each survey process) Collection instrument (questionnaire) Methodology Data accuracy Documentation Data file (Data elements, value domains)

Common metadata set for survey life cycle Methodology Instrument design Sampling Collection method Error detection Imputation Estimation Quality evaluation Disclosure control Revisions and seasonal adjustment Instrument design – questionnaire design, testing Sampling – frame, stratification, methods Collection method – mandatory/voluntary, survey type, collection period, follow up, paper/CATI/CAPI Error detection – capture and edit methods Imputation – manual/automatic, rates, methods, software Estimation – non-response adjustments, calibration, weigh-share methods, variance estimation methods

Questionnaire model Question block Item_ID Block_type, etc… Item_ID, etc… Data element Item_ID Representation_class, etc… Response choice Question_item_ID Response choice, etc… Question Item_ID DE_item_ID, etc… Value domain Item_ID VD_type, etc…

Questionnaire model in the IMDB Metadata for survey planning and design phase Does the concept or question already exist? Metadata discovery - STCWiki Align with output variables - definitions Harmonized Content Modules Project Content development of key socio-demographic data elements (e.g., marital status, age, ethnic origin) in IMDB for registration as a STC standard Leading to development of standard question blocks and questions – stored in the IMDB Specifications (i.e., skip patterns, modes) / BLAISE and other code stored in Survey Specification Manager

Registration/Stewardship Registration and stewardship information is managed for each administered item Who is the owner of the item? Who is responsible for the item’s information? Who is responsible for registration? Verification for editorial, accuracy, bilingual conformance? State – new, candidate, recorded, qualified, standard, preferred/prescribed standard, retired? Degree of sharing/harmonization – divisional, branch, agency, provincial, national, international? Dissemination – Internal, public? Versioning note

Registration Attributes in the IMDB Three registration attributes: Registration status – identifies the quality or progression of quality Registration level – level of conformance or harmonization Administrative status – stage in the registration process

Registration Authority 1. Registration status Registration Authority Preferred standard Retired (Completeness, accuracy, adherence to quality and terminological description standards) Standard Superseded Standards Division Registrar Qualified Regular Registrar Recorded Responsible Owner (Content) Candidate Historical Submitter Steward Incomplete Application

Level of conformance or harmonization 2. Registration level Level of conformance or harmonization Departmental International Recommended U.S. Program-specific Canadian Survey Provincial

Stages in registration process 3. Administrative status Stages in registration process De-registered Registered Reserved for edit New Not registered

Classification of administered items Organization and classification of the administered item Keyword STC taxonomy (28 themes, 200+ sub-themes) UNECE Classification of International Statistical Activities – data elements Program Activity Architecture for reporting to Treasury Board Secretariat and to parliament … Organization of the item’s administrative and item-specific information for different purposes HTML, Wiki, SDMX, CWM, DDI, XBRL., …

Survey design and dissemination phases Collect Edit Estimate Tabulate Publish Concepts (Object Class, Property, Data Element Concept) Data Elements Questions Questions Blocks Classifications (Conceptual Domain Value Domain) Survey Universe Frame Instance Collection Instrument Methodology Data Files Enterprise Architecture IMDB

Reuse of Information Assets in Applications Development Classification coding IMDB Collection instrument development Survey Specification Manager; Integrated Questionnaire and Metadata System Publishing Other applications Software Register

Reuse of Information Assets Integration with Data IMDB Data Warehouses CANSIM

Reuse of Information Assets in Dissemination and information discovery Wiki HTML IMDB SDMX One meta data source many uses for the information many output formats + = DDI ?

Corporate Memory: Data Files Dissemination and archive phases Operational Data Registers Survey Data Administrative Data Operational Data Stores Public Use Master File IMDB Archival information Clean Master File Archived Data