Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata models to support the statistical cycle: IMDB

Similar presentations


Presentation on theme: "Metadata models to support the statistical cycle: IMDB"— Presentation transcript:

1 Metadata models to support the statistical cycle: IMDB
Alice Born Statistics Canada UNECE Workshop on Statistical Metadata July 4 to 6, 2007

2 Outline Survey life cycle and the IMDB IMDB model Data dimension model
Business dimension model Questionnaire model Registration Classification of administered items Use of metadata in the statistical system

3 Role of the IMDB Information management – interpretability of Statistics Canada’s 590+ current surveys Assist in coherence of the data Promote knowledge sharing across STC and with external users Preserve corporate memory Promote reuse of our metadata assets

4 Operational Data Stores
IMDB in the survey life cycle Data Warehouses Operations Management Quality Assurance Analysis Dissemination IMDB Metadata IMDB Design Collect Edit Estimate Tabulate Publish Archive Operational Data Registers Survey Data Administrative Data Operational Data Stores

5 IMDB metadata model Corporate Metadata Repository (CMR), which is an extension of ISO/IEC Metadata Registries Statistical surveys Sample Questionnaire Data sets Products Systems IMDB – data dimension, business dimension, questionnaire model, administration and documents model

6 Data dimension model – ISO/IEC 11179
Data Element Data Element Concept Object Class Survey variable Property Conceptual Domain Value Domain

7 Type of revenues of establishments
Data dimension model Currently in the IMDB: 85 object classes (statistical units) 290 properties 506 data element concepts (O.C. + property) 202 conceptual domains (representation class + property) 1509 value domains (classifications) 1034 data elements (= representation class + property + object class; variables) Type of revenues of establishments

8 Business dimension model in the IMDB
Survey Applications/ Software Frame and sample Survey instance Questionnaire Datasets Products (COR) Survey design Data elements Value domains

9 Administered items Administrative layer Methodology Data Element
Statistical Activity Organization Survey Stewardship Contact Universe Documentation Frame Identification Survey instance Identification Time Frame Instrument Keyword Question Classification Theme Data file Methodology Data Element Instrument design Sampling Data source Error detection Imputation Estimation Quality evaluation Disclosure control Revisions and seasonal adjustment Data accuracy Data Element Concept Administered items Object Class Property Formula Conceptual Domain Value Domain

10 Information management - Administered items
Any item that is managed, tracked, organized and registered in a registry Administered items have their own set of characteristics specific to the administered item and shared administrative characteristics which are common to all administered items – administrative layer

11 Information management - Administrative Layer
Shared administrative characteristics Terminological Designation (Names) Terminological Description Time Frame Organization/Contact Reference Document1 Version Management Stewardship/Registration Classification 1 Reference document is an administered item with all the administrative layer characteristics.

12 IMDB Administrative Layer - Version Management
A snapshot of the information recorded for the administered item. Rules for creation of a version are established for each type of administered item.

13 Information Management - IMDB Administrative Layer
The administrative layer is used to manage administrative information for all IMDB administered items. Administered items are managed in a consistent manner.

14 Surveys Metadata in the IMDB is organized around the survey administered item Refers to collection, compilation and publication of data measuring characteristics of a population Three types of surveys are recognized: Direct Administrative Derived

15 Statistical Activities
Group of surveys that share common feature, common explanatory text E.g., System of National Accounts, Unified Enterprise Statistics, Health Statistics

16 Common metadata set Statistical activity
Survey (direct, administrative, derived) Target population (population, statistical unit) Survey instance (each survey process) Collection instrument (questionnaire) Methodology Data accuracy Documentation Data file (Data elements, value domains)

17 Common metadata set for survey life cycle
Methodology Instrument design Sampling Collection method Error detection Imputation Estimation Quality evaluation Disclosure control Revisions and seasonal adjustment Instrument design – questionnaire design, testing Sampling – frame, stratification, methods Collection method – mandatory/voluntary, survey type, collection period, follow up, paper/CATI/CAPI Error detection – capture and edit methods Imputation – manual/automatic, rates, methods, software Estimation – non-response adjustments, calibration, weigh-share methods, variance estimation methods

18 Questionnaire model Question block Item_ID Block_type, etc…
Item_ID, etc… Data element Item_ID Representation_class, etc… Response choice Question_item_ID Response choice, etc… Question Item_ID DE_item_ID, etc… Value domain Item_ID VD_type, etc…

19 Questionnaire model in the IMDB
Metadata for survey planning and design phase Does the concept or question already exist? Metadata discovery - STCWiki Align with output variables - definitions Harmonized Content Modules Project Content development of key socio-demographic data elements (e.g., marital status, age, ethnic origin) in IMDB for registration as a STC standard Leading to development of standard question blocks and questions – stored in the IMDB Specifications (i.e., skip patterns, modes) / BLAISE and other code stored in Survey Specification Manager

20 Registration/Stewardship
Registration and stewardship information is managed for each administered item Who is the owner of the item? Who is responsible for the item’s information? Who is responsible for registration? Verification for editorial, accuracy, bilingual conformance? State – new, candidate, recorded, qualified, standard, preferred/prescribed standard, retired? Degree of sharing/harmonization – divisional, branch, agency, provincial, national, international? Dissemination – Internal, public? Versioning note

21 Registration Attributes in the IMDB
Three registration attributes: Registration status – identifies the quality or progression of quality Registration level – level of conformance or harmonization Administrative status – stage in the registration process

22 Registration Authority
1. Registration status Registration Authority Preferred standard Retired (Completeness, accuracy, adherence to quality and terminological description standards) Standard Superseded Standards Division Registrar Qualified Regular Registrar Recorded Responsible Owner (Content) Candidate Historical Submitter Steward Incomplete Application

23 Level of conformance or harmonization
2. Registration level Level of conformance or harmonization Departmental International Recommended U.S. Program-specific Canadian Survey Provincial

24 Stages in registration process
3. Administrative status Stages in registration process De-registered Registered Reserved for edit New Not registered

25 Classification of administered items
Organization and classification of the administered item Keyword STC taxonomy (28 themes, 200+ sub-themes) UNECE Classification of International Statistical Activities – data elements Program Activity Architecture for reporting to Treasury Board Secretariat and to parliament Organization of the item’s administrative and item-specific information for different purposes HTML, Wiki, SDMX, CWM, DDI, XBRL., …

26 Survey design and dissemination phases
Collect Edit Estimate Tabulate Publish Concepts (Object Class, Property, Data Element Concept) Data Elements Questions Questions Blocks Classifications (Conceptual Domain Value Domain) Survey Universe Frame Instance Collection Instrument Methodology Data Files Enterprise Architecture IMDB

27 Reuse of Information Assets in Applications Development
Classification coding IMDB Collection instrument development Survey Specification Manager; Integrated Questionnaire and Metadata System Publishing Other applications Software Register

28 Reuse of Information Assets Integration with Data
IMDB Data Warehouses CANSIM

29 Reuse of Information Assets in Dissemination and information discovery
Wiki HTML IMDB SDMX One meta data source many uses for the information many output formats + = DDI ?

30 Corporate Memory: Data Files Dissemination and archive phases
Operational Data Registers Survey Data Administrative Data Operational Data Stores Public Use Master File IMDB Archival information Clean Master File Archived Data


Download ppt "Metadata models to support the statistical cycle: IMDB"

Similar presentations


Ads by Google