Download presentation
Presentation is loading. Please wait.
1
EDDI16 – European DDI Users Conference
The Picasso Project EDDI16 – European DDI Users Conference December 7, 2016 License: CC BY 4.0
2
Background “…metadata should be integral to the process and, indeed, drive the process. Metadata should precede data.” “A strong, corporate statistical information management framework is required.” Architectural Principles Report of the Task Force on Corporate Business Architecture Statistics Canada • Statistique Canada 7/12/2016
3
Statistical metadata strategy
Fill metadata gaps across the business process Develop operational governance Elaborate the statistical metadata portion of the enterprise information architecture Formalize centres of responsibility for each domain Establish a single point of access Adopt structure and content standards Communication and training Statistics Canada • Statistique Canada 7/12/2016
4
Local Survey Infrastructures
SurveyDB’s Satellite Registers Survey files Local Survey Infrastructures Census Data Census files Census Infrastructure Data Integration, Processing, Tabulation IBSP DB Survey files IBSP Infrastructure Data Integration, Processing, Tabulation SSME DB Survey files Common Tools Infrastructure PFM / SSPE Survey files National Accounts Infrastructure Data Integration, Analysis MEA Hub Input Channels Coding QA & Validation Reception Data Preprocessing Capture Transform Analysis (SAS) SAS Survey files Statistical Analysis Data Exploration & Analysis Data Integration Record Linkage Edit & Imputation Processing Link Agreements Agency Link Register Business / Institution Structure People Activity Core Register Infrastructure Statistical Metadata (Picasso) CODR OMR Web Presentation Dissemination (NDM) Web Scraping Sensor (IoT) Admin Files Data Science(R, etc.) Various Survey files Innovation CKAN Data Web Presentation Open Data Portal EAIP Service Integration (Statistical Functions, Information Services) i-EQ, r-EQ CDW Survey Collection MCS CMP Listing Frames & Sampling Survey Design Statistical Design Questionnaire Design Microsimulation, analysis Various Survey files Analytical Studies StatCan “Deliverology” Portal StatCan Reporting & Audit Stakeholder Services Core Information Management Services Data Warehouses Data Lakes (RDCs, CDER) BI and Analytics Data Provisioning Provision Agreements (LADMS) Access Authorizations (CARS) Authentication & Authorization (ActiveDirectory) EDRMS (GCDocs) Communities, Spaces & Collaboration (Confluence, STCWiki, GCConnex / Forum, ICN) Hosting Services & Infrastructure
5
Picasso Wireframe - Concept
Statistics Canada • Statistique Canada 7/12/2016
6
Statistics Canada • Statistique Canada
Picasso will deliver… An enterprise hub to manage: statistical metadata “fit for use” data assets - surveys, administrative data + record linkage projects Metadata-driven processes to link metadata to data using common information models Interpret, integrate, share, use + reuse data Statistics Canada • Statistique Canada 7/12/2016
7
Picasso is also delivering…
Enterprise search and discovery Single point of access with powerful search across databases, shared drives, collaboration sites, internal networks Visualization tools to promote data integration Alignment opportunities with DDI Statistics Canada • Statistique Canada 7/12/2016
8
Local management, limited governance + use of standards
ad hoc search, discovery, access, archive. Enterprise management, standards driven governance, integrated enterprise search, discovery, access, ILCM archive. Statistics Canada • Statistique Canada (ILCM = Information Lifecycle Management) 7/12/2016
9
IMDB - dissemination metadata
Statistics Canada • Statistique Canada 7/12/2016
10
Metadata Core Solution Architecture
External systems access the Metadata core via Entity Services The Data Access Layer (DAL) provides access to the Picasso Registry and Repository via Common Information Exchange Models (CIEMs) for both Entity Services and Picasso RDF components. Statistics Canada • Statistique Canada 7/12/2016
11
Data in transit is mapped… …data at rest, stays at rest
Statistics Canada • Statistique Canada 7/12/2016
12
Statistics Canada • Statistique Canada
Shift to common models Move from static metadata products to entities Enter once, re-use many times “Pick and play” publishing Link metadata to ‘fit for use’ data files Discover new data sources Access directly in Picasso Governance + automation where it makes sense Registration approvals where metadata is created IM - know what you have, where +how long to keep Bulk loaders, workflow registration processes Statistics Canada • Statistique Canada 7/12/2016
13
Statistics Canada • Statistique Canada
7/12/2016 Statistics Canada • Statistique Canada
14
Statistics Canada • Statistique Canada
7/12/2016
15
Statistics Canada • Statistique Canada
Delivered so far… Independent validation of solution architecture (RDF, Sharepoint) Limited single point of access Metadata search Loaded ADI data holdings to new metadata core Services – classification, concept, survey, variable, data holding Statistics Canada • Statistique Canada 7/12/2016
16
Statistics Canada • Statistique Canada
Future releases Enhanced Search Link statistical metadata + data assets Services supporting DDI and SDMX Lifecycle management process (registration) Simple subscription for users Full production by March 31, 2018 Statistics Canada • Statistique Canada 7/12/2016
17
Common Statistical Production Architecture (CSPA) Compliance
BA decision principles (most) Capitalize on and influence national and international developments Deliver enterprise-wide benefits Increase the value of our statistical assets Maximize the use of existing data/Minimize respondent load IA principles (all) Manage information as an asset Manage the information lifecycle Protect information appropriately Use agreed models and standards Capture information as early as possible Describe to ensure reuse Ensure there is an authoritative source Preserve information input into Statistical Services Describe information by metadata BA design principles (most) Re-use existing before designing new Design new for re-use and easy assembly Processes are metadata driven Adopt available standards Enable discoverability and accessibility Application design principles (all): Maintain independence between design and implementation Use available standards Use architecture patterns Implement using GSIM (modulo some renaming) Minimize coupling Maximize Service Autonomy Include non functional requirements (in progress) We also follow CSPA approach to specifications by having three levels: conceptual, logical and physical. Statistics Canada • Statistique Canada 7/12/2016
18
Statistics Canada • Statistique Canada
Technical details Statistical Metadata and Data Management IMDB Replacement (legacy) – ISO11179 schema Data Service Centre functionality – CBA Transformation Information Management - EDRMS linkage Search, Discover, Navigate, Register, Report, Innovate Aggregator / Linkage focus – e.g. National Accounts, Record Linkage Linked Data and Metadata Oracle 12c RDF graph database URI-linked data sets, documents Service-Oriented Architecture integration Metadata services (CSPA compliant) – QSCV… Data Asset management services – ILCM linkage Oracle BPM Suite workflow, Oracle Service Bus Statistical Functions Services, Statistical Entity Services roadmaps Statistics Canada • Statistique Canada 7/12/2016
19
Picasso Wireframe – Details (animated)
Search Bar Metadata Graph Visualization Detailed Definition Facet Browse Associated Data Assets Metadata Designer Personalization Notifications Statistics Canada • Statistique Canada 7/12/2016
20
Current Picasso screenshots of user design interface
Statistics Canada • Statistique Canada 7/12/2016
21
Statistics Canada • Statistique Canada
Search verticals – all, ICN, IMDB, other Search bar Pop up hover panels Search results Metadata Icon Refiners – metadata types (supports faceted search) Statistics Canada • Statistique Canada 9/20/2018
22
Statistics Canada • Statistique Canada
7/12/2016
24
Statistics Canada • Statistique Canada
7/12/2016 Statistics Canada • Statistique Canada
25
Picasso – impact on external users
delivers an internal solution based on common information exchange models will accelerate the implementation of Semantic Web Linked Open Data by facilitating sharing data and metadata with external researchers Statistics Canada • Statistique Canada 7/12/2016
26
Statistics Canada • Statistique Canada
Research Data Centres Research Data Centres, in partnership with universities and Canadian research funding agencies, provide access to researchers, in a secure university setting, to microdata from population and household surveys and administrative data. Statistics Canada • Statistique Canada 7/12/2016
27
Access to metadata externally
Retrospective manual conversion of metadata for household surveys to the DDI-Lifecycle standard Developed a DDI extraction tool to pull survey-level metadata from the IMDB Metadata published to NESSTAR since 2014 Statistics Canada • Statistique Canada 7/12/2016
28
Statistics Canada • Statistique Canada
RDC Tool – NESSTAR view Statistics Canada • Statistique Canada 20/09/2018
29
Statistics Canada • Statistique Canada
Picasso outcomes Moves Agency from custom point-point integration to a framework of common information exchange models Leverages DDI ecosystem in interface with outside users Publishing service in data access layer will output metadata into DDI Provides flexibility to integrate DDI into models going forward Data set model Statistics Canada • Statistique Canada 7/12/2016
30
Statistics Canada • Statistique Canada
For more information Kathryn Stevenson – Project Manager Flavio Rizzolo – Information/Solution Architect Statistics Canada • Statistique Canada 7/12/2016
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.