Metadata Standards and XML Technologies.

Slides:



Advertisements
Similar presentations
Workshop on Metadata Standards and Best Practices November 19-20th, 2007 Session 1 Leveraging Metadata Standards in RDC Pascal Heus Open Data Foundation.
Advertisements

IASSIST 2007 Montreal, May , 2007 Session A2 Open Data and the Common Good Technology Solutions for Difficult Challenges Pascal Heus Open Data Foundation.
11th Annual Federal CASIC Workshops Washington, DC, March 6 - 8, 2007 Session WP4 Metadata challenges and solutions for socio-economic data Pascal Heus.
10th Annual Open Forum for Metadata Registries New York, NY, July 9-11, 2007 Track 3 – Future Directions Metadata challenges and solutions for socio-economic.
3rd International Digital Curation Conference Washington, DC, Dec 2007 Paper Presentations: Interoperability, Metadata & Standards Data Documentation Initiative:
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
UK DATA ARCHIVE Louise Corti, ODAF April UK Data Archive an internationally-renowned centre of expertise in data acquisition, preservation, dissemination.
ODaF Europe 2008 Colchester, UK, April 14-15, 2008 Metadata in social science and the Open Data Foundation Pascal Heus Open Data Foundation
ICPSR-SRO Shared Data Model Project Mary Vardigan Director, DDI Alliance.
International Household Survey Network (IHSN) Microdata Management Toolkit Trevor Croft MICS3 Data Archiving, Dissemination and Further.
The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Economic and Social Data Service a distributed data service for the social sciences.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Microdata dissemination best practice Draft note prepared by the World Bank Development Data Group for the CCSA twenty-second session, Ankara, September.
Introduction to SDMX Seminar Eurostat/ECLAC 02 October 2012 August Götzfried Head of Unit, Eurostat B5 Management of statistical data and metadata.
1 Work session convened by the Friends of the Chair Group on Integrated Economic Statistics Bern, 6-8 June 2007 Session 3(c) DISSEMINATION STANDARDS (DATA.
First Year in Focus at Canadian Colleges and Universities.
1 CES IASSIST 2002, June 2002 University of Connecticut MetaNet: Standardising Statistical Metadata Methodology Karen Brannen University of Edinburgh,
Data Seal of Approval Overview Lightning Talk RDA Plenary 5 – San Diego March 11, 2015 Mary Vardigan University of Michigan Inter-university Consortium.
DDI Does it have a life beyond IASSIST? IASSIST/IFDO 2005 Edinburgh Edinburgh February 11, 2004 Ernie Boyko NESSTAR Americas Ottawa May, 2005.
The International Household Survey Network IHSN IHSN Secretariat PARIS21 Steering Committee, 14 November 2007.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Carol Tullo, The National Archives 14 April 2011 The Checks and Balances of a Transparent Public Sector World of Information.
Overview of SDMX: Statistical Data and Metadata eXchange Technical and Content Standards for Statistical Data Ann McPhail, Division Chief Statistics Department,
SDMX and DDI Working Together Technical Workshop 5-7 June 2013
DevInfo DevInfo – a statistical tool for data collection and dissemination information from statistics the impact of better dissemination Progress in Jamaica.
SDMX at the IMF Progress Report Expert Group on Statistical Data and Metadata Exchange (SDMX 2007), Geneva, May 8-11, 2007 Patrick Hinderdael, Economic.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
IHSN International Household Survey Network Strategy for the Development of Data: Improve the Availability, Accessibility, and Quality of Survey Data Mahesh.
A researcher accreditation model for access to Official micro-data Paul Jackson ONS and DwB Tarragona, 28/10/2011 A DwB Project.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
Research and Innovative Technology Administration U.S. Department of Transportation August 6, 2012 Safety.Data.Gov Initiative.
Technical Overview of SDMX and DDI : Describing Microdata Arofan Gregory Metadata Technology.
SDMX Overview NSF Accounting Interoperability Workshop May Washington DC Arofan Gregory Rene Piche
B A C K G R O U N D B R I E F I N G A N D N E X T S T E P S METIS Geneva, February 2004 Statistical Data and Metadata Exchange Initiative.
SERPent Project Secure Epidemiology Research Platform January – October 2010 Virtual Research Environment Rapid Innovation Project Funded.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Developing and improving data resources for social science research A strategic approach to data development and data sharing in the social sciences Peter.
ESDS and the wider world Kevin Schürer, Director, ESDS/UKDA.
 Metadata Technology North America  200 Prosperity Dr., Knoxville, TN USA  +1 (877) DDI – SDMX   Metadata Standards and Official.
ESDS International Celia Russell and Susan Noble Economic and Social Data Service University of Manchester ESDS International Conference 2007.
Conference of European Statisticians 59 th Plenary Session Session organizer: Ms. Shaida Badiee, World Bank Session II: The management challenges of SDMX.
EPA Geospatial Segment United States Environmental Protection Agency Office of Environmental Information Enterprise Architecture Program Segment Architecture.
ADP SUPPORT IN UGANDA BUILDING A NATIONAL DATA ARCHIVE Presented by Kizito Kasozi Director Information Technology Uganda Bureau of Statistics PARIS21.
Secure Epidemiology Research Platform (SERPent) Kick Start Meeting - April 15 th, 2010 Pascal Heus
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
SDMX IT Tools Introduction
The Data Documentation Initiative (DDI) Fostering Community Engagement and Adoption Breakout 9 RDA Sixth Plenary, Paris Mary Vardigan, ICPSR, University.
Strategic Priorities for DDI Spring 2013 Mary Vardigan Director, DDI Alliance METIS -- Geneva, Switzerland May 6, 2013.
1 Joint UNECE/EUROSTAT/OECD METIS Work Session (Geneva, March 2010) The On-Going Review of the SDMX Technical Specifications Marco Pellegrino, Håkan.
1 High Level Seminar for Eastern Europe, Caucasus and Central Asia Countries (EECCA). Quality in Statistics: Metadata Tbilisi, Georgia, June 2012.
ΕΚΤ Access to Knowledge ΕΚΤ Access to Knowledge R&D Statistics Information System: An Interoperability Tail between CERIF and SDMX Dimitris Karaiskos Dimitrios.
Setting Up a National Data Archive: The Ugandan Experience
International Household Survey Network
System Overview Training on the use of the new countrystat
Methodology and Corporate Architecture
SDMX Visualisation.
Geospatial Data Use and sharing Concepts
IASSIST 2007 Montreal, May , 2007 Session A2 Open Data and the Common Good Walking the Wire: How Technology helps us Achieve the Correct Balance.
Statistical Information Technology
August Götzfried Eurostat unit B 4
Roxane Silberman, Réseau Quételet
IASSIST 2007 Montreal, May , 2007 Session A2 Open Data and the Common Good Technology Solutions for Difficult Challenges Pascal Heus Open Data.
Palestinian Central Bureau of Statistics
SDMX Global Conference , Budapest, September 2019
Presentation transcript:

Metadata Standards and XML Technologies for Unlocking Statistical Data Pascal Heus Metadata Technology / Open Data Foundation

Context Domain: socio-economic data, health data, official statistics, and the likes Kind of data: both microdata (datasets) and macrodata (aggregated statistics / time series) Produced by: government/federal agencies (data.gov), national statistical offices, international organizations, researchers Consumed by: producers, policy/decision makers, academic researchers, economists, public agencies, private sector, press, other users but can be used in other domains

Discovery / Access / Exchange Data Quality / Usefulness Data Documentation Harmonization / Comparability Preservation Data production Data privacy vs. openness, transparency …in world of legacy systems, proprietary software, traditional practices… Challenges

Some of these have been seen before… The rise of the Internet, business to business, business to consumer, communities, social networks, etc. Solved through XML technologies, web service oriented architecture, adoption of standards models, web 2.0 Same industry standards strategies can be used for managing microdata and official statistics The past decade has focused on development of domain specific standard and tools by key stakeholders But we need more than one bullet… Data Documentation Initiative (DDI) Statistical Data Metadata Exchange standard (SDMX) Complementing standards: ISO 11179, ISO 19115, Dublin Core, METS, … Is there a magic bullet?

Originates in data archiving community For microdata or low level administrative statistics Data typically in SAS, Stata, SPSS, R, ASCII, SQL, … DDI-Codebook (1.x-2.x): Fit for preservation/archive, single survey, after the fact documentation, stand alone agency Simple to use, in existence for over a decade Widely adopted, tools/training available DDI-LifeCycle (3.x): Fit for managing data and metadata across production/archive/dissemination/research Published in 2006, encourage reuse and robust design Ongoing rapid adoption, tools/training becoming available Data Documentation Initiative (DDI)

Initiative from international organizations BIS, ECB, EUROSTAT, IMF, OECD, UN, World Bank Need for global data exchange framework Is an ISO standard (17369) Use for aggregated statistics / time series Typically used in excel (analyst), HTML (casual web user), data warehouse (analytics) SDMX describes information (structure) but also natively carries data Publication of static data for exchange between organization or access by end users Recognized in 2007 by UN statistical commission as the preferred standard for the exchange and sharing of data and metadata Statistical Data and Metadata Exchange Standard (SDMX)

Leverage proven industry standard technologies XML, XSchema, XSL/Xpath/XQuery, SOA Cost effective implementation / integration Human readable and machine actionable Are complementing standards / work hand-in- hand Designed for different use cases / needs But together, cover the entire data lifecycle, from respondent to policy maker Broad Adoption Internationally recognized, widely used, large community Face little or no competition Designed to work with other standards ISO 11179, ISO 19115, Dublin Core, RDF, … DDI + SDMX

DDI / SDMX Map (draft) Courtesy of Australian Bureau of Statistics

Selected success stories International Household Survey Network Rolled out DDI in developing countries around the globe Hundreds of surveys now documented, often with underlying data available over the web See Social Science Data archives Across Europe, North America, Australia, using DDI to capture and publish information on large data collections Thousands of surveys documented Examples Etc.

Selected success stories Interuniversity Consortium for Political and Social Research (ICPSR) An international consortium of about 700 academic institutions and research organizations provides access to over ½ million data files captures DDI metadata for over 7000 surveys (and is also the birth place of DDI) xml/index.jsp xml/index.jsp

European Central Bank Demonstrating saving 120 FTEs per year by basing their systems on the SDMX model t t Federal Reserve Board of NY Doubled their hit-rate on the website when they exposed what had been CSV files in an SDMX format kFed.ppt kFed.ppt Joint External Debt web site Turned a 3-month production process into minutes by adopting SDMX Selected success stories

Numerous projects are under way leveraging DDI and SDMX. Australian Bureau of Statistics Implementing a SDMX+DDI framework Canada Research Data Center Network DDI-LifeCycle driven data/metadata management framework Data without Boundaries Integrated model for accessing official data in Europe 27 partners from 12 European countries Tools Open source and commercial software are becoming increasingly available IHSN Toolkit, Nesstar, CRDCN project, OECD.Stat, Algenta Colectica, Stat/Transfer, Space Time Research, etc. Ongoing efforts

Not only a technology challenge (that's the easy part) but also need: Infrastructure, Tools, Training Awareness & adoption (need success stories) Models are complex (but compliance is all that is needed) Managing the change Integration in day-to-day environment Upper level management / funding agencies need to drive adoption Show benefits and return on investment Broad adoption for full benefits By producers, archives, researchers, vendors What’s the catch?

This is not only for public data! Information is about individuals, corporate entities this is not data in the “Internet” sense Transparency is important but access must be in accordance with UN fundamental statistical principles, legal systems, and respect respondent privacy This is often misunderstood or overlooked Broad definition of “open data” Accessible, fit or use, safe and secure, meet needs of producer and end-users, well documented Can be protected / non-public data Comprehensive metadata are essential to responsibly and effectively providing access to all data Open Data Warning

Several data.gov like initiatives (us, uk, nz, ca, …) All are attempting to bring together data from numerous sources could significantly benefit from structured metadata standards such as DDI/SDMX The US data.gov Extra-challenge: US federal statistical system is composed of over 120+ statistical agencies (no national statistics office!), and state level agencies… This is a perfect use case for DDI and SDMX Recommendations Consider adopting DDI / SDMX and related standards across statistical system (don’t make up new ones!) Advocate and support the use by federal agencies of industry standard XML and SOA IT architecture The data.gov use case

Domain specific mature metadata specifications are available for managing socio-economic data, health data, and official statistics The Data Documentation Initiative (DDI) is the recommended specification for “microdata” The Statistical Data Exchange Standard (SDMX) is the recommended standard for aggregated data / time series Both leverage IT industry standard technologies These standards are seeing broad adoption around the globe and can benefit numerous organizations and individuals We need to encourage adoption by key agencies and stakeholders And can also be applicable to other domains Summary

Announcements The IASSIST 2012 conference will be held at GWU in Washington, DC June 6-8 ( New DDI/SDMX community portal to launch next year (signup at Contact us for ODaF membership or join LinkedIn group For more information Contact Thank you!

"The Data Documentation Initiative (DDI): An Introduction for National Statistical Institutes", Arofan Gregory, Open Data Foundation, Jul 2011, "Open Data and Metadata Standards: Should We Be Satisfied with 'Good Enough'?", Arofan Gregory, Open Data Foundation, Jun 2011, “Maximizing the Potential of Data: Modern IT Tools, Best Practices, and Metadata Standards for SBE Sciences”, Pascal Heus (Metadata Technology North America Inc.), Arofan Gregory (Open Data Foundation), Oct "Metadata", Arofan Gregory (ODaF), Pascal Heus (ODaF), German Council for Social and Economic Data Working Paper no. 57/2009, March 2009, “DDI and SDMX: Complementary, Not Competing, Standards", A. Gregory, P. Heus, Open Data Foundation, July 2007 “Combining Metadata Standards: Approaches and benefits”, Arofan Gregory, Open Data Foundation, Work Session on Statistical Metadata (METIS) (Geneva, Switzerland, March 2010), "Data Documentation Initiative: Toward a Standard for the Social Sciences", Mary Vardigan (ICPSR), Pascal Heus (ODaF), Wendy Thomas (MPC), International Journal of Digital Curation, Vol 3, No 1, Aug 2008, References