Presentation on theme: "Www.uis.unesco.org Building SDMX Data Structure Definitions based on a generic conceptual model for contents Experience with the joint Eurostat-Unesco-OECD."— Presentation transcript:
www.uis.unesco.org Building SDMX Data Structure Definitions based on a generic conceptual model for contents Experience with the joint Eurostat-Unesco-OECD education statistics questionnaire Michael Bruneforth, UNESCO Institute for Statistics Expert Group on SDMX firstname.lastname@example.org May 10, 2007, UN, Geneva
www.uis.unesco.org Overview lThe world of international education data collections l Why building a conceptual model lSteps to build the model lThe model lFrom the model towards a SDMX data structure definition
www.uis.unesco.org The world of international education data collections
www.uis.unesco.org lThe UNESCO-UIS / OECD / EUROSTAT (UOE) Data Collection on Education Statistics EXCEL based questionnaire, organized in 31 work sheets 47 countries, 14,000+ data points Changes: 2003 lThe World Education Indicators Project (WEI) Based on UOE Instruments, extended by 10 work sheets 16 countries, >15,000+ data points Examples at www.uis.unesco.org/publications/wei2006 lThe UIS Survey Pdf based E-Questionnaire infrastructure, plus paper form All remaining countries, 5,000+ data points Examples at www.uis.unesco.org -> current surveyswww.uis.unesco.org Instruments used in the system of international education data collections
www.uis.unesco.org Instruments used in the system of international education data collections (II) UOE i ii i i i i UIS Can be transformed WEI
www.uis.unesco.org Education Questionnaires: ever changing l1998 Tables were introduced after ISCED 97 was adopted. l2000 Redesign of Finance tables. l2001 – 2005: ??? l2005 Major redesign: Tables redesigned, some tables spilt or combined. l2006 In ENRL8a; ENRL8b and ENRL8c: the Caribbean countries are now included with Latin America instead of Northern America. l2007 In table ENRL-7, three new sub-categories, unknown residence, unknown prior education, and unknown citizenship have been added.. In ENTR-2 a new row has been added to collect typical age of entry. In GRAD-1 and GRAD-3 a new row has been added to collect typical graduation age.
www.uis.unesco.org Why building a conceptual model? lMeta data Theoretical basis for describing data Visualization of data Validation of codes lQuestionnaire design Improving internal consistency in questionnaires Maintaining the coding schemes: »Avoiding random or ad-hoc data descriptions leading to inconsistent, incomprehensible systems l(we need discipline as much as a model!)
www.uis.unesco.org Why using a conceptual model as basis for SDMX? lA model describes a universe of questionnaires Consistency across questionnaires Consistency across tables Consistency across statistical units Facilitates adaptation of SDMX to changes to tables »Typically no/few keys need to be changed, most new data can be defined using existing keys lA model can be used to describe indicators and derived data SDMX exchange of results (->WorldBank, MDG) lA model can be transformed into/from data base definitions Use of existing meta data (efficiency) Avoid redundant information (less error prone) Basis to match national data to international SDMX definitions
www.uis.unesco.org Building the model Step 1: Bo Sundgrens analysis of the UIS Questionnaire Step 2: Analysis of the relational data base at UIS Step 3: Correction / Expansion of Bos model lStep 4: Model verification1: review of UOE questionnaires lStep 5a: Model verification2: Transformation of UIS database model to conceptual model, automated creation of full code list ºStep 5b: Model verification2: Analysis of the relational data base at OECD ºStep 6: Creation of data structure definition based on existing meta data
www.uis.unesco.org Principles for the generation the detailed model for individual data points lUse existing meta data lAvoid multiple capturing of questionnaire information lEnsure consistency with existing systems l
www.uis.unesco.org Generate the detailed model for individual data points
www.uis.unesco.org The basis: the UIS meta data (relational database description)
www.uis.unesco.org Example: UIS meta data (relational database codes, XML version)
www.uis.unesco.org What is needed beyond the model to get a complete data structure definition? lThe data structure definition has to cope with data points collected twice. Total number of primary students is collected in ENRL1a, ENRL1, ENRL3, ENRL4, CLASS1 lThe data structure definition has to cope with adjustements to data concerning coverage of data. The count of student is collected with coverage adjusted to expenditure data.
www.uis.unesco.org Questions, comments? Education content: Michael Bruneforth (email@example.com)firstname.lastname@example.org IT: Brian Buffett (email@example.com)firstname.lastname@example.org Thanks