Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen

Similar presentations


Presentation on theme: "Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen"— Presentation transcript:

1 Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen saija.ylonen@stat.fi

2 Organizational chart 11/03/20102Saija Ylönen

3 Co-operating parties of the metadata tasks: organizational units IT Management situated in the Secretariat of the Director General co-ordinates the general information architecture, of which metadata tasks form one element Classification and Metadata Services situated in the IT and Statistical Methods department operational unit active role in developing of metadata Dissemination Services situated in the IT and Statistical Methods department develops the metadata connected with the dissemination 11/03/20103Saija Ylönen

4 Metadata Co-ordination Group Originally a co-operation group for persons working with metadata issues in the support function departments of SF The objective at present is to intensify the co-operation between the statistics departments and the parties responsible for general metadata work Comprised of members working on metadata and permanent members from all statistics department Goal is to widen knowledge about metadata and metadata systems and to give an opportunity to the statistics departments to discuss their metadata needs with metadata specialists 11/03/20104Saija Ylönen

5 CoSSI Steering Group and CoSSI model Foundation for the metadata system Modular, xml-based model for describing statistical tables, classifications, concepts, variables, general information on statistical documents, and quality, etc. Expandable CoSSI Steering Group is in charge of mastering and developing the model according to user needs in a manner that will not expose its main structure to risk 11/03/20105Saija Ylönen

6 Definition of metadata 1) Statistical metadata variable and data descriptions classifications, concepts 2) Statistical data quality quality reports statistical method descriptions 3) Metadata of statistical documents or products producers publication information field or subject area 11/03/20106Saija Ylönen

7 Definition of metadata II 4) Process metadata a) technical metadata technical metadata guide the workflow of data production, makes it possible to follow data production and documents the working process. b) conceptual process metadata technical information of data and variables which are used in producing data. E.g. minimum or maximum values, various calculation rules or use of certain classification values 11/03/20107Saija Ylönen

8 Metadata systems at Statistics Finland 11/03/20108Saija Ylönen

9 Metadata systems: present situation We are in a transitional phase from relational databases to an xml-based environment Relational databases: classifications, concepts and definitions, archiving database Xml database eXist: publications, classifications, concepts, data descriptions 11/03/20109Saija Ylönen

10 Relational databases Built in the 1990’s Used in statistics production but not in all statistical processes or all statistics Classifications in the relational databases are used in SAS and Superstar Archiving database is in use in the archiving process Classifications and concepts are generated from the relational databases to the web pages 11/03/201010Saija Ylönen

11 XML database At the moment, the xml database is used mostly in the creation of publications with an Arbortext word processor Classifications and concepts are copied to the xml database from the relational databases and are ready to use Tools for utilising metadata objects from the xml database are being constructed The first metadata tool linked to the xml database is the variable editor 11/03/201011Saija Ylönen

12 Variable editor For creating and maintaining the descriptions of statistical data and variables At the testing phase Implementation begins in 2010 Descriptions are saved as xml documents conforming to the CoSSI model in the eXist/xml database 11/03/201012Saija Ylönen

13 Content and functions of the variable editor Data descriptions are comprised of a general description of the data, a list of variables and information about an individual variable General data description includes descriptive information on the entire data document Variable list interleaf allows management of the list of variables in the data description and selection of the variable whose description needs editing. 11/03/201013Saija Ylönen

14 11/03/201014Saija Ylönen Variable list interleaf

15 Variable metadata 11/03/201015Saija Ylönen Field nameDescription short nameShort identifying name of variable long nameName of variable in natural language concept definitionBasic conceptual description of variable operational definitionVerbal description of the formation of the variable deduction ruleE.g. programming instructions, mathematical formula, etc. classification IDIdentifier of classification. Refers to a classification in the classification database. unit of measureMeasurement unit of variable variable modifiedDate of creation or modification of variable (yyyy-mm-dd) start of validityStart date of validity of variable (yyyy-mm-dd) end of validityEnd date of validity of variable (yyyy-mm-dd) statusStage of editing of variable: draft, ready, validated variable groupName of group to which variable belongs. Makes working with long variable lists easier. work commentFree text field. Contains information only for the use of the maintainer of a description.

16 Results from the variable editor project the development of a consistent information architecture the construction of production applications in which metadata need not be separately produced or manually added to data when publishing or archiving statistics information service where excessive time need not be spent on searching for metadata, or on actual reproduction of metadata for special compilation assignments a system from which table column and row headings can in tabulation applications be retrieved in multiple languages for all statistics using the same methods. 11/03/201016Saija Ylönen In addition to actual variable editor application the project also created preconditions for:

17 Experiences gained during the variable editor project Various questions concerning standardisation had to be addressed in the project although they were not originally in the projects’ scope of task – they had to be done and they took a lot of time Because the variable editor project was the first leg in the revision of the metadata system it was subjected to a diversity of expectations Project was a good test run for the CoSSI model – the data content of the model proved to be exhaustive 11/03/201017Saija Ylönen

18 The planning and building of a classification editor Reasons for the renewing of the classification system: the present way of maintaining classifications has been viewed as inflexible by statistics renunciation of the Sybase relational databases ICT strategy: in the next few years the agency will introduce a common statistical metadata system based on the CoSSI model Classification editor project 2010 1) definition stage 2) construction stage 11/03/201018Saija Ylönen

19 Goals of the classification editor project Analyse the service needs required from a centralised classification system Create maintenance tools for classifications in connection with the CoSSI/eXist metadata store so that the basic maintenance needs of classifications of individual statistics are met in a user-oriented manner which also allows further development of the classification system Produce the solutions with which the interoperability of the Sybase classification database and the eXist metadatabase can be ensured Compile user instructions for the editor Pilot test the editor 11/03/201019Saija Ylönen

20 Benefits of the new classification system A classification system which serves well will encourage centralised and structured maintenance of classification The documentation of classifications will improve, making them easy to find for use in-house and for the provision of information service The new classification system will support smooth movement between data descriptions, variable descriptions and maintenance of classifications and thus improve the efficiency of the maintenance and use of classifications in statistics 11/03/201020Saija Ylönen

21 General benefits of the common classification system A centralised classification system eases the workload needed to maintain classifications because classifications are only maintained in one place Reduces the possibility of errors because classifications are documented in the system consistently so that they are accessible to everybody and easy to find Improves the efficiency of time use because working hours need not be spent on looking for classifications and trying to find their background information Makes the classifications used in different statistics visible to everybody and thus creates possibilities for their harmonisation 11/03/201021Saija Ylönen

22 In conclusion: Why do some statistics departments still have their own metadata systems instead of using the centralized system? Centralised metadata work progresses too slowly from the perspective of individual statistics – We should rethink our construction and implementation strategy Common attitude still regards the process of an individual set of statistics as unique, and therefore incapable of exploiting systems that are meant for all statistics – We have to get quick results to prove the benefits of the system Commitment by the Management and their support to the work is crucial – We have to convince them 11/03/201022Saija Ylönen

23 THANK YOU FOR YOUR ATTENTION! 11/03/201023Saija Ylönen


Download ppt "Metadata projects and tasks at Statistics Finland METIS 2010 Saija Ylönen"

Similar presentations


Ads by Google