A modular metadata-driven statistical production system The case of price index production system at Statistics Finland Pekka Mäkelä, Mika Sirviö.

Slides:



Advertisements
Similar presentations
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Advertisements

Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Design Concepts and Principles
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
ITIL: Service Transition
United Nations Statistics Division Principles and concepts of classifications.
Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
OASIS Reference Model for Service Oriented Architecture 1.0
Managing Data Resources
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
File Systems and Databases
Requirements Analysis Concepts & Principles
Software Requirements
Criteria for good design. aim to appreciate the proper and improper uses of inheritance and appreciate the concepts of coupling and cohesion.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Lecture Nine Database Planning, Design, and Administration
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
Course Instructor: Aisha Azeem
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 5 Slide 1 Requirements engineering l The process of establishing the services that the.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Architectural Design.
 A set of objectives or student learning outcomes for a course or a set of courses.  Specifies the set of concepts and skills that the student must.
WP.5 - DDI-SDMX Integration
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
FP OntoGrid: Paving the way for Knowledgeable Grid Services and Systems WP8: Use case 1: Quality Analysis for Satellite Missions.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Requirements Analysis
Calculation BIM Curriculum 07. Topics  Calculation with BIM  List Types  Output.
CSE 303 – Software Design and Architecture
The Architecture Business Cycle. Software Architecture Definition The software architecture of a program or computing system is the structure or structures.
Copyright 2002 Prentice-Hall, Inc. Chapter 2 Succeeding as a Systems Analyst 2.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
Met a-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University.
UNICEF’s work and planned activities for the production of data on children with disabilities Claudia Cappa, Data and Analytics Section, UNICEF, NY.
SOFTWARE DESIGN.
11 Chapter 11 Object-Oriented Databases Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
CountrySTAT Regional Basic Administrator Training for ECO Member States Friday, October 23, 2015 EVENT Foundations of CountrySTAT E-learning.
Lecture 7: Requirements Engineering
Assessing the influence on processes when evolving the software architecture By Larsson S, Wall A, Wallin P Parul Patel.
ISO/IEC : Framework for a Metadata Registry By Daniel W. Gillman Bureau of Labor Statistics USA.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
1 15 quality goals for requirements  Justified  Correct  Complete  Consistent  Unambiguous  Feasible  Abstract  Traceable  Delimited  Interfaced.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
Understanding and using patterns in software development EEL 6883 Software Engineering Vol. 1 Chapter 4 pp Presenter: Sorosh Olamaei.
Copyright 2010, The World Bank Group. All Rights Reserved. Principles, criteria and methods Part 2 Quality management Produced in Collaboration between.
Cmpe 589 Spring 2006 Lecture 2. Software Engineering Definition –A strategy for producing high quality software.
Chapter 6 – Architectural Design Lecture 1 1Chapter 6 Architectural design.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
INFORMATION SYSTEM ANALYSIS & DESIGN
Winter 2011SEG Chapter 11 Chapter 1 (Part 1) Review from previous courses Subject 1: The Software Development Process.
26 January 2016CountrySTAT Training for the Philippines Introduction to FAOSTAT and CountrySTAT 1 Overview of the FAOSTAT and CountrySTAT Candido J. Astrologo,
February 19, February 19, 2016February 19, 2016February 19, 2016 Azusa, CA Sheldon X. Liang Ph. D. Software Engineering in CS at APU Azusa Pacific.
Software Engineering, COMP201 Slide 1 Software Requirements BY M D ACHARYA Dept of Computer Science.
Software Engineering Lecture 10: System Engineering.
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Enterprise Architectures Course Code : CPIS-352 King Abdul Aziz University, Jeddah Saudi Arabia.
1 Software Requirements Descriptions and specifications of a system.
 System Requirement Specification and System Planning.
ITIL: Service Transition
Lecture 9- Design Concepts and Principles
Lecture 9- Design Concepts and Principles
Measuring Knowledge Acquired From Information Text
Presentation transcript:

A modular metadata-driven statistical production system The case of price index production system at Statistics Finland Pekka Mäkelä, Mika Sirviö

Background Thoughts about a generic production and information system for price indices appeared around 2000 and preliminary planning started in 2004 The system has been designed, not only for calculation, but also for planning and administration of indices During the project a need for a model of statistical production process emerged Objectives Efficiency gains, reliability improvement Same data used for multiple purposes Reduction of data specific applications Implications Common terminology, process, working methods

Modularity in production of statistics According to the principles of modularity the different phases of statistical production need to be standardized and independent of each other The complexity of the system emerges from the interaction of the specialized (domain specific) modules Modules only respond to inputs of a specific class and produce outputs of a specific class The interaction of the modules requires standardization of interfaces between the modules A strictly modular statistical production process can increase productivity but requires an advanced process management system

Value chain of producing statistical information The value chain of producing statistical information is a high level description of production process. It highlights the activities, that produce value to the customer, and their interdependencies. The value chain offers two aspects to the production process: value creation and production costs. Effectiveness of the value chain depends alike on the effectiveness and expediency of a single chain link and on the cooperation of the modules. Critical factors in the functioning of the value chain are the interfaces between the modules and their standardisation. Emerging need of information Determining product concept Specifying presentation content Building measuring plan Acquiring data and metadata Creating statistical information Producing precentation content Providing communication

Determining product concept Product concept is a high level definition of an information product including main features of the product features and properties that define the product or product group or differentiate it from other products customer-product interactions (e.g. potential use and customers) Product concept describes the benefits of the product for the customer and why the product is irreplaceable with other products, it doesn’t specify the implementation of the functionality and properties

Creating statistical information Creating statististical information for providing communication Organizing measured/empirical data and metadata to standardized and validated data elements Summarizing data, estimating and validating the values of population parameters based on data elements Creating specified presentation content of information products analyzing of outputs and identification of core elements/outputs description and interpretation of statistics (graphs, tables, figures etc.)

Process resource description model for production and information system for price indices Description model has four distinct elements The values that guide the activities The activities consist of statistical production and various supporting functions The values are complemented by the instruments that provide practical means for the realisation of the values in the business process The object of activity, input/output

The key values General values Fundamental principles of official statistics European statistics code of practice Human recource management strategy Customer services policy Product concept and quality specifications Production model (implemented product concept) Data and metadata The key instruments Technical instruments Information and communication technology hardware and software Communicative instruments Work culture, distributed cognition Conceptual instruments Terminological concept analysis Conceptual modelling Descriptive statistics, inferential statistics

Terminological concept analysis and modeling Why terminological concept analysis is used for information system applications? Terminological concept modeling produces valuable semantic information about concepts and concept relations. Concept modeling produces easy to use and well applicable IT and concept systems, e.g. classifications in accordance with complete concept hierarchy

Computer aided conceptual modelling Generic production and information system for price indices enables semi-automatic construction of concept systems, or ontologies Some practical implementations Structuring of domain of statistics and creating classifications by using delimiting characteristics as a subdivision criteria Characteristics are modelled by attribute - value pairs Common pool of characteristics for all concepts in system Generic product specification models using common pool of characteristics Configuration of information products

Creating classification, background Concept is a unit of knowledge created by a unique combination of characteristics Characteristics reflect shared properties of the objects belonging to the extension of a concept Delimiting characteristic is abstract concept which consists of a set of concepts which are distinct and mutually exhaustive Delimiting characteristics and their values can be used for subdividing a concept into several subconcepts

Classification by delimiting characteristics Domain: Adults Delimiting characteristics: gender, parental relationship (PR) Values of gender: male, female Values of parental relationship: has PR, no PR Order of the delimiting characteristics: gender, PR Gender=male Gender=male, has PR (A) Gender=male, no PR (B) Gender=female Gender=female, has PR (C) Gender=female, no PR (D) Domain: Adults Delimiting characteristics: gender, parental relationship (PR) Values of gender: male, female Values of parental relationship: has PR, no PR Order of the delimiting characteristics: PR, gender Has PR Has PR, gender=male (A) Has PR, gender=female (C) No PR No PR, gender=male (B) No PR, gender=female (D)

Benefits of using delimiting characteristics as a subdivision criteria in creating classifications Creating classification and to name created classes after predefined naming conventions are clearly separate and different tasks (modularity). Every class in the hierarchy is connected with rich metadata which tell users the characteristics of concept and its relations to other concepts explicitly. The framework of creating classifications presents and structures the information of the domain of statistics precisely. Explicit sructuring of the domain of statistics enables automatic or semi-automatic processing of metadata by computers

Configurable information products ”A configurable product, or product family, is such that each product individual is adapted to the requirements of a particular customer order on the basis of a predefined configuration model, which describes the set of legal product variants (Sabin et al., 1998; Soininen, 2000). Configurable products clearly separate between the process of designing a product family and the process of generating a product individual according to the product configuration model. This places configurable product in between massproducts and one-of-a-kind products by enabling customer specific adaptation without losing all the economical benefits of mass-products (Tiihonen et al., 1998)

Examples of configuration in production and information system for price indices Indices with the same domain and nomenclature but which are calculated differently: Different index formulas can be used as input in the system. Index formula in MathML format is used as a predefined component. Indices in accordance with new index formulas can be instantly calculated. Each product has product specification: The same product specification can be shared by many products. Product specifications are generated by characteristics modelled by formal feature specifications, i.e. attribute- value pairs

Process management Process control focuses on the requirements that the process has to fulfill (e.g. timetables, quality criteria, archiving, confidentiality) Process analysis analyzes the key factors affecting the process in order to guide and improve the process provides accurate and timely (possibly real time) information on production of statistics for the production management and operative staff to enable fast reacting and to support process development

Implementation timetable 2008 Index of real estate maintenance costs 2009 Price index of newly built dwellings 2010 Building cost index 2011 Deflator indices 2011 Index of producer prices of agricultural products 2012 Consumer price indices 2013 Producer price indices for services 2013 Producer price indices