Presentation on theme: "Database Management 2nd EUKLEMS Consortium Meeting, 9-11 June 2005, Helsinki This project is funded by the European Commission, Research Directorate General."— Presentation transcript:
Database Management 2nd EUKLEMS Consortium Meeting, 9-11 June 2005, Helsinki This project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs".
EUKLEMS Database requirements Interface with Consortium inputs Produce consistent KLEMS data from disparate data Store EUKLEMS database Generate Productivity estimates Store and document Assumptions and Methods Interface with EUKLEMS public output
Examples of Practical Problems (1) Aggregation: Chain aggregate 2-digit output by industry to a 1-digit level Disaggregation: Estimate computer mfg. and other machinery output given output of total machinery mfg. Ensure that the results aggregate to the observed total. Employ available indicators (pattern data) of how to split the two. Balancing Estimate investment by industry and by asset type. Observe detail in only 1 dimension at a time. Estimates aggregated by type to equal the observed totals by industry. Estimates aggregated by industry to equal the observed totals by type. Employ available indicators (e.g. from other countries) to estimate the detailed breakdown.
Examples of Practical Problems (2) Concording Converting data for timeseries breaks in national industry codes Converting data from national industry code to EUKLEMS definitions Cross-classification (technically the same as concording) Estimating labor by quality-level using data by occupation.
Productivity Research Data System Programmed as application layer on top of SAS, designed to overcome practical data hurdles Data to be manipulated are coded in relational databases Identities coded in meta data SAS macros to manipulate the data make use of the meta data Aggregation, disaggregation, balancing, and concording Specialized tools to estimate Capital stocks, labour services, capital services, TFP, Domar weighting, alternative deflation In addition: Documentation reference manual (also on-line) users guide User community (in statu nascendi)
What is a relational database? Seperate data and meta-data Meta-data has number of dimensions, e.g. Year: 1970, 1971,….. Country: A, B,C,.. Industry: 1, 2, 3, …. Unit: nominal, real, deflator Type: investment in building, machinery,…, 1970_B_3_nom_mach: 5,870
General lay-out of Analytical module 1. Collection of basic data A. Basic data by country 2. Making meta-data (hierarchies, concordances) B. Data in Prodsys readable format 3. Combining datasources: (dis)agg, concor, balancing C. Input data KLEMS: full time-series and industries 4. Applying productivity tools: harmonization and experiment D. Output data KLEMS 5. Selection of series for analytical database E. Public analytical database Statistical module is same as analytical, except only data and procedures which are backed by NSO
Partner Involvement Use SAS/Prodsys or Do not use SAS/Prodsys? Advantages of using SAS/Prodsys is full flexibility to experiment, but steep learning curve and expensive Importantly: SAS/Prodsys is not needed for data delivery, only Excel!
Data delivery standard Each table neeeds the following information Description of source (publisher, NSO) and version (release date) Classification Meta-data associated with each dimension of national database Concordance with EUK dimension lists Document with guidelines will be prepared
Tree structure of data hierarchy: A as Grand Total with nodes (children) B and C each subdivided further down A B C D E F G H
Highest node on top, children indented, each node always immediately followed by its own children: A.B..D..E..F.C..G..H
EU KLEMS row tree structure
Sequence of processing raw data Prepare metadata (hierarchy) and concordance Metadata and concordance are read into ProdSys Then data is read into ProdSys Metadata determine how original raw data gets translated to SAS (table) version of original raw data Metadata + already present EU KLEMS metadata + concordance determine how SAS version of original raw data gets transformed into EU KLEMS data You dont have to rearrange the actual data into this hierarchical structure; ProdSys assigns the data to the correct category in SAS once the category is part of the metadata.
Concordances A raw data category may have to be split into several EU KLEMS categories We need to be able to determine how the split is made: weights Use pattern data to calculate these weights