Presentation on theme: "Conversions from national grid data to harmonized European grid data EFGS Lisbon 12-14 October 2011 Production and challenges Rina Tammisto, Senior Statistician,"— Presentation transcript:
Conversions from national grid data to harmonized European grid data EFGS Lisbon 12-14 October 2011 Production and challenges Rina Tammisto, Senior Statistician, Statistics Finland Marja Tammilehto-Luode, Chief Adviser, Statistics Finland
Harmonization Data harmonization Source data Georeferenced national data Disaggregated European data Methods used Aggregated Disaggregated Hybrid method Spatial harmonization A grid net covers the whole of Europe
ETRS89-LAEA Grid Net Downloadable ZIP http://www.efgs.info/data/GEOSTAT-1km-Grid.zip/view Grid_ETRS89_LAEA_1K.shp Abt. 500 Mt
LAEA grid net in relation to national grid net in Finland LAEA grid net in relation to national grid net in Austria
Differences in locations of grid cells in different projections (or co-ordinate systems) A grid cell produced by using the national ETRS89- TM35FIN co-ordinate system and projection is divided among several ETRS89-LAEA grid cells Direct derivation between different co-ordinate systems or projection is not usable grids are located differently in relation to each others A issue to be solved: How to use national grid datasets while the direct conversion is not relevant…?
Tested method 1. Aggregation of grid data by using converted building points 1) Georeferenced source data is converted Buildings are converted from ETRS89-TM35FIN to ETRS89-LAEA 2) Converted building points are joined with the ETRS89-LAEA grid net 3) Aggregation of statistical data
Building points in ETRS89-TM35FINBuilding points in ETRS89-LAEA Aggregation of statistical data
Method 1 Advantages Points easily convertible – original quality of location maintained From geostatistical point of view data quality throughly the same as in national data Disadvantages Double sets of primary data Double production processes from the beginning Risk of data disclosure – due to use of several co-ordinate systems - gaps between datasets
Tested method 2. Conversion of grid data by using ready-made national grid datasets 1) Ready-made national grid dataset in ETRS89- TM35FIN is converted into ETRS89-LAEA Polygon to Point – using the middle points of national grid cells Conversion of the middle points of grids 2) Converted points are joined with the ETRS89-LAEA grid net 3) Aggregation of statistical data
PRODUCTION OF THE NATIONAL GRID DATA MIDDLE POINTS OF NATIONAL GRIDS CONVERSION OF THE POINTS, SPATIAL JOIN WITH ETRS89-LAEA GRID NET AGGREGATION OF STATISTICAL DATA
Effects of the grid cell size on the quality of the conducted data Tested grid cell sizes: National grid data: - 125 m x 125 m – highest resolution data - 250 m x 250 m - 1 km x 1 km Reference data: Data produced by using method 1; (conversion made on building points) Additional test: JRC/GISCO disaggregated data – data produced for the Finnish Grid Database
Comparison of the test datasets Statistics: Number of grids, mean (inhabitants/grid populated grid cell), total number of inhabitants in the dataset, min, max VariableNMeanSumMinimumMaximum Dataset from converted building pointsPOP_1KM_LAEA 102 05051,05 204 192114 053 Datasets from converted grid pointsPOP_1KM_125M 102 24950,95 204 192114 197 POP_1KM_250M 102 75950,65 204 166113 283 POP_1KM_1KM 99 04952,55 204 179119 175 JRC datasetPOP_DISAGG 159 92132,45 181 8060.015 866
Coefficients Prob > |r| underH0: Rho=0 Number of Observations Pearson CorrelationCoefficients POP_1KM_ POP_ LAEA125M250M1KMDISAGG POP_1KM_LAEA1.000000.999000.994950.909890.79804 POP_1KM_LAEA <.0001 102 05099 37297 21681 64785737 POP_1KM_125M0.999001.000000.994710.909900.79857 POP_1KM_125M<.0001 99372102249974888180885871 POP_1KM_250M0.994950.994711.000000.906110.79840 POP_1KM_250M<.0001 97216974881027598218586268 POP_1KM_1KM0.909890.909900.906111.000000.74920 POP_1KM_1KM<.0001 8164781808821859904982069 JRC dataset POP_DISAGG0.798040.798570.798400.749201.00000 POP_DISAGG<.0001 85737858718626882069159921 Dataset from converted building points Dataset from converted grid points
Evaluation of differences by using absolute values of inhabitants/km² grid cell (absolute values of differences) Identity line (the 45 degree line) Values of converted dataset in relation to values of national datasets
Method 2 Advantages Use of the ready-made grid datasets! Less phases Smaller data mass Level of quality is a matter of choice Adequate level of quality (?) Dependent on use Min. target: SUM of the whole dataset is correct No increase of confidentiality problems with double datasets Disadvantages Geostatistical point of view data quality is weaker than the original national data Quality errors – quality distortion compared to the correct one (measuring by number of inhabitants)
Next steps For GEOSTAT 1A project from October - November 2011 More tests, any volunteers? Quality definitions concerning adequate level of quality and grid scale used Step-by-step guidelines LAEA dataset – filling the empty grid net with data!
Your consent to our cookies if you continue to use this website.