Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grids and GeoStatistics

Similar presentations


Presentation on theme: "Grids and GeoStatistics"— Presentation transcript:

1 Grids and GeoStatistics
A work in process Routines for delivering products to the national spatial data interest/infrastructure community The European Forum for Geostatistics workshop in Bled, Slovenia 1rst – 3rd October 2008 The next step towards an integration of statistics and geography for sustainable development Vilni Verner Holst Bloch MSc. landscape ecology and natural resources Statistics Norway Otervegen 23 N Kongsvinger Tel : ++47 / Fax : ++47 / Session 2: Concept for an integrated web solution / An infrastructure for geostatistics. (The Subproject 3)

2 Overview of the presentation
Background – introduction Defining the grid Making guidelines for geostatistics Standards and metadata Confidentiality and quality Further work and deliveries

3 Background Partnership in Norway Digital (obligations)
Expressed needs from partners Statistics Norway Registers on address level Population Buildings Ground Properties Sport facilities Farms Businesses And many more

4 Background Download page for Norway Digital partners

5 Background Formalising a national grid for geostatistics
Internal drive within Statistics Norway (coming censuses, preparations for a geodatabase) The Norwegian geospatial data infrastructure (Norge digitalt) was established in All public institutions producing spatial data, including Statistics Norway, are members of the infrastructure. Statistics Norway is the main provider of demographic and socio-economic data in Norway. Statistics Norway therefore has a special responsibility to ensure the compatibility between social science data and the traditional, topographic and physical data provided by other partners in the infrastructure. A standardized grid is one way to provide such compatibility. By setting the SSBgrid as a standard, Statistics Norway intends to provide a common reference frame for spatial grid data for use in Norge digitalt and thus promote further integration of data from the different partners in the spatial infrastructure.

6 Defining the grid SSBgrid is an open-ended definition of a family of spatial tessellation models for use in Norway. The models are all built with quadratic grid cells. The naming convention of the grids is to use the grid cell size, measured as the length of a side of a grid cell in meters or kilometers concatenated to the capital letters ‘SSB. cell size is also denoted K, and with either m or km to indicate length measure.

7 Defining the grid A system of standard grids
Grid name Cell size Number of cells SSB100m (1) 0.01 km2 Approx. 35  cells SSB250m (1) km2 Approx. 5  cells SSB500m (1) 0.25 km2 Approx. 1  cells SSB1km 1 km2 Approx. 350 000 cells SSB5km 25 km2 Approx. 15 000 cells SSB10km 100 km2 Approx. 15 000 cells SSB25km (2) 625 km2 Approx. 500 cells SSB50km (2) km2 Approx. 150 cells SSB100km (2) km2 Approx. 40 cells SSB250km (2) km2 Approx. 10 cells SSB500km (2) km2 Approx. 4 cells Because of limitations in many software packages and for practical use, these grids are recommended as grids with a county coverage. (2) These grids might also cover sea territories. Number of cells refers to coverage of Norwegian mainland. One has however to be aware of deviations in grid cell areas for regions remote from the Norwegian mainland and Svalbard.

8 Defining the grid Identification
The identification (ID) of a grid cell with its southwestern corner at [XC,YC] in an SSBgrid with grid cell size K is Computational efficiency Since a false easting is needed to compute XC and YC for locations along the western coast of Norway, the false easting used in the ID should be also used for this purpose when the ID is calculated directly from point coordinates: Comments Every grid cell in an SSBgrid has a unique, 14 digit ID. The first seven digits is the east coordinate of the southwestern (lower left) corner of the grid cell (as meter coordinate in UTM33/EUREF89) with an additional false easting of 2  The false easting is added in order to avoid problems generated by negative east coordinates and is added to the standard false easting of the UTM coordinate. The last seven digits is the north coordinate of the southwestern (lower left) corner of the grid cell (as meter coordinate in UTM33/EUREF89). The ID is the key element of the SSBgrid system. Due to the ID, SSBgrid data can be distributed as tables instead of spatially organized “raster” data. Several datasets using the same SSBgrid can easily be joined together (using the ID as the key) and it facilitates manipulation and analysis of data with standard statistical or tabular data processing tools (e.g EXCEL or SPSS).

9 Defining the grid

10 Defining the grid

11 Defining the grid Projection and datum
All SSBgrids are defined in UTM33/EUREF89. Extention SSBgrids have no definite extention. Exchange format SSBgrids have no particular exchange format. Comments Grids can be drawn in other projections, but the grid cells will then not appear as quadratic squares. The “origo” of the grid is the point where the 15° east meridian is crossing the equator. The “x” coordinate for this point is, however [ ,0] (and not [0,0]) because a false easting og 500 000 is added by the definition of the UTM system and another false easting of 2 000 000 is added by the definition of the SSBgrid in order to avoid using negative numbers in the identification of the grid cells. To cover Norway, it is recommended to start at [ , ] and include an area reaching 1 200 000 meters eastward and 1 500 000 meters northward. SSBgrid data are most conveniently exchanged using simple tabular formats (text files, EXCEL spreadsheet, SAS or SPSS data files) where each grid cell is represented as a row in the table. A dataset must include the ID of each grid cell and can contain any number of additional data.

12 Organisation – responsibility and contact points
S320 Population statistics Paul Inge Severeide (Head of division) Gjermund Nygårdseter (GIS contact ) S410 Enterprise register Jan Ole Furseth (Head of division) Beate Bartsch (GIS contact ) S430 Agricultural statistics Ole Osvald Moss (Head of division) Anne Snellingen Bye (GIS contact ) S460 Construction and services statistics Roger Jensen (Head of division) Birgit Bjørnsgard (GIS contact ) Skiller ikke mellom arealbruk og arealdekke. Endringer ofte viktigere enn statustall.

13 Quality checks Units with coordinates # records in original file
# grid cells Dif. (n) Dif.(%) Enterprises 2008 35 960 35 861 99 0,28 Population 2000 55 749 55 653 97 0,17 Population 2001 55 992 55 797 196 0,35 Population 2002 55 946 55 812 134 0,24 Population 2003 56 829 56 743 87 0,15 Population 2004 55 805 55 721 85 Population 2005 55 672 55 588 Population 2006 55 703 55 616 88 0,16 Population 2007 55 572 55 489 53 0,10 Population 2008 55 546 55 467 80 0,14 Buildings 2006 434 0,37 Buildings 2007 435 Buildings 2008 382 0,33 Farms 1999 33 406 33 309 98 0,29 Farms 2006 26 993 26 909 84 0,31

14 Confidentiality rules
Enterprises No lower limit Population 1 – 9 => 5 Buildings Farms Due to the Statistical Act § 2-6, Statistics Norway as a main rule does not publish tables containing fewer than 3 units in a group (table cell) in which the sampling method[F1]  can allow identification of individuals. Grid cells are table cells in the sense used in the Statistical Act. In a grid map of population using 1x1 square km grid cells, a number of grid cells will have one or two observations. In Norway, electing this grid cell size will result in about 3 400 and 3 800 cells with 1 or 2 persons respectively (6-7 per cent of the approximately 55 500 inhabited grid cells in Norway). In comparison there are about 13 400 inhabited basic regional units (grunnkretser). The 7 200 cells with either 1 or 2 persons constitutes 12.8 per cent of the inhabited grid cells. These cells with 1 or 2 observations are unevenly distributed between urban and rural areas. In the sparsely populated northernmost county of Finnmark (see figure), almost every fifth inhabited cell (20%) has fewer than three persons. Only 7% of the cells have fewer than three inhabitants in the more densely populated county of Akershus. Confidentiality issues also arise when other geographical information is combined with the grid net. For example, combining the grid with digital municipal borders or a digital road network will give information on where the cells are located. If these grid cells are to be anonymous, several issues need to be considered. Must the sum of values from the grid give the correct population total for the country, or is it sufficient to display all inhabited cells without necessarily giving the exact number for every cell? Is three observations per grid cell an acceptable lowest value in terms of maintaining anonymity, or should the threshold value be increased? The board of confidentiality at Statistics Norway was asked (Ottestad, 2006) to consider various methods for handling confidentiality and to set criteria for disclosure control. The following methods were discussed: Suppression method Heldal’s method Least value method Larger grid cells Clustering method Average method The board of confidentiality recognized that all of these methods might be used, but that they would produce different results. Methods which do not display all inhabited grid cells were considered to be poor solutions from a user perspective. Changing the grid size and/or shapes would also reduce user friendliness. Setting limits or cut-off values for individuals presents challenges with regard to households. When variables other than residents are used (for example households) one should consider setting a threshold for the number of inhabited addresses in each grid cell. At present we lack a satisfactory overview of households in Norway, and hence cannot set limits for the number of households per grid cell. An alternative would be to set the threshold number of individuals per cell so high, that it normally would include more than one household. Viewed as an isolated piece of information, presentation of the fact that there is one resident in a square kilometre is not in conflict with confidentiality rules. However, later publication of other socio-economic variables or information about people in grid cells where the number of persons is low would be in conflict with the confidentiality rules. The board of confidentiality at Statistics Norway concluded that publication of population statistics (number of persons) on a one square kilometre grid should not include exact values for grid cells containing fewer than 10 persons. Statistics Norway will therefore use the following values for grid population statistics: 0, 1-9, 10, 11, 12 and so on. For grid cells with 1-9 persons the value is set to 5. Population by 1 square kilometre grid. Finnmark county

15 Metadata Task: To transfer existing metadata to an xml-document

16 Metadata Lots of standards and tools

17 Metadata editor – INSPIRE online

18 Metadata editor – ESRI ArcGIS

19 Metadata editor – ESRI ArcGIS

20 Metadata editor – OS MapWindow GIS

21 Example : UML modell for adm. boundaries
Metadata Example : UML modell for adm. boundaries

22 Are people blue in Java?

23 … and red in Japan?

24 Making guidelines Use Presentation Delimitations Interpretation
Article in print Example: Identification of residents within a distance from a hospital.

25 From noise to information

26 Preliminary suggestions
Buildings Employees Population Topography Bathymetry

27 Possible future deliveries
Households Dwellings Income Migration Education Agricultural statistics Historical statistics

28 Further work and cooperation
Products developed in cooperation with users Enhanced metadata Improve quality checks and measures Make guidelines for use of products System for exchange of data Thank you ☺


Download ppt "Grids and GeoStatistics"

Similar presentations


Ads by Google