WFM 6202: Remote Sensing and GIS in Water Management

WFM 6202: Remote Sensing and GIS in Water Management
Geographic Information System (GIS) Lecture 6 Geo-spatial Interpolation Dr. A.K.M. Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) January, 2016

Principle of Interpolation
Interpolation is the procedure of estimating the value of properties at unsampled points or areas using a limited number of sampled observations.

Interpolating a Surface From Sampled Point Data
Interpolation Estimating a point here: interpolation Sample data

Extrapolation Estimating a point here: extrapolation Sample data

Sampling Strategies for Interpolation Regular Sampling Random Sampling

Linear Interpolation If A = 8 feet and B = 4 feet then C = (8 + 4) / 2 = 6 feet Sample elevation data A C B Elevation profile

Non-Linear Interpolation Elevation profile Sample elevation data A B C Often results in a more realistic interpolation but estimating missing data values is more complex

Global Interpolation Interpolating a Surface From Sampled Point Data
Uses all known sample points to estimate a value at an unsampled location Sample data

Local Interpolation Interpolating a Surface From Sampled Point Data
Uses a neighborhood of sample points to estimate a value at an unsampled location Sample data Uses a local neighborhood to estimate value, i.e. closest n number of points, or within a given search radius

Interpolation examples
Our interpolated surface (represented in 1-D by the blue line) would look like this

Interpolation examples: Sample rate
Here our interpolated surface is much closer to reality at the local level, but we pay for this in the form of higher data gathering cost

Trend Surface

Interpolation examples
Elevation: Source: LUBOS MITAS AND HELENA MITASOVA, University of Illinois ©2005 Austin Troy

Trend Surface Global method Inexact Can be linear or non-linear predicting a z elevation value [dependent variable] with x and y location values [independent variables]

1st Order Trend Surface Trend Surface
In one dimension: z varies as a linear function of x x z z = b0 + b1x + e

1st Order Trend Surface Trend Surface
In two dimensions: z varies as a linear function of x and y z = b0 + b1x + b2y + e x y z

Surface interpolation
Let say we have our ground water pollution samples This gives us

Trend Surface

Example: Approximate Interpolation
Curve Fitting by Least Square Method. Least square method (sometimes called regression model) is a statistical approach to estimate an expected value or function with the highest probability from the observations with random errors. The highest probability is replaced by minimizing the sum of square of residuals in the least square method. Equation Slope intercept

Inverse Distance Weighted (IDW)

Inverse Distance Weighted
Local method Exact Can be linear or non-linear The weight (influence) of a sampled data value is inversely proportional to its distance from the estimated value

Inverse Distance Weighted (Example)
4 3 2 100 160 IDW: Closest 3 neighbors, r = 2 200

1 / (42) = / (32) = / (22) = .2500 Weights A BC 4 3 2 A = 100 B = 160 C = 200

Weights Weights * Value A BC 1 / (42) = / (32) = / (22) = .2500 .0625 * 100 = * 160 = * 200 = 50.00 Total = .4236 4 3 2 A = 100 B = 160 C = 200 = 74.01 74.01 / = 175

Geostatistics

Geostatistics Geostatistics:The original purpose of geostatistics centered on estimating changes in ore grade within a mine. The principles have been applied to a variety of areas in geology and other scientific disciplines. A unique aspect of geostatistics is the use of regionalized variables which are variables that fall between random variables and completely deterministic variables.

Geostatistics Regionalized variables describe phenomena with geographical distribution (e.g. elevation of ground surface). The phenomenon exhibit spatial continuity.

Geostatistics It is notalways possible to sample every location.
Therefore, unknown values must be estimated from data taken at specific locations that can be sampled. The size, shape, orientation, and spatial arrangement of the sample locations are termed the support and influence the capability to predict the unknown samples.

Semivariance

Semivariance Regionalized variable theory uses a related property called the semivariance to express the degree of relationship between points on a surface. The semivariance is simply half the variance of the differences between all possible points spaced a constant distance apart. Semivariance is a measure of the degree of spatial dependence between samples (elevation(

Semivariogram The semivariogram functions quantifies the assumption that things nearby tend to be more similar than things that are farther apart. Semivariogram measures the strength of statistical correlation as a function of distance. Semivariance: Y(h) = ½ [(Z(xi) - Z(xj)]2 Covarience = Sill – Y(h)

Semivariance semivariance :The magnitude of the semivariance between points depends on the distance between the points. A smaller distance yields a smaller semivariance and a larger distance results in a larger semivariance.

Semivariance Zi is the measurement of a regionalized variable taken at location i , Zi+h is another measurement taken h intervals away d Nh is number of separating distance = number of points –Lag (if the points are located in a single profile)

Calculating the Semivariance (Irregularly Spaced Points Regularly Spaced Points(
Here we are going to explore directional variograms. Directional variograms is defines the spatial variation among points separated by space lag h. The difference from the omnidirectional variograms is that h is a vector rather than a scalar. For example, if d={d1,d2}, then each pair of compared samples should be separated in E-W direction and in S-N direction.

Calculating the Semivariance (Irregularly Spaced Points Regularly Spaced Points(
In practice, it is difficult to find enough sample points which are separated by exactly the same lag vector [d]. The set of all possible lag vectors is usually partitioned into classes

Variogram

Variogram The plot of the semivariances as a function of distance from a point is referred to as a semivariogram or variogram.

Variogram The semivariance at a distance d = 0 should be zero, because there are no differences between points that are compared to themselves. However, as points are compared to increasingly distant points, the semivariance increases.

Variogram The range is the greatest distance over which the value at a point on the surface is related to the value at another point. The range defines the maximum neighborhood over which control points should be selected to estimate a grid node.

a is called the range of influence of a sample.
Variogram (Models( It is a ‘model’ semi-variogram and is usually called the spherical model. a is called the range of influence of a sample. C is called the sill of the semi-variogram.

spherical and exponential with the same range and sill
Variogram (Models( Exponential Model spherical and exponential with the same range and sill spherical and exponential with the same sill and the same initial slope

Kriging Interpolation

Kriging Interpolation
Kriging is named after the South African engineer, D. G. Krige, who first developed the method. Kriging uses the semivariogram, in calculating estimates of the surface at the grid nodes.

Kriging Interpolation
The procedures involved in kriging incorporate measures of error and uncertainty when determining estimations. In the kriging method, every known data value and every missing data value has an associated variance. If ‘C’ is constant (i.e. known value exactly), its variance is zero. Based on the semivariogram used, optimal weights are assigned to known values in order to calculate unknown ones. Since the variogram changes with distance, the weights depend on the known sample distribution.

Geo-statistical method- Kriging
Kriging is a geostatistical method for spatial interpolation. It can assess the quality of prediction with estimated prediction errors. It uses statistical models that allow a variety of map outputs including predictions, prediction standard errors, probability, etc. Semivariogram can be fitted as: Ordinary Kriging models: Spherical, Circular, Exponential, Gaussian and Linear. Universal Kriging models: Linear with Linear drift, and Linear with Quadratic drift

Ordinary Kriging

Ordinary Kriging Ordinary kriging is the simplest form of kriging. It uses dimensionless points to estimate other dimensionless points, e.g. elevation contour plots. In Ordinary kriging, the regionalized variable is assumed to be stationary.

Ordinary Kriging In our case Z, at point p, Ze (p) to be calculated using a weighted average of the known values or control points: This estimated value will most likely differ from the actual value at point p, Za(p), and this difference is called the estimation error:

Ordinary Kriging If no drift exists and the weights used in the estimation sum to one, then the estimated value is said to be unbiased. The scatter of the estimates about the true value is termed the error or estimation variance,

Ordinary Kriging kriging tries to choose the optimal weights that produce the minimum estimation error . Optimal weights, those that produce unbiased estimates and have a minimum estimation variance, are obtained by solving a set of simultaneous equations .

Ordinary Kriging A fourth variable is introduced called the Lagrange multiplier

Punctual (Ordinary) Kriging
Once the individual weights are known, an estimation can be made by And an estimation variance can be calculated by

Thiessen Polygons Thiessen polygons can be generated using distance operator which creates the polygon boundaries as the intersections of radial expansions from the observation points. This method is also known as Voronoi tessellation.

Thiessen polygons This method builds polygons, rather than a raster surface, from control points “grows” polygons around sample points that are supposed to represent areas of homogeneity Source: Jens-Ulrich Nomme

Density Functions We can also use sample points to map out density raster surfaces. This need to require a z value in each, it can simply be based on the abundance and distribution of points.

Lecture 11 Density Functions These settings would give us a raster density surface, based just on the abundance of points within a “kernel” or data frame. In this case, a z value for each point is not necessary.

Spline Method Another option for interpolation method
This fits a curve through the sample data assign values to other locations based on their location on the curve Thin plate splines create a surface that passes through sample points with the least possible change in slope at all points, that is with a minimum curvature surface SPLINE has two types: regularized and tension Tension results in a rougher surface that more closely adheres to abrupt changes in sample points Regularized results in a smoother surface that smoothes out abruptly changing values somewhat

Surface Fitting for random Points
Triangular network called as Triangulated Irregular Network (TIN) is applied

Compare Interpolation methods
Thiessen polygons are Used for service area analysis of public facilities such as hospitals. Originally proposed to estimate aerial averages precipitation in 1985. Inverse Distance Weighted can be a good way to take a first look at an interpolated surface. However, there is no assessment of prediction errors. Accuracy depends on the selection of a power value and the neighborhood search strategy. A smaller (6) actually produce better estimations than a larger number (12). Thin-plate Splines (applies to surface) are recommended for smooth, continuous surfaces such as elevation and water table. Also used for interpolating mean rainfall surface and land demand surface.

Geo-statistical Analyst of ArcGIS
This training will be on: Histogram Normal QQ plot Trend Analysis Creating a prediction map using the geo-statistical wizard Semivariogram / covariance modeling Searching neighbor Creating a prediction standard error map Display Formats Input Data Groundwater well data of Dinajpur district of Bangladesh

Steps in Geo-statistical Analyst
Representation of the Data Explore the Data Fit a model (create surface) Perform Diagnostics

Representation of Data
Representing the data is a vital first step in assessing the validity of the data and identifying external factors that may ultimately play a role in the distribution of data.

Explore the Data Distribution of the data, looking for data trends, looking for global and local outliers, examining spatial autocorrelation, understanding the co-variation among multiple data sets.

Explore data Histogram Q-Q plot Trend Analysis Semivariogram
Voronoi map Cross covariance

Histogram Show frequency distribution as a bar graph that displays how often observed values fall within certain intervals or classes

Normal distribution Skewness is zero for normal distribution
Normally distributed Positively skewed

Q-Q Plot Normal QQ Plot is created by plotting data values with the value of a standard normal where their cumulative distributions are equal

Trend Analysis The Trend Analysis tool provides a three-dimensional perspective of the data. The locations of sample points are plotted on the x,y plane. Above each sample point, the value is given by the height of a stick in the z dimension. The unique feature of the Trend Analysis tool is that the values are then projected onto the x,z plane and the y,z plane as scatter plots. This can be thought of as sideways views through the three-dimensional data. Polynomials are then fit through the scatter plots on the projected planes.

Voronoi map Voronoi maps are constructed from a series of polygons formed around the location of a sample point. Voronoi polygons are created so that every location within a polygon is closer to the sample point in that polygon than any other sample point. After the polygons are created, neighbors of a sample point are defined as any other sample point whose polygon shares a border with the chosen sample point. For example, in the following figure, the bright green sample point is enclosed by a polygon, given as red. Every location within the red polygon is closer to the bright green sample point than any other sample point (given as small dark blue dots). The blue polygons all share a border with the red polygon, so the sample points within the blue polygons are neighbors of the bright green sample point.

Cross variance The Crosscovariance cloud shows the empirical crosscovariance for all pairs of locations between two datasets and plots them as a function of the distance between the two locations.

Fit a Model A wide variety of interpolation methods available to create surface. Two main groups of interpolation techniques: 1. Deterministic 2. Geo-statistical

Interpolation techniques
1. Deterministic: is used for creating surfaces from measures points based either on extent of similarity (Inverse Distance Weighted (IDW) or the degree of smoothing (radial basis functions and polynomials) 2. Geo-statistical: is based on statistics and is used for more advanced prediction of surface modeling that also includes errors or uncertainty of prediction.

Deterministic Methods
Four types: Inverse Distance Weighted (IDW) Global Polynomial Local Polynomial Radial Basis Functions Can classified into two groups: Global uses entire data set Global polynomial Local calculates prediction from measured point with specified neighbors: IDW, local polynomials, radial basis functions

Inverse Distance Weighted (IDW)
A window of circular shape with the radius of dmax is drawn at a point to be interpolated, so as to involve six to eight surrounding observed points.

Global polynomial interpolation
Global Polynomial interpolation fits a smooth surface that is defined by a mathematical function (a polynomial) to the input sample points. The Global Polynomial surface changes gradually and captures coarse-scale pattern in the data. Conceptually, Global Polynomial interpolation is like taking a piece of paper and fitting it between the raised points (raised to the height of value). This is demonstrated in the diagram below for a set of sample points of elevation taken on a gently sloping hill (the piece of paper is magenta).

Local Polynomial interpolation
While Global Polynomial interpolation fits a polynomial to the entire surface, Local Polynomial interpolation fits many polynomials, each within specified overlapping neighborhoods. The search neighborhood can be defined using the search neighborhood dialog

Radial Basis Functions (RBF)
RBF methods are a series of exact interpolation techniques; that is, the surface must go through each measured sample value. There are five different basis functions: thin-plate spline, spline with tension, completely regularized spline, multi-quadric function, and inverse multi-quadric function. RBF methods are a form of artificial neural networks.

Geo-statistical Methods
Kriging and Co-kriging Algorithm Ordinary -A variety of kriging which assumes that local means are not necessarily closely related to the population mean, and which therefore uses only the samples in the local neighbourhood for the estimate. Ordinary kriging is the most comrnonly used method for environmental situations. Simple - A variety of kriging which assumes that local means are relatively constant and equal to the population mean, which is well-known. The population mean is used as a factor in each local estimate, along with the samples in the local neighborhood. This is not usually the most appropriate method for environmental situations. Universal - Indicator Probability Disjunctive Output Surfaces Prediction and prediction standard error Quantile Probability and standard errors of indicators

Kriging Kriging is a geostatistical method for spatial interpolation.
It can assess the quality of prediction with estimated prediction errors. It uses statistical models that allow a variety of map outputs including predictions, prediction standard errors, probability, etc.

Interpolation using Kriging
Kriging weights

Semivariogram The semivariogram functions quantifies the assumption that things nearby tend to be more similar than things that are farther apart. Semivariogram measures the strength of statistical correlation as a function of distance. Semivariance: Y(h) = ½ [(Z(xi) - Z(xj)]2 Covarience = Sill – Y(h)

Types of semivariogram models
Geostatistical Analyst provides the following functions to choose from to model the empirical semivariogram: Circular Spherical Tetraspherical Pentaspherical Exponential Gaussian Rational Quadratic Hole Effect K-Bessel J-Bessel Stable

Semi-variogram Models

Trend An example of a global trend can be seen in the effects of the prevailing winds on a smoke stack at a factory (below). In the image, the higher concentrations of pollution are depicted in the warm colors (reds and yellows) and the lower concentrations in the cool colors (greens and blues). Notice that the values of the pollutant change more slowly in the east–west direction than in the north–south direction. This is because east–west is aligned with the wind while north–south is perpendicular to the wind.

Detrending tool

Anisotropy Anisotropy is a characteristic of a random process that shows higher autocorrelation in one direction than another. The following image shows conceptually how the process might look. Once again, the higher concentrations of pollution are depicted in the warm colors (reds and yellows) and the lower concentrations in the cool colors (greens and blues). The random process shows undulations that are shorter in one direction than another.

Example: Anisotropy and trend
Point X A B C D Grade 4.1 10 2 1 8 Real distance from X 178 222 397 548 Anisotropic distance from X 249 191 502 591

Accounting for Anisotrophy

Searching Neighbor The points highlighted in the data view give an indicator of the weights (absolute value in percent) associated with each point located in the moving window. The weights are used to estimate the value at the unknown location which is at the center of the cross hair.

Data transformation

Declustering method There are two ways to decluster your data: by the cell method and by Voronoi polygons. Samples should be taken so they are representative of the entire surface. However, many times the samples are taken where the concentration is most severe, thus skewing the view of the surface. Declustering accounts for skewed representation of the samples by weighting them appropriately so that a more accurate surface can be created.

Bi-variate normal distribution

Output Surface

Cross Validation Cross-validation uses all of the data to estimate the trend and autocorrelation models. It removes each data location, one at a time, and predicts the associated data value.

Various Surface produced using ordinary kriging

Model comparison Comparison helps you determine how good the model that created a geostatistical layer is relative to another model.

Display Format Filled contour Contours Grids Combination of contours
Filled contour and hill shade Hill shade

Exercise on Geo-statistical Analyst
Data from 21 Groundwater observation Wells as shape file “gwowell_bwdb.shp” Weekly data from December to May for 1994 to 2003 Upazilla shape file “upazila.shp” Tasks: Represent data Explore data Fit Model Diagnostic output Create output maps

WFM 6202: Remote Sensing and GIS in Water Management

Similar presentations

Presentation on theme: "WFM 6202: Remote Sensing and GIS in Water Management"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

WFM 6202: Remote Sensing and GIS in Water Management

Similar presentations

Presentation on theme: "WFM 6202: Remote Sensing and GIS in Water Management"— Presentation transcript:

Similar presentations

About project

Feedback