Spatial Interpolation

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Introduction to Smoothing and Spatial Regression
Regression and correlation methods
Objectives 10.1 Simple linear regression
Kriging.
Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to form.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Spatial Analysis and GIS Stephanie Watson Marine Mapping User Group (MMUG) Coordinator California Department of Fish and Game.
Basic geostatistics Austin Troy.
Geo479/579: Geostatistics Ch14. Search Strategies.
Deterministic Solutions Geostatistical Solutions
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Spatial Analysis Longley et al., Ch 14,15. Transformations Buffering (Point, Line, Area) Point-in-polygon Polygon Overlay Spatial Interpolation –Theissen.
Spatial Interpolation
Concept Course on Spatial Dr. A.K.M. Saiful Islam Developing ground water level map for Dinajpur district, Bangladesh using geo-statistical analyst.
The Simple Regression Model
Applied Geostatistics
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Deterministic Solutions Geostatistical Solutions
Sampling Distributions
Ordinary Kriging Process in ArcGIS
Applications in GIS (Kriging Interpolation)
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Copyright © 2005 by Evan Schofer
Linear Regression Inference
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Using ESRI ArcGIS 9.3 Spatial Analyst
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (1)
Basic geostatistics Austin Troy.
Modeling Spatial Correlation (The Semivariogram) ©2007 Dr. B. C. Paul.
Ecosystems are: Hierarchically structured, Metastable, Far from equilibrium Spatial Relationships Theoretical Framework: “An Introduction to Applied Geostatistics“,
Rule of sample proportions IF:1.There is a population proportion of interest 2.We have a random sample from the population 3.The sample is large enough.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Explorations in Geostatistical Simulation Deven Barnett Spring 2010.
Geographic Information Science
Spatial Statistics in Ecology: Continuous Data Lecture Three.
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
Distributions of the Sample Mean
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
Concepts and Applications of Kriging
Geo479/579: Geostatistics Ch15. Cross Validation.
Local Prediction of a Spatio-Temporal Process with Application to Wet Sulfate Deposition Presented by Isin OZAKSOY.
Esri UC 2014 | Technical Workshop | Concepts and Applications of Kriging Eric Krause Konstantin Krivoruchko.
Inference: Probabilities and Distributions Feb , 2012.
Lecture 6: Point Interpolation
Chapter 8: Simple Linear Regression Yang Zhenlin.
Interpolation and evaluation of probable Maximum Precipitation (PMP) patterns using different methods by: tarun gill.
ANOVA, Regression and Multiple Regression March
Esri UC2013. Technical Workshop. Technical Workshop 2013 Esri International User Conference July 8–12, 2013 | San Diego, California Concepts and Applications.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Geo479/579: Geostatistics Ch7. Spatial Continuity.
Geostatistics GLY 560: GIS for Earth Scientists. 2/22/2016UB Geology GLY560: GIS Introduction Premise: One cannot obtain error-free estimates of unknowns.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (2)
By Russ Frith University of Alaska at Anchorage Civil Engineering Department Estimating Alaska Snow Loads.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
INTERPOLATION Procedure to predict values of attributes at unsampled points within the region sampled Why?Examples: -Can not measure all locations: - temperature.
Estimating standard error using bootstrap
Spatial statistics: Spatial Autocorrelation
Linear Regression.
Simulation-Based Approach for Comparing Two Means
Inference for Geostatistical Data: Kriging for Spatial Interpolation
Interpolation & Contour Maps
Concepts and Applications of Kriging
Concepts and Applications of Kriging
Presentation transcript:

Spatial Interpolation Inverse Distance Weighting The Variogram Kriging Much thanks to Bill Harper for his insights in Practical Geostatistics 2000 and personal conversation

Objectives In this session we will evaluate a dataset and attempt to: Explore the theory and implementation of inverse distance weighting Evaluate issues with IDW interpolation Explore the theory and implementation of the semi-variogram and it’s applicability to interpolation Explore the theory and implementation of kriging and it’s applicability to interpolation

Data Set Simulated Borehole data (PG 2000) General Statistics Iron concentration Need to interpolate iron content for unsampled areas General Statistics 47 samples Mean value: 36.3 S.D.: 3.73

General Statistics Histogram shows the relative distribution of the data Generally follows a normal distribution Other observations Minor skew, no big deal

Data Set The best unbiased estimate for the standard deviation is 3.726 (see right) Therefore, we are 90% confident that a point drawn at random would be: 30 < T < 42.6 This is based on consulting a students t distribution with 47 samples

Subset of Area (northwest area) Subset of borehole data Upper left side General Statistics 7 samples Mean value: 40 S.D.: 2.82 Getting somewhat better

Therefore, we are 90% confident that a point drawn at random would be: The best unbiased estimate for the standard deviation is 3.05 (see right) Therefore, we are 90% confident that a point drawn at random would be: 34.2 < T < 45.7 This is based on consulting a students t distribution with 7 samples Now, the question is, do some of the points exhibit more influence than others? Probably, so lets evaluate the point taking nearness into account

Inverse Distance Weighting IDW works by using an unbiased weight matrix based on the distances from an unknown value to known values. Weights may be defined a number of different ways

IDW ArcGIS provides a nice interface to view points This example looks at 7 neighbors Now, lets look at it the “old fashioned way…”

IDW Using 7 neighboring points allows us to interpolate a value based on distances Interpolated value is 39.9 So, our calculation is the same as that in ArcGIS – its just math….

IDW – standard Error We will compute it, without considering the autocorrelation in the data: Standard error 2.75 Therefore, we are 90% confident that a point drawn at random would be: 34.7 < T < 45.1 This is based on consulting a students t distribution with 7 samples Caveat: we are treating IDW like weighted mean, and the standard deviation like a weighted standard deviation. In reality, you shouldn’t develop confidence intervals for data that is autocorrelated

IDW Methods So which is best??? Power = 2, search = 230

10 Questions to Evaluate1 What function of distance should we use? How do we handle different continuity in different directions? How many samples should we include in the estimation? How do we compensate for irregularly spaced or highly clustered sampling? How far should we go to include samples in our estimation process? Should we honor the sample values? How reliable is the estimate when we have it? Why is our map too smooth? What happens if our sample data is not Normal? What happens if there is a strong trend in the values? 1Clark and Harper Practical Geostatistics 2000. Ecosse North America, Llc

Answering the 10 Questions The Variogram

What is a Semi-Variogram The semi-variogram is a function that relates semi- variance (or dissimilarity) of data points to the distance that separates them. If we can understand the difference between an unknown quantity and a known quantity, we we can estimate the unknown point 1 d1

Estimating via semi-variogram Lets assume the relationship between the unknown and known point depends on distance – 121 feet NE/SW If these two points have the same relationship as the other points, we can look at the other points that are 121 feet NE/SW

Computing the standard differences For all 31 pairs we can compute the standard deviation We are assuming a mean of 0, and a normal distribution

Computing the standard differences The single point we are looking at is 37% Fe. If our original samples come from a normal distribution, the differences will be normal, so we be 90% confident that a point drawn at random would be:

Taking the semi-variogram further Chances are, we won’t get to sample our data on a regular grid. We have to algebraically define some function of distance with the differences in value Therefore, we will assign h to the distance

Variograms å Variogram: g(h) = ½ var [ Z(x) – Z(x+h) ] = ½ E [ {Z(x) – Z(x+h)}2 ] In practice: g(h) = Where: N(h) is the total number of pairs of observations separated by a distance h. The fitted curve minimizes the variance of the errors. å = + - N(h) 1 i 2 h)] Z(xi [Z(xi) 2N(h)

Variogram components Nugget variance: a non-zero value for g when h = 0. Produced by various sources of unexplained error (e.g. measurement error). Sill: for large values of h the variogram levels out, indicating that there no longer is any correlation between data points. The sill should be equal to the variance of the data set. Range: is the value of h where the sill occurs (or 95% of the value of the sill). In general, 30 or more pairs per point are needed to generate a reasonable sample variogram. The most important part of a variogram is its shape near the origin, as the closest points are given more weight in the interpolation process.

Variogram models Variogram models must be “positive definite” so that the covariance matrix based on it can be inverted (which occurs in the kriging process). Because of this, only certain models can be used.

Semi-variogram models We can enter some numbers in Mathcad and see how the variogram changes.

Effect of lag size on variograms Variogram with a lag size of 5m and a lag tolerance of 2.5m. Variogram with a lag size of 10m and a lag tolerance of 5m.

Anisotropy There may be higher spatial autocorrelation in one direction than in others, which is called anisotropy: The figure shows a case of geometric anisotropy, which is incorporated in the variogram model by means of a linear transformation.

Semi-variogram tips We are assuming a normal distribution Gives us a picture of the relationship of data values with distance. If you don’t have a good spatial structure in the semi-variogram, don’t revert to IDW – this is stupid!!!

Comparing Software for Computing the Semi-Variogram Practical Geostatistics 2000 ArcGIS Geostatistical Analyst

Assessing Fit of the Variogram Cressie Goodness of Fit For each point used to create the variogram, match how well the model actually fits it

Kriging 3 components: structural (constant Kriging is based on the idea that you can make inferences regarding a random function Z(x), given data points Z(x1), Z(x2), …Z(xn). 3 components: structural (constant mean), random spatially correlated component and residual error. Z(x) = m(x) + g(h) + e”

Kriging This is our variogram from the borehole data To discuss the mathematics of kriging, we will look at a simple example of 3 points, and get back to our data in a moment

Numerical Example of Iron Ore Data From Practical Geostatistics 2000 Kriging Numerical Example of Iron Ore Data From Practical Geostatistics 2000

Data Set Iron Ore Data, based on sample set from PG 2000 Three point example for simplicity

Calculating Distances The first thing we do is determine the distances between each point Also calculate difference in Z values between all points

Semi Variogram We apply the GLM, based on other test performed on the data The values chosen give the best Cressie statistics for fit on all data points Note: Mathcad is not great at creating semivariograms!!!

Computing Weights Using basic matrix algebra, we can solve for the weights. The weights will add to one, due to our eventual “slight of hand” with the last row.

Solving the Unknown Basic matrix algebra will solve for the unknown value We also compute the standard error and variance

Solving Our Borehole Data Start with our original example Since we have 7 points rather than 3, the screens will be “busier”

Borehole Data The ability to create semi-variograms in MathCad is pretty bad, but this allows us to visualize the mathematics Here we are using the spherical model

Borehole Data Again, we can see with this dataset the weights also add up to one

Solution Here we’ve computed the value of the unknown point, and the standard error This was based on the limited set of 7 points, now we’ll do it with the rest.

Predicting the Point ArcGIS has a good interface for evaluating the weights of the points, in addition to predicting a test location

Kriging Results ESRI Geostatistical Analyst PG 2000 Interpolated value 41.26 Standard error 2.16 PG 2000 41.14 2.11

Standard Errors Based on Kriging results, we can assume the “true” value of the unknown point, with 90% confidence as: 37.6 < 41.14 < 44.68 %Fe So, we are getting better results, better looking maps, and smaller confidence intervals

IDW vs. Kriging Kriging IDW Kriging appears to give a more “natural” look to the data Kriging avoids the “bulls eye” effect Kriging also give us a standard error IDW

Results

Review of 10 Questions to ask1 What function of distance should we use? The variogram shows us the spatial structure, and association of the data, and will give us a hint as to what function to possibly use. How do we handle different continuity in different directions? Here again, the variogram will tell us whether there is any spatial association, and we can determine which direction by evaluating whether anisotropy exists. How many samples should we include in the estimation? Again, we can look at the variogram How do we compensate for irregularly spaced or highly clustered sampling? The variogram defines the relationship between points and their distances from other points. Calculating weights in Kriging takes the distances among all points into account. 1Clark and Harper Practical Geostatistics 2000. Ecosse North America, Llc

10 Questions to ask1 How far should we go to include samples in our estimation process? By looking at the variogram we can identify the sill (that area where the spatial correlation has little value). The range tells us the distance where the points are no longer correlated. Should we honor the sample values? Still lots of debate on this one. IDW says yes, that’s why we get the bullseye. The nugget effect in Kriging allows us to say no. But, we can set the nugget to zero with Kriging. How reliable is the estimate when we have it? Kriging allows us to compute the standard error Why is our IDW map too smooth? In IDW when you include points far away they become part of the weights. Since the weights have to add up to one, you are basically taking power away from the closer ones. 1Clark and Harper Practical Geostatistics 2000. Ecosse North America, Llc

10 Questions to Ask What happens if our sample data is not Normal? Basically, make the data normal… What happens if there is a strong trend in the values? First, remove the trend, then re-interpolate the points (see ESRI Calif. Ozone example, or Clark and Harper Wolfcamp Data)

Conclusions It is possible to interpolate an unknown point based on other points in a data set While it can be done with descriptive statistics, other methods are clearly better The variogram helps answer many questions related to our data, and provides a wealth of information related to the spatial structure of the data More robust (geostatistical) methods for interpolation appear to provide better results