Modeling Spatial Correlation (The Semivariogram) ©2007 Dr. B. C. Paul.

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Things to do in Lecture 1 Outline basic concepts of causality
Derivatives in physics. Why Do We Need Derivatives? In physics things are constantly changing. Specifically, what we’ll be interested with is how physical.
STUDENTS WILL DEMONSTRATE UNDERSTANDING OF THE CALCULATION OF STANDARD DEVIATION AND CONSTRUCTION OF A BELL CURVE Standard Deviation & The Bell Curve.
Introduction to Regression ©2005 Dr. B. C. Paul. Things Favoring ANOVA Analysis ANOVA tells you whether a factor is controlling a result It requires that.
MODULE THREE SCALES AND DIMENSIONS
Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.
Basic geostatistics Austin Troy.
Spatial Interpolation
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 6: Interpreting Regression Results Logarithms (Chapter 4.5) Standard Errors (Chapter.
The standard error of the sample mean and confidence intervals
The standard error of the sample mean and confidence intervals
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Deterministic Solutions Geostatistical Solutions
Zen and the Art of Significance Testing At the center of it all: the sampling distribution The task: learn something about an unobserved population on.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Correlation 2 Computations, and the best fitting line.
Applied Geostatistics
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Deterministic Solutions Geostatistical Solutions
Ordinary Kriging Process in ArcGIS
Applications in GIS (Kriging Interpolation)
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Inferential Statistics
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Unit 1.4 Recurrence Relations
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
CORRELATION & REGRESSION
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Basic Statistics Concepts Marketing Logistics. Basic Statistics Concepts Including: histograms, means, normal distributions, standard deviations.
Are You Smarter Than a 5 th Grader?. 1,000,000 5th Grade Topic 15th Grade Topic 24th Grade Topic 34th Grade Topic 43rd Grade Topic 53rd Grade Topic 62nd.
Using Statistical Interpolation to Build Block Models – Part III (Using Pintrp.dat to project sample values to blocks) Using MineSight® ©2007 Dr. B. C.
Explorations in Geostatistical Simulation Deven Barnett Spring 2010.
Geographic Information Science
Copyright © 2010 Pearson Education, Inc. All rights reserved Sec
Geo479/579: Geostatistics Ch16. Modeling the Sample Variogram.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Time series Decomposition Farideh Dehkordi-Vakil.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Analysis of Residuals ©2005 Dr. B. C. Paul. Examining Residuals of Regression (From our Previous Example) Set up your linear regression in the Usual manner.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
What is a Random Sample (and what if its not) ©Dr. B. C. Paul 2005.
1 Psych 5500/6500 Measures of Variability Fall, 2008.
Algebraic Thinking 5 th Grade Guided Instruction Finding Rules and Writing Equations For Patterns.
Inference: Probabilities and Distributions Feb , 2012.
ANOVA, Regression and Multiple Regression March
Geo479/579: Geostatistics Ch7. Spatial Continuity.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
6.1 Discrete and Continuous Random Variables Objectives SWBAT: COMPUTE probabilities using the probability distribution of a discrete random variable.
Spatial Analysis Variogram
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Copyright © Cengage Learning. All rights reserved.
Different Types of Data
Spatial statistics: Spatial Autocorrelation
Linear Regression.
COORDINATE PLANE The plane containing the "x" axis and "y" axis.
Graphs, Linear Equations, and Functions
Xbar Chart By Farrokh Alemi Ph.D
Presentation transcript:

Modeling Spatial Correlation (The Semivariogram) ©2007 Dr. B. C. Paul

Fitting Spatially Correlated Data  This is the case where grabbing random samples from around does not produce a random result Samples closer together in location parameter are likely to be similar  Real Engineering world situations Location is location Your taking samples of a site – those close together are likely to be similar  Common with Environmental Clean-ups or Ore Reserves  We encountered this situation when we were looking at decreasing variance of the mean I showed you a plot of half squared differences that a computer referenced to see how much variance would average out when one went to larger samples.

The Semivariogram Half Squared Differences Plotted here (Called Semi variance) Distance Plotted Here Data usually follows a line like this Semivariograms are another Type of model that it often Pays to use your judgment and Fit yourself rather than just Tell the computer to do least Squares.

How Do You Do A Semivariogram (With a Computer Unless you have a death wish) Suppose I have this grid of Samples spaced on 50 foot Centers. I tell my computer to look at all Possible pairs of points that are 50 feet apart. (There are a whole Bunch of them). For each pair the computer Subtracts one from the other to Get a difference. I squares the Difference and then adds up the Results for every pair.

Continuing My Computational Love Fest  The Computer eventually totals up all the squared differences for every sample pair in the whole grid that is 50 feet apart.  The computer then looks at its tally of how many pairs it found and divides the total by the number of pairs to get an average squared difference for pairs 50 feet apart.  The computer then divides that value by two It’s a calculus and derivation thing where an extra two shows up and its convenient to just include it in the definition of semivariance  I have my computer do this for every other distance and compute a semivariance for that distance.

The Result Semivariance Distance We now have a sort of histogram of spatial Correlation of samples.

Model Fitting Time Semivariance Distance We will try to fit a mathematical model to our pattern of spatial correlation. (Not just any line will do – has to meet specific mathematical conditions You don’t want to understand).

We Will Look at Fitting a Spherical Model (Because it works about 95% of the time) Semivariance Distance Plot a line for the Over-all variance of Samples. (As samples become Vary far apart they Have no spatial Relationship and tend To have the same Variance as just the Background of the Sample set.

A Range of Influence Semivariance Distance Look for a linear trend in the data at the first (you will probably see your semivariance Rising to meet your variance of samples)

Working on the Range Semivariance Distance The Linear trend will intersect the variance of samples At about 2/3rds the Range in influence. (The actual range curves up to meet the sample Variance).

What Does Range of Influence Mean?  Samples located within the range of influence of each other are spatially correlated and when you draw one sample the value of the other sample a distance away is not a matter of random chance When you have a spatially correlated sample set you can use that information to make more than luck of the draw guesses on the values at points that were never sampled  (Can see how I could mine an ore deposit or clean up an environmental mess better if I knew that kind of stuff)

The Cill and Nugget Semivariance Distance This value is the Cill – it represents The amount of the variation in the Deposit that shows spatial correlation Over a range of influence. Your linear trend normally does not intersect 0 at zero distance – Most real deposits have a random element – called the nugget (it first got its name from whether your gold sampling happened to hit a gold nugget or not).

The Model Semivariance is represented by gamma d represents distance N is the Nugget value fit to the graph This applies if d = 0 Semivariance for d>0 and <R where R is the range of Influence. C and N represent Cill and Nugget (Cill got misspelled because the mathematicians were French) Semivariance for d>R

The Model  The spherical model is called a three part model (any guesses about why?)  Our model represents the average similarity of sample values located a distance d apart We’ll look at how we can use that later on  One assumption we make is “Stationarity” Means that our model of spatial correlation continues to fit over the entire study area  May need different models for different types of mineralized rock or contaminated soils

Some Exceptions  Sometimes the model varies depending on which direction you are moving In that case you have to have your computer look by direction as well as distance for pairs of samples in the set. You will fit model differently in different directions  This is called anisotropy Need more detailed study on geostatistics to get good explanation of anisotropic models and how to fit them.

The Not Really a Grid Factor Sometimes samples are not on A regular square grid. In that case you have the Computer start with each sample Individually and look for possible Pairs in a certain direction with A cone of tolerance. The cone is in turn broken up into Steps of distance. Any sample Pairs located in the interval are Treated as if they were at a grid Point (similar to the arbitrary limits we Use for cells in histograms).

Illustrations of Fitting Our Sample Variance is 100 Checking for anisotropy indications They both appear to be leveling out at 100 Both appear to have a nugget of about 20 Common Cills and Nuggets mean no Zonal Anisoptropy Checking for indications of different range Appear about the same – no geometric anisotropy

Fitting the Model Nugget looks like a read Of 20. Levels out at 100 Cill = 100 – 20 =80 2/3rds R is about 350 so Range of influence is about 525

Lets Try Another One Variance of samples is 100 again Checking for anisotropy Appears to have about same nugget at 20 Both appear to level off around 100 Therefore probably no zonal anisotropy Checking for Common Range of Influence I don’t think so – This must be a geometric anisotropy

Checking Out the Range 2/3rds R is about 350 again – so range is about 525 Nugget = 20 Cill = 80

Check Out the Y Axis 2/3 rd R is about 150 so range Y is 225 Range is a little more than twice as far in The X direction (or principle axis) Cill is 80 Nugget is 20

Lets Try Another One Sample Variance is still 100 but X is only getting to about 70 and Y about 130 Also X appears to hit Y axis at 10 while in Y it appears to be 30 This is appears to be a Zonal Anisotropy

I’m A Little Unsure About the Range being the Same or Different 2/3 rd R is about 340 so R is about 510 on the X axis

Checking Out the Y Axis 2/3rds R is about 350 which implies R = 525 R= 510 and R= 525 are similar enough that an anisotropy in range is not Really worth modeling.

Trying One More Case Sample variance is again 100 Both directions appear to intersect Y axis around 10 and to level out at 100 Probably no zonal anisotropy Range is not obviously different but is a little “funky”

When Range is Kinky (The Nested Structure) Sometimes Mineralization may be Controlled by Processes that have Different ranges of Influence. In this case we have Something with a Short range and Something with a long Range. I’m guessing I have a short range structure at around 100 Range

My Long Range Structure 2/3 rd R is about 525 so R long is about 787

Making Semivariogram Math Work Out  Isotropic with no nested structures Just use the 3 part model  Geometric Anisotropy (means range varies by direction) Use coordinate transform  If Rx is twice Ry use an isotropic three part model, but double the y component of distance before doing the 3 part model calculation

Handling Zonal Anisotropies and Nested Structures  Handled by adding components For zonal anisotropy add separate models together to get the total model  Nugget*cos(θ) will give variable nugget by direction  A normal 3 part model that only counts distance component in one direction will make cill change.  Can have an isotropic or geometric anisotropy to handle the other component  Nested Structures Just separately compute the long and short range models and then add them up.  As a practical matter – feed the task to a computer and let it calculate the predicted gamma values for the model you figured out.

Mathematical Examples – Computing a raw value of gamma  The following samples are a distance 50 feet apart on an isotropic semivariogram model P1 25 P2 31  Diff is 6  Squared diff. is 36  Half squared diff is 18  Total so far is 18  Number of pairs so far is 1

Continuing the computation  P1 = 25, P3 = 20 Difference is 5 Squared diff is 25 Half squared diff is 12.5 Total to this point is 30.5 Number of pairs so far is 2  P2 = 31, P4 = 38 Difference is 7 Squared diff is 49 Half squared diff is 24.5 Total to this point is 45 Number of pairs so far is 3

Continuing the computation  P3 = 20, P5 = 27 Difference is 7 Squared difference is 49 Half Squared diff is 24.5 Total to this point is 69.5 Number of pairs so far is 4  P4 = 38, P6 = 33 Difference is 5 Squared difference is 25 Half Squared diff is 12.5 Total to this point is 82 Number of pairs so far is 5

Finishing the raw gamma value for pairs 50 feet apart  82 / 5 = 16.4  16.4 would be the value plotted on the semivariogram  Most values plotted in reality will be based on more than 5 pairs, however the calculation procedure is the same.  Eventually so called sample points on the semivariogram will be replaced with a mathematical model that will be used to compute gamma values where ever they are needed.

Examples of Semivariogram Mathematical Models  Case 1 – An isotropic spherical model with a cill of 80, a nugget of 20 and a range of 500 Let point 1 be X= 0, Y =0 Let point 2 be X= 75 Y =0 ΔX = 75 ΔY = 0 Pythagorean distance is >0 but less than 500 so use the second part of the spherical model Ie – 37.87

More Mathematical Models  An isotropic nested structure Nugget =20 Range 1 =50 Feet Cill 1 =20, Range 2 = 500 Feet Cill 2 = 60  Nested structures are accomplished by simply adding model components P1 X=0, Y=0, P2 X=53.03, Y=53.03 ΔX=53.03 ΔY=53.03 Pythagorean Distance = 75 Nugget = 20 Model 1 – 75>50 so model is C+N or 20 Model 2 – 75>0 and 75<500 so use 2 model component 13.4 Add components = 53.4

More Models  A geometric anisotropy with Nugget = 20, Cill=80 RangeX=500, Range Y=100  P1 X=0, Y=0, P2 X=53.03, Y=53.03  Geometric Anisotropy is handled by stretching the distance in the short axis direction YRange is 1/5 th of XRange so ΔY gets multiplied by 5  ΔX = ΔY = (ie 53.05*5)  Pythagorean Distance is  is >0 and < 500 so use 2 nd part of formula 58.57