Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 6 – Analysis of mapped point patterns This chapter will introduce methods for analyzing and modeling the spatial distribution of mapped point.

Similar presentations


Presentation on theme: "1 Chapter 6 – Analysis of mapped point patterns This chapter will introduce methods for analyzing and modeling the spatial distribution of mapped point."— Presentation transcript:

1 1 Chapter 6 – Analysis of mapped point patterns This chapter will introduce methods for analyzing and modeling the spatial distribution of mapped point data in which the location of every individual in the population is known. Two types of analyses can be conducted with mapped point patterns: (1) detecting patterns (hypothesis test – if a pattern is at random, regular or aggregated distribution), and (2) model fitting (inference – e.g., fit point pattern models to an observed point pattern, see Chapter 7.) In this chapter, we will concentrate on the first type of analysis by introducing an important technique for detecting spatial patterns of mapped data.

2 2 Nearest-neighbor distribution functions: G(r) and F(r) The various distance methods presented in Chapter 4 only provide summary information on a spatial pattern at a particular distance (e.g., first nearest neighbor distance, etc.). We now present methods that actually describe the distribution of the nearest-neighbor distances, i.e., we model the nn distances by considering the distance as a random variable. G(r) is defined as a probability that the distance from a random chosen event to its nearest neighbor is less than or equal to r: The estimator is: where r i is the nn distance for a randomly chosen event i (i = 1, 2, …, n ), I(r i  r) is an indicator function, I(r i  r) = 1 if (r i  r) is true, 0 otherwise. F(r) is a probability that the distance from a random chosen point to its nearest neighbor is less than or equal to r, also called “empty space function”. It has exactly the same expression as G(r), but r in F(r) is a point-to-event distance.

3 3 More on G(r) and F(r) Under csr, it can be shown that G(r) has the form To judge how far the empirical is from the csr, a simulation envelope could be computed for based on, say, 100 realizations of s 1, s 2, …, s 652 from a uniform distribution in a study area (i.e., assume the 652 Douglas-firs follow the Poisson distribution). The estimator is calculated from each realization and for each distance r, the largest and smallest values define the simulation envelope. (The envelope is not shown in the figure here.) dist Gr×652 051015202530 0 100 200 300 400 500 600 Douglas-fir (n = 652)

4 4 R package spatstat for point pattern analysis Developed by Adrian Baddeley and Rolf Turner The package supports: 1.creation, manipulation and plotting of point patterns 2.exploratory data analysis 3.simulation of point process models 4.parametric model-fitting 5.hypothesis tests and diagnostics The first thing to do for all these analyses is to create a ppp object! Use Douglas-fir data as example: > df.dat=subset(victoria.dat,victoria.dat$sp==“DF”) > df.ppp=ppp(df.dat$x,df.dat$y,c(0,103),c(0,87)) > df.ppp=ppp(df.dat$x,df.dat$y,window=owin(c(0,103),c(0,87)) > df.ppp=ppp(df.dat$x,df.dat$y,poly=list(x=c(0,50,60,0),y=c(0,0,60,50))) # ploygon window

5 5 Baddeley, A.J. & Gill, R.D. 1997. Kaplan-Meier estimators of interpoint distance distributions for spatial point processes. Annals of Statistics 25:263-292. Regular 1 st nn dist Aggregated R implementation 1.Prepare Douglas-fir and Hemlock data into ppp format (df.ppp, hl.ppp) 2.df.G=Gest(df.ppp) 3.plot(df.G) 4.plot(envelope(df.ppp,fun=Gest)) #generate envelope. 5.The pointwise envelopes are not “confidence bands” for the true value of the function! The test is constructed by choosing a fixed value of r, and rejecting the null hypothesis if the observed function value lies outside the envelope at this value of r. This test has exact significance level alpha = 2nrank/(1 + nsim). nrank = the rank of the envelope value amongst the nsim simulated values.

6 6 K-function It is the most important function for quantifying mapped point pattern, proposed by Ripley in 1976, often called Ripley’s K-function. K-function is a second-moment measure as it is closely related to the second-order intensity of a stationary isotropic point process. It captures the spatial (in)dependence between different regions of the point process. Let’s first look at the 1st- and 2nd-order properties of a spatial point process. 1 st -order property: where A x is an infinitesimal region which contains point x. For a stationary process, (x) = constant. 2 nd -order property: For a stationary + isotropic process,  (x, y) =  (h), where h = |x-y|. * Ripley, B. D. 1976. The second-order analysis of stationary point process. J. of Appl. Prob. 13:255-266.

7 7 Definition of K-function K-function is defined as K(h) = -1 E(# of other events within distance h of an arbitrary event). E(# of other events within distance h of an arbitrary event) = K(h).. h

8 8 The relationship between K-function and 2 (x, y) where 2 (r)/ is interpreted as the conditional intensity of an event at x given an event at 0, i.e., 2 (0, x)/. This intensity corresponds to the intensity at the point x conditional on that there is an event at 0. For a Poisson process,  (r) = 2, then K(h) =  h 2. Use as a null model for csr: K(h) >  h 2 suggests aggregated pattern. K(h) =  h 2 suggests random pattern. K(h) <  h 2 suggests regular pattern.. h

9 9 The properties of K(h) 1. For a Poisson process,  (r) = 2, then K(h) =  h 2. Use as a null model for csr: K(h) >  h 2 suggests aggregated pattern. K(h) =  h 2 suggests random pattern. K(h) <  h 2 suggests regular pattern. 2. K-function is invariant under random thinning. By “random thinning”, we mean that if each event of a process is retained or not according to a series of Bernoulli trials. This property means that the K-function of the resulting thinned process is identical to that of the original, unthinned process.

10 10 A simple estimator of K(h)................ h sisi sjsj h. Edge effect: Those point close to the edges will have less # of points with the h circle than those points far from the edges.

11 11 Toroidal unbiased estimator of K(h) Because of edge effect, the simple estimator is not very efficient and is biased. An alternative is the estimator based on toroidal correction. N + is the number of points that fall within ||s i – s j ||  h. Toroidal edge correction, use only for stationary + isotropic patterns h Misuse of toroidal edge correction for non-stationary patterns

12 12 Weighted unbiased estimator of K(h) Another unbiased estimator, initially proposed by Ripley (1976), gives more weight to those points near the boundaries. where the weight w(s i, s j ) is the proportion of the circumference of a circle centered at s i, passing through s j (s i must be within the study area). w(s i, s j ) = 1 if the circle entirely locates within the study area................. h sisi sjsj h.

13 13 Computing w(s i, s j ) Assume the study area is [0, a]  [0, b] and s i has coordinates s i = (x, y). Rewrite w(s i, s j ) = w(s i, h), h is the radius for a circle centered at s i. Denote d 1 = min(x, a-x), and d 2 = min(y, b-y); thus d 1 and d 2 are the distances from s i to the nearest vertical and horizontal edges of A. w(s i, h) is calculated as follows: 1.If h 2  d 1 2 + d 2 2 (circle intersects with both vertical and horizontal edges): 2.If h 2  d 1 2 + d 2 2 (circle intersects with one edge):................ h sisi. sisi. sisi.

14 14 Variance and simulation envelopes As it was mentioned earlier that a csr has K(h) =  h 2. It is usual to express K(h) as (*) This transformation stabilizes variance for the transformed K 0 (h), which is approximately: To judge how far the observed K-function deviates from the csr, a simulation envelope could be constructed based on, say, 99 realizations of s 1, s 2, …, s 982 from a uniform distribution in a study area (i.e., the 982 western hemlock trees follow the Poisson distribution). The K-function is calculated from each realization, and for each distance h the largest and smallest values define the simulation envelope.

15 15 R implementation Let’s model the distribution of the 982 western hemlocks. The spatstat program computes the transformed K(h) presented on previous page. >hl.kest=Kest(hl.ppp) # hl.ppp = is ppp object of sptatstat >plot(hl.kest) >plot(hl.kest$r,sqrt(hl.kest$iso/pi)-hl.kest$r) >hl.env=envelope(hl.ppp) >plot(hl.kest$r,sqrt(hl.kest$iso/pi)) >lines(hl.env$r,sqrt(hl.env$lo/pi),col=2) >lines(hl.env$r,sqrt(hl.env$hi/pi),col=2)

16 16 L-function In practice, K-function is usually displayed in L-function, defined as For an aggregated distribution For a random distribution For a regular distribution Examples: 01020304050 -0.5 0.0 0.5 Douglas-fir (n = 652) L(h) 01020304050 0 1 2 3 Hemlock (n = 982) hh

17 17 g-function (pair correlation function) g-function is derivative of K-function, defined as Obviously, g-function describes how K-function changes with spatial distance lag h. K- function is a cumulative function which may accumulate confounding large scale (large h) effect with the effect of small scales (small h). g-function is said to be able to separate these effects. R implementation: >pcf(hl.ppp). h

18 18 Bivariate spatial point patterns A bivariate spatial point pattern consists of the locations of two types of events in a bounded study area A, e.g., the distributions of two tree species (Douglas-fir and western hemlock). It can be defined as {s j (i) : i = 1, 2; j = 1, 2, …} of type i (i = 1, 2) species at j th location. The two species may or may not be spatially independent. A natural working hypothesis is that the patterns of the two species are independent. However, it is worth to note that the independence does not necessarily guarantee the csr for each of the species. Similar to the univariate case, the K-function can be extended to the bivariate case to quantify the relationship between the two species, defined as K 12 (h) =  -1 E(# of type 2 events within distance h of an arbitrary type 1 event). If both species are at csr, K 12 (h) (= K 21 (h)) has a simple result K 12 (h) =  h 2.

19 19 An unbiased estimator For a given data, K 12 (h) and K 21 (h) can be respectively estimated as where w(s i (1), s j (2) ) is the proportion of the circumference of the circle with centre s j (1) and radius h that lies within the study region A................ h s i (1) s j (2)...

20 20 An estimator of variance reduction When the underlying process for both species are independent Poisson, Lotwick & Silverman (1982) show that the most efficient estimator is a linear combination Because for csr K 12 (h) =  h 2, we can similarly define an L-function: For an aggregated distribution For a random distribution For a regular distribution * Lotwick, H. W. & Silverman, B. W. 1982. Methods for analysing spatial processes of several types of points. J. R. Stat. Soc. B, 44:406-413.

21 21 R implementation Let’s use Splus to compute K 12 (h) for redcedar and western hemlock. >victoria.ppp=ppp(victoria.dat$x,victoria.dat$y,c(0,103),c(0,87),marks=victoria.dat$sp) >cdhl.kcross=Kcross(victoria.ppp,”HL”,”CD”) >plot(cdhl.kcross) Also see Kmulti >plot(Kmulti(victoria.ppp,victoria.ppp$marks=="CD",victoria.ppp$marks=="HL"))

22 22 Hemlock-redcedar Douglas fir-hemlock

23 23 Assignment: Compute bivariate L function for CD and HL of Victoria.dat > victoria.ppp=ppp(victoria.dat$x,victoria.dat$y,c(0,103),c(0,87),marks=victoria.dat$sp) > cdhl.kcross=Kcross(victoria.ppp,”HL”,”CD”) > plot(cdhl.kcross) > cdhl.env=envelope(victoria.ppp, Kcross, i="HL", j="CD") > cdhl.lfn=sqrt(cdhl.kcross$iso/pi)-cdhl.kcross$r > plot(cdhl.kcross$r, cdhl.lfn, ylim=c(-0.25,1.1), xlab="h", ylab="L function") > lines(cdhl.env$r, sqrt(cdhl.env$hi/pi)-cdhl.env$r, col="red") > lines(cdhl.env$r, sqrt(cdhl.env$lo/pi)-cdhl.env$r, col="blue")


Download ppt "1 Chapter 6 – Analysis of mapped point patterns This chapter will introduce methods for analyzing and modeling the spatial distribution of mapped point."

Similar presentations


Ads by Google