ES 470 SAMPLING AND ANALYSIS OF HYDROLOGICAL DATA Manoj K. Shukla, Ph.D. Assistant Professor Environmental Soil Physics FEBRUARY 09, 2006, (W147, 3 - 5.

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Chapter 3 Properties of Random Variables
Richard M. Jacobs, OSA, Ph.D.
Chapter 3, Numerical Descriptive Measures
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
The Multiple Regression Model.
Inference for Linear Regression (C27 BVD). * If we believe two variables may have a linear relationship, we may find a linear regression line to model.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Hydrologic Statistics
Simple Linear Regression and Correlation
Statistics for Business and Economics
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Topic 2: Statistical Concepts and Market Returns
Applied Geostatistics
Biostatistics Unit 2 Descriptive Biostatistics 1.
Deterministic Solutions Geostatistical Solutions
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Applications in GIS (Kriging Interpolation)
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Lecture II-2: Probability Review
Introduction to Regression Analysis, Chapter 13,
Chapter 6 Random Error The Nature of Random Errors
Hydrologic Statistics
Introduction to Linear Regression and Correlation Analysis
Numerical Descriptive Techniques
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Geographic Information Science
Skewness & Kurtosis: Reference
The Semivariogram in Remote Sensing: An Introduction P. J. Curran, Remote Sensing of Environment 24: (1988). Presented by Dahl Winters Geog 577,
Introductory Statistics. Learning Objectives l Distinguish between different data types l Evaluate the central tendency of realistic business data l Evaluate.
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
Geo479/579: Geostatistics Ch4. Spatial Description.
Correlation & Regression Analysis
PCB 3043L - General Ecology Data Analysis.
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Introduction to statistics I Sophia King Rm. P24 HWB
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
CHAPTER 2: Basic Summary Statistics
Stochastic Hydrology Random Field Simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Geostatistics GLY 560: GIS for Earth Scientists. 2/22/2016UB Geology GLY560: GIS Introduction Premise: One cannot obtain error-free estimates of unknowns.
CHAPTER – 1 UNCERTAINTIES IN MEASUREMENTS. 1.3 PARENT AND SAMPLE DISTRIBUTIONS  If we make a measurement x i in of a quantity x, we expect our observation.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Goal of Stochastic Hydrology Develop analytical tools to systematically deal with uncertainty and spatial variability in hydrologic systems Examples of.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 6: Random Errors in Chemical Analysis. 6A The nature of random errors Random, or indeterminate, errors can never be totally eliminated and are.
MATH-138 Elementary Statistics
Summary of Prev. Lecture
PCB 3043L - General Ecology Data Analysis.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Description of Data (Summary and Variability measures)
Basic Statistical Terms
Stochastic Hydrology Random Field Simulation
Hydrologic Statistics
Product moment correlation
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

ES 470 SAMPLING AND ANALYSIS OF HYDROLOGICAL DATA Manoj K. Shukla, Ph.D. Assistant Professor Environmental Soil Physics FEBRUARY 09, 2006, (W147, PM)

J. H. Dane G.C. Topp (Editors) Methods of Soil Analysis- Part 4, Physical Methods ES-470

Scales of Variability Regional Molecules Particles or Pore Aggregate Column or Horizon Field or Watershed Pedosphere ES-470

Variability Spatial: variability with increasing distance (space) from a location Temporal: variability with increasing duration/time We will limit our discussion to field scale ES-470

Agriculture Field ???  In situ soil exhibits large degree of variability or heterogeneity  Changes in soil types need to be accounted for in the composite sampling  The composite sample must maintain the heterogeneity of the insitu soil ES-470

Sources Intrinsic Factors: Soil forming factors, time, soil texture, mineralogy, pedogenesis (geological, hydrological, biological factors) The intrinsic variables have a distinct component that can be called regionalized, i.e., it varies in space, with nearby areas tending to be alike Extrinsic Factors: Land use and management, fertilizer application, other amendments, drainage, tillage ES-470

Structure of Variability Random sampling is done to ensure that estimates are unbiased Meet the criterion of independent sampling under identical conditions Y i =  +  i where Y i is the realization of a soil attribute at location i, m is the mean value for the spatial domain, and  i is a random error term ES-470

An attribute (i.e., bulk density, nitrate concentration, etc.) is described through two statistical parameters E [Y i ] =  First moment or Mean E [(Y i -     Second moment or Variance ES-470

Mean and variance or first and second moment are often assumed to be the parameters of a normal (Gaussian) probability distribution function; and Allow for a series of sophisticated statistical analysis E [Y i ] =  E [(Y i -     Arithmetic mean = m = (x 1 + x 2 + x 3 ) / 3 Geometric mean = m = (x 1 * x 2 * x 3 ) 1/3 Harmonic mean = m = (1/x 1 + 1/x 2 + 1/x 3 )* (1/n) Variance (s 2 ) = (1/n) * ∑(x i – x m ) 2 ES-470

Mean = 1.35 g kg -1 Variance = Mean = g kg -1 Variance = Soil N content data E [Y i ] =  E [Y i ] =  ±   ES-470

Normal (Gaussian) Distribution Mean The function is symmetric about the mean, it gains its maximum value at the mean, the minimum value is at plus and minus infinity ES-470

Histogram for Sand Content Sigma Plot 8.0 Normal distribution ES-470

Histogram for Saturated Hydraulic Conductivity Skewed distribution- Positive ES-470

Skewed distribution- negative Skewed distribution- Positive One of the tail is longer than other- Distribution is skewed ES-470

Different Data Structures ES-470

So in place of E [Y i ] =  An Appropriate model E [Y i ] =  x i  i Where  (x i ) can be a constant or a function, both dependent on a spatial or temporal scale Therefore, simple randomization may not be sufficient Stratified sampling will be better Stratified sampling- the area is divided into sub areas called strata ES-470

1.Formulate objectives 2.Formulate hypotheses 3.Design a sampling scheme 4.Collect data 5.Data Interpretation Objective: Determine the relative magnitude of statistical and spatial variability at Field scale Case Study ES-470

Sampling Design? 1.Simple random 2.Stratified 3.Two-stage 4.Cluster 5.Systematic ******** **** **** 5 ES-470

How many samples? Sample size for simple random sampling Relative error should be smaller than a chosen limit (r) Where   -  /2 = (1-  /2) quartile of the standard normal distribution; S- standard deviation of y in the area; is mean Standard deviation or coefficient of variation is known Absolute error to be smaller than a chosen limit d Time and Resources ???? ES-470

df\p Students t-table df = degree of freedom; p is probability level ES-470

Relative error = 0.01 g kg -1 Mean of Y = 1.17 g kg -1 Standard deviation = 0.05 Example data of N concentration: 1.10, 1.11, 1.12, 1.13, 1.13, 1.14, 1.16, 1.17, 1.19, 1.20, 1.23, 1.24, 1.25 ES-470 Alpha = 0.05 Degree of freedom = 13-1 = 12 t Students (table) = 1.782

Relative error (r) = 0.02 g kg -1 Alpha = 0.10 Degree of freedom = 13-1 = 12 T Students (table) = r = 0.01 r = 0.02 ES-470

Variation in properties Deterministic parameters Stochastic parameters Mean value and an uncertainty statistics VarianceSemi variogram function It is always implied: Domain is first- or second- order stationary Process is adequately characterized by a mean value and an uncertainty statistics E(Y i ) s = Y m Var(Y i ) =0 Var(Y i ) s =  2 s Var[(Y i ) s -(Y i+h ) s ]= 2  h  ES-470

We will use a data collected on a grid of 20 x 20 cm in a field seeded to grass for last 20 years ES-470

Variability can be expressed by coefficient of variation Where: x = an individual value n = the number of test values = the mean of n values Standard deviation of two independent sets where: n 1 = number of values in the first set; s 1 = standard deviation of the first set of values; n 2 = number of values in second set; s 2 = standard deviation of second set of values ES-470

Coefficient of variation (CV) Statistical variability of soil properties at local scale i c - Steady state infiltration rate (cm/min) K s - Sat. hydraulic conductivity (cm/min) I - Cumulative infiltration (cm) I 5 - Infiltration rate at 5 min (cm/min) Textural Water Transmission AWC- Available water content (cm) VTP - Volume of transport pores (  s -  6 ) (%) VSP - Volume of storage pores (%) Shukla et al ES-470

Descriptive statistics (or CV) cannot discriminate between intrinsic (natural variations) and extrinsic (imposed) sources of variability Geostatistical analysis- grid based or spatial sampling For example-20 m x 20 m ES-470

Nugget (C 0 ) Partial Sill (C 1 ) Range (a) Lag (h; m) Pannatier, 1996 ArC View Variowin ES-470

Note:   increases with increasing lag or separation distance  A small non-zero value may exist at  = 0  This limiting value is known as nugget variance  It results from various sources of unexplained errors, such as measurement error or variability occurring at scales too small to characterize given the available data  At large h, many variograms have another limiting value  This limiting value is known as sill  Theoretically, it is equal to the variance of data  The value for h where sill occurs is known as range ES-470

Variogram  The most common function used in geostatistical studies to characterize spatial correlation is the variogram  The variogram,  (h), is defined as one-half the variance of the difference between the sample values for all points separated by the distance h where var [ ] indicate variance and E { } expected value ES-470

Estimator for the variogram is calculated from data using where N(h) is total number of pairs of observations separated by a distance h. Caution- variograms can be strongly affected by outliers in the data ES-470

Variogram Model  Variogram model is a mathematical description of the relationship between the variance and the separation distance (or lag), h  There are four widely used equations ES-470

Isotropic Models Spherical Model Linear Model Exponential Model Gaussian Model ES-470

C 0, Nuggeta, Range Sill Linear ModelSpherical Model Does not have a sill or range and the variance is undefined Precisely defined sill or range ES-470

b ~ a/3b ~ a/3 0.5 Exponential ModelGaussian Model Range is 1/3 of the range for spherical model Range is 1/sqrt(3) of the range for spherical model ES-470

Variogram is constructed by 1.Calculating the squared differences for each pair of observations (x j - x k ) 2.Determining the distance between each pair of observation 3.Averaging the squared differences for those pairs of observations with the same separation distance If observations are evenly spaced on a transect, separation distances are multiple of the smallest distance h1 = 2 m; h2 = 4m; h3 = 6 m …… ES-470

When observations are placed on an irregular pattern, variograms are :  constructed by assigning appropriate lag interval  Binning procedure  B ins are created with interval centers at distances h1 = (1-2) m; h2 = (2-4) m; h3 =(4-6) m ………………….. ES-470

Important considerations when calculating a variogram:  As separation distance becomes too large, spurious results occur because fewer pairs of observation exist for large separations due to finite boundary  Width of lag interval can affect the sample variogram due to number of samples and variation in the separation distances that fall into a particular lag interval  Uncorrelated and correlated data show different nugget effects  Number of datasets used influence on variogram ES-470

Before you start spatial analysis: Check for normal distribution WSA- water stability of aggregates (%) sand- sand content (%) Ic- saturated hydraulic conductivity (cm/h) ES-470

Use of descriptive statistics Mean, median (most middle), skewness, etc. PropertySandSiltClayAWCIc Mean Median Std Error0.7> Std Dev Skewness Minimum Maximum ES-470

Plot the data to see the structure Saturated Hydraulic Conductivity Y X ES-470

Estimator variance Variance = 13.7Variance = 15.5Variance = 16.1Z(x)Z(x+h) Z(x)Z(x+3h) Z(x)Z(x+2h) ES-470 Example

Sand Content Saturated Hydraulic Conductivity ES-470

Sand Content Spherical Model SS = Nugget = 0 Range = m Sill = 16.0 Spherical Model SS = Nugget = 3.04 Range = m Sill = 16.0 Modeling of Variogram ES-470

Spherical Model SS = Nugget = 0 Range = 19.8 m Sill = Saturated Hydraulic Conductivity Spherical Model SS = Nugget = Range = 19.8 m Sill = ES-470

ParametersNugget Range (m) Sill Sand content Silt content (%) Clay content (%) Cumulative infiltration (cm) Steady state infiltration rate (cm/min) Available water content (cm) Parameters for spherical variogram model for soil properties ES-470

Spatial variability: nugget – total sill ratio (NSR) Lower NSR – higher spatial dependence Nugget to total sill ratio Textural Water Transmission Shukla et al NSR < 0.25 highly spatial variable NSR > 0.75 less spatial variable Cambardella et al., 1994 ES-470