Download presentation

Presentation is loading. Please wait.

Published byGaven Bainum Modified about 1 year ago

1
1 Case Example: Using a Stratified Sampling Design & Field XRF to Reduce the 95% UCL for Residential Soil Lead Deana Crumbling, EPA/OSRTI/TIFSD crumbling.deana@epa.govcrumbling.deana@epa.gov 703-603-0643 2009 EPA Annual Quality Conference

2
2 What things increase the interval between the sample mean & UCL? What creates high data variability? High variability in data set Data set is from a non-normal or non- parametric distribution Small number of physical samples in the statistical sample True changes in matrix concentrations across space Inadequate soil sample homogenization Artifact caused small analytical subsample mass

3
3 Variability as an artifact of small analytical sample mass As analytical sample volumes increase, data variability decreases & distribution goes from lognormal to normal (assumes whole sample is measured)

4
4 Reduce the UCL by addressing: By procedures that support: Variability artifacts Non-normal statistical distributions Small number of physical samples in the statistical sample High variability due to true variation Sample homogenization Increased sample mass True changes in matrix concentrations across space Physical manipulation of sample, increase volume (MIS) and/or sufficient replicate analyses }

5
5 Can anything be done about true spatial variations in concentration? Methods for Evaluating the Attainment of Cleanup Standards Volume 1: Soils and Solid Media”, 1989, section 6.4 http://www.cluin.org/download/stats/vol1soils.pdf http://www.cluin.org/download/stats/vol1soils.pdf Guidance on Choosing a Sampling Design for Environmental Data Collection (EPA QA/G-5S), 2002, Chap 6. http://www.epa.gov/quality/qs-docs/g5s-final.pdf http://www.epa.gov/quality/qs-docs/g5s-final.pdf Data Quality Assessment: Statistical Methods for Practitioners (EPA QA/G-9S), 2006, section 3.2.1.3 http://www.epa.gov/quality/qs-docs/g9s-final.pdf http://www.epa.gov/quality/qs-docs/g9s-final.pdf Purpose: determine the overall mean & UCL for a decision unit (DU) when different sections of the DU have different means & standard deviations (SDs). (Statistical) Stratified Sampling Design

6
6 What Makes a Stratified Design Different? To calculate average over the entire area, routine practice is that data go straight into a database, and then… Sum(all) = 2736; then 2736 ÷ 12 = 228 ppm “Dividing by 12” assumes equal weight is given to each sample (1/12 th of total area) 16 * 22 * 20 * 18 * 15 * 21 * 25 * 120 * 184 * 155 * 11001040 *

7
7 But the CSM supports partitioning the site into 3 distinct portions based on similar populations 20(0.75) + 153(0.20) + 1070(0.05) = 99 ppm 16 * 22 * 20 * 18 * 15 * 21 * 25 * 75% of area ave = 20 120 * 184 * 155 * 20% of area ave = 153 11001040 * 5% of area; ave = 1070 * A spatially weighted mean makes a difference! 143 (Δ=44)434 (Δ=196) 95% UCL 8039843242SD 99228201531070Mean StratifiedRoutineLowMidHighArea

8
8 Basic Principles of a Stratified Sampling Design The CSM is the basis for defining both the DU & its strata Decision Unit (DU) = a unit for which a decision is made: a single drum, a batch of drums, risk exposure unit, remediation unit, etc. The DU is the volume & dimensions over which an average conc is desired Strata are created by different release or transport mechanisms – cause different contaminant patterns in within the DU Target properties like conc level & variability differ from strata to strata w/in the DU

9
9 Basic Principles (cont’d) DU is delineated (stratified) into non-overlapping subsections according to the CSM Each stratum’s area/volume is recorded as a fraction of the DU’s area/volume Each stratum’s conc mean & SD determined The means & SDs are weighted and mathematically combined overall mean & UCL for the DU Can apply stratification to data analysis even if not planned into sampling, but must have spatial info & final CSM available

10
10 Benefits of a Stratified Sampling Design Small areas of very high or low conc do not bias the overall mean of the DU. Reduces variability (SD) in the DU data set Reduces statistical uncertainty (as distance between mean & UCL) Preserves spatial information to identify source/transport mechanisms & support remedial design.

11
Case Example: XRF with stratified sampling design Properties in old town near Pb battery recycling plant XRF Pb data from bagged soil samples (~300 gram) Plastic bag of soil

12
12 Decision Goals Resolve confusion over past conflicting data. Determine mean (95% UCL) for exposure unit (entire yard): 500 ppm risk-based A/L; if over, cleanup high contamination areas Pb source? Suggested by spatial contaminant pattern (does facility have liability?) Property divided into 3 sections (strata) Front yard (likely “same” conc within & own SD) Side yard (ditto) Back yard (ditto) Each stratum 5 ~equal subsections (sample units) 1 grab (or MIS) sample (300-400 g) into plastic bag 5 sample units/stratum or 15 sample units/DU (the EU) Data Collection Design

13
13 Preliminary CSM of Simplified Property Back Yard: 5 Samples Front Yard: 5 Samples { Side Yard: 5 Bagged Samples { House Footprint { Area fx = 0.60 Area fraction = 0.25 Area fx = 0.15 Action Level (entire yard) = 500 ppm Potential release: Traffic (facility truck, Pb gasoline); Pb house paint; facility’s atmospheric deposition; combination. Expected Pb conc: Higher. Potential release: Pb paint; atmos dep. Pb conc: Uncertain (near road, house?) Potential release: Pb paint (near structures); atmos dep. Expected Pb conc: Lower.

14
14 XRF Bag Analysis 4 30-sec XRF readings on bag –(2 on front & 2 on back) Results entered real-time into pre-programmed spreadsheet Spreadsheet immediately calculates: 1.ave & SD for each bag 2.ave & SD within each strata (yard section), 3.ave & UCL for the decision unit (entire property). 4.the greater of within-bag vs. between-bag variability IF statistical uncertainty interferes w/ desired decision confidence for DU: –Use #4 & a series of decision trees to reduce statistical uncertain until confident decision possible

15
15 Minimizing Variability Improves Statistical Confidence in EPCs Strategy & Results for Example Yard Mean (XRF) 95UCL (1/2 CI width) uncontrolled micro-scale (within-bag) variability (single analysis) & routine calc 476 647 (171) control within-bag variability (replicates); still use routine EPC calculation 453 607 (154) stratified sampling & data analysis on preliminary CSM 192 227 (35) stratified sampling & data analysis on mature CSM 199 231 (32) NOTE: “Routine” calculation applies same weighting to data points & database loses their spatial representativeness Note: ½ CI width = mean-to-UCL width

16
Preliminary CSM: an informed hypothesis about strata boundaries House Footprint Mature CSM: Data confirms or modifies hypothesis about strata boundaries Data Used to Mature the CSM

17
17 Progressive Data Uncertainty Management Unit Value (ppm) CV 95% UCL (Mean-to-UCL width) 1 XRF reading on 1 Front yard (FY) bag (instrument-reported error) 750 0.041 XRF instrument only 801* (51*) 1 Bag (4 XRF readings on same bag) 789 0.087 +micro- scale 870 (81) Immature CSM, FY section only (10 bag samples) 771 0.30 +short- scale 907 (136) Mature CSM, revised FY section only (7 bag samples) 900 0.12 CSM ↓ 977 (77) Combine w/ Side & Back sections mature CSM, entire yard (area-wt’d) 199 0.49 + long- scale over property 231 (32) stratification & ↑ n → ↓ width * Normal z-distribution used for the XRF instrument’s counting statistics, rest of rows use the t-distribution

18
18 There is the question of XRF-ICP data comparability for this project, but no time in this talk to cover it. Bottom line: adjustment of the XRF data set to be more comparable to the ICP data set was needed, however, it did not change the compliance decision for this property.

19
19 There is the Question of XRF-ICP Data Comparability No time here for details about XRF-ICP data comparability. But there was a problem. XRF was significantly biased LOW compared to ICP. Investigation found the plastic bags that were used decreased XRF signal. When plastic interference combined with moistures in 15-20% range, the XRF data needed adjustment to be more comparable to the ICP data. Adjusted XRF data were usable for decisions.

20
20 How Do They Compare? Notice strong upward deviation from the ideal regression line (i.e., slope > 1). Indicates that ICP results are consistently higher than XRF results

21
21 Adjusting XRF Data to Make More Comparable to ICP There was a consistent, statistically significant bias between XRF & ICP data. Math. relationship consistent enough to adjust the XRF results using the ICP vs. XRF regression eqn. Strategy: use the XRF data (x) to predict what ICP results (y) would be Don’t need to adjust every XRF data point. Can adjust the XRF means & UCLs directly. (Note: DO NOT adjust SDs!)

22
22 Previous summary table w/ adjustment for XRF bias Strategy Mean95UCL XRF Adj XRF Adj uncontrolled var. & traditional calc 476640647862 controlled var.; traditional calc 453610607810 stratified on prelim CSM 196275237329 stratified on mature CSM 199279231322 Decision: The Pb conc for this property is compliant with the 500 ppm risk benchmark.

23
23 Deana M. Crumbling, M.S. U.S. EPA, Office of Superfund Remediation & Technology Innovation 1200 Pennsylvania Ave., NW (5203P) Washington, DC 20460 PH: (703) 603-0643 crumbling.deana@epa.gov www.triadcentral.org Questions ?

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google