Presentation is loading. Please wait.

Presentation is loading. Please wait.

Contaminated lake sediment can destroy an ecosystem through bioaccumulation; it harms not only sediment dwelling organisms, but also the fish that depend.

Similar presentations


Presentation on theme: "Contaminated lake sediment can destroy an ecosystem through bioaccumulation; it harms not only sediment dwelling organisms, but also the fish that depend."— Presentation transcript:

1 Contaminated lake sediment can destroy an ecosystem through bioaccumulation; it harms not only sediment dwelling organisms, but also the fish that depend on them. Data: Concentrations of metals in sediment from a random sample of 50 Minnesota lakes Collected by the MPCA and US EPA It is a common practice in environmental statistics to substitute one-half the detection limit for censored data. No statistical justification Can give biased results Goal: Compare approaches for analyzing left-censored metal concentration data for lake sediment samples. Of the 19 metals analysed… 10 had no censoring 4 metals less than 20% censored 3 metals censored 50-70% 2 metals 100% censored Kaplan-Meier Incremental Probability ( ): conditional; given that a concentration is less than or equal to a certain value, what is the probability that the concentration is less than that value? Survival Probability ( ): the product of all Incremental Probabilities of concentrations greater than or equal to a given concentration; gives the probability that a concentration is less than or equal to a given value. We performed simulations to judge the performance of KM and ROS. Data was generated from distributions with known means and medians. Varying levels of censoring were set. “Bias” used to measure accuracy: 1 st simulation: Random samples from a normal distribution ROS was more accurate at all censoring levels 2 nd simulation: Random samples from a lognormal distribution Similar results up to about 30% censoring, then KM becomes significantly less accurate 3 rd simulation: Bootstrap-sampling from metals with less than 20% censoring Similar results for KM and ROS up to about 40%,censoring, then KM becomes less accurate, KM also more variable at all censoring levels 4 th simulation: Metals with originally no censoring are artificially censored at various censoring levels. ROS had lower bias for nearly all observations Methods for Left-Censored Data Analysis of Lake Sediment Pollution Daniel Able, Andrew Lithio, Piamou Liu, Bassirou Sarr Advisors: Julie Legler, Laura Chihara Simple Substitution Substitutes some fraction of the detection limit (usually one- half in environmental statistics) in for all censored data. Uses the now fully numerical data to estimate summary statistics, perform hypothesis tests, etc. Kaplan-Meier Borrowed from survival analysis, where it is usually used for right-censored data. Non-parametric; rank-based Regression on Order Performs a linear regression of data against the normal quantiles of the plotting positions, and uses the regression line to predict censored values. While simple substitution performs better than we would expect, there are more accurate and justifiable ways of treating censored data. Robust Regression on Order Statistics shows less bias in our simulations. Kaplan-Meier remains a justifiable method to employ, even compared to ROS, because it avoids assuming a distribution and fabricating data. Based on existing literature and our own research, we recommend using Kaplan-Meier when there is less than 30% censoring, and robust ROS for 30-80% censoring. There are great increases in bias after 80% censoring, and medians cannot usually be accurately estimated. R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing,Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.orghttp://www.R-project.org. Helsel, Dennis R. Nondetects and Data Analysis: Statistics for Censored Environmental Data. John Wiley and Sons: New York, 2005. Thanks to Judy Crane and Steve Hennes of the MPCA for the opportunity and support. Concentrations are plotted along the x-axis; the survival probability forms the y-axis and represents the probability that a given concentration is less than or equal to a certain value. The estimated median of the data is the x-value where y=.5; estimated mean is the area under the curve. IntroductionMethods Approaches Simulation Results References & Acknowledgements Robust Regression on Order Statistics If the data follows the desired distribution, the distribution will be linear. Quantiles with censored values can be predicted using the regression line Mean, median, etc. are then estimated using predicted and actual data Conclusions The probability plot of Cadmium, with its linear regression. Kaplan-Meier Curve of Cadmium


Download ppt "Contaminated lake sediment can destroy an ecosystem through bioaccumulation; it harms not only sediment dwelling organisms, but also the fish that depend."

Similar presentations


Ads by Google