A. H. El-Shaarawi National Water Research Institute and McMaster University Southern Ontario Statistics, Graduate Student Seminar Days, 2006 McMaster University.

Slides:



Advertisements
Similar presentations
Time to failure Probability, Survival,  the Hazard rate, and the Conditional Failure Probability.
Advertisements

STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Chapter 7 Sampling and Sampling Distributions
Estimation of Means and Proportions
Hydrologic Statistics Reading: Chapter 11, Sections 12-1 and 12-2 of Applied Hydrology 04/04/2006.
Probit The two most common error specifications yield the logit and probit models. The probit model results if the are distributed as normal variates,
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chapter 6 Sampling and Sampling Distributions
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Sampling: Final and Initial Sample Size Determination
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Outline input analysis input analyzer of ARENA parameter estimation
Maximum likelihood (ML) and likelihood ratio (LR) test
Hypothesis testing Some general concepts: Null hypothesisH 0 A statement we “wish” to refute Alternative hypotesisH 1 The whole or part of the complement.
Point estimation, interval estimation
Chapter 6 Introduction to Sampling Distributions
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Biostatistics Frank H. Osborne, Ph. D. Professor.
3-1 Introduction Experiment Random Random experiment.
7/2/2015 IENG 486 Statistical Quality & Process Control 1 IENG Lecture 05 Interpreting Variation Using Distributions.
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Introduction Before… Next…
Formalizing the Concepts: Simple Random Sampling.
Statistical Theory; Why is the Gaussian Distribution so popular? Rob Nicholls MRC LMB Statistics Course 2014.
Lecture II-2: Probability Review
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Statistical Decision Theory
Traffic Modeling.
Random Sampling, Point Estimation and Maximum Likelihood.
Theory of Probability Statistics for Business and Economics.
An Empirical Likelihood Ratio Based Goodness-of-Fit Test for Two-parameter Weibull Distributions Presented by: Ms. Ratchadaporn Meksena Student ID:
Environmental Control and Economic Development Abdelhameed M El-Shaarawi National Water Research Institute and McMaster University Burlington, Ontario,
Bayesian Analysis and Applications of A Cure Rate Model.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Chapter 7 Point Estimation
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Monitoring Principles Stella Swanson, Ph.D.. Principle #1: Know Why We Are Monitoring Four basic reasons to monitor:  Compliance Monitoring: to demonstrate.
OPENING QUESTIONS 1.What key concepts and symbols are pertinent to sampling? 2.How are the sampling distribution, statistical inference, and standard.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Sampling and estimation Petter Mostad
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Lecture Slides Elementary Statistics Twelfth Edition
Normal Distribution and Parameter Estimation
STATISTICS POINT ESTIMATION
Maximum Likelihood Estimation
Parameter, Statistic and Random Samples
EE513 Audio Signals and Systems
Carey Williamson Department of Computer Science University of Calgary
Parametric Methods Berlin Chen, 2005 References:
STATISTICAL INFERENCE PART I POINT ESTIMATION
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Applied Statistics and Probability for Engineers
Continuous Random Variables: Basics
Presentation transcript:

A. H. El-Shaarawi National Water Research Institute and McMaster University Southern Ontario Statistics, Graduate Student Seminar Days, 2006 McMaster University May 12, 2006

Outline

What is statistical science? A coherent system of knowledge that has its own methods and areas of applications. The success of the methods is measured by their universal acceptability and by the breadth of the scope of their applications. Statistics has broad applications (almost to all human activities including science and technology). Environmental problems are complex and subject to many sources of uncertainty and thus statistics will have greater role to play in furthering the understanding of environmental problems. The word “ENVIRONMETRICS” refers in part to Environmental Statistics

What are the Sources of the foundations? Concepts and abstraction. Schematization == Models Models and reality (deficiency in theory leads to revision of models)

What are the Tools? Philosophy “different schools of statistical inference”. Mathematics. Science and technology.

How to become a successful statistician? Continue to upgrade your statistical knowledge. Improve your ability to perform statistical computation. Be knowledgeable in your area of application. Understand the objectives and scope of the problem in which you are involved. Read about the problem and discuss with experts in relevant fields. Learn the art of oral and written communication. The massage of communication is dependent on the interest of to whom the message is intended.

Environmental Problem Tools for: Data Acquisition Analysis & Interpretation Modeling Model Assessment Trend Analysis Regulations Improving Sampling Network Estimation of Loading Spatial & Temporal Change E Canada H Canada DFO INAC Provincial EPA International Hazards Exposure Control

Data Acquisition Data Analysis Empirical Models Process Models Information Prior Information

Modeling Data Time Space Seasonal TrendInput-output Net-work Error +Covariates

Measurements Input System Output Desirable Qualities of Measurements Effects Related Easy and Inexpensive Rapid Responsive and more Informative (high statistical power)

Burlington Beach

Designing Sampling Program for Recreational Water (EC, EPA) Sampling Grid for bathing beach water quality Setting the regulatory limits: Select the indicators; Determine indicators illness association; Select indicators levels That corresponds to acceptable risk level. Sampling Problems

Sampling Designs Model based Design based Examples of sampling designs 1. Simple random sampling 2. Composite sampling 3. Ranked set sampling

Composite Sampling

Efficiency of Composite Sampling

Efficiency for estimating the mean and variance of the distribution Number of Composite samples = m Number of sub-samples in a single C sample = k Properties of the estimator of Variance: 1. It is an unbiased estimator of regardless of the values taken by k and. The variance of this estimator is given by This expression shows that for:, composite sampling improves the efficiency of as an estimator of regardless of the value of k and in this case the maximum efficiency is obtained for k =1 which corresponds to discrete sampling., the efficiency of composite sampling depends only on m and is completely independent of k., the composite sampling results in higher variance and for fixed m the variance is maximized when k =1. It should be noted that the frequently used models to represent bacterial counts belong to case c above. This implies that the efficiency declines by composite sampling and maximum efficiency occurs when k = 1. Case b corresponds to the normal distribution where the efficiency is completely independent of the number of the discrete samples included in the composite sample.

Health Survey

The effects of exposure to contaminated water

Surface water quality criteria (CFU/100mL) proposed by EPA for primary contact recreational use WaterIndicatorGeometric Mean Single Sample Maximum MarineEnterococci35104 FreshEnterococci E. coli WaterIndicatorGeometric Mean Single Sample Maximum MarineEnterococci35104 FreshEnterococci E. coli Based on not less than 5 samples equally spaced over a 30-day period. The selection of : Indicators Summary statistics, number of samples and the reporting period Control limits

Approximate expression for probability of compliance with the regulations

Sample size n=5 and 10 # of simulations =10000

Ratio of single sample rejection probability to that of the mean rule (n = 5,10 and 20)

The fish (trout) contamination data: 1. Lake Ontario (n = 171); Lake Superior (61) 2. Measurements (total PCBs in whole fish, age, weight, length, %fat) – fish collected from several locations (representative of the population in the lake because the fish moves allover the lake)

Let x(t) be a random variable representing the contaminant level in a fish at age t. The expected value of x(t) is frequently represented by the expression where b is the asymptotic accumulation level and λ is the growth parameter. Note that 1 – exp(-λt) is cdf of E(λ ) and so an immediate generalization of this is The expected instantaneous accumulation rate is f(t; λ)/F(t; λ). One possible extension is to use the Weibull cdf

Modeling: Consider a continuous time systems with a stochastic perturbations with initial condition x(0) = x 0, b(x) is a given function of x and t σ(x) is the amplitude of the perturbation ξ = dw/dt is a white noise assumed to be the time derivative of a Wiener process Examples for σ(x)=0 : 1. b(x) = - λx μ(x) = μ(0) exp(- λt ) (pure decay) 2. b(x) = λ{μ(0) - μ(x)} μ(x) = μ(0) {1- exp(- λt )} Bertalanffy equation When σ(x) > 0, a complete description of the process requires finding the pdf f(t,x) and its moments given f(0,x).

The density f (t,x) satisfies the Fokker-Planck equation or Kolmogorov forward equation Where. When d = 1 this equation simplifies to Multiplying by x n and integrating we obtain the moments equation Clearly dm 0 /dt = 0 and dm 1 /dt = E(b)

In the first example with b(x) = -λx and σ(x) = σ, we have In the second example with b(x) = λ{B - μ(x)} and σ(x) = σ, we have

The Quasi Likelihood Equations and the variance of

Fraser River (BC)

Hansard/Red Pass

Ratio of GEV Distributions

Example is Canadian Ecological Effects Monitoring (EEM) Program for Pulp Mills Risk Identification Risk Assessment Risk Management

Objectives of Environmental Effects Monitoring Program: Does effluent cause an effect in the environment? Is effect persistent over time? Does effect warrant correction? What are the causative stressors? From 1992, all new effluent regulations require sites to do EEM. Pulp and Paper Pilot program

Environmental Effects Monitoring: Canadian Pulp and Paper Industry Structure Data and Objective

Example of data (daphnia survival and reproduction) No. of neonates produced per replicates and total female adult mortality

Example of reproduction data (one cycle)

Some simulation results (MLE) n

Table 2 Skewness MLE has a heavy right tail distribution (skewed to the right)

Table 3 Kurtosis MLE has heavy tails and sharp central part for kurtosis>0 while tails are lighter and the central part is flatter for kurtosis<0

UMVU Estimator

UMVU : Closed form expression for n=2m-1

UMVU: n even

Modified Estimator

Some simulation results (MLE) n

Table 2 Skewness MLE has a heavy right tail distribution (skewed to the right)

Table 3 Kurtosis MLE has heavy tails and sharp central part for kurtosis>0 while tails are lighter and the central part is flatter for kurtosis<0

UMVU Estimator

UMVU : Closed form expression for n=2m-1

UMVU: n even

Modified Estimator