Estimation Techniques for Dose-response Functions Presented by Bahman Shafii, Ph.D. Statistical Programs College of Agricultural and Life Sciences University.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Dummy Dependent variable Models
Introduction to Monte Carlo Markov chain (MCMC) methods
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Lwando Kondlo Supervisor: Prof. Chris Koen University of the Western Cape 12/3/2008 SKA SA Postgraduate Bursary Conference Estimation of the parameters.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Estimation Methods for Dose-response Functions Bahman Shafii Statistical Programs College of Agricultural and Life Sciences University of Idaho, Moscow,
Experimental Design, Response Surface Analysis, and Optimization
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Integration of sensory modalities
Visual Recognition Tutorial
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Parametric Inference.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Chapter 11 Multiple Regression.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
July 3, A36 Theory of Statistics Course within the Master’s program in Statistics and Data mining Fall semester 2011.
Inferences About Process Quality
Bayesian Analysis of Dose-Response Calibration Curves Bahman Shafii William J. Price Statistical Programs College of Agricultural and Life Sciences University.
Maximum likelihood (ML)
Review of Lecture Two Linear Regression Normal Equation
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
The Triangle of Statistical Inference: Likelihoood
Model Inference and Averaging
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
BCS547 Neural Decoding.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Machine Learning 5. Parametric Methods.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Estimating standard error using bootstrap
Chapter 7. Classification and Prediction
Model Inference and Averaging
Ch3: Model Building through Regression
Maximum Likelihood Estimation
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

Estimation Techniques for Dose-response Functions Presented by Bahman Shafii, Ph.D. Statistical Programs College of Agricultural and Life Sciences University of Idaho

Acknowledgments Research partially funded by USDA-ARS Hatch Project IDA01412, Idaho Agricultural Experiment Station. Collaborators: William J. Price Ph. D., Statistical Programs, University of Idaho. Steven Seefeldt, Ph. D., USDA -ARS, University of Alaska Fairbanks.

Dose-response models are common in agricultural research. They can encompass many types of problems: Modeling environmental effects due to exposure to chemical or temperature regimes. Estimation of time dependent responses such as germination, emergence, or hatching. (e.g. Shafii and Price 2001; Shafii, et al. 2009) Bioassay assessments via calibration curves and quantal estimation. (e.g. Shafii and Price 2006) Introduction

Estimation Curve estimation. Linear or non-linear techniques. Estimate other quantities: percentiles. typically: LD 50, LC 50, EC 50, etc. percentile estimation problematic. inverted solutions. unknown distributions. approximate variances.

The response distribution: Continuous Normal Log Normal Gamma, etc. Discrete - quantal responses Binomial, Multinomial (yes/no) Poisson (count)

The response form: Typically expressed as a nonlinear curve increasing or decreasing sigmoidal form increasing or decreasing asymptotic form Dose Response Dose

Given a dose-response curve and an observed response: What dose generated the response? What is the probability of a dose given an observed response and the calibration curve? This problem fits naturally into a Bayesian framework. Bioassay and Calibration

Dose Response Measured Response Unknown Dose

Typical dose-response estimation assumes that the functional form or tolerance distribution, is known, e.g. a sigmoidal shape. In some cases, however, it may be advantageous to relax this assumption and restrict estimation to a family of dose-response forms. The dose-response population consists of a mixture of subpopulations which can not be sampled separately. The dose-response series exhibits a more complex behavior than a simple sigmoidal shape, e.g. hormesis.

Objectives Outline estimation methods for dose- response models. Modern approaches. Probit - Maximum Likelihood Generalized non-linear models. Bayesian solutions. Traditional approaches. Probit - Least Squares.

Objectives Demonstrate solutions for calibration of an unknown dose with a binary response assuming: A known dose-response form. Standard MLE estimation. Standard Parametric Bayesian estimation. A family of dose-response forms. Nonparametric Bayesian estimation.

Estimation Methods Traditional Approach Probit Analysis - Least Squares ^ where p ij = y ij / N and y ij is the number of successes out of N trials in the j th replication of the i th dose.  0 and  1 are regression parameters and  i is a random error;  ij ~ N(0,  2 ). Minimize: SS error =  (p ij - probit) 2 A linearized least squares estimation (Bliss, 1934 ; Fisher, 1935; Finney, 1971): Probit i =  -1 (p ij ) =  0 +  1 *dose i +  ij (1)

 is a convenient CDF form or “tolerance distribution“, e.g. Normal: p ij = (1/  2  ) exp((x-  ) 2 /  2 Logistic: p ij = 1 / (1 + exp( -    dose i -   )) Modified Logistic:p ij = C + (C-M) / (1 + exp( -    dose i -   )) (e.g. Seefeldt et al. 1995) Gompertz:p ij =  0 (1 - exp(exp(-   (dose)))) Exponential:p ij =  0 exp(-   (dose)) SAS: PROC REG.  

Modern Approaches Probit Analysis - Maximum Likelihood for data set y ij where  i =  (  0 +  1 *dose i ) and  0,  1, and dose i are those given previously. The CDF, , is typically defined as a Normal, Logistic, or Gompertz distribution as given above. SAS: PROC PROBIT. The responses, y ij, are assumed binomial at each dose i with parameter  i. Using the joint likelihood, L(  i ) : Maximize: L(  i )   (  i ) yij (1 -  i ) (N - yij) (2)

Limitations: Least squares limited. Linearized solution to a non-linear problem. Even under ML, solution for percentiles approximated. inversion. use of the ratio  0 /  1 (Fieller, 1944). Appropriate only for proportional data. Assumes the response  -1 (p ij ) ~ N( ,  2 ). Interval estimation and comparison of percentile values approximated. Probit Analysis

Modern Approaches (cont) Nonlinear Regression - Iterative Least Squares where y ij is an observed continuous response, f(dose i ) may be generalized to any continuous function of dose and  ij ~ N( ,  2 ). Minimize: SS error =  [ y ij - f(dose i ) ] 2. SAS: PROC NLIN. Directly models the response as: y ij = f(dose i ) +  ij (3)

Nonlinear Regression - Iterative Least Squares Limitations: assumes the data, y ij, is continuous; could be discrete. the response distribution may not be Normal, i.e.  ij ~ N( ,  2 ). standard errors and inference are asymptotic. treatment comparisons difficult in PROC NLIN. differential sums of squares, or specialized SAS codes ; PROC IML.

Generalized Nonlinear Model - Maximum Likelihood Modern Approaches (cont) where y ij and f(dose i ) are as defined above. Estimation through maximum likelihood where the response distribution may take on many forms: Normal:y ij ~ N(  i,  ), Binomial:y ij ~ bin(N,  i ), Poisson:y ij ~ poisson( i ), or in general:y ij ~ ƒ(  ). Directly models the response as: y ij = f(dose i ) +  ij

Generalized Nonlinear Model - Maximum Likelihood Maximize: L(  )   ƒ(  y ij ) (4) Nonlinear estimation. Response distribution not restricted to Normal. May also incorporate random components into the model. Treatment comparisons easier in SAS. Contrast and estimate statements. SAS: PROC NLMIXED.

Generalized Nonlinear Model - Inference Formulate a full dummy variable model encompassing k treatments. The joint likelihood over the k treatments becomes: L(  k )   ijk ƒ(  k  y ijk ) (5) where y ijk is the j th replication of the i th dose in the k th treatment and  k are the parameters of the k th treatment. Comparison of parameter values is then possible through single and multiple degree of freedom contrasts.

Generalized Nonlinear Model Limitations percentile solution may still be based on inversion or Fieller’s theorem. inferences based on normal theory approximations. standard errors and confidence intervals asymptotic.

Bayesian Estimation - Iterative Numerical Techniques Modern Approaches (cont) Considers the probability of the parameters, , given the data y ij. Using Bayes theorem, estimate: p(  |y ij ) = p(y ij |  )*p(  ) (6)  p(y ij |  )*p(  )d  where p(  |y ij ) is the posterior distribution of  given the data y ij, p(y ij |  ) is the likelihood defined above, and p(  ) is a prior probability distribution for the parameters .

Bayesian Estimation - Iterative Numerical Techniques Nonlinear estimation. Percentiles can be found from the distribution of . The likelihood is same as Generalized Nonlinear Model. flexibility in the response distribution. f(dose i ) any continuous function of dose. Inherently allows updating of the estimation. Correct interval estimation (credible intervals). agrees well with GNLM at midrange percentiles. can perform better at extreme percentiles. SAS: PROC MCMC.

Limitations User must specify a prior probability p(  ). Estimation requires custom programming. SAS: PROC MCMC Specialized software: WinBUGS Computationally intensive solutions. Requires statistical expertise. Sample programs and data are available at: Bayesian Estimation - Iterative Numerical Techniques

Calibration Methods Tolerance Distribution: Logistic The response y ij /N i at dose i = 1 to k, and replication j =1 to r, is binomial with the proportion of success given by: y ij /N i = M/(1 + exp(-  (dose i -  ))) (7) where  is a rate related parameter and  is the dose i for which the proportion of success, y ij /N i, is M/2. M is the theoretical maximum proportion attainable.

A convenient generalization of (1) will allow  to represent any dose at which y ij /N i = Q: y ij /N i = M*C / (C + exp(-  (dose i -  ))) (8) Where the constant C = Q/(M – Q). Note that, if Q = M/2, then C = 1 and equation (8) reverts to the standard form given in (7). Equation (8), therefore, permits an unknown dose at a given response, Q, to be estimated through parameter .

Maximum Likelihood Given the binomial responses, y ij /N i, a joint likelihood may be defined as: L(  i | y ij /N i )   ij (  i ) y ij (1 -  i ) (N i - y ij ) (9) Where the binomial parameter,  i, is defined by (8) and the associated parameters,  = [M, ,  ], are estimated through maximization of (9). N i and y ij are the total number of trials and number of successes, respectively. Inferences on  are carried out assuming  ~ N(  ,   ). SAS: PROC NLMIXED

Bayesian: Parametric A Bayesian posterior distribution for  is given by : pr(  | y ij /N i )  pr(y ij /N i |  ) · pr(  ) (10) where pr(y ij /N i j |  ) is the likelihood shown in (9) and pr(  ) is a prior distribution for the parameters  = [M, ,  ]. Estimation of  is carried out through numerically intensive techniques such as MCMC. (e.g. Price and Shafii 2005) Inference on  is obtained through integration of (10) over the parameter space of M and .

Bayesian: Nonparametric Assuming the responses, y ij /N i, are binomial, a likelihood can then be defined as: L(P | y ij /N i )   ij (p i ) y ij (1 - p i ) (N i - y ij ) (11) This methodology was first proposed by Mukhopadhyay (2000) and followed by Kottas et al. (2002). The technique considers the dose-response series as a multinomial process with parameters P = [p 1, p 2, p 3, … p k ].

If the random segments between true response rates, p i, are distributed as a Dirichlet Process (DP), a joint prior distribution on the p i may then be defined by: pr(P)   i (p i – p i - 1 ) ( i - 1) (12) where i =  { F 0 (dose i ) – F 0 (dose i – 1 ) },  is a precision parameter, and F 0 is a base tolerance distribution. The precision parameter, , reflects how closely the final estimation follows the base distribution. Low values indicate less correspondence, while larger values indicate a tighter association. The base distribution, F 0 (.), defines a family of tolerance distributions.

A posterior distribution for P can then be defined by combining (11) and (12) as: pr(P | y ij /N i )   ij (p i ) y ij (1 - p i ) (N i - y ij )  i (p i – p i - 1 ) ( i - 1) (13) Estimation of this posterior is again carried out numerically using techniques such as MCMC. Inference on an unknown dose, , at a known response p 0 = y 0 /N 0, is obtained through sampling of the posterior given in (13).

Concluding Remarks Dose-response models have wide application in agriculture. Probit models of estimation are limited in scope. Generalized nonlinear and Bayesian models provide the most flexible framework for dose-response estimation. Can use various response distributions Can use various dose-response models. Can incorporate random model effects. Can be used to compare treatments. GNLM: full dummy variable modeling. Bayesian methods: probability statements. They are useful for quantifying the relative efficacy oftreatments. Bayesian estimation is preferred when estimating extreme percentiles. Generalized nonlinear models sufficient in most situations.

Methodology proposed here uses a base tolerance distribution. Should be used and interpreted with caution. Standard model assessment techniques still apply. Introduces more uncertainty into the estimation situation. Concluding Remarks (cont) Bioassay is an import part of dose-response analysis. Determining an unknown dose can be problematic for some parametric functional forms. Dose estimation fits naturally in a Bayesian framework. Some dose-response data may not follow typical sigmoidal patterns.

References Bliss, C. I The method of probits. Science, 79:2037, Bliss, C. I The determination of dosage-mortality curves from small numbers. Quart. J. Pharm., 11: Berkson, J Application of the Logistic function to bio-assay. J. Amer. Stat. Assoc. 39: Feiller, E. C A fundamental formula in the statistics of biological assay and some applications. Quart. J. Pharm. 17: Finney, D. J Probit Analysis. Cambridge University Press, London. Fisher, R. A Appendix to Bliss, C. I.: The case of zero survivors., Ann. Appl. Biol., 22: SAS Inst. Inc SAS OnlineDoc, Version 9, Cary, NC. Seefeldt, S.S., J. E. Jensen, and P. Fuerst Log-logistic analysis of herbicide dose-response relationships. Weed Technol. 9: Kottas, A., M. D. Branco, and A. E. Gelfand A Nonparametric Bayesian Modeling Approach for Cytogenetic Dosimetry. Biometrics 58,

References Mukhopadhyay, S Bayesian Nonparametric Inference on the Dose Level with Specified Response Rate. Biometrics 56, Price, W. J. and B. Shafii Bayesian Analysis of Dose-response Calibration Curves. Proceedings of the Seventeenth Annual Kansas State University Conference on Applied Statistics in Agriculture [CDROM], April 25-27, Manhattan Kansas. Shafii, B. and W. J. Price Estimation of cardinal temperatures in germination data analysis. Journal of Agricultural, Biological and Environmental Statistics. 6(3): Shafii, B. and W. J. Price Bayesian approaches to dose-response calibration models. Abstract: Proceedings of the XXIII International Biometrics Conference [CDROM], July , Montreal, Quebec Canada. Shafii, B., Price, W.J., Barney, D.L. and Lopez, O.A Effects of stratification and cold storage on the seed germination characteristics of cascade huckleberry and oval-leaved bilberry. Acta Hort. 810:

Questions / Comments