# Estimation Methods for Dose-response Functions Bahman Shafii Statistical Programs College of Agricultural and Life Sciences University of Idaho, Moscow,

## Presentation on theme: "Estimation Methods for Dose-response Functions Bahman Shafii Statistical Programs College of Agricultural and Life Sciences University of Idaho, Moscow,"— Presentation transcript:

Estimation Methods for Dose-response Functions Bahman Shafii Statistical Programs College of Agricultural and Life Sciences University of Idaho, Moscow, Idaho

Dose-response models are common in agricultural research. They can encompass many types of problems: Introduction Related Problems - Bioassay standard curves and determination of unknown quantities Time effects germination, emergence, hatching exposure times Environmental effects temperature exposure chemical exposure depth or distance from exposure

The response distribution: Continuous Normal Log Normal Gamma, etc. Discrete - quantal responses Binomial, Multinomial (yes/no) Poisson (count)

The response form: Typically expressed as a nonlinear curve increasing or decreasing sigmoidal form increasing or decreasing asymptotic form Dose Response Dose Response

Estimation Curve estimation. Linear or non-linear techniques. Estimate other quantities: percentiles. typically: LD 50, LC 50, EC 50, etc. percentile estimation problematic. inverted solutions. unknown distributions. approximate variances.

Objectives Outline estimation methods for dose- response models. Modern approaches. Probit - Maximum Likelihood Generalized non-linear models. Bayesian solutions. Traditional approaches. Probit - Least Squares.

Methods Traditional Approach Probit Analysis - Least Squares ^ where p ij = y ij / N and y ij is the number of successes out of N trials in the j th replication of the i th dose.  0 and  1 are regression parameters and  i is a random error;  ij ~ N(0,  2 ). Minimize: SS error =  (p ij - probit) 2 A linearized least squares estimation (Bliss, 1934 ; Fisher, 1935; Finney, 1971): Probit i =  -1 (p ij ) =  0 +  1 *dose i +  ij (1)

 is a convenient CDF form or “tolerance distribution“, e.g. Normal: p ij = (1/  2  ) exp((x-  ) 2 /  2 Logistic: p ij = 1 / (1 + exp( -    dose i -   )) Modified Logistic:p ij = C + (C-M) / (1 + exp( -    dose i -   )) (e.g. Seefeldt et al. 1995) Gompertz:p ij =  0 (1 - exp(exp(-   (dose)))) Exponential:p ij =  0 exp(-   (dose)) SAS: PROC REG.  

Modern Approach Probit Analysis - Maximum Likelihood for data set y ij where  i =  (  0 +  1 *dose i ) and  0,  1, and dose i are those given previously. The CDF, , is typically defined as a Normal, Logistic, or Gompertz distribution as given above. SAS: PROC PROBIT. The responses, y ij, are assumed binomial at each dose i with parameter  i. Using the joint likelihood, L(  i ) : Maximize: L(  i )   (  i ) yij (1 -  i ) (N - yij) (2)

Limitations: Least squares limited. Linearized solution to a non-linear problem. Even under ML, solution for percentiles approximated. inversion. use of the ratio  0 /  1 (Fieller, 1944). Appropriate only for proportional data. Assumes the response  -1 (p ij ) ~ N( ,  2 ). Interval estimation and comparison of percentile values approximated. Probit Analysis

Modern Approaches (cont) Nonlinear Regression - Iterative Least Squares where y ij is an observed continuous response, f(dose i ) may be generalized to any continuous function of dose and  ij ~ N( ,  2 ). Minimize: SS error =  [ y ij - f(dose i ) ] 2. SAS: PROC NLIN. Directly models the response as: y ij = f(dose i ) +  ij (3)

Nonlinear Regression - Iterative Least Squares Limitations: assumes the data, y ij, is continuous; could be discrete. the response distribution may not be Normal, i.e.  ij ~ N( ,  2 ). standard errors and inference are asymptotic. treatment comparisons difficult in SAS. differential sums of squares. specialized SAS codes ; PROC IML.

Generalized Nonlinear Model - Maximum Likelihood Directly models the response as: y ij = f(dose i ) +  ij where y ij and f(dose i ) are as defined above. Estimation through maximum likelihood where the response distribution may take on many forms: Normal:y ij ~ N(  i,  ), Binomial:y ij ~ bin(N,  i ), Poisson:y ij ~ poisson( i ), or in general:y ij ~ ƒ(  ). Modern Approaches (cont)

Generalized Nonlinear Model - Maximum Likelihood Maximize: L(  )   ƒ(  y ij ) (4) Nonlinear estimation. Response distribution not restricted to Normal. May also incorporate random components into the model. Treatment comparisons easier in SAS. Contrast and estimate statements. SAS: PROC NLMIXED.

Generalized Non-linear Model - Inference Formulate a full dummy variable model encompassing k treatments. The joint likelihood over the k treatments becomes: L(  k )   ijk ƒ(  k  y ijk ) (5) where y ijk is the j th replication of the i th dose in the k th treatment and  k are the parameters of the k th treatment. Comparison of parameter values is then possible through single and multiple degree of freedom contrasts.

Generalized Nonlinear Model Limitations percentile solution may still be based on inversion or Fieller’s theorem. inferences based on normal theory approximations. standard errors and confidence intervals asymptotic.

Bayesian Estimation - Iterative Numerical Techniques Modern Approaches (cont) Considers the probability of the parameters, , given the data y ij. Using Bayes theorem, estimate: p(  |y ij ) = p(y ij |  )*p(  ) (6)  p(y ij |  )*p(  )d  where p(  |y ij ) is the posterior distribution of  given the data y ij, p(y ij |  ) is the likelihood defined above, and p(  ) is a prior probability distribution for the parameters .

Bayesian Estimation - Iterative Numerical Techniques Nonlinear estimation. Percentiles can be found from the distribution of . The likelihood is same as Generalized Nonlinear Model. flexibility in the response distribution. f(dose i ) any continuous funtion of dose. Inherently allows updating of the estimation. Correct interval estimation (credible intervals). agrees well with GNLM at midrange percentiles. can perform better at extreme percentiles. SAS: No procedure available.

Limitations User must specify a prior probability p(  ). Estimation requires custom programming. SAS: Datastep, PROC IML Custom C program codes Specialized software: WinBUGS Computationally intensive solutions. Requires statistical expertise. Sample programs and data are available at: http://www.uidaho.edu/ag/statprog Bayesian Estimation - Iterative Numerical Techniques

Summary of Estimation Methods

Concluding Remarks Dose-response models have wide application in agriculture. Probit models are limited in scope. Generalized nonlinear and Bayesian models provide the most flexible framework for estimating dose-response. Can use various response distributions Can use various dose-response models. Can incorporate random model effects. Can be used to compare treatments. GNLM: full dummy variable modeling. Bayesian methods: probability statements. They are useful for quantifying the relative efficacy of various treatments.

Concluding Remarks Both GNLM and Bayesian methods give similar percentile estimates for midrange percentiles. Bayesian estimation is preferred when estimating extreme percentiles. Custom programming required. Generalized nonlinear models sufficient in most situations. Software available.

References Bliss, C. I. 1934. The method of probits. Science, 79:2037, 38-39 Bliss, C. I. 1938. The determination of dosage-mortality curves from small numbers. Quart. J. Pharm., 11: 192-216. Berkson, J. 1944. Application of the Logistic function to bio-assay. J. Amer. Stat. Assoc. 39: 357-65. Feiller, E. C. 1944. A fundamental formula in the statistics of biological assay and some applications. Quart. J. Pharm. 17: 117-23. Finney, D. J. 1971. Probit Analysis. Cambridge University Press, London. Fisher, R. A. 1935. Appendix to Bliss, C. I.: The case of zero survivors., Ann. Appl. Biol., 22: 164-5. SAS Inst. Inc. 2004. SAS OnlineDoc, Version 9, Cary, NC. Seefeldt, S.S., J. E. Jensen, and P. Fuerst. 1995. Log-logistic analysis of herbicide dose-response relationships. Weed Technol. 9:218-227.

10. I have never had a course in statistics, but how hard can it be? 9. I don’t have a design! 8. I should have talked to you before I ran the experiment, but..... 7. Why should I replicate? I might get a different answer! 6. I should have randomized what? “Top Ten Things A Statistician Does Not Want to Hear”

5. Could you have this by tomorrow? 4. Halfway through the experiment, we changed..... 3. Can you make it so that the p-value is less than.....? 2. I have 20,000 observations from this one cow! 1. Do you have a minute? Thank you! “Top Ten Things A Statistician Does Not Want to Hear”