Presentation is loading. Please wait.

Presentation is loading. Please wait.

Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav.

Similar presentations


Presentation on theme: "Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav."— Presentation transcript:

1 Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav A. Dror & David M. Steinberg July 2006 1/40 International Conference on DOE – Nankai University

2 2/40 Overview  Introduction – Designs for GLM’s  Local D-optimal Designs  Robust Designs  Sequential Designs  Conclusions Robust Experimental Design for multivariate GLM Technical reports and MATLAB macros available at www.math.tau.ac.il/~dms/GLM_Design

3 3/40 D-optimal GLM designs  Theory like that for linear model, but with a crucial, difference. Fisher’s information matrix changes: F T F  F T WF  D-optimality: maximize  (Local D-optimal, and Local D-Efficiency)

4 4/40 Introduction – Visualization

5 5/40 Introduction – Main Objectives  Construction of an algorithm to find Local D-optimal Designs  Generalization: From locally optimal designs into robust designs (which take account of the uncertainty in the model parameters)  Further robustness – for different link functions, linear predictors, etc.  Sequential design – use data to estimate the model and improve the design as the experiment runs.

6 6/40 Overview  Introduction  Local D-optimal Designs  Robust Designs  Sequential Designs  Conclusions Robust Experimental Design for multivariate GLM

7 7/40 Local D-optimal designs – Algorithm  Mimics algorithms for linear models.  Main element – a row exchange procedure.  Rows are added or deleted, weighting the regression functions in accord with the mean value.  Timing: 1 second for a 16 point Poisson regression with 5 variables + interactions (accuracy 2 decimal places)

8 8/40  Introduction  Local D-optimal Designs  Robust Designs –Clustering: Motivating Example –Clustering vs. Bayesian Designs –Clustering vs. Compromise Designs –Linear Predictor and Link function Robustness –Ink Production Example  Sequential Designs  Conclusions Overview Robust Experimental Design for multivariate GLM

9 9/40 Clustering – Motivating Example  Proximity of 25 local D-optimal designs for a logistic model with intercept value uncertainty

10 10/40 Overview  Introduction  Local D-optimal Designs  Robust Designs –Clustering: Motivating Example –Clustering vs. Bayesian Designs –Clustering vs. Compromise Designs –Linear Predictor and Link function Robustness –Ink Production Example  Conclusions Robust Experimental Design for multivariate GLM

11 11/40 CLUSTERING vs. BAYESIAN DESIGNS (1)  Chaloner & Larntz (1989) Design Criterion: maximize the mean (over a prior distribution) of the information matrix log determinant Their optimal Bayesian Design:  Uses 7 support points  Reported value of -4.5783 for the criterion

12 12/40  Both designs (almost) meet sufficient requirements for optimality proof CLUSTERING vs. BAYESIAN DESIGNS (2)  K-means Clustering over 100 Local Designs  Local Designs’ coefficients: Low-Discrepancy sequence (Niederreiter’s)Niederreiter 05101520 -5.4 -5.2 -5 -4.8 -4.6 -4.4 -4.2 -4 Number of Support Points Average Log Determinant of the Information Matrix Chaloner and Larntz (1989) Reported Value  Evaluated over 10,000 Coefficients vectors

13 13/40 CLUSTERING vs. BAYESIAN DESIGNS (3)  Expect Bayesian to be generally better  But… If Clustering does not fall much:  Simplicity of creation  Considerably less computational needs  Extension to multivariate problems – almost trivial

14 14/40 Overview  Introduction  Local D-optimal Designs  Robust Designs –Clustering: Motivating Example –Clustering vs. Bayesian Designs –Clustering vs. Compromise Designs –Linear Predictor and Link function Robustness –Ink Production Example  Conclusions Robust Experimental Design for multivariate GLM

15 15/40 Clustering vs. Multivariate Compromise Designs (1)  Woods, Lewis, Eccleston and Russell (Technometrics, May 2006): –A method for finding exact designs for experiments in which there are several explanatory variables –Use Simulated Annealing to find a design with the same criterion as Chaloner & Larntz –They note that evaluating the integral is too computationally intensive for incorporation within a search algorithm, and therefore average over a partial set

16 16/40 Clustering vs. Multivariate Compromise Designs (2)  Crystallography experiment –4 variables (rate of agitation during mixing, volume of composition, temperature and evaporation rate) –Affect the probability that a new product is formed –First order logistic model (with no interactions) –16 (/48) observations –Parameter space: (demonstrating algorithm’s superiority) Performance evaluated using median and minimum Local D-Efficiencies relative to 10,000 random parameter vectors

17 17/40 Clustering vs. Multivariate Compromise Designs (3) Minimum Efficiency Median Efficiency Design 0.0030.07Standard 2 4 factorial 0.120.41 Woods’ Compromise design

18 18/40 Clustering vs. Multivariate Compromise Designs (4)  Clustering procedure (1): –First, created Local Designs for 100 parameter vectors (Neiderreiter sequence)Neiderreiter –1,600 points K-means clustering (K=16) 30 seconds 0.25 seconds Minutes Minimum Efficiency Median Efficiency Design 0.0030.07 Standard 2 4 factorial 70.120.41 Woods’ Compromise 10.090.40Clustering (1) [0.06,0.12][0.38,0.42]

19 19/40 Clustering vs. Multivariate Compromise Designs (5)  Clustering procedure (2): –Choose the cluster with highest average log determinant of information matrix, over N clustering repetitions: Minutes Minimum Efficiency Median Efficiency Design 0.0030.07 Standard 2 4 factorial 70.120.41 Woods’ Compromise 10.090.40Clustering (1) 1 0.096 [0.06,0.13] 0.42 [0.416,0.430] Clustering (2)

20 20/40 Clustering vs. Multivariate Compromise Designs (6)  Fast procedure Examine effect of # of Support points 0102030405060708090100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Number of Support Points Approximate Efficiency Median Efficiency Minimum Efficiency 20 seconds

21 21/40 Clustering vs. Multivariate Compromise Designs (7) Minutes Minimum Efficiency Median Efficiency Design 0.0030.07 Standard 2 4 factorial 70.120.41 Woods’ Compromise 10.090.40Clustering (1) 10.0960.42Clustering (2) 2.5 0.177 [0.141, 0.213 ] 0.423 [0.415, 0.432] Clustering (3) Crystallography experiment - summary

22 22/40 Clustering vs. Multivariate Compromise Designs (6)  Advantageous byproduct of clustering: 0102030405060708090100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Number of Support Points Approximate Efficiency Median Efficiency Minimum Efficiency 20 seconds

23 23/40 Overview  Introduction  Prior Work  Local D-optimal Designs  Robust Designs –Clustering: Motivating Example –Clustering vs. Bayesian Designs –Clustering vs. Compromise Designs –Linear Predictor and Link function Robustness –Ink Production Example  Conclusions Robust Experimental Design for multivariate GLM

24 24/40 Robustness for Linear Predictors and Link functions  (again from Woods et al.)  2 variables  2 linear predictors: with / without interactions  2 link functions: Probit / CLL  Given (known) coefficients values

25 25/40 Overview  Introduction  Local D-optimal Designs  Robust Designs –Clustering: Motivating Example –Clustering vs. Bayesian Designs –Clustering vs. Compromise Designs –Linear Predictor and Link function Robustness –Ink Production Example  Conclusions Robust Experimental Design for multivariate GLM

26 26/40 Ink Production Example (1)  A Poisson Model  5 Variables  Normally Distributed Coefficients values uncertainty  Uncertainty about interaction effects  Centroid design reasonably efficient

27 27/40 Ink Production Example (2)  5 Tubes, each with different chemical  Each tube: Chosen concentration (fixed volume)  Ink quality classification: # of imperfect marks (on a standard printed test page)  Low concentrations – low quality, unusable  High concentrations – expensive  Model building based on experts opinions

28 28/40 Ink Production Example (3)  Model building based on experts opinions

29 29/40 Ink Production Example (4)  Full Factorial D-Efficiency:

30 30/40 Ink Production Example (5)  Cluster Design D-Efficiency:

31 31/40 Ink Production Example (6)  Centroid Design D-Efficiency:

32 32/40 Ink Production Example (6) Centroid Design D-Efficiency Cluster Design D-Efficiency

33 33/40 Ink Production Example (7) 00.20.40.60.81 0 20 40 60 80 100 Efficiency Equivalent Sample Size

34 34/40 Overview  Introduction  Local D-optimal Designs  Robust Designs  Sequential Designs  Conclusions Robust Experimental Design for multivariate GLM

35 35/40 Sequential Designs  Good design requires knowledge of coefficients.  Use the data thus far to assess the model and the coefficients.  Augment the design accordingly.  Bayesian framework is natural. Robust Experimental Design for multivariate GLM

36 36/40 Sequential Designs Current methods:  Bruceton (Dixon and Mood 1948)  Langlie (1965)  Neyer (1994)  Wang, Smith & Ye (2006) Robust Experimental Design for multivariate GLM

37 37/40 Sequential Designs Robust Experimental Design for multivariate GLM Our method can be applied with many factors and in both fully sequential and group-sequential settings. Current methods are limited to:  One-factor experiments.  Fully sequential experiments.

38 38/40 Efficiency Comparison Efficiency One-stage ROBUST SEQUENTIAL Median: 0.67 5% quantile: 0.30 Median: 0.98 5% quantile: 0.85 48 points

39 39/40 Overview  Introduction  Local D-optimal Designs  Robust Designs  Sequential Designs  Conclusions Robust Experimental Design for multivariate GLM

40 40/40 Summary & Conclusions  Local D-optimal designs for GLM can be easily found  Clustering a database of local D-optimal designs creates a robust design  Clustering is Robust for many uncertainty types: –parameter space, linear predictors, link functions, …  Simple procedure, minimal computational resources  Speed allows exploration of various designs and investigation of different number of support points  Outperforms more sophisticated and complex design optimization methods  Efficient sequential designs by combining the ideas with a Bayesian updating approach.


Download ppt "Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav."

Similar presentations


Ads by Google