Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics 2 for Chemical Engineering lecture 4

Similar presentations


Presentation on theme: "Statistics 2 for Chemical Engineering lecture 4"— Presentation transcript:

1 Statistics 2 for Chemical Engineering lecture 4
2DS01 Statistics 2 for Chemical Engineering lecture 4

2 Contents Summary of previous lectures
Limitations of factorial designs and standard RSM designs mixture designs D-optimal designs

3 Summary of previous lectures
one-way ANOVA: compare means of several groups noise reduction through blocking factorial designs: screening blocks fractions centre points optimisation steepest ascent designs CCD Box-Behnken

4 Example 1: adhesive amount of adhesive temperature
factors: amount of adhesive temperature constraints (in terms of coded variables) too little adhesive at too low temperature: unsatisfactory bonding too much adhesive at too high temperature: damage experimental region: Source: Montgomery, Design and Analysis of Experiments, 5th edition

5 Example 2: separation of chlorophenols
Factors: pH percentage organic modifier Constraints: retention times should be not too short nor too long Model (based on RPLC knowledge): complete second order model + 3rd order term in pH Experimental region: Source: P.F. De Aguilar, Chem. Int. Lab. Syst ), RPLC = reversed-phase liquid chromatography

6 Example 3: Blending of gasoline
Factors: types of octanes Constraints: effect of octanes only depends on proportions Model not known in general; sometimes only small number of octanes are active Experimental region: simplex (triangle, tetrahedron)

7 Mixtures: necessity for new designs
for independent factors, factorial designs are suitable (exp. region: hypercube) in mixtures, factors are dependent because they add up to 100% notions of effects and interactions do not carry over to mixture experiments hypercube experimental regions give poor coverage of experimental region of mixtures:

8 Mixture designs factors are ingredients of mixture
factors are dependent constraints: 0  xi  1 x1 + x2 + x xp = 1 experimental region is simplex: x1 + x2 = 1 x1 + x2 + x3= 1

9 Trilinear coordinate system
x1 (1,0,0) 0.8 (1/3,1/3,1/3) (1/2, 1/2,0) 0.6 0.4 (0,1,0) (0,0,1) 0.2 x2 x3

10 Simplex lattice design
{p,m} -simplex lattice design p = number of factors m+1 = number of factor levels xi = 0, 1/m, 2/m, ..., 1 (i = 1, ..., p) total number of design points: Examples: {3,2} lattice {3,3} lattice

11 Simplex centroid design
p components: p permutations of (1,0,...,0) permutations of (1/2,1/2,0,....,0) permutations of (1/3,1/3,1/3,0,....,0) .... total 2p-1 design points Example: 3 components x1 = 1 x2 = x23 = 1/2 x1 = x2 = 1/2 x1 = x2 = x3= 1/3 x2 = 1 x3 = 1 x2 = x3 = 1/2

12 Models for mixture designs
Polynomial models for mixture responses may be written in different ways because of constraint x1+ x2 + x xp = 1. Usual interpretation of constant term does not make sense (measurements at (0,0,...,0) are impossible). The constant term can always be removed, e.g., for 3 components we may write

13 Scheffé canonical polynomials
In order to have meaningful interpretations of coefficients, one applies canonical forms of polynomials for mixture data. Scheffé introduced the following polynomials (examples for p=3): linear: quadratic special cubic cubic There exist other types of canonical polynomials: Cox polynomials homogeneous polynomials (Kronecker type)

14 Mixture models: interpretation of coefficients
usual interpretation of interaction no longer holds due to dependence mixture factors i is expected response when xi =1 and xj =0 (“pure blend”) i + j + ij is expected response when xi +xj =1 excess ij indicates “interaction” effect: - ij > 0: “(binary) synergistic blending” - ij < 0: “(binary) antagonistic blending”

15 Simplex-lattice versus simplex centroid designs
simplex-lattice allows for fine grid on experimental region {p,m} simplex-lattice cannot detect synergisms of order higher than m simplex centroid may be executed sequentially (first pure blends, then binary mixtures, ...) both designs have most of their points on the boundary ( = at least one factor equal to 0 )

16 General recommendations for mixture designs
allow enough degrees of freedom (# design points - # model terms) to allow precise estimation of variance add extra points of special interest replicate design add points in interior to increase coverage of experimental region to increase degrees of freedom for variance estimation perform lack-of-fit test if there are replicates use linear model when screening; use higher-order models for optimization perform blocking if necessary

17 Various remarks about mixture designs
mixture designs may be combined with factorial designs when some variables are not related to the mixture (“process variables”) pseudocomponents may be used when there are further restrictions on the mixture ingredients like 0 ≤ xi ≤ 0.3

18 Example of analysis of mixture data
octane blending with 3 components response is octane rating goal is optimization of octane rating simplex centroid design 23-1 = 7 points two additional check points of commercial interest of current production process every observation repeated, so in total 18 observations all experiments under same conditions, so no blocks because the goal is optimization, we start with the quadratic model (simplest model that allow optimization)

19 Results of analysis mixture data: quadratic model
residuals look OK significant model (p-value in ANOVA < 0.05; see also high R2) BUT: significant lack-of-fit (option must be actived in Statgraphics by using right-mouse click)

20 Results of analysis mixture data: special-cubic model
choose next simplest model (leaves more degrees of freedom for accurate estimation of error variance) residuals look OK significant model (p-value in ANOVA < 0.05) and no significant lack-of-fit

21 Further results special-cubic model
residuals show only light indication of not being normally distributed slight pattern in residual plots (variance not constant) BC “ interaction” not significant (unimportant when optimizing) antagonistic blending of AB and AC

22 Optimization results optimum near x1=1.0

23 Limitations of factorial designs + classical RSM designs
experimental region may not be hypercube impossibility to reach corner experimental region specific constraints process factors are ingredients of mixture chemical knowledge postulates asymmetrical model interaction not possible extra higher order term for one factor Factorial designs and classical RSM designs (CCD, Box-Behnken) cannot be used in these circumstances.

24 Some desirable properties of designs
require minimum number of experimental runs allows precise estimates of regression coefficients allows precise predictions of responses allows experiments to be performed in blocks make it possible to detect lack-of-fit Note: 2. and 3. seem similar, but are not the same! We will generalize the use of corner points in 2p designs using criterion 2.

25 Example: simple linear regression
given: minimal and maximal settings of factor problem: which settings are optimal for determining slope? min max min max large effect in slope small effect in slope

26 Simple linear regression: variance of slope

27 Distribution of design points: simple linear regression
Recall: variance of slope small if large Experimental region: -1  x  +1 n = 2: x1 = -1 and x2 = +1 (or vice-versa): S = 2 n = 3 : x1 = -1 , x2 = 0, x3 = +1: S = 2 x1 = -1 , x2 = -1, x3 = +1: S = 8/3 > 2 x1 = -1 , x2 = c, x3 = +1: S = 2/3 * (c2+3) “optimal solution” (not feasible!) : 1 ½ measurement at –1 1 ½ measurement at +1

28 General setup: matrix formulation

29 Design matrix: quadratic linear regression

30 Information matrix and confidence regions
Confidence region for regression parameters: Properties of confidence region: it is an ellipsoid volume proportional to (det(XtX)-1)1/2 length of axes proportional to (eigenvalues)1/2 of (XtX)-1

31 Information matrix and prediction variance
where f t (x) is a row vector with entries of design matrix X Example: In order to compare designs one uses scaled prediction variance:

32 Comparison of designs: n=3
E(Y) = 0 + 1 x1 design -1,0,1 (Xt X)-1(2,2)=1/2 scaled predicted variance: /2 x2 E(Y) = 0 + 1 x1 design -1,1,1 (Xt X)-1(2,2)=3/8 scaled predicted variance: 3/8*(3-2x + 3 x2) better choice for maximum predicted variance better choice for slope

33 Exact design versus continuous designs
mathematical design puts weights on design points exact design optimal distribution may not be feasible (non-integer weights) continuous design: optimal distribution with integer weights is feasible

34 Confidence region: example 1
1 small variance, i.e. known with high precision 2 large variance, i.e. known with low precision axes ellipsoid parallel to coordinate axes, hence parameter estimates for 1 and 2 uncorrelated 2 1

35 Confidence region: example 2
1 and 2 known with same precision axes ellipsoid parallel to coordinate axes, hence parameter estimates for 1 and 2 uncorrelated 2 1

36 Confidence region: example 3
1 medium variance, i.e. known with medium precision 2 large variance, i.e. known with low precision axes ellipsoid not parallel to coordinate axes, hence parameter estimates for 1 and 2 correlated 2 1

37 Optimality criteria Several criteria are being used to construct optimal designs: based on ( X t X )-1: A-optimality (maximize trace = sum of eigenvalues) D-optimality (maximize determinant) based on prediction variance G-optimality (minimize maximum scaled prediction variance) V-optimality (minimize average scaled prediction variance) Note: usual 2p designs are D-optimal!

38 Algorithms several algorithms exist to compute (approximately) D-optimal designs algorithms usually require candidate set of design points exhaustive search of all possible subsets often not possible exchange algorithms try to optimize criterion by exchanging candidate points or coordinates of candidate points

39 Software Matlab -> Statistics Toolbox
cordexch (coordinate exchange algorithm) rowexch ( row exchange algorithm) x2fx (generates design matrix for standard models) Statgraphics ->Special -> Experimental Design -> Optimize Design Gosset: (limited Windows version (called Strategy) available at )

40 Example: separation of chlorophenols
steps in pH: 0.1 steps in organic modifier: 1% constraints 5.7  pH  7.2 24%  % modifier  50% modifier+14.8*pH  129.8 model: Y = 0 + 1 x1 + 2 x2 + 11 x12 + 22 x 12 x1 x2 + 111 x13 minimal 7 runs necessary for 7 parameters + additional runs to estimate variance possible combinations to check???? Source: P.F. De Aguilar, Chem. Int. Lab. Syst ), RPLC = reversed-phase liquid chromatography

41 Literature P.F. de Aguiar et al., D-optimal designs (tutorial), Chem. Intell. Lab. Syst. 30 (1995), L.E. Eriksson et al., Mixture design – design generation, PLS analysis, and model usage (tutorial), Chem. Intell. Lab. Syst. 43 (1998), 1-24. NIST Engineering Statistics Handbook:


Download ppt "Statistics 2 for Chemical Engineering lecture 4"

Similar presentations


Ads by Google