15. Design of Experiments

When setting up a separation method, performing some experiments is almost inevitable. This applies for the complete method, including the actual separation, injection, detection and sample preparation. Because performing separations and sample preparation both require significant time, effort, and consumables, it is tantamount that method-development strategies are as efficient as possible. Moreover, the quality of the results of the process determines the value and attractiveness of the ultimate method (e.g. analysis time, specificity, detection limit). In this class we focus on systematic experimental-design strategies, commonly referred to as Design of Experiments (DoE), which is the preferred method for multivariate optimization processes.

Learning Goals #

Understand why DoE is superior to traditional approaches to reach analytical objectives.
Familiar with the principles of parameter-screening, optimization designs, empirical modelling.
Predict predicting optimal parameter settings.

Analytical Separation Science by B.W.J. Pirok and P.J. Schoenmakers

READ SECTION 9.8.1

DESIGN OF EXPERIMENTS

1. Introduction #

Four different strategies for developing (parts of) analytical methods are outlined in Figure 1. A key step in any procedure is a clear formulation of the objective. Among the many possibilities are

Achieving optimal separation performance, such as a favourable compromise between resolution and analysis time. This is far from straightforward, as described in Module 10.3 of the book.
Maximizing the yield of an extraction process or its selectivity.
Optimizing injection parameters, such as to obtain a narrow injection pulse that is fully representative of the sample;
Optimizing detector parameters to achieve maximal sensitivity.

The most-traditional approach relies on trial-and-error experiments (top row in Figure 1). Possible conditions may be chosen based on the knowledge and experience of the analyst. This approach has proven to yield acceptable results, especially for highly skilled, senior analysts (”gurus”). However, performing multiple experiments in a non-systematic way can be very inefficient and the extrapolation of existing solutions to new problems is not likely to yield optimal results.

Figure 1. Different strategies to reach an analytical objective. The dark purple columns indicate important aspects of the different (horizontal) strategies. The pink bar indicates the Design-of-Experiments (DoE) approach discussed in this lesson. All aspects of this strategy, i.e. variable screening, experimental design, modelling, prediction, and defining the objective of the process, are addressed in Module 9.8.

Another traditional approach that is more systematic is to optimize one variable at a time (OVAT; second row in Figure 1). This is a valid approach only if the effects of the different variables are mutually independent. For example, if the optimum temperature for a derivatization reaction is independent of the pH (and vice versa) the two variables may be optimized successively. However, if the optimum temperature varies with the pH (i.e. if there is interaction between the variables pH and temperature) this approach will yield sub-optimal results, as illustrated in Figure 2. In optimizing (parts of) separation methods mutually dependent variables are the rule rather than the exception. Even in the case of mutually independent variables the OVAT approach yields relatively little information from a relatively large number of experiments.

Figure 2. The one-variable-at-a-time (OVAT) approach to finding optimal conditions. The blue shading in the A area indicates the reaction yield (lighter colours correspond with higher yield). By first optimizing the pH at a temperature of 60^oC an optimum value of pH=4.8 is found (point 1). Subsequent optimization of the temperature at this pH yields and optimum at point 2 (

T

=49^oC), which is different from the true optimum located at point 3. In a simple response surface as shown here, circular iso-response lines (or horizontal or vertical ellipses) indicate mutually independent variables, whereas slanted feature, such as the light-blue ellipse, indicate dependent variables.

EXERCISE 1: ONE AT A TIME

In the example shown in Figure 2, which actions would allow you to approach the true optimum (point 3) more closely when using a one-variable-at-a-time (OVAT) strategy? Select all that apply.

Correct! Iterative OVAT can converge slowly toward the optimum but remains inefficient. Multivariate optimisation directly accounts for variable interaction and targets the true optimum efficiently. Higher resolution in a single slice does not resolve variable dependence. Expanding the domain does not address the fundamental limitation of OVAT.

Not really.. Consider this: Iterative OVAT can converge slowly toward the optimum but remains inefficient. Multivariate optimisation directly accounts for variable interaction and targets the true optimum efficiently. Higher resolution in a single slice does not resolve variable dependence. Expanding the domain does not address the fundamental limitation of OVAT.

Assume the response surface in Figure 2 contains more than one local optimum. What are the implications for the OVAT approach illustrated?

Well done!

Try again! Consider the following. Three statements are correct. OVAT is path-dependent and sensitive to starting conditions. Broader exploration reduces risk but greatly increases experimental burden, and that there is no guarantee of reaching the global optimum with OVAT alone.

The bottom row of Figure 1 indicates the attractive situation, in which a sound theoretical framework exists that allows accurate prediction of optimal conditions. In separation science, theoretical models still fall short of predicting retention and selectivity with sufficient accuracy. Also, optimal injection parameters, extraction conditions, or settings for a mass spectrometer cannot usually be predicted from theory.

Models for chromatographic efficiency

We may have adequate models to predict chromatographic efficiency as a function of column length and diameter, flow rate and particle size (for packed columns; see Module 1.7) or film thickness (for open-tubular columns; see Section 1.7.7).

The remaining (third) row in Figure 1, which is highlighted in pink and dark red, concerns the systematic design of experiments (DoE) that is the subject of the remainder of this class.

2. Design of Experiments #

In a DoE approach a predefined set of experiments are performed, with the different variables varying simultaneously. In Figure 3, a set of 13 experiments is indicated in Figure 3. The results (yield $y$ ) can then be fitted to a linear model of the form

$y=b_0 + b_T\cdot T + b_{\text{pH}}\cdot\text{pH} + b_{T,2} \cdot T^2 + b_{\text{pH},2}\cdot\text{pH}^2 + b_{T,\text{pH}}\cdot T\cdot\text{pH}$

and plotted in the form of iso-response lines (as in Figure 3A) or a quasi-3D plot (Figure 3B).

EXERCISE 2: LINEAR?

Why is a model with quadratic and interaction terms still called linear?

The model is called linear because it is linear in the regression coefficients, even if the variables appear in quadratic or interaction terms.

The term linear does not refer to linearity in the variables or the shape of the response surface, but to how the model depends on its coefficients.

Having established such model, we can predict the yield at any combination of temperature and pH. Using about the same number of experiments as needed to reach point 2 in Figure 2, the DoE approach yields much better results, in that

We obtain a good impression of the entire response surface
We can establish optimal conditions (maximum yield) without further experiments;
We have a quantitative estimate of the degree of correlation between the effects of temperature and pH (e. the coefficient $b_{T,pH}$ );
We can predict how much our yield will decrease if temperature and pH var around the optimum.

Note that there is some redundancy in our experimental design. Six experiments is the minimum number to estimate the six coefficients in our model. Nine experiments may be well spread across the parameter space and may provide an indication of the quality of the model.

Figure 3. Establishing a model for the response surface of Figure 2 using a DoE approach.

Fitting any model that is linear in the $b$ coefficients is straightforward. If we have $n$ experiments (denoted 1 $\ldots n$ , with $n$ equal or larger than the number of coefficients in the model), we may write

$\begin{bmatrix}y_1\\y_2\\\vdots\\y_n\end{bmatrix}=\begin{bmatrix}1&T_1&\mathrm{pH}_1&T_1^2&\mathrm{pH}_1^2&T_1\mathrm{pH}_1\\1&T_2&\mathrm{pH}_2&T_2^2&\mathrm{pH}_2^2&T_2\mathrm{pH}_2\\\vdots&\vdots&\vdots&\vdots&\vdots&\vdots\\1&T_n&\mathrm{pH}_n&T_n^2&\mathrm{pH}_n^2&T_n\mathrm{pH}_n\end{bmatrix}\begin{bmatrix}b_0\\b_T\\b_{\mathrm{pH}}\\b_{T,2}\\b_{\mathrm{pH},2}\\b_{T,\mathrm{pH}}\end{bmatrix}+\begin{bmatrix}e_1\\e_2\\\vdots\\e_n\end{bmatrix}$

where the $e$ vector represents the residuals (differences between the experimental and model values). We abbreviate the model (without the residuals) to

$\textbf{y}=\textbf{X} \cdot \mathbf{\beta}$

Note: It is generally recommended to centre and scale the variables, such as pH and temperature. This will be explained in a following section.

To find a least-squares estimate $\mathbf{b}$ of the coefficients $\beta$ we may write

$\mathbf{X}^{T}\mathbf{y}=\mathbf{X}^{T}\mathbf{X}\,\boldsymbol{\beta}$

from which we find

$\mathbf{b}=(\mathbf{X}^{T}\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}$

The predicted response values are

$\hat{\mathbf{y}}=\mathbf{X}(\mathbf{X}^{T}\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}$

and the error vector $\epsilon$ that estimates the deviations between the model and the individual data points is

$\boldsymbol{\varepsilon}=\hat{\mathbf{y}}-\mathbf{y}=\mathbf{X}(\mathbf{X}^{T}\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}-\mathbf{y}$

These simple solutions are a great advantage of models that are linear in the coefficients. For non-linear models an iterative process is needed to arrive at a solution for the coefficients.

READ SECTION 9.8.2

Factorial Designs

3. Factorial designs #

Without sufficient knowledge and expertise of the analysts DoE strategies may yield nonsensical results. For example, a competent analyst may decide that a prudent range for possible mobile-phase pH values on a conventional octadecylsilica (ODS) column for HPLC is 2 < pH < 7. They will also be aware that the effect of pH may follow a typical sigmoidal pattern, so that studying just two levels will be inadequate to describe the effect of pH.

Analyst and parameter space

Many different experimental designs exist and software packages may be helpful to establish a design, set up a series of experiment, fit a model, and perform statistical calculations. However, the role of the analyst is essential. They should

understand the process.
define clear and unambiguous (mathematical) objectives.
direct the creation of the parameter space, by selecting relevant variables and sensible limits (see also below).
oversee the creation of the experimental design and list of experiments (e.g. blocking, randomisation).

To provide the same weight to each quantitative variable in a fitting process and to reduce the correlation between the different variables in the $\textbf{X}$ matrix (such as between $T$ and $T^2$ ), it is important to scale them, usually to a range between -1 and +1. The position of equidistant point for up to five equidistant points on such a scale is summarized in Table 1

Table 1. Position of equidistant point on a normalized scale from -1 to +1.

# Equidistant Points	Scaled Values
2	-1; +1
3	-1; 0; +1
4	-1; -1/3; +1/3; +1
5	-1; -1/2; 0; +1/2; +1

For example, there are five levels in the design shown in Figure 3A. For temperature, the transformation from a temperature (in ^oC) to a (dimensionless) scaled value $T$ ’ is

$T'=\frac{T-50}{25}$

while for pH we have

$\mathrm{pH}'=\frac{\mathrm{pH}-4.5}{2.5}$

To use the obtained model to predict a response, these transformations should be applied to the settings. Alternatively, the above transformation equations may be substituted in the model to obtain modified values for the coefficients in an equation in which actual temperatures and pH values van be used.

Randomisation and blocking

It is generally recommended to shuffle or randomize the measurements prescribed by the design in a random order, so as to avoid any bias due to a gradual change in the system under study.

However, in some cases complete randomization is not realistic. For example, a series of samples may be measured three times, each time on a different day. The design can then be split in three blocks. The various samples can then be compared within each blocks, or it can be implicitly assumed that the inter-day variation is negligible.

In other situation the total body of measurements may need to be split for other reasons. For example, optimization of an LC separation involves temperature, pH, and three different columns as variables, In that case we are interested in the differences between the columns and the total exploration is split in three separate, one for each column. All measurements in one design (i.e. on the same column) are first measured, followed by the other separate designs. Within each block, the measurements should be randomized. The critical assumption here is that all conditions other than the column are identical within each design, so as not to confound the effects of the column with other effects. For example, if three LC columns are studied on consecutive days, it is assumed that there is no day-to-day variation in the measurements.

Factorial designs are among those most commonly used for establishing a number of experiments with the purpose of fitting a model. A full factorial design includes all combinations of $n$ levels of $m$ variables (or parameters), resulting in $n^m$ experiments. Such a design is also commonly referred to as an $n^m$ full factorial design. Two examples are shown as Figure 4A (2³ design, 8 experiments) and Figure 4B (3³ design, 27 experiments).

Comparison of experimental designs in a three-factor space, showing factorial and response surface layouts.

Figure 4. Illustration of full factorial 2³ (A) and 3³ (B) designs, as well as a fractional factorial (C) and a central-composite (D) design for 3 variables at 3 levels, in which the blue points are positioned outside the cube.

Figure 4C shows a fractional factorial design for 3 variables at 3 levels, in which almost half of the data points are omitted (15 experiments). The design shown in Figure 3A can be seen as a fractional factorial design for 2 variables at 5 levels. Figure 4D shows a central-composite design (also 15 experiments), which is another common design for building a model.

EXERCISE 3: MISSING VALUES

In the table below the design of Figure 4D is specified in terms of scaled values for three parameters, i.e. temperature (

T

), reagent concentration (

c

), and amount of catalyst (

w

) added. Complete the list of actual parameter values (“?”) on the right-hand side of the table.

Exp. #	Scaled Values			Actual Values
	$T$	$c$	$w$	$T$	$c$	$w$
1	$0$	$0$	$1.5$	?	?	?
2	$1$	$-1$	$1$	?	?	?
3	$-1.5$	$0$	$0$	?	?	?
4	$1$	$1$	$1$	$80$	$200$	$750$
5	$0$	$1.5$	$0$	?	?	?
6	$-1$	$-1$	$-1$	$40$	$100$	$250$
7	$0$	$-1.5$	$0$	?	?	?
8	$1$	$1$	$-1$	?	?	?
9	$0$	$0$	$0$	?	?	?
10	$-1$	$1$	$-1$	?	?	?
11	$0$	$0$	$-1.5$	?	?	?
12	$-1$	$1$	$1$	?	?	?
13	$1.5$	$0$	$0$	?	?	?
14	$1$	$-1$	$-1$	?	?	?
15	$-1$	$-1$	$1$	?	?	?

Exp. #	Scaled Values			Actual Values
	$T$	$c$	$w$	$T$	$c$	$w$
1	$0$	$0$	$1.5$	60	150	875
2	$1$	$-1$	$1$	80	100	750
3	$-1.5$	$0$	$0$	30	150	500
4	$1$	$1$	$1$	$80$	$200$	$750$
5	$0$	$1.5$	$0$	60	225	500
6	$-1$	$-1$	$-1$	$40$	$100$	$250$
7	$0$	$-1.5$	$0$	60	225	500
8	$1$	$1$	$-1$	80	200	250
9	$0$	$0$	$0$	60	150	500
10	$-1$	$1$	$-1$	40	200	250
11	$0$	$0$	$-1.5$	60	150	125
12	$-1$	$1$	$1$	40	200	750
13	$1.5$	$0$	$0$	90	150	500
14	$1$	$-1$	$-1$	80	100	250
15	$-1$	$-1$	$1$	40	100	750

3.1. Number of experiments #

How many experiments can realistically be performed depends strongly on the type of measurements. For example, if DoE is used to align a laser system so as to maintain maximum light intensity at a detector, measurements can be nearly instantaneous and many experiments can be performed in a short time (provided all settings can be automatically implemented). Chromatographic experiments (each under different conditions) tend to take considerable time and possibly consumables (such as mobile phases) and the realistic limit may be much lower.

When experiments require time and effort, it is crucial to only select the most relevant, mutually dependent variables in the design. Groups of variables that are independent may be optimized separately, even if variables within each group may show interaction effects. For example, the variables that determine chromatographic selectivity may be optimized first, followed by an optimization of the chromatographic efficiency.

3.2. Mixture designs #

Mixture designs are appropriate when the values of the variables are constrained. An example is the optimization of a ternary solvent mixture, where the sum of the volume fractions (in percentages) should equal 100, i.e.

$\varphi_1 + \varphi_2 + \varphi_3 = 100$

Figure 5 shows a possible design that may be used for such a purpose.

Figure 5. Example of a mixture design for modelling the effect of a ternary solvent composition. The hotspots (related to the exercise) also equally represent the pink experimental sampling spots.

EXERCISE 4: SOLVENT COMPOSITION

Specify which of the compositions (in percentages of solvent 1, 2 and 3, respectively; with no decimals) are correct as shown by the highlighted points 1 through 6 in Figure 5.

Very nice!

Oops! Try again, make sure that you connect the values to the correct solvent.

READ SECTION 9.8.3

SCREENING DESIGNS

4. Screening designs #

Screening designs can be used to scan the effect of many variables (factors) with very few experiments. Very sparse fractional factorial designs, for example one eighth of a 2⁶ full factorial design, which yields a 2^6-3 design of eight experiments. While the full 2⁶ design (64 experiments) would allow estimating main effects, all interactions between two factors, as well as the increasingly unlikely three-factor, four-factor, five factor, and six-factor interactions. A 2^6-3 design would only allow estimating main effects, but this may suffice for screening purposes.

Placket-Burman designs are another option for testing many factors. They are defined with experiments, each for (up to) factors. The disadvantage of such minimal designs is that they typically cover just two levels and they ignore interaction effects between variables. They are meant to discover which factors have the largest main effect on the outcome. If interaction effects are significant, they will be confounded with main effects.

Applications

There are two main applications of screening designs in analytical chemistry (as explained in Section 9.8.3 of the book).

1. To identify which variables to include in a design to build a model and to establish optimal conditions.

2. To demonstrate the robustness of a method and to establish windows of operation for the variables.

An example of a Placket-Burman design for 11 factors at two levels (12 experiments) is provided in Table 2. If only 10 or 9 factors are tested, there are one or two “dummy factors”. The main effects of these should be small, providing an indication for the validity of the outcome.

Table 2. Placket-Burham design for 11 factors at two levels.

Exp. #	Factor number
	1	2	3	4	5	6	7	8	9	10	11
1	1	1	-1	1	1	1	-1	-1	-1	1	-1
2	-1	1	1	-1	1	1	1	-1	-1	-1	1
3	1	-1	1	1	-1	1	1	1	-1	-1	-1
4	-1	1	-1	1	1	-1	1	1	1	-1	-1
5	-1	-1	1	-1	1	1	-1	1	1	1	-1
6	-1	-1	-1	1	-1	1	1	-1	1	1	1
7	1	-1	-1	-1	1	-1	1	1	-1	1	1
8	1	1	-1	-1	-1	1	-1	1	1	-1	1
9	1	1	1	-1	-1	-1	1	-1	1	1	-1
10	-1	1	1	1	-1	-1	-1	1	-1	1	1
11	1	-1	1	1	1	-1	-1	-1	1	-1	1
12	-1	-1	-1	-1	-1	-1	-1	-1	-1	-1	-1

Concluding remarks #

Design of Experiments (DoE) provides a systematic and efficient framework for developing and optimizing analytical methods in situations where theory alone is insufficient. By varying multiple parameters simultaneously, DoE allows interactions and curvature to be detected, enables empirical models to be constructed, and supports reliable prediction of optimal conditions with a limited number of experiments. Compared to trial-and-error or one-variable-at-a-time approaches, DoE yields more information, better insight into the system under study, and more robust outcomes. The effectiveness of DoE, however, critically depends on the expertise of the analyst in defining relevant variables, realistic parameter ranges, and meaningful objectives. These principles form the foundation for advanced experimental optimization strategies discussed further in Module 9.8 of the course material.

INFORMATION REPOSITORY

Extra

MSc. Chemometrics & Statistics

MSc. Separation Science

15. Design of Experiments

Learning Goals #

READ SECTION 9.8.1