Welcome to the course on Biostatistics and

Design of Experiments. We have been talking about various types of designs, screening

designs, various types of screening designs. One of them, the most important one is your

factorial design. Then, we looked at the fractional factorial design, that means, doing a fraction

of those factorials. Then, I talked about something called the confounding. Then, I

also talked about the resolution. So, Resolution III designs are not generally liked. Resolution

IV or above are always liked. And, if you say 2 power 4 minus 1, that means, 4 is the

number of factors, or parameters you are looking at, and this is a half of that. So, 2 power 4 is 2 into 2 into 2 into 2 is

16; half of that will be 8 experiments. So, what do you do? 8 experiments is, so you put

a new variable D, new factor D in, in place of ABC. So, you have a confounding of D and

A B C. This is a Resolution IV design, because there are 4 terms and the design generator

I is equal to A B C D, understand. So, that is how you go about doing that. So, Resolution

IV design. This is liked. Resolution III design is not liked. So, what do you do? 3 parameters

or 3 factors; so 2 raised to the power 3, 8 experiments, you do not want to do 8 experiments.

So, what do you do? 2 raised to the power 3 minus 1. So, it is half of 8 experiments.

You are doing only 4 experiments. So, what do you do? You replace A B with C, that means,

A B is confounded with C. So, the design generator is A B C; I is equal to A B C and this is

a Resolution III, because you have 3 parameter, and so on. So, you can, this tells you the,

which terms are confounded, and then, it also tells you, allows you to make your resolution.

This will be your Resolution IV and so on, number of runs. So, and, I also taught you

how to do the calculations later, once you have the results also. Now, then, we went

into something else, the Plackett-Burman designs. These are also very good screening designs.

We can screen large number of factors in the very minimal experiments. Ideally, if the

factors are like 3, 7 and 11 and so on, because the design generator for Plackett-Burman designs

are meant for 8 runs, 12 runs, 16 runs, 20 runs, 24. So, if we have one factor less than

that, then Plackett-Burman design is very, very good; because, with less number of experiments,

you can get very good idea about the main effect. So, you have the design generator

here. I taught you how to build up these tables.

The last row generally, I mean, last row is always minus, minus, minus, and you push these

signs up, and build up this. So, Plackett-Burman design is very good, if we have parameters

like 3, 7, 11, 15 and so on. If the number of parameters is not that or slightly less,

we can always call some of them as dummy variables. Some of these columns can be called dummy

variables. This is taken from this particular reference. Then, we talked about Latin Square Design.

This is also very important, useful type of design. Then, I showed you Latin Square for

3 parameters A, B, C, each operated at 3 levels. So, how do you represent that? Minus 1, 0

and 1. In a 2 level, we say minus 1 and 1, for a 3 parameter we say minus 1, 0, 1. And,

here, you can see, you need to do 9 experiments. So, this will be one experiment; this will

be one second experiment; this will be third experiment; 4, 5, 6, 7, 8, 9. So, it is very

nice Latin Square Designs for a 3 variable problem. If it is a 2 variable problem, we

can have a 2 by 2 Latin Square. We do not have to, 2 variable problem. We can go even

2 parameters, 2 levels, 2 variables, we can have 2 by 2 Latin Squares and so on, actually. Then, comes another design. This is also a

screening design, that is called Taguchi designs. Traditional DOE’s focus on how each level,

or each factors, sorry, not each level, each factor affects the output, that is y. Y is

your output, or the dependent variable. Suppose, I am doing a bio-process like a yield of a

metabolite, or biomass, that is the output. Whereas in Taguchi design, he has looked at

robust design, and he is more interested in the variation, the loss function. So, he is

looking… Taguchi, there, has developed designs, looking at them as a possible robust design

and also looking at the variations, rather than the averages. Whereas, in normal designs,

like your factorial designs, or even your Plackett-Burman, we are looking at average

output; we are more interested in that, actually. .

So, Taguchi design, you go, there is a design selector is there. This tells you number of

parameters; this tells you number of levels. This table was taken from this particular

reference. So, this tells you the number of levels; this tells you the number of parameters.

So, if I have a 2 by 2, that means, 2 levels, 2 parameters. There is something called L

4 design. If I have that table L 4 is available here, table L 4 is available here, ok. .

This is called the table L 4. So, we can pick up the, as you can see, L 4 and 4 runs, we

can pick up this. So, we can pick up column 1 and 2, for A and suppose I have 2 variables,

so, I can do these two. If I have 3 parameters also, I can take up

this. So, this will become like a confounding; do you understand? And so on. So, if you look

at this table, there are tables available like L 9 table, L 18 table, L 27 table. They

are meant for 3 levels and so on. L 16 table meant for 4 level, L 25 tables, 50 tables

available for 5 levels. So, standard tables are available. We just pick up and then use

it. So, let us look at this level 2. So, you have tables for L 4, L 8. So, we can do it

for, suppose I have parameters 3 or 2, I can use the L 4 table. If I have parameters 4,

or 5, or 6, or even 7, I can use the L 8 table, ok. And, if I have parameters 8, 9, 10, 11,

I can use L 12 table and so on, actually. So, if you look at the L 4 table, 2 levels,

I can do a full factorial design. I can, if I maintain 2 factors, I can get a Resolution

IV, or I can do, go up to 3 factors for screening. new slide So, if I am interested in doing a screening

type of work, I can go up to, take this table, and I can look at 3 variables. For example,

this. This is the L 4 table; 4 runs. This is for factor A, factor B, factor C. So, if

I put A, B, C, then, obviously it becomes a screening design. Whereas, if I put only

A, B, I can do a very good, high resolution, full factor, it becomes a full factorial design.

This is the table called L 4 table. So, I just blindly do this, take the, pick up the

L 4 design for 2 level, and if it is 2 factors, I put A, B, ignore this; if it is 3 factors,

I put A, B, C, and do the experiments accordingly. Let us look at L 8 design; that means, 8 runs.

This number tells you how many runs, 4, 8, 9 and so on. So, the 8 tells you the number of experiments.

So, if you look at L 8 design, 2 levels, if I have 3 factors, it becomes full factorial,

right? 2 raised to the power 3 is also full factorial. But, if I want to do a screening,

I can go up to 7; it is almost like your Plackett- Burman, ok. If I want to get Resolution V,

then, I stick to only 3 factors. So, let us look at L 8 design. This is how it looks like.

So, either I can have less number of factors. I can have only 3 factors, then, it becomes

full factorial design, that means, A, B, C. So, this will become AB, this will become

BC, this will become AC, this will become ABC. Or, I can have 7 factors, A, B, C, D,

E, F. I am doing 8 experiments. It is like your Plackett- Burman design. Notice that

even here, you will have the balancing, 4 minuses, 4 plus. So, you should always have

a balance and orthogonal type of behavior. So, if I want to do a full factorial design,

so, I will, I can select this. I can select this, as I said, 2 raised to the power 3 is

8 experiment. So, I can select this, I can select this. Then, I do not select this, of

course, because, this and this look same. So, I will select this, do you understand?

That is why I select this column, this column, and this column. So, if it is a 3 factor,

full factorial design. So, if I want to do screening up to 7, then, I can go right up

to 7. I can have A, B, C, D, E, F, like that, now completely. So, I can keep increasing.

So, 1, 2, 4, 7. So, I will not use 5, because, 4 and 5 look similar. So, I can use this,

I can use this, I can use this, then, I can use this. So, like that, because some of 4

and 5, like that. Now they are all similar. So, L 4, L 8. So, we can, if I want to do

a full factorial design, I select, I can do up to 3 factors. So, that will become a full

factorial design, with their interactions, and then, if I want to look at 7 factors also,

I can go; it will be almost like a Plackett. Then, you have the L 9 designs. L 9 designs

are generally for 3 levels, minus 1, 0, 1. If I have only 2 factors, so, it is 3 power

2, which is 3 into 3 is 9. So, full factorial I will get. But, I can go up to 4 factors

as a screening design. So, I can go up to 4 factors, so 9 runs. So, as you can see,

minuses, zeros, 1. So, this is at the highest level; this is at the lowest level; this is

at the middle level. For example, temperature, if I am interested in looking at 30, 40, 50,

plus 1 means 50, minus 1 means 30, 0 means 40. pH, if it is 3, 4, 5 I am looking at,

so minus 1 means 3; 5 is plus 1; 0 is 4; like that you know. So, if I want to do a full

factorial design, what I do is, I will take 3 factors I can do; column 1, column 2 and

column 4. Or, screening design, like that you can keep on doing. You can add many factors,

actually. Then, you also have this L 12 design. So,

as you can see, a L 12 design, 2 levels; so, we can use it as a screening design for 11

factors. So, I can put 11 factors here, A, B, C, D,

E, like that. So, I will do 12 runs. This is almost like your Plackett-Burman design.

It is a like Plackett-Burman design, ok. Only thing is, the columns and the rows are

been permuted; so, they have interchanged the columns and the rows, actually, in some

cases. Whereas, in Plackett-Burman design, if you see at the bottom, you will get always

negatives, right. So, some difference is there. So, last line will always, 12 runs, you always

get in negative. So, there is a permute, there is a switching over of some rows and columns,

but, it is exactly like Plackett-Burman. So, 11 factors, I can use a Plackett-Burman design,

and do a 12 experiment, or I can pick up this Taguchi table for L12. I have 11 factors,

and I can do a screening design. So, I have both the options, actually, I can take Taguchi

or I can take Plackett-Burman for 12 experiments. Then, you have a L… So, just like L 4, L

8, L 9, L 12, you also have L 16 table. This is L 16 table. So, there are 16 runs.

So, this indicates the number of runs. You have 16 runs here. So, I can do a full factorial;

2 raised to the power is a full factorial, right; 2 into 2 into 2 into 2, 4 times. So,

full factorial. Full factorial will give you, sorry, full factorial of 4, I can go up to

5 factorial, and maintain Resolution V. If you want to do a screening design, I can go

up to 15, just like your Plackett-Burman experiment, right, 15. And then, there are L 18 designs;

that means, 18 experiments, 27 experiments and so on, actually. So, the advantage of

the Taguchi designs are, there are tables available. We can use different tables for

different… if we have the… depending upon the number of parameters, and numbers of levels

you have. If the parameters are less in that, what is available in the table, we can ignore

that. Then, one important point is, you need to crosscheck whether there is a balance and

orthogonality taking place, because any design you take, we need to have that sort of conditions;

that is very important. So, sometimes it will look like even Plackett-Burman design especially,

when you are looking at L 12, and go screening for 11 factors. Then, of course, it is like

a Plackett-Burman design. So, we have different types of tables, that is advantage of this

type of a Taguchi method. This is a L 18 design. We have 18 experiments

here. This is at 3 levels. We have minus 1, 0 and plus 1, 3 levels. So, the main advantage

as I said is, there are designs, tables available. Taguchi has prepared and given it. So, we

can use those tables directly, and whether it is a 2 factor, sorry, 2 levels, or it is

a 3 level, there are tables available. He even has, as you can see here, 2 levels, 3

levels, 4 levels, 5 levels, there are tables. Generally, we will not go 4 levels and 5 levels.

So, 2 levels, 3 levels, there are table available, depending upon the number of parameters. There

are tables for L 32 also, that means 32 experiments you need to do. So, you just pick up those

tables and make use of, for your design. So, we have covered lot of designs which can

be used for screening, the full factorial, the fractional factorial, the Plackett-Burman

design, the Latin Square design, the Taguchi design. So, all these are very good. Taguchi,

of course, you can mix and match, and use it for full factorial or even fractional factorial.

All these designs are at 2 levels, that means, you can get a linear relationship between,

if temperature varies from 30 and 40, 30 to 40, I am measuring at 30, I am measuring at

40. pH, I am measuring at, say 3, and measuring at 4, the effect of pH; carbon concentration,

measuring at 1 percent and measuring at 2 percent. So, I am measuring at two places;

obviously, I can get linear relationship. So, if I am generating regression equation,

I can get a linear relation. If I want to get a non-linear relation, ultimately,

we are interested in optimization. So, in such situations, I need to have data at more

than two points. Then only, I can get up, at least a square term. Then only, I can get

a quadratic term, which is non-linear. Then, I can find an optimum, maximum yield, for

a minimum impurity and so on, actually. In a linear, it is always increasing or always

decreasing, monotonically increasing, or monotonically decreasing. There is no optimum. So, initially,

during screening design, we do all the experiments at 2 levels, whether it is 2 raised to the

power n, or 2 raised to the power n minus 1, or whatever it is. So, we will have minus

1, plus 1, as your coded experimental strategy. We cannot use the results we get to develop

a second order type of model. We can use it to develop a linear model, not a quadratic

model. So, screening designs are used for eliminating

many parameters which are of no significance. And screening designs can be used to develop

a linear relationship. But, if we are interested in developing nonlinear relationship, then,

we need to go for second order model. So, when you start your design, you do not immediately

jump into second order model; that is a very bad idea. What you do is, you do a screening

design. You have 5 parameters, you do a screening design; come down to 2 parameters, and then,

you do a second order design for those 2 parameters. Because, when you are doing second order design,

you may require the results at 3 levels, minus 1, 0, 1. So, the number of experiments also

increases; because, if you have experiments at 3 levels, you will be able to get a quadratic

relationship. So, always remember, when you start your design experiments, never go for

second order model, second order designs; start with screening designs like fractional,

fractional factorial, or Taguchi method, or Plackett-Burman, or even Latin Square. Eliminate

many variables, and then, do a detailed design; and that is what we are going to talk about.

These detailed designs are generally second order designs which can be used to generate

non-linear relation like a quadratic relation. So, this second order designs are extremely

useful in the later part. We can use those designs for optimization. I want to calculate,

maximize, my biomass production. I want to calculate, maximize, my metabolite production.

Then, we need to have a second order model, or a quadratic model. So, there are some designs in second order

also. One is called the Central Composite Design, 3 k Factorial Design, Box-Behnken

Design. So, these are some designs. Many softwares, commercial softwares, automatically will give

this information. So, it is not a big deal, and I will also tell you how to go about doing

that. And, it is a very useful to do that, even using a paper and a pencil.

There are other designs, Koshal design, Hybrid design; we will not talk about it. The main

designs I will talk about is this central composite and Box-Behnken design. These designs

are very useful and well used generally, and well accepted. So, the second order designs,

of course, you have to have the data at 3 levels of the factors, minus 1, 0, 1; that

means, temperature at 30 degrees, 40 degrees, 50 degrees. Whereas, when you are doing a

screening design, if you are changing temperature, you may do at 30 degrees and 40 degrees, or

30 degrees and 50 degrees only. Whereas, if you are doing a second order design, you will

have at 3 levels; so, you will do a 30, 40, 50. Of course, you may ask the question, can

I do a 3 power end raised factorial design? Also, yes, you can do that, but you do not

do 3 power end designs during screening process. Never, never, never do that actually, because,

the number of experiments increases. You do not know which factors to use, which are of

no use. So, in during, after screening, you may eliminate some factors; then, what is

the point in doing too many levels of unwanted factors. That is why, screenings, we generally

do it at 2 levels, and later on, detailed design, we can do at minus 1, 0, 1.

So, central composite designs developed by Box and Wilson. These are first order designs,

with augmented with center points and star point. So, center point is right in the middle

and star points are outside this first order design. First order designs are your factorial

designs. 3 k Factorial Design, a fraction arrangement with k factors, each at 3 levels.

Box-Behnken Design, this is also almost like central composite, but the corner points are

not used; so, they use some other points. So, we look at each one of them. So, imagine, I have a factorial, 2 power 3,

8 experiments, right; 1, 2, 3, 4, 5, 6, 7 and inside 8. So, for a 2 power 3, it is like

a cube; the parameter factor A, factor B, factor C, each one at minus 1 and plus 1 level;

factor A, factor B, factor C, each one at minus 1 and plus 1 level. So, it is 2 power

3. So, it is like a cube. In a central composite design, so, in addition to this 8 places,

full factorial, we do experiments at the central point, here; and, we also do experiments at

6 star points which are outside this cube. Do you understand? So, 8 experiments plus

1, 9 experiment, and 6 star points. These are, there are 6 faces to the cube. So, there

will be 6 star places which are outside the face; that means, outside. That means, it

is beyond the plus 1 level, or below the minus 1 level; they are called the star points.

So, totally, for 3 factors, you do 8 plus 1, 9 plus 6, 15 experiments. You will get,

you are actually doing experiments at 5 levels, can you imagine?

The same slide as above one below the minus 1 level is one level;

the minus 1 is one level; the center point is one level 3; and then, again, plus 1 is

one level; and above the plus 1, another star is another level, so 5 level, 1, 2, 3, 4,

5. So, each of these 3 variables, you are doing at 5 levels; fantastic, is it not? At

5 levels, but, you are doing only 15 experiments. So, we can fit non-linear equations, for effect

of factor A on your, the output, effect of factor B on your output, effect of factor

C on your output, just by doing only 15 experiments. That is the beauty of the CCDs. So, we cut

down the number of experiments tremendously, but we get lot of information using this type

of approach; that is the main advantage. Now, how do you decide on the star points?

These star points are outside the, the cube, outside the face of the cube; so you will

have 6 faces for a cube. So, obviously, there are 6 star points. Do you understand? So,

there are 8, for the factorial, there is 1 central points, 9, 6 stars, that is above

plus 1, below minus 1, above plus 1, below minus 1, above plus 1, below minus 1. So,

there are 6 experiments. So, 8 plus 1 plus 6, 15 points, each, that means, you are doing

each variable at 5 levels. So, you get very good idea about the effect of each parameter,

or factor, on your output, and, you can develop very good non-linear relationship. It is even

better than your 3 raised to the power n design, because in 3 raised to the power n, you are

doing only at 3 levels, but the number of experiments maybe very high. Imagine, if I

want to do a 3 factor A, B, C at 3 levels, a factorial, it becomes 3 raised to the power

3, which is 3 into 3 into 3, that is 27 experiments. Whereas, with 15 experiments, with CCD, I

can get lot of information also. That is the main advantage of the CCD.

So, I do the full factorial inside; that means, I can get some, quite a lot of idea about

my interactions also. Then, I do these extra star points and central point. So, I am doing

each variable at 5 levels also. So, it is much superior to 3 raised to the power 3 type

of designs, do you understand? So, it is really good. The other one is the Box-Behnken design. So,

instead of doing at these corners, it does it at these center places of these edges;

center of the edge, you know. So, this is what this picture is about, actually. So,

you do the same thing, you do at the center of this cube, but you also do experiments

at the center of these edges, rather than corner of these cube. Of course, if the factors

are 3 to 7, only it exists. So, it is not the corners it is doing, it is doing at these

edges, center of these edges actually, that is called the Box-Behnken Design. So, if we have a, say a 2 parameter, that

is like your square, that is like your square and if you are doing a CCD, the 4 points in

the square, that is, the 2 power raised to the power 2, the center point and if the star

point is at the edges, so, these are the 4; these are called the star points.

So, when you are doing high, medium, low, medium is your 0, 0; so, this is the design.

But, if you can extend this little bit, little bit, little bit, little bit; if the region

of interest is more, so, that way, little bit, little bit, you are ending up doing 5

level type of experiments. Whereas, if you are keeping your star points right on the

edges, you are doing only 3 level experiments, for each of these parameter, or variable,

or factor; it will be high, medium, low only. So, if you extend it beyond, then you are

doing at 5 level. Do not forget that; see, this is what it is.

So, if you are extending it beyond those, those points, so, you are doing at 5 levels.

Each variable is changed from, this is level 1, level 2, level 3, level 4, level 5; so,

each variable is changed. So, here we can say, the star point, if you call it alpha,

if your alpha is 1, those points lie right on these; whereas, if alpha is greater than

1, they are beyond this square. So, it is exploring beyond this square; that is what

it is all about, actually, here, in this type of design. This is for a 2 2 raised to the

power 2, that means, if you have 2 parameters, ok. How do we calculate this, what is the optimum

distance for this? The generally, the formula is square root of 2 is the optimum distance.

So, if this is 1, then, this will be 1.414, because square root of 2 is 1.414. That is

how you select, and then, you go beyond that, actually, that is how you select, and so,

we have 15 points, sorry, if we have a 2 raised to the power 2, we have 4 points here. Central

point is 5, and then, we have a, another 4 points. So, that means, you have, how many

experiments? 1, 2, 3, 4, 5, 6, 7, 8, 9 experiments, for a 2 by 2, at 3 level, it is nine experiments.

So, if you have a 2 parameter, if I am doing a factorial, 3 level, I can do 3 raised to

the power 2 which is 9 experiments, but you are doing only at 3 levels. Whereas, here,

we are doing, when I use a CCD, central composite design, I am looking at each variable at 5

levels; 1 level, 2 level, 3 level, 4 level, sorry, 3 level, 4 level, 5 level; each variable

is being looked at, at 5 level. So, a CCD for 2 factor, the number of experiments is

same as a number of experiments for a factorial design of 3 raised to the power 2. But, each

variable in CCD you are looking at 5 levels. Whereas, each variable in full factorial 3

raised to the power 2, you are looking at only 2 levels, sorry, 3 levels. So, that is

the difference. So, CCD will be much better than your 3 raised to the power 2 type of

designs. So, we will continue more about these second order designs in the next class.

Thank you very much. Key words – 2k Fractional Designs, Plackett-Burman

designs, Latin Square Design, Taguchi design, L4 Design, L8 Design, L9 Design, L12 Design,

L16 Design, L18 Design, Second Order DOE Designs, 3k Factorial Designs