Welcome to the course on Biostatistics and
Design of Experiments. We have been talking about various types of designs, screening
designs, various types of screening designs. One of them, the most important one is your
factorial design. Then, we looked at the fractional factorial design, that means, doing a fraction
of those factorials. Then, I talked about something called the confounding. Then, I
also talked about the resolution. So, Resolution III designs are not generally liked. Resolution
IV or above are always liked. And, if you say 2 power 4 minus 1, that means, 4 is the
number of factors, or parameters you are looking at, and this is a half of that. So, 2 power 4 is 2 into 2 into 2 into 2 is
16; half of that will be 8 experiments. So, what do you do? 8 experiments is, so you put
a new variable D, new factor D in, in place of ABC. So, you have a confounding of D and
A B C. This is a Resolution IV design, because there are 4 terms and the design generator
I is equal to A B C D, understand. So, that is how you go about doing that. So, Resolution
IV design. This is liked. Resolution III design is not liked. So, what do you do? 3 parameters
or 3 factors; so 2 raised to the power 3, 8 experiments, you do not want to do 8 experiments.
So, what do you do? 2 raised to the power 3 minus 1. So, it is half of 8 experiments.
You are doing only 4 experiments. So, what do you do? You replace A B with C, that means,
A B is confounded with C. So, the design generator is A B C; I is equal to A B C and this is
a Resolution III, because you have 3 parameter, and so on. So, you can, this tells you the,
which terms are confounded, and then, it also tells you, allows you to make your resolution.
This will be your Resolution IV and so on, number of runs. So, and, I also taught you
how to do the calculations later, once you have the results also. Now, then, we went
into something else, the Plackett-Burman designs. These are also very good screening designs.
We can screen large number of factors in the very minimal experiments. Ideally, if the
factors are like 3, 7 and 11 and so on, because the design generator for Plackett-Burman designs
are meant for 8 runs, 12 runs, 16 runs, 20 runs, 24. So, if we have one factor less than
that, then Plackett-Burman design is very, very good; because, with less number of experiments,
you can get very good idea about the main effect. So, you have the design generator
here. I taught you how to build up these tables.
The last row generally, I mean, last row is always minus, minus, minus, and you push these
signs up, and build up this. So, Plackett-Burman design is very good, if we have parameters
like 3, 7, 11, 15 and so on. If the number of parameters is not that or slightly less,
we can always call some of them as dummy variables. Some of these columns can be called dummy
variables. This is taken from this particular reference. Then, we talked about Latin Square Design.
This is also very important, useful type of design. Then, I showed you Latin Square for
3 parameters A, B, C, each operated at 3 levels. So, how do you represent that? Minus 1, 0
and 1. In a 2 level, we say minus 1 and 1, for a 3 parameter we say minus 1, 0, 1. And,
here, you can see, you need to do 9 experiments. So, this will be one experiment; this will
be one second experiment; this will be third experiment; 4, 5, 6, 7, 8, 9. So, it is very
nice Latin Square Designs for a 3 variable problem. If it is a 2 variable problem, we
can have a 2 by 2 Latin Square. We do not have to, 2 variable problem. We can go even
2 parameters, 2 levels, 2 variables, we can have 2 by 2 Latin Squares and so on, actually. Then, comes another design. This is also a
screening design, that is called Taguchi designs. Traditional DOE’s focus on how each level,
or each factors, sorry, not each level, each factor affects the output, that is y. Y is
your output, or the dependent variable. Suppose, I am doing a bio-process like a yield of a
metabolite, or biomass, that is the output. Whereas in Taguchi design, he has looked at
robust design, and he is more interested in the variation, the loss function. So, he is
looking… Taguchi, there, has developed designs, looking at them as a possible robust design
and also looking at the variations, rather than the averages. Whereas, in normal designs,
like your factorial designs, or even your Plackett-Burman, we are looking at average
output; we are more interested in that, actually. .
So, Taguchi design, you go, there is a design selector is there. This tells you number of
parameters; this tells you number of levels. This table was taken from this particular
reference. So, this tells you the number of levels; this tells you the number of parameters.
So, if I have a 2 by 2, that means, 2 levels, 2 parameters. There is something called L
4 design. If I have that table L 4 is available here, table L 4 is available here, ok. .
This is called the table L 4. So, we can pick up the, as you can see, L 4 and 4 runs, we
can pick up this. So, we can pick up column 1 and 2, for A and suppose I have 2 variables,
so, I can do these two. If I have 3 parameters also, I can take up
this. So, this will become like a confounding; do you understand? And so on. So, if you look
at this table, there are tables available like L 9 table, L 18 table, L 27 table. They
are meant for 3 levels and so on. L 16 table meant for 4 level, L 25 tables, 50 tables
available for 5 levels. So, standard tables are available. We just pick up and then use
it. So, let us look at this level 2. So, you have tables for L 4, L 8. So, we can do it
for, suppose I have parameters 3 or 2, I can use the L 4 table. If I have parameters 4,
or 5, or 6, or even 7, I can use the L 8 table, ok. And, if I have parameters 8, 9, 10, 11,
I can use L 12 table and so on, actually. So, if you look at the L 4 table, 2 levels,
I can do a full factorial design. I can, if I maintain 2 factors, I can get a Resolution
IV, or I can do, go up to 3 factors for screening. new slide So, if I am interested in doing a screening
type of work, I can go up to, take this table, and I can look at 3 variables. For example,
this. This is the L 4 table; 4 runs. This is for factor A, factor B, factor C. So, if
I put A, B, C, then, obviously it becomes a screening design. Whereas, if I put only
A, B, I can do a very good, high resolution, full factor, it becomes a full factorial design.
This is the table called L 4 table. So, I just blindly do this, take the, pick up the
L 4 design for 2 level, and if it is 2 factors, I put A, B, ignore this; if it is 3 factors,
I put A, B, C, and do the experiments accordingly. Let us look at L 8 design; that means, 8 runs.
This number tells you how many runs, 4, 8, 9 and so on. So, the 8 tells you the number of experiments.
So, if you look at L 8 design, 2 levels, if I have 3 factors, it becomes full factorial,
right? 2 raised to the power 3 is also full factorial. But, if I want to do a screening,
I can go up to 7; it is almost like your Plackett- Burman, ok. If I want to get Resolution V,
then, I stick to only 3 factors. So, let us look at L 8 design. This is how it looks like.
So, either I can have less number of factors. I can have only 3 factors, then, it becomes
full factorial design, that means, A, B, C. So, this will become AB, this will become
BC, this will become AC, this will become ABC. Or, I can have 7 factors, A, B, C, D,
E, F. I am doing 8 experiments. It is like your Plackett- Burman design. Notice that
even here, you will have the balancing, 4 minuses, 4 plus. So, you should always have
a balance and orthogonal type of behavior. So, if I want to do a full factorial design,
so, I will, I can select this. I can select this, as I said, 2 raised to the power 3 is
8 experiment. So, I can select this, I can select this. Then, I do not select this, of
course, because, this and this look same. So, I will select this, do you understand?
That is why I select this column, this column, and this column. So, if it is a 3 factor,
full factorial design. So, if I want to do screening up to 7, then, I can go right up
to 7. I can have A, B, C, D, E, F, like that, now completely. So, I can keep increasing.
So, 1, 2, 4, 7. So, I will not use 5, because, 4 and 5 look similar. So, I can use this,
I can use this, I can use this, then, I can use this. So, like that, because some of 4
and 5, like that. Now they are all similar. So, L 4, L 8. So, we can, if I want to do
a full factorial design, I select, I can do up to 3 factors. So, that will become a full
factorial design, with their interactions, and then, if I want to look at 7 factors also,
I can go; it will be almost like a Plackett. Then, you have the L 9 designs. L 9 designs
are generally for 3 levels, minus 1, 0, 1. If I have only 2 factors, so, it is 3 power
2, which is 3 into 3 is 9. So, full factorial I will get. But, I can go up to 4 factors
as a screening design. So, I can go up to 4 factors, so 9 runs. So, as you can see,
minuses, zeros, 1. So, this is at the highest level; this is at the lowest level; this is
at the middle level. For example, temperature, if I am interested in looking at 30, 40, 50,
plus 1 means 50, minus 1 means 30, 0 means 40. pH, if it is 3, 4, 5 I am looking at,
so minus 1 means 3; 5 is plus 1; 0 is 4; like that you know. So, if I want to do a full
factorial design, what I do is, I will take 3 factors I can do; column 1, column 2 and
column 4. Or, screening design, like that you can keep on doing. You can add many factors,
actually. Then, you also have this L 12 design. So,
as you can see, a L 12 design, 2 levels; so, we can use it as a screening design for 11
factors. So, I can put 11 factors here, A, B, C, D,
E, like that. So, I will do 12 runs. This is almost like your Plackett-Burman design.
It is a like Plackett-Burman design, ok. Only thing is, the columns and the rows are
been permuted; so, they have interchanged the columns and the rows, actually, in some
cases. Whereas, in Plackett-Burman design, if you see at the bottom, you will get always
negatives, right. So, some difference is there. So, last line will always, 12 runs, you always
get in negative. So, there is a permute, there is a switching over of some rows and columns,
but, it is exactly like Plackett-Burman. So, 11 factors, I can use a Plackett-Burman design,
and do a 12 experiment, or I can pick up this Taguchi table for L12. I have 11 factors,
and I can do a screening design. So, I have both the options, actually, I can take Taguchi
or I can take Plackett-Burman for 12 experiments. Then, you have a L… So, just like L 4, L
8, L 9, L 12, you also have L 16 table. This is L 16 table. So, there are 16 runs.
So, this indicates the number of runs. You have 16 runs here. So, I can do a full factorial;
2 raised to the power is a full factorial, right; 2 into 2 into 2 into 2, 4 times. So,
full factorial. Full factorial will give you, sorry, full factorial of 4, I can go up to
5 factorial, and maintain Resolution V. If you want to do a screening design, I can go
up to 15, just like your Plackett-Burman experiment, right, 15. And then, there are L 18 designs;
that means, 18 experiments, 27 experiments and so on, actually. So, the advantage of
the Taguchi designs are, there are tables available. We can use different tables for
different… if we have the… depending upon the number of parameters, and numbers of levels
you have. If the parameters are less in that, what is available in the table, we can ignore
that. Then, one important point is, you need to crosscheck whether there is a balance and
orthogonality taking place, because any design you take, we need to have that sort of conditions;
that is very important. So, sometimes it will look like even Plackett-Burman design especially,
when you are looking at L 12, and go screening for 11 factors. Then, of course, it is like
a Plackett-Burman design. So, we have different types of tables, that is advantage of this
type of a Taguchi method. This is a L 18 design. We have 18 experiments
here. This is at 3 levels. We have minus 1, 0 and plus 1, 3 levels. So, the main advantage
as I said is, there are designs, tables available. Taguchi has prepared and given it. So, we
can use those tables directly, and whether it is a 2 factor, sorry, 2 levels, or it is
a 3 level, there are tables available. He even has, as you can see here, 2 levels, 3
levels, 4 levels, 5 levels, there are tables. Generally, we will not go 4 levels and 5 levels.
So, 2 levels, 3 levels, there are table available, depending upon the number of parameters. There
are tables for L 32 also, that means 32 experiments you need to do. So, you just pick up those
tables and make use of, for your design. So, we have covered lot of designs which can
be used for screening, the full factorial, the fractional factorial, the Plackett-Burman
design, the Latin Square design, the Taguchi design. So, all these are very good. Taguchi,
of course, you can mix and match, and use it for full factorial or even fractional factorial.
All these designs are at 2 levels, that means, you can get a linear relationship between,
if temperature varies from 30 and 40, 30 to 40, I am measuring at 30, I am measuring at
40. pH, I am measuring at, say 3, and measuring at 4, the effect of pH; carbon concentration,
measuring at 1 percent and measuring at 2 percent. So, I am measuring at two places;
obviously, I can get linear relationship. So, if I am generating regression equation,
I can get a linear relation. If I want to get a non-linear relation, ultimately,
we are interested in optimization. So, in such situations, I need to have data at more
than two points. Then only, I can get up, at least a square term. Then only, I can get
a quadratic term, which is non-linear. Then, I can find an optimum, maximum yield, for
a minimum impurity and so on, actually. In a linear, it is always increasing or always
decreasing, monotonically increasing, or monotonically decreasing. There is no optimum. So, initially,
during screening design, we do all the experiments at 2 levels, whether it is 2 raised to the
power n, or 2 raised to the power n minus 1, or whatever it is. So, we will have minus
1, plus 1, as your coded experimental strategy. We cannot use the results we get to develop
a second order type of model. We can use it to develop a linear model, not a quadratic
model. So, screening designs are used for eliminating
many parameters which are of no significance. And screening designs can be used to develop
a linear relationship. But, if we are interested in developing nonlinear relationship, then,
we need to go for second order model. So, when you start your design, you do not immediately
jump into second order model; that is a very bad idea. What you do is, you do a screening
design. You have 5 parameters, you do a screening design; come down to 2 parameters, and then,
you do a second order design for those 2 parameters. Because, when you are doing second order design,
you may require the results at 3 levels, minus 1, 0, 1. So, the number of experiments also
increases; because, if you have experiments at 3 levels, you will be able to get a quadratic
relationship. So, always remember, when you start your design experiments, never go for
second order model, second order designs; start with screening designs like fractional,
fractional factorial, or Taguchi method, or Plackett-Burman, or even Latin Square. Eliminate
many variables, and then, do a detailed design; and that is what we are going to talk about.
These detailed designs are generally second order designs which can be used to generate
non-linear relation like a quadratic relation. So, this second order designs are extremely
useful in the later part. We can use those designs for optimization. I want to calculate,
maximize, my biomass production. I want to calculate, maximize, my metabolite production.
Then, we need to have a second order model, or a quadratic model. So, there are some designs in second order
also. One is called the Central Composite Design, 3 k Factorial Design, Box-Behnken
Design. So, these are some designs. Many softwares, commercial softwares, automatically will give
this information. So, it is not a big deal, and I will also tell you how to go about doing
that. And, it is a very useful to do that, even using a paper and a pencil.
There are other designs, Koshal design, Hybrid design; we will not talk about it. The main
designs I will talk about is this central composite and Box-Behnken design. These designs
are very useful and well used generally, and well accepted. So, the second order designs,
of course, you have to have the data at 3 levels of the factors, minus 1, 0, 1; that
means, temperature at 30 degrees, 40 degrees, 50 degrees. Whereas, when you are doing a
screening design, if you are changing temperature, you may do at 30 degrees and 40 degrees, or
30 degrees and 50 degrees only. Whereas, if you are doing a second order design, you will
have at 3 levels; so, you will do a 30, 40, 50. Of course, you may ask the question, can
I do a 3 power end raised factorial design? Also, yes, you can do that, but you do not
do 3 power end designs during screening process. Never, never, never do that actually, because,
the number of experiments increases. You do not know which factors to use, which are of
no use. So, in during, after screening, you may eliminate some factors; then, what is
the point in doing too many levels of unwanted factors. That is why, screenings, we generally
do it at 2 levels, and later on, detailed design, we can do at minus 1, 0, 1.
So, central composite designs developed by Box and Wilson. These are first order designs,
with augmented with center points and star point. So, center point is right in the middle
and star points are outside this first order design. First order designs are your factorial
designs. 3 k Factorial Design, a fraction arrangement with k factors, each at 3 levels.
Box-Behnken Design, this is also almost like central composite, but the corner points are
not used; so, they use some other points. So, we look at each one of them. So, imagine, I have a factorial, 2 power 3,
8 experiments, right; 1, 2, 3, 4, 5, 6, 7 and inside 8. So, for a 2 power 3, it is like
a cube; the parameter factor A, factor B, factor C, each one at minus 1 and plus 1 level;
factor A, factor B, factor C, each one at minus 1 and plus 1 level. So, it is 2 power
3. So, it is like a cube. In a central composite design, so, in addition to this 8 places,
full factorial, we do experiments at the central point, here; and, we also do experiments at
6 star points which are outside this cube. Do you understand? So, 8 experiments plus
1, 9 experiment, and 6 star points. These are, there are 6 faces to the cube. So, there
will be 6 star places which are outside the face; that means, outside. That means, it
is beyond the plus 1 level, or below the minus 1 level; they are called the star points.
So, totally, for 3 factors, you do 8 plus 1, 9 plus 6, 15 experiments. You will get,
you are actually doing experiments at 5 levels, can you imagine?
The same slide as above one below the minus 1 level is one level;
the minus 1 is one level; the center point is one level 3; and then, again, plus 1 is
one level; and above the plus 1, another star is another level, so 5 level, 1, 2, 3, 4,
5. So, each of these 3 variables, you are doing at 5 levels; fantastic, is it not? At
5 levels, but, you are doing only 15 experiments. So, we can fit non-linear equations, for effect
of factor A on your, the output, effect of factor B on your output, effect of factor
C on your output, just by doing only 15 experiments. That is the beauty of the CCDs. So, we cut
down the number of experiments tremendously, but we get lot of information using this type
of approach; that is the main advantage. Now, how do you decide on the star points?
These star points are outside the, the cube, outside the face of the cube; so you will
have 6 faces for a cube. So, obviously, there are 6 star points. Do you understand? So,
there are 8, for the factorial, there is 1 central points, 9, 6 stars, that is above
plus 1, below minus 1, above plus 1, below minus 1, above plus 1, below minus 1. So,
there are 6 experiments. So, 8 plus 1 plus 6, 15 points, each, that means, you are doing
each variable at 5 levels. So, you get very good idea about the effect of each parameter,
or factor, on your output, and, you can develop very good non-linear relationship. It is even
better than your 3 raised to the power n design, because in 3 raised to the power n, you are
doing only at 3 levels, but the number of experiments maybe very high. Imagine, if I
want to do a 3 factor A, B, C at 3 levels, a factorial, it becomes 3 raised to the power
3, which is 3 into 3 into 3, that is 27 experiments. Whereas, with 15 experiments, with CCD, I
can get lot of information also. That is the main advantage of the CCD.
So, I do the full factorial inside; that means, I can get some, quite a lot of idea about
my interactions also. Then, I do these extra star points and central point. So, I am doing
each variable at 5 levels also. So, it is much superior to 3 raised to the power 3 type
of designs, do you understand? So, it is really good. The other one is the Box-Behnken design. So,
instead of doing at these corners, it does it at these center places of these edges;
center of the edge, you know. So, this is what this picture is about, actually. So,
you do the same thing, you do at the center of this cube, but you also do experiments
at the center of these edges, rather than corner of these cube. Of course, if the factors
are 3 to 7, only it exists. So, it is not the corners it is doing, it is doing at these
edges, center of these edges actually, that is called the Box-Behnken Design. So, if we have a, say a 2 parameter, that
is like your square, that is like your square and if you are doing a CCD, the 4 points in
the square, that is, the 2 power raised to the power 2, the center point and if the star
point is at the edges, so, these are the 4; these are called the star points.
So, when you are doing high, medium, low, medium is your 0, 0; so, this is the design.
But, if you can extend this little bit, little bit, little bit, little bit; if the region
of interest is more, so, that way, little bit, little bit, you are ending up doing 5
level type of experiments. Whereas, if you are keeping your star points right on the
edges, you are doing only 3 level experiments, for each of these parameter, or variable,
or factor; it will be high, medium, low only. So, if you extend it beyond, then you are
doing at 5 level. Do not forget that; see, this is what it is.
So, if you are extending it beyond those, those points, so, you are doing at 5 levels.
Each variable is changed from, this is level 1, level 2, level 3, level 4, level 5; so,
each variable is changed. So, here we can say, the star point, if you call it alpha,
if your alpha is 1, those points lie right on these; whereas, if alpha is greater than
1, they are beyond this square. So, it is exploring beyond this square; that is what
it is all about, actually, here, in this type of design. This is for a 2 2 raised to the
power 2, that means, if you have 2 parameters, ok. How do we calculate this, what is the optimum
distance for this? The generally, the formula is square root of 2 is the optimum distance.
So, if this is 1, then, this will be 1.414, because square root of 2 is 1.414. That is
how you select, and then, you go beyond that, actually, that is how you select, and so,
we have 15 points, sorry, if we have a 2 raised to the power 2, we have 4 points here. Central
point is 5, and then, we have a, another 4 points. So, that means, you have, how many
experiments? 1, 2, 3, 4, 5, 6, 7, 8, 9 experiments, for a 2 by 2, at 3 level, it is nine experiments.
So, if you have a 2 parameter, if I am doing a factorial, 3 level, I can do 3 raised to
the power 2 which is 9 experiments, but you are doing only at 3 levels. Whereas, here,
we are doing, when I use a CCD, central composite design, I am looking at each variable at 5
levels; 1 level, 2 level, 3 level, 4 level, sorry, 3 level, 4 level, 5 level; each variable
is being looked at, at 5 level. So, a CCD for 2 factor, the number of experiments is
same as a number of experiments for a factorial design of 3 raised to the power 2. But, each
variable in CCD you are looking at 5 levels. Whereas, each variable in full factorial 3
raised to the power 2, you are looking at only 2 levels, sorry, 3 levels. So, that is
the difference. So, CCD will be much better than your 3 raised to the power 2 type of
designs. So, we will continue more about these second order designs in the next class.
Thank you very much. Key words – 2k Fractional Designs, Plackett-Burman
designs, Latin Square Design, Taguchi design, L4 Design, L8 Design, L9 Design, L12 Design,
L16 Design, L18 Design, Second Order DOE Designs, 3k Factorial Designs