Design of Experiments

Hi! In this blog, I will be covering an experimental tool used largely by scientists and researchers. Design of Experiments Tool (DOE). It is a popular tool. 🀩🌟

First, I shall cover some background about this tool, followed by how to use it in a relevant Case Study on Popcorn Kernels and my personal learning reflection.

Preview of DOE

Why DOE?

DOE is used to study the effect of multiple (>1) parameters (factors/independent variables) on response output (dependent variable). 

Compared to a guess-and-check method, it allows us to do things more systematically and easily.

DOE can also be used to study the interaction between 2-3 factors. However, for this educational level, we shall be touching only on 2nd order interaction.


DOE Factors have 2 levels, 

HIGH (+) and LOW (-); the level can be decided but must be kept constant throughout

E.g. Temperature Factor

HIGH (+): 100°C 

LOW (-): 60°C 

For each experiment set, the factors can be varied to create multiple treatments (combination of factor levels)

A guideline formula to decide how many experimental runs to carry out:



A quick tip when designing a DOE Test Set RunπŸ’‘:

It is important to run experiments in a different order to ensure the fairness and reliability of results.

When test runs are replicated with the same levels of factor testing, it may produce unreliability in results.

An example is shown here:

Example to show the difference between run set and run order

In an experiment, we shall follow the run order to carry out tests, but for recording data, we will record them in the relevant run sets.

The run Sequence should  ALWAYS be as follows for a 3 Factor DOE Setup to ensure fairness:


Run #1 is always (-)(-)(-)
Run#2 is always (+)(-)(-) 
and so on...
Ending with Run#8 to be (+)(+)(+)


This is an important aspect to check for when setting up an experimental run as well as during calculations.

Full VS Fractional Factorial Method (Which to choose?)

Full Factorial Method is the best method as it analyses all the data from the experiment to produce the most precise results and optimal analysis.

However, it gets infeasible to carry out as the number of experiments increases exponentially. For example, 10 factors and 1024 treatments to study. It is impossible to carry out test runs for such a huge number and group. 

Hence, the Fractional Factorial method will be used in a scenario where there are too many experimental sets to study, fractionalising will help us be more efficient.

Therefore, Full Factorial method is the preferred one followed by Fractional Factorial method.
Nonetheless, whichever method is used, there will always be a tradeoff to consider.

Using a relevant Case Study, I shall explain the subsequent topics.

Fractional Factorial Method (How to Fractionalise?)


An excerpt taken from the case study's full factorial method, we want to lower the number of test sets we need to conduct to carry out the fractional factorial method; shave off a fraction of the initial amount.

Currently, there are a total of 8 test sets. By fractionalising, we can reduce it to 4 test sets.

To select optimal test sets for this method, we have to choose a fractional set to be orthogonal and has good statistical properties. 
Hence, all factors have to occur the same number of times at the same levels. For example, 6 LOW (-) and 6 HIGH (+).  The factors occur at LOW and HIGH 50% of the time here.

Runs 1, 3, 4, 7 are selected from the larger sample size previously

Finally, ensure that you are able to carry out the fractional factorial method with ease. 
At every step, ensure that you are analysing the results to make sure they make sense before proceeding further.

CASE STUDY: Comparison of Factors affecting Popcorn Bullet Yield

Full Factorial Method is on Sheet 1
Fractional Factorial Method is on Sheet 2
My administrative number is 2122388, hence, from the document deliverable requirement, I replaced the last number of some of the results with the number 8. X  >> 8.

Introduction:
In an ideal world, having no popcorn bullets in your bag of popcorn is what everyone would want.
I don't know about you, but having that hard-crushing feeling and taste in between the sweet taste of my popcorn just ruins it. However, I do love biting down on the kernels from time to time.
After all,  Popcorn is a treat reserved for the movies πŸŽ₯🍿🍿🍿

Thereafter, we shall be investigating the factors affecting popcorn bullet yield and discover what are the best parameters to use to make the best popcorn with few "bullets" as possible.


In this case study, three factors are identified that cause the loss of popcorn yield:
  1. Diameter of bowls to contain the corn, 10cm and 15cm
  2. Microwaving time, 4 minutes and 6 minutes
  3. Power setting of microwave, 75% and 100%
8 runs were performed with 100grams of corn used in every experiment and the measured variable is the amount of "bullets" formed in grams and data sets collected are shown below:





Full factorial method (How to?)

1. Create the data table with all the results from the experiment




2. Next, write down the reference key to factors


In order to study the effect of factors, having a graph will allow us to analyse the factor's effect at a glance. 

3. Key in data from the test runs to create a graph

Here, we will solve for the mean output value of the factors by averaging the Runs where the factor is (+) and (-) separately.



The total effect of a factor will be the difference between the HIGH (+) and LOW (-)
HIGH - LOW = Difference = Total Effect

4. We shall use a 2-D Line Chart to analyse the data.


Explanation of graph

Significance of Factors

Factor A
  • Factor A has a negative effect on popcorn bullets formed, the smaller the diameter of the bowl, the greater the number of popcorn bullets formed. However, the effect is minor. 
  • When the diameter of bowl increases from 10cm > 15cm, the average number of "bullets" formed reduces slightly from 1.91g to 1.76g
  • To minimise the number of popcorn bullets formed, increase the diameter of the bowl used.
Factor B
  • Factor B also has a negative effect on popcorn bullets formed, the greater the microwaving time, the lesser the number of popcorn bullets formed. It has a medium effect on output.
  • When the microwaving time increases from 4mins to 6mins, the average number of "bullets" formed reduces from 2.17g to a decent 1.49g
  • To minimise the number of popcorn bullets formed, increase microwaving time.
Factor C
  • Factor C has a negative effect on popcorn bullets formed, the greater the microwave heating power, the lesser the number of popcorn bullets formed. It has a major effect on the output.
  • When the heating power increases from 75% to 100%, the average number of "bullets" formed reduces from 2.94g to a shocking 0.72g
  • To minimise the number of popcorn bullets formed, increase microwave heating power.
Ranking of Factors (from greatest effect to lowest effect)
C > B > A
Based on the graph, the effect significance comparison is very easy to analyse.

Interaction Effect

We will study interaction effect between 2nd order factors now. To compare them we shall use graphical method as well as it allows us to analyse the data effectively.

Likewise, in our data table, we shall take average values of HIGH and LOW Factors.
The second alphabet factor will be the "constant at different levels" while the first alphabet factor will be the"varied and averaged variable"
E.g. A x B
At LOW B... Avg of HIGH A... Avg of LOW A
Calculations are similar to calculating Significance.


Once again, when inputting values into the graph, check back with the data table to ensure that the results make sense. 
(Cannot be effect on the table is calculated to be negative but the graph shows a positive effect on the output variable) 

Hence, we can troubleshoot whether it was our calculations or just data selection when crafting the graph chart.

A x B:
A: Diameter of Bowl
B: Microwaving Time


The gradient of both lines are different. One is -ve and one is +ve.
There may be an inverse relationship between the factors.
Therefore, there is significant interaction between A and B.

A x C:
A: Diameter of Bowl
C: Heating Power


The gradient of both lines are different by a little margin.
Therefore, there IS an interaction between A and C, but the interaction is small.

Only when both lines are parallel, or when the difference is the same, then there is no interaction between factors.

B x C:
B: Microwaving Time
C: Heating Power


The gradient of both lines are negative and of different values. 
Hence, there is a significant interaction between B and C.

Fractional Factorial Method:

From the full data set, I selected runs 1, 3, 4 and 7 to conduct the fractional factorial method.
All factors occur the same number of times for each selection (50%), hence this data set is orthogonal and has good statistical properties.

To select a good fraction from the original set, the data as mentioned above should contain at least 50% of (+) HIGH and (-) LOW levels per factor to be considered orthogonal and useful for the fractional factorial method.

I will explain further below on my experience with the data fractionalising.

Initially, I selected a poor data set despite having all factors equal in number. Then, I realised that I was being too lazy and decided to meticulously select my data set better. 

I also tested out the data set before finalising it to ensure it was workable. 
(2 LOWS and 2 HIGHS for each factor is good.)

1. Select the data set used for fractional method.
I used black paint to eliminate previous run data so that I won't be confused.

This style is effective for analysing interaction effects as well!





2. Calculating Significance of Factors by solving for the Means


The total effect of a factor will be the difference between the HIGH (+) and LOW (-)
HIGH - LOW = Difference = Total Effect

3. Using a 2-D Line Chart to input our data


Explanation of graph:

Significance of Factors

Factor A:
  • Factor A has a positive effect on popcorn bullets formed, the larger the diameter of the bowl, the greater the number of popcorn bullets formed, it has a strong effect on popcorn bullets formed.
  • When the diameter of bowl increases from 10cm to 15cm, the average number of "bullets" formed increases from 0.81g to a great 2.88g
  • To minimise number of popcorn bullets formed, reduce the diameter of the bowl.
Factor B:
  • Factor B has a negative effect on popcorn bullets formed, the greater the microwaving time, the lower the number of popcorn bullets formed, it has a medium effect on popcorn bullets formed.
  • When the microwaving time increases from 4min to 6min, the average number of "bullets" formed decreases from 2.31g to 1.38g
  • To minimise the number of popcorn bullets formed, increase the microwaving time.
Factor C:
  • Factor C has a negative effect on popcorn bullets formed, the greater the heating power, the lower the number of popcorn bullets formed, it has a large effect on popcorn bullets formed.
  • When heating power increases from 75% to 100%, the average number of "bullets" formed decreases from 2.88g to a low 0.81g
  • To minimise the number of popcorn bullets formed, increase the heating power.
Ranking of Factors (from greatest effect to lowest effect)
C > A > B

However, C and A have identical effects on popcorn bullet output based on their gradients, just that they are inversely identical.

Learning Reflection

Entering an Internship next semester in AY23/24, Y3S1, my internship position entails carrying out DOE.
My to-be supervisor mentioned that mastering DOE will help me in my storytelling abilities to tell a story on how the process works. I still have little understanding of this, however, learning DOE in CPDD has helped me shed some light on this tool and may have even gained some understanding of what the supervisor meant.

Having done both methods, it seems that I will tend to trust the full factorial method more as it is very reliable and accurate.
Furthermore, the fractional factorial method also has unusual results and largely depends on the data sets that you choose. The results could vary largely depending on it.

An example here:
In the case study displayed above, the full factorial method produced an inverse trend relationship of Factor A on the yield of "bullets" however in the fractional factorial method, Factor A had a direct trend relationship on the yield of "bullets". 

This is an example of sampling data, where you take a fraction of the population data and the results may not speak for the entire populational data, hence providing an inaccurate or insufficient statistic.

Such issues occur when the data studied contains a large number of variables as well.

However, in actual experiments where the sample size is huge and there are more factors to study, I will have to probably use fractional factorial method instead.

This is similar to the experience of once you tried something you love, you never want to go back to before. 
One difficulty I faced during the keying in and calculation of results was having to analyse so much data in a single table and having to concentrate my eyes on the right data to key in.
However, using the Excel Sheet was an advantage as it simplified everything for me.
Having a good focus will make this calculation process efficient.
I put on some music to place my mind into the zone and smashed the keys on my keyboard.
My colourful RGB keyboard designed for gaming which my brother gave to me


DOE is a really great tool to analyse our experiment results, and most of the work is done during the experimental runs and collecting of data. The rest of the data analysis is actually rather mundane and done on a computer.

During my practical on DOE, my team and I got the chance to play with a catapult and shoot some targets for practice and our grades 🀣🀣🀣


It was even amplified when we were given the photos of the lecturers teaching the CPDD module to vent our frustrations on.


In this practical, we got to see the result of our DOE study at full effect during the catapult shooting challenge where we competed with other teams to win 1st place to get the most marks.
We analysed our data and configured our catapult setup to what our analytical data provided us.

One challenge here was a 200+cm target, which we had no data for it as our maximum result was shorter than that. Hence, we analysed the configurations of other teams and tried them out. We failed our attempts and either overshot or undershot the ball. (that one was a hit-or-miss chance since we had insufficient data)

My team won the 2nd place tie and here is a video clip of me winning the first shot and what sounded like my teammate Cheryl doubting our test results.πŸ’Έ


From this experiment, I learned to have a light of faith in my ability and trust the results.
One thing I can improve from DOE is looking into the interaction effects of factors and understanding which set of parameter grouping will be better for the output variable. 
I also can improve on Fractional Factorial method analysis.

Lastly, I was very proud that I completed the case study using a completely blank and new excel template rather than fully copying the template that was provided as an example. I felt that this helped my learning of DOE greatly so that I can apply it practically.

Ultimately, DOE is used to study the significance and interaction between independent variables (cause) to deduce their effect on the dependent variable (effect).

Hence, to study the effect of just 1 factor/cause on a process/effect, we look into the technique of Hypothesis Testing, which will be covered in the subsequent blog. Look out for that one!

From this short learning package, it is safe to say that I have mastered DOE and will be ready for my internship this coming March!

My wish will be to establish ideal production parameters for the process task that I am given and be proud of what I achieve and learn during the process.

I just have to keep my mind clear and free from harbouring any expectations of privileges.


Popular Posts

Image

About me