Root Cause Analysis is more than just stating a problem

by | Oct 1, 2019

Recently I had the opportunity to talk at the Test Automation and QA Summit in Toronto about Root Cause Analysis and what it should be used for.

One of the things that were interesting about this talk was that the summit was geared towards Test automation and the use of technology to improve quality. My talk was process-based. I did not bring up any significant use of technology(I did mention Excel).

From the start, I went into examples with the audience on finding issues with processes and getting away from laying blame on individuals. Using the “5 Why” method we walked through a scenario that led to a process. The discovery was outside a Software Development process and was actually an HR process. It was interesting to see the looks on everyone’s face as we went through it.

One of the key issues that people run into with performing a Root Cause Analysis is that the lack of use of data to back up what is truly going on. The normal course of action is to put a Band-Aid on it and hope it goes away. This will only cause more issues as it begins to take hold of other processes causing a snowball effect that will just get worse as time goes on. The use of the right data to back up what is being talked about, finding the process that is causing the recurring issue, and working as a team to find a solution is what is needed.

One of the other points I brought up was “recurring”.Yes, there will be events that will happen that may seem as a one-off and a quick solution can be used. In those situations, the organization should keep an eye out for similar issues so that they are aware. It is recurring that will add up over time though.

Here is an example:

Looking at two scenarios I will show how a good Root Cause Analysis process can help.

Example one:

A flooding of a server room. I have seen this happen and it was an issue with the AC in the building. In the end was a cost of about $60,000 total to repair and replace structural and equipment. The price tag also included restoring the data from the back ups.

Example 2.

Every release has about 150 overtime hours costing approximately $9000.00 of addition to the budget.With 6 releases a year that is $54 000.00 of additional spend.

Taking a look at the 2 examples the first one is important and is critical. It is also more expensive in costs than the second. Especially taking into account the unknown costs of client dissatisfaction. It does all add up.

For example 2 what will sometimes happen is this will become common knowledge and the organization could become complacent to what is going on. So it will go un-noticed.

Now let’s say we do a good Root Cause Analysis on example 2 and the team finds that there is a process in the SDLC that is creating a bottle neck which then requires overtime to make up for the lost time. Now with the change it is discovered that it will incur a cost of $25 000.00 to get implemented. On the initial review of the proposed changes the executives may pass on the change due to cost (remember complacency). It will happen, sticker shock. What needs to be added is the ROI. Telling the executives that after the implemented change they will see an ROI of over 50% in the first year will begin to peak interest. Here you are showing value and it is ongoing as it will increase every year.

This is one of the main premises of ISO 9001:2015. Continuous Improvement.

Again example 1 does have an ROI. It is just not a recurring one as it was based on an accident.

During my discussion at the summit I ended it with that those that work on a Root Cause Analysis, because it is a team effort, should be able to tell the right story for those that are reviewing it get what is being said. From there the value is immense.