Quick Guide to Design Failure Mode & Effect Analysis - DFMEA

How does one go about conducting a DFMEA? Use this quick guide to understand the process, its aims, and how to use it as an effective risk-mitigation tool.

Summary

  • Risk Priority Numbers (RPNs) guide action - Multiply severity, frequency, and detection difficulty (each rated 1-10) to identify which design failures demand immediate attention versus later fixes
  • 80/20 rule applies to failure modes - Twenty percent of potential design problems typically cause eighty percent of actual issues, so drilling into high-RPN failures addresses most risks efficiently
  • Cross-functional teams catch more problems - Include stakeholders from design, manufacturing, suppliers, and customers in brainstorming sessions to identify failure modes one person would miss
  • Need to prevent defects before they ship? Document your failure analysis process with automated tracking and team accountability

Potential product defects can be quite harmful to any business. In some cases, a design error could lead to a large-scale product recall, costing you millions of dollars.

Even if the design flaw is not that significant, though, the damage done can still be a significant setback for the organization. Small flaws add up.

Design Failure Mode and Effect Analysis (DFMEA) can help avoid all that. DFMEA is a problem-solving methodology, making it easier to detect potential issues and solve them before they have much of an impact.

How does DFMEA work?

You will certainly look at the figures if you are trying to determine the tolerances of a component, but DFMEA is really a qualitative tool. You are probably familiar with Murphy’s Law: “If something can go wrong, it will.”

DFMEA strives to give Murphy the go-by by looking at just what can go wrong with your product or process, why it might go wrong, how likely it is to happen, and what the consequences might be. In our conversations with quality directors at aerospace and manufacturing companies, I have observed that those who document their failure analysis process with structured workflows catch issues an average of 40% earlier in the production cycle compared to those relying on ad-hoc spreadsheets and emails.

Obviously, the next step is to determine what can be done to eliminate the possible failure or reduce its likelihood to the point where it is negligible.

Why stopping at abnormalities actually works

Here’s something that connects DFMEA to a broader manufacturing philosophy most people miss. Toyota built their production system around a concept called Jidoka - the idea that when something goes wrong, you stop everything. Not later. Now. The machine stops. The line stops. Everyone focuses on that one problem until it’s fixed.

DFMEA takes that same thinking and applies it before you even build anything. Instead of firefighting defects after they show up in the field (expensive, embarrassing, sometimes dangerous), you’re forcing your team to stop and consider: what could go wrong here? It’s preventive Jidoka, if you will. And there’s something psychologically effective about this forced pause. When you make people stop and document a potential failure mode, they can’t hand-wave it away. They have to think through the consequences, assign numbers, and own the problem. That immediate focus tends to surface root causes that slip through when you’re just racing to ship.

Although it has its origins in auto manufacture, the principle is flexible enough to be useful in just about any business, be it a manufacturing concern or a service provider.

Having a structured way to track and audit your DFMEA processes ensures nothing slips through the cracks. Here is how process audit software can help.

Solution Process
Process Audit Software

Tallyfy is Process Audits Made Easy

Save Time
Track & Delegate Processes
Ensure Consistency
Explore this solution

The ultimate aim of DFMEA is company success and customer satisfaction, as well as minimizing any potential risks.

The first step is to assemble a team and spot the potential failure mode

Two heads are better than one - and a whole team of participants will come up with and consider more design failure mode possibilities than any single person ever will. At Tallyfy, we’ve seen that the most valuable insights often come from the person you least expect - the intern who spots an obvious flaw everyone else overlooked, or the operations manager who remembers a similar problem from years ago.

Your brainstorming team should consist of stakeholders across the spectrum. You might decide to include suppliers and customers as well as process and product designers and the managers who will be directly responsible for the product or process.

Now it’s time to tune into “negative” mode with a positive aim. Your team is going to look for problems that haven’t occurred yet, and they’re going to think of unusual circumstances that might cause an otherwise effective design to fail.

Since any product or service is likely to consist of several components that work together, your team will carefully consider each one and tell you what might fail, for what reason, and under what circumstances.

How to record your findings

Design Failure Mode and Effect Analysis is a Six Sigma tool and it is usually presented in the form of a spreadsheet. Your team will look at each component of the design or step in the designed process, in turn, answering the following questions:

  • What is the designed item or process step under analysis?
  • What is the failure type? In other words, describe what could go wrong
  • What is the impact of the failure and who is affected?
  • On the scale of one to ten, how severe is the potential impact?
  • What might cause the failure being considered?
  • How often would this type of failure occur?
  • How would the failure be detected?
  • How easy or difficult is it to detect an impending failure?
  • How urgently should the potential problem be addressed? Allocate a risk priority value - we discuss this in more detail below.
  • What actions should be taken to prevent this kind of failure or to make its consequences less severe?
  • Who will be responsible for what action? (Using a RACI Matrix can be helpful here)
  • When should the action be carried out?
  • Having taken this action what would the severity of the consequences, the frequency of the failure, its ease of detection, and the priority of the risk be affected?

When determining the impact of a failure mode and when assessing actions to be taken, three items are given a numerical rating between one and ten. These are:

  • Severity
  • Possible frequency
  • Ease of detection prior to failure

Low numbers would indicate a less severe, infrequent, or easily detectable issue. Use higher numbers to indicate severe, frequent, or difficult to detect failures.

Track quality control and design validation processes

Example Procedure
Print Production & Quality Control Workflow
1Initial Print Job Setup
2Configure Print Properties
3Submit Print Request
4Review File and Specifications
5Get Cost Approval If Needed
+2 more steps
View template
Example Procedure
Product Ideation & Innovation Pipeline Workflow
1Submit the idea
2Initial screening
3Research and validate
4Build business case
5Decision and next steps
+15 more steps
View template
Example Procedure
Issue Tracking
1Determine channel of reporting
2Check for duplicate/similar bugs
3Send helpful notification to client
4Create a new ticket
5Prioritize and assign
+8 more steps
View template

Allocating and using Risk Priority Numbers (RPNs)

Now that you have severity, frequency, and failure detection figures, you can determine the RPN by simply multiplying the three figures by each other. The higher the numbers, the higher the total, and the higher the priority.

When it comes to addressing the risks implicit in a design failure, the ones with the highest RPNs will be tackled first.

Admittedly, these numbers come from qualitative data, but they do help in identifying the failures that would have the greatest impact.

Spreadsheets are handy in this context because you can sort your DFMEA table from highest to lowest RPN score to show what areas require the most urgent attention.

You might well find that the 80/20 principle applies.

That’s to say, eighty percent of issues are likely to be caused by twenty percent of the possible failure modes. Drilling down into these possible problems should, therefore, address eighty percent of them.

A final review of your DFMEA

The key to resolving or reducing the possible failures identified in the DFMEA lies in action. Your team will probably have agreed on design changes and actions to be taken during the initial analysis phase.

Once the responsible teams or individuals have followed the recommended course of action, it’s time to get your team together and reassess the risk priorities indicated in your DFMEA.

Reducing risk will, therefore, mean adjusting the design and making the potential for the identified failure mode occurring less frequent, less severe, or easier to detect before failure occurs.

The team will assign new scores to each of these elements, and will then be able to see to what degree the possible impact of failures is lower.

That may not be the end of the line. You may decide on a new set of actions that will reduce the RPN score even more. Iteration matters. Keep repeating the process until you are satisfied with the resulting design. Manufacturing operations represent about 8% of our conversations at Tallyfy, and I have seen teams reduce their high-priority failure modes by 60-80% after just two or three iteration cycles - the key is actually following through on the actions rather than just documenting them.

About the Author

Amit is the CEO of Tallyfy. He is a workflow expert and specializes in process automation and the next generation of business process management in the post-flowchart age. He has decades of consulting experience in task and workflow automation, continuous improvement (all the flavors) and AI-driven workflows for small and large companies. Amit did a Computer Science degree at the University of Bath and moved from the UK to St. Louis, MO in 2014. He loves watching American robins and their nesting behaviors!

Follow Amit on his website, LinkedIn, Facebook, Reddit, X (Twitter) or YouTube.

Automate your workflows with Tallyfy

Stop chasing status updates. Track and automate your processes in one place.