Root Cause Failure Analysis. Trinath Sahoo

Чтение книги онлайн.

Читать онлайн книгу Root Cause Failure Analysis - Trinath Sahoo страница 13

Автор:
Жанр:
Серия:
Издательство:
Root Cause Failure Analysis - Trinath Sahoo

Скачать книгу

TITANIC’s superior performance. Ismay survived the disaster and testified at the inquiries that this speed increase was approved by Captain Smith and the helmsman was operating under his Captain’s direction.

      Latent Roots

      All physical failures are triggered by humans. But humans are negatively influenced by latent forces. The goal is to identify and remove these latent forces. Latent causes reveal themselves in layers. One after the other, the layers can be peeled back, similar to peeling the layers off an onion. It often seems as if there is no end. These forces within the organizations are causing people to make serious mistakes.

      These are the management system weaknesses that include training, policies, procedures and specifications. People make decision based on these and if the system is flawed, the decision will be in error and will be the triggering mechanism that causes the mechanical failure to occur. These are the management system weaknesses. These include training, policies, procedures and specifications. The most proactive of all industrial action might be to identify and remove these latent traps. But all our attempts to identify and remove these latent causes of failure start at the human. Humans do things “inappropriately,” for “latent” reasons. In order to understand these reasons, we must first understand what “errors” are being made. This puts people at risk – especially the “culprits.” Once exposed. They are in danger of being inappropriately disciplined.

      To understand different level of root causes, let us take one industrial case.

      Consider this example: During the overhauling of a large reciprocating compressor, the maintenance supervisor discovers a damaged compressor rod requiring replacement. So, he decides to have a rod made in a local shop by fabricating the rod with cut threads. But the OEM’s design department has recommended the compressor rods for this frame size to have rolled threads. As a result of the improper fabrication, the rod fails due to fatigue in the thread area and causes extensive secondary damage inside the compressor.

Schematic illustration of events leading to compressor failure.

      If you study this example, you can discern the following events leading to the costly failure:

       The warehouse did not stock spares for this rod because it was a new compressor installation.

       The maintenance supervisor decides to have a rod fabricated without drawings.

       Neither the user nor the local shop investigated the thread requirements.

       Because the compressor was not equipped with vibration shutdowns, it ran for a significant amount of time before it was shutdown.

      There were several chances to break the chain of events leading to the catastrophic compressor failure. If the project engineer had ordered spare parts through the OEM, this failure probably would have been avoided. If either the maintenance supervisor or the local machine shop had talked to the OEM, or studied the failed rod, they would have been aware of the importance of rolled threads. Lastly, if a vibration shutdown had been in place, the compressor would have shutdown after only minimal damage. We see there were six major events leading to the secondary compressor damage. These events were as follows:

       No procedure in place to order spare parts for newly purchased equipment (latent root).

       The improper installation of the packing leads to rod scoring.

       Because a spare rod is not available and plant management wants the compressor back in operation as soon as possible, it was decided to have a replacement rod fabricated at a local machine shop.

       No one checks with the OEM about rod thread specifications (physical root).

       The rod fails after two days of operation.

       The broken rod causes extensive damage to the cylinder, packing box, distance piece, and cross‐head.

      After examining the vestiges of the failure, the rotating equipment (RE) engineer would discover a fatigue failure in the threaded portion of the rod. From this, he would conclude an improper thread design led to a stress riser and a shortened fatigue life. After talking to the OEM, he writes a report recommending that all compressor rods in the plant have rolled threads.

      This recommendation will surely reduce rod failures, but the investigation did not uncover the latent root of failure. The stress riser, due to the improper thread design, is called the “physical root,” because it did initiate the physical events leading to the secondary damage. However, there were significant events preceding the physical root that are of interest. If the RE engineer had the time and resources, he would have discovered that the absence of a procedure requiring new equipment to be purchased with adequate spares directly initiated the sequence of events. This basic event is called the “latent root.”

      By requiring spare parts be purchased from the OEM for all new equipment, the latent root is eliminated, not only for this scenario but, potentially, for many other similar events. This example demonstrates the importance of finding out the “latent root” of rotating equipment failures. Stopping at the “physical root,” deprives the organization of a valuable opportunity for improvement. So, an RCFA is a detailed analysis of a complex, multi‐event failure, such as the example above, in which the sequence of events is hoped to be found, along with the initiating event. The initiating event is called the root cause, and factors that contributed to the severity of the failure or perpetuated the events leading to the failure are called contributing events.

      Industry personnel generally divides failure analysis into three categories in order of complexity and depth of investigation.

      They are:

      1 Component failure analysis (CFA) looks at the specific physical cause of failure such as fatigue, overload, or corrosion of the machine element that failed, for example, a bearing or a gear. This type of analysis mostly emphasizes to find the physical causes of the failure.

      2 Root cause investigation (RCI) is conducted in greater depth than the CFA and goes substantially beyond the physical root of a problem. It investigates to find the human errors involved but doesn’t involve management system deficiencies.

      3 Root cause analyses (RCA) include everything the RCI covers plus the management system problems that allow the human errors and other system weaknesses to exist.

Скачать книгу