Consulting Services

IBM Maximo


Technical Services

How can maintenance reliability be improved?

16 February 2023 | Patrice Duchesne

The notion of reliability refers to the capacity of a system or a piece of equipment to perform a required function under certain conditions for a certain length of time.

In that light, the process of increasing reliability may be regarded in various ways based on organizational objectives. Expectations also vary over time. In the 1970s, for example, since automobile problems were more frequent, users looked for a good warranty to avoid having to keep on shelling out money for poor quality. Nowadays, if you keep your vehicle regularly maintained, you can drive for 500,000 km or more without any major problems arising.

How do we achieve reliability?

Various studies have shown that reliability stems primarily from two factors: equipment strategies and reliability engineering.

Equipment strategies are all actions taken to upgrade an asset and maintain it in optimal condition. The longstanding methodology—still regarded as the standard—consists of determining the critical equipment/components and then establishing the best preventive measures.

Another course of action is to target the main failure modes (the standard sectoral term is reliability-centred maintenance, or RCM). More advanced models are used to quantify the failure curves (these models are based on the continuous probability laws developed by Waloddi Weibull in 1951).

Weibull’s work is still used to model the failure curve for each component and to generate a model that extrapolates the interaction of components over time and their reliability using the Monte Carlo method. This method tests a host of possibilities in order to predict with a certain degree of certainty the final reliability, as well as the potential losses (or gains) associated with various decisions.

Therefore, failure curve modelling is used to test strategies and to model their consequences, making it a very powerful tool in the decision-making process. However, using a high-powered methodology means having to work on equipment or processes that require high performance, e.g. in the aerospace sector.

As regards the question of which strategy should be used, the answer is obviously the one best adapted to the observed situation on a given site. Otherwise, you might end up with unusable results from the equipment strategy review because they are overly complex and poorly adapted to users’ needs.

On the ground, the preferred method is clearly based on critical equipment. This method uses RCM concepts to determine which equipment strategies are the most adaptable or the most relevant, based on an understanding of failure modes.

A methodology based on the systematic review of failure modes is used to specify (step by step) the best strategy for increasing the reliability of a given piece of equipment:

  • Condition of physical assets: upgrades
  • Predictive
  • Preventive
  • Periodic
  • Do nothing
  • Reliability engineering (re-engineering)

Since this method is based on RCM principles and uses the maximum amount of site-collected information, it is called “reverse RCM” because it is based on known data.

What about reliability engineering?

Depending on a company’s objectives, analyzing the causes of failures is another way to improve reliability. This process begins by collecting the largest possible amount of current loss data (production, safety/security, costs, etc.). The main causes of failures are then identified (bad actors) by analyzing the data using a Pareto diagram. This makes it possible to identify the 20% of causes associated with 80% of impacts.

Vilfredo Pareto, an Italian mathematician, formulated various observations relating to population and wealth. He observed that approximately 80% of the land in Italy was owned by 20% of the population. By extension, it was determined that the vast majority of phenomena reflect this observation, hence the use of Pareto diagrams to identify the main reliability issues.

For each problem identified, a team is put together consisting of operators, maintainers, sometimes equipment manufacturers, etc., all of whom share their respective views on the most likely causes of the problem and then work together to solve it. A systematic problem-solving method is then used to identify the fundamental cause(s) of the problem with a view to addressing the situation.

Condition of physical assets

The equipment strategy review and the process of resolving recurring problems tie in with what is known as a “short loop” (less than five years).

When the conversation turns to the reliability of equipment or groups of equipment, the first thing to ascertain is the current physical condition. A quick upgrade may prove necessary to reap the benefits of the equipment strategies. In any event, a good deal of corrective action should be taken after the inspections begin.

On another note, the equipment life cycle (ELC) must also be considered. The ELC is a long-term consideration because the useful life of a piece of equipment is measured in years. Here are the questions to ask yourself: What is the general condition of the equipment? How many more years before it will have to be replaced? Would an upgrade be less expensive?

All these issues must be verified to ensure the long-term viability of the equipment. In addition, equipment strategies must be reviewed and applied with a view to supporting performance and the maintenance of the equipment’s physical condition over the long term.

More specifically, what are the benefits of the combined approaches?

  • Production gains thanks to fewer and shorter shutdowns
  • Improved health and safety
  • Optimized maintenance costs over the long term
  • Optimized inventories (raw materials and parts)
  • Improved equipment integrity
  • Improved stakeholder engagement
  • Improved morale thanks to greater employee engagement across the board