The Complete Guide to Maintenance & Reliability PM / CM / CBM / RCM + KPIs + A Sample Annual Plan

The Complete Guide to Maintenance & Reliability PM / CM / CBM / RCM + KPIs + A Sample Annual Plan


What Exactly Is Maintenance? (Standard Definition)

Before we discuss PM/CM/CBM/RCM professionally, we need a standard definition of “maintenance.” In the IEC Electropedia / IEV (International Electrotechnical Vocabulary), maintenance is defined as a combination of technical and administrative actions carried out to retain or restore an item/asset to a state in which it can perform the required function.

This definition contains two key insights:

  • Maintenance is not only technical work; it also includes management activities (planning, resources, contractors, data, reporting).
  • The ultimate goal is not “doing tasks.” The goal is ensuring the asset can perform its required function.

What Are CM, PM, CBM, and RCM in Standard Terms?

1) CM — Corrective Maintenance

Corrective Maintenance is maintenance performed after a fault has been detected, in order to restore the equipment to an operational state.

  • Advantage: simple and appears low-cost in the short term
  • Risk: service interruption, emergency costs, cascading failures, safety degradation

2) PM — Preventive Maintenance

Preventive Maintenance is maintenance performed to reduce wear and lower the probability of failure.

PM is typically delivered in two common forms:

  • Time / calendar-based (e.g., monthly, quarterly)
  • Usage / operating-hours-based (e.g., every 2,000 hours)

Planned / Scheduled maintenance” is also defined in the IEV as maintenance carried out according to a defined schedule.


3) CBM — Condition-Based Maintenance

In the IEV, Condition-Based Maintenance is defined as a type of preventive maintenance carried out based on the assessment of the physical condition of the equipment.

To execute CBM effectively, you typically need a condition monitoring program. The ISO standard ISO 17359 provides general guidance for establishing a machinery condition monitoring program.

  • Benefits: fewer unnecessary tasks, earlier detection, fewer unexpected shutdowns
  • Prerequisites: reliable data + sensors/inspections + analytics + a clear response process

4) RCM — Reliability-Centred Maintenance

In the IEV, RCM is a systematic approach to determining maintenance actions and intervals in order to achieve the required reliability.

The SAE International standard SAE JA1011 also specifies the criteria a process must meet to be called “RCM.”

From a practical, facility-oriented perspective, the WBDG (NIBS, USA) describes RCM as an optimal mix of reactive, time-based, condition-based, and proactive approaches—aimed at improving reliability and reducing life-cycle cost.

Correct interpretation:
RCM means that instead of assigning the same PM routine to everything, you decide—based on risk and failure consequences—where PM is necessary, where CBM makes sense, where redesign/change is required, and where CM is acceptable.


When Should You Use Each Strategy? (A Quick Decision Framework)

To choose a strategy based on logic rather than preference, ask five questions:

Question 1: How severe are the consequences of failure?

  • High consequence assets (safety/fire protection, hospitals, server rooms, critical production lines)
    → typically require CBM + focused PM + RCM
  • Low consequence, quickly replaceable assets
    CM or simple PM may be sufficient

Question 2: Is the failure pattern predictable?

  • If wear/aging is observable → CBM is highly effective
  • If failures are random and sudden → heavy PM may only add cost; RCM helps select the right policy

Question 3: Is condition monitoring economically justified?

For expensive and critical assets (chillers, generators, main pumps, critical switchboards), condition monitoring often has a strong business case.


Question 4: Do you have redundancy?

With redundancy, you can manage part of the risk. Without it, you generally need a stricter maintenance program.


Question 5: Do you have reliable data?

Without data, CBM and RCM become slogans. Start by ensuring proper logging of events, failures, and repair durations.


Key Maintenance KPIs (With Standard Definitions)

If KPIs are not defined properly, you end up creating “performance theater” instead of improvement. These five metrics are the backbone of maintenance control:

1) MTTR — Mean Time to Restoration / Repair

The IEV defines MTTR as Mean Time to Restoration.

Use: the lower the MTTR, the faster assets return to service.

Recommended practical formula:

MTTR = Total repair time (from start of action to handover to operations) ÷ Number of failures
(Always define whether you include spare-part lead time and contractor waiting time.)


2) MTBF — Mean Time Between Failures

Electropedia defines “mean operating time between failures” as the expected operating time between failures.

Use: a core reliability metric for repairable assets.

Recommended practical formula:

MTBF = Total operating time (uptime) ÷ Number of failures


3) Availability — Readiness / Service Availability

For many contexts (hospitals, data centers, hotels), Availability matters more than “raw MTBF,” because it reflects whether the service is actually deliverable.

Common management approximation:

Availability ≈ MTBF ÷ (MTBF + MTTR)
(Valid when time definitions are consistent.)


4) PM Compliance — On-Time PM Execution Rate

PM Compliance = PM tasks completed on time ÷ PM tasks scheduled

Warning: High PM volume is not a success metric. PM must reduce failures and cost—otherwise, the program needs redesign.


5) Planned vs. Unplanned Work Ratio

The share of planned work compared to emergency/unplanned work.

Maturity goal: increase planned work—without inflating “useless PM.”


What Does a Professional Maintenance Program Include?

A strong maintenance program is not just a checklist—it is a system:

  • Asset classification and criticality: if this asset fails, what happens?
  • Asset register and coding: without an ID and history, the program collapses
  • Defined PM/CBM tasks: task, interval, standard time, required skill, required parts
  • Work Order process: request, prioritization, assignment, execution, close-out
  • Spare parts management: critical stock levels, lead times, contracts
  • Reporting and analysis: repeat failures, root causes, MTBF/MTTR trends
  • Periodic review: monthly/quarterly improvement based on real operating data

Sample Annual Maintenance Plan (Reusable Template)

You can localize this template for an office/commercial/healthcare facility or an industrial site. (Frequencies are examples—final intervals must align with OEM guidance, operating conditions, and risk.)

A) Program Levels

  • Daily / per shift: operator inspections (simple checklist + log entries)
  • Weekly: routine inspections + minor defect correction
  • Monthly: light PM + safety checks
  • Quarterly (every 3 months): major PM + key calibrations
  • Semi-annual: heavier servicing + performance testing
  • Annual: selective overhauls + risk/budget/renewal review

B) Sample Annual Calendar (Management Summary)

1) HVAC (Chillers / Boilers / AHUs / FCUs)

  • Monthly: leakage/vibration/noise checks, filter condition, alarms, log key temperatures/pressures
  • Quarterly: coil cleaning, belt/bearing inspection, control valve performance testing
  • Semi-annual: energy/flow/ΔT trend analysis, control tuning, functional performance tests
  • Annual: selective service/overhaul based on condition (PM + CBM mix)

2) Emergency Power (Generator / UPS)

  • Weekly/Monthly: no-load / load testing per instruction, log voltage/frequency/battery health
  • Quarterly: load transfer test, fuel/cooling checks, safety inspections
  • Annual: major service + failure and response-time trend analysis

3) Fire Protection & Safety

  • Monthly: visual inspections, pressure checks, routine tests
  • Quarterly/Semi-annual: functional tests per operational requirements
  • Annual: full drill/system test + nonconformity correction

4) Pumps & Water/Wastewater Systems

  • Monthly: leakage/seals, noise/vibration, log current draw
  • Recommended CBM for critical pumps: vibration/temperature/power consumption trending
  • Annual: strategy review (if repeat failures exist → run a lightweight RCM study)

C) Management Programs That Are Often Forgotten

  • Monthly: KPI review meeting (MTBF/MTTR/PM Compliance) + top 5 repeat failures
  • Quarterly: review critical spares inventory levels + contractor performance
  • Annual: strategy review (PM/CBM/RCM) + renewal plan for high-risk assets

How to Implement RCM Practically and Economically (Lightweight RCM to Start)

You don’t need to launch a “heavy” RCM program from day one. A cost-effective starting approach:

  1. Select 10 critical assets
  2. For each asset define:
    • required function
    • key failure modes
    • failure consequences (safety/service/cost)
  3. Decide the best policy:
    • time-based PM?
    • CBM?
    • redesign/redundancy?
    • or accept CM?
  4. After 3 months, refine the plan using MTBF/MTTR data

This aligns with the intent of SAE JA1011, which treats RCM as an evaluation process for selecting failure management policies.


Common Maintenance Mistakes (And Quick Fixes)

  • “More PM = more professionalism” → No. PM must reduce failures and cost; if not, redesign is needed.
  • CBM without a response workflow → If alarms don’t generate work orders, sensor investment is wasted.
  • Measuring without definitions → Define exactly what time is included/excluded in MTTR/MTBF.
  • Relying on contractors without KPI/SLA → Contractor management must be data-driven.
  • “RCM is only for industry” → RCM is highly relevant to hospitals/data centers/hotels because service continuity is critical.

 

 


If You Want To…

  • reduce emergency work,
  • lower MTTR and increase Availability,
  • build a real, auditable PM program,
  • and implement cost-effective CBM and RCM for critical assets,

we can support you end-to-end—from maintenance strategy design to CMMS/CAFM implementation, team training, KPI definition, and performance audits.


FAQ (Frequently Asked Questions)

1) What are the standard definitions of PM and CM?

  • PM: maintenance performed to reduce wear and the probability of failure.
  • CM: maintenance performed after fault detection to restore function.

2) What exactly is CBM?

A type of preventive maintenance performed based on an assessment of the physical condition.

3) What exactly is RCM?

A systematic method for selecting maintenance actions and intervals with a reliability focus; and per WBDG, an optimal blend of different maintenance approaches.

4) Where can we find valid definitions for MTTR and MTBF?

The IEC Electropedia / IEV defines MTTR as mean time to restoration and defines MTBF as the expected operating time between failures.

5) Which standard is recommended for CBM?

ISO 17359 provides general guidance for establishing a condition monitoring program for machines.

 

Post a comment

Your email address will not be published.