Sie sind auf Seite 1von 19

Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.

com

Effective Maintenance Program


Development/Optimization

Sammy Seifeddine
HSB Reliability Technologies
Senior Project Manager
800 Rockmead Drive
Three Kingwood Place, Suite 180
Kingwood, TX 77339
(281) 358-1477 ext. 276
(281) 358-1871 fax
sseifeddine@hsbrt.com

12th International
Process Plant Reliability Conference

October 22-23, 2003

Houston, Texas

Page 1
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Effective Maintenance Program


Development/Optimization

Abstract

This paper describes a proven process for developing, optimizing, and managing
effective maintenance programs for new and in-service assets based on risk and cost-
benefit principles. The process calls for utilizing operational and maintenance experience
as long as the experience is documented for the proper class of assets in the form of
standard tasks. In absence of standard tasks, a more comprehensive analysis is
performed using Reliability-Centered Maintenance (RCM2) or Failure Modes Effects
Analysis (FMEA) to develop an optimum program. Asset performance data is used to
continually adjust the maintenance program to meet user objectives.

1.0 Introduction

A maintenance program is effective when it targets critical production equipment and


puts emphasis on minimizing risk, which will lead to improved reliability, availability
and resource utilization.

This paper focuses on a process for developing effective asset (or optimizing existing)
maintenance programs. The process is a component of overall asset’s Life Cycle
Management (LCM).

2.0 Maintenance Program Development/Optimization

This process consists of the following steps (refer to Figure 1):

1. Identifying business objectives.


2. Development of plant/asset technical model.
3. Condition assessment of installed assets.
4. Criticality and risk assessment.
5. Maintenance program development/review.
6. Loading of maintenance tasks to the CMMS system.
7. Maintenance spares strategy (not covered in this document.)

These steps are considered in more detail in the following sections.

3.0 Business Objective

Business objectives are set at the corporate and plant levels. They reflect market
conditions, shareholders expectations, and regulatory compliance. Objectives at this level

Page 2
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

include production levels, products qualities, safe operation policies and requirements,
environmental integrity requirements, and operating cost targets.

Objectives are then translated to major assets’ specific performance expectations.


Measures at this level might include availability, asset utilization, efficiency, specific
products qualities, Overall Equipment Effectiveness (OEE), cost per unit produced, etc.
Target values are set by plant operating departments and approved by plant and corporate
management.

Major assets or systems performance expectations are further refined to the individual
equipment level. Here target vales for measures, such as Mean Time Between Failure
(MTBF), Mean Time To Repair (MTTR), availability, etc., are set and approved.

This process is repeated periodically, and the objectives are changed to reflect the
company’s position regarding the main business drivers. Figure 2 identifies the steps
involved in developing asset performance expectations.

Business objectives and performance expectations set the stage for defining equipment
performance standards for high risk equipment in which RCM2 is the utilized method for
developing/optimizing the maintenance programs.

4.0 Plant Technical Model

The plant technical model (also known as asset hierarchy) is composed of a hierarchy of
systems and sub-systems that gradually represent increased levels of detail in describing
the asset. The model reflects how systems and sub-systems fit together, interrelate and
operate to provide the intended business function. As such, the hierarchy reflects both the
structural and process flow characteristics of the plant/asset.

The model starts with the process flow diagram representing the overall operation of a
plant. This level consists of the major plant production units, utility systems (such as
electricity, water, steam, air, fuel, etc.), feed and raw material preparation facilities, final
product storage, plant control systems and local area network(s), infrastructures, etc.

The next level breaks down each unit into systems and sub-systems as depicted on unit
process flow diagram and P&ID’s. Examples at this level include systems such as feed
filtration, feed pressurization, feed heating, atmospheric fractionation, etc. At
progressively lower levels of the model, the breakdown of the plant becomes more
detailed. At the end, the plant is reduced to a set of systems and sub-subsystems and the
equipment items that support each one of the systems or sub-sub-systems.

Control and protective systems are incorporated in the hierarchy at the appropriate levels.
In the case where a control or protective system is dedicated to one system or sub-system
then it should be setup as a sub-element of that system. In the case that a

Page 3
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

control/protective system is controlling/protecting multiple systems, it should be setup as


an element at the same level in the hierarchy.

Every hierarchy element - whether it is a system, sub-system or an equipment item - has a


clearly defined boundary. Boundary definitions are standardized for classes of
system/equipment items.

The steps involved in developing a plant technical model are as follows (see Figure 3):

1. Collect technical information and drawings (PFD’s, P&ID’s, line diagrams,


datasheets, O&M manuals, etc.)
2. Establish a standard for defining systems’ boundaries. See references 4 and 6 for
details.
3. Develop plant technical hierarchy.
4. Define systems’ functions (optional).
5. Load hierarchy into the plant maintenance information system (CMMS).

5.0 Criticality and Risk Assessment

Criticality and risk assessment is a qualitative analysis of assets failure events and the
ranking of those events according to their impact on the business goals of the company.
The process consists of the following main activities (see Figure 4):

1. Establish criticality assessment criteria.


2. Define for each assessment criteria the failure consequences and their scores.
3. Collect equipment condition assessment records or generic failure frequencies.
4. Determine failure frequencies and their ratings.
5. Define criticality ranking scores.
6. Define criticality ranking rules.
7. Select systems and/or equipment for assessment.
8. Perform the analysis.
9. Rank systems/equipment by criticality.
10. Rank systems/equipment by risk.

These steps are considered in more detail in the following sections.

5.1 Assessment Criteria

The first step in the analysis is to use the organizational business objectives to define the
criticality assessment criteria. The following are some suggested criticality assessment
criteria.

¾ Health and Safety.


¾ Environmental Integrity.
¾ Throughput.

Page 4
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

¾ Customer Service.
¾ Operating Cost.

Each criterion is given a maximum score to reflect the consequences and relative
importance. In Table 1, the safety criterion is given a maximum score of twenty (20)
while operating cost criterion is given a maximum score of ten (10).

5.2 Failure Consequences

Failure consequences within each criterion are defined and given an evaluation score.
Table 2 provides examples of safety, throughput/downtime, product quality, maintenance
and operating cost criteria and their associated consequences of failure and their scores.

5.3 Failure Frequencies

Failure frequencies are defined based on systems and equipment performance. When
defining failure frequencies, consideration is given to aspects such as:

¾ Operational failure history (where available).


¾ Generic reliability data.
¾ Equipment redundancy.
¾ Mode of equipment operation.
¾ Equipment stress variations, etc.

The frequency of failure score is used in the calculation of relative risk to determine how
likely the failure of the assessed system or equipment item will impact an organization’s
business. Table 3 shows a sample of frequency scores.

5.4 Criticality Ranks and Rules

The criticality rank number of a system or equipment is a function of the system’s or


equipment’s impact on the business when the system or equipment fails, regardless of
how often the failure occurs. For example, a set of criticality ranking numbers might
range from 1 to 10. Criticality rank number 10 represents the highest rank while number
1 represents the lowest.

Criticality ranking rules are defined to assist in assigning criticality ranks to systems or
equipment during the analysis. The rules are established by considering the combined
consequence scores for all assessment criteria. For example, a rule can be defined as
“Assign criticality of 10 to a system/equipment, if any of safety or environmental
consequence scores are greater than 18, or any of throughput, product quality or
maintenance and operating cost consequence scores are equal to 10”, and so forth.

Page 5
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

The equipment criticality rank numbers, number range, and the rules for assigning the
numbers to systems or equipment under assessment are defined before conducting the
analysis.

Criticality rank numbers are assigned to systems and/or equipment based on the rules
developed. This is accomplished by comparing the equipment’s criteria consequence
scores to the criticality rank number’s rules. If the equipment matches the rules, the
equipment is assigned that criticality rank number. The equipment is always assigned the
highest criticality rank number it matches.

5.5 Criticality and Risk Assessment

The assessment starts by analyzing the selected system and/or equipment failure
consequences. The most serious failure consequence in each defined consequence
criterion is identified and its score recorded.

System and equipment failure consequences are analyzed in terms of the resultant effects
on the asset as a whole and consider the impact of the failure on safety of personnel and
on the asset commercial performance. The later requires consideration of both direct and
indirect failure costs.

The analysis is conducted by answering a series of questions about each system or


equipment item. These questions assess both the consequence of system or equipment
failure and the frequency/probability of failure with respect to the assessment criteria.
The criticality number and relative risk are calculated during the assessment from
responses to the questions.

Questions are formulated in the following form:

“If the system/equipment fails, could it result in a safety consequence? If yes, how
serious should the potential consequence be rated?”

5.6 Results of Criticality and Risk Assessment

5.6.1 Outcome of the Assessment

Criticality and risk assessment produces the results:

1. Systems/equipment criticality ranks.


2. Relative risk.
3. Total consequence scores.
4. Individual system/equipment scores.

Page 6
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

5.6.2 Relative Risk

The probability of failure is used in combination with the total failure consequence of a
system/equipment to determine the RR value of the system/equipment. CARA uses the
concept of the relative risk (RR) to identify system/equipment that has the greatest
potential impact on the business goals of the company.

The RR of a system or equipment is the product of its Total Consequence Score (TC) and
the Frequency/Probability (F/P) Number. It is called “relative risk” because it only has
meaning relative to the other equipment evaluated by the same method.

The Total Consequence (TC) is the sum of all the scores assigned to each of the criteria
including: Safety (S), Environmental (E), Quality (Q), Throughput (T), Customer Service
(CS) and Operating Cost (OC).

TC = S + E + Q + T + CS + OC

RR = TC * F/P

6.0 Maintenance Tasks Development/Optimization (MTD/O)

The MTD/O process described in this paper establishes a structured framework for
developing or assessing maintenance programs for in-service or newly commissioned
assets. The process emphasizes the use of operation and maintenance experience
documented in a form of standard maintenance tasks (SMT).

6.1 Maintenance Tasks Development/Optimization (MTD/O) Overview

The flowchart in Figure 5 describes the steps involved in carrying out the MTD/O
process.

The steps involved in the development/optimization of maintenance tasks are as follows:

1. A system is identified for review by selecting an element from the plant technical
hierarchy. As described earlier, the selected system boundary should be clearly
defined. The selected system includes all lower level elements.

2. A risk analysis is performed per section 4 of this paper. If an analysis was conducted
in the past, review of failure frequencies in lieu of the current system/equipment
items’ condition is conducted and the frequency scores changed as necessary. The
system/equipment items selected are then ranked by their risk ranking.

3. In the case that the system under review belongs to an equipment class group that has
a Standard Maintenance Task (SMT) documented, it is only necessary to verify for
low risk systems/equipment that any specific company, standards, and regulation

Page 7
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

requirements are applicable and simple service activities are adequate and cost
efficient. For high and medium risk systems/equipment, verification of all SMT
elements is required.

4. When an applicable SMT is not available, a more detailed analysis is required for
high and medium risk systems/equipment. For high risk items, a complete RCM2
analysis is recommended, while for medium risk items, RCM2 (FMEA) is sufficient
to develop/optimize the maintenance program. The outcome of RCM2 or RCM2
FMEA is a set of proposed tasks, their frequencies, and the crafts and skill levels of
individuals performing the work, or recommended actions in case suitable routine
tasks cannot be found.

5. For low risk items not governed by any company, standard or governmental
requirements a run-to-failure strategy is adapted. When requirements exist, routine
tasks are developed and incorporated into work packages.

6. From the output of RCM2 or RCM2 (FMEA), detailed routine task descriptions are
developed and then incorporated into work packages.

7. SMTs are developed to reduce tasks development time, efforts, and to ensure
consistency when dealing with equipment from the same equipment group.
Developed SMTs are kept in a library for future reference. Routine updates are made
to SMTs to reflect current condition of equipment, gained maintenance and operating
experience, and any new changes/modification to systems and equipment.

8. The final step in the analysis is to upload the developed work packages into Plant
Reliability Information Management Systems (PRIMS). PRIMS include maintenance
systems such as MAXIMO, SAP Plant Maintenance, Document Management
Systems, Inspection Systems, etc.

9. Monitoring developed/optimized maintenance programs is essential to ensure their


effectiveness in meeting the objective set by the organization. An established method
for recording failure modes, failure effects, and failure causes as well as the
corrective actions taken to eliminate/reduce the failure effects is critical to the
successful implementation of any maintenance program.

6.2 Standard Maintenance Task (SMT)

An SMT is a set of maintenance activities, which demonstrate a technically feasible and


cost-effective maintenance strategy for a defined equipment group. An equipment group
is a set of equipment of the same class that functions in an identical operating context. An
equipment group has similar design, failure modes and frequencies.

Establishing a library of SMTs ensures consistent documentation of maintenance


strategies, reduces the efforts for developing maintenance programs for new systems,

Page 8
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

ensures the application of uniform, consistent and cost-effective maintenance activities,


and facilitates analysis of equipment groups.

It is recommended to include the following information when documenting a standard


maintenance task:

1. Applicable company requirements.


2. Applicable governing standards.
3. Governmental requirements/regulations.
4. Completed RCM2 analysis.
5. Description of equipment boundary and proper reference to drawings/isometrics.
6. Description of operating context (operational and environmental.)
7. Assumptions/requirements for/from risk assessment.
8. Dominating failure modes with approximate probability.
9. The selected maintenance activities to reduce the probability of identified failure
mechanisms to cause failure along with the proper intervals (time-based or
performance/condition-based).
10. All equipment monitored parameters (RCM2) with their sensitivity to faults/failures.
11. Established performance indicators.
12. Experience from using a known maintenance strategy along with periodic monitoring
of established performance indicators.
13. For non evident failure modes, the tests/inspections required to determine equipment
expected availability.
14. Required experience and competency of maintenance personnel.
15. Estimated person-hours for maintenance activities.
16. Estimated repair time.
17. Essential spare parts, tools, equipment, and lead times.

The extent of documentation depends on the complexity and the risk assigned to the
assets under review. For low risk assets, it is only required to document items one to three
above and an assessment if simple service activities are adequate and cost effective. For
high and medium risk assets, it is recommended that the SMT documents all of the listed
items.

6.3 Condition Monitoring

The MTD/O review will determine that the best maintenance strategy is to perform “on
condition maintenance.”

Equipment condition is determined by monitoring operational and non-operational


parameters sensitive to failure modes. Since not all parameters are effective in detecting
failure modes, a formal analysis is needed to select the right corroborative set of
parameters. The analysis must identify the failure sensitive parameters and their
monitoring practicality.

Page 9
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

After establishing the technical feasibility of condition monitoring, the economic viability
must be considered. The costs associated with the operation and on-going support of the
condition-monitoring program must be considered against the potential cost savings and
cost of alternative maintenance strategies.

6.4 Monitoring Maintenance Program Effectiveness

Monitoring the effectiveness of the developed maintenance programs is accomplished by


tracking and trending a set of key performance indicators. The indicators were
established during the assets condition assessment phase. Progress reports are produced
periodically. Modifications to maintenance tasks are made when necessary.

7.0 Application

This process was introduced and implemented at several plants in North America. Assets’
condition assessment studies were conducted and baselines established for each facility.
The studies helped in developing the frequency score tables and provided points of
reference for future analysis to assess the effectiveness of the devised maintenance
programs.

Areas of assessment included the following:

¾ Mean time between failures.


¾ Downtime due to unscheduled maintenance.
¾ Downtime for scheduled maintenance.
¾ Asset downtime due to failures of utilities, upstream, and downstream production
assets.
¾ Slowdowns due to equipment failures.
¾ Slowdowns due to utilities, upstream and downstream failures.
¾ Quality problems due to equipment failures.
¾ Maintenance cost.
¾ Increased operating cost due to equipment failures.
¾ Safety incidents due to equipment failures.
¾ Environmental releases and damages due to equipment failures.
¾ Spares consumptions.
¾ Survey of existing PM and PdM tasks.

Operational downtimes and slowdowns data were collected but not used for this analysis.

The impact of adapting this process on assets performance and maintenance


organizations are summarized in Table 4.

Page 10
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Start
Start

Develop
Develop
Plant/Asset
Plant/Asset
Technical Model
Technical Model

Perform
Perform
Criticality & Risk
Criticality
Assessment & Risk
Assessment

Existing
New/Existing Assess Plant/Asset
Assess Plant/Asset
Plant/Asset? Condition
Condition

New

Develop /
Develop /
Optimize
Optimize
MP
MP

Develop / Monitor
Develop /
Optimize Modify/Load MP Monitor
MP
Optimize
Spares Strategy To PRIMS MP
Effectiveness
Spares Strategy Effectiveness

End
End

MP: Maintenance Program


PRIMS: Plant Reliability Information Management Systems

Figure 1: Maintenance Program Development Process.

Page 11
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Start

Corporate Objectives
Corporate Objectives

Plant Objectives
Plant Objectives

Major Assets/Systems
Major Assets/Systems
Performance Expectations
Performance Expectations

Equipment Item
Equipment Item
Performance Expectations
Performance Expectations

End

Figure 2: Setting Performance Expectations.

Page 12
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Start
Start

Collect
Collect
Plant/Asset
Plant/Asset
Technical Data
Technical Data

Establish Develop
Establish Develop
Boundary Definition Plant/Asset
Boundary Definition Plant/Asset
Standards Technical Model
Standards Technical Model

Describe Systems’
Describe Systems’
Functions
Functions

Load Plant/Asset Model


& Equipment To PRIMS

End
End

PRIMS: Plant Reliability Information Management Systems

Figure 3: Plant Technical Model Development.

Page 13
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Start
Start

Establish Develop
Business Objectives Establish Develop
Assessment Select System Plant/Asset
Assessment Select System Plant/Asset
Criteria Technical Model
Criteria Technical Model

Cycle through systems/equipment list


Generic Failure Define Failure
Generic Failure Define Failure
Data Consequences & Perform the Analysis
Data Consequences & Perform the Analysis
Their Ratings
Their Ratings

Determine Assign
Assess Plant/Asset Determine
Assess Plant/Asset Failure Frequencies Criticality & Risk Ranks To
Condition Failure Frequencies
Condition & Their Ratings System(s) / Equipment
& Their Ratings

Define Criticality More Systems/ Yes


Define Criticality
Ranking Table Equipment?
Ranking Table
No

Define Criticality End


End
Define Criticality
Assignment Rules
Assignment Rules

Figure 4: Criticality and Risk Assessment.

Page 14
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Start
Start

Select
Select
A System/Equipment End
End Risk?
A System/Equipment
Low Medium High

Perform Planned Perform


Perform Planned
Corrective Repair No Regulatory Perform Perform
Risk Corrective Repair Perform RCM 2
Risk (Run-to-Failure) Requirements? RCM2 (FMEA) RCM 2
Assessment (Run-to-Failure) RCM2 (FMEA) Analysis
Assessment Analysis
Yes

Standard No Yes Routine Activities,


Relevant as
Maintenance Task Establish SMT Frequencies, Required
Establish SMT SMT?
Exist? Resources
Yes
No

Select Proper SMT Write Detailed


Select Add SMT to Library
From Proper
LibrarySMT Write
Work Detailed
Instructions
From Library Work Instructions

Verify Load Work Packages Determine


Verify Determine
Work Packages To PRIMS Work Packages
Work Packages Work Packages

More
Yes
Systems/
Equipment?

No

End
End

Figure 5: Maintenance Tasks Development/Optimization.

Page 15
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Criterion Score
Health and Safety 20
Environmental Integrity 20
Production Throughput 10
Operating Cost 10

Table 1: Assessment Criteria Scores.

Score Consequence
Safety
20 Fatalities.
18 Disabling injury.
14 Serious injury.
6 Minor or first aid injury such.
0 No injury.
Throughput/Downtime
10 Production downtime equal or greater than 7 days
9 Production downtime from 3 to 7 days.
8 Production downtime from 1 to 3 days.
7 One day production down time.
6 Production throughput at 25% of capacity.
4 Production throughput at 50% of capacity.
2 Production throughput at 75% of capacity.
0 No impact on throughput.
Product Quality
10 Unacceptable quality resulting in TOTAL product loss.
5 Unacceptable quality resulting in TOTAL product rework.
0 No effect on product quality.
Maintenance and Operating Cost
10 Incurred cost <$400K.
8 Incurred cost >$100K and <$400K.
6 Incurred cost >$50K and <$100K.
4 Incurred cost >$10K and <$50K.
2 Incurred cost >$1K and <$10K.
1 Incurred cost <$1K.

Table 2: Safety Criterion Consequence Table.

Page 16
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Failure Frequency Score


Failures occur daily 10
Failures occur weekly 9
Failures occur monthly 8
Failures occur between one month and one year intervals 7
Failures occur yearly 6
Failures occur between 1 and 5 years 5
Failures occur between 5 and 10 years 4
Failures occur less frequently than once in 10 years 1

Table 3: Failure Frequency Scores.

Availability Downtime RAV3 Product


(%)1 (%)2 Quality
Rejects
(%)4
Before Plant 1 88 8 4.1 6
Plant 2 89 7 3.5 6
Plant 3 92 5 3.1 4
Plant 4 93 4 2.5 2
After Plant 1 92 4 3.25 4.5
Plant 2 91.5 4.5 2.85 4.2
Plant 3 94.5 2.5 2.4 2.4
Plant 4 94.5 2.5 2.1 1.3

1) Availability ([operating time - all downtimes including slowdowns]*100/operating time).


2) Planned and unplanned downtime for maintenance (excluding TA).
3) Percent of maintenance cost to asset replacement value.
4) Percent reject due to equipment failure (includes startup and shutdown of spec products).

Table 4: Implementation Results.

Page 17
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Appendix A: Definitions

Asset: May refer to a plant, system, or a piece of equipment.

Failure Mechanism: Physical, chemical, or other processes which lead or have led to
failure.

Maintenance Program: A comprehensive set of maintenance activities, their intervals,


and required recourses along with the performed maintenance analysis documentation.

Maintenance Strategy: The means by which equipment are maintained. The


maintenance strategy can be of four main types: Run-to-failure, preventive, predictive (on
condition maintenance), or, redesign (the equipment).

Standard Maintenance Task (SMT): A set of cost-effective maintenance actions for an


equipment class group.

Equipment Group: A set of equipment of the same class that functions in an identical
operating context.

Page 18
Downloaded from Reliabilityweb.com on the web at http://www.reliabilityweb.com

Appendix B: References

1. AIChE/CCPS, Guidelines for Process Equipment Reliability Data. Center for


Chemical Process Safety, American Institute of Chemical Engineers, New York,
1989.
2. Blanchard, Benjamin S., Logistics Engineering and Management, Prentice Hall, Inc.,
1998.
3. EXP Training Documentation, IVARA Corporation, 2002.
4. Moubray, John, Reliability-Centered Maintenance (RCM II), 2nd Edition, Industrial
Press, 1997.
5. ISO 14224, “Petroleum and Natural Gas Industries – Collection and Exchange of
Refinery and Maintenance Data for Equipment,” International Standards
Organization, First Edition, 1999.
6. Norsok Standard, “Criticality Analysis for Maintenance Purposes,” Z-008, Rev. 2,
November 2001.
7. OREDA-97, Offshore Reliability Data, Det Norske Veritas, P.O.Box 300, N-1322
Hovik, Norway, 3 Edition, 1997.
8. Seifeddine, Sammy, “Criticality and Risk Assessment,” HSB Reliability
Technologies, Project Document, 2000.

Page 19

Das könnte Ihnen auch gefallen