Sie sind auf Seite 1von 12

Energy 35 (2010) 964–975

Contents lists available at ScienceDirect

Energy
journal homepage: www.elsevier.com/locate/energy

Risk-based decision making method for maintenance policy selection of


thermal power plant equipment
F.G. Carazas, G.F.M. Souza*
Polytechnic School, University of São Paulo, São Paulo, Brazil

a r t i c l e i n f o a b s t r a c t

Article history: This study presents a decision-making method for maintenance policy selection of power plants
Received 30 October 2008 equipment. The method is based on risk analysis concepts. The method first step consists in identifying
Received in revised form critical equipment both for power plant operational performance and availability based on risk concepts.
19 June 2009
The second step involves the proposal of a potential maintenance policy that could be applied to critical
Accepted 26 June 2009
equipment in order to increase its availability. The costs associated with each potential maintenance
Available online 28 July 2009
policy must be estimated, including the maintenance costs and the cost of failure that measures the
critical equipment failure consequences for the power plant operation. Once the failure probabilities and
Keywords:
Availability the costs of failures are estimated, a decision-making procedure is applied to select the best maintenance
Risk analysis policy. The decision criterion is to minimize the equipment cost of failure, considering the costs and
Decision making likelihood of occurrence of failure scenarios. The method is applied to the analysis of a lubrication oil
Gas turbines system used in gas turbines journal bearings. The turbine has more than 150 MW nominal output,
installed in an open cycle thermoelectric power plant. A design modification with the installation of
a redundant oil pump is proposed for lubricating oil system availability improvement.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction A Mean time to repair (MTTR) which is a measure of how long,


on average, it will take to bring the equipment back to normal
The main objective of any electric power generation system serviceability when it does fail.
based on hydroelectric or thermoelectric power plants is to supply
the amount of energy demanded by the market and to comply with Although reliability can be at least estimated during the plant
the regulatory requirements defined by government laws. design stages, its availability is strongly influenced by the uncer-
To attain the objective, one of the most important requirements tainties in the repair time. Those uncertainties are influenced by
for any power generation system is to guarantee its technical many factors such as the ability to diagnose the cause of failure or
availability. the availability of equipment and skilled personnel to carry out the
The availability of a complex system such as a thermal power repair procedures.
plant is strongly associated with the parts reliability and mainte- The maintenance policy of some power plant equipment, such
nance policy. That policy not only has influence on the parts repair as gas and steam turbines, must follow very stringent recommen-
time but also on the parts reliability affecting the system degra- dations defined by the manufacturer. Most of the maintenance
dation and availability [3,7,8]. procedure tasks, involving periodical inspection and replacement
Availability is a measure of the percentage of time in which of parts, are related to parts subjected to very high temperatures.
a plant is capable of producing its end product at some specified The periodical inspection schedule is based on the number of
acceptable level. In a simple way, availability is controlled by two equipment start-ups and operational hours.
parameters: For auxiliary system, such as the lubricating oil system, heat
recovery steam generators feed water pumping systems, and other
A Mean time to failure (MTTF) which is a measure of how long, on auxiliary systems, the manufacturer recommends periodical
average, the plant will perform as specified before an unplanned inspections but does not clearly define what kind of maintenance
failure occurs, being associated with equipment reliability; policies could be applied to the components of those systems.
In order to guarantee the plant operational performance, the
Reliability Centered Maintenance (RCM) philosophy can be applied
* Corresponding author. Tel.: þ55 11 3091 9656; fax: þ55 11 3091 5461. to define maintenance policy, [4] and [7], mainly for equipment
E-mail address: gfmsouza@usp.br (G.F.M. Souza). that are not submitted to stringent maintenance tasks. The main

0360-5442/$ – see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.energy.2009.06.054
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 965

Nomenclature diagram branch, given the occurrence of the initial


event.
A uniform payment in the equal-payment-series capital pinitial event probability of occurrence of the initiating
recovery event
Cf fixed cost of failure (cost of spare parts) P present value in the equal-payment-series capital
Cv the variable cost per hour of down time, including recovery
labor rate and crew size PL energy production loss [MWh]
Consequencesi consequences associated with the sequence R(t) reliability at time t
represented by the ith cause-consequence diagram Riski risk associated with the sequence represented by the
branch ith cause-consequence diagram branch
DT power plant down time [in fraction of h] SP selling price of generated electricity [$/(MWh)]
EMV expected monetary value t time period [h]
F(t) failure probability at time t T time interval between two consecutive preventive
j interest rate maintenance actions
MTTF mean time to failure Ti time interval between two consecutive predictive
MTTR mean time to repair maintenance actions
p probability that the maintenance action is carried out DTi time to failure of the ith equipment failure
unsatisfactorily b Weibull distribution shape parameter
p(branchi/initial event) probability of occurrence of the h Weibull distribution characteristic life [h]
sequence represented by the ith cause-consequence l Exponential distribution shape parameter [failures/h]

goal of that philosophy is ‘‘to determine what must be done to that indicate whether a given failure mode is being developed and
ensure that any physical asset continues to do whatever its users on the pattern of failure occurrence frequency that can be random
want it to do in its present operating context’’. The main difficulty or repetitive.
for that application is the decision-making process regarding the According to RCM concepts, equipment failed states are known
cost-effectiveness of a given maintenance policy selected for the as functional failures, because they occur when the equipment is
power plant equipment. unable to fulfill a function to a standard of performance which is
acceptable by the user. In addition to the total inability to func-
1.1. Objective tion, this definition encompasses partial failures, where the
equipment still functions but at an unacceptable level of
The aim here is to present a risk-based maintenance policy performance.
selection methodology for power plants equipment. In Fig. 1 a graphic indicates, in a generic way, the functional
The proposed method is based on risk analysis and decision- behavior of equipment, the performance of which presents
making concepts. The risk analysis concepts are used to structure degradation during operational time. If the time to failure (DTi)
the process of equipment failure consequences assessment recorded in the maintenance database is quite similar (repetitive
considering the power plant operational profile. The decision- failure pattern), the equipment presents a frequency of failure that
making concepts are used to balance the costs associated with is almost constant during the operational period. If the failure root-
a given equipment maintenance policy and its failure probability cause analysis indicates that most of the failures are caused by the
aiming at the minimization of the mean power plant operational same age-related degradation mechanism, the equipment failure
costs in a given operational profile. mode presents a repetitive pattern. The aging failure is a gradual
The methodology can be considered complementary to the failure, meaning that the performance of the equipment is grad-
Reliability Centered Maintenance (RCM) philosophy. According to ually drifting out of the specified range. According to Table 1,
the decision-making diagrams proposed by the RCM philosophy, preventive maintenance tasks may be used to lower that frequency
the maintenance cost-effectiveness must be considered as a deci- of failure. Those tasks, based on the scheduled replacement or
sion variable for maintenance policy selection. The risk-based restoration of components the failure of which causes the opera-
method presented in the paper can be used for cost-effectiveness tional performance degradation of the equipment, aims at
analysis supporting the maintenance managers to select the most restoring the initial performance of equipment at a specified
useful maintenance policy among a set of technically feasible operational time limit, regardless of its apparent condition at the
maintenance alternatives, [4,5]. time. The frequency of scheduled maintenance tasks is governed
The method integrates financial and technical aspects associated by the operational age at which the equipment shows a rapid
with all applicable maintenance policies for an equipment, including decrease in performance. If the equipment presents performance
the evaluation of the equipment reliability when assisted by a given degradation (named as failure symptoms), as shown in Fig. 1, due
maintenance policy and the costs associated with the equipment to some component loss of performance associated with a failure
failure, named cost of failure consequences. That cost is estimated mode development, a monitoring system may be used to detect
based on the equipment failure effects over the power plant opera- failure mode development aiming at the use of predictive
tion, including environmental, safety and operational aspects, mainly
the reduction of power plant output that causes production loss.
Table 1
Simple decision-making procedure for maintenance selection.
2. Method development
Failure frequency of occurrence

Random Repetitive pattern


Based on RCM concepts, a primary maintenance practice
selection procedure for equipment can be developed. That proce- Equipment failure With symptom Predictive Preventive or predictive
Without symptoms Corrective Preventive
dure, presented in Table 1, is based on the presence of symptoms
966 F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975

Fig. 1. Degradation of equipment performance with operational time.


Fig. 3. Flowchart for risk-based methodology.

maintenance practice instead of preventive practice. The predictive The methodology proposed here for maintenance policy selec-
(or on-condition) tasks entail checking for potential failures, so tion is presented in Fig. 3.
that action can be taken to prevent the equipment failure. The Initially, the maintenance planner must list all possible main-
maintenance tasks are not previously scheduled, being executed tenance procedures that present technical feasibility to be applied
based on the assessment of the condition of the equipment. The to the equipment. For each one of those maintenance procedures,
use of that maintenance practice is recommended for both random the equipment failure probability must be evaluated through the
frequency of failure (quite different DTi) and constant frequency of use of reliability concepts. Based on ‘time to failure’ database, the
failure. analyst can calculate the equipment reliability as presented in [9].
In Fig. 2 a graphic indicates the functional behavior of equip- For each one of those maintenance procedures the future equip-
ment, the performance of which does not present degradation until ment reliability can be predicted according to the procedures pre-
the occurrence of functional failure. In that case the failure occurs sented in [2,3,7,8].
suddenly without any previous symptoms. As proposed in Table 1, The second step in the procedure is the assessment of mainte-
for that case, if the equipment presents a repetitive failure pattern nance procedures costs and equipment failure consequence costs
the use of preventive maintenance is recommended. On the based on cost database.
contrary, if the equipment does not present performance degra- The equipment failure consequences costs assessment involves
dation and the failure pattern is random, the maintenance planner the definition of the equipment failure effects on the power plant
can only use corrective actions to restore the equipment to its operational availability and safety, including environmental impact.
normal operational condition. Corrective maintenance is per- The risk analysis concepts are usually used to predict the equip-
formed after failure has occurred in order to return the equipment ment failure consequences given a power plant operational
to service as soon as possible. scenario.
Although the decision-making procedure presented in Table 1 Once the equipment failure probability is evaluated for each of
allows the selection of a set of maintenance practices to be used in the feasible maintenance procedures and the failure consequences
equipment, it does not indicate the selection of the most cost- and maintenance costs are estimated, a decision-making proce-
effective maintenance practice. dure, based on decision tree, is used to select the maintenance
So as to deal with this issue, one needs to evaluate the equip- procedure that minimizes the risk associated with the equipment
ment failure probability and the costs of maintenance and failure failure expressed by the mean failure costs.
consequences. The present method can be used not only to select maintenance
procedures but also to evaluate the feasibility of changes in power
plant design such as the use of redundant equipment, installation of
new control and monitoring systems, equipment retrofitting and
even changes in the plant operational procedure.

2.1. Risk analysis concepts

Risk analysis is a technique for identifying, characterizing, quan-


tifying, and evaluating the loss of an event. Risk analysis approach
integrates probability and consequence analysis at various stages of
the analysis and attempts to answer the following questions, [4]:

 What can go wrong that could lead to a system failure?


 How can it go wrong?
 How likely is its occurrence?
 What would the consequences be in case it happens?

In this context, risk can be defined qualitatively/quantitatively


Fig. 2. Constant equipment performance with operational time. as the following set of duplets for a particular failure scenario.
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 967

Risk ¼ ðfailure probabilityÞ  ðconsequence of the failureÞ The risk assessment analysis is executed for the most critical
(1) components.

The risk analysis method aims at the evaluation of the likelihood 2.1.2. Risk quantification
of occurrence of equipment failures and their consequences for the The second step involves the risk quantification that must be
power plant operation, characterizing a quantitative risk analysis. executed in two steps: equipment failure probability estimate and
The output of a quantitative risk assessment will typically be failure consequences analysis.
a number, such as cost impact ($) per unit time. The number could
be used to prioritize a series of items that have been risk assessed. 2.1.2.1. Reliability analysis. Reliability can be defined for a system or
Quantitative risk assessment requires a great deal of data both for a component as its ability to fulfill its design functions under
the assessment of probabilities and assessment of consequences. designed operational or environmental conditions for a specified
The procedure is presented in Fig. 4. time period. This ability is commonly measured using probabilities.
Reliability is, therefore, the occurrence of the complementary event
2.1.1. System description to failure as provided in the following expression:
Defining the system is an important first step in performing
a risk assessment. A system can be defined as an entity comprising RðtÞ ¼ 1  FðtÞ (2)
an interacting collection of discrete elements. The definition of the
Probably the single most used parameter to characterize reli-
system is based on analyzing its functional and/or performance
ability is the mean time to failure (or MTTF). It is just the expected
requirements.
or mean value of the failure time, expressed as:
The first step involves the definition of the analysis scope
that includes the study of the power plant and equipment ZN
operational procedures and federal legislation associated with MTTF ¼ RðtÞdt (3)
power generation and distribution. The equipment analysis can
0
de performed through the use of a functional tree that repre-
sents the functional links between the equipment subsystems, Random failures (represented by the exponential probability
[11]. function) constitute the most widely used model for describing
The system breakdown structure used in the functional tree is reliability phenomena. They are defined by the assumption that the
the top-down division of a system into subsystems and compo- rate of failure of a system is independent of its age and other
nents. This architecture provides internal boundaries for the characteristics of its operating history. In that case the use of mean
system. For failure analysis, the resolution of that breakdown time to failure to describe reliability can be acceptable once the
should be to the components level where failure data are available. exponential distribution parameter, the failure rate, is directly
Often the systems/subsystems are identified as functional associated with MTTF.
requirements that eventually lead to the component level of detail. The constant failure rate approximation is often quite adequate
The functional level of a system identifies the function(s) that must even though a system or some of its components may exhibit
be performed for the operation of the system. moderate early-failures or aging effects. The magnitude of early
Taking into consideration that the risk analysis is based on the failures is limited by strictly quality control in manufacturing and
evaluation of the equipment failure consequences on the system aging effects can be sharply limited by careful predictive or
performance, a method must be used to select the critical compo- preventive maintenance.
nents or equipment with respect to the potential effect of its failure When the phenomena of early failures, aging effects, or both, are
on the system operational performance. That criticality must be presented, the reliability of a device or system becomes a strong
judged on the plant level. function of its age.
The present method uses the Failure Mode and Effects Analysis The Weibull probability distribution is one of the most widely
(FMEA) to identify the components (equipment) failure modes and used distributions in reliability calculations involving time-related
the impacts on the surrounding components and the system. The failures. Through the appropriate choice of parameters a variety of
analysis tool assumes that a failure mode occurs in an equipment failure rate behaviors can be modeled, including constant failure
component through some failure mechanisms; the effect of this rate, in addition to failure rates modeling both wear-in and wear-
failure on the system is then evaluated. A criticality ranking can be out phenomena.
developed to measure the effect of a component failure mode on The two-parameter Weibull distribution, typically used to
the overall performance of the system. Reference [11] suggests the model wear-out or fatigue failures is represented by the following
use of a numerical code ranking from 1 to 10. The higher the equation:
number the higher is the criticality of the component failure mode.  b
t
RðtÞ ¼ e h (4)
The distribution parameters are estimated through the use of
parametric estimation methods that fit the distribution to the ‘time
to failure’ data. There are procedures for estimating the Weibull
distribution parameters from data, using what is known as the
maximum likelihood estimation method.
The reliability of an equipment or component, the failure of
which can de considered as an initiating event as for risk analysis,
should be determined from historical data if a significant number of
failures have occurred in the past. For the present study equipment
failure probability can be estimated based on ‘time to failure’
database developed by the power plant or even, in case of lack of
Fig. 4. Risk analysis method. information, based on equipment reliability database issued by
968 F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975

international associations such as the RAC [10]. The external 2.1.2.2. Failure consequence analysis. The goal of the scenario
information available should be considered carefully before its use, development is to derive a complete set of scenarios that encom-
because such information is generally available at a rather coarse passes all of the propagation paths following the occurrence of the
level. initiating event. To describe the cause and effect relationship
Any component or system present an inherent reliability that is between initiators and the event progression, it is necessary to
function of the design and the build (or manufacturing) quality. An identify those functions that act as barriers to the failure
effective maintenance program will ensure that the inherent progression.
reliability is realized. It cannot, however, improve the reliability A failure scenario is a description of a series of events, which
of the component. This is only possible trough redesign or leads to a failure event. It may contain a single event or a combi-
modification [6]. nation of sequential events. The expectation of a scenario does not
If some maintenance actions are performed on the system at mean it will indeed occur, but that there is a reasonable probability
constant time intervals T (characterizing preventive intervention), that it would occur. A scenario is neither a specific situation nor
smaller than the expected operational time, it is possible to define a specific event, but a description of a typical situation that covers
the reliability of the maintained system as [9]: a set of possible events or situations. Failure scenarios are gener-
ated based on the operational characteristics of the system, physical
RM ðtÞ ¼ ½RðTÞN Rðt  NTÞ; NT  t < ðN þ 1ÞT (5) conditions under which operation occur, geometry of the system,
and safety arrangements.
being N ¼ 0, 1, 2, ., the number of maintenance intervention in
The development of a failure scenario should be based on the
a given operational time t.
following steps:
The analysis is based on the hypothesis that the system is
restored to an ‘as good as new’ condition after the maintenance
i. Identify the mitigating functions for each initiating event;
action. This implies that the maintained system at time t > T has no
ii. Identify the corresponding human actions, systems or hard-
memory of accumulated wear effects for times before T. Thus, in the
ware operations that can be considered barriers for the
interval NT < t  (N þ 1)T , the reliability is the product of the
initiating event propagation;
probability [R(T)]N that the system operated without failures to NT,
iii. Develop a failure scenario based on cause-consequence
and the probability R(t – NT) that a system ‘as good as new’ at NT
analysis methodology, such as Fault Tree Analysis or Event
will operated without failure for a time (t – NT).
Tree Analysis.
Preventive maintenance has a quite effect when aging or wear
causes the failure rate to become time-dependent, and the effect on
The present study suggests the use of the Cause–consequence
reliability is presented in Fig. 5.
diagram method to evaluate the failure scenarios. According to
The use of predictive maintenance has similar effect on the
Ridley and Andrews [12] the method was developed at RISO
system reliability but the maintenance actions are executed at
National Laboratories, Denmark, aiming at assisting in the cause–
variable time intervals. The reliability of the maintained systems
consequence accident analysis of the nuclear power plants.
can be expressed as:
The cause–consequence diagram focuses on the occurrence of
! an initiating event. For power plants the initiating event is typically
hY
N i N
X N
X
RM ðtÞ ¼ RðTi Þ R t  Ti ; t > Ti (6) a failure of an electromechanical device. Once that event has been
i i¼1 i¼1 identified all potential consequences can be developed based on
the Event Tree Analysis. The event tree method is used as the link
The mean time to failure for a system with preventive or
between the occurrence of the initiating event and the various
predictive maintenance can be determined by:
consequences that could result. The initiating event is followed by
ZN other events leading to an overall result (consequence). Those
MTTF ¼ RM dt (7) events are named as reactionary events that can be interpreted by
the barriers to the initiating event progression and should be
0
arranged according to the temporal action of the system. The
Considering that the maintenance actions will increase the reaction can either be a success of failure. The functionality of each
system mean time to failure, the availability of the system will also event (usually represented by the operational condition of a given
increase. So the selection of a maintenance policy has a direct effect component) is investigated, and expressed as ‘yes’ or ‘no’ answers.
on the failure probability of a system thus affecting its operational That answer is probabilistic, once the component may fail during
risk profile. the plant operation. The probability of failure (representing the ‘no’
answer) is estimated based on the reliability analysis of the
component (or group of components) associated with the event
under analysis. The probability of success (‘yes’ answer) is repre-
sented by the component reliability.
The consequence tracing part of the diagram involves taking the
initiating event and following the resulting chain of events through
the system. At various steps, the chains may branch into multiple
paths. The consequence analysis results in a description of the
relevant accident scenarios given the occurrence of the initiating
event and is used to calculate both the likelihood and the conse-
quences of each accident scenario. A quantitative evaluation of the
diagram probability values can be used to estimate the probability
of the overall system state. The probabilities of various events in
a sequence can be viewed as conditional probabilities and therefore
can be multiplied to obtain the occurrence probability of a given
Fig. 5. The effective of preventive maintenance on reliability [9]. sequence. The probabilities of various sequences can be summed
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 969

Fig. 6. Cause–consequence diagram logical notation.

up to determine the overall probability of a certain outcome. The equipment maintenance costs (for procedures that do not depend
addition of consequence evaluation of a scenario allows for on the equipment operation time history), insurance and taxes.
generation of a risk value. The variable O&M include costs that are dependent on the
The risk associated with a given cause–consequence diagram amount of energy generated or on the equipment operation time
branch is calculated through the following expression: history.
Both classes of costs are dependent on the maintenance policy
Riski ¼ ðpinitial event Þpðbranchi jinitial event ÞðConsequencesi Þ (8) applied on the power plant equipment.
The unavailability costs are related to the consequences of
Fig. 6 shows the cause–consequence logic.
equipment unexpected failure, that requires corrective mainte-
Once the cause–consequence diagram is developed for the main
nance actions, defined according to the risk analysis procedure.
equipment installed in a power generation plant, the risk analyst
Those costs consider corrective maintenance actions, including
can select the most important equipment for plant operation using
spare parts and labor hours, and a monetary expression of equip-
as prioritization criterion the severity of its failure consequences.
ment failure consequence costs, mainly production loss cost. For
The higher that severity the higher is the priority of the equipment
that estimate, one should include reduction in the power plant
for maintenance planning.
energy output that affects commercial contracts, environmental
For that equipment, the maintenance planner can select the
and operational safety degradation. Both environmental and
most feasible maintenance procedures aiming at the reduction of
operational safety degradation costs are strongly influenced by the
the failure probability and, consequently, aiming at risk minimi-
power plant location and by regulatory laws related to the envi-
zation, [1].
ronmental impacts.
Corrective maintenance cost typically includes the cost of labor,
parts, and the down time associated with the repair. The mainte-
2.2. Cost analysis nance cost can be calculated using the following equation:

MC ¼ Cf þ DT  Cv (10)
The objective here is to quantify the potential consequences of
the functional failure, which represents a credible scenario. The The cost of spares includes the cost of raw material, internally
total consequences assessment usually is a combination of four manufactured parts, the parts sent away for repairs, new spare
major categories of consequences: (i) system performance loss, parts, consumables, small tools, testing equipment, and rent for
(ii) financial loss, (iii) human health loss, and (iv) environmental special equipment. The cost of spares and raw materials is drawn
and/or ecological loss. The method of quantification of these four from the plant stock book.
categories may change according to the scope of the study Maintenance down time includes the total amount of time the
undertaken. plant would be out of service as a result of failure, from the moment
To complete the risk analysis of power plant equipment, the it fails until the moment it is fully operational again.
consequences of equipment failure must be expressed in monetary The cost of labor is an important component of the maintenance
values. Many aspects influence that cost evaluation, such as power cost. This is based on the hourly rate for various trades and the
plant location and configuration, operational pattern and federal information is drawn from the plant documentation. Those costs
legislation. Additionally, the costs of maintenance procedures must depend on the union agreements and federal laws, varying from
also be evaluated. country to country.
The cost analysis is dependent on the existence of a database
that relates costs to some undesirable failure events associated
with power plant equipment. For the present analysis, the costs are
divided into three classes, as presented in Fig. 7: fixed operational
costs, variable operational costs and unavailability costs.
The total operational costs can be calculated by the sum of those
costs as:

Power generation costs ¼ fixed cost þ variable cost


þ unavailability costs (9)

The operation and maintenance (O&M) fixed costs are related to


the power plant operation independently of the amount of energy
generated. Those costs include plant operator’s wages, general and Fig. 7. Power generation costs.
970 F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975

Down time associated with forced outage and forced de-rating with power plant operation. The best alternative is the one that has
state must be estimated from the failure data collected on the the lowest EMV.
power plant. Owing to the lack of data, the down time and the
number of maintenance personnel involved in repair is estimated 3. Application
by interviewing the maintenance personnel.
The production loss cost can be estimated using the following The proposed risk-based maintenance policy selection is applied
formula: to the analysis of a heavy-duty F series gas turbine with a 150 MW
nominal output. The turbine generates power in an open cycle
PLC ¼ DT  PL  SP (11) thermal power plant.
Production loss in MWh must be computed from the failure The maintenance policy of gas turbines must follow very strin-
data. The combination of production loss cost and the mainte- gent recommendations defined by the manufacturer. Most of the
nance cost gives the consequence of the failure in monetary maintenance procedure tasks, involving periodical inspection and
values. replacement of parts, are related to parts subjected to very high
For the Brazilian electrical energy market, the evaluation of temperature and located in the hot gas path (combustion chamber
the production loss cost due to the power plant unavailability plus turbine). The parts that compose those subsystems can present
depends on the bilateral agreement between the electrical severe wear, affecting gas turbine performance. For those parts the
energy generator agent (power plant owner) and the Brazilian manufacturer did not allow any maintenance procedure change
Electricity Regulatory Agency (ANEEL). Those agreements deter- once they are defined in maintenance contracts involving equip-
mine prices and sales volume of electrical energy during certain ment warranty. The periodical inspection schedule is based on the
periods. number of equipment start-ups and operational hours.
In case of equipment failure, the power plant performance can For auxiliary system, such as the lubricating oil system, the
be affected in different ways. If the amount of delivered energy is manufacturer recommends periodic inspections but does not
smaller than the sales volume defined in the agreement, the clearly defined what kind of maintenance policies could be applied
generator agent may suffer penalties including fine payment or to the components of those systems. Taking in view that the plant
reduction of sales volume in future contracts. Those penalties affect operator faces difficulties in planning maintenance tasks for the
production loss cost. lubricating system; the present paper applies the risk-based
In order to avoid those penalties the generator agent can buy maintenance analysis to evaluate possible actions aiming at
electrical energy in a spot market to accomplish with the con- reducing the lubrication system failure frequency.
tracted sales volume. The spot market is a segment of the wholesale The lubricating system is responsible for providing pressurized
electrical energy market in which energy not contracted for bilat- oil for turbine radial and axial sliding bearings. That system uses
erally and surpluses over those amounts contracted for are brought two oil pumps: a main unit (with 35 kW output) and an emergency
and sold. The price varies on-demand and greatly affects the power unit.
plant production loss cost. Bearing in mind the importance of the main oil pump for
turbine operation, the power plant managers developed a mainte-
nance policy based on the application of predictive and preventive
2.3. Decision-making procedure practices, including a bimonthly preventive inspection and periodic
vibration monitoring.
A decision generally deals with three elements: alternatives, Even using a very restrictive preventive maintenance plan
consequences, and preferences. The alternatives are the possible aiming at the reduction of pump failure probability, the power
choices for consideration. The consequences are the potential plant still faces some emergency shut downs due to pump
outcomes of a decision. Decision analysis provides methods for instability.
quantifying preferences tradeoffs for performance along multiple To reduce the number of shut downs, the plant managers
decision attributes while taking into account risk objectives. The planned to install a passive redundant oil pump that should keep
decision outcomes may be affected by uncertainty; however, the the lubricating system working even in case of main pump failure.
goal is to choose the best alternative with the proper consideration Due to the high investment associated with that design change,
of uncertainty. a risk-based analysis is executed to evaluate the feasibility of the
Once the feasible maintenance procedures for equipment are lubricating oil system modification.
listed and the equipment reliability and costs associated with the
power plant operation are estimated, a decision-making procedure 3.1. Risk analysis
should be applied to select the most feasible maintenance policy.
The objective of the decision-making procedure is to minimize the Quite similar equipment had the reliability and availability
mean costs associated with power plant operation. evaluated and the results are presented in reference [11]. For that
The decision tree is used to structure the decision-making analysis the functional tree and the failure mode and effects anal-
problem. In decision trees, squares represent decisions to be made, ysis were developed for the F series gas turbine and some results
while circles represent chance events. The branches emanating can be used for the present study.
from a square correspond to the choices available to the decision- According to reference [11], the most critical components for the
maker (in the present case, the maintenance policies), and the turbine are:
branches from a circle represent the possible outcomes of a chance
event (the equipment failure probability). The third decision - compressor: blades, vanes and shaft;
element, the consequences, is specified at the ends of the - bearings: oil heat exchanger, lubrication oil system, including
branches. pump, piping and filter;
Given the costs of consequences at the end of each branch and - combustion subsystem: combustion chamber, ignition and
the probability of occurrence of each output at a chance node, it is transition duct;
possible to calculate the expected monetary value (EMV) for each - turbine: blades and vanes refrigeration system, shaft and
decision alternative which represents the mean costs associated exhaustion;
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 971

Fig. 8. Gas turbine functional tree with emphasis on the lubrication system.

The simplified functional tree for the gas turbine, with emphasis development. If the main pump fails and the emergency pump
on the lubricating system, is presented in Fig. 8. The functional tree present on-demand failure, the gas turbine bearings will suffer
shows that a failure in the lubricating system greatly affects gas severe damage due to lack of lubrication oil during the shut down
turbine bearings (radial and thrust) performance. A total absence of process.
lubrication on the journal-bearing system leads to bearing seizure If the main pump is operating according to the design perfor-
and, normally, to total destruction of the part. However an alto- mance the loss of oil pressure or oil flow may be caused by piping or
gether more frequent phenomenon is fatigue due to oil starvation, reservoir failure. Those failure modes can be easily detected in
whereby the amount of oil reaching the journal-bearing system is comparison with the detection of main pump failure.
insufficient to maintain the oil film, leading to metal-to-metal According to the diagram, the worst scenario involves the
contact between the two parts. Prolonged operation under such main and the emergency pump failure (that may fail in the
conditions will also result in total destruction of the whole. stand-by mode) that would cause an emergency gas turbine shut
The most critical component of the lubricating system is the oil down without bearing lubrication causing severe damage to that
pump due to its many components, such as impeller, bearings, shaft part.
and coupling. The greater the number of components in equipment If the emergency pump is operational, the lubrication system
the greater is the number of failure modes that can affect equipment will need corrective repair but the gas turbine bearing will not
functional performance. The failure of the main pump reduces the suffer any damage during the shut down process. Another branch
oil pressure and flow. If the pressure and/or flow are below a pre-set indicates that if the main pump failure is not clearly diagnosed, the
value the control system shuts the turbine down. For that emergency lubricating system operation must be continuously monitored. The
operation the lubricating system uses the emergency pump. diagram also shows that the lubricating system failure may be
The oil pump is mechanical equipment that typically presents caused by leakage in the oil piping or by failure in the lube oil
aging failures associated with component fatigue or wear-out. reservoir. For those scenarios the damage caused to the bearings is
The cause–consequence diagram presented in Fig. 9 is devel- less severe than the damage caused by main pump failure, once the
oped to define the power plant failure scenarios associated with the shut down process is executed with oil flow, although above the
oil pump failures. flow design value.
The initiating event is the reduction of oil pressure and flow in
the turbine bearings. 3.2. Risk quantification
The diagram considers that once the initial event is detected the
plant operator will check if the main oil pump is operational. If the The main pump reliability is defined according to a ‘time to
pump performance is not achieving a design value the equipment failure’ database registered in the power plant maintenance control
must be monitored in order to detect the failure mode system.

Fig. 9. Cause–consequence diagram for main pump failure.


972 F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975

Once reliability is expressed through the use of a probability The corrective repair tasks and the unavailability costs in case of
distribution function, it is necessary to fit the ‘time to failure’ data to main pump failure are close to $2300.00.
some particular distribution aiming at the evaluation of the prob- The cost of gas turbine bearings failure due to the main lubri-
ability function parameters. There is a variety of advanced statis- cation oil pump failure are estimated based on private conversation
tical methods for determining the goodness of fit of data to between the authors and the power plant managers. That cost can
a particular distribution for estimating the parameters. The be close to $350,000.00 depending on the extension of the damage.
maximum likelihood estimation method is used here to verify the That value includes production loss costs, once the time to
goodness of fit of data for four distributions: normal, lognormal, repair is very high, and can be treated as an upper bound estimate
exponential and Weibull. The exponential distribution presented for the decision problem.
the best fit and is used to model main pump reliability. The investment to install a stand-by lubrication oil pump,
Using those data, the main pump reliability is calculated using including sensors and control system, is close to $2000.00.
the maximum likelihood criterion and is represented by an expo- The investment representing the acquisition and installation of
nential distribution: the stand-by pump for decision-making will be modeled according
to an equal-payment-series capital recovery calculated with twelve
bimonthly (twelve operational periods of 1440 h) equal payments
RðtÞ ¼ elt (12)
corresponding to three operational years. After three years,
the failure rate parameter is equal to 5.8  105 failures/h. The according to the plant operator, the stand-by pump should be
reliability distribution is shown in Fig. 10. The points presented in subjected to overhaul maintenance, so the economic life of the
the graphic represent the median rank plotting reliability estimate equipment would be three years.
for each of the time to failure data, arranged in increasing order. The capital recovery factor can be interpreted as the amount
Those points are used to verify the adherence of the reliability of equal (or uniform) payments (A) to be paid for 12 periods of
distribution to the failure data. time such that the total present value of all these equal payments
The risk analysis is conducted considering a continuous opera- is equivalent to a payment of $2000.00 at present (the cost
tion of 1440 h, once the main pump is subjected to preventive of acquisition and installation of the stand-by pump), if interest
inspection every other month. Repairs are executed in the pump in rate is j.
case of degradation being detected. For the analysis, the pump is Fig. 11 summarizes the monetary flow for the present analysis.
considered ‘as good as new’ after preventive tasks. The monetary value of the payment A is calculated according to
For 1440 h, the pump reliability is equal to 0.92 and conse- the following equation:
quently the failure probability is equal to 0.08. " #
The emergency pump did not fail in the five-year operational jð1 þ jÞ12
A ¼ P (13)
period considered in the present analysis. Nevertheless, it might fail ð1 þ jÞ12 1
in the stand-by mode. Due to the lack of information regarding
emergency pump failure, a standard reliability database was used For the present analysis the annual interest rate is equal to 9.5%
to estimate that pump failure probability on demand. According to (according to the Brazilian monetary authority – Central Bank – in
reference [10], that probability is close to 0.08. June 2009), corresponding to a bimonthly interest rate of 1.53%. The
The costs associated with the main pump failure are estimated monetary value of A for the present study is $183.70.
based on the power plant accounting system. The costs of failure The costs associated with main pump failure are considered to
consequences for the branch that supposes failure of both main and occur in a random time in the cash flow analysis and are not
emergency pumps are estimated once that event has never distributed through 1440 h (or two months).
occurred for the power plant under analysis. The costs associated with the decision problem are summarized
The main pump operational costs are close to $1800.00. Those in Table 2.
costs correspond to 1440 h of operation. Those costs include the
rated fixed O&M power plant costs. 3.3. Decision problem

The basic decision tree for the present problem is presented in


Fig. 12. For the first analysis, the emergency pump is considered to
have reliability equal to 1 (will not present failure on-demand), so
the gas turbine will shut down without damage to the bearings. The
back-up pump is modeled with on-demand probability of failure
equal to 0.1.
The basic decision choice node represents the choice of keeping
the system design (only one main pump) or installing the back-up
oil pump. The main pump failure probability is equal to 0.08 and its
reliability is equal to 0.92.

Fig. 10. Main pump reliability distribution. Fig. 11. Monetary flow for stand-by pump investment.
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 973

Table 2 Table 3
Summary of consequence costs, in US dollars. Decision-making EMV ($).

Consequence Cost ($) No failure in emergency pump


Gas turbine shut down. No damage. 2300.00 Keep System Design 1840.00
Corrective maintenance in main pump Install Back-up Pump 1987.70
Gas turbine shut down. Damage in bearings. 350,000.00 Failure in Emergency Pump
Corrective maintenance.
Normal gas turbine operation for 1440 h. 1800.00 Keep System Design 4065.28
Install Back-up Pump 2368.92

At the end of each branch the consequences are expressed in


monetary values according to the following calculations: installation is high in comparison with the pump maintenance
costs.
- Install Back-up Pump If the probability of the emergency pump failure is added to the
B Normal operation with back-up ¼ back-up pump operational decision problem, the decision tree is updated, as shown in Fig. 13.
cost þ back-up pump investment ¼ $1800.00 þ $183.70 In that case, considering the actual system design, if the emergency
¼ $1983.70 pump fails the turbine shuts down but the bearing will suffer
B Turbine shut down ¼ main pump repair cost þ back-up pump severe damage. The plant unavailability cost (including bearing
investment [ $2300.00 þ $183.70 [ $2483.70 repair cost plus production loss cost) is $350,000.00. The expected
B Normal operation ¼ main pump operational cost þ back-up monetary values are calculated using the same procedure used for
pump investment [ $1800.00 þ $183.70 ¼ $1983.70 the tree presented in Fig. 12.
- Keep System Design In Table 3, the expected monetary values for each decision choice
B Normal operation ¼ main pump operational cost [ $1800.00 are presented and the use of a redundant pump is economically
B Turbine shut down ¼ main pump repair cost [ $2300.00 feasible. The results show how important it is to consider the
probability of failure of all pumps used in the lubricating system
The operational cost of the back-up pump is considered equal to even though, historically, those probabilities are rather small.
the operational cost of the main pump for the decision analysis. Having in mind the importance of the emergency pump failure
The expected monetary costs for each decision choice are pre- for the decision problem, a sensitivity analysis is made varying that
sented in Table 3. pump failure probability. The results, presented in Table 4, show
The expected monetary values (EMV) for each decision choice that the higher the emergency pump reliability, the lower is the
are: feasibility of a back-up pump. The plant managers must carefully
investigate the emergency pump maintenance policy in order to
- Install Back-up Pump: avoid the investment in the back-up pump.
If the maintenance policy of the main pump is not able to restore
EMV ¼ ½0:9$1983:70 þ 0:1$2483:700:08 þ 0:92$1983:70 the equipment’s properties to the initial state, the main pump
EMV ¼ $1987:70 reliability will decrease after each preventive maintenance
intervention.
The effect of the imperfect maintenance in the reliability
- Keep System Design
modeling is based on the consideration that there is a finite prob-
ability, p, that the maintenance action is carried out unsatisfactory,
EMV ¼ 0:08$2300:00 þ 0:92$1800:00
in such a way that the faulty maintenance causes a system failure
EMV ¼ $1840:00
immediately thereafter. To take that effect into account, the reli-
The use of a redundant oil lubrication pump does not seem to be ability of the equipment, after the maintenance procedure, must be
economically feasible once the initial investment for the pump multiplied by the maintenance nonfailure probability, (1p), each

Fig. 12. Decision tree – no failure in emergency pump.


974 F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975

Fig. 13. Decision tree – failure in emergency pump.

Table 4
Sensitivity analysis for EMV ($).

Emergency pump failure probability ¼ 0.01


Keep System Design 2118.16
Install Back-up Pumps 2173.50
Emergency pump failure probability ¼ 0.02
Keep System Design 2396.32
Install Back-up Pump 2202.02
Emergency pump failure probability ¼ 0.03
Keep System Design 2674.48
Install Back-up Pump 2229.84
Emergency pump failure probability ¼ 0.05
Keep System Design 3230.80
Install Back-up Pump 2282.47

time that maintenance is performed. For the case of preventive


maintenance, the reliability expression is, [9]:

RM ðtÞ ¼ RðTÞN ð1  pÞN Rðt  NTÞ; NT  t < ðN þ 1ÞT (14)


The trade-off between the improved reliability from the
replacement of wearing parts and the degradation that come about
because of maintenance error may now be considered, as shown in
Fig. 14, [9].
A sensitivity analysis is made varying the main pump failure
probability due to imperfect maintenance using the decision tree
Fig. 15. Decision sensitivity analysis regarding effect of imperfect maintenance.
presented in Fig. 13. The results of the analysis are presented in
Fig. 15. The EMV associated with ‘Keep System Design’ choice presents a great increase due to the increase in the main pump
failure probability. The EMV associated with ‘Install Back-up Pump’
choice does not present a great increase with the increase in the
main pump failure probability. The effect of imperfect maintenance
strengths the decision for installing a back-up pump.
The study can also be applied to analyze possible changes in the
oil piping or oil reservoir maintenance policy, once their failures
can also cause damage to the turbine journal bearings. During five
operational years those equipment presented no failure according
to the power plant maintenance record. Based on that information
it is possible to conclude that the failure probabilities of those
pieces of equipment are smaller than the failure probability of the
main oil pump. So the risk profiles associated with that equipment
are different from the observed for the main oil pump and can be
considered less critical than the calculated for the pump.

4. Conclusions

Maintenance is aimed at increasing the availability of any


Fig. 14. The Effect of imperfect preventive maintenance on reliability [9]. system taking account of safety or environment issues and
F.G. Carazas, G.F.M. Souza / Energy 35 (2010) 964–975 975

optimizing total life cycle cost. Risk assessment integrates reli- plant equipment and the costs associated with the equipment
ability analysis with safety and environmental issues. Risk-based repair, including those incurred to solve safety and environ-
maintenance attempts to answer four important questions related mental degradation and to pay penalties associated with
to failure free operation of the system: rupture of commercial contracts. Those data support reli-
ability and costs analysis;
 What can cause the system to fail? iii) The method, due to its complexity, shall be applied to make
 How can it cause the system to fail? strategic decisions that may impact the power plant perfor-
 What would be the consequences if it fails? mance for the next three to five-year period;
 How probable is it to occur? iv) Regarding the example of application:
1) The proposed risk-based method is very useful to identify
Having known the answers to these four questions, it is safe to the failure scenarios and consequences associated with the
say that maintenance planning based on risk analysis is expected to main lubrication pump failure;
provide cost effective maintenance, which minimizes the conse- 2) The emergency lubrication pump failure has great impor-
quences (related to safety, economic, and environment) of a system tance for the study. The likelihood of emergency pump
failure. failure changes the decision regarding the installation of
Sometimes, depending on the system operation, the use of a redundant lubrication pump;
preventive or predictive maintenance practice for critical equip- 3) The expected monetary value for the costs associated with
ment will not reduce the probability of failure to an acceptable the main pump failure, considering the existence of
level. Then there is need to redesign or modify the item. If the a redundant pump, is $2368.92. That value means that, in
consequences of failure relate to safety or the environment then case the main pump fails, the power plant will lose in
this redesign recommendation will normally be mandatory. For average $2368.92, even using the stand-by equipment.
operational and economic consequences of failure this may be
desirable, but a cost-benefit assessment has to be performed. The reliability and the failure costs associated with the gas
The paper presents a methodology for decision problems turbine lubricating system presented reflect on-site behavior,
involving equipment maintenance policy selection or design including the effects of changes in the auxiliary system mainte-
changes (including overall and retrofitting) that can be applied in nance policy. Those estimates can be used for benchmarking, in
large power plants. order to compare the performance of the same model, or even
The method is based on risk and decision-making concepts different gas turbine models, operating in different sites.
aiming at balancing the probability of occurrence of a given failure Finally, the results of any risk-based analysis should be contin-
scenario and the consequences of that failure, expressed in uously updated, as additional information regarding equipment
monetary values. reliability becomes available.
Considering that the failure of a system is rarely the result of
a single failure, the cause-consequence diagram is used to trace the
References
consequences (or relevant accident scenarios) for the power plant
operation associated with a given initiating event, characterizing [1] Al-Mansour F, Kozuh M. Risk analysis for CHP decision making within the
the failure consequences analysis. The diagram allows evaluating conditions of open electricity market. Energy 2007;32(12):1905–16.
[2] Eti M, Ogaji S, Probert S. Integrating reliability, availability, maintainability and
the risk associated with each failure scenario.
supportability with risk analysis for improved operation of the AFAM thermal
Differently from other risk-based analysis listed in the refer- power-station. Applied Energy 2007;84(12):202–21.
ences, the method proposes the use of the Decision Tree to select [3] Frangopoulos C, Dimopoulos G. Effect of reliability considerations on the
the most suitable maintenance policy (or even design modification) optimal synthesis, design and operational of a cogeneration system. Energy
2004;29(12):309–29.
to minimize the risk associated with the power plant operation. [4] Kahn F, Haddara M. Risk-based maintenance of ethylene oxide production
That method allows evaluating the trade off between the costs facilities. Journal of Hazardous Materials 2004;3(12):147–59.
associated with the implementation of a given maintenance policy [5] Vatn J, Hokstad P, Bodsberg L. An overall model for maintenance optimization.
Reliability Engineering and System Safety 1996;51(12):241–57.
and the risk associated with a given failure scenario, supporting the [6] Dhillion B. Engineering maintenance, a modern approach. New York: CRC
decision-maker with a structured decision method. Press; 2002. p. 45–60.
For power plants, the consequences associated with a failure [7] Modarres M. What every engineer should know about reliability and risk
analysis. New York: Marcel Decker; 1993. pp. 200–9; 289–90.
scenario represent reduction of power plant performance (affecting [8] Smith A, Hinchcliffe G. Reliability-centered maintenance: a gateway to world
plant output), loss of operational safety and environmental impact. class maintenance. New York: Butterworth-Heineman; 2003. pp. 54–6;
The paper illustrates the application of the method to a heavy- 290–9.
[9] Lewis E. Introduction to reliability engineering. New York: John Wiley & Sons;
duty gas turbine lubricating system. 1987. p. 251–59.
The main conclusions are: [10] Reliability Analysis Center (RAC). Non-electric components reliability data.
New York: Center for Reliability Assessment; 2002.
[11] Carazas F., Souza G. Availability analysis of gas turbines used in thermoelectric
i) The risk-based methodology is capable of dealing with the
power plants. In: Proceedings of the 20th International Conference on Effi-
uncertainties associated with the occurrence of a failure ciency, Cost Optimization, Simulation and Environmental Impact of Energy
scenario. Those uncertainties are modeled through the use of Systems, Padova, Italy. 2007, p. 277–85.
cause–consequence diagram and reliability concepts; [12] Andrews J, Ridley L. Reliability of sequential systems using the cause–
consequence diagram methodProceedings of the Institution of Mechanical
ii) To use the present methodology, the power plant must have Engineers. Part E. Journal of Process Mechanical Engineering 2001;vol.
structured database that registers the ‘time to failure’ of the 215:207–20. ISSN 0954-4089.

Das könnte Ihnen auch gefallen