Академический Документы
Профессиональный Документы
Культура Документы
www.elsevier.com/locate/jlp
Abstract
The overall objective of the maintenance process is to increase the profitability of the operation and optimize the total life cycle
cost without compromising safety or environmental issues. Risk assessment integrates reliability with safety and environmental
issues and therefore can be used as a decision tool for preventive maintenance planning. Maintenance planning based on risk
analysis minimizes the probability of system failure and its consequences (related to safety, economic, and environment). It helps
management in making correct decisions concerning investment in maintenance or related field. This will, in turn, result in better
asset and capital utilization.
This paper presents a new methodology for risk-based maintenance. The proposed methodology is comprehensive and quantitative.
It comprises three main modules: risk estimation module, risk evaluation module, and maintenance planning module. Details of
the three modules are given. A case study, which exemplifies the use of methodology to a heating, ventilation and air-conditioning
(HVAC) system, is also discussed.
2003 Elsevier Ltd. All rights reserved.
Keywords: Maintenance; Risk assessment; Risk-based maintenance; Risk-based inspection; Maintenance planning
0950-4230/$ - see front matter 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jlp.2003.08.011
562 F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573
Nomenclature
A system performance loss factor (dimensionless)
B financial loss factor (dimensionless)
C human health loss factor (dimensionless)
D environmental loss factor (dimensionless)
AR area under the damage radius (m2)
AD asset density in the vicinity of the event (up till 苲500 m radius) ($/m2)
PDI population density in the vicinity of the event (up till 苲500 m radius) (persons/m2)
IM importance factor can be derived from Fig. 4 (dimensionless)
i number of events, fire, explosion, toxic release, etc.
t failure time (h)
h characteristics life of the component (scale of Weibull distribution) (h)
β slope or shape factor of Weibull distribution
F(t) failure probability function
PDF1 population distribution factor (dimensionless)
ment owned by Brunei Shell Petroleum (Hagemeijer and uses information obtained from the study of failure
Kerkveld (1998)). modes and their economic consequences.
A risk-based approach has been applied successfully Risk analysis is a technique for identifying, charac-
to the maintenance of oil pipelines. Dey, Ogunlana, terizing, quantifying, and evaluating the loss from an
Gupta, and Tabucanon (1998) discussed a simple risk- event. Risk analysis approach integrates probability and
based model for the maintenance of a cross-country consequence analysis at various stages of the analysis
pipeline. Nessim and Stephens (1998) proposed a quanti- and attempts to answer the following questions:
tative risk analysis model, and recently Dey (2001)
described a more general model for risk-based inspection 앫 What can go wrong that could lead to a system fail-
and maintenance of cross-country pipelines. ure?
The use of a risk-based policy in the maintenance of 앫 How can it go wrong?
medical devices has been tackled by Capuano and Kor- 앫 How likely is its occurrence?
itko (1996) and Ridgway (2001). 앫 What would be the consequences if it happens?
The review of the literature indicates that there is a
new trend to use the level of risk as a criterion to plan In this context, risk can be defined
maintenance tasks. However, most of the previous stud- qualitatively/quantitatively as the following set of
ies focused on a particular equipment type. It seems that duplets for a particular failure scenario.
there is a need for a more generalized methodology that
can be applied to all types of assets irrespective of Risk ⫽ probability of failure
their characteristics. ⫻ consequence of the failure
There is also a need for a more realistic quantification
of risk factors. The quantitative description of risk is Risk assessment can be quantitative or qualitative. The
affected by the quality of the consequence study and the output of a quantitative risk assessment will typically be
accuracy of the estimates of the probability of failure. a number, such as cost impact ($) per unit time. The
This study will focus, among other things, on these two number could be used to prioritize a series of items that
factors. It is hoped that this study will lead to a math- have been risk assessed. Quantitative risk assessment
ematical model that can be used to develop an optimum requires a great deal of data both for the assessment of
maintenance strategy. probabilities and assessment of consequences. Fault tree
or decision trees are often used to determine the prob-
1.1. Concept of risk and its relevance in maintenance ability that a certain sequence of events will result in a
certain consequence. Qualitative risk assessment is less
One of the main objectives of a sound maintenance rigorous and the results are often shown in the form of
strategy is the minimization of hazards, both to humans a simple risk matrix where one axis of the matrix rep-
and to the environment, caused by the unexpected failure resents the probability and the other represents the
of the equipment. In addition, the strategy has to be cost consequences. If a value is given to each of the prob-
effective. Using a risk-based approach ensures a strat- ability and a consequence, a relative value for risk can
egy, which meets these objectives. Such an approach be calculated. It is important to recognize that the quali-
F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573 563
tative risk value is a relative number that has little mean- (2) risk evaluation, which consists of risk aversion and
ing outside the framework of the matrix. Within the risk acceptance analysis, and
framework of the matrix, it provides a natural prioritiz- (3) maintenance planning considering risk factors.
ation of items assessed using the matrix. However, as
these risk values are subjective, prioritizations based on
these values are always debatable. 3. Module I: risk estimation
The proposed risk-based maintenance (RBM) strategy
aims at reducing the overall risk of failure of the This module comprises four steps, which are logically
operating facilities. In areas of high and medium risk, a linked as shown in Fig. 2. A detailed description of each
focused maintenance effort is required, whereas in areas step is presented below.
of low risk, the effort is minimized to reduce the total
scope of work and cost of the maintenance program in 3.1. Step I.1. Failure scenario development
a structured and justifiable way. The quantitative value
of the risk is used to prioritize inspection and mainte- A failure scenario is a description of a series of events
nance activities. RBM suggests a set of recommen- which may lead to a system failure. It may contain a
dations on how many preventive tasks (including the single event or a combination of sequential events. Usu-
type, means, and timing) are to be performed. The ally, a system failure occurs as a result of interacting
implementation of RBM will reduce the likelihood of an sequence of events. The expectation of a scenario does
unexpected failure. Detailed description of the method- not mean it will indeed occur, but that there is a reason-
ology is presented in subsequent sections. able probability that it would occur. A failure scenario
is the basis of the risk study; it tells us what may happen
so that we can devise ways and means of preventing or flashing, and the rate of evaporation. The models for
minimizing the possibility of its occurrence. Such scen- explosions and fires are used to predict the character-
arios are generated based on the operational character- istics of explosions and fires. The impact intensity mod-
istics of the system; physical conditions under which els are used to predict the damage zones due to fires,
operation occur; geometry of the system, and safety explosion and toxic load. Lastly, toxic gas models are
arrangements, etc. Recently, Khan (2001) has proposed used to predict human response to different levels of
a systematic procedure—maximum credible accident exposures to toxic chemicals. There are many tools
scenario (MCAS)—to evaluate failure (accidents) scen- available to conduct this analysis such as WHAZAN,
arios in a process system. The procedure introduces the MAXCRED, RISKIT, etc. (Khan & Abbasi, 1999a).
concept of maximum credible scenarios as an alternative MAXCRED is one of the recent tools that is built upon
to the current methodology based on the worst-case the latest models of fires, explosions and toxic release
scenario as recommended by many regulatory agencies. and dispersion (Khan & Abbasi, 1999a). The total conse-
The developed failure scenarios are then screened to quence assessment is a combination of four major categ-
short list the ones that are more relevant to the system at ories as described below.
hand. MCAS provides the criteria to form this short list.
3.2.1. System performance loss
3.2. Step I.2. Consequence assessment Factor A accounts for the system’s performance loss
due to component/unit failure. This is estimated semi-
The objective here is to prioritize equipment and their qualitatively based on the expert’s opinion. In this work,
components on the basis of their contribution to a system it is suggested using the following procedure for
failure. For example, in the case of a pressure contain- determining the value of this parameter:
ment, a pinhole leak on a process line may not lead to
Ai ⫽ function (performance) (1)
a total loss of production. This is in contrast to a failure
of a pipe valve which may cause a shut down of the line. Details of the function are given in Table 1.
Consequence analysis involves assessment of likely
consequences if a failure scenario does materialize. 3.2.2. Financial loss
Initially, consequences are quantified in terms of damage Factor B accounts for the damage to the property or
radii (the radius of the area in which the damage would assets and may be estimated for each accident scenario
readily occur), damage to property (shattering of window using the following relations:
panes, caving of buildings) and toxic effects
(chronic/acute toxicity, mortality). The calculated dam- Bi ⫽ (AR)i ⫻ (AD)i / UFL (2)
age radii are later used to assess the effect on human
health, and environmental and production losses. Fig. 3
B⫽ 冘 Bi
i ⫽ 1,n
(3)
illustrates the procedure for this step. The assessment of
consequences involves a wide variety of mathematical where i denotes the number of events, i.e. fire, explosion,
models. For example, source models are used to predict toxic release, etc. The UFL in Eq. (2) signifies the level
the rate of release of hazardous material, the degree of of an unacceptable loss. In the present study, we will
use a value of 1000 for UFL. This value is subjective
and may change from case to case as per an organiza-
tion’s criterion.
C⫽ 冘
i ⫽ 1,n
Ci (6)
Table 1
Quantification scheme for system performance function used in Eq. (1)
radius), the factor is assigned a value of 1; if the popu- Con ⫽ [0.25A2 ⫹ 0.25B2 ⫹ 0.25C2 ⫹ 0.25D2]0.5 (9)
lation is localized and away from the point of accident
the lowest value 0.2 is assigned. Values for this para- 3.3. Step I.3. Probabilistic failure analysis
meter have been adapted from the latest work of Hirst
and Carter (2000). Probabilistic failure analysis is conducted using fault
tree analysis (FTA). The use of FTA, together with
3.2.4. Environment and/or ecological loss components’ failure data and human reliability data,
The factor C signifies damage to the ecosystem, which enables the determination of the frequency of occurrence
can be estimated as: of an accident. Developing probabilistic fault trees is
Di ⫽ (AR)i ⫻ (IM)i / UDA (7) made easier using a methodology called “analytical
冘
simulation”, see Khan and Abbasi (2000).
D⫽ Di (8) The key features of this step are:
i ⫽ 1,n
UDA indicates a level for the unacceptable damaging (1) Fault tree development: The top event is identified
area, the suggested value for this parameter is 1000 m2 based on the detailed study of the process, control
(subjective value and may change from case to case). arrangement, and behavior of components of the
IM denotes importance factor. IM is unity if the damage unit/plant. A logical dependency between the causes
radius is higher than the distance between an accident leading to the top event (failure) is developed.
and the location of the ecosystem. This parameter is (2) Boolean matrix creation: The fault tree developed is
quantified using Fig. 4, see Khan and Abbasi (1997). transformed to a Boolean matrix. If the dimension
Finally, these three factors are combined together to of the Boolean matrix is too large to be handled by
yield the factor Con. the available computer, a structural moduling tech-
nique may be applied (Shafaghi, 1988; Yllera,
1988). This technique proposes moduling of the fault
tree into a number of smaller submodules with
dependency relations among them. This reduces the
memory allocation problem as well as makes the
computation faster.
(3) Finding of minimum cutsets and optimization: Mini-
mum cutsets are determined from the Boolean
(Greenberg & Slater, 1992). If the problem has been
structurally moduled, then each module is solved
independently, and the results are combined. The
minimum cutsets are then optimized using an appro-
priate technique. Optimization is necessary in order
to eliminate the unimportant paths (cutsets).
(4) Probability analysis: The optimized minimum cut-
sets are used to estimate probabilities. The present
authors recommend the use of Monte–Carlo simul-
ation method (Rauzy, 1993; Soon, Joo, & Myung,
Fig. 4. Quantification of importance factor (IM). 1985) for this purpose. The simulation methods not
566 F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573
only give the probability of the top event but they estimated risk exceeds the acceptance criteria are ident-
also provide information on the sensitivity of the ified. These are the units that should have an improved
results. In addition, simulation is helpful in studying maintenance plan.
the impact of each of the initiating events. To
increase the accuracy of the computations and
reduce the margin of error due to inaccuracies 5. Module III: maintenance planning
involved in the reliability data of the basic events
(initiating events), we recommend the use of a fuzzy Units whose level of estimated risk exceeds the
probability set (Dubois & Prade, 1980; Lai, acceptance criteria are studied in detail with the objec-
Shenoi, & Fan, 1988; Noma, Tankara, & Asai, 1981; tive of reducing the level of risk through a better mainte-
Tanaka, Fan, Lai, & Toguchi, 1983). Fuzzy prob- nance plan. The details of this analysis are given below,
ability set theory is used in analytical simulation see Fig. 6.
algorithm and coded in PROFAT software (Khan &
Abbasi, 1999b). 5.1. Step III.1. Estimation of optimal maintenance
(5) Improvement index estimation: The improvement duration
index provides a measure of the impact of each root
cause on the final failure event. Improvement indices The individual failure causes are studied to determine
are estimated using the simulation results. To esti- which one affects the probability of failure adversely. A
mate the impact of a root cause, the simulation is reverse fault analysis is carried out to determine the
carried out twice: with and without the cause. The required value of the probability of failure of the root
“improvement index” is then obtained as a measure event. A maintenance plan is then completed.
of the change in the probability of occurrence of the
final event. 5.2. Step III.2. Re-estimation and re-evaluation of risk
3.4. Step I.4. Risk estimation The last step in this methodology aims at verifying
that the maintenance plan developed produces accept-
The results of the consequence and the probabilistic able total risk level for the system.
failure analyses are then used to estimate the risk that
may result from the failure of each unit. In the next mod-
ule, we will show how the estimated risk is evaluated 6. Case study: a maintenance plan for an HVAC
against an acceptance criteria. system
operation. The process flow diagram of the HVAC is failure scenario. The results of consequence analysis for
shown in Fig. 7. envisaged failure scenarios of the different units of the
system are presented in Table 3. It is evident from Table
6.2. Risk estimation 3 that the highest impact on the system performance
results from three units: the air supply fan, the EP relay,
6.2.1. Failure scenarios and the freeze protection system.
The complete HVAC system has been divided into 10
different functional units according to their operational 6.2.3. Probabilistic failure analysis
characteristics (Table 2). Two most probable failure An analysis to determine the failure probability distri-
scenarios have been developed for most of the units and bution for each unit is listed in Table 3. For units having
listed in Table 2. These failure scenarios have been sub- more than one failure scenario, the scenarios that have
jected to consequence analysis. the maximum consequences are selected for sub-
sequent analysis.
6.2.2. Consequence analysis Memorial University facility management division has
Consequence analysis has been carried out for envis- been maintaining performance data of all of the compo-
aged failure scenarios for each of the 10 units. The oper- nents of the HVAC system. In the present study, 5 year
ation of an HVAC system does not involve the pro- data have been used (1993–1998) (Wong, 2000). These
cessing of chemicals and the effect of its stoppage cannot data have been used to verify various failure functions
be measured in terms of the lost production. Thus, the (distribution) and it has been observed that two-para-
consequences related to financial and to human health meter Weibull distribution define them the best (Eq. (8)).
loss have been ignored. The focus in this study is on
the consequences related to system performance and the
effect on the environment. The consequences for these
two major classes are combined by applying Eq. (9).
F(t) ⫽ 1⫺exp ⫺ 再 冉 冊冎t
h
b
(10)
They are then normalized on a scale of 1–10 for each The two parameters, h and b, are estimated for the dif-
568 F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573
Table 2
Units in a typical HVAC system
Table 3
Results of consequence analysis for different accident scenarios
Table 4
Results of risk estimation module; units in italicized exceeding the acceptance level
a
Failure data for these units were not available, as they did not ever fail on operation; the failure probability for these units is adopted from
the literature (Lees, 1996).
Table 5
Details of air supply fan failure
Component number used in Fig. 9 Unit name h b Probability of failure in 1 year Risk factor
environmental issues. Risk-based maintenance attempts 앫 How can it cause the system to fail?
to answer five important questions related to integrity 앫 What would be the consequences if it fails?
and fault free operation of the system: 앫 How probable is it to occur?
앫 How frequent an inspection/maintenance of what
앫 What can cause the system to fail? components would avert such failure?
F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573 571
Fig. 9. Fault tree for air supply fan failure scenario; numbers are
explained in Table 5. A.1. Failure scenarios
Table 6
Results of optimal maintenance duration computations
Unit name Optimal maintenance duration Revised frequency of failure (year⫺1) Un-revised risk factor Revised risk factor
(days)
No environmental and ecological loss, D = 0 two level hierarchical structure to equalize incremental risk. IEEE
Con = [0.25 × 102 + 0.25 × 62]0.5 = 5.83 = 6 Truncations on Power Systems, 5(4), 1510–1561.
Dey, P. M. (2001). A risk-based model for inspection and maintenance
of cross-country petroleum pipeline. Journal of Quality in Mainte-
nance Engineering, 7(1), 25–41.
A.2.2. Scenario 2 Dey, K. P., Ogunlana, S. O., Gupta, S. S., & Tabucanon, M. T. (1998).
A risk-based maintenance model for cross-country pipelines. Cost
Engineering, 40(4), 24–31.
Significant loss of system performance, A = 8 Dubois, D., & Prade, H. (1980). Fuzzy sets and systems: Theory and
No financial loss, B = 0 applications. New York: Academic Press.
Moderately serious human health effects, C = 4 Greenberg, H. R., & Slater, B. B. (1992). Fault tree and event tree
No environmental and ecological loss, D = 0 analysis. New York: Van Nostrand Reinhold.
Con = [0.25 × 82 + 0.25 × 42]0.5 = 4.47 = 4 Hagemeijer, P. M., & Kerkveld, G. (1998). A methodology for risk-
based inspection of pressurized systems. Proceedings of the Insti-
tute of Mechanical Engineers, Part E, 212, 37–47.
Final consequence results = maximum of 4 and 6 = Hirst, I. L., & Carter, D. A. (2000). A “Worst Case” methodology
6 for risk assessment of major accident installations. Process Safety
Progress, 19(2), 78–82.
Khan, F. I. (2001). Maximum credible accident scenario for realistic
and reliable risk assessment. Chemical Engineering Progress, Nov-
A.3. Probabilistic failure analysis ember, 55–67.
Khan, F. I., & Abbasi, S. A. (1997). Accident hazard index: A multi-
Failure probability in 1 year of operation attribute scheme for process industry hazard rating. Institution of
b 3.85
Chemical Engineers (IChemE) of UK (Environmental Protection
Failure probability ⫽ 1⫺e⫺(t/h) ⫽ 1⫺e(365×24/51966.6) and Safety), IChemE, UK, 75B, 217.
Khan, F. I., & Abbasi, S. A. (1998). Safe maintenance practice. Chemi-
⫽ 1.05 ⫻ 10⫺3 cal Industry Digest, March, 91–105.
Khan, F. I., & Abbasi, S. A. (1999a). MAXCRED—a new software
package for rapid risk assessment in chemical process industries.
Environment Modeling and Software, 14, 11–25.
A.4. Risk estimation Khan, F. I., & Abbasi, S. A. (1999b). PROFAT: A user-friendly system
for probabilistic fault tree analysis. Process Safety Progress, 18(1),
42–49.
Risk factor due to damper motor = 6×1.05 × 10⫺3 = Khan, F. I., & Abbasi, S. A. (2000). Analytical simulation and PRO-
6.3 × 10⫺3 FAT II: A new methodology and a computer automated tool for
Total calculated risk of the HVAC system = 1.01 fault tree analysis in chemical process industries. Journal of Haz-
ardous Materials, 75, 1–27.
Kletz, T. A. (1994). What went wrong. Houston, TX: Gulf Publi-
cation House.
A.5. Risk evaluation and maintenance planning Kumar, U. (1998). Maintenance strategies for mechanized and auto-
mated mining systems: a reliability and risk analysis based
HVAC Target risk = 2.2 × 10⫺2 approach. Journal of Mines, Metals and Fuels, Annual review,
343–347.
Target risk calculated for damper motor based on
Lai, F. S., Shenoi, S., & Fan, L. T. (1988). Fuzzy fault tree analysis
HVAC target risk and reverse fault tree analysis = 2.1 theory and applications. In Kandel, & Avni (Eds.), (pp. 139–167).
× 10⫺5 Engineering risk and hazard assessment, vol. 1. Florida: CRC
Based on target risk, preventive maintenance time, Press Inc.
132 days. Lees, F. P. (1996). Loss prevention in chemical process industries, vol.
1. London: Butterworths.
Nessim, M., & Stephens, M. (1998). Quantitative risk-analysis model
guides maintenance budgeting. Pipe Line and Gas Industry, 81(6),
References 1–33.
Noma, K., Tankara, H., & Asai, K. (1981). Fault tree analysis with
Aller, J. E., Horowitz, N. C., Reynolds, J. T., & Weber, B. J. (1995). fuzzy probability. Journal of Ergonomics, 17, 291–297.
Risk based inspection for petrochemical industry. In Risk and safety Rauzy, A. (1993). New algorithms for fault tree analysis. Reliability
assessment where is the balance? New York: American Society of Engineering and System Safety, 40, 203–211.
Mechanical Engineers. Reynolds, J. T. (1995). Risk based inspection improves safety of press-
API (1995). Base resource document on risk based inspection for API ure equipment. Oil and Gas Journal, special 16 January issue.
committee on refinery equipment. Washington, DC: American Pet- Ridgway, M. (2001). Classifying medical devices according to their
roleum Institute. maintenance sensitivity: A practical, risk-based approach to PM
ASME (1991). Research task force on risk based inspection guidelines, program management. Biomedical Instrumentation and Tech-
risk based inspection development of guidelines. In General docu- nology, May/June, 167–176.
ment CRTD 20-1. Washington, DC: American Society of Mechan- Shafaghi, A. (1988). Structure modeling of process systems for risk
ical Engineers. and reliability analysis. In Kandel, & Avni (Eds.), (pp. 45–64).
Capuano, M., & Koritko, S. (1996). Risk oriented maintenance. Biom- Engineering risk and hazard assessment, vol. 2. Florida: CRC
edical Instrumentation and Technology, January/February, 25–37. Press Inc.
Chen, L. N., & Toyoda, J. (1990). Maintenance scheduling based on Soon, H. C., Joo, Y. P., & Myung, K. K. (1985). The Monte–Carlo
F.I. Khan, M.M. Haddara / Journal of Loss Prevention in the Process Industries 16 (2003) 561–573 573
method without sorting for uncertainty propagation analysis in Wong, D. (2000). A knowledge-based decision support system in
PRA. Reliability Engineering, 10, 233. reliability-centered maintenance of HVAC systems. PhD thesis,
Tanaka, H., Fan, L. T., Lai, F. S., & Toguchi, K. (1983). Fault tree Memorial University of Newfoundland, St. John’s, Canada.
analysis by fuzzy probability. IEEE Transactions on Reliability, R-
32, 453–456.