Вы находитесь на странице: 1из 12

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1355-2511.

htm

A structured approach for the assessment of system availability and reliability using Monte Carlo simulation
Adolfo Crespo Marquez
Department of Industrial Management, School of Engineering, University of Seville, Sevilla, Spain, and

Monte Carlo simulation

125

Benot Iung
` Faculte des Sciences, Universite Henri Poincare, Vandoeuvre les Nancy, France
Abstract
Purpose This paper proposes a method to model and assess the availability and reliability of a system when numerous factors such as system complexity, wide range of failure modes, environment, and sustainability may inuence system behaviour. Design/methodology/approach The approach for reliability/availability study is using continuous time stochastic simulation (Monte Carlo simulation) and is based on seven steps for covering logical phases from system description to simulation result discussion. The feasibility and benets of this approach are shown in a case study on cogeneration plant. Findings Owing to the factors inuencing the system behaviour, the opportunity to carry out system availability/reliability assessment through analytical models will be many times very restrictive. Thus a general approach to this problem is proposed based on Monte Carlo (stochastic) simulation. The simulation of the systems life process will be carried out in the computer, and estimates will be made for the desired measures of performance. The simulation will then be treated as a series of real experiments, and statistical inference will then be used to estimate condence intervals for the performance metrics. Practical implications Individuals, companies as well as society in general are becoming more and more dependent on increasingly complex technical systems. Moreover, failure of these complex systems often causes a major loss of service with potentially serious consequences (i.e. critical risk). Thus their dependability with its facets such as reliability, availability, safety has become an important issue. For example, the ability of reliability/availability assessment of such systems is invaluable in industrial domains. Indeed reliability/availability assessment is used for various purposes such as maintenance strategy selection, maintenance planning, production planning, risk and cost evaluations. To face with this complexity, the existing analytical models are not well adapted to carry out system modelling and assessment due mainly to assumptions that are difcult to validate. This paper looks into this issue by proposing a generic approach based on Monte Carlo (stochastic) simulation. Originality/value The Monte Carlo simulation method allows one to consider various relevant aspects of systems operation that cannot be easily captured by analytical models. The utilisation of this method is growing for the assessment of overall plants availability and the monetary value of plant operation. Keywords Monte Carlo simulation, Assessment Paper type Research paper
Journal of Quality in Maintenance Engineering Vol. 13 No. 2, 2007 pp. 125-136 q Emerald Group Publishing Limited 1355-2511 DOI 10.1108/13552510710753032

1. Introduction The Monte Carlo simulation method allows us (Marseguerra and Zio, 2000) to consider various relevant aspects of systems operation that cannot be easily captured by

JQME 13,2

126

analytical models (K-out-of-N, redundancies, stand-by nodes, etc.). The utilisation of this method is growing for the assessment of overall plants availability (Taylor et al., 2000) and the monetary value of plant operation (Marseguerra and Zio, 2000). In this paper we will explain the use of the continuous time simulation techniques to produce Monte Carlo simulation models that can help us to assess system reliability and availability. This continuous time simulation model will evaluate the system state every constant time interval (Dt), then the new system state will be recorded and statistics collected (Ventana Systems Inc., 2006). Then the time is incremented another Dt, and so on. As a simulation tool we will use VENSIM (Pidd, 2003), which has special features to easy Monte Carlo type of simulation experiments, and to provide condence interval estimations. The weak point of the Monte Carlo method is the computing time (Marseguerra and Zio, 2000) especially when we deal with the problem of nding suitable maintenance control policies, and the search space for the control variables of the problem to test increases. However, assessing the availability of predetermined maintenance strategies is normally easy to do and no time consuming. In these cases, randomness can be constrained to the failure generation process while the maintenance policy is predetermined by the plant manufacturer or the equipment user (whoever is interested in the study). In our model, pseudo random numbers will be generated every time interval, and therefore when considering the entire simulation horizon, the requirements in terms of number of simulation are not expected to be very exigent, as we will show in the paper. The rest of the paper will be organised as follows: section 2 is dedicated to the continuous time Monte Carlo modelling of a systems maintenance. Section 3 presents a method to use this model for availability ad reliability assessment. Section 4 is dedicated to present a case study. In section 5 conclusions presented. Finally, section 6 contains the literature references used in this study. 2. A sample continuous time model of systems maintenance A generic systems maintenance model is now presented following basic principles as explained in (Crespo Marquez et al., 2003). The notation that we will use for the modelling of each functional block will be as follows: System status information related variables: At: CAt: LCt: NCTInt: LPt: NPTInt: PAt: System availability (1 available, 0 unavailable) at t. Decrease in systems age due to corrective maintenance action in t. Time when the last corrective maintenance, for a system in t, started. Input of a new corrective time in period t. Time when the last preventive maintenance, for a system in t, started. Input of a new preventive time in period t. Decrease in systems age due to preventive maintenance action in t.

OCTOutt: Output of the last corrective time in period t.

OPTOutt: Output of the last preventive time in period t.

RNt: T t: TIt: TOt:

Random number, interval (0,1), generated in t. Systems age in t. Increase of systems age in period t. Decrease of systems age in period t. Failure rate of the system in t. Average time of a corrective maintenance action. Minimum age of the system to do preventive maintenance actions. Average time of a preventive maintenance action. Maximum time the system operates without a failure.

Monte Carlo simulation

l(Tt):
CT: n: PT: T1:

127

Model parameters:

The process requires rst to model the age of the system (Tt): T t T t TI t 2 TOt 1

Lets consider that the age of the system increases when the system is available (system that is no idling nor standing-by), then we can dene this age increase as 1 when the system is available and 0 when unavailable, this can be expressed as: TI t At 2

And age will decrease when the system is maintained. If we assume that preventive and corrective maintenance set up the equipment as good as new, then we can express the system age decrease as: ( TOt with ( CAt T t if l T t $ RN t 0 Otherwise 4 PAt if PAt ,. 0 and CAt ,. 0 PAt CAt Otherwise 3

Where RNt is a random number generated for every t within the range (0,1), l(Tt) is the failure rate of the system, and CAt and PAt are decreases in the systems age (in time units) as a consequence of the corrective and preventive maintenance actions respectively. Once we have modelled the age of the system we can model, for instance, an age based maintenance policy (ABMP). In the ABMP the system is preventively maintained when reaches a given time n, and it is correctively maintained when a failure takes place, at the failure time:

JQME 13,2

( PAt

Tt 0

if T t $ n Otherwise

128

Then notice that the conditions of the system that will make it unavailable will be the corrective or preventive maintenance that is being carried out: ( 1 2 PulseLCt ; CT; t PulseLPt ; PT; t ; if LC t . 0 or LP t . 0 6 At 1 Otherwise The function Pulse, previously introduced to calculate At is dened as follows: ( 1 if a , t , a b Pulsea; b; t 0 Otherwise

LCt and LPt, are the times when the last corrective (or preventive, respectively) maintenance, for a system in t, started. Notice that when t 0, LC t LP t 0. When t . 0 these variables can be modelled as follows: LC t LC t21 NCTInt 2 OCTOutt with ( NCTInt ( OCTOut t Similarly, LP t LP t21 NPTInt 2 OPTOut t with ( NPTInt ( OPTOut t t 0 t 0 if PAt . 0 Otherwise 12 11 t 0 if CAt . 0 Otherwise 9 8

t 0

if CAt . 0 and t . LC t Otherwise

10

if PAt . 0 and t . LP t Otherwise

13

Previous equations can be represented graphically, building a stock and ow diagram with VENSIM software (Figure 1). Equipment preventive maintenance time is predetermined. Sometimes, however, although that time arrives, it is convenient that the equipment may remain functioning for a certain period. For example, suppose the case of a pumping unit with two pumps in parallel. Imagine that one of the two units fails, and while the fault is being repaired,

Monte Carlo simulation

129

Figure 1. Stock and ow diagram for the systems age modelling

the second pump has a preventive maintenance activity scheduled. We would probably wait to carry out the preventive maintenance until the repair or the faulty unit is nished. When modelling maintenance we have to consider these cases, cases in which there is a backlog of maintenance activities, i.e. activities which are due and waiting to be carried out by the maintenance department. In order to model situations where maintenance backlog exist, we will use the following additional notation of variables in our formal model: AAt: All sub-systems available (1 yes, 0 no) at t. SMt,i: Scheduled maintenance for unit i (1 yes, 0 no) in period t. MBt,i: Maintenance backlogged for unit i (1 yes, 0 no) at t. RMt,i: Maintenance released for unit i (1 yes, 0 no) in period t. Suppose, for instance, that we have a system with two units (i 1; 2), and that we require that both of them are in good operating conditions in order to preventively maintain one of them. Then i.e. AAt 1, where AAt is dened in equation (14) as follows (Figure 2):
i2 Y i1

AAt

At;i ; with i 1; 2

14

JQME 13,2

130

Figure 2. Stock and ow diagram for the backlog of maintenance activities

Maintenance activities will be scheduled according to (15), then could be backlogged according to (16), and nally released as explained in (17). Notice that if both units are OK (i.e. AAt 1 a scheduled maintenance is immediately released, just in time, without being backlogged. Then, when a preventive activity is released, we will record this time (in LPt) to allow downtime modelling as explained previously in (6): ( 1 if ti =n Intti =n and t i . 0 SM t;i 15 0 Otherwise MBt;i MBt21;i SM t;i 2 RM t;i ( RM t;i 1 0 if SM t;i 1 or MBt;i 1 and AAt 1 Otherwise 16

17

We will suppose, to simplify the formalisation of the model, that a backlogged activity will be released before a new preventive maintenance is scheduled, but of course this is not the general case that would require extra formulation. 3. A method for the assessment of system availability and reliability The procedure that we propose in this paper, in order to develop the availability study using continuous time stochastic simulation, is described in the Table I, where we distinguished a total of seven steps. It is important to make clear that before building the simulation model (step 5), we need to know all details of the system design besides full reliability and maintainability data of each item which is part of it (Taylor et al., 2000). Another important step of the study is the selection/determination of the functional blocks (step 2). A functional block provides the output of a system as the outcome of a joint event dened by the inputs to the system and its various states. Functional blocks corresponding to different subsystems are combined together to form a functional

Step Description 1 2 3 4 5 6 7 Description of the system: required functions, limits and operational context Determination of the basic functional blocks of the system and for every system required function to analyse Determination of the dependencies among functional blocks for the fullment of every function Reliability and maintainability data gathering for each one of the functional blocks Simulation model building using principles in previous sections of this paper Scenarios and experiments design Simulation, results and their discussion

Monte Carlo simulation

131
Table I. Steps of the method for the availability and reliability assessment

block diagram representing the functional characteristics of the combined system (Papazoglou, 1998). Conversely, a complex system represented by a single functional block is decomposed to constituent components with a corresponding functional block diagram. 4. A case study We will briey present an application of this modelling approach for the assessment of the reliability and availability of the services provided by a cogeneration plant (for the description of a more complex case study for cogeneration plants with different possible congurations the reader is referred to (Crespo Marquez et al., 2005). It is well known that electrical power generation systems are processes where Monte Carlo techniques traditionally provide a practical approach to reliability analysis (Crespo Marquez et al., 2005) and where using deterministic methods for availability is virtually impossible (Taylor et al., 2000), since repairable systems are considered. Cogeneration plants produce electrical power and to co-produce steam according to certain very high requirements. For instance, availability requirements for electrical power and steam production will be the following: (1) Steam production availability: . Steam 600 psi: 70 T/h 350 days/yr, 35 T/h 15 days/yr. . Steam 150 psi: 70 T/h 365 days/yr. (2) Electrical power production availability: . 40 MW with back-up for 335 days/yr, . 20 MW with back-up 20 MW without backup for 30 days/yr. The system used for the generation of the electrical power is conformed by a dual turbine (where dual refers to the possibility to use gas or fuel as combustible, with natural gas used under normal operating conditions), and a turbine-coupled generator. The plant has a two-turbine conguration with 45 MW output per unit. Output in both cases will be to a nominal voltage of 12 KV. It is also foreseen the existence of a transformer and a circuit-breaker per turbine-generator subsystem. The generation of steam will be done using a boiler that will benet from the turbine exhaust gases temperature in order to generate the necessary steam ow (see, Figure 3). The boiler,

JQME 13,2

132

Figure 3. Description of the cogeneration unit from

using a by-pass system, allows a set of post-combustion burners to be used, providing back-up in case of a turbine-generator set failure. Finally, the use of several economisers permits the production of steam in low (150 psi) and high pressure (600 psi). The steam generation system takes the water from a demineralised water plant, producing equal amount of water than the steam generation system. Besides this, a water tank will exist to allow total supply of water during a sufciently wide period (this provides 100 per cent water supply back-up). In this paper we consider perfect water supply to the steam generation system, therefore we will only include in our study those elements that compose the water network, the boilers and the steam network in the plant. Taking into account previous considerations, we have decided to model the two different functional blocks that we describe in the Table II, where also the components of the blocks are described. Some of these blocks will be then replicated, according to the specic physical system conguration analysed. The functional blocks dependencies are as follows: (1) For steam production availability: the steam production at 600 psi with volumes of 70 Tn/hr and 35 Tn/hr will be obtained when the following conditions are met (see simulation model condition for this example in Figure 4): . The 70 T/hr ow requirements are met those days that all two Boilers work simultaneously. . The 35 T/hr ow requirements are met those days only one out of two Boilers works. (2) For electrical power production availability: . The 45 MW of electrical power (not only 40) with 45 MW backup in stand-by will be available the days that the two turbo-generators are available. . The production of 20 MW with 20 MW back-up in stand-by 20 MW without back-up does not apply for this case. The functional blocks failure rate and mean time to repair data is shown in Table III.

Monte Carlo simulation

133

Functional blocks Power generation system named: [RGasTGasGen]

Components Gas network Turbine Generator Transformer Circuit switch Boiler Steam extraction and network system Pump 1(from degasier to boiler) Pump 2 (between economisers) Valve 1 (after pump 1 before boiler) Valve 2 (after pump 2, to bypass second economiser) Ten joints and connections (water network system)

Steam generation system named: [RAguaCaldera]

Table II. Functional blocks

JQME 13,2

134

Figure 4. Functional blocks and function (steam production) fullment (step 3)

Table III. Reliability and maintainability data for functional blocks

Functional block TG [RGasTGasGen] B [RAguaCaldera]

Failure rate: Sli (failures per day) 0.00597766 0.02208000

MTTR: S(li/mi)/Sl (repair days) 2.45204697 ,1

The functional blocks maintenance policy is as follows: the preventive maintenance of the plant will be conditioned by the preventive maintenance of the turbine sets, so that in the subsequent simulation model we will suppose that every set turbine-generator-boiler will stop together for the accomplishment of the different steps of maintenance. The maintenance of the boilers will be therefore opportunistic and determined by that of the turbo-generator sets (Table IV). Turbo-generator sets will never be simultaneously stopped for the accomplishment of a preventive task. Therefore, in some occasions, maintenance will be backlogged. The model simulates a temporary horizon of 6,205 days (selected for the project elicitation), and the time step for the simulation is of one day. Every day, the failures that will take place in the available functional blocks will be randomly obtained, in the
Maintenance step Table IV. Maintenance steps and scheduled downtime for the whole turbo-generator and boiler set Monthly (off-line wash) Each 4,000 operating hrs Each 8,000 operating hrs Each 50,000 operating hrs Scheduled downtime hours 6 48 120 240

event that a failure takes place in a block, this will turn to be unavailable. Blocks will return to the availability state as soon these operations are nished. Results obtained (see Table V) with the simulation model for the production of electrical power indicate that the system fulls the requirement of 5,025 days providing 40 MW with back-up. The requirement of 20 MW with back-up 20 MW without back-up is not applicable. However an availability of 45 MW without back-up during 339 days can now be reached and offered. Results obtained with the simulation model for the production of steam indicate that the requirement of a minimum of 5,250 days of 70 Tn/hr steam 600 psi supply is not obtained. A more reasonable estimation would be 4,987 days of 70 Tn/hr. The system delivers more than 35 Tn/hr ow of steam 600 psi. The minimum of 5,475 days of 70 Tn/hr 150 psi steam is not full either. A reasonable estimation is 4,987 days of 70 Tn/hr 150 psi. Simulation results obtained regarding availability and reliability of the blocks indicate that the values of availability of the Turbo-generators are within the range (96-97 per cent) with 95 per cent of condence and that the values for the reliability of the same equipment, with identical condence, are within the interval (98-99 per cent).

Monte Carlo simulation

135

Supply (requirements) 45 MW with back up (note: MIN 5; 025 days 40 MW with back-up

Replication 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

Values 5115 5100 5153 5110 5134 352 366 338 352 338 4980 5006 4996 5026 4999 425 409 412 383 419 4980 5006 4996 5026 4999

Statistics Mean Std. Dev. Conf (^95 per cent) Max Min Mean Std. Dev. Conf. (^95 per cent) Max Min Mean Std. Dev. Conf. (^95 per cent) Max Min Mean Std. Dev. Conf. (^95 per cent) Max Min Mean Std. Dev. Conf. (^95 per cent) Max Min 5118,80 15,32 13,43 5132,23 5105,37 349,20 11,71 10,27 359,47 338,93 5001,40 16,73 14,66 5016,06 4986,74 409,60 16,12 14,13 423,73 395,47 5001,40 16,73 14,66 5016,06 4986,74 Table V. Simulation results (values in days) for different replications and statistics

45 MW without back-up (note: MAX 450 days 20 MW with B. 20 MW without B

70 Tn/h 600 psi steam (note: MIN 5; 250 days 70 Tn/hr)

35 Tn/h 600 psi steam (note: MAX 225 days 35 Tn/hr)

70 Tn/h 150 psi steam (note: MIN 5; 475 days 70 Tn/hr)

JQME 13,2

136

5. Conclusions This paper presents a method for the assessment of system availability and reliability using continuous time Monte Carlo simulation models. This method has been used to assess whether a system fulls certain availability and reliability requirements. Moreover, the methods offered more realistic estimations that could be expected from the system studied. These estimations are based in validated reliability and maintainability calculations of each functional block that was considered in the simulation model.
References Crespo Marquez, A., Gupta, J.N.D. and Sanchez Herguedas, A. (2003), Maintenance policies for a production system with constrained production rate and buffer capacity, International Journal of Production Research, Vol. 41 No. 9, pp. 1909-26. Crespo Marquez, A., Iung, B. and Sanchez Herguedas, A. (2005), Monte Carlo based assessment of system availability: a case study for cogeneration plants, Reliability Engineering and System Safety, Vol. 83 No. 3, pp. 279-83. Marseguerra, M. and Zio, E. (2000), Optimizing maintenance and repair policies via combination of genetic algorithms and Monte Carlo simulation, Reliability Engineering and System Safety, Vol. 68, pp. 69-83. Papazoglou, I.A. (1998), Functional block diagrams and automated construction of event trees, Reliability Engineering & System Safety, Vol. 61 No. 3, pp. 185-214. Pidd, M. (2003), Tools for Thinking: Modelling in Management Science, 2nd ed., Wiley, Chichester. Taylor, N.P., Knight, P.J. and Ward, D.J. (2000), A model of the availability of a fusion power plant, Fusion Engineering and Design, Vol. 51, pp. 363-9. Ventana Systems Inc. (2006), The Ventana Simulation Environment: VENSIM 5.5D, Ventana Systems Inc., Cambridge, MA. Corresponding author Adolfo Crespo-Marquez can be contacted at: adolfo.crespo@esi.us.es

To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints

Вам также может понравиться