Вы находитесь на странице: 1из 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/224644156

Modelling imperfect inspection and maintenance in defence aviation through


bayesian analysis of the KIJIMA type I general renewal process (GRP)

Conference Paper · February 2006


DOI: 10.1109/RAMS.2006.1677418 · Source: IEEE Xplore

CITATIONS READS

10 146

3 authors, including:

F.J. Groen Ali Mosleh


NASA University of California, Los Angeles
39 PUBLICATIONS   197 CITATIONS    487 PUBLICATIONS   3,804 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Risk analysis of autonomous ships View project

COTS in Space: A Reliability Assessment Methodology View project

All content following this page was uploaded by Ali Mosleh on 23 August 2014.

The user has requested enhancement of the downloaded file.


Modelling Imperfect Inspection and Maintenance in Defence Aviation
through Bayesian Analysis of the KIJIMA Type I General Renewal
Process (GRP)
Andrew Jacopino, Royal Australian Air Force
Frank Groen, Ph. D., Prediction Technologies, Inc.
Ali Mosleh, Ph. D., University of Maryland
Key Words: General Renewal Process, Repairable Item, Imperfect Maintenance, Imperfect Inspection, Aviation

SUMMARY & CONCLUSIONS comparison (benchmarking) of various maintenance


activities/facilities.
To ensure the effective and efficient operation of Defence
aviation equipment there is a clear need for a component life 1. INTRODUCTION
model that is representative of the true life of a component.
However, the large and often sophisticated RAM models used The following paper is intended to define a number of
to manage defence aviation platforms, through various cases, including the mathematical model and algorithms,
engineering and logistics activities, use models that cannot which can be used to realistically represent, and therefore
accurately represent this life. The main difference is in the predict, the future life of a RI within the defence aviation
underlying repair assumption. Specifically, the Ordinary environment.
Renewal Process (ORP) uses an as-good-as-new repair The paper is divided into 4 sections. Section 2 provides a
assumption while the Non-Homogenous Poisson Process brief background on the need for an alternative model to
(NHPP) uses an as-bad-as-old repair assumption. However, it support optimal management of RIs in the defence aviation
is highly unlikely that any component, typically referred to as environment. Section 3 introduces eight cases that have been
a Repairable Item (RI), will readily fit into either repair developed as part of the overall modelling scheme. Section 4
assumption. Therefore, despite the best endeavours of both solves the cases using a MCMC sampling procedure based on
engineering and logistics staff given the underlying repair Slice Sampling and Auxiliary Variable techniques. Section 5
assumption and the limitation these impose on the model, any then provides examples of the analysis of components.
solution will be suboptimal. Accordingly, there is a need for a Finally Section 6 considers describes the potential future
RI life model that can contend with imperfect maintenance, work.
imperfect inspection and can adapt to the limitations in data
and include a number of additional factors including aging of 1.1 Acronyms & Abbreviations
the component, number of repairs, effectiveness of the repair,
skill of the technicians, etc. BIT Built In Test
Eight cases were developed as part of the overall FM Failure Mode
modelling scheme. These eight cases are further divided into GRP General Renewal Process
2 main types; the first type representing cases where failure NHPP Non-Homogenous Poisson Process
times are known and the second type where failure times are ORP Ordinary Renewal Process
unknown. The cases incrementally modify these types RCM Reliability Centred Maintenance
through the addition of factors including multiple failure RI Repairable Item
modes and their inter-dependence, and imperfect inspection ROCOF Rate of Occurrence of Failure
and maintenance, in order to achieve a more realistic TTF Time-To-Failure
representation. Each of these cases were then solved using VA Virtual Age
utilising a Markov Chain Monte Carlo (MCMC) sampling
procedure, concentrating only on the analysis of the KIJIMA 2. BACKGROUND
Type I GRP model with an underlying Weibull Time-To-
Failure (TTF) distribution. The MCMC was made possible It is widely accepted that the TCO over a typical
through the use of a Slice Sampling and Auxiliary Variable utilisation phase of 20 years of aviation platforms only relates
techniques. The resulting models have the ability to to 20% the acquisition cost. For example, the Australian
accurately model, and specifically predict, the future failure Department of Defence spends approximately 1 Billion
trends. Furthermore, the model allows the analyst to compare Australian Dollars, or 5.7% of the total annual budget of 17.5
the maintenance effectiveness either in isolation, or in Billion Australian Dollars on sustainment of their aircraft

1-4244-0008-2/06/$20.00 (C) 2006 IEEE


which total 283 fixed wing and 145 rotary wing aircraft. Case A Case B
While it is true that this sustainment cost includes engineering,
certification and logistics functions, a large portion of this cost Case A represents Case B represents
relates to the management of RIs. One of the main facets of components with Known components with Unknown
RI management is ensuring that a working spare is available to Failure Times. These Failure Times. These
the user within a prescribed timeframe. However, the key to models are representative models are representative
optimal management of RIs is knowing the arising rate (not of RIs where a failure can of RIs where a failure not
failure rate since aviation tends to avoid unscheduled failures) be observed and repaired observable during
to calculate the total number of spares, the repair turn-around as they occur (e.g. avionics operation. Failure is only
time, or a combination of both. Furthermore, unlike with an automatic BIT). observed, and therefore
commercial aviation organisations, defence aviation is less This definition is similar to repaired, during scheduled
concerned with cost than with the operational capability of the the concept of an evident inspection point. This
platform; specifically Systems Readiness (i.e. Availability), failure under the RCM definition is similar to the
Mission Success (i.e. Mission Reliability), Logistics Footprint logic. concept of a hidden failure
(i.e. support required to operate the aircraft such as under the RCM logic.
maintenance staff, spares, test equipment, etc) and Demand Table 1: Description of Case Types
Satisfaction Rate (i.e. availability of spares).
Unfortunately, the large and often sophisticated RAM 3.3 Cases
models used to manage defence aviation platforms, through
engineering and logistics activities, either use as-good-as-new Cases were developed for each type that ranged in
(i.e. ORP) or as-bad-as-old (i.e. NHPP) repair assumptions. realism, and therefore complexity, beginning with the
However, it is highly unlikely that any RI will readily fit into modelling of simple single Failure Mode (FM), Instantaneous
either repair assumption. Therefore, despite the best Repair and Perfect Inspection through to Multiple Dependent
endeavours of both engineering and logistics staff, given the FMs, Delayed Repair and Imperfect Inspection. A summary
underlying repair assumptions of the ORP and NHPP models, of the Cases, by type, including reference to their
any solution will be suboptimal resulting in either aircraft representative, is provided in Table 2. A diagram was used to
being unavailable to conduct a mission awaiting a spare, or a assist define each case, specific times and interactions. An
higher TCO due to spare RIs that are not being utilised. example of the diagram for Case 4A is provided at Figure 1.
Neither result is acceptable and accordingly there is clearly a
need to construct a model that more realistically represents the 4. MODEL SETUP
true life of a RI within the Defence aviation environment.
4.1 Basic GRP Model
3. CASES
Since the method and function of the GRP has been
Eight Cases were developed in an incremental manner previously discussed at length in KIJIMA and SUMITA [1],
allowing the various complexities of the life of an RI to be JACOPINO, GROEN and MOSLEH [2], KAMINSKIY and
accurately modelled. KRIVTSOV [3], METTAS and ZHAO [4], YANEZ,
JOGLAR and MODARRES [5], it is not the intent of this
3.1 General Case Assumptions paper to recover the GRP concept. However, to allow the
reader to be able to understand this paper in isolation, the
Before describing the set-up of the individual cases, all following brief description of the GRP, specifically the
cases include the following underlying model assumptions: KIJIMA Type I model.
• Only valid for a single RI (e.g. single serial number from Simplistically, realistic repair actions and aging effects
a fleet). are modelled through the reduction of Rate of Occurrence of
• Only 2 states; serviceable (i.e. working) and Failure (ROCOF) during the lifetime of a system/component.
unserviceable (i.e. failed). In the simplest form of GRP, a single parameter, q, is
introduced representing the goodness of repairs, or Repair
• When a repairable item can have multiple failure modes,
Effectiveness.
the repair action is assumed to address only the failure
The variation between the two extreme repair
mode that has occurred. The repair action does not
assumptions (i.e. ORP and NHPP) is achieved through the
remove the possibility of the other failure modes
notion of virtual age. KIJIMA [6] introduced 2 models for
occurring following the repair action.
virtual age typically referred to as the KIJIMA Type I and
Type II equations:
3.2 Case Type
Vi = Vi-1 + AiYi KIJIMA Type I (1)
Vi = Ai (Yi + Vi-1) KIJIMA Type II (2)
Of the eight cases modelled, they were divided into two
where Vi is the virtual age (VA) after the ith repair, Yi
distinct case types for the purposes of the modelling;
labelled Case A and Case B. The differences between Case A most recent operating time during the ith interval and Ai ∈
and Case B are detailed in Table 1. <0,1> may be a random variable representing the Repair
Effectiveness. However, for simplicity, all models assume Ai
CASE 1 CASE 2 CASE 3 CASE 4
Single Failure Mode Multiple Independent Failure Multiple Dependent Multiple Dependent
Instantaneous Repair Modes Failure Modes Failure Modes
Perfect Inspection Instantaneous Repair Instantaneous Repair Instantaneous Repair
Perfect Inspection Perfect Inspection Imperfect Inspection
CASE A Case 1A For Cases 2 – 4, only repair failure mode that was discovered in the failed state (e.g.
Known Failure Representative of equipment no repair of other failures regardless whether in failed state).
Times that is constantly monitored Case 2A Case 3A Case 4A
(failure observed for failures through BIT, etc Representative of case where Representative of the Representative of case
and repair as and repaired during After failure modes can be case where failures where inspection process
occurs) Flight (AF)/Turn-around analysed separately and then failure modes will can ‘age’ a different
(TA)/rectification action combined in the GRP impact (e.g. ‘age’) other failure mode that did not
simulation. failure modes. fail and/or wasn’t
repaired.
CASE B Case 1B Case 2B Case 3B Case 4B
Unknown Failure Representative of equipment As Above As Above As Above
Times that is not monitored for
(failure non- failures (e.g. failure can
observable and occur without observing).
repaired at Inspection, and AF/TA/
scheduled rectification action, occurs at
inspection point) fixed intervals.
Table 2: Description of RI Life Cases

= constant = q. replenishment of lubricants. Accordingly, an inspection


If q = 0, the solution of the GRP corresponds to the does not conduct any restorative action. However, an
solution of an ORP model. Conversely, if q = 1, the solution inspection may increase (i.e. make worse) the VA,
of the GRP corresponds to the solution of a NHPP. especially if the inspection is intrusive (e.g. opening
The focus on the KIJIMA Type I GRP equation is based panels, disconnecting cables or hydraulic lines, etc).
on the findings from JACOPINO et al [2] which indicated that • While a maintenance action on one FM may restore the
the KIJIMA Type I GRP equation should be used to represent VA (i.e. 0 ≤ q ≤ 1), the maintenance action may affect the
an individual RI. Conversely, JACOPINO et al [2] stated that VA of the non-repaired FM(s) by decreasing the VA (i.e.
the KIJIMA Type II GRP equation should be used to represent 0 ≤ q < 1), no change to the VA (i.e. q = 1) or increasing
complex system such as an aircraft or cars. the VA (i.e. q < 1).
Based on these rules for Case 4A with two FMs there are
4.2 Slice Sampling now 10 variables representing the factors in Table 3.

As discussed in KAMINSKIY and KRIVTSOV [3] there


is no closed form solution to the GRP equation and FM1 FM2
accordingly, a MCMC method, such as Gibbs sampling and Weibull αFM1, βFM1 αFM2, βFM2
the Metropolis algorithm, must be used to sample form the Parameters
complex, multivariate distribution defined by these cases. GRP Parameter qinspection_FM1 qinspection_FM2
However, both these MCMC methods have a number of ‘qinspection’
disadvantages with their use and accordingly an alternative GRP Parameter qmaintenance_FM1 qmaintenance_FM2
MCMC sampler developed by NEAL [7], referred to as Slice ‘qmaintenance’ qmaintenance_FM1-2 qmaintenance_FM2-1
Sampling, was utilised. It is not the intention of this paper to Table 3: Parameter Set for Case 4A
describe the basis of Slice Sampling, which is adequately
covered by NEAL [7], but rather to describe those issues It can be seen from Table 3 complex RIs with multiple
connected to the resolution of the GRP question posed. FMs that the number of parameters for the solution to Case 4A
quickly expands. For example, while a 2 FM case solution
4.2.1 Set-up of Model requires 10 parameters, a 4 FM solution requires 28
parameters.
The model is based on the concept from KIJIMA and While the Slice Sampling technique is not overly
SUMITA [1] that VA of an RI will change with a maintenance complex, its Bayesian implementation was not without
event; notionally a repair. However, this concept can be difficulty. The key to the use of the Slice Sampler method for
extended to represent in the VA due to any maintenance event any continuous distribution is the ability to compute some
including inspections, preventative maintenance and function, f(x), that is proportional to the density. In this
corrective maintenance. Based on this extension the following application, the f(x) must be representative of the pdf,
rules are made: including a Bayesian prior, of the underlying KIJIMA Type I
• An inspection will not decrease the VA of the RI since an GRP equation. Furthermore, this f(x) will vary depending of
inspection is not an active maintenance action such as the the case under consideration.
Case 4A – Known Failure Times, Multiple Dependent Failure Modes, Instantaneous
Repair, Imperfect Inspection

Combined Failure/Inspection
• Reduction in Virtual Point for FM1 only
Age of FM1 due to • Failure Observed
repair of FM1 • Instantaneous Repair
Virtual time (V)

Failure Mode 1
•Increase in Virtual Age
of FM1 due to
imperfect inspection to
find FM2

X
t0 = 0 t1 = a t2 = b Real time (t)
• Reduction in Virtual
Virtual time (V) Age of FM2 due to
repair of FM2
Failure Mode 2

X
t0 = 0 t1 = a t2 = b Real time (t)

• Increase in Virtual Age Combined Failure/Inspection Point


of FM2 due to imperfect for FM2 only
inspection to find FM1 • Failure Observed
• Instantaneous Repair

Note: the virtual age takes into account imperfect maintenance directly; imperfect inspection can be considered similar, but separately

Figure 1: Description of Case 4A

Using the Slice Sampler it is possible to generate a number) from the current Maintenance Management System, a
number of parameter sets that solve the underlying Case set-up common problem within the aviation industry.
allowing the simulation of the expected CIF curve. Furthermore, due to the number of cases it is impossible
Once the individual parameter sets have been calculated, to demonstrate the output from the model, and therefore a
a simulation routine is performed using the parameters sets in representative sample will be shown for illustrative purposes.
step 1 based on at least 100 simulations for each set. The Specifically the 2 cases to be shown are Case 1A and Case 4A
average of each set is then recorded. GRP simulation has been
previous discussed by JACOPINO et al [2]. 5.1 Case 1A Results
Once all the parameter sets have been simulated, it is then
possible to average all parameters sets to get the average CIF The Case 1A (single FM, Instantaneous Repair, Perfect
curve, and non-parameteric upper and lower confidence limits. Inspection) results are based on the data from MEEKER and
ESCOBAR [8] for the USS Grampus Number 4 Main
4.2.2 Implementation Issues Propulsion Diesel Engine.
The estimator was run for 50,000 iterations with an
The Slice Sampler, like most MCMC samplers, suffer a interleave value of 5 resulting in 10,000 parameter sets of α, β
high level of autocorrelation. While the fact that output was and q. An initial guess of α=1, β=1and q=0.5 was used. To
correlated was not unexpected, the duration of the correlation allow comparison of the model to the simulated data the first
(e.g. even at lag-5) was surprising. Given the clear indication two thirds of the data set was used by parameter estimator,
that the output from the parameter estimation was strongly while the simulator predicted the whole data set. Each of
correlated, an interleaving technique was included within the these parameter sets were then simulated 100 times and the
algorithm. The aim of the interleave interval is to only record averages recorded resulting in 10,000 CIF curves. The 10,000
the ith value from the MCMC and therefore reduce the effect CIF curves were then averaged to find the average CIF curve
of correlation. However, the selection of the interleave for all parameter sets, and the non-parametric ninety percentile
interval needs to reflect the opposing requirements of a upper and lower confidence limits.
sufficiently large interval to reduce the effect of the As can been seen from Figure 2, the results accurately
autocorrelation to a satisfactorily level without significantly predict the additional 5 data points, and the non-parametric
reducing the efficiency of the MCMC sampler. Based on a ninety percentile upper and lower confidence limits effectively
trial an interleave value of 5 was selected for this model. bound the simulated data.

5. RESULTS 5.2 Case 4A Results

The original intent of this research was to examine real RI For the reasons previously stated the Case 4A results are
data from the defence aviation environment. Unfortunately, it based on simulated data for a 2 FM RI. The input to the data
is very difficult to get the life of a single RI (e.g. serial simulator is provided in Table 4.
70

15
14
60
13
12
50 11
10
9
40
No. of Renewals

8
7

30 6
5
4
20 3
2
1
10
0
0 100 200 300 400 500 600 700 800 900 1000
Input Data(11points)
0 Input Data(16points)
0 2 4 6 8 10 12 14 16
SimulatedAverage for Data
TimeToFailure (Hours) 90%UCLSimulatedAverage
90%LCLSimulatedAverage
AverageSimulatedGRPfromParameter Estimator
Actual SimulatedAverage
90%UCLSimulatedGRPfromParameter Estimator
90%UCLSimulatedGRPfromParameter Estimator
Simulated Average TTFfromKnownParameters Figure 3: Case 4A Simulator Results
Input TTFData toParameter Estimator
6. FUTURE WORK
Figure 2: Case 1A Simulator Results
Due the unexpected lack of data for individual
components based on serial numbers the logical next step is to
Failure Mode 1 Failure Mode 2 apply this technique to a number of actual Australian
αFM1,= 150 hours αFM2 = 300 hours Department of Defence aviation datasets. This data analysis
βFM1 = 1.5 βFM2 = 3.5 phase is to specifically examine the variation of the
qinspection_FM1 = 0.5 qinspection_FM2 = 1 parameters, especially ‘q’, between the same item for due to
qmaintenance_FM1 = 0.7 qmaintenance_FM2 = 0.2 location (i.e. temperature, humidity, etc), maintenance
qmaintenance_FM1-2 = 1.2 qmaintenance_FM2-1 = 0.2 facility/staff, maintenance tools/manuals/procedures,
incorporation of a design change/modification, mission
Scheduled Inspection at 400, 800, 1200 and 1600 hours
profile, etc. In parallel, work will also be done to transfer of
the algorithms from MathCad® environment used to develop
Table 4: Input to Data Simulator for Case 4A and test the concept into a Windows® based environment,
including the optimisation of the code.
The estimator was run for 50,000 iterations with an
interleave value of 5 resulting in 10,000 parameter sets of the REFERENCES
10 variables. To allow comparison of the model to the
simulated data the first 11 data points where used by 1. Kijima, M. and Sumita, N. “A useful generalization of
parameter estimator, while the simulator predicted the whole renewal theory: counting process governed by
16 data points. Each of these parameter sets were then nonnegative Markovian increments.” Journal of Applied
simulated 100 times and the averages recorded resulting in Probability, 23, 71–88, 1986.
10,000 CIF curves. The 10,000 CIF curves were then 2. Jacopino, A., Groen, F. and Mosleh, A., “Behavioural
averaged to find the average CIF curve for all parameter sets, Study of the General Renewal Process”, Reliability,
and the non-parametric ninety percentile upper and lower Availability and Maintainability Symposium, Los Angels,
confidence limits. USA, 2004
As can been seen from 3. Kaminskiy, M. and Krivtsov, V. “A Monte Carlo
Figure 3, the results accurately predict the additional 5 approach to repairable system reliability analysis.”
data points, and the non-parametric ninety percentile upper Probabilistic Safety Assessment and Management, New
and lower confidence limits effectively bound the simulated York: Springer; p. 1063–1068, 1998
data.
4. Mettas, A. and Zhao, W., Modeling and Analysis of
Repairable Systems with General Repair, Availability and Dr. Frank Groen is the President of Prediction Technologies,
Maintainability Symposium, Alexandria, USA, 2005 Inc, a company whose products (R-DAT 1.5 and BRASS 1.0)
5. Yanez, M., Joglar, F. and Modarres, M. “Generalized have been designed to allow the user to conduct Bayesian
renewal process for analysis of repairable systems with reliability data collection and analysis. His research interests
limited failure experience,” Reliability Engineering and include the Bayesian treatment of uncertain evidence and
System Safety, vol. 77, iss. 2, pp. 167-180(14), 2002 methods for Probabilistic Risk Assessment.
6. Kijima, M., “Some Results for Repairable Systems with
General Repair”, Journal of Applied Probability, #20, Professor Ali Mosleh, PhD
1989, pp 851-859. Department of Reliability Engineering
7. Neal, R., Slice Sampling, The Annals of Statistics, Vol 31, University of Maryland
No. 3, 2003 2100 Marie Mount Hall,
8. Meeker, W.Q. and Escobar, L.A., Statistical Methods for College Park, Maryland 20742-7531 USA
Reliability Data, Wiley-Interscience, USA, 1998
mosleh@eng.umd.edu
BIOGRAPHIES
Professor Ali Mosleh is a full professor at the University of
Andrew Jacopino Maryland, Reliability Engineering Program. His research
Directorate of Aerospace Systems interests include dynamic PRA methods, human error
Defence Materiel Organisation (Aerospace Systems Division) modelling, and model uncertainty.
Department of Defence
Russell Offices, Canberra, ACT 2600 Australia Prof. Ali Mosleh is professor and director of the Reliability
Engineering Program and director of the Center for Risk and
Andrew.Jacopino@defence.gov.au Reliability at the University of Maryland. He conducts
research on methods for probabilistic risk analysis (PRA) and
Squadron Leader Andrew Jacopino is the R&M subject matter reliability of complex systems, and has made many
expert for Aerospace Systems Division (ASD) within the contributions to diverse fields of theory and application. These
Defence Materiel Organisation (DMO) located in Canberra, include Bayesian methods for inference with uncertain
Australia. He is involved in all aspects of R&M within the evidence; analysis of data and expert judgment; treatment of
Division; from RAM specification, evaluation and review in model uncertainty; risk and reliability of hybrid systems of
materiel acquisition contracts, the monitoring of in-service hardware, human and software; methods and tools for
performance of equipment including the use of R&M in dynamic PRA; cognitive models for human reliability
Performance Based Logistic contracts, and R&M training analysis; and models of the influence of organizational factors
including acting as the course manager of the Australian on system safety. He is the developer of the Accident
Defence Organisation (ADO) Defence Reliability Precursor Analysis methodology and many of the methods
Management Course. currently used for the treatment of common cause failures in
highly reliable systems. On these topics he holds several
Frank Groen, PhD patents, and has edited, authored or co-authored more than 250
Prediction Technologies, Inc. publications including books, guidebooks and papers in
6525 Belcrest Road #513 technical journals and for conferences. Mosleh received his
Hyattsville, Maryland 20782 USA PhD in Nuclear Science and Engineering form the University
of California, Los Angeles in 1981.
fgroen@prediction-technologies.com

View publication stats

Вам также может понравиться