Вы находитесь на странице: 1из 48

Introduction to

RELIABILITY and
MAINTENANCE

Reliability.Asset.Integrity Center
1-2

Session Objectives

To recognize the importance of reliability


To understand the basic definitions of reliability and its
measures
To understand the concept of bathtub reliability curve
To understand basic methodology in reliability analysis and its
relation to maintenance

Reliability.Asset.Integrity Center
1-3

Highly competitive business environment

Increased concern in safety and


environment
Tight profit margin Safe,
OPERATIONAL ISSUES

Escalating operational cost Reliable,


and
Increased system complexity Pressure
Efficient
Plant
Depletion in oil and gas resources
Increased in demand
Changes in material, operating
conditions, equipment ages

Reliability.Asset.Integrity Center
1-4

Why RELIABILITY?

PETROCHEMICAL BUSINESS DRIVERS


Reduce operational cost
Healthy, Safe and environmental friendly operation
Maximize utilization
Meeting operation target and customer demand
Reduce wastes, failures and downtime
High availability
Continuously improve plant performance

RELIABILITY
DIRECTLY IMPACTS
ALL THESE

Reliability.Asset.Integrity Center
1-5

Reliability and Organizations profitability

Recent incident of oil spills in the Gulf of Mexico had


caused an estimated of USD 23 Billion loss to BP

What causes it?


Bad cement job
Failure of the shoe track barrier
The negative pressure test was accepted when it should
not have been
Failure in well control procedures
Failure in blow-out preventer failures
Rigs fire and gas system failed to prevent ignition

Source: BP report, www.bp.com

Reliability.Asset.Integrity Center
1-6

System Performance Improvement

Improve System
performance

Improve Reliability Improve Maintainability

Prolong the life of


equipment/component

Minimize Downtime

Study Reliability Estimate and reduce


Engineering issues Failure rate

(Modarres, et al (1999))

Reliability.Asset.Integrity Center
1-7

Failure Causes for Engineering Components and Systems

Causes Descriptions
1. Poor design Improper design, dimensions, tolerances, stress concentration, no
interchangeability of parts

2. Improper installation Improper foundation, excessive vibration, inadequate inputs (i.e voltage
etc.), wrong techniques/tools

3. Incorrect production Outdated technology, wrong equipment, lack of process control and
calibrated equipment, inadequate training

4. Improper maintenance Under/over maintenance, wrong tools/technique, poor spare part


management, insufficient skills and training

5. Complexity More number of components, interfaces and interconnection

6. Poor operational instruction Wrong instruction, lack of clarity, difficult to understand, poor language
/ SOP

7. Human error Lack of understanding of process and equipment, carelessness,


forgetfulness, poor judgmental skills

Reliability.Asset.Integrity Center
1-8

What is RELIABILITY?

the probability that the item will perform its required


function under given conditions for the time interval

Probability describe stochastic (random) behaviour of occurrence


of failure
Required function the designed function of the system
Given conditions the external condition in which the system
usually operates
Time interval the design life period of the system

Reliability.Asset.Integrity Center
1-9

RELIABILITY MEASURES

MEAN TIME TO FAILURE (MTTF)

The average time that elapses until a failure occurs. It


is for non-repairable item

1 n
MTTF ti
n i 1

Example:

Consider 6 similar type components have failure time of 23,


34, 32, 28, 19 and 27 days respectively

MTTF = (23+34+32+28+19+27) / 6 = 27.2 days

Reliability.Asset.Integrity Center
1-10

RELIABILITY MEASURES

MEAN TIME BETWEEN FAILURE (MTBF)

The average time between successive failures. It is


used for repairable systems when failure rate is
assumed to be constant (random failure).

1 n
MTBF xi
n i 1
Example:
50 30 60 46

Uptime

Downtime
Fail Fail Fail Fail Time (days)

MTBF = (50+30+60+46) / 4 = 46.5 days

Reliability.Asset.Integrity Center
1-11

RELIABILITY MEASURES

FAILURE RATE (HAZARD RATE)

Failure rate (hazard rate) is the conditional


probability that a component fails in a small
time interval given that it has survived from
time zero until the beginning of the time
interval.
What is the probability of
failure?
survive

t t + t
time

Note : Failure rate term has been widely used to describe reliability of both non-
repairable components and repairable system. The more appropriate term for non-
repairable is hazard rate, and for repairable is rate of occurrence of failure
(ROCOF)

Reliability.Asset.Integrity Center
1-12

RELIABILITY MEASURES

FAILURE RATE (HAZARD RATE) CTD

Failure rate is an important function in Reliability study since it


describes changes in the probability of failure over the lifetime
of the item hence the items reliability performance

Increasing rate = reliability deteriorates


Decreasing rate = reliability improves
Constant rate = reliability maintains

Reliability.Asset.Integrity Center
1-13

Bathtub curve
Bathtub curve is a conceptual model of the reliability
characteristics (failure rate) of a component or system over its
lifetime. It is divided into three regions

Early failures
Failure rate

Wear out
2

Useful life

time

Reliability.Asset.Integrity Center
1-14

Bathtub curve

Infant mortality or burn-in period


1
Failure rate is initially higher due to
Early failures issues such as improper
Failure rate

manufacturing, installation and poor


materials

time

Reliability.Asset.Integrity Center
1-15

Bathtub curve

Failure rate is approximately constant as the


failures, assumed mostly stress-related occur at
random. This flat-portion of bathtub is also
referred as components or systems normal
operating life where realistically many
components or systems spend most of their
Failure rate

lifetimes operating

Useful life

time

Reliability.Asset.Integrity Center
1-16

Bathtub curve

Increasing failure rate


because of degradation
phenomena due to wear
Failure rate

out. Wear out is generally 3

caused by fatigue, Wear out


corrosion, creep, friction
and other aging factors

time

Reliability.Asset.Integrity Center
1-17

Failure rate curve Repairable system

Original system Original system Improvement # 1 Improvement # 2


decreasing failure useful life phase system wear out system wear out
rate phase phase phase
Failure rate

Major
Major
maintenance Original fielded
maintenance
action system failure
action
curve

t1 t2 tn
time
Useful life extension

Equipment / system useful life phase extension (Wasson, 2006)

Reliability.Asset.Integrity Center
1-18

Various types of Failure rate curve


1. Traditional view Typical equipment : Maintenance strategy:
(random failure then wear out)

Belt, chains, impellers Preventive Maintenance

2. Bathtub curve
Electro-mechanical Condition monitoring
components and motors

3. Slow aging
(steady increase in failure rate)

Turbine, engines, Condition monitoring


compressors, piping

Reliability.Asset.Integrity Center
1-19

Various types of Failure rate curve


4. Best New Typical equipment: Maintenance strategy:
(sharp increase in failure rate, then level
off)
Hydraulic and pneumatic Condition based
equipment maintenance

5. Random failure
(failure rate is constant, no age related Ball and roller bearing Condition based
failure pattern)
maintenance

6. Worst New
(high infant mortality, then random failure)

Electronics equipment Condition based


/components maintenance

Reliability.Asset.Integrity Center
1-20

Reliability Analysis

Statistical concepts play critical roles in Reliability analysis/


techniques
Applications of Reliability techniques in real-world problems
generally involves three main elements:
Acquisition effective and efficient data collection
Analysis description and analysis of data (descriptive and
inferential statistics)
Interpretation of data use the result to solve the problem

Reliability.Asset.Integrity Center
1-21

General Methodology for Reliability Analysis

Setting Objectives

Definition of system
and failure

Data gathering

Exploratory analysis

Distribution Analysis Recommendations for


Operation and
Maintenance
Estimation of Reliability improvement
Measures

Reliability.Asset.Integrity Center
1-22

Setting Objective

Clear objective is very important factor for successful reliability


study

Have clear definition of the specific purpose to be achieved at


the end of the analysis

The objective of the reliability study has high influence on the


approach and method of modeling and analysis used

Precise objective will set proper conditions for appropriate


collection of relevant maintenance data to be used in the
analysis

Reliability.Asset.Integrity Center
1-23

System Definition
Example:
Gas Compression Train
System Boundary
(adapted from OREDA (2002))

Recycle
valve
Inlet Gas
conditioning
(Scrubber, Cooler
etc.) Inlet Inter-stage
valve
Conditioning
(Scrubber,
Fuel/Gas control Cooler etc.)
Fuel/ Local valve
Gas Fuel/Gas
inlet Exhaust
Equipment

Gas Power Compressor unit After


Generator Gear Box 1st 2nd Cooler
Turbine
stage stage Outlet
valve
Air Air inlet
Equipment

System
Shaft
Starter Lubrication Control and
seal Miscellaneous boundary
system system monitoring
system

Power Coolant Power Remote Instr. Power Coolant

Reliability.Asset.Integrity Center
1-24

Source of Data

Historical Data test and field data on the same components /equipment
Vendor data Data from manufacturer / vendor / consultant
Test data experimental data of the parts
Operational data Field data collected under actual operating conditions
Handbook data theoretical data from standard engineering handbook,
Reliability database i.e. OREDA, MIL-HDBK 217F
Judgmental data information based on expert opinion inputs
Cost data data on sales, maintenance and operational costs

Reliability.Asset.Integrity Center
1-25

Operational Data

Main categories of data for reliability analysis :


Inventory data information on equipment related to design,
operational, functional and environmental characteristics. Can be classified
under equipment identification, manufacturing and design, maintenance
and test, engineering and process data

Failureevent data detailed records on failure incidents i.e. event


date; duration; modes; causes; codes; severity and effect on system;
downtime date and duration

Operating time data the time and duration for each operating state
i.e. operation, standby and downtime

Reliability.Asset.Integrity Center
1-26

Types of Data

Complete Data Exact time to


failure is known

Right Censored (Suspension) ? Item is still running


at the end of
observation time
Left Censored
Failure time is only
known to be before a
? certain time
Interval Censored
Failure time is
? between interval

Reliability.Asset.Integrity Center
1-27

Exploratory Data Analysis

Use statistical tools and techniques to investigate data sets in


order to gain insight about the data, understand their important
characteristics, identify outliers and extract important factors

Common Exploratory Tools

Histogram
Pie chart
Pareto
Box plot
Trend chart
scattered plot

Reliability.Asset.Integrity Center
27
1-28

Exploratory Analysis
TCS
Example PIE CHART 9%

TCS GT
No. Subsystem Code 25% 31%
GT
1 Gas Turbine GT PRO
39%
18%
2 Centrifugal Gas Compressor GC
3 Starter System STS
4 Gearbox GB
LOS
5 Fuel System FS
LOS 3%
6 Vibration Monitoring System VMS 7% AVS
7 Anti-surge Valve System AVS 9%
GC
8 Lube Oil System LOS AVS VMS
18%
9 Process and Utilities PRO 14% GC 3% FS STS
VMS STS 7% 3%
10 Turbine Control System TCS 4% 4%
Train 1
6%
Train 2

25 100 14
TREND
TCS
12
20 80 PRO
10 LOS
PARETO
cummulative %

no of failures
15 60 8 AVS
failures

VMS
6
10 40 FS
4 GB

5 20 STS
2
GC
0
GT
0 0 2002 2003 2004 2005 2006 2007 2008 2009
GT TCS GC AVS PRO LOS STS FS VMS GB
Gas compression Train (overall)
Reliability.Asset.Integrity Center
28
1-29

Types of Configurations

Series

Parallel

T201A T203-A
T202A
Feed/pure
gas exchanger T201B A201 T203-B M202
M201
Feed gas Absorber Feed gas
separator separator
T202B T201C T203-C
Feed/pure
gas exchanger T201D T203-D

Example RBD for Acid Gas Removal Unit

Reliability.Asset.Integrity Center
1-30

Series Configuration

Blocks are connected in a series.

It can be thought of as an OR relationship (i.e. The system


fails if A OR B fails).

It implies no redundancy in the components.

If units are in series, then all units must for the system to work.
If any unit in the series fails, then the system fails.

The reliability of the system is given by:


Rs = R 1 R2 Rn

R1 R2 R3

Reliability.Asset.Integrity Center
1-31

Reliability Calculation for Series System

Calculate system reliability given R1 = 0.90, R2 = 0.95


and R3 = 0.98.

R1 R2 R3

RS = R1 R2 R3
= (0.90)(0.95)(0.98)
= 0.8379

Reliability.Asset.Integrity Center
1-32

Reliability Calculation for Series System

What is the system reliability and failure rate?

Assuming that the components are having a constant failure rate.

Then, the system reliability is


So, the failure rate for
Rs (t ) R1 (t ) R2 (t ) R3 (t ) the system is
S 1 2 3
1t 2 t 3t
e e e
e ( 1 2 3 )t
R1 R2 R3

Reliability.Asset.Integrity Center
1-33

Exercise for Series System

Consider a system with three components in series.

R1 R2 R3

You are required to achieve a system reliability of 0.98 over a 800-hours


non-stop operation.

1. What would be the target failure rate for the system?

Rs (t ) e S t
0.98 e S (800)
ln( 0.98) S (800)
ln( 0.98)
S
800
S 2.53 10 5 per hour
Reliability.Asset.Integrity Center
1-34

Exercise for Series System

Consider a system with three components in series.

R1 R2 R3

You are required to achieve a system reliability of 0.98 over a 800-hours


non-stop operation.

2. What would be the system MTBF be?

1
MTBFS
S
1

2.53 10 5
39599 hours
1650 days
Reliability.Asset.Integrity Center
1-35

Exercise for Series System

3. Assuming the component failures are identically distributed,


a) What should be the component failure rate?
S 1 2 3
2.53 10 5 3
R1 R2 R3

2.53 10 5

3
8.42 10 6 per hour
b) What would be the component MTBF? 1 1
MTBF
8.42 10 6
118,796 hours
4950 days
c) What should be the component reliability?
R (t ) e t
6
e (8.4210 )( 800)

0.993
Reliability.Asset.Integrity Center
1-36

Parallel Configuration

A system will fails when all the units fail.

It can be thought of as an AND relationship (i.e. the system


fails if 1 and 2 and and n fail)

At least one unit must succeed for a successful mission.

The reliability of the system is given by:


1
Rs = 1 [(1-R1) (1-R2) (1-Rn)]
2

3
.
.

n
Reliability.Asset.Integrity Center
1-37

Reliability Calculation for Parallel System

Calculate system reliability given R1 = 0.90 and R2 = 0.98.

1
RS = 1 [(1 R1)(1-R2)]
= 1 [(1 0.90)(1 0.98)]
= 1 (0.10)(0.02)
2
= 1 0.002
= 0.998

Reliability.Asset.Integrity Center
1-38

Combination of Basic Configurations

Any of the previous configuration types can be used


simultaneously in one diagram.

Consider a system having subsystems.

3 4

1 2 6

Reliability.Asset.Integrity Center
1-39
Steps to calculate system reliability for
combined series-parallel configuration

1. Break the system into smaller series and parallel


arrangements.

2. Calculate reliability of each arrangement identified


in step 1.

3. Finally, calculate RS using the reliability obtained in


step 2.

Reliability.Asset.Integrity Center
1-40

k-out-of-n Redundancy

At times, a system function is such that k-out-of-n of its


components need to be working for the system to function.

1
1

2
2
3/4
k/n
3
3

4
4
.
.
.

Reliability.Asset.Integrity Center
1-41

k-out-of-n Redundancy

A node is used to signify k-out-of-n redundancy.

The basic property of the node is to define the


number of incoming paths that must be Good for
the system to be Good.

Reliability.Asset.Integrity Center
1-42

k-out-of-n Redundancy

For n identical components (i.e. same reliability values), the


system reliability is calculated as

1 Rs Prob (at least k components are working )


n

2
Px
xk
k/n where
3
n x
P x R 1 R n x Binomial
4
x distribution
. and
.
.
n n!

x x!n x !
n

Reliability.Asset.Integrity Center
1-43

Example: k-out-of-n Redundancy

A high pressure boiler is mounted with 5 identical pressure relief


valves. Pressure inside the boiler is successfully controlled by any
three of these valves. If the failure probability of a relief valve is
0.05, compute the reliability of pressure relief valve system.

Solution: This is 3-out-of-5 system where n = 5, R = 1 0.05 = 0.95.

n
n
Rs R x 1 R
n x

xk x

0.953 1 0.95 0.954 1 0.95


5! 5 3 5! 5 4

3!5 3! 4!5 4 !

0.955 1 0.95
5! 55

5!5 5!
0.99884
Reliability.Asset.Integrity Center
1-44

AVAILABILITY

Definition
The probability that a system or component is
performing its required function at a given
point in time or over a stated period of time
when operated and maintained in prescribed
manner
(Ebeling, 1997)

Reliability.Asset.Integrity Center
1-45

AVAILABILITY

Three Types of Availability Measures

1. Inherent, Ai
MTBF Steady state availability which considers only corrective
Ai = maintenance (CM)
(MTBF + MTTR)

2. Achieved, Aa
MTBM Steady state availability which include both corrective
Ai = maintenance (CM) and preventive maintenance (PM)
(MTBM + MMT)

3. Operational, Ao
MTBM
Ao =
MTBF = mean time between failure
MTTR = mean time to repair (MTBM + MMT + MLDT)
MTBM = mean time between
maintenance (LDT + ADT)
MMT = mean maintenance time
MLDT = mean logistics down time
LDT = logistics delay time
Uptime
ADT = administrative delay time Ao =
(Uptime + Downtime)
Reliability.Asset.Integrity Center
1-46

Operational Availability

Standby Operating
Time Time

UPTIME
Ao =
UPTIME + DOWNTIME

Logistics Administrative Corrective Preventive


Delay Time Delay Time (ADT) Maintenance Time Maintenance Time
(LDT) (CMT) (PMT)
locating tools
Parts availability setting up test preparation time servicing
Waiting for items equipment Fault location time Inspection
/ services finding personnel Getting parts overhaul
reviewing manuals Correcting fault
Test and check out

Reliability.Asset.Integrity Center
47

THANK YOU
1-48

References

Modarres, M., Kaminskiy, M. and Krivtsov, V. (1999) Reliability Engineering


and Risk Analysis. Marcel Dekker, New York

OREDA Offshore Reliability Data Handbook, 4th Edition (2002) OREDA


Participants

Ebeling, C. (1997), An Introduction to Reliability and Maintainability


Engineering, McGraw-Hill Companies, Inc., Boston.

Wasson, C. S. 2006. System Analysis, Design, and Development. Hoboken,


NJ, USA: John Wiley & Sons.

Reliability.Asset.Integrity Center