Академический Документы
Профессиональный Документы
Культура Документы
Module I
Definition of reliability key elements; failure analysis failure density failure rate
probability of failure - bathtub curve - Basic reliability equations Reliability in terms of
failure rate failure density - relation between reliability, failure density and hazard
rate - Mean time to failure (MTTF) Integral equation of MTTF in terms of reliability
Module II
Hazard models constant hazard model linearly increasing hazard model expressions
for reliability, failure density, and probability of failure of these models problems
Module III
System reliability components connected in series components connected in parallel
mixed configuration reliability block diagrams (RBD) distinction between physical
configuration and logical configuration problems
Module IV
Reliability improvement methods Redundancy unit redundancies element
redundancies simplification of design parts derating operating environment; Cost of
reliability factors to be considered for optimizing the reliability cost
References
1. L.S.Srinath, Reliability Engineering, Affiliated East-West Press Ltd., 1985
2. E. Balaguruswamy, Reliability Engineering, Tata McGraw Hill Publishing Co.,
1984
3. Charles E. Ebling, Reliability & Maintainability Engg., Tata McGraw Hill
Publishing Co., 1997
4. Alessandro Birolini Reliability Engineering Theory and Practice, Springer,
2007.
5. Lewis, E., Introduction to Reliability Engineering, John Wiley & Sons, 1995
Time (t)
No. of
failures
( f)
Cumulative
failures (F)
0
No. of
survivors
(S)
1000
130
1
130
870
213
787
288
712
356
644
83
2
418
474
525
571
429
612
388
649
351
683
317
41
9
714
742
806
882
118
944
56
984
16
62
16
996
0.101
0.046
0.101
0.526
0.475
0.429
0.100
0.388
0.100
0.351
0.101
0.317
0.031
0.103
0.028
0.103
0.064
0.283
0.076
0.486
0.286
0.258
0.194
0.118
1000
0.714
0.056
1.110
0.016
0.012
1.200
0.004
2.000
Total =1
Mean = 0.376
4
19
0.051
0.582
0.04
12
18
0.101
0.062
40
17
0.056
194
76
15
0.101
258
64
14
0.062
286
28
13
0.100
0.644
0.034
31
12
0.100
0.712
0.037
34
11
0.787
0.041
37
10
0.100
475
46
8
0.870
526
51
Reliability
(R)
1
582
56
6
0.139
0.068
62
5
0.130
0.075
68
4
Failure
rate (Z)
0.083
75
3
Failure
density
(fd)
0.004
4
The failure density, failure rate, reliability and probability of failure can be defined as
follows.
Failure density (fd)
This is the ratio of number of failures during a given unit interval of time to the total
number of items at the very beginning of the test (also called as initial population). For
the given data the failure density associated with the first unit interval is
130
f d1
0.13
1000
83
Similarly, f d 2
0.083 and so on.
1000
Failure rate (Z)
This is the ratio of number of failures during a particular unit time interval to the average
population during that interval. The average population during an interval is the average
of populations at the beginning and at the end of the interval. For the given data, failure
rate during the first unit interval is
130
Z (1)
0.139
1000 870
2
83
Similarly, Z (2)
0.100 and so on.
870 787
2
Reliability (R)
This is the ratio of survivors at any given time to the total initial population. Probability
of survival is another name for reliability. For the given data, reliability corresponding
to first hour is
870
R(1)
0.870
1000
787
R(2)
0.787 and so on.
1000
Probability of failure
Probability of failure can also be termed as unreliability factor. Since survival and failure
are complementary events, Probability of failure = 1- Probability of success (or
reliability).
For the given data, probability of failure corresponding to first hour = 1=R(1)
= 1-0.787
= 0.213
Similarly, probability of failure corresponding to fifth hour = 1=R(5)
= 1-0.582
= 0.418
0.14
Failure density
0.12
0.1
0.08
0.06
0.04
0.02
18
16
14
12
10
Time interval
2.5
Failure rate
2
1.5
1
0.5
18
16
14
12
10
Time interval
1.2
Reliability
1
0.8
0.6
0.4
0.2
18
16
14
12
10
Time
6
Conclusions
(i) Let f d 1 = failure density associated with the first unit time interval
f d 2 = failure density associated with the first unit time interval
.
.
f d l = failure density associated with the last unit time interval
Then, f d1
f d2
f dl
(ii) Let n 1 number of failed components during the first unit interval
n2 number of failed components during the second unit interval
.
.
nt number of failed components associated with the t th unit interval
N = Total initial population
Then, Reliability for the tth hour is the number of survivors till the t th hour divided by the
initial population.
N (n1 n2 nt )
n
n1 n2
That is, R(t )
1
t
N
N N
N
That is, R(t ) 1 f d1 f d 2 f d t
(iii) Probability of failure for t th hour = f
d1
f d 2
f dt
(iv) Failure rate or hazard rate associated with the tth hour, Z(t) =
Z (t )
Z (t )
NR(t 1) NR(t )
R(t 1) R(t )
(for unit time interval)
2
NR(t 1) NR(t )
R(t 1) R(t )
2
2 R(t
t ) R(t )
(for a time interval of t )
t R(t
t ) R(t )
Problem
Following table gives the results of tests conducted under severe conditions on 1000
safety valves. Obtain the failure density and hazard rates for various time intervals.
Time
0-4
4-8
8-12
12-16
16-20
20-24
interval
No. of
267
59
36
24
23
11
failures
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
7
Solution
Time
No. of failures
Failure density
Hazard rate
0
267
0.0668
0.00770
59
0.0150
0.00210
36
0.0090
0.0137
24
0.0060
0.0096
23
0.0058
0.0095
11
0.0028
0.0047
4
8
12
16
20
24
Sample calculation for time interval 0 4 is given below.
267
4 1000
fd
Z
4
1000
0.0668
267
(1000
2
267 )
0.0770
fd 1
n1
f d2
n2
where f d1 , f d 2 etc. are the failure densities associated with the respective intervals.
When the total population is large and the time interval is very small, the variation of
failure density with time will be a smooth curve and the summation can be represented by
integrals.
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
f d ( )d
f d ( )d
f d1
f d2
f dt
fd
0
t
fd
9
When t is very small and tends to zero, the value of R(t) approaches that of
R(t
t ) and we get
R(t
t ) R(t )
Z (t ) Lt .
t 0
tR(t )
1 dR(t )
R(t ) d (t )
d
Z (t )
ln R(t )
dt
dR(t )
R(t
t ) R(t )
Lt .
t
0
d (t )
R(t )
d
1 dR(t )
ln R(t )
Since
dt
R(t ) d (t )
d
1
For example,
ln( 2 x)
2
dx
2x
Since
Integrating,
Z ( )d
ln R(t ) C
0, Reliability 1
is a dummy variable.
Z ( )d
ln R(t )
0
t
R(t )
exp(
Z ( )d
0
f d (t )
When
1 N R(t
t
t ) N R(t )
N
1 dR(t )
Z (t ) R(t )
R(t ) d (t )
Combining above two equations we get,
f d (t ) Z (t ) R (t )
Also we have, Z (t )
Lt.
t
R(t
t ) R(t )
t
dR(t )
dt
dR(t )
dt
Mean Time To Fail (MTTF) and Mean Time Between Failures (MTBF)
MTTF is the mean time to first failure and is used in case of components that are not
repaired when they fail, but are replaced by new components. On the other hand, MTBF
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
10
is the mean time between two successive component failures and is used with repairable
equipment or systems.
Note:
1) When the number of samples tested is small, it is possible to note the time to failure of each
sample and the mean failure rate is given by the formula
Z (T )
where,
1 N (0) N (T )
T
N (0)
2) As the number of specimens tested becomes large it is tedious to record the time to failure of each
specimen. Instead, the number which fail during specific intervals of time are recorded. Here
mean time to failure for N specimens will be
1
n1 t 2n2 t 3n3 t l.nl t
N
t is the time interval
MTTF
where,
.
.
n l is the number of specimens that failed during the last interval
That is, in general, MTTF
1
N
nK K t
K 1
Problem
In the life testing of 10 specimens of a mini-mixer, the time to failure of each specimen is
recorded as given in the following table. Calculate the mean failure rate for 900 hours and
the mean time to failure for all ten specimens.
Specimen
Number
1
2
3
4
5
6
7
8
9
10
Time to Failure
(hours)
805
810
815
820
825
832
842
856
875
900
11
Solution
The mean failure rate can be calculated using the expression:
1 N (0) N (T )
Z (T )
T
N (0)
Here, Z(900) is to be calculated and N(0)=10 ; N(900)=0
1 900 0
1
Z (900 )
1.11 10 3 (per hour)
900
900
900
Mean Time to Fail (MTTF)
1
(MTTF)
805 810 815 820 825 832 842 856 875 900
10
838 hours
Problem
Ten transformers were tested for 500 hours each within the prescribed operating
conditions, and one transformer failed exactly at the end of the 500 hours exposure. What
is the failure rate for this type of transformer?
Solution
The mean failure rate can be calculated using the expression:
1 N (0) N (T )
Z (T )
T
N (0)
1 10 9
1
Z (500 )
0.0002 failure / hour
500
10
5000
Mean Time to Fail (MTTF) in integral form
1 l
MTTF
nK K t
NK1
Also, by definition failure density fd can be expressed as f d
nK
, where n K is the
N t
MTTF
f d K ( tK ) t
K 1
Further K t is the elapsed time t and therefore the expression for the MTTF becomes
l
MTTF
tf d t
K 1
tf d dt
0
12
In the above equation, upper limit l is the number of hours after which there are no
survivors. It is customary to replace this by infinity since all components will have failed
at the end of an infinite test period.
MTTF
tf d dt
0
t
fd( )d
0
f d (t )
MTTF
t
0
dR(t )
dt
f d (t )
dR(t )
and substituting this in the equation for MTTF
dt
dR(t )
dt
dt
tdR(t )
0
tR(t ) 0
udv uv
R(t )dt
0
MTTF
R(t )dt
0}
vdu
13
Hazard Models
The data obtained from failure tests can be analyzed to obtain reliability, failure density,
hazard rate and other necessary information. Obviously, the behavioural characteristics
exhibited by one class of components differ from those exhibited by another class of
components. In order to compare different behavioural characteristics and also to draw
general conclusions from behavioural patterns of similar components, a mathematical
model representing the failure characteristics of the components becomes necessary. The
procedure involves assuming a function for hazard rate and thereby obtaining reliability
and failure density by using this failure rate function. The assumed function for the
hazard rate will be the hazard model. Some of the common hazard models are discussed
below.
Constant hazard model
Here the failure rate is assumed to remain constant with time.
That is, Z (t )
, a constant.
t
R(t )
exp
Z ( )d
0
exp
exp
t
0
exp
Variation of failure rate, reliability, probability of failure, and failure density for a constant hazard model
14
It can be seen that, for a constant hazard model the mean time to failure is the reciprocal
of failure rate.
That is, MTTF
R(t )dt
0
dt
e0
0 1
R(t )
exp
Z ( )d
0
exp
K d
exp
K 2
2
exp
0
Kt
2
Kt 2
2
Kt 2
2
Kt 2
2
Variation of failure rate, reliability, probability of failure, and failure density for a linearly increasing
hazard model
It can be seen from the failure density curve that the curve has a slope equal to K at
K
1
time t 0 . Also the value of f d (t ) reaches a maximum of
at time t
, and tends
K
e
to zero as t becomes larger.
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
15
R(t )
exp
Z ( )d
exp
exp
K m1
m 1
Z (t ) R (t )
Kt e
exp
0
Kt m 1
m 1
Kt m 1
m 1
Kt m 1
m 1
Kt m 1
m 1
Following figure shows the variation of reliability in case of Weibul model for various
values of K and m
16
Bath Tub Curve
Component failure rate as a function of age follows a curve that is concave upward, as
shown in the above figure. Because of its shape, this curve is also referred to as bath tub
curve. This curve exhibits three distinct zones. The first is the short initial period called
variously the early failure, infant mortality, or the burn in period. The decreasing but
greater failure rate early in life of the system is due to one or more of several potential
causes. The causes include inadequate testing or screening of components during
selection or acceptance, damage to components during production, assembly, or testing,
and choice of components which have too great a failure variability. It shall be a specific
goal of the supplier to ensure that the early failure period is rigorously controlled and
covered by a suitable warranty.
The failures in the second zone are termed service failures. During this period, the failure
or hazard rate is constant and it represents the effective life of the product.
The failures in the third zone are the wear-out failures. The incidence of failure in this
zone is high since most of the components will have exceeded their service life, and
consequently would have deteriorated. Hence, they are appropriately called wear-out
failures.
Note: Failure (death) rates for human beings are different by sex, race, nationality, and other factors but
all failure rate for humans appear to exhibit this distinctive bath tub curve. The failure rate for infants is
extremely high for the first few months, drops sharply, and remains fairly constant for many years and then
slowly climbs as the person ages.
System reliability
A system or a complex product is an assembly of a number of parts or components. The
components may be connected in series or in parallel, or it may be a mixed system, where
the components are connected in series as well as in parallel.
Series configuration
If the components of an assembly are connected in series the failure of any component
causes the failure of the assembly or system.
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
17
Let us consider a system consisting of n units which are connected in series as shown in
the following figure.
R(t )
P( X 1 ) P( X 2 ) P( X n )
R1 R2 R3 Rn ,
18
F (t )
P( X 1 ) P( X 2 ) P( X n ) 1 R(t )
R(t ) 1
1 P X 1 (1 P X 2 1 P X n
1 R1 (1 R2 ) (1 R n )
19
Mixed configuration
If a system is having a mixed configuration, then it will have components connected in
parallel as well as in series and the following figure indicates a system having
components in series and parallel.
R1 R2 1 (1 R3 )(1 R4 )(1 R5 ) R6
Problem
A certain type of electronic component has a uniform failure rate of 0.00001 per hour.
What is the reliability for a specified period of service of 10,000 hours?
Solution
0.000001per hour
t 10000 hours
R(1000 ) e ( 0.00001)10000
0.1
Problem
Given a MTTF of 5000 hours and a uniform failure rate. What is the probability that the
system failure occurs within 200 hours?
Solution
1
(per hour)
5000
t = 200 hours
1
200
5000
R(200) e
0.96079 (or) 96.079%
F (200) 1 0.96079 0.03921 (or) 3.921%
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
20
Problem
The following reliability requirements have been set on the subsystems of a
communication system.
Subsystem
Receiver
Control System
Power Supply
Antenna
21
Solution
R(t ) (0.95)( 0.96 ) 1 (1 0.95)(1 0.94 ) (0.90 ) 0.8183 (or) 81.83%
Problem
The MTBF of equipment is 500 hours. What is the failure rate expressed in
a) Failures / hour
b) Failures / 106 hours
c) % failures / 1000 hours
Is MTBF a guaranteed failure free period?
Solution
Failure rate,
1
MTBF
1
500
0.002 106
{Answer (a)}
{Answer (c)}
MTBF cannot be regarded as a guaranteed failure free period as it is only a mean value of
operating times between failures.
Reliability Increasing Techniques
One way of achieving high reliabilities is by introducing redundant parts. For example we
may have two parts in parallel such that the system operates if at least one part operates.
Here the probability that the system fails is equal to the probability both parts fail. If the
failures are assumed to be independent, then the system reliability will be R(t) = 1- (1R1)(1-R2), where R1 and R2 are the reliability of the two parts respectively. If the
reliability of each part is 0.95 at time t, then the reliability of the system is
R(t) = 1-(1-0.95)(1-0.95) = 0.9975
By adding a redundant part we have increased the reliability of system at time t from 0.95
to 0.9975
We have been assuming that both parts are operating whenever the system is on and the
failure of one part does not affect the operation of other part. This is some times called
hot standby and is not always practical. We may need to provide a cold stand by where
the second part is switched into service when the first one fails. Then we must also take
into account the reliability of the switch. If we assume we have, as before, two
components with reliability 0.95 at time t and a switching device with reliability 0.98 at
time t we have the system reliability at time t as
R(t) = 0.95 + (0.05)(0.98)(0.95) = 0.9966
22
The above equation is just the probability the first part is operating plus the probability
the first part fails times the probability the switch operates times the probability the
second part operates.
It is to be noted that there is a point of diminishing returns in using redundancy
configurations; an increase in the level of parallel redundancy employed increases size,
weight, cost and volume of the equipment and often requires complicated failure-sensing
devices whose reliability need to be considered.
0.73
2) Review the selection of any parts that are relatively new and unproven. Use
standard parts whose reliability has been proved. (However, be sure that the
conditions of previous use are applicable to the new product.)
Lecture notes on Process Reliability Engineering
Subject handled by Dr.Shouri P. V., Associate Professor in Mechanical Engineering, MEC, Cochin.
(for 1st year M.Tech. Mechanical Engineering batch)
23
3) Use derating to assure that stresses applied to the parts are lower than the stresses
the parts can normally withstand.
4) Use robust design methods that enable a product to handle unexpected
environments.
5) Control the operating environment to provide conditions that yield lower failure
rates Common examples are (a) potting electronic components to protect them
against climate and shock, and (b) use of cooling equipments to keep down
ambient temperatures.
6) Specify replacement schedules to remove and replace low-reliability parts before
they reach the wear-out stage.
7) Prescribe screening tests to detect infant-mortality failures and to eliminate
substandard components. The tests include burn in, accelerated life tests etc.
8) Conduct research and development to attain an improvement in the basic
reliability of those components which contribute most of the unreliability.