Design For Reliability

Design for reliability 1
Design for Reliability

• Reliability design is an iterative process that begins with the
specification of reliability goals consistent with cost and
performance objectives.
• It considers life-cycle costs of the system and system

effectiveness
Specify reliability goals
Allocate reliability to components
Implements design methods
Failure analysis (FMEA/FMECA)
No
Are goals
achieved?
Yes
System Safety Analysis (FTA)
No
Are goals
achieved?
Yes
Ready for production

Reliability activities and product life cycle
Conceptual Detailed Production and Product Use

preliminary design and Manufacturing and Support
design development
and
prototyping
Specification Design Acceptance Preventive
methods testing and predictive
maintenance
Allocation Failure Quality control Modification
analysis
Design methods Growth testing Burn-in and Parts
screen testing replacements
Safety analysis
Reliability specification and system measurements
• Mean time to failure (MTTF) or mean time between failures

(MTBF) has been primarily and often only measures of
reliability.
• This parameter is not sufficient except in case of exponential
failure distribution.
• A better measure would be to specify the reliability at specific
points in time.
For example: to state that 99% reliability is required after 2 years

of operation and 95% reliability is required after 5 years of
operation is much more precise than stating an MTTF of 10 years
is required.
• If the MTTF is to be the only reliability specification to be used,

a constant failure rate should be assumed as part of the
specification and subsequently demonstrated. Otherwise, a
wide range of distributions must be tried.
System effectiveness
System effectiveness is the probability that the system can

successfully perform its intended purpose or mission when operated
under specified conditions. It includes reliability
System effectiveness
AND
Operational Mission Design
readiness availability
OR
adequacy
Reliability Maintainability
• Operational readiness is the probability that the system is

operational when first used or at the start of the mission.
• Mission availability is the percentage of time the system will be

operating during the mission. If system is not repairable,
availability is same as reliability.
• Design adequacy is the probability that the system will

accomplish its mission given that the system is operating within
its design parameters.
System effectiveness = operational readiness*availability*design

adequacy
Example: A copy machine is working on demand. However, in order

to complete the job in time, it must operate at a speed of 45 copies a
minute. If it could do not do this, then the copies is inadequately
designed for the job.
Reliability allocation
Once the system reliability goals have been defined, reliability must
then be allocated to the components and possibly subcomponents in
a manner that will support these goals.
h{R1 (t ), R2 (t ),..., Rn (t )} ≥ R * (t )
Where, Ri(t) is the reliability at time t of ith component, R*(t) is the

system reliability goal at time t, h is a function that relates
components to the system reliability
Similarly for MTTF

*
g{MTTF 1 , MTTF 2 ,... MTTF n } ≥ MTTF
If all components are serially related and their failures are

independent
n
*
∏ Ri (t ) ≥ R (t )
i =1
Exponential case
If all the components have constant rate of failure:

n
−λit
∏e ≥ R * (t )
i =1
or equivalent ly
n
∑λi ≤ λs
i =1
Where λ s is the system failure rate goal
Optimal allocation
Ideally, reliability allocation should be accomplished in a least-cost

manner.
Let each component has a current reliability Ri where Π Ri < R*, the
optimal solution may be obtained optimizing:
n
Min z = ∑ Ci ( xi )
i =1
n
*
subject to ∏ ( Ri + xi ) ≥ R and o < Ri + xi ≤ Bi < 1
i =1
Where i = 1,2,…n, xi is the increase in the reliability of the ith

component, Ci(xi) is the corresponding cost to achieve this growth,
and Bi is an upper bound on the attainable component reliability
Considering quadratic cost function (the most common function that

shows reliability growth cost increase at an increasing rate).
n
Min z = ∑ ci xi2
i =1
Let ignore the inequality sign and form the Lagrangian function
n
 n 
L( xi ,θ ) = ∑ ci xi2 − θ  ∏ ( Ri + xi ) − R * 
i =1  i =1 
Where θ is the Langrangian multiplier. Now optimizing the function
∂
L ( xi , θ)
=2ci xi θ (
−∏
n
Rj +x j )=0
∂ xi j=
1
j≠
i
∂
L ( xi ,θ)
∏
n
(Ri +xi )−R*
∂i θ = i=1
=0
Multiplying (Ri+xi) to the first equation and rearranging terms

n
2ci xi ( Ri + xi ) = θ ∏ ( Ri + xi ) = θR*
i =1
Rearranging and solving the equation
2ci xi2 + 2ci x i Ri − θR * = 0
The solution is:
− 2ci Ri + 4ci2 Ri2 + 8ciθR *

xi =
4 ci
There are a few popular strategies discussed in the literature,

which do not require optimization. They are ARINC and AGREE
ARINC method
The simplest one is ARINC which assumes components are in series,

are independent, and have constant failure rates.
λi
wi =
newλ i = wiλ *
where n
i= 1,2,..n
∑ λi
i =1
Where λ * is the target failure rate
AGREE method
The AGREE (Advisory Group on Reliability of Electronic Equipment)

method considers system is composed of n components each having
ni modules or sub components.
This approach allocate equal share of the reliability to each module in

the system. The ith component contribution to system reliability is
given by [R*(t)]ni/N. This leads to
ni
wi (1 − e −λiti ) = 1 −[ R * (t )] N
The left side is the joint probability that ith component fails and results
in a system failure. The right side of the failure probability allocated to
the ith component. Solving for λ i result in:
 ni

1  1 − R * (t ) N 
λi = − ln 1 − 
ti  wi 
 
n
−λiti
such that ∏ e ≤ R * (t )
i =1
Where
t is system operating time

R*(t) is system reliability goal
n is number of components
ni complexity number, number of modules within component i
N = sum of ni, total number of modules in system
ti is operating time of ith component
λ i is failure rate of ith component
wi is probability that the system will fail given component i failed
Redundancies R2
Redundancy may be used to achieve the R1 R4

allocated component reliability.
R3
*
If R is the system reliability goal, one
R’
can write R*=R1 * R’ * R4 .
Assuming R’ is allocated reliability
R’ = 1-(1-R2)(1-R3) = R2 + R3 – R2R3
One can assign a reliability to one of the component, say component

2 such that R2 <R’ and then solving for other.
R '−R2
R3 =
1 − R2
If both component receive the same probability R, we have R’ = 2R–
R2, which has the solution R = 1-(1-R’)0.5
In case of complex systems, it is generally possible to reduce the

system initially to serially related components and then further
decompose as necessary.
Design methods
A product fails prematurely because of the inadequate design
features, manufacturing part defects, abnormal stresses introduced
due to packaging or distribution, operator and maintenance error, or
external conditions that exceed the design parameters.
Various activities and parameter that are involved design of

products:
1. Material selection. It involve consideration of following

parameters
• Tensile strength
• Hardness
• Fatigue life
• Creep
2. Derating. It is use of a component under stress significantly
below its rated value.
3. Stress-strength analysis. The traditional approach is to design

safety margins, or safety factors in to the equipment/ component.
Failure is likely to occur if safety factor is less than 1 or safety margin
is negative.
• Safety factor: The safety factor is the ratio of the capacity of

the system to the load placed on the system
• Safety margin: The safety margin is the difference between

the system capacity and load
4. Complexity and Technology. The number of the parts in a

system measure of system complexity.
5. Redundancy. Redundancy includes both active and standby

units. There may be duplicate active units with all operating and only
required to survive or there may be the more general k-out-n
redundancy.
• Redundancy optimization
Redundancy optimization
Let Ri(t) reliability of component i at time t

ni number of parallel components i
ci unit cost of component i
B budget available for additional units
The problem is find the optimal ni so that

M
max ∏[1 − (1 − Ri (t )) ni ]
i =1
M M
subject to ∑ci ni ≤ B + ∑ci
i =1 i =1
Marginal analysis may be used to solve this problem, if natural log of

the reliability is maximized rather than the function itself.
M
max ∑ln[1 − (1 − Ri (t )) ni ]
i =1
ln[1 − (1 − Ri (t )) ni +1 ] − ln[1 − (1 − Ri (t )) ni ]
∆i =
ci
Marginal analysis consist of following steps:
1. Set ni =1, i =1,2,3,…, M and set cost =0

2. Compute ∆ i, i =1,2,3,…, M.
3. Find max{∆ 1, ∆ 2, ∆ 3,…, ∆ m}, call it ∆ k
4. Set cost = cost + ck
5. if cost<B, then se nk= nk+1, recompute ∆ k, and go to step 3,
otherwise stop.
Failure Analysis
Failure mode effect analysis (FMEA) or failure mode effect, and
criticality analysis (FMECA) is formalized design process with an
objective to improve the inherent reliability.
This is an iterative process that influences design by identifying

failure modes, assessing their probabilities of occurrence and their
effects on the system. It may also consider isolation of the causes
and determining corrective actions or preventive measures.
FMEA comprises of following steps:
1. System definition:
This step is to identify those system components that will be subject
to failure. A functional and physical description of the system provides
the definition and boundaries for performing analysis.
2. Identification of failure modes

Failure modes will be identified by hardware or function approach.
Failures modes are observable manners in which a component fails.
For example: valve open, circuit short, pipe or valve rupture, power
loss, etc.
3. Determination of causes
For each failure mode an assessment is made as to the probable
cause or causes. A failure mode may have more than one
cause. Example includes:
Failure mode Category Cause Failure

Mechanism
Capacitor short Electrical High voltage Derating
Failure of metal Chemical Humid and Corrosion
salty
atmosphere
Connector Mechanical Excessive Fatigue
fracture vibration
4. Effect assessment
The impact each failure has on the operation or status of the system
is assessed. Effects may range from complete system failure to
partial degradation to no impact on performance
Failure mechanism Failure mode Failure effect

Corrosion Failure of tank Tank rupture
wall
Manufacturing Leaking Failure to flashlight to light
defect in casing battery
Frication and Drive belt Shutdown of production line
excessive wear break
Prolonged low Brittle seals Leakage in hydraulic system
temperature
5. Classification of severity
A severity classification is assigned to each failure mode to be used

for prioritization of corrective actions. Generally severity is classified
in four classes.
Category I: Catastrophic. Significant system failure occurs that can

result in injury, loss of life, or major damage.
Category II: Critical. Complete loss of system occurs, performance
unacceptable.
Category III: Marginal. System is degraded with partial loss in
performance.
Category IV: Negligible. Minor failure occurs with no effect on
acceptable system performance
6. Estimation of probability of occurrence
Probability of occurrence of each failure mode is estimated generally

using handbook or existing databases. Some of the standard
handbook on FMEA classifies qualitatively frequency of occurrence in
five major levels:
Level A: Frequent: High probability of failure (p≥ 0.20)

Level B: Probable: Moderate probability of failure (0.1≤ p≤ 0.20)
Level C: Occasional: Marginal probability of failure (0.01≤ p≤ 0.1)
Level D: Remote: Unlikely probability of failure (0.001≤ p≤ 0.01)
Level E: Extremely unlikely: Rare probability of failure (p≤ 0.001)
7. Computation of criticality index
This is a quantitative measure of the criticality of the failure mode that

combines the probability of the failure mode’s occurrence with its
severity ranking. The index may be defined as:
C k = αkp βk λp t
Where Ck is critical index for failure mode k

α kp the fraction of the component p’s failures having failure
mode k
β k the conditional probability that failure mode k will result in
the identified failure effect
λ p the failure rate of component p
t duration of time used in the analysis
β k, the conditional probability is subjective estimate that may be

quantified as:
Failure effect β
Certain β =1.0
Probable 0.10<β <1.0
Possible 0<β <0.10
No effect β =0
For a given p, the sum of α kp over all its failure modes would
normally equal 1.
Failure mode classification matrix

Criticality index
Severity classification
A IV III II I
B
C
D
E
8. Determination of corrective action
This is very dependent on the problem. Those failure modes having

high criticality index and severity classification should receive the
most attention.
Design activities should be oriented toward removing the cause of

failure, decreasing the probability of occurrence, and reducing the
severity of failure.
Fault Tree Analysis
Fault Tree: is a logical representation of the relationship

between an accident/event with their basic
causes. The relationship is expressed using
logic gates (And or Or)
Fault Tree Analysis: A deductive study to quantify the

probability of occurrence of an accident using
fault tree and basic failure data.
Logic gates
AND An output from the gate (event)

will occur only when all input
.
occurs
OR An output from the gate (event)

.
will occur if any of the inputs occur
INHIBT Gate An output from the gate (event)

will occur inputs occur and the
inhibit event also occur
Basic event A fault event that need no further

definition
Undevelope An event that cannot be further

d event developed due to lack of
information
Steps in Fault Tree Analysis
1. Fault Tree Development

2. Minimum cut set finding
3. Probabilistic analysis using basic failure data
4. Importance factor estimation
Fault Tree Development
Identify the top event as

accident scenario
Identify the events that

may cause top event to
occur
Does these
Yes events may
be broken
down?
Identify the causes Identify the
that may lead these relationship of these
events events to top event
Identify relationship
events and their Transform these
basic causes relationships in fault
tree using gates
Transform these
relationships in fault
tree using gates
Minimum Cut Set Finding

• Analytical Procedure
• Simulation methods
Probabilistic analysis using basic failure data
• Analytical Procedure
• Monte Carlo Simulation method
Importance factor estimation

Here importance of each basic event is quantified.
Repeating steps 2 and 3 do it while one particular event is made

totally safe (not to fail).
Analytical Simulation Methodology and PROFAT
Start
Represent an undesired
event in terms of fault tree
Transform the fault tree

in to boolean matrix
Solve the boolean matrix for

minimum cut- sets
Optimization of the cut- Optimization

sets criteria
No
Is optimization over
?
Yes
Transformation of
Probability analysis static probability to Probabilities
fuzzy probability set
Improvement index
calculation
Stop
Start
Represent a fault tree in

Boolean Matrix such that
Row = Gate
Col=Event+Gate
Take top gate
OR AND
Open new row to Gate Check Gate
enter elements All entries of this
for type
of this gate of gate gate in same row
Calculate
factor 'G'
Take new
row
Is any Yes
gate present?
No
No Are all
rows checked?
Yes
Optimization
Apply optimization techniques
using optimization criteria
Criteria
Optimum minimal cutset
Stop

Design For Reliability

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Design For Reliability

Загружено:

Авторское право:

Доступные форматы

Design for reliability 1

Design for Reliability

• It considers life-cycle costs of the system and system

Specify reliability goals

Allocate reliability to components

Implements design methods

Failure analysis (FMEA/FMECA)

System Safety Analysis (FTA)

Ready for production

Reliability activities and product life cycle

Conceptual Detailed Production and Product Use

Reliability specification and system measurements

• Mean time to failure (MTTF) or mean time between failures

For example: to state that 99% reliability is required after 2 years

• If the MTTF is to be the only reliability specification to be used,

System effectiveness is the probability that the system can

• Operational readiness is the probability that the system is

• Mission availability is the percentage of time the system will be

• Design adequacy is the probability that the system will

System effectiveness = operational readiness*availability*design

Example: A copy machine is working on demand. However, in order

Where, Ri(t) is the reliability at time t of ith component, R*(t) is the

Similarly for MTTF

If all components are serially related and their failures are

If all the components have constant rate of failure:

Where λ s is the system failure rate goal

Ideally, reliability allocation should be accomplished in a least-cost

Where i = 1,2,…n, xi is the increase in the reliability of the ith

Considering quadratic cost function (the most common function that

Multiplying (Ri+xi) to the first equation and rearranging terms

The solution is:

− 2ci Ri + 4ci2 Ri2 + 8ciθR *

There are a few popular strategies discussed in the literature,

The simplest one is ARINC which assumes components are in series,

Where λ * is the target failure rate

The AGREE (Advisory Group on Reliability of Electronic Equipment)

This approach allocate equal share of the reliability to each module in

t is system operating time

Redundancy may be used to achieve the R1 R4

Assuming R’ is allocated reliability

One can assign a reliability to one of the component, say component

In case of complex systems, it is generally possible to reduce the

Various activities and parameter that are involved design of

1. Material selection. It involve consideration of following

3. Stress-strength analysis. The traditional approach is to design

• Safety factor: The safety factor is the ratio of the capacity of

• Safety margin: The safety margin is the difference between

4. Complexity and Technology. The number of the parts in a

5. Redundancy. Redundancy includes both active and standby

Let Ri(t) reliability of component i at time t

The problem is find the optimal ni so that

Marginal analysis may be used to solve this problem, if natural log of

1. Set ni =1, i =1,2,3,…, M and set cost =0

This is an iterative process that influences design by identifying

FMEA comprises of following steps:

2. Identification of failure modes

Failure mode Category Cause Failure

Failure mechanism Failure mode Failure effect

A severity classification is assigned to each failure mode to be used

Category I: Catastrophic. Significant system failure occurs that can

6. Estimation of probability of occurrence

Probability of occurrence of each failure mode is estimated generally

System effectiveness = operational readinessavailabilitydesign