Вы находитесь на странице: 1из 11

Proceedings of IMECE 04 2004 IMECE Design Engineering Technical Conferences November 13-19, Anaheim, CA

IMECE2004/DFM-59612
PREDICTING COST OF POOR QUALITY AND RELIABILITY FOR SYSTEMS USING FAILURE MODES AND EFFECTS ANALYSIS
Seung J. Rhee rhees@mml.stanford.edu Kosuke Ishii ishii@stanford.edu

Department of Mechanical Engineering Design Division Stanford University Stanford, California, 94305-4022, USA

ABSTRACT Failure Modes and Effects Analysis (FMEA) is a design tool that mitigates risks during the design phase before they occur. Although many industries use the traditional FMEA technique, it has many limitations and problems. Traditional FMEA identifies failure modes with high risk but does not consider the consequences in terms of cost, which could lead to unnecessarily expensive solutions. We have used a new methodology, "Life Cost-Based FMEA", which measures risk of failure in terms of cost to compare two different technologies that might be used for the Next Linear Collider (NLC) magnets: electromagnets or permanent magnets. We derived the availability estimates for the two different types of magnet systems using empirical data from Stanford Linear Accelerator Centers (SLAC) accelerator failure database as well as expert opinions on permanent magnet failure modes and industry standard failure data. We can predict the labor and material costs to repair magnet failures using a Monte Carlo simulation of all possible magnet failures over a 30-year lifetime. Our goal is to maximize up-time of the NLC through magnet design improvements and the optimal combination of electromagnets and permanent magnets, while reducing magnet system lifecycle costs. KEYWORDS: reliability, failure, field data, failure cost, system, validation 1. INTRODUCTION Fundamental problems with traditional FMEA have been identified by practitioners over the years: 1. FMEA is performed too late to impact design decisions[1,2].

2. 3.

Risk priority number is not a consistent measure of risk [3,4] FMEA process does not usually capture human errors and environmental effects which have significant impact on the performance of the system [5].

Risk contains two basic elements (1) chance, measured by probability, and (2) consequence, measured by cost. A new methodology has been developed to overcome these shortcomings, it is called "Life Cost-based FMEA". Life CostBased methodology has been discussed in detail through previous publications by Rhee and Ishii [6], [7] [12]. It measures risk of failure in terms of cost. Cost is a universal language understood by engineers without ambiguity. Expected failure cost is defined as the product of the probability of a particular failure and the cost associated with that failure. Lifetime failure cost is the sum of all the expected costs for all failure scenarios at all stages of a system component's life: design, manufacture, installation, and operation. The probability of a failure can be characterized as the frequency of such failures in a system containing multiple components. Field failures do not generally occur at a uniform rate, but follow a distribution in time commonly known as the bathtub curve. The bathtub curve consists of three stages: infant mortality, useful life period, and the wear out period. Infant mortality phase is usually the first year of device operation Useful life stage is after the infant mortality phase, and failures at this stage are due to random, normal wear and tear where failures are caused by unexpected conditions. Wear out phase is due to component again. Making prediction for infant mortality is difficult. This paper will discuss failures that

Copyright 2004 by ASME

happen during the useful life phase where failures are unexpected. 2. BACKGROUND The Stanford Linear Accelerator Center (SLAC) is a national research laboratory that is charged with investigating the most basic elements of matter. Engineers at SLAC and other labs are currently designing the Next Linear Collider (NLC) that will be 20 miles long, 10 times longer than the current linear accelerator at SLAC. The proposed NLC has a proposed 85% overall availability goal, the availability specifications for all its 8653 magnets and their 6167 power supplies are 97.5% each. SLAC intends to operate the NLC 24 hours per day, 7 days a week for 9 months a year. Thus, all of the electromagnets and their power supplies must be highly reliable or quickly repairable to minimize interruption of the particle physics research program. SLAC keeps a history of all failures for the past 15 years on an online database called the Computer Aided Trouble Entry and Reporting (CATER) system [8]. Thus, empirical data from SLACs accelerator failure database and design experience are used to calculate Mean Time Between Failures (MTBF) for failures modes identified using FMEA. Occurrence or probability for certain failure modes can be determined through MTBF. The NLC requires 8653 magnets to control its particle beams. Two different technologies could be used for the magnets: electromagnet or permanent magnets. An electromagnets strength is varied by changing the electric current in the coils. Thus, a power supply is required as part of the system. SLAC has been using more than 3000 electromagnets over the past 30 years and their failure data are readily available. Another competing technology uses permanent magnets without any current. Permanent magnets are simpler in design and the initial manufacturing cost maybe smaller. However, the technology risk has not been ascertained for adjustable permanent magnets. The following section compares the cost of poor quality of the two competing technologies: electromagnets and permanent magnets. 3. ANALYSIS 3.1. Electromagnet System The failure modes and operating conditions for the electromagnets are first investigated to make predictions for the NLC electromagnets. The configuration for the electromagnet system for SLAC is shown in Figure 1. Magnets and power supplies make up the electromagnet system. To make the calculation manageable, 129 different types of electromagnets at SLAC are categorized into 2 fundamental designs: watercooled and solid wire electromagnets. Power supplies are categorized into 2 different sizes: small ( <12V, < 500W) and large ( >12V, >500W) for switching and non-switching. Although SLC uses both switching and non-switching, NLC will be using only the switching type.

Solid Wire Electromagnets 1923

Water Cooled Electromagnets 3827

Total * Electromagnets 5750

Non Switching Supplies 2121

Switching Power Supplies 1347

Power Supplies 3468

Figure 1 Electromagnet System Configuration for SLC 3.1.1. Electromagnets Identification of the magnet and power supply configuration is necessary because the number of components change according to the type of experiment that is conducted at the time. History of magnet failures from 1997 to 2001 was collected using the SLAC CATER system. Table 1 shows a total of seventy-six incidents where the beam line had to be shutdown due to electromagnet failures. 90% of the failure incidents and 96% of the total downtime were associated with water-cooled electromagnets. Table 1 Downtime of SLC Due to Magnet Failure
Solid Wire Water Cooled Total # of Events Total Downtime 6 25.8 70 521 76 546.8 Min 1.8 0.1 Max 11 12 Units: hour Avg 4.3 7.5 7.1

Table 2 lists the five most common failure modes with electromagnets and its frequency and downtime. Insulation failures are related to magnets becoming shorted. The insulation material around the coil becomes degraded due to radiation or thermal effects. Water leaks from the cooling system is a major problem with electromagnets. Water leaks are mainly due to failures with flexible hoses due to radiation and copper corrosion. Human errors range from not following procedures to forgetting tasks. Connector failures are mainly mechanical failures due to mechanical and thermal cycling. Table 2 Failure Frequency and Downtime of Electromagnets
Failure Mode Insulation Water Leak Water Blockage Human Error Connector Other Frequency 29 22 5 5 3 12 Min(hr) Max(hr) 0.2 27.2 1 32 0.5 7.5 0.7 6 1 3.2 0.9 10.2 Avg(hr) 8.82 9.7 3.92 2.5 1.733 5.8

Table 3 shows the different types of beamlines and their durations during the period from 1997 to 2002 and the average

Copyright 2004 by ASME

Table 3 Availability Prediction for a Single Water Cooled Electromagnet (SLAC 1997-2001)
Beam Line Linac/BSY 2/4/97 - 4/30/97 HER SLC 5/1/97 - 6/8/98 HER 7/10/98 - 7/31/98 PEP II 10/30/98 - 12/15/98 PEP II 1/15/99 - 2/22/99 PEP II 2/24/99 - 5/1/99 Linac 5/1/99 - 11/29/99 PEP II PEP II BSY/FFTB 1/12/00 - 10/31/00 BSY/A-Line PEP II BSY/FFTB(e+) 1/10/01 - 12/31/01 BSY/A-Line Sum SLAC Average Dates Line Ran No. of Run Hour Magnets Magnet Hours 1547 520 804,440 181 1200 217,200 8828 2302 20,322,056 918 1240 1,138,320 575 2602 1,496,150 1040 2602 2,706,080 844 2602 2,196,088 1461 520 759,720 4797 2602 12,481,794 6624 2602 17,235,648 2196 198 434,808 630 520 327,600 7411 2602 19,283,422 2795 198 553,410 820 520 426,400 80,383,136 No. of # Failures 1 32 2 6 4 2 7 MTBF (hr) 804,440 635,064 748,075 451,013 549,022 379,860 1,783,113 TR (hr) 0.2 289.5 9 40.1 15.6 26.1 65.65 MTTR (hr) 0.20 9.05 4.50 6.68 3.90 13.05 9.38 Availability 1 Mag 0.999999751 0.999985755 0.999993985 0.999985182 0.999992897 0.999965646 0.99999474 PPM 0.2 14.2 6.0 14.8 7.1 34.4 5.3

7 7 2 70

2,462,235 2,754,775 276,705 1,148,331

34.6 37.9 3.05 521.70

4.94 5.41 1.53 7.45

0.999997993 0.999998035 0.999994489 0.9999935120

2.0 2.0 5.5 6.5

Actual Actual Predicted

PEP II SLC NLC (2003)

Magnets 2602 2302 6085

Availability Availability Availability

0.984009925 0.967738715 0.961289556

0.999993805 0.999985755 0.9999935120

Prediction for NLC Operation Hr/yr Expected Downtime Occurrences /yr

6480 250.8 hr/year 33.7 Water cooled magnet failures / year

availability for water cooled electromagnet. The second column indicates the type of line running during the period (this determines the number of magnets turned on during the experiment). The fourth column shows the number of watercooled electromagnets for that particular line. The fifth column is the product of run hour and the number of magnets: magnet hours. The sixth column indicates the number of failures identified during that particular period. The MTBF in the seventh column is a result of magnet hours divided by the number of failures. The eighth column indicates the total repair time for those failures in that period and the ninth column is MTTR. Based on these numbers the availability of any one magnet in a beamline can be calculated. The average availability of one water-cooled magnet at SLAC is found to be 0.999993512. The configuration for the NLC is shown in Figure 2. The NLC has 2903 more electromagnets and 3212 more PS compared to the SLC.

The availability of the NLCs electromagnet subsystem is estimated using the following equation: System Availability = A(Single Component)n (1)

where A is availability and n is the number of components. Assuming the average availability of each individual water cooled electromagnet is 0.999993512 (as was found for SLAC magnets), the availability for 6085 water-cooled electromagnets would be 0.96129. However, this is lower than the target value of 97.5% for the magnet subsystem. Therefore, the magnet designers know they must improve the reliability of the magnets they design for the NLC compared to the SLAC magnets. Given 6489 hours of operation time per year, the expected downtime of the NLC due to electromagnet failure would be 250 hours/year: Expected downtime = (1-0.96129) x 6480 hours = 250.8 hours

Solid Wire Electromagnets 2568

Water Cooled Electromagnets 6085

Total * Electromagnets 8653

Since the average MTTR is 7.45 hours, we can estimate the number of failures for a given year to be 33.7 occurrences: Number of failures = downtime / MTTR = 250.8/7.45 = 33.7 times/year

Non Switching Supplies 2512

Switching Power Supplies 4168

Power Supplies 6680

Figure 2 Electromagnet System Configuration for NLC

Availability of solid wire magnets is calculated in the same manner. The expected number of failures for solid wire magnets in the NLC is twice a year. The overall availability of the NLC magnet system (AMSys) can be calculated using the following equation since all of the magnets are in series: AMSys = ASM x AWM (2)

Copyright 2004 by ASME

= 0.998628 x 0.96129 = 0.9599 ASM = Availability of solid wire magnet AWM = Availability of water-cooled magnet Thus, this would fall short of the 97.5% availability goal if the engineers used the same magnet technology for the NLC as they used for SLC. A summary of the electromagnet availability for the NLC is shown in Table 4. Table 4 Predicted Availability of Electromagnets for the NLC
Type # of Magnets Availability Expected Downtime Occurrence Solid Wire 2568 0.998628 8.3 hrs/yr 2.1 / yr Water-cooled 6480 0.96129 250 hrs/yr 33.7 /yr

Table 6 Downtime of Accelerator Due to Power Supply Failure


Type of PS Large Small Total Events 92 70 162 Total Down Time 178 88.7 266.7 Max 11 11.5 32 Min 0.1 0.2 0.1 Avg Down Time 1.93 1.27 1.65 Units: Hour

Availability, MTBF, and occurrence for the power supplies can be estimated following the same steps as with the electromagnets studied for the 5 year period (Table 7). Table 7 Predicted Availability of Power Supplies for NLC
Size # of Power Supplies Availability Expected Downtime MTBF Small 2512 0.989 70 hrs/yr 97hrs Large 4168 0.9256 483 hrs/yr 25.6 hrs

The root causes for water leak failures are identified as fitting, hose crack, coil leak failures as tabulated in Table 5. A total of 23 water leak related failure events were identified during the 5 year period. Table 5 Water Leak Failure from SLAC Field Data
Repair Time Min (hr) Max (hr) 6.5 32 2 9 2 12.5 1 23.5 1 32

The overall availability for the power supply system (APSSys) is the product of the availability for the two types of power supplies (small and large): APSSys = ASPS x ALPS = 0.989 x 0.9256 = 0.9154 ASPS = Availability of Small Power Supply ALPS= Availability of Large Power Supply As seen from the result, Availability for the large PS is very low. PS cannot be used for NLC as is, otherwise the target availability will be never met. 3.1.3 All Electromagnet Configuration The electromagnet system has power supplies that control the electric field for each magnet. The preliminary design for the NLC has 8653 electromagnets and 6680 power supplies to operate. The accelerator will shutdown if any one of the 8653 magnets or 6680 power supplies fail. Thus, the overall availability of the system is the product of the magnet and power supply availabilities. The system availability for the NLC is calculated to be 0.8786 if the current hardware were to be implemented: ASys = AMSys x APSSys = 0.9599 x 0.9154 = 0.8786 The expected downtime, occurrence for the NLC is the sum of each variable from each subsystem: Expected Downtime = Downtime (solid wire magnet, water cooled magnet, small PS, large PS) = 9 + 251 + 70 +483 (hours/year) = 813 (hours/year)

Failure Scenario # Events Fitting 3 External Water 3 Hose 7 Coil Leak 10 Total 23

3.1.2 Power Supply The power supplies that provide the electric current to the electromagnets are categorized into 2 main categories: small (< 12Amps, 50Volts) and large (> 12Amps, 50Volts). A summary of the SLAC switching type power supply failures from the CATER system between 1997 and 2001 is shown in Table 6. The total number of failures is 162, 2.5 times greater than the number of electromagnet failures, but the total downtime is less than half of the electromagnet failures. This is because the average downtime for power supply is only 1.65 hours as opposed to 7.2 hours for the electromagnet. When a power supply fails, the power supplies are replaced with a refurbished or a new one to minimize downtime. Thus, the average downtime is significantly shorter than the magnet failures. Only failures that forced the accelerator to shutdown are considered in the analysis.

Copyright 2004 by ASME

Table 8 Life Cost Based FMEA Worksheet


Input Detection Phase Output Opportunity Cost($) 112,500 125,000 62,500 250,000 Material Cost($) 15 3,000 5,000 4,500

Detection Time

Parts Cost ($)

Root Cause of Failure Too many loads on circuit Water passage is blocked Damaged coil Thermal swtich trip due to overheating Water sprayed on to coil

Failure Mode

Effect of Failure Magnet turned off Magnet turned off Magnet turned off Magnet turned off

Oper Oper Inst Oper

Oper 30 0.001 0.5 Oper 30 2 1 TR 1 4 0.5 Oper 30 3 2

4 4 2 8

0 5 0 5 0 3 0 10

1 50 180 1 50 38,400 1 1250 1,280 1 50 115,200

Expected Occurrence =Occurrence (solid wire magnet, water cooled magnet, small PS, large PS) = 2 + 34 + 53 + 292 = 381 events / year System MTBF = MTBF (solid wire magnet, water cooled magnet, small PS, large PS)

Table 9 Predicted Life Cycle Failure Cost for Electromagnet System for NLC for 30 yrs
Total System Probability 5% 50% $4.2 $5.3 $2.8 $3.5 $6.4 $7.8 $228.0 $285.6 $555.0 $714.4 $1,110.0 $1,426.0 Unit: $Million

Labor Cost($)

Re-occuring

Fixing Time

Delay Time

Frequency

Loss Time

Quantity

Origin

6085 2512 4168 2568 = + + + 8,0444,091 1,148,331 294,646 106,700


= 18.8 hrs

Labor Cost Material Cost Sub Total $10k $25k $50k

Opportunity Cost

95% $6.3 $4.1 $9.5 $346.4 $864.4 $1,725.0

This means that the NLC will be disrupted every 19 hours either due to the electromagnet or power supply failure. A Life Cost-Based FMEA sheet, as shown in Table 8, was completed for the electromagnet. First, the origin of the failure and the detection stages were identified for each scenario. Failure frequency was assigned with respect to the historical data for failure with root causes in insulation, water leak, water blockage, human error and corrector failures. Experts in their respected fields gave failure frequencies related to design, manufacturing, and installation. Repair for an electromagnet failure can range from replacing a simple water hose to replacing the whole magnet. Material cost for a simple replacement of a hose can range from $35 to $70. Replacing the entire solid wire magnets can range in cost from $400 to $2000 and water-cooled magnets can range from $4000 to $30,000 depending on the size of the magnet. Power supplies repairs usually only require the electronic boards to be swapped and parts cost for boards range from $300 to $700 depending on the size. A Monte Carlo simulation is applied to the Life CostBased FMEA to consider the sensitivity of variables associated to failure cost: frequency, detection time, fixing time, delay time, and parts cost. A 30 year predicted failure cost for the electromagnet system is summarized in Table 9. As shown it the table, opportunity cost can be 30 to 150 times greater than the labor and material cost.

3.1.4 Design Improvements As derived in the previous section, availability for the electromagnet system falls short of the target goal of 95%. To increase the availability, the availability of the magnet or the power supply has to be increased. As for the magnets, the water cooled electromagnet has a lower availability. To increase the availability of the water-cooled magnets for the NLC, two measures can be taken: reduce MTTR or increase the reliability of the electromagnets. The average MTTR for water-cooled electromagnet is 7.2 hours and 4 hours for the solid wire magnets as determined from empirical data at SLAC. 3.1.4.1 Redundant Large Power Supply In terms of component reliability, the large power supply has the lowest availability of the major components in the system. The availability of the large PS has to be increased to meet the target availability. One way of achieving this is to design redundancy in the system. Since the small power supply has a relatively high availability, consideration is made having redundancy only in larger power supplies to minimize acquisition cost. Assuming the large power supplies are in redundancy mode, the MTBF for the large PS becomes 4.1 million hours. A comparison of MTBF and expected downtime for redundant and non redundant PS is shown in Table 10.

Copyright 2004 by ASME

Table 10 Redundant and Non-Redundant PS

Large PS No Redundancy MTBF for one large PS Availability of 4168 PS Expected Downtime Failure Occurrences MTBF of 4186 PS 106,700 hr 0.92556 483 hr/ yr 244/ yr 25.6 hr

Large PS Redundancy Hot swappable 4,107,950 hr 0.9983 10.5 hr/ yr 5.3 / yr 1214 hr

The expected downtime due to power supply failure is 6480 hours x (1-0.986) = 90.7 hours/yr. Using an average fixing time for the power supply of 1.5 hours the average number of failures during the year is 90.7 / 1.5 = 60.5 events/year. This is a significant improvement over the existing design. However, the overall availability is 0.9435 which is just shy of the requirement, 0.95. Labor cost, material cost, opportunity cost, and availability of the electromagnet subsystems are tabulated in Table 11 3.1.4.2 Reducing Repair Time for Electromagnets Further modifications and design improvements have to considered. One possible solution is to reduce the MTTR of the electromagnets by 30% to 5 hours through hoist design inside the tunnel, making it easier for technicians to lift the electromagnets. With the reduction in repair time, the availability for a single water cooled magnet (A1WM) becomes:
A1WM = MTBF 1,148,331 = = 0.99999565 MTBF + MTTR 1,148,331 + 5

With redundancy of large power supplies (ALPS), availability of one large PS in redundant 0.99999961:
A1LPS = MTBF 4,107,950 = = 0.99999961 MTBF + MTTR 4,107,905 + 1.6

With 4168 large PS redundant pairs in the system, the availability for the large PS becomes:
A4168 LPS = (0.99999961) 4168 = 0.99838

Thus, for the 6085 water cooled electromagnets, the availability becomes: A6085WM = (0.99999565 ) 6085 = 0.97385 The availability of the electromagnet system (AMSys) becomes:

Thus, the improved availability of the entire electromagnet system becomes: APSSys = 0.988 x 0.99838 = 0.9864

AMSys = 0.998628 0.97385 = 0.9725


The overall availability of the magnet system becomes: ASys = AMSys x APSSys = 0.9725 x 0.986 = 0.958

Table 11 Predicted Failure Cost of Electromagnet System with Redundant Large PS


Labor Cost Material Cost Sub Total $10k/hr $25k/hr Opportunity Cost $50k/hr Availability 5% $0.205 $0.052 $0.26 $3.3 $8.2 $16.4 Correctors Probability 50% $0.260 $0.065 $0.33 $3.9 $9.8 $19.6 0.9988 95% $0.354 $0.085 $0.44 $4.6 $11.5 $23.0 Water Cooled Probability 5% 50% $1.080 $1.300 $0.871 $1.085 $1.95 $2.39 $66.0 $80.0 $165 $200 $330.0 $400.0 0.9576 95% $1.537 $1.334 $2.87 $97.2 $243 $486.0 Power Supplies Probability 5% 50% 95% $0.78 $0.980 $1.18 $0.32 $0.400 $0.48 $1.12 $1.400 $1.68 $27 $33.8 $41 $68 $84.5 $101 $135 $169.1 $203 0.9864

Electro Magnet System Probability 5% 50% 95% $2.07 $2.54 $3.07 $1.24 $1.55 $1.90 $3.33 $4.12 $4.99 $96 $118 $142 $241 $294 $356 $482 $589 $712 0.9435

Table 12 Predicted Failure Cost of Electromagnet System with Redundant Large PS and 30% Reduction in MTTR for Water Cooled Electromagnets
5% $0.205 $0.052 $0.26 $3.3 $8.2 $16.4 Correctors Probability 50% $0.260 $0.065 $0.33 $3.9 $9.8 $19.6 0.9988 Water Cooled Power Supplies Probability Probability 95% 5% 50% 95% 5% 50% 95% $0.354 $0.580 $0.860 $1.130 $0.780 $0.980 $1.180 $0.085 $0.871 $1.085 $1.334 $0.320 $0.400 $0.480 $0.44 $1.45 $1.95 $2.46 $1.12 $1.40 $1.68 $4.6 $40 $31 $58 $27 $34 $41 $11.5 $100 $120 $146 $68 $85 $102 $23.0 $200 $155 $290 $135 $169 $203 0.9771 0.9864 Electro Magnet Probability 5% 50% 95% $1.565 $2.100 $2.664 $1.243 $1.550 $1.899 $2.831 $3.675 $4.584 $70.3 $68.9 $103.6 $176 $214 $260 $351 $344 $516 0.9627

Labor Cost Material Cost Sub Total $10k/hr $25k/hr $50k/hr

Opportunity Cost Availability

Copyright 2004 by ASME

Labor cost, material cost, opportunity cost, and availability of the electromagnet subsystem that has redundant PS and where the MTTR for water cooled magnet is reduced to 5 hours is tabulated in Table 12. As seen from the results, large PS has to be made redundant and MTTR for electromagnet has to be reduced by 30% in order to meet the target 95% availability. 3.2 Permanent Magnet System 3.2.1. Permanent Magnet An adjustable permanent magnet uses permanent bricks to drive flux through steel pole tips and has rotating rods with permanent cylinders just outside the core [9]. The field strength in the bore varies with the rods' angular position. The so-called tuners are rotated by an electro-mechanical linkage system driven by a stepper motor, which is controlled by standard electronics. The tuning rods will rotate to the minimum field position if certain parts of the stepper motor system fail. The SLC only has 80 sets of permanent magnets which are not adjustable but has movers for other purposes. Thus, the permanent magnet system is divided into components where historical data is available and unavailable to make predictions for the overall MTBF. Since the permanent magnet had been installed at SLC only a few years ago and only has a couple of failure reports, MTBF for the permanent magnets has to be tested with a prototype or make an educated guess. In order to estimate the MTBF of such adjustable permanent magnets, SLAC magnet mover failure data that caused the beam to be lost for the past fiver years are considered. Because these movers had almost all the same components as the proposed tuner controller system and had been used in a similar radiation environment and with a similar duty cycle, it is the best source of information next to the actual system. Historical data shows that the failure for the movers at SLAC for the five year period (1997-2001) shows an MTBF for the mover is, 347,687 hours, which is 3 times greater than the average large power supply. MTBFs for the additional pinion gear, 2 sun gear bearings and 16 shaft bearings that the tuner system has are made using standard manufacturers MTBF guideline. The Weibull distribution is a useful way to analyze failures of components and quantify their reliability, its widespread use has led to tables of MTBFs for all kinds of items being developed and published. A typical shaft bearing with a 100% duty cycle has an MTBF of 50,000 hours. However, the adjustable permanent magnet will only be used twice a month to allow for a beambased alignment of the magnets. This procedure will take 30 minutes each time for a total of 1hour/month operation time. MTTR for the movers is found to be 2 hours. The duty cycle is the element (component) usage time over the system operating time. The actual duty cycle for the movers is:
Duty cycle = Elelment Operating Time 2 12 = = 0.0027 System Operating Time 8790

Thus, the actual duty cycle is expected to be 0.3%, which gives MTBF of 16.6 million hours for one bearing. Combining all 19 bearings for the permanent magnet system, MTBF for the bearings yield 877,193 hours for 0.3% duty cycle and 263,158 hours for 1% duty cycle (more conservative number). Deterioration of the magnet itself due to time and radiation is still under testing and test results do not exist. Thus, experts educated guess has to be made at this point. Experts believe that 3 permanent magnets out of 100 magnets will fail every year, yielding a MTBF of 2,730,510 hours for one single permanent magnet. The MTTR for one permanent magnet is expected to be 13.5 hours. MTBF for a single permanent magnet with optimistic duty cycle, 0.3%, takes the form that yields a MTBF of 228,186 hours:
1 MTBFPermanentMagnet
=

1 1 1 + + MTBFMotor MTBFBearing MTBFRadiation

1 1 1 + + 347 , 687 877 ,193 2 , 730 ,510 1 = 228 ,186

3.2.2. Permanent Magnet Configuration Configuration for the permanent magnet system with the maximum possible number of permanent magnets is shown in Figure 3. Compared to the configuration for an all electromagnet system, the total number of magnets remains the same, 8653. However, the numbers of power supplies are reduced by 40% to 3998 power supplies.
Permanent Magnet 3371 Solid Wire Electromagnets 2568 Water Cooled Electromagnets 2714

Total Magnets 8653

Small PS 2512

Large PS 1486

Power Supplies 3998

Figure 3

Mixed Permanent and Electromagnet Configuration for NLC

MTBF for the major subcomponents for the permanent magnet is listed in Table 13. Average downtimes for the three major failures are listed in Downtime column. Based upon the predicted downtime number, frequency for these failures are placed in the lifecycle cost based FMEA sheet.

Copyright 2004 by ASME

Table 13 MTBF Calculation for Permanent Magnets


Permanent Magnet Motor & Electrical Comp 3371 Motors Observed Avail, MTTR MTBF Bearings 3371 Bearing sets Radiation damage on Magnet 3371 magnets MTBF 347,687 103 877,193 260 2,730,510 810 MTTR 2 Availability 0.9999942 0.9807958 0.9999943 0.9809688 0.9999949 0.9828646 Downtime 124.4

123.3

requirement. A summary of the mixed configuration results is shown in Table 14. Permanent magnet and power supply availability are the subsystems with low availability that brings the systems availability down. Permanent magnet and power supplies only have a MTBF of 50 and 44.5 hours respectfully. Table 14 Prediction Summary for NLC Mixed System
Solid Wire Availability Predicted Downtime (hr) No of Occurrences (/yr) MTBF (hr) 0.99860 2.0 1 8238 Water Cooled 0.98250 113.4 15 423 Permanent Power Supply Magnet 0.9094 0.9623 587.0 244.3 118 144 50 44.5 Magnet System 0.85860 946.7 277.8 22.2

14

111.0

Summing the recovery time from the LCFMEA sheet, the average downtime for the permanent magnet is expected to be 587 hours/year. Availability for the permanent magnets is 0.909:
Availabili ty3371 Permanent Magnets Uptime 6480 587 = = = 0.9094 ScheduledTime 6480

As seen from the summary table, the oval availability is 0.858632 which is slightly lower than the all electromagnet configuration, 0.8786. 3.2.2.2 Life Cost Based FMEA for Mixed Configuration Based on the number of failure frequencies in Table 14, a new set of Lifecycle cost based FMEA is generated to calculate the cost portion. The predicted lifecycle failure cost result for the mixed magnet system is tabulated in Table 15. Comparing it to the all electromagnet configuration, electromagnet and power supply costs have come down due to reduction in the number of respective units. However, permanent magnet failure cost is extremely high. The predicted labor and material cost alone for the 30 years is expected to be $12.7 million. This is primarily due to the fact that the entire permanent magnets have to be replaced due to radiation and material cost is much higher than the electromagnets. Table 15 Predicted Life Cycle Failure Cost for Mixed Magnet System for the NLC for 30 years
Mixed Magnet System Probability 5% 50% 95% $5.673 $6.409 $7.515 $9.255 $12.485 $16.315 $14.9 $18.9 $23.8 $364 $428 $497 $912 $1,070 $1,245 $1,824 $2,141 $2,490 Units: $ Million

3.2.2.1 Availability for Mixed Magnet Configuration Since the NLC cannot be built entirely on permanent magnets due to technical difficulty, a mixed configuration as shown in Figure 3 is considered. Availability for a single solid wire and water cooled magnet is the same as discussed in 3.1.1. Since, there is no change in the number of solid wire magnets for the mixed magnet configuration, availability for the solid wire magnets remain the same, 0.9986. Availability for the 2714 water cooled magnets is: Availability 2714 Water cooled magnets = (0.999993512)2714 = 0.9825 Availability for just the magnets (permanent and electro) is calculated by taking the product of the three different magnet availabilities: AMixed Magnets = A Permanent x A Solid Wire x A Water cooled = 0.9094 x 0.9986 x 0.9825 = 0.8922 There are 2512 small power supplies and 1486 large power supplies in the mixed configuration. Number of small power supply is the same as the all electromagnet configuration so the availability remains the same, 0.98925. Availability for 1004 large power supply is obtained using the availability number for a single large PS: A 1004
Large PS

Labor Cost Material Cost Sub Total $10k $25k $50k

Opportunity Cost

= (0.9999814)1486 = 0.9727

Availability for the power supplies in the mixed configuration is: A Mixed PS = 0.98925 x 0.9727 = 0.9622 Thus, the Availability of the mixed configuration is: A Mixed configuration = 0.8922 x 0.9622 = 0.8586 This is far short of the 0.95% availability for the magnet system

Comparing results in Table 15 against the system in Table 9, direct labor and material cost is 115% more for the permanent magnet compared to the all electromagnet. Opportunity cost is 63% greater for the permanent compared to the all electromagnet. 3.2.3 Design Improvements for Mixed System Permanent magnet and power supply seem to be systems with the lowest availability and several improvements are considered to increase the availability. Just as with the all electromagnet configuration, redundancy for the large power supplies. For the permanent magnets themselves, the accelerator will shut down if one of the motors that fails

Copyright 2004 by ASME

Table 17 Predicted Life Cycle Failure Cost for Mixed Magnet Sub-System for the NLC for 30 years with Design Improvements
Correctors Probability 50% $0.395 $0.082 $0.48 $7.8 $19.7 $38.0 Permanent Magnet Probability 5% 50% 95% $1.59 $1.89 $2.27 $3.180 $4.21 $5.66 $4.77 $6.10 $7.93 $59.2 $71.6 $86.0 $148.0 $179.0 $215.0 $296.0 $358.0 $430.0 Electro Magnet Probability 5% 50% 95% $1.125 $1.389 $1.671 $0.896 $1.119 $1.358 $2.021 $2.508 $3.029 $27.4 $34.2 $41.4 $68.4 $85.5 $103.5 $136.8 $171.0 $206.9 Redundant Power Supply Probability 5% 50% 95% $0.470 $0.470 $0.588 $0.563 $0.750 $0.938 $1.033 $1.220 $1.525 $18.8 $25.0 $31.3 $45.8 $61.0 $76.3 $91.5 $122.0 $152.5 Units: $ Million

Labor Cost Material Cost Sub Total $10k $25k $50k

Opportunity Cost

5% $0.300 $0.065 $0.37 $6.0 $15.0 $30.0

95% $0.532 $0.102 $0.63 $9.7 $24.2 $48.0

because it will force the beam position to be at its initial setting. Thus, a latch in the system will prevent the beam to return to its original position. 3.2.3.1 Redundant Large Power Supply With active redundancy of large power supplies, one large PS in redundant is 0.99999961. With 1486 large PS in the mixed model configuration, availability for the large PS becomes 0.9994: A1486LPS = (0.99999961)1486 = 0.9994 Thus, the improved availability for the PS system (APSSys) becomes 0.9874: APSSys = 0.988 x 0.9994 = 0.9874 The expected downtime of the accelerator due to power supply failure is 6480 x (1-0.9874) = 81.6 hours. This is a 66% reduction in downtime compared to the single PS source configuration (Table 14). With the large PS in redundant the mixed magnet availability becomes 0.881: ASys = ASWMag x AWCMag x APMag x APSSys = 0.9986 x 0.9825 x 0.9094 x 0.9874 = 0.881 3.2.3.2 Latch Design for Permanent Magnet The current magnet movers are designed such that if the stepper motors fail, the rod in the permanent magnet would return to its original position. This will cause the beam to be out of position and ultimately disrupt the accelerator. A latch that prevents the rod from returning to its default position would prevent the beam from going out of position. Only permanent magnets in the injector and main Linac are affected by the beam based alignment, 1916 and 865 respectfully. Experts believe that 8 motors can fail before the accelerator shuts down due to beam misalignment. The MTBF for the 2781 stepper motors failures related to motor and bearing is 89.5 hours:

1 2781 2781 1 = + = MTBF 347,687 877,193 89.5

On the average, 8 motors would have failed during a 30 day period:


Number of Motors Failed = 30 days 24 hours =8 89.5 hours

Preventive maintenance should be conducted prior to beam based alignment, broken stepper motors and bearings have to be fixed before beam base alignment is conducted. Experts estimate the average repair time for the preventive maintenance will be 6 hours. Thus, the MTBF of the motor and bearing failure will increase 10 fold by designing in the latch for the permanent magnets in the injector and main Linac, the rest will remain the same:
1 1 2781 1 2781 590 590 = + + + MTBF 10 347,687 10 877,193 347,687 877,193

1 1 = MTBF 287

Table 16 Availability for Permanent Magnet with Latch Design


Permanent Magnet Permanent Magnet 3371 Magnets Observed Avail, MTTR MTBF Radiation damage on Magnet 3371 Magnets Permanent Magnet System MTBF 967,477 287 2,730,510 379.5 MTTR 6 Availability 0.9999938 0.9793112 0.9833000 0.9630

13.5

Next, radiation effect on the permanent magnets is considered. Table 16 shows calculation for permanent magnet with latch design that prevents the beam position to go to its default position. Availability for the 3371 permanent magnets is increased to 96.3 % from 90%. Availability for the mixed magnet system with two design improvements is 0.933:

Copyright 2004 by ASME

There are 2277 water cooled magnets in the SLC and predictions using conservative MTBF of 1.7 million hours yield 8.4 failures per year and optimistic MTBF of 3 million hours yields 4.9 failures per year. Table 18 summarizes the prediction and actual failure occurrences for magnets in 2002. ASys = ASWMag x AWCMag x APMag x APSSys = 0.9986 x 0.9825 x 0.963 x 0.9874 = 0.933 This is a significant increase compared to the mixed magnet configuration with no modification, 0.8586. However, this is still smaller than the all electromagnet configuration with modifications, 0.9627. Life cycle cost predictions for the mixed configuration with modifications is shown in Table 17. Even though the difference is only 3% in availability between the two different configurations with modification, the expected direct labor and material cost is $3.6 million vs. $10.3 million and opportunity cost is $170 million vs. $345 million for all electromagnet and hybrid configuration respectfully. 4. VALIDATION To see how accurate the availability numbers for electromagnet and power supplies are, prediction on availability for SLCs 2002 is made using these numbers and compared. First, prediction is made for 2002 using 2 different MTBF numbers: conservative and optimistic. Experts believe that the magnets in the damping ring were poorly designed and the NLC wont be designed in the same way. Thus, an optimistic view would take out all the electromagnet failures in the damping ring and make prediction for the NLC. Taking the damping ring out, MTBF for solid wire increases to 21 million hours and water cooled magnet increases to 1.7 million hours. A more optimistic outlook would only consider failures during the last two years (2000 to 2001) since they are the latest. The MTBF for the water cooled magnets would increase to 3 million hours. During 2002, no failures were observed for the solid wire magnets and 7 failures were found for electromagnets. There are 1450 solid wire magnets in SLC and using availability with no damping ring failures, the expected availability for solid wire magnets for a one year period is: ASolid Wire = (0.99999982)1450 = 0.99982
(1 Availabili ty ) OperationT ime MTTR

Table 18 Prediction and Validation of Magnet Failures for 2002


Solid wire Magnets Availability # of Failures 0.99983 0.44 0.99983 0.44 1 0 Water cooled magnets Availability # of Failures 0.993 8.4 0.9959 4.9 0.9947 7

Conservative Prediction Optimistic Prediction Acutal # of failures

As seen from the table, the actual failure frequency falls between conservative and optimistic estimates. Next, predictions for power supply using 5 year (1997-2001) period MTBF (294, 656) as conservative number and 2 year (2000-01) period MTBF(585,893) is used as an optimistic estimate for small power supply. Large power supplies predictions are made for conservative (MTBF=106,700, MTTR= 1.98) and optimistic (MTBF=117,243, MTTR=1.46) for the same period. Table 19 summarizes the prediction and actual failure occurrences for small and large power supplies in the SLC in 2002. As seen from the table, the actual failure frequency falls between conservative and optimistic estimates except for the small PS which over estimated the failure frequency. Table 19 Prediction and Validation of Power Supply Failures for 2002
Small PS Availability # of Failures 0.9969 15.5 0.9987 7.8 0.9993 5 Large PS Availability # of Failures 0.9939 19.7 0.9959 17.9 0.9964 18

Conservative Prediction Optimistic Prediction Acutal # of failures

5. CONCLUSIONS This paper demonstrated the systematic use of empirical, test, expert opinion data in performing Life Cost-Based FMEA through two very different examples and how it can improve the reliability, and life cycle cost through predicting cost of poor quality for complex systems such as the linear particle collider. Thus, Life Cost-Based FMEA has three main benefits: 1) Estimation of life-cycle cost, 2) FMEA 3) Service Mode Analysis (SMA) [10]. The proposed method inherently captures a systems lifecycle costs related to component failures during design, manufacturing, installation, and operation. Designers can readily incorporate the changes in the model to estimate an improved life cycle cost. The root causes directs designers to focus their efforts on problem systems, components, and processes. Complex systems usually have set target availability. One means to achieve the target is to increase all subsystems reliabilities. However, guaranteeing higher reliability often incurs cost increases. Another solution is to schedule preventive maintenance. Our proposed methodology maps allow comparisons of different availability enhancement

Number of

Failures =

(1 0.99982 ) 6480 = 0.44 2 .5

10

Copyright 2004 by ASME

measures and trace analysis in terms of cost, a widely accepted measure of risk. The authors agree that extracting relevant knowledge from pre-existing data (CATER system) is hard work because it collects data without the purpose of improving reliability or serviceability. It is evident that engineers should consider reliability and serviceability when maintenance management systems are put into place. Many times the management system records data in a way that it cannot be used effectively. Life Cost-Based FMEA can also provide a fair comparison between competing designs of subsystems. The case study presented in this paper compared two different technologies where one was proven and the other was not. Empirical data exists for the proven technology and one can predict failure and life cycle cost for the new design using existing technology with fairly high confidence. However, for unproven technology your data source has greater uncertainty. Reliability and failure cost is heavily relied on test and expert opinion which tend to have a higher uncertainty. Thus, future research lies in minimizing uncertainty when combining data from different sources that have different uncertainty.

Accelerator Magnets, IEEE Transaction of Applied Superconductivity, 2000, 10 (1), p. 284. [10] Gershenson, J., Ishii, K., Design for Serviceability, in Kusiak, A. (ed.), Concurrent Engineering: Theory and Practice, 1992, pp. 19-39, Wiley, New York. [11] Manago, M., Auriol, E., Using Data Mining to Improve Feedback from Experience for Equipment in the Manufacturing & Transport Industries, Institute for Operations Research and the Management Science, Oct., 1996. [12] Spencer, C., Rhee, S., Comparison Study of Electromagnet and Permanent Magnet Systems for an Accelerator Using Cost-Based Failure Modes and Effects Analysis, International Conference on Magnet Technology, Morioka, Japan, Oct 2003.

REFERENCES [1] McKinney, B., FMECA, The Right Way, Proceedings of the 1991 IEEE Annual Reliability and Maintainability Symposium, 1991 pp. 253-259. [2] Kara-Zaitri, C., Keller, A., Barody, I., Fleming, P., An Improved FMEA Methodology, Proceedings of the 1991 Annual Reliability and Maintainability Symposium, pp. 248-252. [3] Gilchrist, W., Modeling Failure Modes and Effects Analysis, International Journal of Quality and Reliability Management, 1993, Vol. 10, No. 5, pp. 16-23. [4] Palady, P., Failure Modes and Effects Analysis: Predicting & Preventing Problems Before They Occur, PT Publications, 1995, West Palm Beach, FL. [5] Stamatis, D.H.., Failure Modes and Effects Analysis, FMEA from Theory to Execution, ASQ Quality Press, 1995, Milwaukee, WI. [6] Rhee, S., Ishii, K., Life Cost-Based FMEA Incorporating Data Uncertainty, Proceedings of the ASME Design Engineering Technical Conference: Design for Manufacturing. September, 2002, Montreal, Canada. [7] Rhee, S., Ishii, K., Life Cost-Based FMEA Using Empirical Data, Proceedings of the ASME Design Engineering Technical Conference, 2003, Chicago, IL. [8] Sass, R., Shoaee, H., CATER: an Online Problem Tracking Facility for SLC, Proceedings of Particle Accelerator Conference, 1993, Washington DC. [9] Bellomo, P., Rago, C.E., Spencer, C.M., Wilson, Z., A Novel Approach to Increasing the Reliability of

11

Copyright 2004 by ASME

Вам также может понравиться