Вы находитесь на странице: 1из 144

Maintenance Best Practices for Switching Equipment and Transformers with Key Performance Indicators (KPIs) and Algorithms

for Living Reliability Centered Maintenance (RCM) and Performance Based Maintenance (PBM)

Technical Report

Maintenance Best Practices for Switching Equipment and Transformers with Key Performance Indicators (KPIs) and Algorithms for Living Reliability Centered Maintenance (RCM) and Performance Based Maintenance (PBM)
1010555

Interim Report, December 2005

EPRI Project Manager B. Desai

ELECTRIC POWER RESEARCH INSTITUTE 3420 Hillview Avenue, Palo Alto, California 94304-1395 PO Box 10412, Palo Alto, California 94303-0813 USA 800.313.3774 650.855.2121 askepri@epri.com www.epri.com

DISCLAIMER OF WARRANTIES AND LIMITATION OF LIABILITIES


THIS DOCUMENT WAS PREPARED BY THE ORGANIZATION(S) NAMED BELOW AS AN ACCOUNT OF WORK SPONSORED OR COSPONSORED BY THE ELECTRIC POWER RESEARCH INSTITUTE, INC. (EPRI). NEITHER EPRI, ANY MEMBER OF EPRI, ANY COSPONSOR, THE ORGANIZATION(S) BELOW, NOR ANY PERSON ACTING ON BEHALF OF ANY OF THEM: (A) MAKES ANY WARRANTY OR REPRESENTATION WHATSOEVER, EXPRESS OR IMPLIED, (I) WITH RESPECT TO THE USE OF ANY INFORMATION, APPARATUS, METHOD, PROCESS, OR SIMILAR ITEM DISCLOSED IN THIS DOCUMENT, INCLUDING MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, OR (II) THAT SUCH USE DOES NOT INFRINGE ON OR INTERFERE WITH PRIVATELY OWNED RIGHTS, INCLUDING ANY PARTY'S INTELLECTUAL PROPERTY, OR (III) THAT THIS DOCUMENT IS SUITABLE TO ANY PARTICULAR USER'S CIRCUMSTANCE; OR (B) ASSUMES RESPONSIBILITY FOR ANY DAMAGES OR OTHER LIABILITY WHATSOEVER (INCLUDING ANY CONSEQUENTIAL DAMAGES, EVEN IF EPRI OR ANY EPRI REPRESENTATIVE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES) RESULTING FROM YOUR SELECTION OR USE OF THIS DOCUMENT OR ANY INFORMATION, APPARATUS, METHOD, PROCESS, OR SIMILAR ITEM DISCLOSED IN THIS DOCUMENT. ORGANIZATION(S) THAT PREPARED THIS DOCUMENT Maintenance and Test Engineering, LLC

NOTE
For further information about EPRI, call the EPRI Customer Assistance Center at 800.313.3774 or e-mail askepri@epri.com. Electric Power Research Institute and EPRI are registered service marks of the Electric Power Research Institute, Inc. Copyright 2005 Electric Power Research Institute, Inc. All rights reserved.

CITATIONS
This report was prepared by Maintenance and Test Engineering, LLC 2037 North Berry Street Olympia, WA 98506 Principal Investigator J. Skog This report describes research sponsored by the Electric Power Research Institute (EPRI). The report is a corporate document that should be cited in the literature in the following manner: Maintenance Best Practices for Switching Equipment and Transformers with Key Performance Indicators (KPIs) and Algorithms for Living Reliability Centered Maintenance (RCM) and Performance Based Maintenance (PBM): EPRI, Palo Alto, CA: 2005. 1010555.

iii

PRODUCT DESCRIPTION
Over the past several decades, utilities have taken two significant approaches to improve the effectiveness and efficiency of their maintenance programs. These approaches have focused on maintenance task improvements and technology improvements associated with the equipment operation and the use of on-line monitors. While these two approaches have resulted in improvements, they have not necessarily taken full advantage of existing data available through supervisory control and data acquisition (SCADA) devices and intelligent electronic devices (IEDs), optimized maintenance cycles, or focused on improving the overall performance of the maintenance program. Performance focused maintenance (PFM) is an all-inclusive approach to maintenance. PFM brings together what previously appeared to be distinctly different approaches to maintenance under a single umbrella. PFM recognizes that maintenance is both a technical and business process that must be managed and, at a high level, should be very similar across the whole landscape of utilities. PFM acknowledges that the specific application of these processes and approaches will differ due to the wide range of customer requirements, electric infrastructures, and maintenance organizations. The adaptive approach of PFM allows utilities to meet their own specific maintenance and operational goals and at the same time be confident that they are effectively managing the process and following industry best practices. Results and Findings This report outlines the initial effort to identify the value of PFM and to provide a broad overview of the topics involved. This work will provide direction for future EPRI efforts. The key purposes of this report are to establish the need for PFM in the minds of utility personnel and to demonstrate its potential value. The previous paradigm for maintenance is no longer sufficient to ensure optimal performance; PFM will provide the next step in maintenance optimization. This report contains suggestions to improve maintenance performance using techniques that have been developed but only partially tested and formally documented. Because this document is a work-in-process report, these suggestions are in transition. Challenges and Objectives Although this report outlines the PFM approach for substations, PFM is directed at all utility personnel involved in the management of maintenance processes. It gives power delivery maintenance managers access to a series of tools and thought processes that will enable them to identify areas for improvement using PFM strategies. Improving maintenance remains crucial to improved utility financial performance. Operations and maintenance (O&M) are the largest controllable costs for most utility organizations. Because both maintenance and operations are labor intensive, proper management of the workforce is v

vital to success in each area. EPRI has been a leader in the application of reliability centered maintenance (RCM) and other maintenance performance enhancement tools. PFM is the logical extension of these tools. Applications, Value, and Use PFM is applicable to all forms of asset management in the power delivery sector. This report provides an overview of a comprehensive yet adaptable approach to maintenance that can easily be applied to a utilitys unique asset management strategy. PFM is an overall process of utility service optimization that the utility can apply in totality or on a targeted basis. EPRI Perspective This work is only a beginning in the development of PFM. PFM incorporates an assortment of tools previously developed by EPRI as well as forward-looking maintenance strategies. PFM combines the elements into a master framework that allows the utility to integrate them in a cohesive manner into their maintenance program. PFM is a vital part of power delivery asset management and is linked to future asset management projects. EPRI has long been a leader in developing and adapting new approaches for improving maintenance performance. In addition, EPRI has an unmatched ability to extract intelligence from utility personnel, vendors, and other industry leaders and to combine that intelligence into a program that will benefit the entire community. Approach This report explains the value of PFM and outlines a possible approach to implementing it. To achieve this objective, EPRI obtained the services of key industry leaders in the areas of performance enhancement including maintenance management workstation (MMW), the integrated monitoring and diagnostics (IMD) and XVisor programs, and RCM. These industry leaders synthesized the lessons learned from the past 10 years of EPRI-sponsored optimization efforts to generate the contents of this report. PFM will serve as a basis for ideas for the next generation of EPRI-sponsored products. EPRI has established plans for industry-wide review of this report to determine the future direction of its efforts. Keywords Availability Key performance indicator (KPI) Performance focused Reliability Aging models Maintenance optimization

vi

ABSTRACT
This interim report introduces an advanced and comprehensive approach to power delivery maintenanceperformance focused maintenance (PFM). PFM is an all-inclusive approach to maintenance that goes well beyond reliability centered maintenance (RCM) and condition based maintenance (CBM). PFM includes not only the technical aspects of maintenance but also the business, risk management, economic, organizational, and continuous improvement processes. PFM emphasizes the appropriate and judicious use of data and establishes feedback loops. Specifically, this report emphasizes the feedback loops involving all levels of the maintenance organization, from feedback on the performance of individual tasks to feedback on the performance of the utility as a maintenance provider and manager, to meet the continuing competitive challenge of improving maintenance performance while reducing costs.

vii

CONTENTS

1 INTRODUCTION ....................................................................................................................1-1 Goals of the Project...............................................................................................................1-1 The PFM Concept .................................................................................................................1-1 2 BALANCED MAINTENANCE APPROACH...........................................................................2-1 Definitions Used in this Report ..............................................................................................2-2 Maintenance Philosophy ..................................................................................................2-3 Maintenance Strategy.......................................................................................................2-3 Reliability Centered Maintenance ................................................................................2-3 Maintenance Basis ......................................................................................................2-3 Maintenance Tasks......................................................................................................2-4 Corrective Maintenance Tasks................................................................................2-4 Preventive Maintenance Tasks ...............................................................................2-5 Condition Directed/Based Maintenance Tasks .......................................................2-5 Predictive Maintenance Tasks (Also Referred to as Condition Based Maintenance) ..........................................................................................................2-5 Hidden Failure Finding Tasks .................................................................................2-5 3 ELEMENTS OF PFM..............................................................................................................3-1 1. Planning Aligning Maintenance with Utility Goals ..........................................................3-2 PFM Planning Process Objective .....................................................................................3-3 Elements of the Planning Process....................................................................................3-3 Executive Sponsorship and Reporting.........................................................................3-3 System Selection .........................................................................................................3-3 Team Assembly ...........................................................................................................3-3 Understanding Utility Goals .........................................................................................3-4 2. Developing a Technical Maintenance Approach ...............................................................3-4 PFM Technical Process Objectives..................................................................................3-4 Elements for Developing a Technical Maintenance Approach .........................................3-4

ix

Identifying Critical Functions ........................................................................................3-4 How Do Failures Manifest Themselves? .....................................................................3-4 Identifying the Effects of Failure...................................................................................3-4 Selecting the Right Preventive Strategy ......................................................................3-5 Data and Measures .....................................................................................................3-5 3. Building an Aging Model....................................................................................................3-5 PFM Aging Model Objectives ...........................................................................................3-5 Elements for Building Aging Models.................................................................................3-5 Determining the Aging Mechanism ..............................................................................3-5 Can Aging Be Measured?............................................................................................3-6 What is an Acceptable Level of Risk?..........................................................................3-6 Limiting the Risk...........................................................................................................3-6 4. Creating a Maintenance Plan Best Practices .................................................................3-6 PFM Plan Objectives ........................................................................................................3-6 Elements for Building the Maintenance Plan....................................................................3-7 Task Triggers ...............................................................................................................3-7 Optimizing Maintenance Intervals................................................................................3-7 Building a Maintenance Plan .......................................................................................3-7 Dynamic Prioritization ..................................................................................................3-8 5. Measuring Performance ....................................................................................................3-8 PFM Measurement Objectives .........................................................................................3-8 Setting Specific Maintenance Goals ............................................................................3-8 Developing Metrics and KPIs.......................................................................................3-9 Determining Data Requirements..................................................................................3-9 Setting Targets ..........................................................................................................3-10 Identifying the Current State of Maintenance.............................................................3-10 Gap Analysis..............................................................................................................3-10 6. Documentation and Implementation................................................................................3-11 PFM Documentation and Implementation Objectives ....................................................3-12 Elements for Documenting and Implementing the PFM Recommendations ..................3-12 Reconciliation ............................................................................................................3-12 Identifying Change .....................................................................................................3-12 Impact Analysis..........................................................................................................3-13 Change Management ................................................................................................3-13 Implementation Plans ................................................................................................3-13

7. Measurement and Feedback...........................................................................................3-14 PFM Measurement and Feedback Objectives ...............................................................3-14 Elements for Measuring Maintenance Effectiveness and Providing Feedback ..............3-14 Measurement .............................................................................................................3-14 Reporting ...................................................................................................................3-14 Making Corrections ....................................................................................................3-14 Implementing New Technologies and Maintenance Tasks........................................3-15 4 USING A TARGETED APPROACH WITH PFM ....................................................................4-1 Closing the Gaps with PFM Selective Activities ....................................................................4-4 Goals Are Unclear ............................................................................................................4-4 Reliability Is Below Expectations ......................................................................................4-4 Executives Are Confused About the Value of Their Maintenance Investments ...............4-5 Regulators Are Challenging Your Maintenance Program.................................................4-5 Availability Requirements Are Tightened..........................................................................4-5 Maintenance Tasks Are Not Achieving Desired Results ..................................................4-5 Want to Make Better Use of Data.....................................................................................4-5 Maintenance Task Intervals Are Suboptimal ....................................................................4-5 There Is Too Much Work and Not Enough Resources .....................................................4-6 A Replacement Strategy Is Needed .................................................................................4-6 Intellectual Property Is Lost ..............................................................................................4-6 5 EXAMPLE APPLICATION OF PFM MEASURE AND PERFORMANCE ACTIVITIES .........5-1 Overview ...............................................................................................................................5-1 PFM Findings ........................................................................................................................5-2 1. LTC Oil Temperature....................................................................................................5-2 2. Differential Temperature...............................................................................................5-2 3. Differential Temperature with Trending ........................................................................5-3 4. Temperature Index .......................................................................................................5-3 Using Readily Available Data ................................................................................................5-4 LTC Failure Avoided..............................................................................................................5-4 6 THE ROLE OF DATA IN PFM................................................................................................6-1 Multiple Uses of Data ............................................................................................................6-2 Where Data Are Applied in PFM ...........................................................................................6-3

xi

7 EFFECTIVELY USING DATA FOR RISK ANALYSIS ...........................................................7-1 Introduction ...........................................................................................................................7-1 Data Drivers and Risk Management .....................................................................................7-2 Risk Decision Process...........................................................................................................7-4 Risk Assessment ..............................................................................................................7-6 An Effective Assessment Model at the Network/System and Asset/Component Levels ...............................................................................................................................7-8 Example Using the Assessment Model...............................................................................7-11 Pre-Service Information..................................................................................................7-11 Information Regarding Service Life ................................................................................7-13 Analysis of the Situation .................................................................................................7-13 Decision Based Upon Analysis.......................................................................................7-15 Technical PFM Interaction...................................................................................................7-15 Practical Application of the Risk Assessment Approach .....................................................7-21 The Decision Process Considering Various Scenarios .......................................................7-21 Assessment Steps...............................................................................................................7-23 Susceptibility Assessment ..............................................................................................7-23 Consequence Assessment.............................................................................................7-23 Technical Assessment (of Expected Asset Performance)..............................................7-24 Economic Assessment ...................................................................................................7-24 The Decision Support Model Functionality and Structure ................................................7-24 Decision Model Supporting Effective Use of Data..........................................................7-28 The Decision Support Model, a Practical Example ........................................................7-29 Condition Analysis...............................................................................................................7-31 Consistency of Information .............................................................................................7-31 Maximum Quality of Condition Information.....................................................................7-32 Data Mining and Decision Support .................................................................................7-32 Practical Example of Data Mining: Cable Condition Assessment .......................................7-33 Condition Analysis of Power Cables...............................................................................7-35 Knowledge Rules............................................................................................................7-38 Database for Condition Assessment Support.................................................................7-40 Determinations of Norms and Criteria ............................................................................7-45 Database Application for Condition Assessment............................................................7-47

xii

8 PROJECT OPPORTUNITIES.................................................................................................8-1 Load-Tap-Changer Opportunities..........................................................................................8-1 Medium-Voltage Circuit Breakers..........................................................................................8-2 High-Voltage SF6 Circuit Breakers ........................................................................................8-3 9 NEXT STEPS..........................................................................................................................9-1 10 REFERENCES ...................................................................................................................10-1 A APPLICATION STUDY FOR LOAD-TAP-CHANGERS ....................................................... A-1 Performance Focused Maintenance LTC Application ....................................................... A-1 LTC Population Characteristics ....................................................................................... A-3 Operating and Maintenance History ................................................................................ A-3 LTC Diagnostics and Observations ................................................................................. A-4 Industry LTC Experience ................................................................................................. A-5 Main Insulation Package Diagnostics and Observations................................................. A-6 PFM Technical Analysis .................................................................................................. A-7 PFM Technical Summary .............................................................................................. A-12 PFM Risk Analysis......................................................................................................... A-12 Developing Aging Models.............................................................................................. A-14 Implications of Aging/Wear Models ............................................................................... A-15 Transformer Winding Maintenance ............................................................................... A-16 LTC Maintenance .......................................................................................................... A-17 Performance Measurement ........................................................................................... A-20 Conclusion ......................................................................................................................... A-23

xiii

LIST OF FIGURES
Figure 2-1 The PFM Framework ................................................................................................2-1 Figure 3-1 PFM Process Block Diagram....................................................................................3-2 Figure 3-2 The Relationships Among Utility Goal, Maintenance, and Performance ................3-11 Figure 4-1 Simplified PFM Decision Diagram for Selective Implementation..............................4-2 Figure 5-1 LTC and Main Tank Temperature Profile .................................................................5-3 Figure 5-2 Failed Reversing Identified by Temperature Index ...................................................5-5 Figure 7-1 Establishing Priorities ...............................................................................................7-3 Figure 7-2 Gearbox Approaches to Balancing Stakeholder Needs ...........................................7-4 Figure 7-3 Risk Decision Process..............................................................................................7-5 Figure 7-4 Asset Grouping by Risk and Performance Expectations ..........................................7-7 Figure 7-5 Risk-Based Failure Consequence and Probability Matrix ........................................7-9 Figure 7-6 Asset-Directed Activity Matrix.................................................................................7-10 Figure 7-7 Single-Line Diagram and Gas Sectionalizing Compartments (Inlay)......................7-12 Figure 7-8 GIS Example, Transforming Information into Consequence and Activity Matrices............................................................................................................................7-14 Figure 7-9 PFM Technical Analysis for Switchgear .................................................................7-16 Figure 7-10 PFM/FMECA for a Transformer............................................................................7-21 Figure 7-11 Information Flow and Processing .........................................................................7-22 Figure 7-12 Five Steps Decision Flowchart for Asset Management (AM) Decision.................7-25 Figure 7-13 Flowchart for Determining Redundancy Factor ....................................................7-27 Figure 7-14 Flowchart Example End Result ............................................................................7-30 Figure 7-15 Exchange of Condition Data.................................................................................7-32 Figure 7-16 Example of Analysis Tool .....................................................................................7-33 Figure 7-17 Schematic Structure of Data Mining Process .......................................................7-34 Figure 7-18 Relations of the Directly and Indirectly Analyzed PD Properties ..........................7-36 Figure 7-19 Decision Support Flow Diagram for PD Diagnosis ...............................................7-37 Figure 7-20 Example of Time (Upper) and Type (Lower) Analysis ..........................................7-39 Figure 7-21 Schematic Structure of a Diagnostics Database ..................................................7-40 Figure 7-22 Screenshot of Cable Sections ..............................................................................7-41 Figure 7-23 Measurement Add and Update Screen ................................................................7-42 Figure 7-24 Dialog for Adding and Updating a Filter................................................................7-43 Figure 7-25 Histograms Created in the Type View ..................................................................7-44

xv

Figure 7-26 Cable System in Original (Left) and Modified (Right) Form ..................................7-45 Figure 7-27 Experience Norms/Rejection Levels for the PD Amplitude Levels .......................7-46 Figure 7-28 PD Occurrence Frequency ...................................................................................7-46 Figure 7-29 Database View of the Different Diagnosed Cable Systems .................................7-47 Figure A-1 PFM Framework...................................................................................................... A-2 Figure A-2 Winding Failure Risk Analysis Older Westinghouse .......................................... A-12 Figure A-3 Winding Failure Risk Analysis Others ................................................................ A-13 Figure A-4 LTC Failure Risk Analysis Poor Performer ........................................................ A-13 Figure A-5 LTC Failure Risk Analysis Fair Performer .......................................................... A-13 Figure A-6 LTC Failure Risk Analysis Good Performer ....................................................... A-14 Figure A-7 Main Winding Aging Model (Normal Loading)....................................................... A-15 Figure A-8 LTC Wear Model ................................................................................................... A-15 Figure A-9 Average Failure Rate and Risk for an Aging Fleet of Transformers...................... A-16 Figure A-10 Example of Optimizing Maintenance Intervals Based on Lowest Life-Cycle Cost................................................................................................................................. A-18

xvi

LIST OF TABLES
Table 1-1 Typical PFM Drivers and Benefits .............................................................................1-2 Table 4-1 Selective PFM Benefits..............................................................................................4-3 Table 5-1 Novel Condition Monitoring Approaches Used to Trigger Just in Time LTC Maintenance.......................................................................................................................5-1 Table 7-1 Scenario Approach ....................................................................................................7-7 Table 7-2 FMECA of the Switchgear Secondary (Control) System .........................................7-19 Table 7-3 Part of the FMECA Drive System Circuit Breaker ...................................................7-20 Table 7-4 Final Step: Decision With Action..............................................................................7-31 Table A-1 LTC Population Characteristics, PFM Drivers and Benefits ..................................... A-3 Table A-2 LTC Condition Summary .......................................................................................... A-4 Table A-3 Industry Experience with LTCs................................................................................. A-5 Table A-4 Transformer Insulation Condition ............................................................................. A-6 Table A-5 PFM Technical Analysis Summary........................................................................... A-7 Table A-6 Number of LTC Operations Where 63% Contact Wear Is Expected...................... A-17 Table A-7 Metrics for LTC Performance ................................................................................. A-20 Table A-8 Metrics for Main Insulation Performance ................................................................ A-21

xvii

1
INTRODUCTION
Performance focused maintenance (PFM) is a methodology to help maintenance and asset managers direct their limited resources to maintenance tasks that will best contribute to reaching the organizations business goals. PFM integrates various technologies and techniques into maintenance in a phased approach that can be used to augment existing maintenance policies and programs and also develop new programs. The intent is to strengthen and build upon sound maintenance foundations rather than replace a utilitys current maintenance practices. PFM recognizes that, at the highest level, the maintenance process should be very similar for most utilities. However, PFM also recognizes that the approach to and application of these processes will differ from company to company due to individual circumstances, including the wide range of customer requirements, electric infrastructures, and maintenance organizations. The flexible approach of PFM allows utilities to meet their specific operation and maintenance (O&M) goals and at the same time be confident that they are following an industry-accepted practice and the latest developments in maintenance technologies and methodologies.

Goals of the Project


The goal of the Electric Power Research Institute (EPRI) PFM project is to provide a maintenance framework that integrates many of the technical, economic, and managerial concepts that have been a foundation of maintenance for the past several decades as well as recently introduced concepts and ideas. PFM incorporates the concepts of modern asset management and integrates many previous EPRI activities, allowing utilities to build custom maintenance strategies that meet their business needs and utilize the best practices of the industry.

The PFM Concept


At the highest level, PFM is a methodology to answer the question: Are my maintenance resources being used in the most effective and efficient way to achieve the desired performance goals of my utility? Dissatisfaction with its existing maintenance program is not a necessity to apply some or all of the PFM concepts; in fact, it is generally true that existing maintenance programs are servicing the utility well. It is also expected that some of the PFM elements are currently incorporated into the utilitys maintenance strategy but that there is room for improvement. PFM provides a structured but flexible approach to improve overall maintenance effectiveness that can adapt to an individual utilitys needs and resources.

1-1

Introduction

There are two basic starting points for reviewing and improving maintenance: One can start from an identified issue that requires correction and then work back, analyzing and reviewing the maintenance tasks and strategies that influence that issue. The scope of such a review depends on the issues being addressed. For example, if the issue identified is excessive maintenance costs across the board, a total review of maintenance could be initiated. However, if the concerns were about excessive corrective maintenance costs, an analysis and identification of the major contributors to those costs would be indicated. If a particular type or model of equipment was identified as a significant factor resulting in corrective maintenance, the PFM review could be directed at that equipment and the maintenance associated with it. The other starting point would be at the level of a particular maintenance subprogram or task in order to evaluate the results achieved by that activity in relationship to the costs expended. This approach addresses issues such as: Is this the most efficient task for the desired results? Can monitoring be cost effectively substituted for scheduled maintenance? What are the impacts on performance if a particular task is eliminated?

Regardless of the starting point, application of PFM should result in a similar set of recommendations. PFM directs review and analysis only at the areas targeted for improvement. Potential benefits of PFM are listed in Table 1-1. The PFM methodology will be further explained and illustrated with examples in the following sections.
Table 1-1 Typical PFM Drivers and Benefits Problem Underfunding Benefits Identifies where cutbacks are prudent Identifies consequences of reduced maintenance Links maintenance requirements with utility goals and objectives

Identifies appropriate tasks for preventing loss of function and risk of failure Reduced reliability Inefficient use of data Establishes realistic reliability goals Suggests design changes to improve reliability Identifies what data are needed for maintenance and how the data are to be used in a continuous improvement process Reveals how data can be used on a predictive basis

1-2

Introduction

Table 1-1 (cont.) Typical PFM Drivers and Benefits Problem Lack of executive support Regulatory oversight Benefits Tightly links maintenance activities to executive goals Measures continuous progress Identifies risks associated with program changes Reassures regulatory bodies that maintenance is being effectively managed Ensures that regulatory requirements are followed

Charts progress Provides a documented basis for the current approach

1-3

2
BALANCED MAINTENANCE APPROACH
Over the past decades, utility maintenance has transformed from a quiet, routine, and behind-the-scenes activity to one of the most dynamic and high-interest segments of the utility industry. Numerous approaches to maintenance have been developed and publicized as the most advanced approach to maintenance. While some of these new approaches have been more revolutionary than others, they have fueled a desire to identify a single best practice approach to maintenance. However, because there is no one-size-fits-all approach to maintenance, many times the benefits of these new approaches are either overstated or not realized. PFM acknowledges that each utility must find its own balance between the desire to prevent failures and the ability to finance maintenance, the risk adversity and the ability to predict failures, and finally a balance between what its customers desire in terms of reliability and what its regulators are willing to include in rates. To achieve and maintain these balances, PFM includes a framework of key elements that are pictured in Figure 2-1.

Figure 2-1 The PFM Framework

The PFM framework is designed to: Provide a methodology to effectively and appropriately apply new maintenance concepts Incorporate risk analysis

2-1

Balanced Maintenance Approach

Identify a detailed and in-depth approach to managing and prioritizing the maintenance of specific assets and an overall approach to optimize asset and task performance Allow utilities to build their own individual maintenance strategies with the assurance that they are incorporating best practices that fit their own specific corporate and customer service objectives Leverage currently available data and information resources to use in algorithms to predict asset deterioration and incipient functional failure Embrace business goals and customer service objectives by employing maintenance approaches that are both technically and economically effective Support dynamic prioritization Set realistic maintenance goals Supply feedback, making continuous maintenance improvements that are a requirementnot an optionof maintenance Identify best practice building blocks Identify models that take greater advantage of data and underutilized native intelligence to trigger maintenance and replacement decisions Build from previous EPRI projects

Definitions Used in this Report


Utility maintenance has made some revolutionary changes during the past few decades. With these changes has come a new vocabulary that sometimes adds as much confusion as clarity to the subject. Because PFM encompasses many of these recent maintenance concepts, it is important to establish a clear set of definitions. In the past, maintenance terms and definitions varied from source to source. A dictionary might define maintenance as the act of keeping equipment in the state of repair. In PFM, a much broader definition of maintenance is used to refer to all activities performed on equipment and systems in order to manage, assess, maintain, or restore their operating functionality. It is important to note that monitoring, inspecting, testing, and measuring are maintenance activities. In the utility industry, maintenance has been targeted for improvement with a focus on maximizing equipment reliability while minimizing the cost of performing time-based and condition-based maintenance tasks. As a result, a significant effort has been placed on creating a common set of maintenance process and maintenance task definitions. EPRI has recently published various documents in the Generation sector with a clear set of maintenance terms and definitions. To optimize the use of this guide, it will be useful to review and understand these basic terms and concepts that are now accepted and understood across the industry. These definitions are consistent with the fossil and nuclear utility industries and are widely accepted and practiced definitions [1]. Also, these definitions are consistent and easily associated with the many terms used in the transmission and substation utility business units. 2-2

Balanced Maintenance Approach

Maintenance Philosophy A maintenance philosophy is an organizations basic set of beliefs for developing a strategy to meet its overall business goals. Maintenance philosophies include goals such as maintaining high reliability, being the low-cost provider, minimizing capital investments, and increasing customer satisfaction. Maintenance Strategy A maintenance strategy is the specific set of actions and plans developed by an organization to support its philosophy and accomplish its goals. The specific strategy that an organization deploys consistent with its philosophy includes an appropriate mix of the following described maintenance processes and tasks/activities. Reliability Centered Maintenance Reliability centered maintenance (RCM) is a process to study and analyze equipment criticality, functions, failure modes, and causes to determine the appropriate technical mix of maintenance tasks that will best help an organization achieve its reliability goals. It is a step-by-step approach to optimize the maintenance task balance by incorporating equipment/plant knowledge, maintenance history, and industry experience. The RCM process results in a documented set of technically and economically effective maintenance tasks. Maintenance Basis The maintenance basis (MB) is the documented rationale for understanding expected equipment and system failures as well as their associated rationale for maintenance tasks and frequency to achieve an organizations desired goals for safety, the environment, equipment reliability, and O&M costs.

2-3

Balanced Maintenance Approach

Maintenance Tasks There are five basic maintenance tasks that are required by an organization to be performed to protect or restore component/equipment/system functions. These processes are: Corrective maintenance (CM) tasks Preventive maintenance (PM) tasks Condition directed/based maintenance (CDM) tasks Predictive maintenance (PdM) tasks or condition based maintenance (CBM) Hidden failure finding (HFF) tasks

Corrective Maintenance Tasks

The corrective maintenance (CM) process is the most basic of maintenance processes. It is also commonly referred to as reactive maintenance. CM is the process of restoring equipment or components affecting personnel safety or equipment/plant reliability that have failed, are degraded, or do not conform to their original design, configuration, performance criteria, or intended functions. A component should be considered failed or degraded if the deficiency is similar to any of the following examples: Is removed from service because of actual or incipient failure Does not meet design specifications for configuration or performance Creates a personnel safety hazard or equipment reliability concern Adversely affects the performance of nearby equipment (for example, missing piping insulation that increases the operating temperature of nearby electrical equipment) Releases fluids that create contamination concerns (or have the potential to, under postulated accident conditions) Adversely affects controls or process indications that directly or indirectly impair an operators ability to operate the equipment or reduce redundancy of important equipment

There are two types or classifications of CM tasks: If an organizations strategy for a component or equipment is to run the component/equipment to failure (RTF), the failure is expected. The resulting maintenance task to restore or repair the component or equipment is considered to be an expected corrective maintenance activity (CM-E). If an organizations strategy for a component or equipment is to prevent or avoid equipment failure due to the significance of the function of a component or equipment, the resulting maintenance task to restore or repair the component or equipment is considered to be an unexpected or undesired corrective maintenance activity (CM-U).

2-4

Balanced Maintenance Approach

Preventive Maintenance Tasks

The preventive maintenance (PM) process includes all program aspects to effectively manage periodic condition monitoring and periodic time-based actions taken to maintain or ascertain the condition of a piece of equipment within design operating conditions and to extend its life. They are performed before equipment failure, to avoid performance degradation or to reduce the likelihood of equipment failure. The maintenance tasks/activities that are generated as a result of the PM process are: Time-based tasks to restore a piece of equipment to new or an improved condition or to replace a piece of equipment at a certain point in time. These tasks are identified as PM-RRs. Condition monitoring tasks that include the collection of data that could be used to indicate condition, such as visual inspections, electrical testing, infrared thermography, and oil or gas analysis. These tasks are identified as PM-C MTs.

Condition Directed/Based Maintenance Tasks

Condition directed/based maintenance (CDM) tasks are the resulting tasks that are triggered when a condition monitoring task indicates that end-of-life is near. These tasks renew the item to a like new or good as new condition.
Predictive Maintenance Tasks (Also Referred to as Condition Based Maintenance)

Predictive maintenance (PdM) tasks are maintenance activities that require models, technologies, people skills, and communication to integrate all equipment condition data to make timely decisions about maintenance requirements. PdM tasks are many times confused with condition monitoring tasks, which include the act of collecting condition monitoring data or PM-CMTs. The result of an effective PdM process is informed and effective CDM decisions. The resulting maintenance tasks taken based on maintenance decisions generated by the PdM process are referred to as CDMs.
Hidden Failure Finding Tasks

Hidden failure finding (HFF) tasks are a special form of condition monitoring. Failure finding involves the detection of a failed function that is not obvious to the operators of the utility system. The task generally involves the functional operation of the device or system, resulting in a go or no-go decision. If the function cannot be performed, a condition directed task is implemented to return the item to a fully functional condition.

2-5

3
ELEMENTS OF PFM
To provide a balanced approach to maintenance, PFM employs several subprocesses to ensure that the needs of all maintenance stakeholders are addressed. This section of the report describes each of these subprocesses at a high level. Detailed documentation of subprocess procedures will be part of future EPRI work. In its simplest form, PFM can be broken into seven major sequential processes, as described in Figure 3-1. These processes are: 1. Planning 2. Technical maintenance approach 3. Aging and modeling 4. Best practices 5. Measure performances 6. Documentation and implementation 7. Measurement and feedback

3-1

Elements of PFM

Figure 3-1 PFM Process Block Diagram

1. Planning Aligning Maintenance with Utility Goals


Regardless of the state of the current maintenance program, it is critical that one understands that maintenance is just one of many functions contained in the utility operating and business structures. Maintenance must not function as an isolated organization but integrate with all of the utility business elements to ensure that the overall goals and objectives of the utility are being met. Linking maintenance to important strategic utility goals guarantees long-term support and helps define maintenance success.

3-2

Elements of PFM

PFM Planning Process Objective The objective of the PFM planning process is to assemble the appropriate team for performing the PFM process and to ensure that the team understands how maintenance supports all or parts of each specific strategic utility business, employee, and customer service goal. Elements of the Planning Process Executive Sponsorship and Reporting Development of effective and far-reaching initiatives will not be possible if the PFM process is not supported at the appropriate utility management level. Sustained executive sponsorship and visibility is necessary for success. The PFM process requires resources from many organizations and could have far-reaching impacts. Clear identification of sponsors, reporting structures, analysis scope, and timetables must be made before undertaking all or part of a PFM process. System Selection Maintenance is focused on ensuring the functional performance of specific utility assets. Each of these assets has its own technical operating and business requirements. PFM is a focused process that requires the resources of subject matter experts (SMEs). Each asset or system of assets has its own set of experts. To provide focus for the PFM team and to ensure that it includes the right mix of resources, the assets and systems to be analyzed must be clearly defined. Team Assembly PFM is performed by a multi-disciplined team that has a good understanding of critical maintenance stakeholder needs. Team members are not necessarily involved in all phases of the PFM process but should understand the objectives and agree on the findings. Potential team members include: Asset managers Maintenance engineers Maintenance supervisors Maintenance technicians System planners Apparatus engineers Data analysts Financial analysts System operators

3-3

Elements of PFM

Understanding Utility Goals Maintenance is a very necessary element of every utility. Maintenance is a support organization that plays a major role in ensuring that the strategic utility goals are consistently being met. The PFM team must understand how maintenance successes and failures impact each of these strategic goals. How the utility goals place constraints on maintenance must also be understood; future PFM recommendations must be consistent with both these goals and constraints.

2. Developing a Technical Maintenance Approach


At the most fundamental level, the maintenance goal is to prevent the loss of critical functions by performing an appropriate set of maintenance tasks, thus reducing the likelihood of functional failure to an acceptable level. PFM Technical Process Objectives The objectives of this PFM process are to develop a technical understanding of how failures manifest themselves, the effects of failure, and how the failures can be effectively prevented. Elements for Developing a Technical Maintenance Approach Identifying Critical Functions Every asset in the utility has a series of critical functions that it is expected to perform at a very high degree of reliability. These critical functions must be identified so that the appropriate maintenance strategy for guaranteeing their success can be developed. How Do Failures Manifest Themselves? Only by understanding the mode of failure and the precursors to failure can one develop an effective, preventive strategy. Identifying the Effects of Failure Failures with benign effects are good corrective maintenance candidates. Failures with severe failure effects are worthy of a preventive strategy or even a design change. The effect of a failure along with its probability identifies the risk associated with the failure. Understanding the effects of a failure in terms of safety, costs, system impact, and customer outage consequences is helpful in setting the level of acceptable risk.

3-4

Elements of PFM

Selecting the Right Preventive Strategy There are, many times, several effective methods for preventing a failure from taking place or reducing its impacts. Choosing the best single approach requires both a technical understanding of the equipment and a practical understanding of utility operations. PFM task selection models help to ensure that technically and economically effective preventive strategies are developed. Data and Measures Many maintenance strategies are driven by operational and diagnostic data. A listing of all available data elements is developed and a determination is made as to their value, usefulness, and impacts on future maintenance activities. The resultant list provides a foundation for future database requirements, performance metrics, and the development of predictive algorithms.

3. Building an Aging Model


Each failure has its own unique mechanism of manifesting itself. Understanding this mechanism allows one to identify conditional changes that occur prior to failure and to build models that predict those conditional changes. PFM Aging Model Objectives The objective of the PFM aging model process is to use information and insight gained during the development of a technical maintenance approach and build models that describe how an item ages. These models focus on the root cause of failure and their associated aging mechanism. Because the probability of failure changes with age for most failure mechanisms, the risk associated with failure also changes. By developing a mathematical aging model, one can better assess the risk associated with various maintenance strategies as well as develop a predictive approach to some maintenance tasks. Elements for Building Aging Models Determining the Aging Mechanism Time is not the only mechanism of an aging process. Many times, aging is influenced by operating events, temperature, loading, or other factors. The dominant cause of aging must be identified in order to build an appropriate aging model.

3-5

Elements of PFM

Can Aging Be Measured? Aging models are often built from data collected during maintenance and diagnostic testing. If historical data are available, models can be created directly from the data. If data are not currently available, the analysis team should determine what data can be collected in the future to calibrate the model. What is an Acceptable Level of Risk? Acceptable risk levels are rarely explicitly defined by the utility. These levels are more generally alluded to in the utilitys stated goals and objectives in various forms. These forms may include: System average interruption duration index (SAIDI) and system average interruption frequency index (SAIFI) targets Maximum allowable failure rates Environmental exposure Safety goals

These generalized goals and objectives must be transformed into specific failure probabilities or risk limits. Limiting the Risk Risk is a function of the probability of failure and the total effects of a failure. The aging model that predicts end-of-life can also be used to predict probability of failure and the associated risk. By applying the maintenance tasks identified above at the appropriate apparatus age, the amount of risk being taken by the utility can now be capped.

4. Creating a Maintenance Plan Best Practices


Maintenance plans are composed of a series of tasks aimed at effectively and technically reducing the probability of a functional failure. Each task has its own requirements and costs. Since maintenance involves human and equipment resources, the various maintenance task activities must be coordinated in a manner that produces high reliability, high availability, and lower costs. PFM Plan Objectives The objective of PFM plan development is to ensure that maintenance tasks are logically integrated and that intervals and task triggers result in the correct balance between reliability and cost.

3-6

Elements of PFM

Elements for Building the Maintenance Plan Task Triggers Maintenance activities must be performed at the appropriate timeeither during the periodic scheduling of activities based on the calendar or after a series of operating events. Many times, maintenance is performed when a specific condition is observed or measured (resulting in CDM). For each maintenance task, the appropriate trigger must be identified. A determination of the urgency of each task should also be made so that an understanding of the risk associated with delays can be predetermined. Optimizing Maintenance Intervals Many times the scheduling bandwidth associated with periodic maintenance is quite wide. Reduced maintenance intervals may result in reduced risk and higher reliability at the expense of increased maintenance costs. Extended periodic maintenance intervals can have the opposite impact. Choosing the right periodic interval should result in: Not exceeding risk levels Low cost High reliability

The aging and risk models developed previously are used to identify the risk, reliability, and costs associated with various periodic maintenance intervals. These models are then used to set the optimal maintenance interval. Building a Maintenance Plan Effective maintenance takes full advantage of available O&M resources. This means that the technical maintenance strategy identified previously is not implemented in a vacuum. The development of a maintenance plan recognizes that maintenance must also be a coordinated process making effective use of: Planned outages Routine observations Automated data and event collection Specialty human and technical resources Customer needs

3-7

Elements of PFM

Dynamic Prioritization Even the best maintenance plans cannot guarantee the availability of all necessary resources to ensure that they are performed on schedule. As maintenance is based more and more on diagnostics, condition monitoring, and predictive algorithms, it becomes difficult to develop detailed, long-range maintenance plans. It is inevitable that some maintenance delays will take place. It is important that these maintenance delays be managed in a way that minimizes their impact. Dynamic prioritization involves understanding the effects of delayed maintenance and the risk sensitivity of each maintenance trigger. Dynamic prioritization implies that maintenance scheduling delays are acceptable and each maintenance plan has its own resilience to these delays. Dynamic prioritization allows dissimilar maintenance plans to be compared and identifies the relative criticality of each plan.

5. Measuring Performance
After a maintenance plan has been developed but before it is implemented, the expected results of the plan must be identified. These performance targets allow for meaningful external review by the maintenance stakeholders and set the stage for the continuous improvement of the maintenance program. PFM Measurement Objectives The goal of the PFM performance measurement process is to identify a set of metrics and key performance indicators (KPIs) that objectively and meaningfully identify the progress of maintenance and provide linkage to higher level strategic utility goals and objectives. For each metric and KPI, a set of targets must be set so that success is defined. Data required to calculate these metrics and KPIs must be readily available, or an action plan to obtain these data must be created. Setting Specific Maintenance Goals Regardless of the forces that are driving a review of the current maintenance program, realistic goals must be set so that targeted actions can take place. These goals must be specific, be achievable, and address the requirements of all major utility stakeholders. Without goals it is difficult to determine if overall improvement really takes place and impossible to know when success is achieved. For maintenance, these goals can be generalized and quantified in terms of: 3-8 Safety Functional reliability Equipment or system availability

Elements of PFM

Equipment maintainability Economics Quality of service

These goals should form the foundation of any maintenance program and serve as both the starting point and final destination for PFM. If goals have not been correctly established, it is difficult to choose the appropriate course of action. If stakeholders do not agree that the goals are adequate, they will never be satisfied. Many times the goals of the maintenance organization, while valiant, do not meet and sometimes conflict with the expectations of the various stakeholders. Developing realistic goals that can be embraced by all stakeholders provides the foundation for building a sustainable maintenance program. Sometimes every goal the stakeholder desires can be met, and sometimes they cannot. In the latter case, re-evaluation of the goals, system design, or both may be required. Developing Metrics and KPIs For each maintenance plan, a specific set of metrics must be developed to ensure that the plan is technically effective and that utility, reliability, and operating objectives will continue to be met into the future. These metrics and KPIs address: Reliability Availability Maintainability Supportability Financial prudence

Determining Data Requirements When developing the technical approach for maintenance, a list of available data elements is developed. This list must be reviewed in the context of metrics and KPIs. Each data element should support an immediate need to take action, a predictive aging model, and one or more metrics or KPIs. Data that do not support any of these items must be reassessed for their value. Conversely, each metric and KPI should rely on readily available data. If data are not available, a determination of the value of the metric and KPI must be made, resulting in either a change in metrics or KPIs or the development of a new data collection requirement.

3-9

Elements of PFM

Setting Targets All maintenance metrics and KPIs must have a set of targets that indicate the following: Goals are being met. Goals are not being met, but the program is under control. The maintenance plan is not effective, and changes are necessary.

The targets must align with the aging model development and risk analysis previously performed. Failing to align these items will result in targets that are well below the risk acceptance level or well beyond the capabilities of the existing maintenance resources. Identifying the Current State of Maintenance Once the goals have been established, it is necessary to determine the maintenance programs current proximity to these goals. If historical data are available, this task should be relatively simple and accurate. If historical data are not available or in the wrong form, performance estimates must be made from expert knowledge and whatever data are available. Having a lack of data should not be an insurmountable impediment; performance estimates can be made using data from a random sample of equipment. Care must be taken to ensure that the sample is large enough to represent the whole population and over an appropriate time frame. Figure 3-2 shows the relationships among a utilitys goals, maintenance performed to achieve those goals, and the performance results. Gap Analysis The difference between the desired goal and the current situation is known as a gap. The wider the gap, the more likely the current maintenance program is in need of in-depth review or change. Similarly, if there is a large number of gaps, a broader analysis must be made in order to understand the cause for these gaps and the best methodology to close them. The identification of gaps may indicate a need to revise previously identified maintenance program targets or a need to conclusively determine if the proposed maintenance program can effectively meet expectations.

3-10

Elements of PFM

Figure 3-2 The Relationships Among Utility Goal, Maintenance, and Performance

6. Documentation and Implementation


Comprehensible documentation is necessary to gain outside support and to keep the maintenance program living. The documentation forms the basis for plan acceptance and future enhancements. Although each step of the PFM process should be documented at the time it is performed, a final summary document must be developed. Implementation of a new maintenance program is the most overlooked aspect of any maintenance review project. Unless the existing maintenance program has been well refined or resistance to change has been insurmountable, it is necessary to formulate an implementation plan. It should be recognized that: Implementation failures account for a large percentage of program failures. Implementation costs make up about 25% of total PFM program costs. Implementation is the time to solidify worker and management acceptance and buy in.

3-11

Elements of PFM

PFM Documentation and Implementation Objectives The objective of the implementation and documentation process is to ensure that the PFM recommendations are implemented with minimal objection, the work of the PFM team can be understood by all maintenance stakeholders, and that continuous improvements can be sustained well into the future. The process requires full reconciliation with the existing maintenance strategy and documentation as to why the change is necessary. Elements for Documenting and Implementing the PFM Recommendations Reconciliation If a maintenance program previously existed, it is required that a comparison of the PFM-based program be made to the current maintenance program. This comparison will identify differences that must be justified and potentially effective maintenance tasks that were overlooked or discarded during the PFM technical processes. Any resulting program changes must be identified and provided with supporting documentation so that a change implementation plan can be developed. Identifying Change Change can be a difficult process, especially if it is the response to a knee-jerk reaction. The PFM process will identify many changes that should be made to the existing maintenance program and its ongoing support processes and organization. Many of these changes will be identified by the stakeholders themselves and thus will have a pre-established base of support. It must be recognized that even with this base of support, it is not a guarantee that all organizations will immediately understand and embrace the change. Although not all changes require up-front pre-approval and buy in, there will be a few changes that are pivotal to the implementation of the PFM findings. These critical changes must be identified well in advance and formal change-management activities initiated. This changemanagement process may need the support of: Asset management Information technology Labor relations Maintenance craft personnel First-line supervisors Upper management

3-12

Elements of PFM

Impact Analysis Every change to an existing maintenance plan has some impact. Some of these impacts are minor while others may be severe, requiring significant resources to mitigate the impacts. The impact of each change must be identified. Then, a determination as to the overall value the change has on the ability of the maintenance program to meet its goals and objectives must be identified. In some cases, the impacts might exceed the value of the change, and thus a different approach is required. Change Management Whether a change is welcome or not, it generally does not happen without support and management. All changes identified by the PFM process must be managed to ensure that they are made correctly and in a timely manner. Implementation Plans Changes to an existing maintenance plan and system can be significant events themselves. Many times these changes require both financial and technical resources that must be secured in advance. To ensure that the implementation goes smoothly, specific implementation plans must be developed to address: Changes to the maintenance management system Resources needed for implementation Additional training New equipment purchases New procedures New monitoring equipment Responsibilities for action and approvals, interfaces, schedules, and milestones

Implementation planning also requires: Executive sponsorship Training Leadership program development team Tight link between program developers and craft Feedback mechanism Sustained support

3-13

Elements of PFM

7. Measurement and Feedback


No single piece of equipment is 100% reliable nor is any PM program 100% effective. PM programs improve with age only if O&M targets are defined and focused. The path to meeting these goals is not singular and linear but iterative and circular. PFM Measurement and Feedback Objectives The objectives of the measurement and feedback process are to ensure that the maintenance program is on track; to meet maintenance technical, financial, and operational objectives; and to communicate progress to all maintenance stakeholders. Elements for Measuring Maintenance Effectiveness and Providing Feedback Measurement The metrics and KPIs previously identified must be updated on a periodic basis. Data should be readily obtained from appropriate collection systems and suitable calculations made. Depending on the KPI or metric, updating may be necessary only at quarterly or yearly intervals. Reporting Not all maintenance stakeholders are interested in all the KPIs or metrics. Only items of interest need to be periodically reported to each stakeholder group. The reporting should include: Historical trends Summary analysis of the data Future expected maintenance actions that have influence on these items

Making Corrections Each metric or KPI has a predefined acceptance level. Negative trends or off-target conditions for any of these reporting items must receive follow-up analysis. This analysis must identify: Data or operating anomalies New information not identified or addressed by previous PFM analyses Needs to take action Suggested corrective actions

3-14

Elements of PFM

Implementing New Technologies and Maintenance Tasks New technologies and maintenance techniques are continually being developed. Many of these items have the potential of improving the existing maintenance program, while others may just be a cost resulting in little or no performance improvement. These new opportunities must be analyzed, tested, and validated before they are made part of the mainstream maintenance program.

3-15

4
USING A TARGETED APPROACH WITH PFM
Although PFM is a comprehensive process for developing and assessing a maintenance program, it does not need to be employed in totality. PFM can be applied on a targeted basis, using only selected PFM activities. Figure 4-1 shows a logic diagram for selectively implementing PFM.

4-1

Using a Targeted Approach with PFM

Figure 4-1 Simplified PFM Decision Diagram for Selective Implementation

4-2

Using a Targeted Approach with PFM

Some of the potential benefits that can be gained through selective implementation are listed in Table 4-1.
Table 4-1 Selective PFM Benefits Activity Stakeholder identification Maintenance goal setting Measures and metrics KPI development Function and failure analysis Typical Benefits Realized from the Selective Activity Identifies your customers/maintenance allies Expands the list of actual maintenance benefactors Clearly identifies the actual needs of all maintenance benefactors Legitimizes the impacts of failure Sets a foundation for building business cases for various maintenance initiatives Sets specific and measurable maintenance targets Provides a path for continuous improvement Provides a cause and effect relationship between maintenance strategy and maintenance results Identifies specific methods for determining success and the needs for change Provides clarity to data collection requirements Identifies the need for corrective actions when metrics exceed boundaries Provides input to reliability models Charts progress Provides executives with a meaningful view of the value of maintenance Provides regulators with a level of confidence that maintenance is being managed Charts progress Identifies most effective maintenance tasks Eliminates ineffective and redundant tasks Identifies the need for design change

4-3

Using a Targeted Approach with PFM Table 4-1 (cont.) Selective PFM Benefits Activity Predictive modeling Task interval optimization Maintenance strategy documentation Typical Benefits Realized from the Selective Activity Leverages the use of data Reduces maintenance costs Increases equipment reliability and availability Forms a technical basis for equipment replacement Reduces total overall maintenance costs Provides a direct linkage between reliability and task frequency Provides justification for maintenance intervals Identifies the impacts of budget changes Provides a consistent basis for maintenance decisions Instills confidence in the minds of executives and regulators that maintenance is being well managed Helps allocate limited resources effectively Ensures that the utility gets the biggest bang for its maintenance dollar

Task prioritization

Closing the Gaps with PFM Selective Activities


The following describe common ways that the PFM templates have been used to enhance mature maintenance programs. Goals Are Unclear The PFM performance measurement activity provided the utility asset manager with a methodology for setting maintenance goals with results that are measurable. Reliability Is Below Expectations Reliability requirements were identified for each critical function, and appropriate metrics were developed that clearly indicated whether or not progress was being made. The metrics template provided the PFM leader with a framework for identifying suitable metrics and measures.

4-4

Using a Targeted Approach with PFM

Executives Are Confused About the Value of Their Maintenance Investments Maintenance is a complex subject that must be managed by experts. Executives cannot be expected to understand maintenance at the same level as asset managers, but they must be able to see how maintenance allows the utility to meet its business and customer service goals. PFM KPI activities were used to distill maintenance activities into a balanced scorecard that provided executives with constant insight into how maintenance was effectively using precious resources to meet critical utility goals and how maintenance objectives aligned with the utilitys risk profile. Regulators Are Challenging Your Maintenance Program Regulators understand process and results, but they do not understand the details of maintenance. When utility operation and customer service results do not meet expectations and there are no supporting processes to validate the current maintenance approach, regulators respond with orders that are not necessarily in the best interest of the customer or the utility. The PFM activity for building an active and documented maintenance strategy that could withstand the scrutiny of regulators was employed. It provided persuasive arguments supporting the utilitys maintenance approach. Availability Requirements Are Tightened Reduced downtime, whether scheduled or nonscheduled, is the only way to increase equipment availability. Successful application of PFM technical approaches for developing a predictive maintenance strategy improved both reliability and availability. Maintenance Tasks Are Not Achieving Desired Results New technologies promise improved reliability at reduced costs, but will they work? The PFM task selection process ensures that the most proactive maintenance practices are being used. Want to Make Better Use of Data The approach to maintenance must be intelligent and business based, effectively using readily available data to predict equipment condition and measure both asset and program performance. PFM aging models have been used to better predict end-of-life. Maintenance Task Intervals Are Suboptimal Periodic maintenance intervals have been optimized based on technical and economic objectives, not arbitrary time intervals.

4-5

Using a Targeted Approach with PFM

There Is Too Much Work and Not Enough Resources PFM prioritization and resource allocation processes ensured that the most critical functions were receiving the appropriate attention of maintenance. A Replacement Strategy Is Needed Efficient direction of capital expenditures for replacement of equipment occurs when maintenance cannot achieve reliability and availability requirements and risk tolerances are exceeded. PFM activities for evaluating risk will serve as the basis for building an equipment replacement strategy. Intellectual Property Is Lost Expert system knowledge currently possessed by experienced personnel was captured and institutionalized, elevating the proficiency level of all technical and operations personnel during PFM documentation stages.

4-6

5
EXAMPLE APPLICATION OF PFM MEASURE AND PERFORMANCE ACTIVITIES

Overview
A large East Coast utility had embraced the concepts of RCM and applied the concepts to much of its substation maintenance program. The revised maintenance program was superior to previous activities and was based on a strong technical foundation. Condition-monitoring activities were heavily employed to initiate just in time maintenance for their load tap changing (LTC) transformers. The novel condition monitoring approaches and their derived benefits are listed in Table 5-1.
Table 5-1 Novel Condition Monitoring Approaches Used to Trigger Just in Time LTC Maintenance LTC Attribute Monitored Operations counter Limit switch counter Motor run time Derived Benefits Frequent use may indicate excessive contact wear and require maintenance. Infrequent use may indicate that the tap changer requires exercising through the tap change or reversing switch. Assesses contact life from cumulative operations. Number of operations at each tap position may indicate excessive wear on certain taps or the reversing switch. Ensures regulating voltage range. Comparison with system voltage levels may indicate a tap changer control deficiency such as phase out of sync. Accesses contact life of each tap from cumulative operations. Limit switch operations may indicate a gearing, voltage, or control deficiency. Changes in run times above baseline levels may indicate a mechanical deficiency such as contact wear, gear fouling, motor fouling, or AC supply failure. Indicates contact or bearing wear.

Operations per tap

5-1

Example Application of PFM Measure and Performance Activities Table 5-1 (cont.) Novel Condition Monitoring Approaches Used to Trigger Just in Time LTC Maintenance LTC Attribute Monitored Motor current Gas-in-oil in diverter compartment oil Derived Benefits Changes in current may indicate a power supply failure or mechanical deficiency such as gear or motor fouling. Changes in concentration of gas-in-oil above the baseline may indicate coking or incipient dielectric failure. Indicates overheating due to high resistance current paths. Indicates contact wear.

PFM Findings
The PFM measurement and performance process revealed a significant gap between reliability and availability requirements and actual LTC performance. LTC contact wear was not being effectively monitored by the techniques presented in Table 5-1. A PFM technical maintenance approach was performed, and it was determined that LTC temperature modeling could provide significant insight into the state of each contact. Overheated contacts in an LTC can result from various causes, such as coking, misalignment, and loss of spring pressure. Because contact temperature cannot be directly measured, the overheating will generally be detected by an increase in the LTC oil temperature. If the overheating progresses to an advanced stage, the oil quality will deteriorate and bubbles may form. A flashover between contacts could occur, which would place a short circuit on the regulating winding and cause the transformer to fail. The technical analysis identified that LTC temperature profiles are normally influenced by weather conditions, cooling bank status, and electrical load. However, abnormal sources of energy (losses) also affected the temperature profile. Four potentially effective methods for determining when one or more contacts were near end-of-life were identified. 1. LTC Oil Temperature The simplest temperature-related diagnostic involves monitoring the temperature level. LTC temperature in excess of a certain level may be an indication of equipment trouble. Setting a high temperature level for triggering LTC maintenance is quite challenging because normal loading and ambient temperatures cause LTC oil temperatures to change significantly. 2. Differential Temperature Monitoring the temperature difference between the main tank and LTC compartment when the tap changer is in a compartment separate from the main tank can indicate high resistance conduction paths in the LTC. Typically, the main tank temperature will be higher than the tap 5-2

Example Application of PFM Measure and Performance Activities

changer compartment temperature. Many factors influence differential temperature. Excessive losses caused by bad contacts or coking in the tap changer may be detectable. However, the LTC temperature can exceed main tank temperature periodically under normal conditions. Hourly variations in electrical load, weather conditions, and cooling bank activation can result in main tank temperatures below the tap changer. Figure 5-1 shows a typical curve of the top oil temperature in the main tank and of the LTC compartment temperature for reactance-type tap changers. It should be noted that a signature needs to be developed for each type of tap changer. The top trace (black) is the main tank top oil temperature, and the bottom trace (gray) is the LTC compartment temperature.

Figure 5-1 LTC and Main Tank Temperature Profile

3. Differential Temperature with Trending A method to distinguish between normal and abnormal differential temperature is to include load trends. If the LTC temperature is greater than that of the main tank when load is decreasing, this is deemed a normal condition. However, if the tap changer temperature exceeds the main tank temperature when load is increasing, this may indicate an LTC contact problem. 4. Temperature Index Another method used to examine temperature differential involves computing the area between the two temperature curves over a rolling window of time. This quantity is called the temperature index and is usually expressed in degree-hours. Normal temperature difference (defined as the main tank level above the LTC) is counted as negative area, and the reverse is positive area. Therefore, over a period of time, the index reflects the general relationship between the two measurements without changing significantly due to normal daily variations in temperature. Under abnormal conditions, the index will exhibit an increasing trend because the LTC tends to operate at a higher temperature relative to the main 5-3

Example Application of PFM Measure and Performance Activities

tank. This method eliminates false alarms associated with simple differential monitoring but responds slowly to abnormal conditions.

Using Readily Available Data


Temperature monitoring of transformer main tank oil through SCADA was typical for this utility. Monitoring of LTC oil temperature with SCADA was not typical but easy to implement and low cost. Temperature data were archived in the SCADA data historian, which was accessible to the utilitys maintenance management workstation (MMW). Development of an LTC temperature indexing algorithm was straightforward and easy to do. The algorithm calculated the temperature index on a periodic basis and provided both reports and maintenance triggering.

LTC Failure Avoided


Implementation of the temperature index model on a 65-MVA LTC transformer had positive results. Within a short period of time, an incipient reversing switch failure was detected and averted. The potential impact of this was $1.2 million in avoided replacement expenditures. The nearly failed contacts are shown in Figure 5-2.

5-4

Example Application of PFM Measure and Performance Activities

Figure 5-2 Failed Reversing Identified by Temperature Index

5-5

6
THE ROLE OF DATA IN PFM
PFM is a consumer of data and a generator of information. Within the PFM construct, it is realized that proper use of data cannot only improve equipment reliability but also improve the maintenance management process by delivering important information to numerous organizations throughout the utility. Developing and implementing algorithms to identify asset degradation and incipient functional failures is an important approach to maintenance. Traditional approaches to PdM have required the installation of numerous sensors and expensive data collection systems. PFM embraces the use of readily available data from different data sources and can reduce the cost of implementing and the time required to develop predictive maintenance strategies. Typical data elements used by PFM include: Design Original equipment manufacturer (OEM) design System design Fault duty Loading, voltage, and current Fault and switching operations Operating events and times Temperatures Run times Maintenance events Troubles and failures Outage times Part usage Periodic off-line measurements and diagnostic tests On-line monitoring Visual observations 6-1

O&M

Diagnostic

The Role of Data in PFM

Laboratory tests Forensic tests Trouble and failure experiences for similar equipment Generic failure models Availability statistics Reliability statistics Asset replacement costs Labor costs CM Labor costs PM Labor costs CDM Parts costs Failure effects

Industry

Economic

Multiple Uses of Data


Utility maintenance organizations are no different than many other businesses when it comes to the topic of data; they find themselves inundated with data but challenged as to how to transform them into real information. For maintenance managers, the challenge is twofold: How can existing data sources be leveraged to improve the current maintenance process? How can a maintenance organization transform its data into meaningful information that can be of value to other departments and organizations within the utility?

Data and data utilization are extremely important elements of PFM. The PFM project includes the building of models that integrate several utility data sources, domain expertise, and existing EPRI tools, resulting in improved maintenance approaches that target specific equipment families. Consider todays microprocessor protective relays. These devices now not only detect the presence of a fault, they also measure real-time currents and record sequences of events. This information is not only useful in evaluating the performance of the protective scheme but can also be used to evaluate: Breaker mechanism performance Breaker contact wear Transformer through-fault impacts

6-2

The Role of Data in PFM

Where Data Are Applied in PFM


PFM leverages data whenever they are available. When they are not available, PFM suggests the initial use of proxies until the time that solid data are obtained. Typical areas where data are used to make decisions include: Understanding failure consequence Determining current failure probability Identifying and quantifying risk Building failure models Optimizing maintenance intervals Developing predictive models Setting norms Triggering maintenance

6-3

7
EFFECTIVELY USING DATA FOR RISK ANALYSIS
Introduction
Asset maintenance strategies based on performance require external drivers or circumstances that set the performance level. Maintenance plans will differ depending on the specific demands but will nearly always strive to achieve maximum performance given a set of performance requirements. Performance requirements have different characters, depending on the stakeholder being addressed. Shareholders can be interested in either or both short-term and long-term capital growth, industrial consumers may be interested in a variable relationship between availability and cost, and society (including captive customers and government) would generally be interested in continuous availability against lowest price possible as well as a safe and clean environment for people. Making high level maintenance decisions should not be driven by a ruling principle forced by privatization or regulation. Opinions and/or (contractual) requirements on the one hand and technical information regarding asset performance on the other should be the performance drivers. Information as a result of effective analysis of the proper data should lead to decisions with respect to maintenance frequency, refurbishment, replacement, and task content, all at the utility level. The effective use of data, leading to the essential information, is dependent on the data collected, the collection process, and the analysis methodology. It is obvious that the quality of such an analysis is improved if it is based upon a larger data population rather than solely based on limited data derived from the utility alone. Maximizing the data population can be realized by either making use of industrial data concerning failure analysis or by sharing condition/performance data regarding the asset (or group/type of assets) with other owners of the same asset type. The leading guideline in all analysis processes is very simple: do the right things right at the right moment. Simply collecting mountains of data is not in itself valuable; the real trick is to focus on the essential data and convert them into useable management information. This section addresses the process of deciding how to identify the data that will be needed for various maintenance information activities and how to apply this information in a decision process. It shows the necessity of applying subjective data related to the position of the asset in the network against the more objective asset condition information. It concentrates on information that should support decisions regarding the handling of equipment in relation to the total relevant maintenance stakeholder environment. It emphasizes and describes methods that can be applied to generate the proper information/data in a robust collection process. In many situations, this means not only mining existing databases but also using special distilling processes such as Delphi methods, applied in discussions with experienced engineers and

7-1

Effectively Using Data for Risk Analysis

technicians. Moreover, once it has been determined what information is relevant to the maintenance process, a consistent method of generating information is required. At the asset level, proper technical PFM activities such as failure modes, effects, and criticality analysis (FMECA) become the basis for determining what information is relevant. PFM analysis tells us that the best way to apply FMECA is based upon a functional division. This approach enables us to understand the mutual relationships between critical functions and the desired performance of each function. Apart from deciding which data entities are of relevance for maintenance decisions, the number of data entities becomes relevant when performance databases are designed. Collecting and storing measured data in a consistent way offers possibilities for applying different analytical tools that optimize the generation of expert knowledge. If properly stored in a database of sufficient quality and structure, combinations of measured data can be analyzed per type, fleet, or location. Knowledge about performance can thus be transferred into the creation of new expert rules and applied for decision with regard to maintenance, especially if the collected data are shared with others.

Data Drivers and Risk Management


As stated in the introduction, the main challenge of asset management is to make the right decision at the right moment. This series of correct actions implies a decision process at different levels, leading to the proper order/priority of the execution of activities. The decision as to whether or not to execute a maintenance action requires insight not only into the assets performance but also on its short-, mid-, and long-term effects at the system level and the respective consequences to the stakeholders. General utility management must realize that they severely challenge the abilities of the asset managers when they set qualitative and quantitative requirements for O&M. Requirements that address expectations regarding the quality of supply, environmental load, customer service, and employee satisfaction challenge the asset manager to develop plans that realize these goals while simultaneously achieving the lowest possible long-term cost of ownership. Many times these requirements force the asset manager to find an optimal balance between often conflicting satisfying factors of stakeholders. The asset manager has to assign a (limited) budget in the correct order to a series of capital and O&M projects. The order of assignment should depend on the contribution the project/activity has on the gap between the assumed actual (is) and the anticipated (should be) needs for all stakeholders. This balancing process is often referred to as risk management. Figure 7-1 depicts the challenges the asset manager is facing in this priority process.

7-2

Effectively Using Data for Risk Analysis

Figure 7-1 Establishing Priorities

Risk management may be expressed, in the utility environment, as a function of the probability of the occurrence of supply failures. The translation from accepted risk levels into activities necessary to maintain the supply function through the upkeep of the asset performance is thus of extreme importance. Methods to measure the influence of asset performance on the system level are based upon historic information regarding the quality of supply. Methods to measure the quality and expected performance of the asset itself strongly depend on the information with respect to the specific asset type as integrated into the PFM failure modes and effects analysis (FMEA). Industrial data or, preferably, data of certain complete asset fleet type can generate only generic performance information. Applying data-mining techniques to a combination of utility-specific and industry databases provides better insight into future asset behavior and becomes a better platform for making risk decisions. A properly designed data collection process will support not only the quality of the decision but also the application of a certified process. Most asset management strategies attempt to achieve an optimum balance between investment return and stakeholders values. Risk decisions are based upon two things: the judgment of the acceptable risk and the (assumed or predicted) performance of the asset string concerned. To be fully informed about the string of asset performance capabilities, CBM tasks are applied intensively. A PFM technical analysis of the asset string (such as FMECA) creates a standardized and human independent environment for the identification and execution of condition measurements and CDM.

7-3

Effectively Using Data for Risk Analysis

The drivers that stimulate the asset manager to show maximum and continuous effort are: The requirements set by utility executives The challenge and possibility to be creative in the translation from data to information The desire to develop a solid information basis for making optimal decisions

Risk Decision Process


The secret of a well-designed and applied risk management process lies in finding the optimal balance of value-added service versus risk taken between shareholders, regulators, employees, consumers, and society. Risk decisions are made by defining alternatives in terms of financial, quality, and environmental impacts. Based upon the risk analysis, control measures, and execution of actions, stepwise actualization of improvements occur in a more or less continuous process. Risk management implies that decisions are based not only upon realizing the agreed-upon reliability targets or the profit promised but also the desires of other stakeholders. All decision aspects are based on economic needs, technical requirements, environmental impacts, and customer and employee values. The situation can be regarded as a gearbox relation of cogwheels as given in Figure 7-2. The cogwheels represent the same functions in both diagrams and show the mutual influence of each others position. Turning the maintenance activity wheel will influence all other positions, and a careful balance must be found.

Figure 7-2 Gearbox Approaches to Balancing Stakeholder Needs

The issue of risk management is a constant and complex theme for the asset manager. One way to perform risk assessments is to separate observations and studies into two separate activities, one for a system and one for components, using reliability management as the linking process. This approach eases the handling of data and distinguishes between the system risk and asset failure mode while guaranteeing the interaction between supply/demand and decisions regarding 7-4

Effectively Using Data for Risk Analysis

the execution of maintenance activities. The asset manager basically applies a step-by-step decision process as shown in Figure 7-3. The decision process again is separated into a risk (network) and a condition (component) orientation with reliability management as the link between both. A corporate level is added, addressing the risks involved in items like safety, corporate image, and environmental damage. Based on the information available (external, financial, and asset information sources), different impact assessments are executed. Regarding the first two levels, asset and system, the proper information is collected in a process as shown in Figure 7-3 (right). This process shows the steps taken from data input up to the decision-making level. It strongly emphasizes the necessity of a data warehouse approach. The effectiveness of making use of data is strongly influenced by approaches as shown in Figure 7-3 (left). Reliability management aims to identify the processes of degradation while describing and quantifying the processes effects on the availability of a system. It opts to combine the (expected) condition of the system elements with financial consequences, such as investments, needed to realize required availability. Weak results may lead to a change in the design of the system as well as a replacement of a component or another maintenance strategy being applied. Taking into account societal information, one reaches the decision at the corporate level, and the risk management process is completed. Items such as utility image, environmental impacts, and personnel safety (as well as failure consequences at the societal/strategic level) are taken into account by this model at the corporate level.

Figure 7-3 Risk Decision Process

7-5

Effectively Using Data for Risk Analysis

The described impact assessment approach necessitates three elementary types of information: Feedback from the corporate level: Analysis reports of effects from realized investment and upkeep activities on the corporate strategic level as related to stakeholder satisfaction Reports on environmental consequences Possible safety hazards for personnel and the environment Information on interruptions Quality of the supply to customers Energy not supplied Information on equipment failures and measured conditions Maintenance activity Operational activity

Feedback from the system level:

Feedback from the component level:

Risk Assessment Based upon the theoretic model of the risk decision process described in Figure 7-3, the risk assessment approach is more practically visualized in its pre-design schema as shown in Figure 7-4 and in Figures 7-5 and 7-6. Simply said, and in an asset life-cycle management process, the asset managers main objective is to find out which assets need to be maintained, which need to be replaced, and which can be ignored for a while. Then, given that information, a decision must be made about the order in which the chosen actions should be executed to obtain optimal system, societal, and financial results. Roughly translated, the asset manager has to keep risk to an accepted level. The asset manager has to determine which assets have had too much money invested in them (and a decrease would be acceptable) and which assets are at high risk and money should be spent on. With respect to Figure 7-4, more money should be spent on assets that are to be positioned in the top right corner of the figure and less spending is required on the assets that can be positioned in the bottom left corner. After making this investment decision, the asset manager can determine which actions are reasonable and appropriaterefurbish, renovate, replace, or redesign.

7-6

Effectively Using Data for Risk Analysis

Figure 7-4 Asset Grouping by Risk and Performance Expectations

The action-type decisions are based upon analyzing the different solution scenarios and determining their specific economic impact and the value added to other stakeholders. Table 7-1 exemplifies the use of a tool for evaluating several different action scenarios.
Table 7-1 Scenario Approach Scenario Process Decision Evaluation Matrix with Stakeholders Preferences Scenario Process 1 1 2 2 Alternative A B C B Flexibility Score ++ + Safety Score + + ++ + Reliability % 99.84 96.70 99.00 96.00 Durability Score + + NPV/EVA $ X Y Z W

In this example only one hard data element is presentthe costs related to the scenario involved, expressed as net present value (NPV) or economic value added (EVA). The EVA approach not only takes into account the life-cycle costs expressed in present money (that is, NPV) but also evaluates consequences of loaning money (weighted average costs of supplied capital), corporate revenues and costs, and long-term consequences of changes in corporate rating. Normally, the cheapest solution has the preference of the shareholders, although flexibility scores can also influence their desires. The flexibility scores express the possibility of extending the actual decision to a later period or allowing for additional options on future decisions. The other stakeholder preferences are covered by a quality element in safety (both for personnel as well as the environment), reliability/availability (the situation with respect to redundancy and asset performance), and durability (mainly environmental). Presenting and assessing alternatives in this way objectifies the decision-making process, although it is still partly subjective.

7-7

Effectively Using Data for Risk Analysis

An Effective Assessment Model at the Network/System and Asset/Component Levels In present utility settings, there are numerous data entities stored in databases. On many occasions, it is recognized that decision-supporting data were available but not known at the moment of decision. This situation also implies that many data entities are collected and stored for no reason at all. It is the intention of this section to give guidance on the process of deciding which data are relevant and important to collect and store in databases. To meet this objective, the approach, as depicted in Figure 7-4, is described in more detail in Figure 7-5 and Figure 7-6. Besides further detailing the risk-performance model, the separation of the system/network level and the asset levels is introduced. This separation results in splitting the discussions between the more theoretical system/network circumstances and the more practical, oriented asset needs. It is on the corporate/system level that the influence of the different stakeholders becomes relevant and the first step of deciding which information is relevant is taken. Stakeholders are not interested if an asset operates well, they are interested only if energy supply brings what was promisedsafe and environmentally acceptable energy, profit, and costs. Figure 7-5 shows a nine-field matrix where the classification of risk depends on the relation of consequences and probabilities of occurrence. The latter is roughly distracted from the susceptibility for failures of a circuit, asset, or subsystem. The consequence axis aims to take the different stakeholder categories into account. The approach forms the basis for a number of (commercially available) tools commonly used by some utilities. For reasons of understanding and learning, this section describes the basic approach (including an example). Once understood, an internally developed, spreadsheet-based, software support tool will provide most engineers and analysts with good service. Of course, one can make a more detailed matrix (containing 16, 25, or even 36 fields) in order to further detail the decisions to be made.

7-8

Effectively Using Data for Risk Analysis

Figure 7-5 Risk-Based Failure Consequence and Probability Matrix

In the approach, one can recognize the following: On the X-axis, the introduction of the different consequences for the different stakeholder areas is apparent. Each (stakeholder) consequence class (low, medium, high) has its own measurable level per stakeholder category. The asset manager decides under which circumstances the different consequences lead to a certain consequence class. A situation with physical long-term effects is easily judged; a combination of medium-classed medical treatment and negative PR is more difficult and will depend on utility objectives (or history). On the Y-axis, the concept of circuit, network, or asset susceptibility is introduced. Susceptibility classification is almost an intuitive process, taking into account the average asset age, the level of redundancy, the influence of polluted/salted air, aggressive soil, distance to sea, and the influence of sand or ice.

By ranking susceptibility and consequences for (all) stakeholders, including penalties and constraint costs, a non-critical, critical, or vital situation is judged. Although more rankings are possible, it is the experience that a three-level/nine-field approach is already quite complex and results in sufficient sensitivity. With respect to effectively making use of data, the figure forms the basis for the choice of which data entities are of relevance to measure at this level. These entities are not limited to this example; in many situations, an

7-9

Effectively Using Data for Risk Analysis

intense discussion between experts (the so called Delphi method) will give access to the relevant items to collect. The next step is the transformation of this knowledge into the actions to execute on the assets being part of the circuit or system part. This is shown in the matrix given in Figure 7-6. The judgment regarding the failure consequence class is placed on the risk axis (Y) of the asset-directed activity matrix. Accordingly, this risk (failure consequence class) is weighted against the condition or performance of the asset. The resulting matrix field gives insight into which decision should be taken regarding maintenance and/or reinvestment priorities. The two matrices are thus linked via the consequence judgment vital, critical, or non-critical. The risk analysis matrix in Figure 7-5 is related to the activity matrix of Figure 7-6 by stakeholder consequence management. The basic idea is to translate the network failure probabilities and consequences for stakeholders into decisions at the component (asset) level. The matrix in Figure 7-6 also provides information about the data needed at the asset levela qualitative indicator giving information about the condition of the assets belonging to a circuit or subsystem. Such an analysis is normally based upon the results of PFM technical activities. In ideal situations, a data mining process using all relevant asset information, preferably set against a larger population of measured data from identical or comparable asset types, improves this judgment.

Figure 7-6 Asset-Directed Activity Matrix

The condition class given in Figure 7-6 has two components. One component is given by the PFM-based maintenance and condition measurement/assessment and referred to as a quality or asset health rating of 1, 3, or 5 (or whatever is practiced). The other figure expresses the failure history of the asset (corrective maintenance). If less than one corrective measure per year was 7-10

Effectively Using Data for Risk Analysis

necessary, the asset has a good judgment, less than three per year is medium class, and more than three per year is classified as bad. Of course, the setting of the figures depends on the opinion of the organization. By deciding to what level of failure consequence class a circuit, asset, or system belongs, one can determine which action must be executed to bring the asset into the necessary performance level. If the quality level is too high compared to the failure consequence class, one can decide to decrease investments in maintenance. On the other hand, if the performance is too low given the failure consequences, a decision to increase maintenance or even to refurbish or replace it might be logical. Alternatively, one can decide to interfere in the susceptibility class of Figure 7-5 and lower the probability (risk of occurrence) by investments in redundancy or age.

Example Using the Assessment Model


To better explain the previous approaches, an example from a utility practice is presented. The following sections describe the example, using the assessment model: Pre-service information Information regarding service life Analysis of the situation Decision based upon analysis

Pre-Service Information In the early 1980s, a 12-bay, double-busbar, gas-insulated substation (GIS) switchyard of the latest generation was placed into service. The 12 bays were divided over two (8/4) busbar sections, according to the single-line diagram in Figure 7-7. The busbar sections were separated by longitudinal sectionalizers, and only the 8-bay section was equipped with a coupling bay. The switchyard was connected with circuit breakers to another older GIS through another busbar system. Although the system was operated at 150 kV, the design was based upon 300 kV (at higher gas pressure).

7-11

Effectively Using Data for Risk Analysis

Figure 7-7 Single-Line Diagram and Gas Sectionalizing Compartments (Inlay)

The installation was connected to one generator, three 50-kV power transformers, three connections that incorporated the substation into the 150-kV grid, and two cable connections that fed a large industrial facility. The inlay of the single-line diagram shows the complex gas sectionalizing arrangement. During installation and acceptance testing, some high levels of partial discharges (PDs) were detected, and some voltage transformers produced excessive noise. These problems were corrected by the supplier, and follow-up testing was performed to the utilitys satisfaction.

7-12

Effectively Using Data for Risk Analysis

Information Regarding Service Life Routine PD tests were performed periodically through a special connection and a special GIS testing transformer. During these testing periods, it appeared that it was very difficult to take part of the system out of service. This was caused by the arrangement of the different feeders/grid connections, transformers, and industry connections over the busbars. Another cause was the requirement, from a safety point of view, that it was not acceptable to have rated power on one side of an isolating point and a test voltage on the other (which is the standard situation using single-gap isolators). The routine tests did reveal a relatively high level of PDs. In the first few years of its service life, these PDs were found in the circuit breakers and voltage transformers; in the later period, PDs were also found in the busbar systems. Further investigations indicated that there were probably two causes for this: The mounting construction was designed incorrectly and had to be adapted during erection. The cleaning of the internal parts was not executed at normal quality.

The first cause resulted in high mechanical stresses on the circuit breakers, voltage transformers, and current transformers during switching operations, causing loose particles to move and insulators to be damaged. The second causeprobably based upon the idea that cleaning was not so critical because of the higher rated (design) voltageresulted in the presence of many particles in the system. In practice, a number of internal breakdowns appeared over the first 10 years of operation. Damage of gas-tight insulators, in some cases, could result in a personnel hazard due to explosion after opening a compartment for repair. The gas compartmentalization, resulting from logical requirements, made it impossible to open a single compartment if opposite parts were in service or if neighboring compartments were at rated pressure. As a result of this design, small repairs required a significant bus outage. Analysis of the Situation Modeling susceptibility for this situation using the upper part of the risk-based failure consequence matrix shown in Figure 7-5, one initially concludes that there was a reasonable level of redundancy (double busbar, two busbar sections, connection to another switchyard, 50-kV supporting grid, and higher rated design voltage). In practice, however, this appeared not to be true because of the distribution of the different incoming and outgoing bays over both busbar sections. Moreover, the configuration (generation plant, heavy industry) did not allow maintenance to freely choose the maintenance periods for inspection and performing diagnostic and condition monitoring tasks. Although the heavy industry and sea coast were nearby (the GIS was mounted in a building), pollution was not an issue nor was the age of the installation. Apart from the busbar configuration, the installation of the system was critical with many (potential) fault situations, and thus the susceptibility was rated as high.

7-13

Effectively Using Data for Risk Analysis

Applying the lower part of the risk-based failure consequence matrix found in Figure 7-5 to identify risk from a stakeholder point of view, one can argue from a technical point of view that the mechanical forces experienced during operation caused a considerable increase in the aging process as evidenced by: Cracks in insulators Loosing bolts in voltage transformers Loosing bolts in parts of circuit breaker constructions Short circuits in primary circuits of current transformers Internal breakdown from loose free particles

This situation led to, at minimum, the judgment medium aged after only 10 years of service. There was also an incident where experienced personnel were almost injured during the opening of the system when an insulator disk broke. This created the possibility of an internal breakdown during routine operation, and thus the health and safety index was set to medium (with a strong tendency to set it to high). Societal damage and image impacts were rated as severe because the loss of production in the heavy industry had significant regional economic impacts and would definitely tarnish the utilitys image. Finally, the switchyard was very clearly situated in a heavy industry area, also the highest category. Including all the specific consequences for stakeholders resulted in a failure consequence class rating somewhere between critical and vital, as shown in the left matrix of Figure 7-8a rather harsh rating for such a young installation.

Figure 7-8 GIS Example, Transforming Information into Consequence and Activity Matrices

Transposing this same information to the asset-directed activity matrix found in Figure 7-6, it is now essential to have a proper assessment with respect to the expected performance of the switchgear. One of the measurable elements is the number of forced outages per year that are caused by the system and followed by corrective maintenance. In the period of analysis, this was between one and three failures per yearsometimes minor, sometimes major. Although no complete PFM technical analysis was applied, standard maintenance activities such as functional testing (for protection and control), PD measurement, and breaker opening and closing velocity measurements were applied regularly. This information revealed that, apart from the critical 7-14

Effectively Using Data for Risk Analysis

situation with respect to PDs, large time delays for circuit breaker pole openings were also routinely observed. Theoretically, this could have caused overvoltages at transformer terminals of the connected 150-kV transformers and could have led to more excessive damage. The general conclusion (see the right matrix in Figure 7-8) was that the performance of the asset was poor and, even worse, could not be improved because of limitations for taking it out of service for refurbishment. Note: Where Figure 7-8 gives a more or less qualitative approach, a quantitative analysis based upon the same example will be described later. Decision Based Upon Analysis As shown in the analysis, the utility clearly had a situation that needed to be rectified. A station with a very critical/vital failure consequence classification had a very low performance rating. Based upon a discussion like the one shown in the decision evaluation matrix of Table 7-1 and given specific situations with respect to expansion as well as decreasing prices of new switchgear, a dramatic decision was made to replace the switchyard completely and install a new installation (after only 20 years of service). The lessons of the previous situation were taken into account. Sectionalizing was optimized over feeders and large customers, mounting design was scrutinized and improved (same building), and enhancements were made to facilitate the testing and maintenance of the bus during service.

Technical PFM Interaction


Given the approach described in Figure 7-6, performing a PFM FMECA becomes necessary. Properly applied, this analysis provides information on the current and expected performance of an item. A specific FMECA approach, based upon functional analysis, may help set the proper actions for the improvement of performance. This approach is shown in Figure 7-9 where switchgear is analyzed. This approach can also be applied to all other asset types and is thoroughly described in a previous section of this report. The type of FMECA presented in Figure 7-9 is not based upon collecting all possible data but focused on the potential hazards or defects that can occur. It is mainly based upon answering questions such as: What are the main functions of the asset? How can the internal functional relationships be described?

7-15

Effectively Using Data for Risk Analysis

Data needs are thus limited compared to the more detailed, parts-oriented PFM technical analysis.

Figure 7-9 PFM Technical Analysis for Switchgear

The approach forces one to judge the specific outputs that interact with the different functions, to validate them, and to describe the risk of occurrence and the effect of failure on the system as a whole. The end result is a set of condition measurements and preventive actions that describe the current quality of the asset and its expected performance. The PFM technical analysis is an important part of risk analysis. A simple example of an FMECA is provided to strengthen these relationships and concepts. The FMECA approach is an inductive method of performing a qualitative asset reliability or safety analysis. It addresses the failure behavior and criticality of a specific stand-alone asset (one that is not related to system). The FMECA study also forms the necessary basis for deciding which measurements can be of use in a CBM process. The FMECA process determines the critical parts of an asset. Logically, each type of switchgear will have its own FMECA, based upon the functional division, as shown in Figure 7-7. Addressed subsystems for this example include: Mechanical Dielectric primary Control secondary Stored energy system

7-16

Effectively Using Data for Risk Analysis

Each subsystem has its own unique primary function and contains all components that provide this function. For example, the dielectric subsystem contains all components such as oil, epoxy, SF6 gas, and insulating material that provides the insulation of the high-voltage (HV) parts of the switchgear to the ground. The inventory of components and failure information is achieved by brainstorming sessions with technicians, by analyzing event reports and archives, by visiting maintenance activities, and through discussions with manufacturers and (preferably) other users. All components and their failure behavior are weighted by the four criteria as described hereafter. All criteria are defined from a technical and a more practical point of view. As the risk element is already covered (or will be) by the described approach, the FMECA does not consider this. Consequences at the stakeholder level are thus not taken into account, which significantly eases the analysis. The score for each component is achieved by using the Delphi method in brainstorming sessions and is used to estimate the necessity to address the possible hazard by maintenance or design action, either preventive or predictive. Of course, these actions give, at the same time, a good impression of the asset quality/expected performance. The abbreviations and possible values that are used in Table 7-2 are as follows: Failure frequency (F) 0 = No failure has occurred yet 1 = Failure incidentally occurs (< once a year) 2 = Failure frequently occurs (> once a year) 1 = No impact 2 = Has impact only on the circuit that is directly connected to the switchgear 3 = Has impact on more than just the circuit that is connected to the switchgear 1 = None 2 = < $5,000 3 = > $5,000 1 = Technical impact only 2 = Impact on personnel and environmental safety

Impact of failure on energy delivery (I)

Corrective costs (C)

Environmental impact (E)

The final score for each component is achieved by multiplying the scores of the four criteria. The weight factors depend on a companys philosophy and are changeable. For example, in this case, the safety impact of a failure weighs twice as much as the technical impact but could be higher if a utility has safety as a key utility objective. Another example is the failure of a protective relay, which has no impact on safety but has consequences for the circuit, which is connected to the switchgear.

7-17

Effectively Using Data for Risk Analysis

Finally, a classification should be determined according to the utilitys ability to detect a deviation, to measure the condition, or to replace the part causing the problem. A component should be replaced on a regular basis if its failure mode is not detectable or worthwhile to measure. With respect to the detectable and diagnosable critical components, an optimized inspection and diagnostic program can be determined, including the specific inspection points and measurement types. Table 7-2 shows an example of an FMECA for a switchgear control (secondary) subsystem. The main function of the control subsystem is to protect and operate the switchgear at the right moment. Components of this subsystem can be defined as necessary parts that contribute to this main function. The failure description of the component describes each situation that deviates from the components healthy situation. In this example, three deviated situations are described for auxiliary contacts as part of the switchgear control subsystem. Oxidation, burned contacts, and loose wiring are deviated situations. These deviated situations are defined as touchable and visible. The root cause must have a causal relationship with the deviated situation. For example, lowcontact pressure for the auxiliary contacts is not a deviated situation until it damages the contacts. Only if low-contact pressure results in a burned contact is there a causal relationship between low-contact pressure and the burned contacts. In this case, the low-contact pressure is defined as the real cause of the burned contacts as a deviated situation. The effect of the deviated situation has to be defined from the point of view of the function of the component within the subsystem. In this example, the moisture causes the oxidation of the contacts, which might cause a malfunction or even inoperability of the switchgear. Another example is illustrated by the failure of the trip coil. The trip coil is used for translating the control command into the operation of the mechanical drive mechanism. A bad moving plunger in the trip coil is a well-known failure caused by oxidation of the plunger. As such, the plunger is not the deviated situation, but the oxidation caused by a high moisture level is considered the deviated situation. The effect is deviated or delayed switchgear operation.

7-18

Effectively Using Data for Risk Analysis

Table 7-2 FMECA of the Switchgear Secondary (Control) System FMECA for Secondary Subsystem Switchgear Type
Component Deviated Situation Oxidated contacts Auxiliary contacts Burned contacts Loose wiring Burned coil Trip coil Oxidated plunger Moisture Weight Factor Cause Effect F x I x CxE Moisture Low-contact pressure Vibration Deviated powering Refused command Refused command Refused command Refused switchgear operation Deviated switchgear operation 2 3 2 1 =W 12 Measurement/ Maintenance Visual inspection Trip coil measurement Measurable Quantity Coil time Oxidation level wiring connection

12 Visual inspection Trip coil measurement

Coil surface

Coil time and curve

Within this framework of conceptual and practical thinking, all components must be analyzed according to the subsystem division. The weight factors are determined for all components as described. Depending on the score on each of the four criteria, different failures on the same component can have different final weight factors. For example, a deviated situation of auxiliary contacts has low impact on environmental safety but might cause a trip failure in the case of isolating a damaged power cable. In this situation, the damaged power cable will be switched off by the backup breakers, resulting in the switching of an unnecessarily large area caused by an unresponsive switching command. For this reason, a deviated situation of the auxiliary contact has impact on more than just the control circuit that is connected to the switchgear. Based on the classification of the components and their associated weight factors, the necessary maintenance activities can be determined. The deviated situation of auxiliary contacts can be detected by visual inspections, functional tests, timing measurements, or the trip coil resistance measurement. From a technical point of view, the information entered into the measurement/maintenance and measurable quantity cells describes the total maintenance and measurement activities for the specific type of switchgear, except for their frequency. The frequency for applying these maintenance or measurement activities depends on the components aging process and the accepted level of risk. In Table 7-3, the main function of the driving subsystem circuit breaker is to store energy for operating the main contacts and actuating the circuit breakers main contacts with the right acceleration and velocity. The functional demands of the drive system circuit breaker are to supply the right amount of stored energy to achieve the right switching curve.

7-19

Effectively Using Data for Risk Analysis

Table 7-3 Part of the FMECA Drive System Circuit Breaker Driving Subsystem Circuit Breaker (CB) Type
Component Deviated Situation Weight Factor Cause Effect Fx I x CxE =W Measurement/ Maintenance Measurable Quantity Speed and contact time; CB operates in minimum energy position; distance between spring housing and spring deviated compared to other CB/OK/wrong distance but repaired Motor operates/does not operate/does not operate and placid

Closing springs

Decreased spring energy

Springs overwound for a long period

Slow closing or failure to close

CB timing; minimum energy check; visual inspection

Motor

Burnout

Overheating of windings

Closing springs are unwound; motor failed to operate Closing springs are unwound; motor failed to operate No winding of the spring; strip the gear Disruption of self lubricating system Failure to wind the springs Failure to close Failure to close

Functional test

Motor

Broken brush caps

Aging or overheating of holder Jumped mechanism; loose mounting of the motor Grease supplied to the spring winding gear Aging/ vibration Drying of grease Drying of grease, incorrect use of grease

Visual inspection

Broken/unbroken but replaced

Gearing

Broken tooth

Visual inspection

Broken tooth/OK/broken tooth and replaced

Spring winding gear/worm gear Spring housing Close latch cam Closing latch block

Incorrectly greased

Maintenance

Remove grease

Loose bolts

Visual inspection Functional test

Tight/loose but tightened Free/stiff but freed

Stiff roller

Stiff latch block rollers

Functional test

Free/stiff but freed

As will be shown later, the PFM technical analysis results are of major importance for the collection of condition/performance asset data in a way that human interference is avoided maximally. This avoidance of individual and subjective opinions is necessary to make optimal analyses of situations. Finally, and just to show that the approach is not limited to switchgear or circuit breakers, a model for transformers is shown in Figure 7-10. 7-20

Effectively Using Data for Risk Analysis

Figure 7-10 PFM/FMECA for a Transformer

Practical Application of the Risk Assessment Approach


The approach, as described in the previous section, should be applied within asset management departments. Applying this step-by-step process is very educational for understanding of the difficulties met while implementing risk-based asset management. The described approach stimulates the constant awareness that a sustainable asset management process considers both technical and economic as sociological and ecological/environmental aspects. A balance must be found between all stakeholder values and requirements. The achievement of effective asset management and a good service level is nothing more or less than finding a proper balance in the satisfaction of all, often conflicting, stakeholder needs. Unfortunately, senior utility management often faces a lack of well-prepared information, skills, and decision-supporting tools. In some instances, this may lead to polarized strategies, which fail to satisfy the previously mentioned objectives of sustainability and overall stakeholder satisfaction. Corrective measures (that is, from regulatory or other bodies) that the asset manager would prefer to avoid, become necessary. The information in this section is aimed at dealing with the challenge: how to change data into information that can support the decisions relevant to the asset manager.

The Decision Process Considering Various Scenarios


A structured approach that assesses data into asset decision information following the three decision levels of Figure 7-3 is given in Figure 7-11. A step-by-step approachstarting with information regarding asset level, followed by influence on system/network level and consequences at corporate levelis described.

7-21

Effectively Using Data for Risk Analysis

Figure 7-11 Information Flow and Processing

Based upon this detailed approach, multiple scenarios can be found to influence the performance at all three levelsincreasing or decreasing maintenance and inspection intervals, replacement or refurbishment, and changing the maintenance strategy. Also, a combination of technical information with relevant economic data and future system performance will result in quantified benefits. Benefits are not exclusively expressed in economic terms, but also in terms of reliability. On the corporate level, balancing the cost and benefits of the scenarios with the risk involved with each scenario will result in the strategic decision that has the best-managed risk. On this level, all stakeholders expectations can be taken into account when they are formulated as business values, resulting in a set of performance indicators to give expression to those expectations. The societal and economic information, together with the reliability of the equipment inventory and the system performance, form the ingredients of the risks involved. Balancing the risks will lead to the final decision.

7-22

Effectively Using Data for Risk Analysis

The asset management decision process that is supported by this information flow model consists of three assessment types, which, if applied correctly and completely, support decisions while taking account of the system-oriented risk and the asset-based activity.

Assessment Steps
Based upon previous statements and having developed an approach for network/asset risk assessment (see the matrices in Figures 7-5 and 7-6), a controllable assessment process has to be started. This process is based upon dedicated assessments that will ultimately lead to a proper decision. Each assessment step is defined as follows. Susceptibility Assessment A susceptibility assessment focuses on topics such as: Distance to sea Polluted air Specific soil situation (for cable connections) General age of circuit or asset Level of meshed network Level of redundancy

Consequence Assessment A consequence assessment should cover all mentioned aspects shown in Figure 7-6, including: Economic damage, including penalties and constraint costs Aging Societal assessment Issues such as utility image, safety, and environmental hazards Relationship with respect to number and type of customers

The combination of susceptibility and consequence assessment will lead to a proper failure consequence class (criticality level) that, in turn, is weighted against the results of the technical and economic assessments.

7-23

Effectively Using Data for Risk Analysis

Technical Assessment (of Expected Asset Performance) The technical assessment will be the basis for a decision on the asset level and focuses on topics such as: Technical remaining life Condition of the asset Dielectric strength Condition of the mechanical controls and parts

Data mining techniques are applied to generate the best possible knowledge about the norms and criteria used for judging the quality level of the asset. Aging/degradation processes can thus be monitored, which might influence both the susceptibility of the circuit for disturbances and the condition of the specific asset for PdM and CDM activity decisions. This process will improve significantly if asset condition information is shared between asset owners. This process will lead to a positioning in one of the fields in the activity matrix and will be followed by an economic assessment to make an optimal decision with the best overall effects. Economic Assessment The expenses related to the operation of the asset are reviewed. Life-cycle costing (LCC) is one way of looking at the economic settings of an asset. Other important topics that are taken into consideration in this stage are: Depreciation Weighted average costs of capital (WACC)

The benefits related to the operation of the asset are considered in the approach as illustrated in Table 7-1 in both financial as well as more practical values for all preferred scenarios.

The Decision Support Model Functionality and Structure


Based upon the discussions in the previous sections, a model, as shown in Figure 7-12, can be applied. Although basically a transformation of the approach, as described in Figure 7-5 and Figure 7-6, it also takes into account the detailed decision process model, as shown in Figure 7-11.

7-24

Effectively Using Data for Risk Analysis

Figure 7-12 Five Steps Decision Flowchart for Asset Management (AM) Decision

The process itself starts with a profiling phase to get a better picture of the stakeholders requirements. In the decision stage, it will become clear how well the advice complies with the objectives, given the set of stakeholder demands and wishes. Apart from the very important analysis of the stakeholders expectations, an analysis of the environmental situation and system development stage should clarify the susceptibility of the judged system/asset. Both analyses are partly characterized by fuzzy elements and thus are strongly dependent on the engineers or managers opinion regarding these subjects. During the analysis of the (expected) technical performance, the asset is analyzed by assessing the influence of the condition measurement results and the historic information regarding 7-25

Effectively Using Data for Risk Analysis

failures. If relevant industrial failure data can be used here, assessment based upon a larger data population of the same asset type is, however, preferable. A very practical way of approaching a not-too-detailed analysis is to introduce the categories bad, medium, or good (or in figures 1, 3, or 5). The same approach can be used with the economic and societal assessment, although this will be strongly influenced by utility objectives. The set category division is open for changes and to be adapted by influences coming from the specific situation. In the final step, the position of the asset in regard to its necessary/expected performance is clarified. The analysis stage after that position is confirmed is based solely upon the level of value added to the stakeholders requirements, as shown in processes described in Table 7-1. Because it is beyond the scope of this report to describe the complete scheme, a single example (see Figure 7-13) is given based upon the assessment of the susceptibility class of a cable circuit. To guarantee a certain quality of service level, some redundancy is applied. To assess how well redundancy is implemented in the case of cable connections, two measurable entities are taken into account: The time needed for switching of the short-circuit, The location of the asset within the circuit

The range of the three categories of the -parameter was defined by a Delphi-method-based obtainment of data: 1 ms 1 ms < 1 s >1s category 5 category 3 category 1

The categories of the other parameter are defined by the three grid configurations that are commonly used: Radial configuration Loop configuration Meshed configuration category 1 category 3 category 5

The evaluating method used to obtain a value for the redundancy factor is displayed in Figure 7-13.

7-26

Effectively Using Data for Risk Analysis

Figure 7-13 Flowchart for Determining Redundancy Factor

The following equations show the decision rules that can be applied and the results in the redundancy factor (RF):
Q1 = 1; Q2 = 3 RF = 1 Q1 {1,3,5}; Q2 = 1
Eq. 7-1

Q1 = 1; Q2 = 5 RF = 3 Q1 = 3; Q2 = 3

Eq. 7-2

Q1 = 1; Q2 = 5 Q1 = 3; Q2 = 5 RF = 5 Q1 = 5; Q2 = 5
Q1 : Duration of the service down period ( ) Q2 : Net configuration RF : (Value) redundancy factor

Eq. 7-3

These decision rules can also be expressed in text in the following manner: A mesh configuration results in a redundancy factor of 5, as long as the switching time is 1 second or less. A radial configuration results in a redundancy factor of 1, independent of the switching time.

By addressing the other factors (such as environmental conditions and age) in the same way and by deciding about the mutual weight, ultimately, a circuit or subsystem falls into one of the three 7-27

Effectively Using Data for Risk Analysis

categories: high, medium, or low. Applying the same systematic approach to the stakeholders requirements will give comparable results. A weighting of the situation into vital, critical, and non-critical is the result of the complete analysis. Decision Model Supporting Effective Use of Data Based upon the previously described approach, a decision support system can be applied, either commercially purchased or internally developed. All these tools are more or less based upon the previously given description and need the following as input: Specific parameter data, based upon a well-designed and detailed discussion with regard to the measuring parameters A pre-design decision regarding the different weights of the influencing factors A pre-design decision regarding inter-related weights of the different factors

The described risk and activity analysis is strongly dependent on the use of accurate data. The process identifies which data are essential for analysis and identifies the possibility of ignoring data that are not relevant. Considering the division into technical, economic, and social/societal areas, it is best to decide which data to collect per area. In the first stage of risk analysis (see Figure 7-5), information regarding environment and relevant position to stakeholders is of importance, while technical information differs from the activity matrix (see Figure 7-6). Economic issues are taken into account later in the decision evaluation process for different scenarios. This stage can be seen as a discussion of where and how to spend money and will be strongly influenced by utility philosophies. It is essential that top management understands this process and does not limit its target to financial results alone. In the second stage, the activity matrix, data covering technical information consider the condition/performance of the asset which, depending on the asset amount per type, can be improved by sharing data. Where other discussed analysis issues are partly influenced by subjective opinions, the technical assessment is to be made maximally objective. The final outcome of the model is a set of advisable actions that can be of multiple characters. One direction can advise to improve asset quality and bring the asset to a less critical level in view of its position in the system. Obviously, this can be realized by replacement, refurbishment, or intensified maintenance. The other direction can accept the expected (limited) performance but aims at decreasing the susceptibility of the system/circuit and, as such, brings the asset to a less critical position. This could be realized by making better use of the connected (lower voltage) grid, by protection of the switchgear against ambient conditions (building), or by improving the redundancy situation. As long as the decision attempts to balance all stakeholders requirements, the best decision regarding the circumstances will be taken. Sensitivity analysis of different scenarios will certainly support the decision how to spend the money best and in the best priority. 7-28

Effectively Using Data for Risk Analysis

The Decision Support Model, a Practical Example The GIS substation situation, as described earlier in the qualitative example, is used to give a further practical quantitative example of the application of the different models presented. The example follows the AM flowchart decision model found in Figure 7-13. The process as exemplified there can, however, be applied according to the following analysis stages: 1. Analysis regarding the susceptibility. In the assessed situation, age and pollution were not an issue (building), nor was soil quality. Redundancy, however, was considered to be very limited due to the restricted possibility of using the lower voltage grid, the specific design with respect to bay distribution, and the position of the longitudinal disconnectors. The susceptibility rating was also influenced by mounting difficulties experienced during the erection stage, and an overall susceptibility rating of medium to bad quality (4) was given (see Figure 7-14). 2. Stakeholder consequence analysis. The stakeholder situation was judged quite severely, considering: The possibility for SF6 pollution, both on personnel and the environment. The image consequences that were very negative and, principally, could mean loss of a major customer. From a societal point of view, the industrial area could not be harmed anymore; the area was considered the largest population of employers in the region. Aging processes in the specific equipment were very evident and could not be stopped due to the circumstances; moreover, the regularly occurring overvoltages had the potential of damaging/aging the connected transformers. Direct damages to the surrounding industries were in the range of $15 million.

As such, the situation was judged very seriouslylevel 5. 3. Failure consequence class. Susceptibility was rated medium to bad while the consequence category was set on high. This led to the assessment conclusion of vital. 4. Transposing the failure consequence assessment result (in this case, vital) to the risk analysis matrix provides a basis for determining the expected performance level. This expected situation is compared to the actual asset performance, based upon PFM-related condition measurements and analyses. In the assessed situation, the condition class was considered to be level 1 because of high PD levels and inconsistent pole velocities of circuit breakers.

7-29

Effectively Using Data for Risk Analysis

Figure 7-14 Flowchart Example End Result

In the final analysis step, the result of the assessment becomes clear: one either has to do something about the asset quality or bring the situation to a lower failure consequence class. In this practical example, the position of the switchgear was clearly in the top right corner of the matrix and an action was urgently needed.

7-30

Effectively Using Data for Risk Analysis Table 7-4 Final Step: Decision With Action Scenario Process Decision Evaluation Matrix with Stakeholders Preferences Scenario 1 1 2 2 Alternative A B C B Flexibility Class ++ + Safety Class + + ++ + Reliability % 99.84 96.70 99.00 96.00 Durability Class + + NPV/EVA USD X Y Z W

It was apparent that the current layout of the busbar was inadequate and three designs were considered (alternatives A, B, and C). From a technical point of view, there were only two general options for each design: complete refurbishment or replacement (scenarios 1 and 2). While this left the possibility of six different solutions, only four combinations were practical. The four solutions for rectifying the situation are listed in Table 7-4. Although the NPV of all alternatives was low, the two different approaches associated with design B (scenarios 1 and 2, alternative B) were considered unacceptable because of lack of flexibility and low reliability. From the remaining design alternatives, which were basically both applicable because of complete erection alongside the original switchgear, alternative A won mainly because of flexibility.

Condition Analysis
It is obvious that the condition of an asset forms an important trigger when deciding which activity should be performed. Depending on the risk axis (failure consequence class), as shown in Figure 7-6 and as referred to in Figure 7-2, this is in fact the only steering entity/variable of the approach (maybe apart from redundancy). Information about the condition (present and future performance) can be collected from available industrial figures and relevant measurement data. Industrial data provide information about generic failure rates, minutes of energy not supplied, mean times between failures of specific switchgear principles/drives, and more. This data can be very valuable but do not inform about the specific condition of the specific device/asset. The latter information is based upon the earlier described PFM technical analysis, which identified the performance entities to measure and the norms to apply. Consistency of Information Deciding upon the performance data necessary to make a rigorous asset condition analysis is only part of the problem. Although this process is necessary to limit the quantity of data to those relevant for the decision, one should also guarantee the quality of the data. This means providing intensive support for maintenance technicians performing measurements and maintenance.

7-31

Effectively Using Data for Risk Analysis

Maximum Quality of Condition Information Determining the condition of an asset is not dependant solely on the local situation. One can use information from other assets as well, especially those assets of the same type and operational requirements, thus avoiding being too subjective in this part of the decision process. A maximum objective decision regarding asset performance is supported by broadly collected asset condition information, analyzed, and translated into knowledge rules and norms. Some commercially available software tools support this idea of a generic or generally applied data bank/database with the pure objective to store a maximum amount of measured condition data for analysis purposes. The basic idea is shown in Figure 7-18.

Figure 7-15 Exchange of Condition Data

As a result of this joining initiative, participants can expect a far better judgment that supports an improved decision regarding the risk-based activity to be executed on the asset. Data Mining and Decision Support In general, a data mining process involves the earlier stages of data selection and data transformation and the subsequent stages of validation and interpretation. Data mining aims to provide an alternative to a data analysis based upon hypothesis and theory. The idea behind data mining is to find intelligible patterns that are not predicted by the established theories. Formatting the output data in a visual form that human intelligence can interpret is important, especially the evaluation of existing experiences in condition monitoring, including validity and relevance of results. The full potential of condition and maintenance information cannot always be realized by using traditional techniques of data handling and analysis. There are often underlying trends or features of the data that are not evident from the usual analysis techniques. Such details and trends can be important for the assessment of the equipment operation. Increasing requirements from operators and asset managers force utilities to fully exploit the capabilities of these data to optimize the use of an HV electrical plant. The method of extracting full value from such extensive databases 7-32

Effectively Using Data for Risk Analysis

using new analysis techniques is commonly called data mining. In its basics, data mining is the application of relatively new data-driven approaches to find patterns in data obtained from (in this case) electrical equipment. The data mining techniques are then used to relate these patterns to the operational condition of the equipment and to provide new knowledge about aging mechanisms and norms and, as such, get in control of maintenance activities. In order to support the data mining process, the utility may apply the knowledge expert database in which all the measured data and failure information are stored and analyzed. Figure 7-16 shows an example of such an analysis tool that is based upon a data mining expert system. The general features of this approach will be shown later.

Figure 7-16 Example of Analysis Tool

Practical Example of Data Mining: Cable Condition Assessment


For the classification of the conditional quality of HV assets, rejection levels are necessary to determine the necessity of CDM actions or replacement. For example, a PD source is not always harmful to the assets insulation in the short term. To distinguish the condition state of an asset, decision support needs to be available. One way to develop this decision support is by using statistical analysis on large numbers of field measurement data. This will help determine 7-33

Effectively Using Data for Risk Analysis

experience norms (rejection levels) for different diagnostic properties applicable for condition assessment of HV assets, in order to find a basis for rejection of an asset.

Figure 7-17 Schematic Structure of Data Mining Process

During routine maintenance activities, large amounts of condition-related data from different components are collected. These data can be used to perform data analysis on different levels. This data mining is carried out using a condition database, which contains all the information of the asset characteristics and its related diagnostic information. From the collected data in the database (see Figure 7-17), statistical distributions for analysis can be obtained for different diagnostic properties. Furthermore, between different diagnostic properties and component failures, correlations can be obtained that justify condition assessment actions. This process provides three outcomes (A, B, and C) of the data mining approach: Outcome A refers to new knowledge about aging mechanisms of power cable components. Outcome B refers to recommended maintenance activities on power cable components, resulting from the database analysis. Due to the large amount of measurement data stored in the software data system, operating norms and criteria are continuously updated and fed back to workers in the field as determined by result C.

When PFM technical analyses on power cables are performed, the result is that the major failure causes are related to damages of an external nature (digging activities in the ground) and internal insulation problems in the cable and its accessories. Analyzing the frequently occurring defect types and the degradation modes of the different insulation materials, the material degradation in the cable network can be categorized into four local degradation processes that are related to PDs. As a result, PD characteristics provide a sensitive parameter to detect degradation processes in the power cables. With PD diagnostics, the insulation defects can be pinpointed to a specific component of the cable system. The location of the PD sources along the cable length can be analyzed by time domain reflectometry.

7-34

Effectively Using Data for Risk Analysis

Condition Analysis of Power Cables Decision making for the condition assessment of power cables is mainly related to its dielectric condition where different PD properties have their relevant contribution. These PD properties exert their respective relation to insulation degradation of a cable system and its components, as shown in Figure 7-18. These different PD properties should be taken into account for the determination of the insulation quality. Their effect on the degradation of the insulation is dependent on each PD property. To discern the contributions of different PD properties, different weights are assigned to each PD property. In Figure 7-18, a general overview of PD measurements on power cable systems is given. These are rules of thumb that support the analysis of the measurement results. These decision criteria are based on the different aspects of the cables condition analysis from inspections (a set of PD properties obtained from a measurement). The knowledge rules contain the effect of PD properties on the insulation degradation determined by the detected PD property size and weight.

7-35

Effectively Using Data for Risk Analysis

Figure 7-18 Relations of the Directly and Indirectly Analyzed PD Properties

7-36

Effectively Using Data for Risk Analysis

For supporting decision making based on the insulation quality of a cable system, a flow diagram can be applied in which the different PD properties are used, as shown in Figure 7-19. By analyzing the derived measurement data following the diagram, the cable systems insulation condition is determined to be in one of these three classes: Not OK: Defective cable component (or length) in the cable system should be replaced, as multiple PD properties are outside the experience norms. Trending: Possible degradation, trending on the cable component is required (for example, 1 year or 3 years), as some of the PD properties are outside the experience norms. OK: No weak spots in the cable system. Cable system is OK, as none of the most important PD properties are outside the experience norms.

Figure 7-19 Decision Support Flow Diagram for PD Diagnosis

7-37

Effectively Using Data for Risk Analysis

The order of the PD properties, as applied in the decision diagram, is based on the contribution and the recognition value of PD properties for degradation processes. On the basis of these weight factors, a decision can be formulated for the qualification of the insulation condition. As the cable system consists of a series of different components, the condition of a cable system is determined by its weakest link. So, by summing the condition of the individual components, the quality of a cable system can be indicated, for example, as not OK, trending, or OK. This approach provides a defendable maintenance plan in the case of a critical decision. The measurement values and the norms can be compared on a per component basis. Knowledge Rules To find the rejection levels of different properties, visual inspections of the replaced cable components are often performed. However, it is very difficult to get the general relationships between the degradation symptoms and the forensic evidence. In most cases, when a cable component is inspected, only a limited view of the interior of the opened component is obtained. In order to open a component, destructive actions are needed that will influence the accuracy of the inspection. The analysis of the forensic evidence is often performed with the subjective judgments in the visual inspection. For example, it is not possible to do a visual analysis if the electric field at the location of a cavity in the insulation is high enough to ignite the PD. When a cable component is opened, usually more than one defect can be found. These manually constructed components often show additional defects. It is always difficult to visually determine the relationship between the found defects and the detected PD properties. The relationship between the PD symptoms and the forensic evidence is mostly inaccurate. The solution for the analysis of a diagnosis of a cable system (or its components) is to make comparisons to other measurements results. This comparison can be performed in two different ways, as illustrated in Figure 7-20: Time analysis: Comparison of a PD property to the properties as obtained in previous measurements on the same object. If the trend of a PD property is negative (for example, increasing PD magnitudes), from the point of view of that property, the condition of that component is deteriorating. Trending of a power cable condition may be time and cost consuming but shows the advancement of a degradation process clearly. A trend line together with a norm level will contribute to a failure prediction. Type analysis: Comparison of a PD property to that same type of properties, as obtained for the total population of the same specific component type. If a PD property of a specific component is out of the range of the typical observations, this indicates that, from the point of view of that property, the condition of that component deviates from the normal condition of the total population. As a result, the statistical response of a PD property, as obtained in a measurement, can be compared to the true population of that PD property. To disclaim a sample belonging to a certain population, a sample is generally found outside of the 90% or 95% confidence interval of the true population.

7-38

Effectively Using Data for Risk Analysis

Figure 7-20 Example of Time (Upper) and Type (Lower) Analysis

Both of the previously described analyses show that there is an independent way to determine diagnostic norms for a condition assessment. Tolerance levels are necessary to assess the deviation of a diagnostic property in relation to time or population. These tolerance levels can be determined independently by making statistical analyses for large amounts of condition data. From these large amounts of data, the statistical distributions can be obtained for the different diagnostic properties. The populations used for the statistical analysis should be representative of the total population of the full network. In order to make these statistical analyses of field measurements, a database is necessary for the consistency of the performed analysis.

7-39

Effectively Using Data for Risk Analysis

Database for Condition Assessment Support To support a condition assessment of distribution power cables, database systems are needed to manage the data from measurements and inspections. The database for management of measurement data is also used to perform statistical analysis from which knowledge rules and norms can be set. The system, as applied for the data collection and analysis in this report, consists of the following components: Module to input or define the cable data (such as structure and component types) Module to input measurements Module to analyze the data Database to store the data

Separating the functionality of the system in these four blocks has the advantage that extra modules can be added to store data in or to use data from the database. The database system should be able to interact with its environment to be functional. To do so, four modules will be implemented as separate software programs, respectively called: Definition Input Measurement Analysis

A schematic structure of a diagnostics database is shown in Figure 7-21.

Figure 7-21 Schematic Structure of a Diagnostics Database

A set of diagnostic tools provides a number of algorithms for the indication of the condition of an HV component. Depending on the diagnostics, a choice has to be made as to which data should be collected in the database. In this respect, only the condition data that can provide us 7-40

Effectively Using Data for Risk Analysis

relevant information on the condition of an HV component and can be used for data mining purposes should be stored. Individually written textual remarks are very hard to make a comparison analysis with, but standard remark texts or numerical properties can be used for a variety of analyzing purposes. Also, in the database, the different cable component types can be defined. For each of the cable component types, norms can be assigned for different PD properties. From the different components, the diagnosed cable systems can be added to the database, as shown in Figure 7-22. All of the required information on the cable system can be added. The obtained measurement data are assigned to the individual cable components or to the full cable system.

Figure 7-22 Screenshot of Cable Sections

After the selection of one of the cables in the database, measurements results can be added, removed, or updated for the cable system or for the individual components, as can be seen in Figure 7-23. On the left, an overview of a cable system is shown where, after selection, the measurement input can be filled. The analysis of the inserted measurement data is applied for time analysis or type analysis of the measurement data. Because the database can contain a large amount of data, filters can be used to restrict the amount of data for analysis purposes.

7-41

Effectively Using Data for Risk Analysis

The analysis can be performed on three different levels: Component level Type level Group level

The component view is used to view and analyze the measurement data per component. The type view enables statistical analysis of components of the same type. The group view is used to view and analyze the measurement data on groups of components.

Figure 7-23 Measurement Add and Update Screen

A filter is constructed of rules, and rules are constructed of conditions. After selecting the root of the tree, controls appear for adding rules. When a rule is selected, conditions can be added to this rule, or the rule can be removed. One can be used to restrict the value of a property of a component (such as a joint, termination, cable part, or cable system) (see Figure 7-24). Expressions can be composed by one or two properties and an operator to restrict the amount of 7-42

Effectively Using Data for Risk Analysis

data to be analyzed. By combining these data restrictions, certain data amounts can be used for analysis purposes (for example, all cable joints of type A situated in paper oil insulated joints with a partial discharge inception voltage [PDIV] below service voltage). The analysis on the component level shows the properties that are relevant to the selected component for the different measurement sessions. An output matrix shows the different PD properties of the different measurements. When norms are in use, the elements of the component tree will color red if they are outside a norm and green if they are inside a norm. This coloring is recursive, so a cable system will color red if one of its components is red.

Figure 7-24 Dialog for Adding and Updating a Filter

7-43

Effectively Using Data for Risk Analysis

The analysis at the type level is used to display statistical analysis of measurement data, as shown in Figure 7-25. The tree on the left displays the different types of components. Selecting a component type and a PD property will display the histogram of the measurement results that satisfied the filter and are performed on the selected component type. The histogram of the measurement data can also be displayed and exported as a table.

Figure 7-25 Histograms Created in the Type View

By using a database as described, the integrity of the collected data is an important issue. What if measurement data in one file or from one measurement session contain information about a cable system and its components? The situation can occur that one of the subcomponents will be replaced. Now, the measurement data are no longer valid for the new combination of the component and subcomponents. If these data stay in the database, they are no longer valid. If these data are deleted, information about the non-replaced components will be lost. The cable system changes when components are replaced or when the cable system is enlarged with another cable system or divided into two parts. The new cable system is no longer the same cable system as the old one. A new cable system should be entered in the database, and the old cable system will be marked as historic and remains in the database. In this way, cable parts, joints, or terminations can belong to multiple cable systems in the database. Only one of these 7-44

Effectively Using Data for Risk Analysis

cable systems can be the current cable system. The other older cable systems must all be historic. By using a reference to its former cable system(s), the history of a cable system can be maintained. Because PD property values are linked to the components themselves, the historic data from the components are still accessible from the new current cable system. An example of this is shown in Figure 7-26. At time t, two joints C and D and a short cable part replace joint B. The measurement data on terminations 1 and 2 and joint A are the only ones that remain valid in the new situation. The date when a cable system became historic should remain stored inside the database to facilitate the ability to view measurement data that apply only to the historic cable.

Figure 7-26 Cable System in Original (Left) and Modified (Right) Form

Determinations of Norms and Criteria The scatter in the distributions of the populations can be rather large, due to relatively few observations. For statistical analysis of the different properties, the condition data can be indicated by the type and the shape of the distribution. Calculations on the representative distributions can be performed to determine the deviation of a diagnostic property or to determine the deviation limits (norms) for a population of a property. From the condition data per type of component, different statistical distributions, as shown in Figure 7-25, can be obtained, depending on the type of diagnostic properties. The distribution used for the statistical analysis should be representative of the total population of the full network. Figure 7-28 shows examples of different PD diagnostic properties for distribution power cables. For the PD level, a Weibull distribution can be applied, which is used as a mathematical fitting of the distributions of the PD magnitude levels. The Weibull statistic deals with the different kinds of shapes of the distribution of PD amplitude levels. The parameters of the distribution can

7-45

Effectively Using Data for Risk Analysis

then be determined from the sampled measurement data. For the PD occurrence frequency at a location, the Poisson distributions can be applied.

Figure 7-27 Experience Norms/Rejection Levels for the PD Amplitude Levels

Figure 7-28 PD Occurrence Frequency

After the determination of the type and shape of a distribution, the required rejection levels are calculated. Because the distribution represents the true population, calculations can be performed for the definitions of the norm levels of the different diagnostic properties. The diagnostic norm level can be determined at the level of the distribution (for example, the 95% level). This diagnostic norm determines that the remaining part of the distribution is typical for this diagnostic property and for this asset and will not fail inside the next maintenance interval.

7-46

Effectively Using Data for Risk Analysis

The determined diagnostic norms are dependent on the goals of the asset manager. If the failure rate of a component increases, the experience norm should be adapted so that more defective components are taken out of service. Even so, if the condition assessments interval is increased from 5 to 10 years, the experience norm can be decreased to keep the risk level at the same level. Furthermore, as real-time condition data of large populations of various assets are used, the determinations of rejection are obtained in an independent manner. Also, the data are obtained from those service-aged assets for which the diagnostic norms are determined. Database Application for Condition Assessment In the condition database, the cable system component data, the condition data from the diagnostic inspections, and the experience norms (knowledge rules) are combined for condition assessment. As a result, based on the decision support as described previously in this section, an overview of the diagnosed cables in the network (or part of the network) can be obtained, as shown in Figure 7-29.

Figure 7-29 Database View of the Different Diagnosed Cable Systems

7-47

Effectively Using Data for Risk Analysis

Figure 7-29 shows a database view of the different diagnosed cable systems of a specific network owner. The tree of the cable systems, in the figure on the left side, shows the different conditions of each cable system by color (red, orange, and green). As a cable system consists of a series of different components, its overall condition is actually determined by the component with the worst insulation condition. Therefore, the tree can be extended to show the different components of that cable system with its individual insulation condition. Because all of the PD properties are stored in the database, it is possible to trace back to the criteria by which the decision is based. A filter can be created to decrease the total amount of data for analysis or condition assessment. On the right side, the different PD properties of a cable component are reflected with the applied filter, which indicate on what PD properties the condition of that component is determined. On the right side, the PD properties of a diagnosed cable termination are shown, indicating that its condition is determined by the PD activity in phase L3. The PDIV is below operation voltage (9 kV) and the PD levels at U0 and 2U0 are above the experience norm for the shrink termination. Phase L1 also shows PD activity in this termination, but the PD properties are not critical.

7-48

8
PROJECT OPPORTUNITIES
PFM embraces the use of models in its subprocesses. These models are very useful in predicting such things as: Maintenance costs Optimum maintenance intervals Wear End-of-life Risk Maintenance effectiveness

The development of accurate maintenance models requires an in-depth understanding of functions as well as practical insight to their operation, risk environment, and failure mechanisms. The result of this amassed knowledge is a comprehensive maintenance strategy that properly applies relevant technologies, uses readily available data, and meets reliability and availability goals at the lowest total cost of asset ownership. These models cannot be developed in a vacuum but must be developed in a collaborative environment involving researchers, asset managers, and maintenance technicians.

Load-Tap-Changer Opportunities
Load-tap-changers (LTCs) continue to be one of the larger consumers of maintenance resources. While great strides have been made in modeling the aging mechanism of some models and subcomponents, the work is far from complete. Prudent application of life extension technologies coupled with better identification of subcomponent deterioration/failure and improved end-of-life predictors can dramatically reduce maintenance costs while simultaneously improving reliability. By working collaboratively with utilities that have a keen interest in optimizing their LTC maintenance program, a comprehensive methodology for developing and managing these maintenance programs can be developed. The resultant product will include: A documented methodology and base data for applying a strategic asset maintenance model Revised methodologies on how to perform PFM studies

8-1

Project Opportunities

For condition assessment tasks adding specifically: Triggers for scheduling condition assessment tasks Assessment determinants for triggering condition-directed tasks as a result of the assessment process

Optimal interval determination KPIs: Measurement techniques Metrics Actions to be taken when KPIs show inappropriate trends

Prioritization methodology Linkage to: Life extension project Best practices project Industry database

Medium-Voltage Circuit Breakers


Medium-voltage circuit breakers are the largest major asset components found in the substation arena. These devices are a critical element of the substations protection scheme and see numerous switching and fault operations in a typical year. While their individual replacement costs are on the low end of the spectrum compared to power transformers, their population is quite large and the effects of a failure are significant to both the customer and the operating utility. Because the risk of functional failure is significant and the population is large, traditional maintenance approaches not only dominate the maintenance landscape but also consume a large amount of labor resources. An improved understanding of the failure mechanism and increased use of readily available data from SCADA systems and inspection activities has the potential to not only drive down maintenance costs but also improve reliability and extend useful operating life. By working collaboratively with utilities that have a keen interest in optimizing their medium-voltage circuit breaker maintenance program, a comprehensive methodology for developing and managing these maintenance programs can be developed. The resultant product will include: A documented methodology and base data for applying a strategic asset maintenance model Development and implementation of predictive maintenance algorithms Revised methodologies on how to perform PFM studies

8-2

Project Opportunities

For condition assessment tasks adding specifically: Triggers for scheduling condition assessment tasks Assessment determinants for triggering condition directed tasks as a result of the assessment process

Optimal interval determination KPIs: Measurement techniques Metrics Actions to be taken when KPIs show inappropriate trends

Prioritization methodology Linkage to: Life extension project Best practices project Industry database

High-Voltage SF6 Circuit Breakers


High-voltage circuit breakers are the largest major asset components found in the transmission substation arena. These devices are a critical element of the substations protection scheme and see numerous switching and fault operations in a typical year. While their individual replacement costs are on the low end of the spectrum compared to power transformers, their population is quite large and the effects of a failure are significant to both the customer and the operating utility. Because the risk of functional failure is significant and the population is large, traditional maintenance approaches not only dominate the maintenance landscape but also consume a large amount of labor resources. An improved understanding of the failure mechanism and increased use of readily available data from SCADA systems and inspection activities has the potential to not only drive down maintenance costs but also improve reliability and extend useful operating life. By working collaboratively with utilities having a keen interest in optimizing their high-voltage SF6 circuit breaker maintenance program, a comprehensive methodology for developing and managing these maintenance programs can be developed. The resultant product will include: A documented methodology and base data for applying a strategic asset maintenance model Development and implementation of predictive maintenance algorithms Revised methodologies on how to perform PFM studies

8-3

Project Opportunities

For condition assessment tasks adding specifically: Triggers for scheduling condition assessment tasks Assessment determinants for triggering condition directed tasks as a result of the assessment process

Optimal interval determination KPIs: Measurement techniques Metrics Actions to be taken when KPIs show inappropriate trends

Prioritization methodology Linkage to: Life extension project Best practices project Industry database

8-4

9
NEXT STEPS
This phase of the PFM project was focused on developing the concepts and creating an integrated framework. Many of the concepts have been tested with various degrees of documentation. The next phase of this project requires a set of collaborative utilities to share their maintenance performance data and jointly develop a core set of performance metrics and aging models. Once real data and aging models become available, predictive models can be further refined and updated by using near real-time data from existing SCADA and other monitoring systems. The result would be a robust set of algorithms that can be used to trigger maintenance and predict remaining equipment life.

9-1

10
REFERENCES
1. Technical Update: Maintenance and Monitoring Best Practices for Substations Equipment 2004. EPRI, Palo Alto, CA: 2004. 1008673.

10-1

A
APPLICATION STUDY FOR LOAD-TAP-CHANGERS
Performance Focused Maintenance LTC Application
Over the past several decades, utilities have taken two significant approaches to improve the maintenance effectiveness and efficiency for their LTC transformers. These approaches have focused on maintenance task improvements and technology improvements associated with the operation of the LTC and the use of on-line monitors. While these two approaches have resulted in improvements, they have not necessarily optimized the use of existing data available through SCADA and IEDs or focused on improving the overall performance of the LTC and its associated maintenance. PFM is an all-inclusive approach to maintenance. PFM brings together what previously appeared to be distinctly different approaches to maintenance under a single umbrella. PFM recognizes that maintenance is both a technical and business process that must be managed and should be very similar across the whole landscape of utilities. PFM acknowledges that the specific application of these process and approaches will differ due to the wide range of customer requirements, electric infrastructures, and maintenance organizations. The adaptive approach of PFM allows utilities to meet their own specific maintenance and operational goals and at the same time be confident that they are effectively managing the process and following best industry practices. A framework for PFM, shown in Figure A-1, displays the key concepts that go into a performance based approach to maintenance. From the diagram, one can see that the approach is quite robust and contains the technical and business elements needed for a world-class maintenance program.

A-1

Application Study for Load-Tap-Changers

Figure A-1 PFM Framework

To further these concepts, a simple example focused on distribution LTC transformers is presented. This example application will apply only some of the PFM concepts to a family of 20 MVA LTC power transformers. Each of these transformers has an LTC, which will be the focus of this study. Due to a limited amount of available data, only a limited analysis of the main winding will be made, and there will be no analysis of the bushings.

A-2

Application Study for Load-Tap-Changers

LTC Population Characteristics Nineteen LTC transformers are included in the analysis population. These transformers are rated: Primary voltage = 110 kV Secondary voltage = 12.5 kV and 24 kV MVA rate = 12/16/20 MVA

All of the LTCs use a reactive type design with a preventive autotransformer. The LTC models and populations are listed in Table A-1.
Table A-1 LTC Population Characteristics, PFM Drivers and Benefits Manufacturer Allis Chalmers General Electric McGraw Edison/ Pennsylvania McGraw Edison RTE Westinghouse Westinghouse LTC Model TLH21 LR654A 550B 550C UZD URT UVT Population 3 5 3 2 2 3 2 Average Age (years) 31.0 33.0 37.7 24.0 26.0 45.7 31.0

Operating and Maintenance History For this example application of PFM concepts, a utility has provided EPRI with a limited amount of LTC O&M history. Due to the small amount of available data, some general assumptions are made. These assumptions, while typical of many LTC transformers, may need to be adjusted at a future date in order to better describe the actual condition of the transformers and the future anticipated operating performance of the LTCs.

A-3

Application Study for Load-Tap-Changers

Specific assumptions being made include: No failures have taken place. Contact replacement takes place at the time of internal inspection of the LTC even if some life is still left. LTC oil is filtered at the time of each inspection. The loading patterns of all transformers are not beyond the nameplate rating. The aging of the main insulation package can be determined by oil testing. No bushing analysis is made.

LTC Diagnostics and Observations Recent oil samples from the LTC were analyzed by an oil laboratory, and several observations were made. A summary of these observations is listed in Table A-2.
Table A-2 LTC Condition Summary Manufacturer Allis Chalmers Allis Chalmers General Electric McGraw Edison/ Pennsylvania McGraw Edison RTE Westinghouse Westinghouse Westinghouse LTC Model TLH21 TLH21 LR654A 550B 550C UZD URT URT URT LTC oil is oxidized and needs refurbishing. LTC oil in worse condition. Moisture: 98 ppm, dielectric strength: 13 kV, IFT: 19 mN/m needs rerating of bank after LTC internal inspection. LTC oil has high arcing products, and oil probably is oxidized. Coking or overheating may have occurred. No problems reported. No problems reported. LTC oil in poor condition. Coking or overheating may be present. LTC is in bad shape. Needs serious repair to the gearbox and control parts. No problems reported. No problems reported. Oil Test Results High moisture (45ppm) content in LTC oil. Possibility of mild overheating or coking in LTC. LTC needs minor repairs. Other Observations

Westinghouse

UVT

A-4

Application Study for Load-Tap-Changers

Industry LTC Experience A survey of other utilities was made to better understand the experience of others and determine if any of the transformers had a higher than normal failure or LTC wear rate. A summary of this survey is shown in Table A-3.
Table A-3 Industry Experience with LTCs Manufacturer Allis Chalmers General Electric McGraw Edison/ Pennsylvania McGraw Edison RTE LTC Model TLH21 LR654A 550B 550C UZD Industry Experience Proper contact alignment is critical. Excessive wear above 800 amps. Solid tap changer at currents less than 1000 amps. Poor contact performance. Reversing switch was of a poor design. Improvement of the 550B model. Good contact performance. Have found problems with the barrier board studs. Solid tap changer at currents less than 1000 amps. Some problems with controls. Westinghouse UVT Vacuum bottles reduce contact wear significantly. Needs desiccant breather. Good Good Industry Rating of LTC Poor Fair Poor Fair

Westinghouse

URT

Fair

A-5

Application Study for Load-Tap-Changers

Main Insulation Package Diagnostics and Observations Recent oil samples from the main tank were analyzed by an oil laboratory, and several observations were made. A summary of these observations is described in Table A-4.
Table A-4 Transformer Insulation Condition Manufacturer Allis Chalmers General Electric McGraw Edison/ Pennsylvania McGraw Edison RTE Westinghouse Westinghouse LTC Model TLH21 LR654A 550B 550C UZD URT UVT Aging Moderate to marginal Normal Marginal Marginal Normal Accelerated aging on one unit Excellent Other Observations 4070% remaining life by DP analysis 5095% remaining life by DP analysis 67% remaining life by DP analysis 4577% remaining life by DP analysis 7784% remaining life by DP analysis 1295% remaining life by DP analysis 95% remaining life by DP analysis

A-6

Application Study for Load-Tap-Changers

PFM Technical Analysis A traditional failure mode and criticality analysis was performed on this family of distribution class transformers with LTC. The results are summarized in Table A-5.
Table A-5 PFM Technical Analysis Summary
Dominant Failure Modes (0 = Exceptional, 1 = Seldom, 2 = Real Possibility) Dominant Cause (0 = Rare, 1 = Seldom, 2 = Real Possibility, 3= Exceptional Problem)

Functions and Regulatory Requirements

Critical Function (Mark with an X)

Failure Modes

Failure Effects: Equipment

Failure Effects: System

Failure Effects: Remote

Failure Effects: Customer

Failure Causes

Aging Mechanisms

Applicable Performance Metrics

Safety Impact? (Yes, No)

Transform voltage at rated KVA

High

Fails to transform voltage at rated MVA Fails to provide cooling Fails to adjust output voltage Fails to adjust output voltage Fails to adjust output voltage

0; never happens on its own

Component failure

Loss of critical function Could lead to dielectric failure Loss of critical function Loss of critical function

Highside protection operates Highside protection operates

Extended outage

Result of other failure modes Cooling control failure or loss of station service Loose electrical connections Contact misalignment or failure

No aging mechanism

None

N/A

High

1; multi-stage cooling

Loss of life

Extended outage

Random

Failure rate of cooling control system

No

Automatically adjust output voltage (LTC)

High

Component failure

Maintenance inconvenience

Quality of service None; must take place when xfr is off-line

OEM workmanship

Failure rate for this mode

No

Component failure

Maintenance inconvenience Loss of transfer capacity, possible transformer failure Loss of transfer capacity, possible transformer failure

OEM workmanship

Failure rate for this mode

No

Component failure

Loss of critical function

Extended outage

Drive mechanism

OEM workmanship

Failure rate for this mode

No

Fails to adjust output voltage

Component failure

Loss of critical function

Extended outage

Failed reversing switch

OEM workmanship

Failure rate for this mode

No

A-7

Application Study for Load-Tap-Changers Table A-5 (cont.) PFM Technical Analysis Summary
Dominant Failure Modes (0 = Exceptional, 1 = Seldom, 2 = Real Possibility) Dominant Cause (0 = Rare, 1 = Seldom, 2 = Real Possibility, 3= Exceptional Problem)

Functions and Regulatory Requirements

Critical Function (Mark with an X)

Failure Modes

Failure Effects: Equipment

Failure Effects: System

Failure Effects: Remote

Failure Effects: Customer

Failure Causes

Aging Mechanisms

Applicable Performance Metrics

Safety Impact? (Yes, No)

Fails to adjust output voltage

Component failure

Loss of critical function

Maintenance inconvenience

Quality of service reduction (back-up controls prevent overvoltage condition) Quality of service reduction (back-up controls prevent overvoltage condition) None; must take place when xfr is off-line None; must take place when xfr is off-line None; must take place when xfr is off-line

Failed controls

OEM workmanship

Failure rate for this mode

No

Fails to adjust output voltage

Component failure

Loss of critical function

Maintenance inconvenience

Loss of sensing (voltage input)

OEM workmanship

Failure rate for this mode

No

Manually adjust output voltage (NLTC)

High

Fails to adjust output voltage Fails to adjust output voltage Fails to adjust output voltage Incorrectly indicates oil level is too high Indicates oil level is ok but is low

0 = in service; 1 = during installation

Component failure

Loss of critical function

Maintenance inconvenience

Loose electrical connection

OEM workmanship

Report exception during installation or resetting Report exception during installation or resetting Report exception during installation or resetting Report exception during inspection Report exception during inspection

No

0 = in service; 1 = during installation

Component failure

Loss of critical function

Maintenance inconvenience

Contact misalignment

OEM workmanship

No

0 = in service; 1 = during installation

Component failure

Loss of critical function

Maintenance inconvenience

Broken mechanical connection

OEM workmanship

No

Provide oil level indication

Gauge or float figure

Loss of alarming

None

None

Mechanical binding

Random

No

Gauge or float figure

Loss of alarming

None

None

Mechanical binding

Random

No

A-8

Application Study for Load-Tap-Changers Table A-5 (cont.) PFM Technical Analysis Summary
Dominant Failure Modes (0 = Exceptional, 1 = Seldom, 2 = Real Possibility) Dominant Cause (0 = Rare, 1 = Seldom, 2 = Real Possibility, 3= Exceptional Problem)

Functions and Regulatory Requirements

Critical Function (Mark with an X)

Failure Modes

Failure Effects: Equipment

Failure Effects: System

Failure Effects: Remote

Failure Effects: Customer

Failure Causes

Aging Mechanisms

Applicable Performance Metrics

Safety Impact? (Yes, No)

Incorrectly indicates oil level is too low Fails to provide containment; minor leak Fails to provide containment; minor leak Fails to provide containment; minor leak Fails to provide containment; major leak Provide connectivity to HV or LV system Failure to provide conduction path

Gage or float figure

Loss of alarming Possible dielectric failure as a result of oil heating Possible dielectric failure as a result of oil heating Possible dielectric failure as a result of oil heating Possible dielectric failure as a result of oil heating

None

None

Broken float

Random

Report exception during inspection

No

Contain oil

High

Loss of cooling and insulation

Loss of transfer capacity

None; planned load transfer None; planned load transfer None; planned load transfer

Rust/corrosion

Time and location dependent (environment)

Failure rate for this mode

No

Loss of cooling and insulation

Loss of transfer capacity

Gasket failure

Time and temperature

Failure rate for this mode

No

Loss of cooling and insulation

Loss of transfer capacity

Weld fatigue

Time-OEM manufacturing error

Failure rate for this mode

No

Loss of cooling and insulation

Highside protection operates

Extended outage

Vandalism

Random

Failure rate for this mode

No

High

Local overheating

Hotspot

Could lead to flashover and outage Highside protection operates Highside protection operates

Voltage flicker and possible outage Extended outage

Loose bushing connection

Random

Failure rate for this mode

No

1 Fails to provide insulationnormal operation

Bushing failure

Loss of function

Broken bushing

Random

Failure rate for this mode

No

Provide rated insulation

High

Loss of function

Damaged coils

Extended outage

Overload, overheating

Time and temperature

Failure rate for this mode

No

Loss of function

Tank rupture or fire (rare)

Highside protection operates

Extended outage

Contamination, water ingress

Time or random event that allows water ingress

Failure rate for this mode

No

A-9

Application Study for Load-Tap-Changers Table A-5 (cont.) PFM Technical Analysis Summary
Dominant Failure Modes (0 = Exceptional, 1 = Seldom, 2 = Real Possibility) Dominant Cause (0 = Rare, 1 = Seldom, 2 = Real Possibility, 3= Exceptional Problem)
0

Functions and Regulatory Requirements

Critical Function (Mark with an X)

Failure Modes

Failure Effects: Equipment

Failure Effects: System

Failure Effects: Remote

Failure Effects: Customer

Failure Causes

Aging Mechanisms

Applicable Performance Metrics

Safety Impact? (Yes, No)

Loss of function Loss of function Loss of function

Tank rupture or fire (rare) Damaged coils Damaged coils

Highside protection operates Highside protection operates Highside protection operates

Extended outage Extended outage Extended outage

Manufacturing contaminants

N/A

Failure rate for this mode Failure rate for this mode Failure rate for this mode Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode) Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode) Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode) Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode)

No

Age

Time

No

Low oil, loss of cooling

Temperature

No

HIgh

Fails to provide insulationsurge and transient

Loss of function

Damaged coils

Highside protection operates

Extended outage

Overload, overheating

Time and temperature

No

Loss of function

Damaged coils

Highside protection operates

Extended outage

Contamination, water ingress

Time or random event that allows water ingress

No

Loss of function

Damaged coils

Highside protection operates

Extended outage

Manufacturing contaminants

N/A

No

Loss of function

Damaged coils

Highside protection operates

Extended outage

Age

Time

No

A-10

Application Study for Load-Tap-Changers Table A-5 (cont.) PFM Technical Analysis Summary
Dominant Failure Modes (0 = Exceptional, 1 = Seldom, 2 = Real Possibility) Dominant Cause (0 = Rare, 1 = Seldom, 2 = Real Possibility, 3= Exceptional Problem)

Functions and Regulatory Requirements

Critical Function (Mark with an X)

Failure Modes

Failure Effects: Equipment

Failure Effects: System

Failure Effects: Remote

Failure Effects: Customer

Failure Causes

Aging Mechanisms

Applicable Performance Metrics

Safety Impact? (Yes, No)

Loss of function

Damaged coils

Highside protection operates

Extended outage

Loss of coil compression

Time

Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode) Failure rate for insulation failure mode (only root cause analysis can differentiate the exact cause and mode)

No

Loss of function

Damaged coils

Highside protection operates

Extended outage

Low oil

Random

No

Provide pressure relief

High

Fails to open

Damage to pressure relief device

Tank deformation, rupture, or fire (rare)

The initiating event that caused a change in pressure generally causes a protective device to operate.

Extended outage

Corrosion or contamination in spring

Time and environment

Report exception during inspection

Yes

Fails to reset

Damage to pressure relief device

Could allow contamination to enter the transformer

None

May lead to future insulation failure and thus, extended outage

Spring failure

Report exception during inspection

No

A-11

Application Study for Load-Tap-Changers

PFM Technical Summary The PFM analysis identified the dominant modes and causes of failure to be: Failure to adjust output voltage automatically Stationary contact failure Moving contact failure Reversing switch failure Loss of clamping pressure Age Moisture intrusion

Failure to provide insulation surge and transient Failure to provide insulation normal operation

Other causes of functional failure (such as leaks) will not be further investigated because the current substation and transformer maintenance inspection program is a prudent and successful approach. PFM Risk Analysis An analysis of the risk associated with the dominant causes of failure was performed. The results are summarized in Figure A-2 through Figure A-6 for the failure modes of: Failure to adjust output voltage automatically Failure to provide insulation surge and transient Failure to provide insulation normal operation

Figure A-2 Winding Failure Risk Analysis Older Westinghouse

A-12

Application Study for Load-Tap-Changers

Figure A-3 Winding Failure Risk Analysis Others

Figure A-4 LTC Failure Risk Analysis Poor Performer

Figure A-5 LTC Failure Risk Analysis Fair Performer

A-13

Application Study for Load-Tap-Changers

Figure A-6 LTC Failure Risk Analysis Good Performer

From the risk analysis, several potential maintenance/reliability areas requiring more in-depth analyses or potentially a change in strategy were recognized. Three areas that will be further analyzed in this report include: Advanced aging of some main winding insulation systems Short contact lives associated with the TLH21 model of LTC The potential to extend maintenance intervals for type UVT and UZD model tap changers

Note: Due to limited information, other areas could not be sufficiently reviewed. Developing Aging Models Initial risk analysis activities gave good insight into areas of potential concern and opportunity. The process was very subjective due to the limited amount of available data. This limited amount of data does not imply that the risk analysis was of little value; in fact, just the opposite is true. The risk analysis also does a good job in identifying what future data collection activities will benefit the utility. Two aging models were developed, using PFM techniques to model: The aging process of paper insulation in transformers The wear process for LTC contacts

A-14

Application Study for Load-Tap-Changers

The two models developed are graphed in Figure A-7 and Figure A-8.

Figure A-7 Main Winding Aging Model (Normal Loading)

Figure A-8 LTC Wear Model

Implications of Aging/Wear Models It is recognized that the above aging/wear models are second approximations as to how the aging process actually takes place. They are an improvement over first approximation models that are sometimes referred to as failure rate models in several ways. The improvements include: Realization that most aging processes are not linear or constant. The rate of aging can change with time. Time is not the only aging mechanism. Wear is described as the probability of end-of-life. Models are specific to each equipment design.

A-15

Application Study for Load-Tap-Changers

With the wear models, risk can now be quantified as a function of age, acknowledging the fact that risk is dynamic and changes with time. Now risk and benefits can be defined in the following equation as:

Risk = (Impacts of failure in dollars) (Probability of failure) Benefit = Risk 1 Risk 2 Where: Risk1 = Risk associated with operation or maintenance scenario 1 Risk2 = Risk associated with operation or maintenance scenario 2

Eq. A-1

This risk can now be applied to the fleet of transformers to determine the average annual failure rates and the expected budget impacts as shown in Figure A-9.

Figure A-9 Average Failure Rate and Risk for an Aging Fleet of Transformers

Transformer Winding Maintenance

The PFM analysis reinforced the facts that: Renewal of the paper insulation system is not a maintenance activity. Renewal of the paper insulation system is, in most cases, not cost effective. Coil clamping pressure directly affects the capability of the transformer to withstand a through-fault. Aging is a function of: Time Temperature/loading Available oxygen An age limit on transformer operating life is prudent and could be determined from the utilitys risk tolerance level.

A-16

Application Study for Load-Tap-Changers

A third approximation of winding age should now be developed in order to more accurately quantify the annual risk of failure. This model can take into account the previously mentioned aging elements and be calibrated by a combination of: Power factor test measurements Oil (furan) tests Dissolved gas analysis (DGA) tests (CO and CO2)

The topic of transformer coil clamping pressure must be further investigated. It is acknowledged that test techniques such as a swept frequency response analysis (SFRA) can detect the movement of windings resulting from through-faults and reduced clamping pressure. This analysis is after-the-fact, and, although it can potentially alert the utility of an abnormal risk of near-term failure, it does not prevent winding movements.
LTC Maintenance

The maintenance of LTCs is a traditional renewal activity that can be improved by the results of the above analysis. It is possible to set more exact maintenance triggers for LTC inspection based on the above models and the level of risk the utility is willing to take. Table A-6 identifies some possible trigger levels for LTC internal inspection.
Table A-6 Number of LTC Operations Where 63% Contact Wear Is Expected Model TLH21 LR654 550B 550C UZD URT UVT Operations Before Maintenance 15,000 40,000 30,000 40,000 80,000 30,000 60,000

A-17

Application Study for Load-Tap-Changers

Another approach to triggering LTC maintenance is to perform maintenance when the total cost of maintenance and risk is minimal. This approach requires the utility to quantify: The cost of maintenance The cost of failure: Equipment repair and replacement costs Contractual impacts Supply impacts Revenue impacts Social impacts: Customer impacts Environmental impacts Political impacts This approach is location sensitive and results in an optimum maintenance interval. It is approximated in Figure A-10 for demonstration of the concept only.

Figure A-10 Example of Optimizing Maintenance Intervals Based on Lowest Life-Cycle Cost

Beyond setting operation limits for each LTC model, the analysis identified a need to reassess the maintenance approach for the model TLH21 tap changer and potentially other models as well. This revised approach may include: Use of on-line LTC oil filters LTC temperature monitoring temperature index LTC DGA analysis

A-18

Application Study for Load-Tap-Changers

It is obvious that the second approximation model of LTC contact wear can be greatly improved. A third generation approximation can be developed that incorporates: Actual loading at the time of contact change Tap positions at the time of change A large sampling of observations using observations from multiple utilities

This improvement can be personalized by using readily available, utility-specific O&M data that come from field inspections, SCADA systems, and data historians.

A-19

Application Study for Load-Tap-Changers

Performance Measurement

Both the PFM technical analysis and the age model identified important data elements to measure during routine transformer O&M as well as metrics to calculate for measuring transformer and LTC performance. Some of the measures and metrics identified during this analysis are shown in Tables A-7 and A-8.
Table A-7 Metrics for LTC Performance
Metrics and KPIs Mark only one column below. Function Adjust output voltage (LTC) Adjust output voltage (LTC) Adjust output voltage (LTC) Adjust output voltage (LTC) Adjust output voltage (LTC) Adjust output voltage (LTC) Information to Collect/Use Loading Tap position LTC oil temperature LTC oil temperature Failure event Percent wear at the time of maintenance Contact wear as function of operations and model Source SCADA/ historian SCADA/ historian SCADA/ historian SCADA/ historian CMMS Equip. Spec. or Aggregate? Specific Specific Specific Specific Specific Static Equip. Data? Measurement Data? Metric Info? KPI Info? Action If Off-Target Report Needed

N/A N/A N/A N/A N/A

Base data Base data Base data Base data Base data

CMMS

Specific

N/A Determine if maintenance triggers are correct or if impacted by other factors, such as loading

Base data

Adjust output voltage (LTC)

CMMS

Aggregate

Wear distribution by age/operations

A-20

Application Study for Load-Tap-Changers Table A-7 (cont.) Metrics for LTC Performance
Function Information to Collect/Use Reversing switch wear as function of operations and model Source Equip. Spec. or Aggregate? Static Equip. Data? Measurement Data? Metric Info? KPI Info? Action If Off-Target Determine if maintenance triggers are correct or if impacted by other factors, such as loading Perform DGA and analysis and/or schedule internal inspection of LTC Investigate bandwidth and time delay settings on LTC controls Investigate NLTC setting Report Needed

Adjust output voltage (LTC)

CMMS

Aggregate

Wear distribution by age/operations

Adjust output voltage (LTC)

LTC temperature index

MMW

Specific

13-month index history of all LTCs Items with more operation than monthly threshold Items with fewer than needed operations of the reversing switch

Adjust output voltage (LTC) Adjust output voltage (LTC)

Operations per month Reversing switch operations per month

MMW

Specific

MMW

Specific

Table A-8 Metrics for Main Insulation Performance


Metrics and KPIs Mark only one column below. Function Information to Collect/Use Outage event/frequency for insulation failure mode Outage duration for insulation failure mode Source Equip. Spec. or Aggregate? Static Equip. Data? Measurement Data? Metric Info? KPI Info? Action If Off-Target Report Needed

Provide rated insulation

Specific

Determine if it is a batch or age problem

Outage count distribution by mode and age

Provide rated insulation

Specific

Look at emergency replacement process

Outage duration distribution by mode

A-21

Application Study for Load-Tap-Changers Table A-8 (cont.) Metrics for Main Insulation Performance
Function Information to Collect/Use Customers affected for insulation failure mode DGA results Source Equip. Spec. or Aggregate? Static Equip. Data? Measurement Data? Metric Info? KPI Info? Action If Off-Target Report Needed

Provide rated insulation

Specific

Determine if it is a batch or age problem Determine if a specific transformer problem exists Determine if a specific transformer problem exists X Investigate OEM quality control Investigate OEM quality control Modify replacement criteria or aging model Modify replacement criteria or aging model

Customers impacted distribution by mode and age Items exceeding gas threshold Items exceeding implied age or furan threshold Failure mode distribution by age Failure rate distribution by mode SAIDI report by application SAIFI report by application Graph showing the current operating life distribution and the trends toward older or newer

Provide rated insulation Provide rated insulation Provide rated insulation Provide rated insulation Provide rated insulation Provide rated insulation

Specific

Furan results

Specific Transformer application specific Transformer application specific Aggregate

Failure mode distribution by age Failure rate by age for insulation failure mode SAIDI for all xfr insulation failure modes SAIDI for all xfr insulation failure modes

Aggregate

Provide rated insulation

Operating life index distribution

Aggregate

Review replacement plan

A-22

Application Study for Load-Tap-Changers

Conclusion
The concepts of PFM are quite thorough and cover the gamut of technical, business, and information requirements needed for a robust and responsive maintenance program. Although the analysis presented previously is rather short and limited, one can see how these concepts can be further expanded upon, resulting in further insight into how to manage and improve maintenance. The implications that can be drawn from a larger pool of data than what has been provided here can have a great financial and reliability impact on a utility. Potential impacts include: Developing an objective transformer replacement program Delaying premature transformer replacements where prudent Limiting risk equitably across the whole population of transformers Changing LTC purchase specifications so that life-cycle O&M costs are minimal and maximum reliability is realized Identifying how existing data sources can be used as drivers in a PM approach Statistically demonstrating the value of maintenance Identifying areas of maintenance that are not under control

A-23

Export Control Restrictions Access to and use of EPRI Intellectual Property is granted with the specific understanding and requirement that responsibility for ensuring full compliance with all applicable U.S. and foreign export laws and regulations is being undertaken by you and your company. This includes an obligation to ensure that any individual receiving access hereunder who is not a U.S. citizen or permanent U.S. resident is permitted access under applicable U.S. and foreign export laws and regulations. In the event you are uncertain whether you or your company may lawfully obtain access to this EPRI Intellectual Property, you acknowledge that it is your obligation to consult with your companys legal counsel to determine whether this access is lawful. Although EPRI may make available on a case-by-case basis an informal assessment of the applicable U.S. export classification for specific EPRI Intellectual Property, you and your company acknowledge that this assessment is solely for informational purposes and not for reliance purposes. You and your company acknowledge that it is still the obligation of you and your company to make your own assessment of the applicable U.S. export classification and ensure compliance accordingly. You and your company understand and acknowledge your obligations to make a prompt report to EPRI and the appropriate authorities regarding any access to or use of EPRI Intellectual Property hereunder that may be in violation of applicable U.S. or foreign export laws or regulations.

The Electric Power Research Institute (EPRI) The Electric Power Research Institute (EPRI), with major locations in Palo Alto, California, and Charlotte, North Carolina, was established in 1973 as an independent, nonprofit center for public interest energy and environmental research. EPRI brings together members, participants, the Institutes scientists and engineers, and other leading experts to work collaboratively on solutions to the challenges of electric power. These solutions span nearly every area of electricity generation, delivery, and use, including health, safety, and environment. EPRIs members represent over 90% of the electricity generated in the United States. International participation represents nearly 15% of EPRIs total research, development, and demonstration program.

Together...Shaping the Future of Electricity

Program:
2005 Electric Power Research Institute (EPRI), Inc. All rights reserved. Electric Power Research Institute and EPRI are registered service marks of the Electric Power Research Institute, Inc.
Printed on recycled paper in the United States of America

Substations

1010555

ELECTRIC POWER RESEARCH INSTITUTE

3420 Hillview Avenue, Palo Alto, California 94304-1395 PO Box 10412, Palo Alto, California 94303-0813 USA 800.313.3774 650.855.2121 askepri@epri.com www.epri.com

Вам также может понравиться