Вы находитесь на странице: 1из 5

Synthesized Diagnosis on Transformer Fault

Based on Principle Component Analysisand

Support Vector Machine

Tangbing LiˈPeng WangˈQiukuan ZhouˈYuting Liu

Jiangxi Electric Power Testing & Research Institute
Nanchang, China

Abstract—A synthesized method is presented in this paper for and other pattern recognition problems[12-16].
transformer fault diagnosis, This model combines the principal
component analysis and the support vector machine. Firstly, by This paper presents a synthesized method based on PCA
principal component analysis, the characteristics of the sample and SVM for transformer fault diagnosis. Firstly, the method
data are extracted, the main information is be retrieved, a new analyses the impact factors by using PCA, to achieve
sample set is created. Then, a support vector machine model is reducing dimension and noise filtration of data, then the SVM
created and the new sample set is used to train the support vector is used to train and test the data samples which have been
machines. This method achieves the advantages of the two reduced to noise, and the ideal results are obtained.
algorithms. The accuracy of transformer fault diagnosis based on
this method is improved when the sample information is noisy or II. PCA–SVM THEORY
incomplete. Experimental results show that the method is valid
A. Principle Component Analysis
and feasible and has better diagnostic accuracy.
Assume that there are p attribute x1 , x 2 , " x p , the various
Keywords—Principle Component Analysis; Support Vector
characteristics of the objective object. Let
Machine; Transformer; Fault Diagnosis T
x = ( x1 , x2 ," x p ) be a P dimensional vector. Denoted by
u =E( x), H = D( x) is the mean and variance of vector x.
At present, the three-ratio fault diagnosis based on Consider its linear transformation:
dissolved gas in oil is one of the most simple and effective
method for fault diagnosis of power transformer. However,
dissolved gas in transformer oil does not carry all of the fault ­ y1 = l1T x = l11 x1 + l21 x2 + " + l p1 x p
information, so in recent years, the comprehensive diagnosis °
method based on dissolved gas analysis combined with the ®"" (1)
preventive test results of transformer is introduced into the fault ° y = lT x = l x + l x + " + l x
diagnosis of the transformer. Among them, the support vector ¯ p p 1p 1 2p 2 pp p
machine, rough set theory and fuzzy neural network algorithm
in the fault diagnosis of the transformer has achieved good We get:
results. However, due to the limitation of the algorithm itself, it
is difficult to overcome the problem that the sample data is Var ( y i ) = liT ¦ li
easy to be polluted by noise and the information is not (2)
complete. COV ( yi , y f ) = liT ¦ l f i, j = 1," , p
Principal component analysis (PCA) is a kind of data
mining technology in multivariate statistics. The purpose of the Use y1 instead of the original P variables x1 , x2 ," x p , and
PCA is to reduce the dimension of data, and to convert a y1 variance representation is the most classical method that can
number of indicators into a few comprehensive indexes under reflect the information contained in these variables as much as
the premise of not losing the main information. To exclude the possible. The greater the Var ( y1 ) , the more information the y1
existence of noise pollution and overlapping information
among the numerous information[10-11]. Support vector contains. In order to avoid the emergence of Var ( y1 ) → ∞ , we
machine (SVM) is a new pattern recognition method. SVM have to limit the li.
adopts the principle of structural risk minimization, taking into
account the training error and generalization ability. SVM has
advantages in solving small sample, nonlinear, local minimum liT li = 1, i = 1," , p (3)

978-1-5090-0496-6/16/$31.00 ©2016 IEEE

In the constraint type (3) for l1, so that the maximum f ( x) = sgn(wφ ( x) + b) (6)
Var ( y1 ) , then y1 is called the first principal component. If a
principal component is not sufficient to represent the original P w is the weight vector, whose dimension is the dimension
variables, the increase of y2, in order to most effectively of the feature space, b is constant. Optimization problem is:
represent the original variable information, ›ଵ existing
information does not appear in the y2, that is:
w + C ¦ (ξi + ξi* )
COV ( y1 , y2 ) = 0 (4)
w ,b ,ξ 2 i =1

So in the constraint type (3) and (4) for l1, so that the
s.t. y j − w • φ ( xi ) − b ≤ ε + ξ j ᧤7᧥
maximum Var ( y2 ) , the y2 is known as the second principal w • φ ( xi ) + b − y j ≤ ε + ξ *
component. Similar to the definition of third, fourth principal
component, etc.. ξi , ξi* ≥ 0, i = 1, ⋅⋅⋅, l
Let λi be the eigen values of x, ti as the corresponding unit In the above formula, C is the fault tolerant penalty
feature vector. Set a feature vector extraction method, then coefficient, C > 0 ; ξi is the relaxation factor.
according to the theorem of principal component analysis, it
can be known that the number of the original P variables The dual optimization problem for the above mentioned
changes which are reflected by the principal components can optimization problem is
be represented by the ratio of p p
, p
Var ( y i ) = λi and ¦ λi = ¦ σ ii λk / ¦ λi
1 l l
¦ ¦ (α i − α i* )(α j − α *j ) K ( xi , x j )
i =1 i =1 i =1
m p max L D = −
the contribution of the principal component, and ¦ λi / ¦ λi is 2 i =1 j =1
i =1 i =1 l l
known as the cumulative contribution rate of principal − ε ¦ (α i + α i* ) + ¦ y i (α i + α i* ) (8)
component x1 , x2 ," xm . In short, the principal component i =1 i =1
analysis is to map the original attributes to one or a few of the
principal components that are not related to each other, and
s .t . ¦ (α
i =1
i − α i* ) = 0
keep most of the original attribute information. The 0 ≤ α i , α i* ≤ C
information is not represented by the principal component is
removed as the noise, because this part of the information is
Where K ( xi , x j ) = φ ( xi ) φ ( x j ) is called a kernel
not important.
B. Support Vector Machine function, SV and NSV are used to represent the support vector
The application of SVM has two main types at present, set and the standard support vector set respectively. So we can
namely, pattern recognition and regression analysis. In this obtain the nonlinear classifier
paper, we discuss the classification and recognition problem,
which belongs to a class of pattern recognition, without loss of
generality, the classification problem can finally be classified f ( x) = sgn( ¦ (ai − ai* ) K ( x, xi ) + b) (9)
xi ∈sv
as two categories. The goal of this problem is to introduce a
function from a known sample, to classify the two types of
objects. Where b can be calculated according to the following
The following training samples for a given set of training
are separated into two categories
b= { ¦ [ yi − ¦ (α j −α*j )K(xi , xj ) −ε ]}
NNSV 0<αi <C
S = {( x1 , y1 ),( x2 , y2 ), ",( xl , yl )} (5)
x j ∈sv (10)

xi ∈ R n , y ∈ {−1,1} + ¦ [ y − ¦ (α −α )K(x , x ) +ε ]
i j
j i j
0<αi* <C x j ∈sv

Where xi is input vector, yi is expected output, l is the total

number of data points.
A. Fault TypeࠊAttribute VariableࠊDetermination Of
Firstly, we can map the training data to a high dimensional Decision Table
linear feature space by a nonlinear function, and then build a
hyper plane in this high dimensional feature space, thus the Fault attribute setᇬfault classᇬdecision table are shown in
form of classifier is:
Table 1,Table 2,Table 3[7,8]. X2=0 ᇬ
X2=1,X2=2,X5=0,X5=1,X5=2 in Table 3, respectively, the TABLE.II ATTRIBUTE SET
dissolved gas analysis results for the no overheating, low
temperature overheating ᇬ high temperature overheating, no Number Attribute Number Attribute
discharge, low energy discharge, high energy discharge. Iron core grounding Value of partial
X1,X3,X4,X7,X8,X9,X10,X11,X12,X13,X14, value of 0,1,2 X1 X8
current discharge
respectively, indicating that attribute values and trend values
DGA diagnosis
are not excessive, attribute values or trend values exceeded, X2 results for X9
Gas ratio in oil
attribute values and trend values are excessive. CO2/CO
The capacitance
Three- phase between
X3 unbalance of DC X10 windings or
Number Fault type Number Fault type winding winding-to-earth
Transformer is Suspended capacitance
D0 D7
normal discharge
Micro-water content DC resistance of
Multipoint X4 X11
in bulk oil winding
grounding of Iron Folding screen
D1 D8
core or Partial short discharge
circuit DGA diagnostic Dielectric loss of
X5 X12
Winding results for discharge winding
deformation and Protective action of No-load current
D2 Insulation aging D9 X6 X13
turn to turn short gas relay and loss
Magnetic induced
heating or magnetic Bare metal Insulation
D3 D10
shielding overheating resistance,
overheating X7 Winding ratio X14 absorption ratio,
Turn insulation Transformer oil polarization
D4 damage and turn to D11 flow blocked caused index
turn short circuit by overheating
D5 Insulation damp On load tap
Defect of tap D12 changer(OLTC)
changer and lead oil leakage


Serial quantit Attribute Fault

number y X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 class
1 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D0
2 17 2 2 0 1 0 1 * * * 0 1 0 2 1 D1
3 6 1 2 0 0 * * * 1 1 * * * 2 1 D1
4 8 * 1 0 1 0 0 * 0 2 0 0 * * 1 D2
5 12 0 2 0 0 0 * * * * 0 0 0 1 * D3
6 20 0 2 * 0 0 2 0 0 0 * * * * 0 D3
7 15 0 0 1 0 2 1 0 1 1 * 1 0 2 1 D4
8 8 * 0 1 1 0 * * 0 1 * * 0 * 0 D5
9 4 * 2 2 1 0 1 * 1 0 * * * * 0 D6
10 7 1 0 1 0 2 1 0 2 0 0 0 * * 1 D7
11 15 0 0 0 1 2 1 0 2 2 0 0 0 * * D8
12 19 1 0 * 0 2 * 0 0 0 * 1 0 0 2 D9
13 14 * 2 0 * 0 * * * 1 0 1 0 1 * D10
14 2 0 1 0 0 0 0 0 0 0 0 1 1 0 * D11
15 4 0 1 0 1 2 0 * 1 0 * * 0 * 1 D12

B. SVM Training Model 2

In this paper, we choose the Gauss radial basis function as − xi − x j
the kernel function, which has the ability to deal with the K ( xi , x j ) = exp˄ , pl > 0
˅ (11)
nonlinear data: 2* pl 2
Where pl is called the key parameters of the model. A contribution rate higher than 80% can be considered that the
principal component can better reflect the information of all
method for determining key parameters pl and fault tolerant original attribute expression. In the training of SVM we can
penalty coefficient C is, first select the parameters pl determine the key parameters of the model pl and the penalty
parameter C were 16 and 95 respectively according to the
determining the function complexity, then by changing the
training effect.
value of the parameter C to change the empirical risk and the
ratio of the set of complexity confidence intervals determined TABLE.IV COMPARISON OF THE RESULTS OF DIAGNOSIS
by the pl .
Modified Sample accuracy rate᧤%᧥
Fault diagnosis based on support vector regression is a attribute quantity Literature
number PCA-SVM SVM
generalization of the data fitting problem. The process is as [8] method
follows: modeling according to the training set, the model 0 50 100 100 100
which applied to the test data. The modeling results similar to a
1 50 98 94 98
neural network, as shown in figure1. The output is a linear
combination of the intermediate node, the weights value is the 2 500 95.2 89.4 91.2
difference between each pair of Lagrange multipliers, 3 500 93.0 79.8 88.4
intermediate node is the inner product of support vectors and
input vectors. The input node corresponds to each dimension of 4 500 89.2 68.4 81.6
the input sample, n is the dimension, and l is the number of
the training samples. The attribute information of samples in actual engineering
1 is often incomplete. For transformer testing, Testers often no
x` K ( x , x1 ) α1 *
−α longer test other items when they tested some key projects and
f ( x) found problems. At the same time due to differences in
x 2 α −α2

K ( x, x2 )
experimental conditions, environment, personnel, transformer
# test data actually obtained there may be some deviation from

x n
# # αl
* −α l
the true value. In this paper, we can understand the incomplete
and the deviation of the information as the noise pollution of
K ( x, xl )
the attribute data. In order to simulate this kind of noise
pollution, on the basis of the original test set, this paper is
Fig.1 SVM model diagram conscious to change a known attribute to form a test sample of
C. PCA-SVM Fault Diagnosis 50 cases, randomly change 2 to 4 known attribute values to
form a test sample for each test sample in 500 cases. Table 4
Fault diagnosis based on PCA-SVM can be divided into the presents the comparison of results of several methods for the
following steps: diagnosis of contaminated samples with noise pollution.
1) Use PCA to process the sample set. A linear From the results in Table 4 we can see that transformer
transformation of the sample set has been taken, then the fault diagnosis based on PCA-SVM is better than SVM and
principal component is selected according to the cumulative Method of literature [8] in dealing with noisy data, especially
contribution rate, finally, a new set of samples is obtained by when the noise pollution is gradually increasing, the number of
reducing the dimension of the transformed samples set attributes is modified, the advantage is more obvious.
according to theprincipal component.
2) Concentrated train the new samples and construct the
form of input variables. Principal component analysis and support vector machine,
3) Enter the sample to be measured to the support vector the Synthesized of transformer faults diagnosis method is
studied in this paper, the method overcomes the problem that
machine model, and calculate its predictive value.
transformer fault diagnosis sample data is easy to be
4) Restore the obtained predictive value to the fault contaminated by noise and the information is not complete,
diagnosis result, through the coefficient which is used in the ensures a high diagnostic accuracy when there is noise
process of principal component analysis. pollution in the sample data, so the method has a high
practical value.
In this paper, based on the decision table provided by the
literature [7], combined with the transformer test data collected
by the author in the power production unit, 250 samples were
collected, 200 cases were taken as training set, and 50 cases as [1] GB 7252-2001, Guide for the analysis and judgment of dissolved gases in
transformer oil[S], 1987.
test set.
[2] CAI. Jinding, WANG Shaofang,ĀApplication of rough set theory in
After calculation, the principal component of the PCA IEC-6059 three ratio fault diagnosis decision rules[J]ā, Proceedings of
analysis is determined to be 10, and the cumulative the CSEE, 2005, 25(11), pp.134-139.
contribution rate is 90.2%. Experience shows, the cumulative
[3] WANG Yong-qiang, LV Fang-cheng, LI He-ming, ĀTransformer fault
diagnosis based on Bayesian network and DGA[J] ā , High Voltage
Engineering, 2004, 30(5), pp.12-13
[4] YU Jie, ZHOU Hao, Ā Fault diagnosis model of dissolved gas in
transformer based on immune algorithm[J]ā, High Voltage Engineering,
2006, 32(3), pp.49-50
[5] LU Gan-yun, CHENG Hao-zhong, DONG Li-xin, et al, Ā Fault
identificationof power transformer based on multi-layer SVM
classifier[J]ā, Proceedings of the CSU-EPSA, 2005, 17(1), pp. 19-22.
[6] Liu Na, GAO Wen-sheng, TAN Kexiong,ĀFault diagnosis of power
transformer based on combined neural network model[J]ā, Transactions
of China Electrotechnical Society, 2003, 18(2), pp.83-86.
[7] Mo Juan, Wang Xue, Dong Ming et al,ĀFault diagnosis method of
power transformer based on Rough Set Theory[J]ā, Proceedings of the
CSEE, 2004, 24(7),pp.162̾167.
[8] ZHU Yong-Li, WU Li-zeng, LI Xue-yu, Ā Synthesized diagnosis on
transformer fault based on Bayesian classifier and Rough set”,
Proceedings of the CSEE, 2005, 25(10), pp.159-165.
[9] WU Zhong-li, YANG Jian, ZHU Yong-li, “Fault diagnosis of transformer
based on rough set theory and support vector machine[J]”, 2010, 38(18),
[10] Tipping M E, Probabilistic principle component analysis.Journal of the
Royal Statistical Society, 1999, 61(3), pp.611-622.
[11] Xiang Dongjin, Practical multivariate statistical analysis[M],
Wuhan:China University of Geosciences press, 2005.
[12] Vapnik V, The Nature of Statistical Learning Theory[M], New
York:SpringerVerlag, 1995.
[13] Deng Naiyang, TianYingjie, SVM-A new method in data mining[M],
Beijing : Science Press, 2002.
[14] NelloCristianini, John Shawe-Taylor, An Introduction to Support Vector
Machines and Other Kernel-based Learning Methods[M], Beijing:
Publishing Housing of Electronics Industry, 2004.
[15] S. K. Shevade, S. S.Keerthi , C. Bhattacharyya and K. R. K. Murthy,
ĀImprovements to the SMO Algorithm for SVM Regression”, IEEE
Transaction on neural networks, 2000, 11(5), pp.1188-1193.
[16] Danian Zheng, Jiaxin Wang, Yannan Zhao, “Non-flat function estimation
with a multi-scale support vector regression[J]”, Neurocomputing, 2006,
70, pp.420-429.