Вы находитесь на странице: 1из 7

IEEE PES PowerAfrica 2007 Conference and Exposition

Johannesburg, South Africa, 16-20 July 2007

Online Voltage Stability Monitoring and


Contingency Ranking using RBF Neural Network
B. Moradzadeh, S.H. Hosseinian, M.R. Toosi and M.B Menhaj

Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Iran

Email: b.moradzadeh@aut.ac.ir
Abstract: Voltage stability is one of the major concerns in
competitive electricity markets. In this paper, RBF neural network
is applied to predict the static voltage stability index and rank the
critical line outage contingencies. Three distinctfeature extraction
algorithms are proposed to speedup the neural network training
process via reducing the input training vectors dimensions. Based
on the weak buses identification method, the first developed
algorithm introduces a new feature extraction technique. The
second and third algorithms are based on Principal Component
Analysis (PCA) and Independent Component Analysis (ICA)
respectively which are statistical methods. These algorithms offer
beneficial solutions for neural network training speed
enhancement. In all presented algorithms, a clustering method is
applied to reduce the number of neural network training vectors.
The simulation results for the IEEE-30 bus test system
demonstrate the effectiveness of the proposed algorithms for online voltage stability index prediction and contingency ranking.

I. INTRODUCTION
Voltage stability is defined as the ability of a power
system to maintain steadily acceptable bus voltage at each
node under normal operating conditions, after load increase,
following system configuration changes or when the system
is being subjected to disturbances like line outage, generator
outage and etc [1]. Voltage collapse may be caused by a
variety of single or multiple contingencies known as voltage
contingencies in which the voltage stability of the power
system is threatened [2]. Conventional evaluation techniques
based on the full or reduced load flow Jacobian matrix
analysis such as, singular value decomposition, eigenvalue
calculations, sensitivity factor, and modal analysis are time
consuming [3]-[5]. Therefore, they are not suitable for online
applications in large scale power systems. Since the 1990s,
extensive research work has been carried out on the
application of neural networks to power system problems [6].
Artificial neural networks (ANNs) have shown great promise
in power system engineering due to their ability to synthesize
complex mappings accurately and fast. Most of the published
work in this area utilize multilayer perceptron (MLP) model
based on back propagation (BP) algorithm, which usually
suffers from local minima and overfitting problems.
Some research work has been devoted to neural network
applications to voltage security assessment and monitoring
[7]-[1 1]. Multi-layered feed-forward neural networks have
been used for power margin estimation associated with static

1-4244-1478-4/07/$25.00 2007 IEEE

voltage stability by means of different training criteria and


algorithms. The active and reactive powers on load and
generation buses are frequently used as the inputs to the
multi-layered feed-forward neural network [7]-[10]; bus
voltages and angles are also part of the inputs in some
research work [7], [9]; and in [11], the active and reactive
power flows of some selected lines are also used as inputs.
Radial Basis Function Network (RBFN), with nonlinear
mapping capability, has become increasingly popular in
recent years due to its simple structure and training
efficiency. RBFN has only one nonlinear hidden layer and
one linear output layer. RBFN has been applied for active
power contingency ranking/screening in [12] and a separate
RBFN was trained for each contingency. In [2], both
unsupervised and supervised learning is applied to RBFN in
order to reduce the number of neural networks required for
voltage contingency screening and ranking. An approach
based on class separability index and correlation coefficient is
used to select the relevant features for the RBFN. In [1],
active and reactive loads on all PQ buses are considered as
input set. In large scale power systems, the mentioned
method generates large input training vectors and this leads to
a low speed training process. In the present paper, three
distinct feature extraction algorithms are proposed in order to
reduce the input training vectors dimension and speedup the
RBFN training.
II. VOLTAGE STABILITY INDEX
Minimum singular value of the load-flow Jacobian matrix
(MSV) is proposed as an index for quantifying the proximity
to the voltage collapse point. Right Singular Vector (RSV),
corresponding to a minimum singular value of the Jacobian
matrix, can be utilized for indicating sensitive voltages that
identify the weakest node in the power system [5].

III. PROPOSED ALGORITHMS


In this paper three different algorithms are proposed for
feature extraction. These algorithms reduce the training time
of the RBFN with acceptable accuracies.

A. First algorithm
In this algorithm, three groups of parameters are
considered for feature extraction. The first group includes the
active and reactive loads on the weak PQ buses. Load
variations on these buses have a great effect on the voltage
stability index. The second important parameter group
consists of the active and reactive loads on terminal buses of
critical lines. This group must be considered in the input set
to enhance the accuracy of the contingency ranking. The third
parameter group is the ratio between the sum of the active
and reactive loads on the remained PQ buses to the sum of
their base values of active and reactive loads.
B. Second algorithm
In this algorithm, Principal Component Analysis (PCA)
method is employed in order to reduce the dimension of the
neural network input training vectors.
PCA is a way of identifying patterns in data, and
expressing the data in such a way so as to highlight their
similarities and differences. It describes the data set in terms
of its variance. Each principal component describes a
percentage of the total variance of a data set and elaborates
loadings or weights that each variate contributes to this
variance. That is to say, the first principal component of a
data set describes the greatest amount of variance in the data
set. The coefficients of the principal components quantify the
loading or weight of each variate to that amount of variance.
Since patterns can be hard to find in data of high dimension,
the luxury of graphical representation is not available. The
other main advantage of PCA is that once you have found
these patterns in the data, you can compress the data by
reducing the number of dimensions, without much loss of
information [13]-[15]. Principal components can be
calculated using eigenvectors and eigenvalues of covariance
matrixes or correlation matrix.

Cov(X )

(Xik -Xi)(Xjk

X)

(1)

where n is the number of the input training sample data,


i -1,2, ,m, j 1,2, ,m (m is the vector's dimension).
Xi is the average vector of all samples. All components can
be computed by the solution of:

Cwp = Apwp,

1,2,

,m

(2)

C is the covariance matrix, wP is the p-th principal


component (eigenvector) and AP is the corresponding
eigenvalue. The AP are positive values which are proportional
to the fraction of the total variance accounted by each

component which have the important property of forming an


orthogonal set. As PCA has the property of packing the
greatest energy into the least number of principal
components, principal components which are corresponding
to eigenvalues less than a threshold can be discarded with
minimal loss in representational capability. The coefficients
of the principal components of q-th vector are then given by:
m

aap =

,xqiwip,

= 1,2,...,z

= 1,2,...,n

(3)

i=l

where z is number of the eigenvalues which are bigger


then the threshold.
C. Third algorithm
In this section, we review the definition of Independent
Component Analysis (ICA) as well as the difference between
PCA and ICA. ICA is a statistical method for transforming
multidimensional vectors into components which are
statistically as independent from each other as possible [16].
ICA was originally developed to deal with blind source
separation problems. But, ICA has been applied to many
different problems including exploratory data analysis, blind
de-convolution, and feature extraction. A fundamental
problem in signal processing or data mining is to find suitable
representations for image, audio or other kinds of data for
tasks like compression and de-noising. ICA assumes each
observed data vector to be a linear combination of unknown
statistically independent components. Let us denote by x the
n-dimensional observed data vector whose elements are the
mixture {x1 x2 ... x, } and likewise by s the m-dimensional
source vector with elements {SI S2 ... s, }. Let us denote by
A the mixing matrix with elements ai,. All vectors are
understood as colunm vectors. Using this notation, the ICA
mixing model is written as
x = As

(4)

The basic model of ICA is shown in Fig. 1. The ICA


model is a generative model, which means that it describes
how the observed data is generated by a process of mixing
the components si. These components cannot be directly
observed. Also, the mixing matrix is assumed to be unknown.
All the data we can use is only the observed data vector x,
and we should estimate both A and s using it. The starting
point of ICA is the assumption that components si are
statistically independent. For simplicity, we also assume that
n is equal to m, but this assumption is sometimes relaxed. As
shown in equation (5), the recently developed technique of
ICA can be used to estimate the unmixing matrix W based on
the independence of estimated independent components u

[17]-[18].

(5)

Wx

The representation of observed data x using estimated


independent components u is the inverse matrix of W. We
can use that representation instead of original observed data x
for the inputs of the prediction model.

There is not any specific rule to calculate the


neighborhood radius, r, and it must be calculated by trial and
error method. The selected r must lead to an acceptable
accuracy and training speed.

PC 1

1~~~~~~~~P 2
\~~~~~~~~~~~~

MT

1^
1~~~~~~~~~~I
Jfiw_

Fig. 1. The basic model of lCA.

T-s
. 1

T~IC

D. Differences between PCA and ICA

The main goal of PCA is finding appropriate directions


within data x that maximize variance and sometimes reduce
the effects from noises. As a result, the dimensionality of the
data is reduced by using only principal components that
contribute to the covariance, and it may lead to increased
visibility. In the mean-square (second-order) error sense,
PCA may be the optimal method for dimension reduction.
However, the projections on the principal components
derived by PCA may provide less information than ones
derived by higher-order methods like ICA. Fig. 2 clearly
shows the difference between the directions determined by
PCA and ICA on a bivariate data set. The directions
determined by PCA, i.e. principal components (PCs), are
orthogonal to each other and are directed towards the
maximum variance because of the basic assumption that the
distribution of the data is Gaussian. However, the directions
guided by ICA -independent components (ICs) -are
directed in different ways that provide a more meaningful
interpretation of the given data than PCA. It is possible
because ICA does not assume GJaussianity of the observed
data [19].
IV.

CLUSTERING ALGORITHM

In this paper, a clustering method is used in order to


reduce the number of the neural network training vectors
[20]. The procedure of the applied algorithm is exhibited in
Fig. 3. Center of each cluster is a representative for all
vectors in that cluster, so the other vectors in the cluster are
discarded from the training vectors list.

(b)
Fig. 2. (a) Principal component analysis and (b) independent
component analysis. Difference between PCA and ICA interpretations
on a bivariate data set.

V. RBFN STRUCTURE

Radial Basis Function network (RBFN) has been found


very attractive for many engineering problems because they
have a very compact topology and their locally neurons
tuning capability leads to a high learning speed [21]. The
RBFN is a feed forward architecture with an input layer, a
hidden layer and an output layer. The RBFN structure is
shown in Fig. 4. The input layer units are fully connected to
the hidden layer units. In this structure, hidden nodes are
named RBFN units. These units are fully connected to the
output layer units.

where di(X) is called the distance function of the i-th RBFN


unit, X (xI,x2,...,x-n)T is an n-dimensional input feature
vector, Ci is an n-dimensional vector called the center of the
i-th RBFN unit, vi is the width of i-th RBFN unit and s is the
number of the RBFN units. Typically, Gaussian function is
chosen as the RBFN units activation function.

START

Ri (X) = exp[-d72 (X)]

(9)

The output units are linear and therefore the j-th output unit
for input X is given by the equ. 10.
s

yj (X)

b(j)

ZRi(X)W2(j1,)

(10)

i=l

where W2(, i) is the connection weight of the i-th RBFN unit


bO) is the bias of the j-th output.
The bias is omitted in this network in order to reduce the
network complexity. Therefore, equ. 10 can be contracted
into the simpler equ. 11.

to the j-th output node and

yj (X)=

Ri (X)W2 (j,1)

VI.

Fig. 3. A flow chart of the applied clustering method. k is number of


the input output pairs, m is number of the clusters, X,tm is center of
the cluster number m and Clust is set of generated clusters.
-

XI

yl(x)

Y-(x)

FiG. 4. RBF neural network structure.

The activation function of the RBF units is expressed


follows [7]:

as

(1 1)

i=l

NUMERICAL RESULTS

The 30-bus IEEE test system is selected to verify the


effectiveness of the proposed algorithms. It consists of 6
generators, 21 PQ buses and 41 lines. Load flow program
converges for 37 line outages and critical lines are recognized
under several loading conditions. In this paper, 11 numbers of
most critical lines are considered for the study. Randomly
changing the loads on PQ buses between 50% and 150% of
their base values, 2500 loading vectors are generated. 2000
vectors are used for training and the remaining vectors are
used for the test. In [9] all active and reactive loads on PQ
buses are considered for training. There are 21 PQ buses in
IEEE-30 bus system and the mentioned method in [9]
generates 42 dimensional input training vectors. This method
leads to a slow training process in large scale power systems
because of the large input set. Therefore, this method is not
suitable for large scale power systems. In the present article,
three different algorithms are proposed to reduce the
dimension of the input vectors and improve the training speed
of the neural network.
A. First algorithm

Ri (X) = Ri (-d72 (X))


di (X)

'

i = 1,2,..., s

(7)

As mentioned in section III-A, three groups of parameters


considered for feature extraction. The first group includes
the value of the active and reactive loads on weak buses.
are

(8)

Weak buses can be identified using the RSV index. The result
of the bus ranking for the 10 weakest buses is presented in
Table 1. The higher the rank, the weaker the bus is. In
addition to the weak buses, the buses which are the terminals
of the critical lines must be considered to enhance the
accuracy of contingency ranking. Finally, the selected buses
are: 26, 29, 30, 24, 21, 15, 12, 10, 4 and 2. The active and
reactive loads on these buses in addition to the third
parameter, mentioned in III-A, should be considered for
feature extraction. Using this algorithm, the dimensions of the
input vectors is reduced from 42 to 22.
After all, clustering method is applied in order to reduce
the number of the training vectors. Selecting a neighborhood
radius equal to 0.085, 1024 clusters (training vectors) are
chosen for training. Therefore, number of the training vectors
is reduced from 2000 to 1024.
Table I. weak bus ranking

Rank Bus Index Rank Bus index


1
26 0.2118
16
10 0.1537
2
30 0.2062
17
16 0.1513
12 0.1467
3
29 0.2005
18
4
25 0.1883
19
13 0.1441
24 0.1777
5
20
9 0.1378
21
11 0.1358
6
27 0.1743
7
19 0.1719 22
28 0.1138
8
23 0.1715
23
8
0.1057
9
24
6
18 0.1703
0.1026
10
20 0.1693
25
7 0.0951
11
22 0.1629
4 0.0912
26
12
21 0.1630
27
5
0.0798
3
13
15 0.1587 28
0.0763
14
14 0.1574
2 0.0471
29
1
15
17 0.1555
30
lE-5
B. Second algorithm
In this method, PCA is employed in order to reduce the
dimension of the input training vectors. Using this algorithm,
it is found that 20 eigenvalues are much bigger than the
others.
their corresponding eigenvectors
Hence,
(components) are selected and after implementing the steps
described in III-B, dimension of the vectors is reduced from
42 to 20. 10 principal components (as a sample) and their
corresponding variances are shown in Fig. 5. Sum of the all
42 variances corresponding to all 42 components is 100%.
Finally, the clustering method is applied in order to reduce
the number of the vectors. Selecting a neighborhood radius
equal to 0.48, 1033 clusters (training vectors) are chosen for
training. So, number of the training vectors is reduced from
2000 to 1033.

80

70
fo

40

fo30

20
10

10

Principal Component

Fig. 5. 10 principal components and their corresponding variances.

C. Third algorithm
In this section, ICA is applied for dimension reduction of
the input training vectors. Two different ICA algorithms have
been tested on the input training vectors. Then the
dimensionally reduced vectors are trained by RBF neural
network. These algorithms are Cardoso' s Equivariant
Adaptive Separation via Independence (EASI) [22] and Bell
and Sejnowski's infomax algorithm (BSICA) [17]. Minimum
square error and training time for both algorithms are
exhibited in table II. These values are obtained for a specific
desired vector dimension (2000 20-dimensional vectors).
Table II, shows that EASI algorithm leads to a more accurate
training in comparison with BSICA. Hence, this algorithm
has been selected for the study. Using trial and error
algorithm, it is found that 20 components from all 42
components lead to an acceptable accuracy. So the dimension
of the input training vectors has been reduced from 42 to 20.
Table II. MSE and training speed corresponding to EASI and BSICA.

algorithm MSE training time


EASI
0.0035
25.42
BSICA 0.0152
27.42

hidden layer neurons

50
50

After that, clustering algorithm is applied to reduce the


number of the training vectors. Selecting a neighborhood
radius equal to 1.65, 1024 clusters (training vectors) are
chosen for training. The performance (Speed and accuracy)
of the three proposed algorithms and their comparison with
the case that active and reactive loads at all PQ buses are
considered for training (case 1) is shown in Table III. Note
that in case 1 there are 2000 42-dimensional input training
vectors. Number of hidden layer neurons for each algorithm
has been obtained using trial and error method. This study
demonstrates that the first algorithm which is based on weak
bus identification yields a more accurate and fast training

procedure in comparison with PCA and ICA. In addition,


ICA shows a better performance than PCA. Result of voltage
stability index prediction for base case (system with no
contingencies) and contingency ranking for two different
loading conditions, using the proposed algorithms are
exhibited in table IV. It is obvious that fast performance,
accurate evaluation and good prediction accuracy for voltage
stability index have been obtained.

VII.

In this paper, RBF neural network is employed to


precisely predict the voltage stability index (MSV) and
contingency ranking in different loading conditions. Three
different algorithms are proposed in order to reduce the
dimension and the number of the training vectors in order to
improve the speed of neural network training process. First
algorithm is a novel feature extraction approach based on
power system engineering concepts. Second and third
methods are based on statistical methods. These algorithms
exhibit good performance in voltage stability prediction and
online contingency ranking, while computing the MSV using
conventional methods is very time consuming for large scale
power systems.

Table 111. Performance of the proposed algorithms.

algorithm

MSE

case 1

0.0006
0.0031
0.0104
0.0043

weak bus
PCA
ICA

training

hidden layer

time (s)
357.7
7.4

neurons

300
50

92.5
7.4

CONCLUSION

300
50

Table IV. Result of voltage stability index prediction and online contingency ranking for two different loading conditions (a) and (b).

(a)

Load flow
case l

weak bus
PCA
ICA

MSV base
0.1652
MSV base
0.1653
MSV base
0.1650
MSV base

rank
MSV
rank
MSV
rank
MSV
rank
_0.1657 MSV
MSV base rank
0.1655 MSV

2
0.0233
2
0.0225
2
0.0231
2
0.0250
2
0.0214

11
0.0981
11
0.0982
11
0.1002
11
0.0890
11
0.0957

12
0.0991
12
0.0995
12
0.1006
12
0.1015
12
0.1017

9
0.1100
9
0.1101
9
0.1103
9
0.1107
9
0.1106

3
0.1310
3
0.1311
3
0.1319
3
0.1333
3
0.1324

10
0.1431
10
0.1432
10
0.1436
10
0.1421
10
0.1431

4
0.1439
4
0.1441
4
0.1445
4
0.1437
4
0.1435

7
0.1442
7
0.1443
7
0.1452
7
0.1445
7
0.1452

8
0.1503
8
0.1505
8
0.1518
6
0.1513
8
0.1511

6
0.1533
6
0.1534
6
0.1520
8
0.1516
6
0.1528

5
0.1570
5
0.1578
5
0.1573
5
0.1593
5
0.1555

3
0.1318

10
0.1391
10

4
0.1405

7
0.1423

5
0.1454

8
0.1492

6
0.1534

0.1391

0.1417

0.1449

0.1475

0.1529

0.1435

0.1448

0.1516

0.1522

0.1540

0.1421 T0.1423

0.1454

0.1492

0.1535

(b)

Load flow

case l

MSV base rank


0.1627 MSV

weak bus

MSVbase rank
0.1627 MSV
MSV base rank

PCA

MSV base rank

ICA

MSV base rank

[2]
[3]

2
0.0246
2

0.0232

12
0.0817
12

11
0.1000

9
0.1090

0.1000

0.1090

0.0805

12

0.0983

11

0.1086

12
0.0816

11

11

9
9

3
0.1318
3

0.1627

MSV

0.1627

MSV

0.0255

0.0858

0.1020

0.1109

0.1343

0.1627

MSV

0.0225

0.0811 T0.1023

0.1107

0.1340

VIII.
[1]

2
0.0251

12

11

REFERENCE

S. Sahari, A. F. Abidin, and T. K. Abdulrahman, "Development of


Artificial Neural Network for Voltage Stability Monitoring",
National Power and Energy Conference (PECon) 2003 Proceedings,
Bangi, Malaysia.
T. Jain, L. Srrivastava and S. N. Singh, "Fast Voltage Contingency
Screening using Radial Basis Function Neural Network," IEEE
Trans. Power Syst., vol. 18, no. 4, pp. 705-7015, Nov. 2003.
M. M. Begovic, A. G. Phadke, "Control of Voltage Stability using
sensitivity analysis", IEEE Trans. On Power Systems, vol. 7, no. 1,
pp. 114-123, Feb 1992.

0.1298

0.1392

10
0.1387
10
0.1424
10
0.1383

4
0.1406
4

7
0.1424
7

7
7

5
0.1461
5

5
5

8
0.1492
8

8
8

6
0.1534
6

[4] N. Flatabo, et al., "Voltage Stability Condition in a Power


Transmission calculated by Sensitivity Analysis", IEEE trans. On
Power Systems, vol. 5, no. 4, Nov, 1990, pp. 1286-1293.
[5] Y. L. Chen, C. W. Chang, C. C. L, "Efficient Methods for
Identifying Weak Nodes in Electrical Power Networks," IEE Proc
Gener. Transm. Distrib. Vol.142, no. 3, May 1995.
[6] M.T. Haque, A.M. Kashtiban, "Application of neural networks in power
systems - a review," Transactions on Engineering, Computing and
Technology, 2005, 6, pp. 53-57.
[7] M. L. Scala, M. Trovato and F. Torelli, "A neural network-based method
for voltage security monitoring," IEEE Trans. Power Syst., 1996, 11,
(3), pp. 1332-1341

[8]

[9]
[10]

[1 1]

[12]

[13]
[14]

[15]
[16]
[17]
[18]
[19]
[20]

[21]

[22]

D. Popvic, D. Kukolj and F. Kulic, "Monitoring and assessment of


voltage stability margins using artificial neural networks with a reduced
input set," IEE Proc. Gener. Transm. Distrib., 1998, 145, (4), pp. 355362
H.B. Wan and Y.H. Song, "Hybrid supervised and unsupervised neural
network approach to voltage stability analysis," Electric Power Systems
Research, 1998, 47, (2), pp.115-122
L. Srivastava, S.N. Singh, and J. Sharma, "Estimation of loadability
margin using parallel self-organizing hierarchical neural network,"
Computers and Electrical Engineering, 2000, 26, (2), pp. 151-167
S. Chakrabarti, and B. Jeyasurya, "On-line voltage stability monitoring
using artificial neural network," Proc. 2004 Large Engineering Systems
Conference on Power Engineering, Westin Nova Scotian, Canada, July
2004, pp. 71-75
D. K. Ranaweera and G. G. Karady, "Active power contingency
ranking using a radial basis function network," Int. J. Eng. Intell.
Syst. for Elect. Eng. Communications, vol. 2, no. 3, pp. 201-206,
Sept. 1994.
D.F. Morrison, Multivariate Statistical Methods, McGraw-Hill,
New York, 1976.
L.I. Smith, A tutorial on principal components analysis, 26 February
2002,(http://kybele.psych.cornell.edu/ edelman/Psych-465-spring2003/PCA-tutorial).
R.B. Panerai, A. Luisa, A.S. Ferreira, O.F. Brum, "Principal
component analysis of multiple noninvasive blood flow derived
signals," IEEE Trans. Biomed. Eng. 35 (1998) 7.
P. Comon, Independent component analysis - a new concept?
Signal Processing, 36(3), 287-314. 1994
A.J. Bell and T.J. Sejnowski, "An information-maximization
approach to blind separation and blind deconvolution," Neural
Computation, 7(6), 1129-1159, 1995
Z. Roth, and Y. Baram, "Multidimensional density shaping by
sigmoids," IEEE Transactions on Neural Networks, 7(5), 12911298, 1996
M. Kermit, and 0. Tomic, "Independent component analysis
applied on gas sensor array measurement data," IEEE Sensors
Journal, 3(2), 218-228, 2003
Spath, H., Cluster Dissection and Analysis: Theory, FORTRAN
Programs, Examples, translated by J. Goldschmidt, Halsted Press,
New York, 1985, 226 pp.
J. Haddadnia and K. Faez, "Neural network human face recognition
based on moment invariants", Proceeding of IEEE International
Conference on Image Processing, Thessaloniki, Greece, pp. 10181021, 7-10 October 2001.
J.G. Cradoso, B.H. Laheld, "Equivariant adaptive source
separation," IEEE Transaction on Signal Processing, Vol. 44, issue
12, Dec. 1996.