Академический Документы
Профессиональный Документы
Культура Документы
Portugal
Gianfranco Chicco, Member, IEEE, Roberto Napoli, Member, IEEE, Federico Piglione
Abstract- Load forecasting algorithms try to capture regular behaviours in historic load time series in order to perform an accurate forecast. The presence of anomalous days (holidays, working days between holidays, social events) is a serious drawback and requires a dedicated forecast. The successful application of Artificial Neural Networks (ANN) in this field suggested the use of the Kohonen Self-Organising Map for clustering the similar load patterns and classifying day typologies. In order to evaluate the benefits of this choice, this work compares the Kohonen map with a classic clustering algorithm, both applied to grouping the daily load patterns in homogeneous sets. The information gathered by the clustered data is then applied to the 24-hour ahead load forecasting of anomalous days, by means of an ANN-based approach. The results show that the combined use of both clustering techniques allows better understanding of the anomalous load patterns.
Index terms-short-term load forecasting, cluster analysis, artificial neural networks, self-organising map, radial basis function
In ANN-based STLF, a cluster of similar load patterns, belonging to some day typology, forms the Training Set (TS) of a feed-forward ANN that learns to forecast that specific day typology. The method referred in [2] uses an unsupervised algorithm that forms several clusters of similar load patterns; afterwards, for each cluster a dedicated Functional Link Network (FLN) is trained. In References 3 and 4, a Self-Organising Map (SOM) assists the human expert in finding the TS of a dedicated MultiLayer Perceptron (MLP). Moreover, the SOM has been directly employed as associative memory in finding a first guess of the forecasted load profile [5]. Actually, there exist some differences in using the SOM instead of the clustering algorithms. The SOM recognises very well clusters of similar patterns in a set of raw data. Moreover, it has the interesting property of the topological preservation, i.e. similar data are grouped together in the same region of the map. In fact, SOM projects the ndimensional space of the input data into the twodimensional grid of the map units. This feature is very useful for visual inspection of the data set peculiarities, though some skill is required in order to train correctly the SOM. Topological preservation does not exist in clustering algorithms, which only split up the data in several classes. If only the latter information is needed, the problems concerning design and training of the SOM could be avoided. This paper compares the performances of SOM and clustering algorithms in the classification of daily load patterns, mainly with relation to the anomalous days problem. Firstly, the algorithms and their features are summarised. Afterwards, some anomalous periods are considered and the classification uncertainties discussed. Finally, the hourly load forecast of the anomalous days is performed by means of the clustered data and a forecasting model based on the Radial Basis Function (RBF) ANN.
I. INTRODUCTION
Short-Term Load Forecasting (STLF) predicts the power system load 1-7 days in advance. It is then one of the main tools in power system management, allowing safe and economical operation. The traditional STLF procedures are based on statistical approaches, such as multiple regression, Box-Jenkins method, and spectral decomposition. In the last decade Artificial Neural Networks (ANN) have been more and more regarded as an effective approach to the STLF problem [I]. Apart from the method employed, it is widely recognised that better forecasts are obtained if the historical time series present a regular behaviour, which could be easily captured by the forecast algorithm. Actually, anomalous days (holidays, working days between holidays) require dedicated forecasts. In order to forecast the next day's load the human expert search the historic database for similar days and draft a forecast that he subsequently refine by means of current information on weather and social events. In the same way, the grouping of similar daily load patterns allows prediction algorithms to reduce the forecasting error. Paper PPT-377 accepted for presentation at the IEEE Porto Power Tech 2001 Conference, Porto, Portugal, September 10-13,2001. The authors are with the Dipartimento di Ingegneria Elettrica Industriale, Politecnico di Torino, corso Duca degli Abruzzi 24, I10129 Torino, Italy (E-mail chicco@polito.it, rna~oli(iir,polito.it,~irlione@athena.polito.it)
grid. The learning algorithm, inspired by biological considerations, changes not only the weights of the winning unit (the unit whose weight vector is nearest, according to some distance criterion, to the presented sample), but also the weights of its neighbours in inverse proportion of their distance from the winning unit (neighbourhood function). During the training, separate areas (bubbles of activity), which follow the probability distribution of the TS samples, grow up spontaneously in the map. The most attractive feature of the SOM is that, once trained, the map represents the projection of the TS data, belonging to an n-dimensional space, into a bidimensional one. The mutual distances between the bubbles are then proportional to those in the original data space. The topological preservation of the input structure and the simulation of the bubbles of activity distinguish this approach from the traditional unsupervised clustering approaches.
For the numerical tests we employed some hourly load data extracted from the database of a small Italian electric utility. This utility supplies a mixed industrial, commercial and residential load. The winter load peak of the weekdays is about 270 MW, with a base load of 100 MW. A database with three years (199511997) of hourly load data was available. These data were analysed by the authors in Reference 10, in order to develop an ANN-based load forecasting method for ordinary weekdays. A correlation analysis showed that the behaviour of time series is nearly autoregressive, since the influence of the exogenous variables is weak. In fact, the correlation with weather variables is weak, so that they act mostly as a seasonal effect. Two cases are presented: a winter month (December 1996) with several holidays and some anomalous days (days between holidays), and a spring month (April 1996) which includes the Easter week.
Unsupervised clustering algorithms, such as k-means, Isodata, Maximin distance [SI, are able, like the SOM, to discovering regularities in data sets. In this work we employed the algorithm proposed by Pao [2,9], whose main features are efficiency and simplicity. The algorithm is a variant of the classic 'follow the leader' approach and does not require initial guess of cluster centre coordinates, nor the initial number of clusters. The clustering process is controlled by a threshold called Vigilance Parameter (VP) and uses Euclidean metric function. The first sample is selected as the centre of the first cluster. Then, the next sample is compared to the first cluster centre. If the Euclidean distance is smaller of the VP, it is clustered with the first. Otherwise, it is the centre of a new cluster. This process is repeated for all samples, and reiterated until a stable cluster formation occurs. The VP is then the average radius of the clusters. A good heuristic guess for the VP is the half of the Mean Euclidean Distance (MED) among the N samples xi, defined by:
Fig. 1 shows the monthly load pattern of December 1996. The first day is a Sunday; then, beginning from the fourth Sunday, the load shape becomes anomalous, owing to the Christmas holidays.
280 1
260
240
220
MW 200
180
160
140
120
I '
200
300
400
500
SO0
700
I O
hours
Fig. 1. December 1996: monthly load pattern. The TS is composed by 31 daily load patterns of 24 elements. At first, we trained a SOM grid of 10x10 units. The initial learning rate, decreasing linearly to zero during the training, was 0.05 and the initial radius (in units) of the training area was 10. A training session of 10000 steps was performed. The resulting map is shown in Fig. 2, where the radius of the circles representing the units is proportional to the number of points classified by each unit and the day number is superimposed on the map. Four main activity bubbles, which can be separated by decision boundaries, are in evidence:
The computational speed of the Pao algorithm easily allows repeated tries in order to find a satisfactory value of the VP.
IV. APPLICATION LOAD TO PATTERN CLUSTERING The performances of the SOM and the Pao unsupervised clustering algorithm have been compared in the task of clustering daily load patterns. The aim is detecting the anomalous days and obtaining an adequate grouping of similar load patterns.
A) B) C) D)
normal working days; Sundays 1, 8, 15,22; Saturdays 7, 14, 2 1, Christmas and New Year Eves; Christmas Day, Thursday 26 (holiday), and Sunday 29.
Moreover, there are some anomalous days classified separately. They are 23, 27, 28 and 30 (working days
enclosed between holidays), whose attribution to one of four main bubbles by visual inspection is rather doubtful.
10x10 units. In the resulting map (Fig. 3), with MQE 12.2 MW, four main activity bubbles are evident:
A) normal working days; B) Sundays 14th, 21~1,28th, and Thursday 25th (Italian national holiday); C) Saturdays 6th, 13th, 20th, 27th; D) Sunday 8th (Easter) and Monday 9th (holiday). Moreover, there is an anomalous day (Friday 26th), classified separately because the near holidays obviously encourage a long weekend. The Pao clustering algorithm, applied with VP = 80 MW (the MED was 190 MW), produced the five clusters shown in Table 11. In this case, the classification resulted identical to the SOM's one.
TABLE I1 - April 1996: clustered load patterns Days MO 1 , Tu 2, We 3, Th 4, Fr 5, Tu 9, We 10, Th 11,
r30
4
27
1 0
Cluster
1
A useful index of the quantisation quality of the SOM is the measure of the Mean Quantisation Error (MQE), given by:
2 3 4
5
Fr 12, MO 15, Tu 16, We 17, Th 18, Fr 19, MO 22, Tu 23, We 24, MO 29, Tu 30 Su 14, Su 21, Th 25, Su 28 Su 7, MO 8 Sa 6, Sa 13, Sa 20, Sa 27 Fr 26
where x is the generic sample and mi the winner map unit i for the sample xi. In the map of Fig. 2 the MQE resulted 11.4 MW. For applying the Pao clustering algorithm to the same case, we computed the MED of the TS, which resulted 207 MW. Therefore, we chose 100 MW as initial guess of VP. The algorithm produced the four clusters shown in Table I. They are very similar to those deduced by the SOM, and group the working days (cluster I), the Sundays (cluster 2), the Saturdays and the holiday's eves (cluster 3), and finally the Christmas holidays (cluster 4). The main difference i s that the anomalous days (December 23, 27, 28, 30) which SOM classified separately, now belong to clusters 2 and 3.
12
~~~ ~
Fig. 3. SOM trained on April 1996. Cluster Days 1 MO 2, Tu 3 , We 4, Th 5 , Fr 6, MO 9, Tu 10, We 11, Th 12, Fr 13, MO 16, Tu 17, Wc 18, Th 19, Fr 20 2 Su I , Su 8, Su 15, Su 22, Sa 28 Sa 7, Sa 14, Sa 21, MO 23, Tu 24, Fr 27, MO30, Tu 31 3 4 We 25, Th 26, Su 29 The computed MQE results 33.3 MW. This value is larger than that obtained by the SOM, but SOM has usually many units in the same activity bubble and therefore the average distance between samples and grid units is lesser. On the other hand, the data grouping i s more definite by the clustering algorithm, which produces only few cluster centres. In this case, the anomalous days have been compelled to enter in the main clusters.
A . Cluster-based,forecatmodel We employed an original RBF-based algorithm, previously described in Reference 10. Because anomalous days do not have meaningful correlation with the immediately previous days, the method employs 24-hour load profiles picked in the same cluster. The load forecasting model has been
Case II. April 1996 The TS is composed of 30 daily load patterns with 24 elements each. Also in this case, we used a SOM grid of
accordingly modified: (3) where: t: time (hour) L (t): hourly load at time t h (t): hour (of day) number (lt24) The forecast procedure is then so arranged: (i) Cluster the load profiles of a given period (e.g. a month) of the previous year and check for anomalous days. (ii) Locate in the present year the anomalous day in corresponding calendar position. (iii) According to model (3), build up a TS composed by the load profiles of the present year in corresponding calendar position. (iv) Use the RBF in trainhecall mode according the method of Reference 10.
together with the actual load profile of Saturday 27 (line with circles). The hypothesis is then verified, as it is also shown in Fig. 6, where the very different load pattern of a normal Saturday (20 December) is presented together with the profiles of Fig. 5. Finally, the load forecast of Saturday 27 is obtained by training a RJ3F with the available similar load patterns, i.e. the Sundays 7, 14, and 21 and the Saturday 28 December 1996. The actual and forecast load patterns are shown in Fig. 7. The Mean Absolute Percentage Error (MAPE) resulted 4.4%, with a maximum absolute error of 13.8
MW.
220,
200 -
180 -
MW
160 -
140 -
B. Forecast examples
As first example, let's consider the 28 December 1996, a Saturday enclosed between the Christmas holidays, that was classified in a doubtful way by the SOM (see Fig. 2). On the contrary, the Pao algorithm classified that day in the cluster 2, together with the December Sundays. This fact is confirmed by the visual inspection of the load patterns. Fig. 4 shows the load patterns of the five days grouped in the cluster 2 of Table I. It is clear that the load profile of Saturday 28 (line with circles) is anomalous and very similar to those of four December Sundays.
220
120 -
."" 0
10 hour
15
20
25
220
200
180
MW
160
200 140
180
120
MW
160
Sa 27 Dec.
100
10
15
20
25
hour
140
120 -
220
0
10
15
20
25
hour
200
180
MW
160
Fig. 4. Load profiles of the cluster 2 of Table I. In order to exploit in December 1997 the information gathered in the previous year, the corresponding calendar position must be taken into account. In this year the day corresponding to Saturday 28 December 1996 is Saturday 27 December. Therefore, the hypothesis is that Saturday 27 would be an anomalous day and belong to the cluster of the Sundays of December 1997. The load profiles of the first three Sundays of December 1997 are shown in Fig. 5,
'
10
15
20
25
hour
Similar results have been obtained in the forecast of Tuesday 30 December 1997. This is an anomalous weekday included between the Christmas and New Year holidays. The corresponding day in the previous year is Monday 30 December 1996, that is classified in a doubtful way by the SOM (see Fig. 2), but belongs to cluster 3 in Table I, with the Saturdays and some other anomalous days. Let's apply this information to December 1997. The load profiles of the first three Saturdays of December 1997 and the Christmas Eve are shown in Fig. 8, together with the actual load profile of Tuesday 30 (line with circles).
240 220 200
profiles are compared in Fig. 12. The MAPE is 4.3%, with a maximum absolute error of 14 MW.
160
150
MW
140 130 -
120 110
100 90 -
10
15
hour
20
25
MW
160
180
140 120
170
160
MW
150
Ivv
10
15
hour
20
25
140
130
In order to forecast the load profile of Tuesday 30, we then build up a TS composed by the days 6, 13, 20 and 24 of December 1997, and Monday 30 December 1996. The result is shown in Fig. 9. The MAPE is 3.3%, with a maximum absolute error of 13.9 MW.
240
110 100
90
10
15 hour
20
25
- actual
MW
180
160
140
1
I
5
10
hour
140
120 L 0
120
15
20
25
iI n n I V"
Fig. 9. Load forecasting of Tuesday 30 December 1997. Finally, we applied this method to a spring midweek holiday, the 25 April (Italian National Holiday). For year 1996, both SOM and Pao algorithm classify this holiday together with the April Sundays, as shown in Fig. 3 and Table 11. Sunday 7 April 1996 is the Easter Sunday and has obliviously a separate cluster. These load profiles are shown in Fig. 10, where Thursday 25 April is represented by a line with circles. The corresponding load profiles in April 1997, presented in Fig. 11, show the same similarities of the previous year. The TS is composed by the Sundays 6, 13, 20 April 1997, and Thursday 25 April 1996. The forecast and actual load
10
15
hour
20
25
draw the correct boundary. The problem could derive from the data nature, but also from the grid shape and the training parameters. Therefore, in order to obtain a satisfactory map it is often necessary a time consuming trial and error procedure. The Pao clustering algorithm does not produce a refined visual projection of the data space, nevertheless it performs a fast and efficient clustering. In fact, the pattern clusters in the previous examples are almost identical to those obtained by the SOM. The advantage is that there is only one training parameter, the VP, and the execution time is almost negligible. Obviously, it is possible to reduce the quantisation error by reducing the VP. The optimal clustering becomes then a compromise between the average cluster variance and the number of clusters. Also in this case, a trial and error procedure is required, but it is made easier by the higher execution speed. In order to forecast the hourly load pattern of anomalous days, the combined use of both clustering methods proved very effective in discovering the hidden similarities. The experimental results are quite good for anomalous days, because a MAPE of 4% corresponds in our data to an average absolute error of about 8 MW. In fact, the useh1 historical references for anomalous days are rather far in the past and short-term effects are hard to capture. We conclude that the SOM produces a useful planar map of the clustered data, but sometimes it does not explain the data that fall out of the activity bubbles. On the other hand, the Pao algorithm (as other clustering algorithms) is less suited for the visual approach, but allows a simple direct measure of the pattern similarity. Therefore, the wise use of both methods seems to be the better approach.
VII. REFERENCES H.S. Hippert, C.E. Pedreira, R.C. Souza, neural Networks for Short-Term Load Forecasting: A Review and Evaluation, IEEE Transactions on Power Systems, Vol. 16, N. 1, Feb. 2001, pp. 44-55 M. Djukanovic, B. Babic, D.J. Sobajic, Y.-H. Pao, Unsupervised/supervised learning concept for 24-hour load forecasting, ZEE Proceedings-C, 140, 1993, pp. 311-318 Y.-Y. Hsu, C.-C. Yang, Design of artificial neural networks for short-term load forecasting. Part 1: Self-organising feature maps for day type identification, IEE Proceedings-C, 138, 1991, pp. 407-413 R. Lamedica, A. Prudenzi, M. Sforna, M. Caciotta, V. Orsolini Cencelli, A neural network based technique for short-term forecasting of anomalous load periods, IEEE Transactions on Power Systems, Vol. 11, N. 4, Nov. 1996, pp. 1749-1756 A.J. Germond, N. Macabrey, T. Baumann, Application of artificial neural networks to load forecasting, Proc. ZNNSSummer Workshop Tveural Networks Computing for the Electric Power Industry, Stanford, CA (August 17-19
Technology, 1995 J.T. Tou, R.C. Gonzalez, Pattern Recognition Principles, Addison-Wesley, 1974 Y.-H. Pao, Adaptive Pattern Recognition and Neural Networks, Addison-Wesley, Reading, MA, USA. 1989 [ 101E.Bompard, E.Carpaneto, G.Chicco, R.Napoli, F.Piglione, Short-term load forecasting of a small electric utility by a fast learning RBF neural network, Proc. PMAPS 2000, Funchal, Madeira, Portugal, September 25-28, 2000, V01.2, paper FOR- 129
I