Вы находитесь на странице: 1из 8

Applied Energy 88 (2011) 368375

Contents lists available at ScienceDirect

Applied Energy
journal homepage: www.elsevier.com/locate/apenergy

Modeling and prediction of Turkeys electricity consumption using Support Vector Regression
Kadir Kavaklioglu *
Pamukkale University, Computer Science Department, Denizli, Turkey

a r t i c l e

i n f o

a b s t r a c t
Support Vector Regression (SVR) methodology is used to model and predict Turkeys electricity consumption. Among various SVR formalisms, e-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to nd the best e-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. 2010 Elsevier Ltd. All rights reserved.

Article history: Received 21 February 2010 Received in revised form 9 July 2010 Accepted 24 July 2010 Available online 30 August 2010 Keywords: Electricity consumption Support Vector Regression Turkey Energy modeling Time series Prediction

1. Introduction Electricity is one of the main forms of energy that modern life is built upon. It affects a societys quality of living, efciency and quality of its work, manufacturing and competitiveness in an ever-growing global world. Governments and their related branches in developed and developing countries put major emphasis in modeling and predicting electricity consumption. Forecasting errors for electricity consumption would result in either shortages or excess capacity that are highly undesirable from a nancial standpoint. Therefore, modeling electricity consumption with a high degree of accuracy becomes vital in order to avoid aforementioned costly mistakes. As a developing country and an emerging market, Turkey expects an increasing but somewhat volatile behavior in electricity consumption in the future as its economy moves rapidly to respond to expansions and crises of national and global scale. Because of its limited energy resources, Turkey depends on
Abbreviations: ANN, Articial Neural Network; GNP, Gross National Product; MAED, Model for Analysis of Energy Demand; MENR, Ministry of Energy and Natural Resources; RMS, Root Mean Square; RMSE, Root Mean Square Error; SPO, State Planning Organization; SVC, Support Vector Classication; SVM, Support Vector Machine; SVR, Support Vector Regression; TEIAS, Turkish Electricity Transmission Company; TPES, Total Primary Energy Supply. * Address: Pamukkale University, Computer Science Department, 20070 Kinikli, Denizli, Turkey. Tel.: +90 258 296 3329; fax: +90 258 296 3262. E-mail addresses: kkavaklioglu@pau.edu.tr, kkavaklioglu@hotmail.com 0306-2619/$ - see front matter 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.apenergy.2010.07.021

imported energy sources both for electricity production and for general energy use. For instance, imported oil and gas contribute about 62% of Total Primary Energy Supply (TPES) of Turkey and it imported 77% of its energy needs in 2004 according to WECTNC report published in 2006 [1]. Due to extreme dependency on foreign fossil fuels, building models for Turkeys electricity consumption and accurately predicting future electricity consumption are essential. Studies on energy prediction in Turkey are done by Ministry of Energy and Natural Resources (MENR) and State Planning Organization (SPO). The MENR uses the Model for Analysis of Energy Demand (MAED) that historically has not produced reliable intermediate to long term results for Turkey. This has motivated the present research to study a different modeling technique for Turkeys electricity consumption forecasting. A review of the literature shows that there are numerous studies on the relationship among energy consumption and economy, as it inspired the selection of input variables for the present paper, such as Ebohan in 1996 [2], Kavrakoglu in 1983 [3], Say and Yucel in 2006 [4] Uri in 1980 [5] and Yu and Been in 1984 [6]. Total and sectoral energy modeling and prediction studies have been carried out by many researchers. Gilland [7] developed an energy demand projection of the world for the years 2000 and 2020. Ediger and Tatlidil [8] and Ediger and Camdali [9] proposed an approach that uses the analysis of cyclic patterns in historical curves to predict the primary energy demand in Turkey. Yumurtaci and Asmaz [10] proposed an approach to calculate future energy demand of Turkey, for the period of 1980 and 2050, based on the population

K. Kavaklioglu / Applied Energy 88 (2011) 368375

369

and energy consumption increase rates per capita. Sozen et al. [11] used Articial Neural Networks (ANN) to predict Turkeys net energy consumption. Toksari [12] developed an ant colony energy demand estimation model for Turkey. Akay and Atak [13] proposed an approach using grey prediction with rolling mechanism (GPRM) to predict the Turkeys total and industrial electricity consumption. Sozen and Arcaklioglu [14] developed the energy sources estimation equations in order to estimate the future projections and make correct investments in Turkey using Articial Neural Network (ANN) approach. Murat and Ceylan [15] obtained that modeling the energy consumption may be carried out with Articial Neural Networks (ANN) with a lack of future prediction since ANN models they built are good at solving current data, but are not good for prediction since they do not use any mathematical models. Hamzacebi [16] used ANNs with time series structure to predict Turkish electricity consumption. The goal of accurate modeling of consumption requires attention to a few extremely important points. The rst point is to identify all the necessary variables and parameters that contribute to electricity consumption in a given country. In this paper, the eSVR models take time, population, Gross National Product (GNP), import and export values as inputs to model and predict total electricity consumption of Turkey. It is obvious that consumption is related to population as the population increases, more electricity will be consumed. Imports and exports for Turkey are related to manufacturing processes and therefore strongly affect electricity consumption. Finally, GNP is a measure of all economic activities and increasing GNP means improved living standards and thus increased energy use. The second point is to choose a modeling methodology that can handle the difculties of the consumption modeling task. One difculty in this area, similar to many other modeling studies, is the fact that the relationship between the input variables and the output variable is nonlinear and the nature of the nonlinearity is not known very well. Therefore it is not easy to postulate the form and/or order of a mathematical function whose parameters could be estimated through regression type computations. Black box models such as Articial Neural Networks (ANN) could be applied in these situations since they can handle nonlinearities to an arbitrary degree of accuracy [17]. However, although being the most important aspect, accuracy of the model is never the only concern in any modeling analysis. Models should also be able to handle imperfections in the data such as noise, errors, missing data points, disturbances and short term effects. Another area of concern in modeling is so called local minima. Most modeling techniques compute model parameters by minimizing a cost function which may not have a single local minimum that is also the global minimum. Then getting trapped in one of the local minima avoids reaching the real optimal parameters. Therefore a good modeling method should either be able to guarantee reaching global minimum or should have a single global minimum for its cost function. In summary, the models should exhibit good global optimization and robustness characteristics as well as accuracy in prediction. Furthermore, the methodology used should allow future prediction by letting the model topology cover one-step or multi-step ahead predictions. This is critical since the eventual goal here is to predict the future values of electricity production that leads to dynamic model structures. It is well-known that time series formalism using nonlinear models is very suitable for these types of problems. There are numerous published papers on successful applications of time series to complicated problems [1821]. In addition, the methodology should allow rangeability as often the data available for modeling does not cover the ranges of data for the times the predictions need to be done. For instance, population range of Turkey is 40.474.6 million for 19742006 period; how-

ever it is expected to be much higher for year 2026. Then the models will see [40.4,74.6] range during training while they will be presented with much higher values during future prediction. In the light of the preceding argument, Support Vector Regression methodology was proposed to model Turkeys electricity consumption. The literature search did not yield any study where SVR technique is applied to consumption modeling of Turkey, and therefore, the present paper is original and should contribute signicantly to the area of electricity demand forecasting. SVR methodology allows multiple inputs so that it can be used to model consumption as a function of the aforementioned socio-economic indicators. SVR based regression models can also handle any nonlinearity since they use kernel functions that could be chosen as a number of different parametric nonlinear functions. SVR models can also handle data imperfections as it allows an adjustable parameter (e) that the mismatch between the data and the model output is ignored if it is below that parameter. That means the method will not force the model to go through every point exactly which greatly enhances its overall performance with little tradeoff to accuracy. SVR modeling also has excellent global optimization characteristics since the computation of model parameters becomes a convex optimization problem with a single minimum, as there are no other local minima. Last but not the least, SVR models could handle different data ranges properly through use of data preprocessing and normalization but more importantly through parameters of the kernel functions. The main goal of this research is to develop accurate, robust, globally optimal SVR models for electricity consumption of Turkey. Another equally important goal is to build models that could predict consumption for strategic planning. In order to incorporate prediction capabilities to the eventual system, a separate SVR model is built for each of the input variables using their past values as inputs. This results in four onestep-ahead predictor models for population, GNP, imports and exports. Then, a nal one-step-ahead predictor model was built for the electricity consumption as output and the past values of the other variables as the input.

2. Support Vector Regression The foundations of Support Vector Machines (SVM) have been laid by Vapnik and Chervonenkis [2224] and the methodology is gaining popularity ever since. SVMs that deal with classication problems are called Support Vector Classication (SVC) and SVMs that deal with modeling and prediction are called Support Vector Regression. There are numerous resources in the form of books, reports and papers that give thorough overview of SVMs. An excellent tutorial on SVC has been published by Burges [25] and an excellent tutorial on SVR has been published by Smola and Schlkopf [26]. This paper exclusively uses modeling and prediction and therefore an overview of the SVR technique is given here.Let us assume one is trying to model a single output (y) as a function of n input variables (x) and is given a training data set of length N: T = {(x1, y1), (x2, y2), . . . (xN, yN)} where xk 2 Rn and yk 2 R, k = 1, 2, . . . , N In essence, xks are n dimensional vectors carrying the values of each input at time step k and yks are scalars carrying the values of the output variable at time step k. Now the problem becomes nding a model that explains this training set the best. In the original SVR formulation a linear model is proposed:

^ yx hw; xi b

^ where y is the estimated output of the model, w is a weight vector, b is a bias term and h, i denotes vector inner product. The vector w is actually an element of the feature space of the problem. However,

370

K. Kavaklioglu / Applied Energy 88 (2011) 368375

most real world problems cannot be modeled using linear forms. In order to allow nonlinear modeling, SVR methodology allows use of nonlinear functions that are nonlinear kernel functions U(x) from input space to feature space.

Tucker conditions lead to the following nal quadratic programming problem:

minimize

1 2

N N PP i1 i1 N P i1

K ij bi b bj b i j
N P i1

^ yx hw; Uxi b

Then the problem becomes choosing the right nonlinear maps U(x) and tting for the best w. Radial basis functions are one of the most widely used nonlinear kernels for SVR models and are implemented in this study. SVR algorithm wants the model to have good generalization performance that translates into the requirement that w needs to be as at as possible. This means that the norm (kk) of the w vector needs to be minimized for every data point i = 1, 2, . . . , N.

e subject to
N P i1

bi b i

yi bi b i 6

bi b 0 i

0 6 bi ; b 6 C i where K ij Kxi ; xj Uxi T Uxj

minimize subject to

1 kwk2 2

yi hw; Uxi i b 6 e hw; Uxi i b yi 6 e

At this point we are not guaranteed feasible constraints. One can introduce slack variables (ni ; n ) towards making the constraints i feasible. Also, the variant of SVR used in this research (called eSVR) is insensitive to small errors as it starts to penalize for errors that are greater then e in magnitude. This is accomplished through ^ use of an e-insensitive loss function Le; y; y as shown in Fig. 1. Then the task is to estimate the parameters w and b that minimizes the following cost function that is now a feasible convex optimization problem.

where Kij is the kernel function (thus giving the methodology of being a kernel-based method) based on the original nonlinear maps and does take care of the inner product in the feature space and we no longer need to compute w explicitly. The solution of the above quadratic programming (QP) problem yields optimum bi ; b parami eters for i = 1, 2, . . . , N. By virtue of being a QP problem, global optimum is guaranteed. For every training set data point there is a pair of bi ; b : However, some of the bi ; b pairs may vanish (by having i i values of zero) leading to a sparse model. A training data point xi that bi ; bi pair does not vanish for is called a support vector. The bias ^ b is computed such that e yi yi 0 condition is satised for all the support vectors. Finally, the model form in the dual space can be written as:

^ yx

N X i1

bi b Kx; xi b i

minimize

1 kwk2 2

N P i1

ni n i 4

subject to

yi hw; Uxi i b 6 e ni yi hw; Uxi i b 6 e n i ni ; n P 0 i

The parameter C (also called the regularization parameter) in the above cost function determines the balance between atness of the w vector and penalizing for errors that are greater then e. This optimization problem is often solved in its dual form where the constraints are handled by use of Lagrange multipliers and the Lagrangian (L) to be minimized becomes:

If the model is sparse, then the summation in the above equation only needs to be taken for the support vectors. The parameters a user needs to determine for e-SVR are; e the maximum allowable error in the output, C the regularization parameter, N the number of input patterns and possible parameters of the kernel function chosen. There are no guaranteed ways of choosing these parameters and therefore a grid-search has been performed in this study. The kernel function used throughout this research is the radial basis function and is dened as the following:

xi xj T xi xj K ij Kxi ; xj exp 2r2

! 8

N N X X 1 kwk2 C ni n bi e ni yi hw; Uxi i b i 2 i1 i1

This function has a single parameter r that determines the spread of the function.

N X i1

b e n yi hw; Uxi i b i i

N X i1

gi ni g n i i

3. Application to consumption modeling In this research, the electricity consumption of Turkey is modeled as a function of four socio-economic indicators, namely population, GNP, imports and exports. An analysis of historical data for these four variables has shown that except for the population variable, there are no steady trends within these indicators. This is probably due to the fact that Turkey is a developing country and it is very vulnerable to global inuences as well as it has a highly dynamic internal structure. In addition, these four variables do not constitute a full list of all the variables that affect electricity consumption although they are responsible for the majority of the inputoutput relationship. To address these issues; time (in years) is introduced as another input variable to capture non-uniform trends and decient input set. This means that the input vector x of the SVR method becomes x t x1 x2 x3 x4 T . Table 1 lists all the variables used in this study and Fig. 2 presents the general form of the electricity consumption model. As one of the goals of this research is to model electricity consumption, the other goal is to predict this consumption for future dates. That is why the model form needs to be chosen such that

where b, b*, g and g* are the Lagrange multipliers. In order to nd the minimum, one needs to take all the partial derivatives of the Lagrangian with respect to ni ; n ,w and b; and set them to zero. i These expressions along with the complementary KarushKuhn

+ + +

+ + + +

L ( , y,y)

+ +

Fig. 1. Graphical details of e-insensitive loss function.

K. Kavaklioglu / Applied Energy 88 (2011) 368375

371

it could produce results for the times that it has not seen during model training. This leads to a one-step-ahead predictor type modeling where the model output at a given year [k] is modeled as a function of all input variables at previous year [k 1], 2 years before [k 2] and 3 years before [k 3] (see Fig. 3). This approach allows the model to capture not only the static relationship among

these variables, but also the dynamic relationships that determine future behavior. As always one has to pay a price with improvements which, in this case, is the increase in the number of inputs. With this dynamic formalism, number of inputs jump from 5 to 15 where the new x vector becomes:

x t 1 t 2 t 3 x1 k 1 x1 k 2 x1 k 3 ::: x4 k 1 x1 k 2 x1 k 3 T

Table 1 List of all variables used in this study. Variable t x1 x2 x3 x4 y Description Years Population Gross National Product (GNP) Imports Exports Total electricity consumption

t x1 x2 x3 x4
Fig. 2. General form of the electricity consumption model.

t[k-3], t[k-2], t[k-1] x 1 [k-3], x1 [k-2], x1 [k-1] x 2 [k-3], x2 [k-2], x2 [k-1] x 3 [k-3], x3 [k-2], x3 [k-1] x 4 [k-3], x4 [k-2], x4 [k-1]
Fig. 3. One-step-ahead predictor model for electricity consumption model.

y [k]

Use of multi-step ahead predictors was also considered for this research. The reason why one-step ahead prediction was used is that the number of points in the original data set is very low. As mentioned earlier, the data consist of values of input and output variables for the years 19752006 that is a total of 32 data points. It is a rule of thumb to reserve 20% (which is six points for this study) of the data for testing. Therefore the total number of data points available for training the models was 26. Multi-step ahead predictors would reduce this number even more to a point where the models would not be reliable anymore. The down side of modeling with one-step ahead prediction is that the reliability of the prediction gets worse as the years move into the future. Another decision in these models was to use three time steps as the depth in the input variables. That would make it a 3rd order model. This implies that the underlying dynamics could be represented by a 3rd order difference equation (or a 3rd order differential equation). It is good practice to start with low model orders and move to higher model orders only if the results are not satisfactory. Akay and Atak also have shown that a lower order model can sufciently model the electricity consumption dynamics of Turkey [13]. Finally, same level of depth of three steps was used for all the input variables. This is because all the input and output variables are inter-related and deciding individual depths based on error measures is not well justied. Building and testing this model and estimating its parameters are straightforward for the existing data which is all the input and output variables for the years 19752006. In order to make predictions beyond 2006 with this model, four more individual eSVR models are necessary so that the input vector for the future years could be obtained. Fig. 4 presents the overall nal structure of the models developed. Four individual e-SVR models produce the necessary input for the fth model which is for the electricity consumption. As applied with most modeling techniques, data preprocessing was performed to take care of the unit and scaling issues so that the importance of inputs are not altered by their relative magnitudes. Each variable including years is linearly transformed into the unit interval [01] using their own minimum and maximum values separately. All the modeling and prediction was done using these transformed values and the inverse linear

t[k-3], t[k-2], t[k-1] t x1 x2 x3 x4 fx1 fx2 fx3 fx4 x1 [k-3], x1 [k-2], x1 [k-1] x2 [k-3], x2 [k-2], x2 [k-1] x3 [k-3], x3 [k-2], x3 [k-1] x4 [k-3], x4 [k-2], x4 [k-1]
Fig. 4. Final form of the electricity consumption model.

y[k]

372

K. Kavaklioglu / Applied Energy 88 (2011) 368375 Table 2 List of absolute and relative RMS errors for models. Variable RMS error 0.07 millions 2.80 billion $ 0.88 billion $ 0.42 billion $ 0.76 TWh Relative RMS error (%) 0.13 1.96 2.68 1.86 1.51

transform was used to bring the results back into their nominal values. 4. Results and discussions The data used for Turkeys electricity consumption modeling spans the years from 1975 to 2006 that corresponds to the most up to date data available at the time this research was performed. First, all ve e-SVR models (see Fig. 4) were built and then they are used for future consumption prediction until year 2026. 4.1. Electricity consumption model Finding the model within the context of SVR means evaluating model parameters bi ; b and the bias term in Eq. (7). However, i there could be innitely many models depending on the SVR parameters e, C and r chosen for the expressions 6 and 8. Then the question becomes which set of e, C and r would give the best model. This immediately begs the question of how we would measure the performance of a given model. Root-mean-square (RMS) error is used as the performance measure for the models throughout this research. This is simply the square root of the mean squared error between a models output and the actual output. However, where this RMS error is measured is also important. Only the RMS error of the test points are used here and the RMS error for the training points are excluded. The reason for this is the fact that the models built in this work are intended solely for future prediction, and therefore, their prediction performance for the data that they have not seen during training is of most importance.The modeling process consists of two phases, namely training and testing. For a given set of SVR parameters (e, C and r); rst 80% of the data are used for training, i.e., nding the model parameters (bi ; b and i the bias term) and the remaining 20% of the data are used to evaluate the performance of the model using its RMS error. Complete data set consisted of values of all the input and output parameters for the years 19752006 for a total of 32 patterns. First 26 patterns (corresponding to years 19752000 which is roughly the rst 80% of the complete data set) are used for training. The remaining data from 2001 to 2006 (corresponding to years 20012006 which is roughly the last 20% of the complete data set) are used for testing. The main reason for using the last portion of the data for testing models (rather than randomly choosing training and test sets) is the fact that we wanted to measure the performance of a model for its prediction capabilities. In this way, the model is built by using the data between 1975 and 2000 and its RMS error is computed only for the years 20012006 where the model has never seen before. This allows us to compare different models (for different SVR parameters) based on their future prediction, thus extrapolation capability. This process of modeling is repeated for different SVR model parameter sets. A brute force grid search was performed for the SVR parameters where e is varied from 0.001 to 0.01 by increments of 0.001, C is varied from 10 to 100 by increments of 10 and r is varied from 1.0 to 3.0 by increments of 0.1. Therefore, a total of 10 10 20 = 2000 different models were built and the one with the least test RMS error was chosen as the best model as it would have the best prediction ability. All of the computations in this study were performed using MATLAB R2007a software. Optimization for the SVR model was performed using quadprog routine in MATLAB. Absolute and relative RMS errors for the optimum models are given in Table 2. The RMS error for population is very small which agrees with the fact that this variable changes almost linearly throughout the years given in the data set. Since there is no complicated behavior, the prediction is relatively easy for the model.

Population GNP Imports Exports Consumption

RMS errors for all the other variables are less then 3% which is pretty good considering the complicated nature of these variables. Besides the goal of SVR is to have good global performance and not to have the best local accuracy. The optimal SVR parameters are summarized in Table 3. The second column (e) in this Table shows the optimal e parameters for each variable. It is seen that the values are all close to the lower threshold of 0.001 that means the most of the data points are used for the model and the models are not sparse. This is also evident in the last column (# SV) that presents the number of support vectors for each variable. Except for the population variable, number of support vectors is around the total number of data points used. Looking at the third column (C) of Table 3 indicates that the parameter C is low for the electricity consumption variable and high for the rest of the variables. This means that the relative importance of atness of the weights versus the prediction errors is different for these two groups of variables. Consumption model puts more emphasis of atness of the weight vector whereas models for other variables put more emphasis on prediction errors. This also translates as the consumption model puts more emphasis into global performance, whereas the other models put more emphasis into local performance. This is exactly what is needed; globally correct output and locally correct input variables. The fourth and fth columns of Table 3 present the optimal r parameters and bias values for each variable, respectively. The kernel function (radial basis function) parameters are around 2.9 for most variables except for the consumption variable that has a r of 1.7. Bias values vary between 0.405 and 1.096. Plots of the model output and actual data are presented in Fig. 5ad for the input variables and Fig. 6 for the output variable, respectively. Fig. 5a shows that population variable is almost linear and the model follows the behavior pretty well. The nature of GNP, imports, exports and electricity consumption is more complicated but the model performance is still very good. At this point, it would be benecial to analyze the behavior of the suboptimal solutions for the output variable, i.e., solutions with RMS errors (RMSE) higher than the optimal model. This would give us useful information about the behavior of models in different SVR parameter regions. Within the context of this research nding the optimal SVR model means obtaining the parameters of e, C and r that would give the least RMSE. As mentioned earlier in this section, e is varied from 0.001 to 0.01 by increments of 0.001, C is varied from 10 to 100 by increments of 10 and r is varied from 1.0 to 3.0 by increments of 0.1, thus, yielding 2000 different SVR models. Therefore a total of 2000 different RMSE values are generated for 2000 different models. The minimum RMSE is found to be

Table 3 List of optimized parameters for e-SVR models. Variable Population GNP Imports Exports Consumption

e
0.001 0.003 0.002 0.002 0.004

C 100 70 80 80 20

r
2.8 2.9 2.9 2.9 1.7

Bias 1.096 0.710 0.525 1.071 0.405

# SV 16 29 28 29 29

K. Kavaklioglu / Applied Energy 88 (2011) 368375

373

(a)

80 75 70
Actual Predicted

(c)

450 400 350


Actual Predicted

Population (Million)

65

GNP (Billion $)
1980 1985 1990 1995 2000 2005

60 55 50 45

300 250 200 150

40 35 30 100 50 1980 1985 1990 1995 2000 2005

Years

Years

(b)

150
Actual Predicted

(d)

120
Actual Predicted

100

100

Exports (Billion $)
1980 1985 1990 1995 2000 2005

Imports (Billion $)

80

60

50

40

20

0 1980 1985 1990 1995 2000 2005

Years
Fig. 5. Actual and predicted values for input variables.

Years

0.76 TWh (Table 2) for the electricity consumption and the SVR parameters for this optimal model are e = 0.004, C = 20 and r = 1.7. In describing the behavior toward optimization, it would be best if one could plot RMSE versus e, C and r on the same plot.
150
Actual Predicted

100

50

1980

1985

1990

1995

2000

2005

Years
Fig. 6. Actual and predicted values for electricity consumption.

However, this would require a four-dimensional plot which is impossible to draw and to directly visualize. Instead, the behavior of RMSE is presented as a function of only one of the SVR parameters by xing the value of the other two parameters as constants. RMSE behavior is presented in Fig. 7 as a function of e while C and r are kept constants. The behavior was similar for different pairs of C and r, and therefore, optimal C and optimal r were used in Fig. 7. Based on this gure, it is observed that the optimal value for e is 0.004 which corresponds to the RMSE of 0.76. As evident from the gure, the RMSE starts increasing as e gets lower or higher than 0.004. It may also be important to note that RMSE increase rate is faster for higher e and slower for smaller e. For instance, if the same model was used with e = 0.008 rather than e = 0.004 (100% increase in the parameter), we would get about 22% increase in the RMSE which means that the model output for the test cases would be worse in comparison with the actual data. On the other hand, Fig. 8 presents RMSE as a function of r while e and C are constants at their optimal values. This gure clearly shows that the optimal value for r is 1.7 which corresponds to the RMSE of 0.76. RMSE shows an increasing behavior on both sides of the optimal value. In contrast with e behavior, RMSE increase rate is faster for lower r and slower for higher r. Increasing r from 1.7 to 3 (less than 100% increase) would bring RMSE from 0.76 to 3.7 that is about 400% increase causing a signicantly worse mismatch between the model output and the actual data.

Consumption (TWh)

374

K. Kavaklioglu / Applied Energy 88 (2011) 368375

0.764 0.7635 0.763 0.7625 0.762 0.7615 0.761 0.7605 1

The bias was found to be 3.392 and the noise standard deviation was found to be 2.076. These random walk model parameters were used to predict the consumption values for the test set of the years 20012006. The results that are presented in Fig. 9 shows that the performance of the random walk model is not acceptable. The RMSE for the random walk model for the test set was found to be 12.56 which is extremely higher than SVR models RMSE. Random walk model also did not capture the trend in the data properly. Obviously this is one of the simplest time series models and would not constitute a thorough comparison between SVR models and time series models. A full investigation of time series models is beyond the scope of this paper and could be the subject of another research project. 4.2. Electricity consumption future predictions
2 3 4 5 6 7 8 9 10 x 10
-3

RMSE

Epsilon
Fig. 7. RMSE behavior as a function of epsilon (e).

Finally, for the justication of the SVR model, a simple time series model was built for the overall electricity consumption and its error measure was compared against SVR model error. One such model is so called random walk model which could be considered as a special case of ARMA model structure [27]. ARMA model in general can be expressed as the following:

Based on the optimal consumption model by SVR method, future predictions of consumption are evaluated from year 2007 to year 2026, a 20-year period. The results are given in Fig. 10 graphically and Table 4 numerically. The results indicate that the electricity consumption of Turkey will not increase drastically until year 2014 and thereafter will sit on a steady upward trend. The

145 140 135

Aqyt Cqet

10
Consumption (TWh)

where A(q) and C(q) are polynomials in time-shift operator q, y(t) is the measurement sequence that is being analyzed and e(t) is a sequence of independent random variables which is called white noise. With proper selections of A(q) and C(q), one would obtain the random walk model that could be expressed as the following:

130 125 120 115 110 105 100 95 2001 2002 2003 2004
Actual SVR Random Walk

yt yt 1 et

11

For stationary time series the mean value of the noise sequence is zero and therefore the process moves around its previous value randomly. However, the electricity consumption variable is not stationary, and therefore a random walk model for this variable would include a bias at each time step plus white noise. First differences of the consumption model for the years 19752000 were used to estimate the bias term and the noise standard deviation.

2005

2006

Years
Fig. 9. Performance comparison of SVR and random walk models.

6 300 5

Consumption (TWh)

250

RMSE

200

1.5

2.5

150

2008 2010 2012 2014 2016 2018 2020 2022 2024 2026

Sigma
Fig. 8. RMSE behavior as a function of sigma (r).

Years
Fig. 10. Future predicted values for total electricity consumption of Turkey.

K. Kavaklioglu / Applied Energy 88 (2011) 368375 Table 4 Predicted values of electricity consumption until 2026. Years 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Consumption (TWh) 151.05 156.72 163.21 170.23 175.05 177.16 177.63 179.18 184.10 193.20 Years 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 Consumption (TWh) 199.63 207.56 215.96 224.44 233.20 242.52 252.45 262.87 273.68 284.90

375

References
[1] World Energy CouncilTurkish National Committee (WECTNC), Energy Statistics, Ankara, Turkey; 2006 [in Turkish]. [2] Ebohan OJ. Energy economic growth and causality in developing countries: a case study of Tanzania and Nigeria. Energy Policy 1996;24:44753. [3] Kavrakoglu I. Modeling energyeconomy interactions. Eur J Oper Res 1983;13(1):2940. [4] Say NP, Yucel M. Energy consumption and CO2 emissions in Turkey: empirical analysis and future projection based on an economic growth. Energy Policy 2006;34(18):38706. [5] Uri ND. Energy GDP and causality: a statistical look at the issue. Energy Commun 1980;6(1):115. [6] Yu ESH, Been KH. The relationship between energy and GDP: further results. Energy Econ 1984;6(3):18690. [7] Gilland B. Population, economic growth, and energy demand, 19852020. Popul Dev Rev 1988;14(2):23344. [8] Ediger VS, Tatlidil H. Forecasting the primary energy demand in Turkey and analysis of cyclic patterns. Energy Convers Manage 2002;43:47387. [9] Ediger VS, Camdali U. Camdali, energy and exergy efciencies in Turkish transportation sector, 19882004. Energy Policy 2007;35:123844. [10] Yumurtaci Z, Asmaz E. Electric energy demand of Turkey for the year 2050. Energy Source 2004;26:115764. [11] Sozen A, Arcaklioglu E, Ozkaymak M. Modelling of the Turkeys net energy consumption using articial neural network. Int J Comp Appl Technol 2005;22(2/3). [12] Toksari MD. Ant colony optimization approach to estimate energy demand in Turkey. Energy Policy 2007;35:398490. [13] Akay D, Atak M. Grey prediction with rolling mechanism for electricity demand forecasting of Turkey. Energy 2007;32(9):16705. [14] Sozen A, Arcaklioglu E. Prospects for future projections of the basic energy sources in Turkey. Energy Source Part B: Econ Plan Policy 2007;2(2):183201. [15] Murat YS, Ceylan HH. Use of articial neural networks for transport energy demand modeling. Energy Policy 2006;34:316572. [16] Hamzacebi C. Forecasting of Turkeys net electricity energy consumption on sectoral bases. Energy Policy 2007;35:200916. [17] Lippmann RP. An introduction to computing with neural nets. IEEE ASSP Mag 1987;4(4):422. [18] Beliaev I, Kozma R. Time series prediction using chaotic neural networks on the CATS benchmark. Neurocomputing 2007;70:242639. [19] Herrera LJ, Pomares H, Rojas I, Guillen A, Gonzalez J, Awad M, et al. Multigridbased fuzzy systems for time series prediction: CATS competition. Neurocomputing 2007;70:241025. [20] Sarkka S, Vehtari A, Lampinen J. CATS benchmark time series prediction by Kalman smoother with cross-validated noise density. Neurocomputing 2007;70:233141. [21] Simona G, Leea JA, Cottrellb M, Verleysen M. Forecasting the CATS benchmark with the Double Vector Quantization method. Neurocomputing 2007;70:24009. [22] Vapnik V, Chervonenkis A. Theory of pattern recognition [in Russian]. Moscow: Nauka; 1974 [German translation: Wapnik W, Tscherwonenkis A. Theorie der Zeichenerkennung. Berlin: Akademie-Verlag; 1979]. [23] Vapnik V. Estimation of dependences based on empirical data. Berlin: Springer; 1982. [24] Vapnik V. The nature of statistical learning theory. New York: Springer; 1995. [25] Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 1998;2(2):12167. [26] Smola AJ, Schlkopf B. A tutorial on support vector regression. Stat Comput 2004;14:199222. [27] Soderstrom T. Discrete-time stochastic systems: estimation and control. New York: Prentice Hall; 1994.

model predicts that the consumption will reach 284.9 TWh in the year 2026. 5. Conclusions The major conclusion of this research is that electricity consumption of Turkey can be modeled as a function of aforementioned socio-economic variables using Support Vector Regression. Modeling performance in terms of RMS errors can be made relatively small without sacricing too much generalization performance. It is found that the number of support vectors is relatively high indicating that the models are not sparse and actually use most of the data set. Determination of optimal SVR parameters, and thus, optimal model parameters, require a search for the optimal model and it is concluded that a grid search would sufce in case of Turkey. It is also concluded that SVR models can be put in such forms that future prediction is possible. Future prediction capability in this study is achieved by using a one-step-ahead predictor structure for the SVR model. However, conclusions about the future prediction quality of SVR models can not be made until the actual data for the electricity consumption data are available. Finally, based on the results presented in the previous section, the model predicts Turkeys electricity consumption to reach 284.9 TWh in the next 20 years which is about twice the 2006 value. This means that the Turkish government and related organizations should plan for signicant increase in the capacity for electric power plants for the future. Acknowledgement The author would like to thank Scientic Research Projects Division (BAP Project No.: 2010FBE011) of Pamukkale University for their support during this research.

Вам также может понравиться