Академический Документы
Профессиональный Документы
Культура Документы
MODELLING
Submitted by
A.AADHITYA(16C001)
M.AKASH(16C007)
V.BHUJITH MADAV(16C015)
1
ABSTRACT
In this paper we are going to present a very short time ahead(15 min,1
hour) solar irradiance forecast using stacked ensemble approach.Our main
objective here is to show how the proposed model outperforms the existing
models available in terms of performance. The dataset we use here consists of
time series data from 2014–2016 with1-min resolution global horizontal
irradiance and direct normal irradiance measurements in Folsom,California.
Additionally, the dataset includes sky images, satellite imagery, and Numerical
Weather Prediction forecasts. Now currently numerical weather prediction
models statistical techniques like ARMA and ARIMA, machine learning
techniques such as regression, ANN, support vector machines are in use. The
methodology we are suggesting here is stacked ensemble model which has not
been yet explored to the best of our knowledge.
INTRODUCTION
Today the world is in desperate need of non conventional and renewable
energy sources and technologies. The word ‘sustainble’ holds more importance
than never before.Renewable energy is on the growth trajectory for the past few
years due to climate change and various reasons[1,2]. One of the most important
and massively vast available source of energy is solar energy but sadly it’s
potential has not been harnessed yet[3]. Today photovoltaic (PV) energy
provides just 0.1% of the total worldwide electricity production. But PV is
expected to deliver 5% of the global electricity consumption in 2030[4]. A
major concern surrounding solar power is the variability and unpredictability of
sunlight. If it is overcast or cloud cover present during the day, then the
photovoltaic cells are unable to produce electricity. This inherent variability
2
poses issues with grid reliability and the expenses associated with operating the
solar units. Hence forecasting the future solar energy is an important step for
efficiently integrating solar energy in to the power grid[5].
PROBLEM FORMULATION
Currently the solar installation in India stands at 33.8 GW[6].175 GW of
renewable energy is about to be installed in India by 2022[7]. Load Dispatch
Centers(LDC) are responsible for distribution of power to our households and
for other commercial purposes. Load dispatchers collect the information from
various generating stations about how much they will be providing the next day
and prepare a schedule with 96 time blocks of 15minutes each by[8].
1) If enough load is not provided by the generating stations (due to reasons such
as gloomy sky or sudden overcast conditions) as mentioned a day before to the
LDC there may be a demand-supply mismatch. Thus LDC has to request for
load from other power plants which is a strenuous process[9].
3
SOLAR FORECASTING EXISTING
TECHNIQUES
The first approach comprises of (i) Artificial Neural Network (ANN), (ii)
Support Vector Machine (SVM), (iii) Regression models.
Among the time series based-forecasters, ANN-based forecasting is one of the
most effective methods. But ANN and other regressive models form a
relationship between the individual feature vector and individual output value at
a specific hour which neglects the dependence between consecuvtive hours of
the same day[11]. ARMA and ARIMA models come under statistic based
models and is predominately used, but they are computationally intensive and
doesn’t achieve accuracy[12,13].
The physical methods consist of three sub-models i.e., (i) Numerical Weather
Prediction (NWP), (ii) Sky Imagery, and (iii) Satellite-Imaging models.
Physical methods, consist of mathematical equations that describe the physical
and dynamic state of the atmosphere[1] NWP tools provide information about
atmosphere conditions for a given time-scale. Sky imagery techniques were
4
used for very short-term forecast of future cloud patterns in solar plants. Cloud
motion information and its properties are provided by satellite imaging models.
DATASETS:
The dataset consists of three years (2014–2016) of, 1-min resolution Global
Horizontal Irradiance(GHI) and Direct Normal Irradiance(DNI) data for Folsom
a city in California,USA. Additionally, exogenous data including sky images,
and weather data are also provided[14].
AVG() and STD(): The mean and standard deviation values for RGB colors for
the images taken are provided. The sky camera captures Red-Green-Blue
(RGB) color images at a medium resolution (1536 x 1536 ), at intervals of 1-
min.
5
24 such factors have been employed for predicting Global Horizontal
Irradiance(GHI)(W/m2). GHI is the total amount of radiation received from
sun by a surface horizontal to the ground. We predict irradiance because it is the
important variable in predicting PV energy compared to other exogeneous
variables. Changes in solar irradiation intensity is directly proportional to PV
power output[15]. From the below plot seasonal variation experienced by GHI
and DHI data could be seen.
6
From the above correlation plot we could find that the maximum correlation is
obtained between GHI and Temperature.But increasing air temperature negatively
affects solar energy output[16].
DATA PREPROCESSING:
The timestamp values are available in UTC format which is first
converted into DateTime format. Then the 1 min resolution data is resampled
for 15 minutes, 1 hour and 24 hours for making predictions.If certain columns
contain NaN values (indicate measurements which are not available) they are
filled by taking the mean of all the observations of those respective columns.
The data is then scaled using MinMax Scaler available in the preprocessing
library of scikit-learn.
January-September 2016 is treated as test data and GHI predictions are made for
October-December 2016.
METHODOLOGY:
The methodology that is going to be used here for the GHI prediction is
Stacked Ensemble models. Ensemble models are a means of improving
prediction accuracy; it enables you to average out noise from diverse models
and thereby enhance the generalizable signal. Stacked ensemble techniques
combine predictions from multiple machine learning algorithms and use these
predictions as inputs to second-level learning models or meta learners[18].
Dietterich[17] emphasizes that, “A necessary and sufficient condition for an
7
ensemble model to be more accurate than any of its individual members is if the
classifiers are accurate and diverse”. The more diverse your first level learners
are more accurate the predictions will be.
MLP 1
PRE
MLP 2
PREDICTIONS
PREDICTIONS
MLP 3
MULTIPLE LINEAR
GHI
REGRESSOR
LSTM 1
PREDICTIONS FINAL
PREDICTIONS
LSTM 2
SUPPORT VECTOR
REGRESSOR
REGRESSOR
Here there are the six heterogeneous first level learners with their basic
description and hyper-parameters tuned:
8
MULTI-LAYER PERCEPTRON(MLP) NETWORK
source:pinterest
MLP 1
Optimizer=Adam(Learning rate=0.001)
MLP 2
9
1 output layer(Activation function=Linear function , No of neurons=1)
Optimizer=Adam(Learning rate=0.005)
MLP 3
Optimizer=Adam(Learning rate=0.001)
source:[23]
Support Vector Regression(SVR) is similar to that of SVM but only with a few
minor differences. The basic idea is to form a decision boundary at ‘e’ distance
from the original hyper plane so that the support vectors are within that boundary
line. In simple words points which have least error rate are taken into
consideration minimizing the error rate[24].
10
Kernel function=Radial Basis function
Epsilon=0.1
Source:[19]
11
LSTM 1
Layer (type) Output Shape Param
=====================================================
Lstm (LSTM) (None, 1, 50) 15000
_____________________________________________________
dropout (Dropout) (None, 1, 50) 0
_____________________________________________________
lstm (LSTM) (None, 1, 30) 9720
_____________________________________________________
dropout (Dropout) (None, 1, 30) 0
_____________________________________________________
lstm (LSTM) (None, 20) 4080
_____________________________________________________
dense (Dense) (None, 1) 21
=====================================================
Total params: 28,821
LSTM 2
Layer (type) Output Shape Param
=========================================================
lstm (LSTM) (None, 1, 50) 15000
________________________________________________________________
dropout (Dropout) (None, 1, 50) 0
________________________________________________________________
lstm (LSTM) (None, 20) 5680
________________________________________________________________
dense (Dense) (None, 1) 21
=========================================================
Total params: 20,701
12
Here the second level learner is
The first level learners are trained using the training data(Jan-Sep 2016) and
predictions are made by each model. Now these predictions are stacked together
which in turn serve as input to the second level learner.Then then test dat is split
into two parts. The first part is used for fitting and training of the meta learner.
Finally GHI predictions are made for the second half of the test data.
RESULTS
Mean Absolute Error(MAE) and Root Mean Squared Error(RMSE) are used as
metrics to evaluate the performance of the models.Lower the values of MAE
and RMSE higher will be the accuracy of the model. MAE is the average of the
error values of the entire test dataset. RMSE gives more weightage to error with
larger values than to smaller prediction errors[25].All RMSE and MAE values
are in terms of w/m2(SI unit of irradiance).
The below table provides a summary of the performance of various models
under varying time horizons. Each model when trated alone shows a better
performance for particulat time horizons. But it is clearly evident that, the
13
stacked ensemble model overshadows other stand-alone models by greater
margins in terms of error evaluation metrics.
CONCLUSION
The invariability in solar energy production due to environmental factors
doesn’t make it sound a reliable source of energy.In this work we have made
GHI predictions for 15 minutes and 1 hour ahead using stacked models.
Definitely we cannot conclude that the proposed model is best performing and
suitable for all forecasting horizons.All models have their own benefits. Even
the other undiscussed models hold significant value and importance for various
type of forecasts. Accurate solar irradiance predictions, helps in forecasting the
amount of solar energy that will be produced in the future more precisely, which
increases the reliability of solar energy in the electricity grid.
14
FUTURE WORK
Integrating solar forecasting facilities with electricity grids leads to the
development of smart grids[27]. Here we have tried to forecast solar irradiance.
In our future work we will be focusing on the forecasting of solar energy on
hourly basis which would further take us towards the fore-mentioned goal. As
performance is a key parameter to any new development, we are focusing on
how accuracy can be improved with a focus on tuning the hyper-parameters and
combining various alternate models to influence the end result.
REFERENCES
1. Nespoli, Alfredo, Emanuele Ogliari, Sonia Leva, Alessandro Massi Pavan,
Adel Mellit, Vanni Lughi, and Alberto Dolara. "Day-ahead photovoltaic
forecasting: A comparison of the most effective techniques." Energies 12,
no. 9 (2019): 1621.
15
5. Mishra, Sakshi, and Praveen Palanisamy. "Multi-time-horizon Solar
Forecasting Using Recurrent Neural Network." In 2018 IEEE Energy
Conversion Congress and Exposition (ECCE), pp. 18-24. IEEE, 2018.
11. Qing, Xiangyun, and Yugang Niu. "Hourly day-ahead solar irradiance
prediction using weather forecasts by LSTM." Energy 148 (2018): 461-
468.
16
12. Reikard, Gordon. "Predicting solar radiation at high resolutions: A
comparison of time series forecasts." Solar Energy 83, no. 3 (2009): 342-
349.
13. Das, Utpal Kumar, Kok Soon Tey, Mehdi Seyedmahmoudian, Saad
Mekhilef, Moh Yamani Idna Idris, Willem Van Deventer, Bend Horan,
and Alex Stojcevski. "Forecasting of photovoltaic power generation and
model optimization: A review." Renewable and Sustainable Energy
Reviews 81 (2018): 912-928.
14. Pedro, Hugo TC, David P. Larson, and Carlos FM Coimbra. "A
comprehensive dataset for the accelerated development and benchmarking
of solar forecasting methods." Journal of Renewable and Sustainable
Energy 11, no. 3 (2019): 036102.
17
18. Güneş, Funda, Russ Wolfinger, and Pei-Yi Tan. "Stacked ensemble
models for improved prediction accuracy." In Proc. Static Anal. Symp., pp.
1-19. 2017.
19. DUONG, TRAN ANH, MINH DUC BUI, and PETER RUTSCHMANN.
"LONG SHORT TERM MEMORY FOR MONTHLY RAINFALL
PREDICTION IN CAMAU, VIETNAM."
21. Gers, Felix A., Jürgen Schmidhuber, and Fred Cummins. "Learning to
forget: Continual prediction with LSTM." (1999): 850-855.
18
24. Sapankevych, Nicholas I., and Ravi Sankar. "Time series prediction using
support vector machines: a survey." IEEE Computational Intelligence
Magazine 4, no. 2 (2009): 24-38.
25. Monjoly, Stéphanie, Maïna André, Rudy Calif, and Ted Soubdhan.
"Hourly forecasting of global solar radiation based on multiscale
decomposition methods: A hybrid approach." Energy 119 (2017): 288-298.
26. Gensler, André, Janosch Henze, Bernhard Sick, and Nils Raabe. "Deep
Learning for solar power forecasting—An approach using AutoEncoder
and LSTM Neural Networks." In 2016 IEEE international conference on
systems, man, and cybernetics (SMC), pp. 002858-002865. IEEE, 2016.
27. Wan, Can, Jian Zhao, Yonghua Song, Zhao Xu, Jin Lin, and Zechun Hu.
"Photovoltaic and solar power forecasting for smart grid energy
management." CSEE Journal of Power and Energy Systems 1, no. 4
(2015): 38-46.
19