Академический Документы
Профессиональный Документы
Культура Документы
Table of Contents:
The first chapter gives an introduction to the basic concepts of neural networks. The second chapter
explains the neural network for forecasting electricity generation in billion kwh for India. The third
chapter delineates on the neural network classification model. A hypothetical dataset has been used
for developing the neural network classification model.
Chapter 01
Introduction to Neural Networks
The artificial neural network (ANN) is based on the concept of machine learning. ANN attempts to
mimic the human brain into the machine (computer). ANNs attempt to learn the patterns that is
exhibited in the training dataset (input - independent variable and output - dependent variable).
Then based on the previous learning, ANN attempts to give generalized solution for the total
dataset. The solution could be forecast in a time series dataset or causal regression model. It could
also be used for classification of the dataset like model development in discriminant analysis or
logistic regression.
Artificial neural networks (ANN) are data-driven and self-adaptive. There is no need to make any
priori assumption on the input data. The neural network model is formed in an adaptive manner
based on the pattern exhibited by the dataset. The artificial neural network is basically non-linear in
contrasts to the linear regression models including ARIMA models. This makes it more adaptive to
different kinds of dataset and the predictions are mostly more accurate than the other models of
forecasting or classification models
ANN Architecture:
The most widely used ANN architecture is based on multilayer perceptron (MLP). MLP in general
comprises of input layer, single hidden layer, and output layer. There could also be more than one
hidden layer also. The nodes in the various layers are known as processing elements. The elements
in the input, hidden, and output layers are connected by acyclic links. Each link is associated with a
weight. The most commonly used ANN architecture consists of three layers: input layer, a single
hidden layer, and an output layer. The network works on the principle of feed forward network
(FNN) mechanism.
Yk = pi_out (alpha_k + sum over h (Whk * pi_h ( alpha_h + sum over i ( Wih * Xi))))
where,
W = weight associated with each link in the network
i = number of inputs
h = number of elements in the hidden layer
k = number of outputs
The activation function pi_h of the hidden layer is taken to be logistic function; pi_out is activation
function at the output level; alpha_h and alpha_k are bias coefficients. The logistic function is:
f(z) = exp(z)/(1 + exp(z))
5
Chapter 02
Neural Networks – Forecasting Model
The test for prediction into the future 10 years has been estimated using ARIMA for the independent
variables population in million (pop_mn) and gdp in billion rupee (gdp_bn). Hence, the model
developed for forecasting annual electricity generation is basically hybrid in nature.
> #R code:
> abc <- read.csv("bnkwh_gdp.csv", sep=",", header=TRUE)
> head(abc)
year pop_mn gdp_bn gdp_pcap bn_kwh
1 1960-61 434 178.70 411.75 16.94
2 1961-62 444 189.12 425.95 20.15
3 1962-63 454 203.21 447.60 23.36
4 1963-64 464 233.50 503.23 26.57
5 1964-65 474 272.22 574.30 29.78
6 1965-66 485 286.93 591.61 32.99
> kwhgdp<-data.frame(abc[,c(2,3,5)])
> kwhgdp
> scalemin<-apply(kwhgdp,2,min)
> scalemax<-apply(kwhgdp,2,max)
> scaledata<-data.frame(scale(kwhgdp, center=scalemin, scale=scalemax-scalemin))
> scaledata
> set.seed(123)
> library(nnet)
> library(NeuralNetTools)
> kwh_nn <- nnet(bn_kwh ~ gdp_bn + pop_mn, scaledata, size = 5, decay = 1e-3, linout = T, skip = F,
maxit = 1000, Hess = T)
# weights: 21
initial value 7.652763
iter 10 value 0.029795
iter 20 value 0.026070
iter 30 value 0.024238
iter 40 value 0.023878
iter 50 value 0.023284
iter 60 value 0.022457
iter 70 value 0.020235
iter 80 value 0.019555
9
> summary(kwh_nn)
a 2-5-1 network with 21 weights
options were - linear output units decay=0.001
b->h1 i1->h1 i2->h1
0.02 0.12 -0.16
b->h2 i1->h2 i2->h2
1.41 -1.53 -2.24
b->h3 i1->h3 i2->h3
0.02 0.12 -0.16
b->h4 i1->h4 i2->h4
-0.19 0.43 -1.02
b->h5 i1->h5 i2->h5
0.02 0.12 -0.16
b->o h1->o h2->o h3->o h4->o h5->o
0.35 0.31 -1.62 0.31 1.05 0.31
10
> plotnet(kwh_nn)
> ycap <- fitted.values(kwh_nn)*(max(kwhgdp$bn_kwh)-
min(kwhgdp$bn_kwh))+min(kwhgdp$bn_kwh)
> bnkwh<-data.frame(actualbnkwh=kwhgdp$bn_kwh, fittedbnkwh=ycap[,1])
> bnkwh
actualbnkwh fittedbnkwh
1 16.94 4.92
2 20.15 8.53
3 23.36 12.26
4 26.57 16.17
5 29.78 20.22
6 32.99 24.71
7 37.81 29.01
8 42.62 33.92
9 47.43 39.29
10 51.62 44.45
11 55.80 50.17
12 59.43 56.57
13 63.06 63.23
14 66.69 70.37
11
15 72.94 77.71
16 79.20 85.55
17 85.30 93.07
18 91.40 101.60
19 102.52 110.20
20 104.70 120.35
21 120.80 130.69
22 122.10 140.08
23 130.30 151.37
24 140.20 162.79
25 156.86 174.92
26 170.40 187.56
27 187.70 200.51
28 202.10 214.87
29 221.40 230.57
30 245.44 246.40
31 264.30 263.36
32 287.03 280.68
33 301.40 298.08
34 324.00 319.73
35 350.40 341.75
36 380.00 365.27
37 395.89 389.52
38 421.70 411.92
39 448.50 438.89
40 480.70 464.53
41 499.50 486.61
42 517.44 512.71
43 532.70 534.66
44 565.10 561.97
45 594.40 594.35
46 623.80 628.45
47 670.65 667.76
48 723.00 711.02
49 741.20 749.76
50 799.80 796.10
51 844.80 857.26
52 923.20 913.66
53 964.50 954.29
54 1014.80 1008.98
55 1048.40 1053.77
56 1090.85 1108.56
57 1135.33 1152.28
58 1206.31 1192.35
59 1249.34 1245.93
12
The Figure 2.2 shows the actual bnkwh and fittedted bnkwh as a scatter.
> rmse<-sqrt(sum((bnkwh$actual-bnkwh$fitted)^2)/nrow(bnkwh))
> rmse
[1] 10.47169
The actual bnkwh versus predicted bnkwh scatter has been plotted in order to understand the
model fit. The scatter plot has been shown in Figure 2.2.
The forecast has been done for bnkwh for the next 10 years. The new data for regressor variables
has been estimated by making use of arima model. In fact, auto.arima function of the 'forecast'
13
package has been used for making the estimation of gross domestic product in billion rupees and
population in million.
Coefficients:
ma1
-0.8907
s.e. 0.0551
Call:
arima(x = scaledata$pop_mn, order = c(0, 2, 1))
Coefficients:
ma1
-0.8907
s.e. 0.0551
Series: scaledata$gdp_bn
ARIMA(1,2,1)
Coefficients:
ar1 ma1
0.9820 -0.8670
s.e. 0.0287 0.0836
Call:
arima(x = scaledata$gdp_bn, order = c(1, 2, 1))
Coefficients:
ar1 ma1
0.9820 -0.8670
s.e. 0.0287 0.0836
5 1454.158
6 1486.896
7 1518.160
8 1548.522
9 1578.469
10 1608.398
> fittedbnkwh<-ts(bnkwh$fittedbnkwh,start=1)
> futurebnkwh<-ts(bnkwhfuture,start=60)
> ts.plot(fittedbnkwh, futurebnkwh, type="l", col=c(1,2), lwd=c(1,4), xlab="year", ylab="bnkwh",
main="bnkwh forecast 10 yrs")
> legend("topleft", lty=c(1,1), col=c(1,2), lwd=c(1,4), legend=c("fitted bnkwh","forecast 10
yrs"),bty="n")
The Figure 2.3 shows the electricity generation (bnkwh) forecast for the next 10 years based on the
hybrid neural network model.
> #R code:
> abc <- read.csv("bnkwh_gdp.csv", sep=",", header=TRUE)
> kwhgdp<-data.frame(abc[,c(2,3,5)])
> scalemin<-apply(kwhgdp,2,min)
> scalemax<-apply(kwhgdp,2,max)
> scaledata<-data.frame(scale(kwhgdp, center=scalemin, scale=scalemax-scalemin))
> scaledata
> nnpredlevel<-numeric(nrow(kwhgdp))
> i<-integer(nrow(kwhgdp))
> xyz<-numeric(nrow(kwhgdp))
> library(nnet)
> for (j in 1:5){
+ for (i in 1:nrow(kwhgdp)) {
+ nntest<-as.data.frame(scaledata[i,c(1:2)]);
+ nntrain<-as.data.frame(scaledata[-i,]);
+ nnfit<-nnet(bn_kwh~gdp_bn+pop_mn, data=nntrain, size = 4, decay = 1e-3, linout = T, skip = F,
maxit = 1000, Hess = T);
+ nnpred<-predict(nnfit, nntest);
18
+ nnpredlevel[i]<-nnpred[,1]*(max(kwhgdp$bn_kwh)-
min(kwhgdp$bn_kwh))+min(kwhgdp$bn_kwh);
+}
+ nnpredlevel
+ xyz<-cbind(xyz,nnpredlevel)
+}
# weights: 17
initial value 6.315430
iter 10 value 0.043400
iter 20 value 0.029248
iter 30 value 0.026542
iter 40 value 0.025696
iter 50 value 0.024177
iter 60 value 0.022226
iter 70 value 0.021287
iter 80 value 0.020766
iter 90 value 0.020102
iter 100 value 0.019594
iter 110 value 0.019382
iter 120 value 0.019246
iter 130 value 0.019207
iter 140 value 0.019160
final value 0.019156
converged
........................
# weights: 17
initial value 5.533753
iter 10 value 0.041560
iter 20 value 0.027292
iter 30 value 0.022990
iter 40 value 0.022279
iter 50 value 0.020596
iter 60 value 0.020322
iter 70 value 0.020144
iter 80 value 0.019869
iter 90 value 0.019411
iter 100 value 0.019302
iter 110 value 0.019168
iter 120 value 0.019069
iter 130 value 0.019016
iter 140 value 0.018898
19
> xyz<-xyz[,-1]
> xyz
> result<-rowMeans(xyz)
> result
44 565.10 561.64
45 594.40 594.46
46 623.80 629.11
47 670.65 667.54
48 723.00 710.06
49 741.20 750.57
50 799.80 795.71
51 844.80 858.96
52 923.20 911.47
53 964.50 952.15
54 1014.80 1007.52
55 1048.40 1054.47
56 1090.85 1113.96
57 1135.33 1157.78
58 1206.31 1187.76
59 1249.34 1240.17
The Figure 2.4 shows the actual bnkwh versus fitted bnkwh for the leave-one-out cross validated
model.
> rmse<-sqrt(sum((resultdf$actualbnkwh-resultdf$predbnkwh)^2)/nrow(resultdf))
> rmse
[1] 11.74785
24
Chapter 03
Neural Networks – Classification Model
Hypothetical University allows students to select teacher for a particular subject and number of
electives per semester. The case describes a very simple situation that there are three teachers and
these three teachers are capable of teaching all the four electives offered to the students. The
student result is the response variable which is evaluated against different selection combination of
teacher and the number of electives. The actual result of the student is compared and evaluated
against the predicted result of the neural network model. The 'acad2019.csv' holds the dataset for
this example. This is a purely hypothetical case for neural network concept demonstration for
classification model. The R code for running this neural network model is given below.
> tail(acad)
> library(nnet)
> library(NeuralNetTools)
> acadfit <- nnet(result ~ teacher + electives, traindata, size = 5, decay = 1e-3, linout = F, skip = F,
maxit = 1000, Hess = T)
# weights: 21
initial value 56.289146
iter 10 value 23.080040
iter 20 value 22.162028
iter 30 value 21.788613
iter 40 value 15.999344
........................
iter 380 value 15.657065
iter 390 value 15.656995
iter 400 value 15.656938
iter 410 value 15.656852
final value 15.656648
converged
> summary(acadfit)
a 2-5-1 network with 21 weights
options were - decay=0.001
b->h1 i1->h1 i2->h1
7.60 -0.93 -1.54
b->h2 i1->h2 i2->h2
-0.37 1.48 -0.41
b->h3 i1->h3 i2->h3
-4.64 -0.79 2.78
b->h4 i1->h4 i2->h4
0.60 2.53 -0.93
b->h5 i1->h5 i2->h5
1.44 2.96 -3.17
b->o h1->o h2->o h3->o h4->o h5->o
1.90 -6.45 3.13 -4.20 3.86 4.65
> plotnet(acadfit)
The Figure 3.1 shows the neural network model for the acad2019 classification. The teacher and
electives are the input variables and result is the response variable or output variable. The output
variable is also called dependent variable.
26
1 2 3 4 5 10 11 12 13 16 29 32 37 38 40 44 46 47 48 50
1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1
51 52 57 61 63 66 67 68 71 73 76 83 85 86 87 90 93 94 97 100
1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1
105 109 110 111 112 113 115 116 117 118 124 129 130 142 146 150 152 153 154 155
0 0 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 0 1 1
158 160 162 168 169 172 173 174 177 181 182 185 193 195 198 199 201 202 203 208
0 1 0 1 1 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1
209 210 217 219 221 223 225 227 228 229 235 239 240 241 242 243 250 254 255 256
1 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 0 1 1 1
259 260 269 273 274 275 276 279 280 282 283 284 292 295 296 299 300 303 306 309
1 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 1
310 312 313 316 317 318 319 326 329 331 334 335 336 339 342 345 352 355 356 359
1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 1 0 1 0 1
361 363 365 370 372 373 379 380 382 386 389 391 396
0 0 0 1 1 1 0 1 0 1 1 1 1
> actualresult<-acad[ind==2,4]
> result<-data.frame(actualresult=actualresult, predresult=predval)
27
> head(result)
actualresult predresult
1 1 1
2 1 1
3 1 1
4 1 1
5 1 0
10 1 1
> tb1<-table(result)
> tb1
predresult
actualresult 0 1
0 41 5
1 8 99
> misclass<-1-sum(diag(tb1))/sum(tb1)
> misclass
[1] 0.08496732
The mis-classification by the neural network model is 8.5 percent. The result of the classification
type neural network model is quite satisfactory.
28
1. Peter Dalgaard (2008), Introductory Statistics with R 2e, Springer, New York, ISBN: 978-0-387-
79053-4
2. Simon Haykin (1999), Neural Networks: A Comprehensive Foundation 2e, Prentice Hall
International, Inc., New Jersey 07458, ISBN 0-13-908385-5
3. W. N. Venables and B. D. Ripley (2002), Modern Applied Statistics with S 4e, Springer