Chapt 7 Backpropagation Neural Network

INTELLIGENT CONTROL SYSTEM (ICS) Semester 1 session 20132014
Multilayered Perceptrons (MLPs)

The Multilayered Perceptron is a natural extension to the single layer
perceptron that were very popular in the 1960s.
These multilayered perceptrons are able to overcome the severe
limitation of its single layer predecessor.
This plus the availability of several learning algorithms for finding
suitable weights and thresholds or biases have made multilayered
perceptrons widely popular.
Their applications are in finance, chemistry, plant control, autonomous
vehicle steering, system identification, control and various other function
approximation and general pattern recognition problems (Burke, 1991;
Lippmann, 1987, and Hopfield and Tank, 1985).
There are numerous works on the study and applications of multi
layered perceptrons. The different variants of this model differ in the way
the weights are updated during learning, among these is
backpropagation, or more formally, retroprogation of error.
Concept of Training
The role in the MLPs training algorithm are done by deciding the number
of layers, and number of units in each layer, and select the network's
weights and thresholds to minimize the prediction error made by the
network.
The rules are used to automatically adjust the weights and thresholds in
order to minimize this error.
The error of a particular configuration of the network can be determined
by running all the training cases through the network, comparing the
actual output generated with the desired or target outputs
A helpful concept is the error surface, the objective of network
training is to find the lowest point in this many-dimensional surface.
Concept of Training (cont)
Neural network error surfaces are much more complex, and are
characterized by a number of unhelpful features, such as local minima
(which are lower than the surrounding terrain, but above the global
minimum), flat spots and plateaus, saddle points, and long narrow
ravines.
It is not possible to analytically determine where the global minimum of
the error surface is, and so neural network training is essentially an
exploration of the error surface.

From an initially random configuration of weights and thresholds (i.e., a
random point on the error surface), the training algorithms incrementally
seek for the global minimum.
M
S
E
Iteration
Local
Minimum
Global
Minimum
The global minimum of error surface training
The global minimum value related to the minimum value of the
performance criteria. The performance criteria are based on the training
error that below:
But sometime we can use Mean Square Error as equation
Backpropagation Networks
Backpropagation networks and multilayered perceptrons in
general, are feedforward networks with distinct input, output, and
hidden layers. The units function basically like perceptrons, except
that the transition (output) rule and the weight update (learning)
mechanism are more complex.

Multilayer feedforward neural networks have proven their abilities
in solving and handling a wide range of problems and applications,
and these systems have overcome limitations of the single layer
system. In order to learn a solution, a training strategy must be used.
One of the most popular training methods is backpropagation.
No learning algorithm had been available for multilayer networks until
Rumelhart, Hinton, and Williams introduced the backpropagation
training algorithm, also referred to as the generalized delta rule 1988
(1)
(2)
STEEPEST DESCENT BACKPROPAGATION (SDBP) NEURAL NETWORK
The Backpropagation Algorithm (BPA) is a supervised learning method for training
ANNs, and is one of the most common form of training techniques. It uses a Steepest
Descent or a Gradient Descent optimization method, also referred to as the Delta rule
When applied to feedforward network. The feedforward that has employed the Delta
Rule for training, is called a Multi-Layer Perceptron (MLP).
The performance index or cost function J takes the form of a summed squared error
Function, then

Backpropagation Networks (cont)
From Equation (1) and (2)
(3)
If the activation function is the sigmoid function given, then its
derivative is
(4)
Since f(s) is the neuron output y
j
, then equation (4) can be written as
(5)
From equation (3), again using chain rule
(6)
If, in equation (1), the bias b
j
is called w
j0
, the equation (1) may be written as
(7)
(8)
Substituting equation (5) and (8) into (6) give
(9)
Putting equation (9) into (3) give
(10)
(11)
(12)
Substituting equation (11) into (2) gives
(13)
(14)
i j ji
x kT w ) (
General formulation by considering the weight increment previous value is
(15)
Calculate using equation (12), and the weights adjusted using equation (14).
To adjust the weights on the hidden layer (l =2) equation (12) is replaced by
(16)
The equations that govern the BPA can be summarized as
Single neuron summation

) 17 (
1

N
i
j i ji j
b x w net
) 19 ( ) (
i j ji
x kT w
) 20 ( ) ( ) ) 1 (( ) ( kT w T k w kT w
ji ji ji

Sigmoid activation function

Delta rule

New weight

) 18 (
1
1
) (
) (
j
net
j j
e
net y


(21)
(22)
(23)
The delta rule given in equation (19) can modified to include momentum as
Indicate in equation (24).
(24)
MOMENTUM BACKPROPAGATION (MOBP) NEURAL NETWORK
Stopping criteria
Can assess train performance using

where P =number of training patterns, M =number of output units,
t = target, and y = actual output
Could stop training when rate of change of E is small, suggesting
convergence
However, aim is for new patterns to be
classified correctly

p
i
M
j
i j j
y t E
1 1
2
] [
or
Typically, though error on training set will decrease as training continues
generalisation error (error on unseen data) hits a minimum then increases
(model complexity etc)
Therefore want more complex stopping criterion
Error /
MSE
Training time
Training error
Generalisation
error
Example 1:
Related to example 1 above, if the number of training pattern is increase
as bellow

Example 2:
1 1 1
0 1 1
t
1
x
2
x
Determine the new weight and plot the MSE for 1 iteration

Example 3:
The feedforward network above is trained with SDBP learning algorithm
With initial condition as below:
a. Determine the new weights for 1 iteration if the network activation function
is sigmoid.
b. Plot the MSE
2305 . 0 ) 5199 . 0 1 (
1
1
pattern training of number , ) (
1
: Answer
MSE Plot the (b)
2
1
2

MSE
P y t
P
MSE
P
i
i i
Example 4
The feedforward network above is trained with MOBP learning algorithm
with same activation function for hidden and output layer neuron and has
training pattern as below
0 0 0 1
0 0 1 0
1
x
2
x
3
x
t
03 . 0 , 05 . 0
, 22 . 0 , 01 . 0 , 2 . 0 , 01 . 0 , 21 . 0 , 11 . 0
31 . 0 , 15 . 0
, 2 . 0 , 11 . 0 , 21 . 0 , 11 . 0 , 01 . 0 , 01 . 0
75 . 0 momentum
and 5 . 0 rate learning ,
1
1
) ( function activation
8 7
6 5 4 3 2 1
8 7
6 5 4 3 2 1

w w
w w w w w w
w w
w w w w w w
e
net f
net
a. Determine the new weights for 2 iteration related with the data above.
b. Plot the MSE
Compare between neural network and fuzzy logic
according to the structure
The fuzzy logic system enables the inclusion of linguistic knowledge in a systematic way. It
means for adaptive systems of fuzzy logic is that the system's initial parameters are set
extremely well. In artificial neural networks, the non-transparent network design prevents
the inclusion of linguistic knowledge and that is why needs a random selection of initial
parameters which prolongs the learning phase.
All parameters of the fuzzy logic system have a physical meaning. There is no such clear
connection between inputs, individual parameters, and outputs in artificial neural networks.
Based on classical system identification views, artificial neural networks into approaches
according to the black box method, while fuzzy logic systems into approaches according to
the gray box method.
Only in few examples do we not have at least the basic linguistic knowledge about the
system or the process available. In such cases, it is possible to construct a fuzzy logic
system with an adaptive algorithm which functions in the same way as the artificial neural
network.
When using the artificial neural network and adaptive fuzzy logic system in solving the
same exercises, where that the fuzzy logic system with adaptive parameters is significantly
less extensive than equally efficient artificial neural network. Thus, needs less processor
time for the same effect which is extremely important in real-time application.


Chapt 7 Backpropagation Neural Network

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Chapt 7 Backpropagation Neural Network

Загружено:

Авторское право:

Доступные форматы

INTELLIGENT CONTROL SYSTEM (ICS) Semester 1 session 20132014

Multilayered Perceptrons (MLPs)

INTELLIGENT CONTROL SYSTEM (ICS) Semester 1 session 20132014

Вам также может понравиться