Sarg Ur Sri Hari

Artificial Neural Networks
Sargur Srihari
CSE 555: Srihari
Role of ANNs in Pattern Recognition

Generalization of linear discriminant
functions which have limited capability
SVMs need propoer choice of kernel
funcations
Three- and Four-layer nets overcome
drawbacks of two layer networks
CSE 555: Srihari
Biological Neural Network
Neuron, a cell
Axon
Dendrite
CSE 555: Srihari
Synapse
Idealization of a Neuron
CSE 555: Srihari
Common ANN
CSE 555: Srihari
Decision Boundaries of ANNs
Linear
Arbitrary
CSE 555: Srihari
ANN for XOR

Input
Unit
+1
+1
Hidden
Unit
+1
1.5
+1
Input
Unit
CSE 555: Srihari
-2
0.5
Output
Unit
4-2-4 Encoder ANN

Output
Units
Input
Units
INPUT PATTERNS
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
Bias
Unit
HIDDEN UNIT OUTPUTS

0.03
0.97
0.98
0.96
0.91
0.02
0.03
0.07
CSE 555: Srihari
ACTUAL OUTPUTS
0.91 0.10 0.00 0.07
0.07 0.88 0.06 0.00
0.00 0.10 0.91 0.06
0.07 0.00 0.09 0.90
Simple three-layer ANN
CSE 555: Srihari
Sigmoid Activation Function

y
Sigmoid Function, defined by
y =1/(1+e-x)
produces almost the same
output as an ordinary threshold
(a step function) but is
mathematically simpler.
0.5
Its derivative is
0
-10
+10
dy/dx=y(1-y).
CSE 555: Srihari
2-4-1 ANN
CSE 555: Srihari
The Back-Propagation Algorithm

To train a neural network to perform some
task, we must adjust the weights of each
unit in such a way that the error between
the desired output and the actual output
is reduced. This process requires that
the neural network compute the error
derivative of the weights (EW). In other
words, it must calculate how the error
changes as each weight is increased or
decreased slightly. The back-propagation
algorithm is the most widely used method
for determining EW.
CSE 555: Srihari
A fully connected network
CSE 555: Srihari
Implementing the Back-Propagation

Algorithm
unit j is a typical unit in the
output layer and
unit i is a typical unit in
previous layer. A unit in the output
layer determines its activity by following a

net j = First,
xi wij it computes the
two-step procedure.
total weighted input i netj, using the formula
CSE 555: Srihari
where xi is the activity level of the ith

Algorithm
Next, the unit calculates the activity xj
using some function of the total weighted
input. Typically, we use the sigmoid
function:
1
xj =
net j
1+ e
CSE 555: Srihari
Calculating Error E
Once the activities of all the output
units have been determined, the network
computes the error E
E=
1
2
(x
tj)
where
xj is the activity level of all the jth
units in the top layer and
tj is the desired target output of the jth
unit.
CSE 555: Srihari
Four Steps to Calculating the BackPropagation Algorithm

how fast the error
changes as the activity of an
output unit is changed.
1.
Compute
E
EA j =
= xj tj
x j
CSE 555: Srihari
The error derivative (EA) is the

2. Compute how fast the error changes as
the total input received by an output
unit is changed.
E
E dx j
=
EI j =
= EA j x j (1 x j )
net j x j dnet j
555: Srihari
This quantity (EI)CSEis
the answer from
step 1 multiplied by the rate at which

3.
Compute how fast the error
changes as a weight on the

connection into an output unit
is changed.
E
E net j
=
EWij =
= EI j xi
wij net j wij
CSE 555: Srihari
This quantity (EW) is the answer from

4. Compute how fast the error changes as
the activity of a unit in the previous
layer is changed. This crucial step
allows back-propagation to be applied to
multi-layer networks. When the activity
of a unit in the previous layer changes,
it affects the activities of all output
units to which it is connected.
So to
compute the overall effect on the error,
E
we add together
allEthenet
separate
effects
j
= EI j wij
EA =
=
on outputi units.
But
each
is
xi effect
xi
j net j
j
simple to calculate. It is the answer in
CSE 555: Srihari
step 2 multiplied by the weight of the
connection to that output unit

Conclusion
By using steps 2 and 4, we can convert
the EAs of one layer of units into EAs
for the previous layer. This procedure
can be repeated to get the EAs for as
many previous layers as desired. Once we
know the EA of a unit, we can use steps 2
and 3 to compute the EWs on its incoming
connections.
CSE 555: Srihari

To train neural network: adjust weights of
each unit such that the error between the
desired output and the actual output is
reduced
Process requires that the neural network
compute the error derivative of the weights
(EW)
it must calculate how the error changes as
each weight is increased
or decreased
CSE 555: Srihari
slightly

Algorithm
describe a neural network in mathematical
terms
Assume that unit j is a typical unit in the
output layer and unit i is a typical unit in
the previous layer
A unit in the output layer determines its
activity by following a two-step procedure
CSE 555: Srihari

Algorithm
Step 1: Unit computes the total weighted
input netj, using the formula
net j = xi wij
i
where
xi = activity level of the ith unit in the previous layer
wij = weight of the connection between the ith and jth
units
CSE 555: Srihari

Algorithm
Step 2: Unit calculates the activity yi using
some function of the total weighted input
Usually use the sigmoid function
xj =
1
1+ e
CSE 555: Srihari
net j
Calculating Error E
Once the activities of all the output units
have been determined, the network
computes the 1error E, which is 2defined by
E = 2 (x j t j )
the expression
where
xj = activity level of the jth unit in the top layer
tj = is the desired output of the jth unit
CSE 555: Srihari
4 Steps to Calculating the BackPropagation Algorithm

Step 1: Compute how fast the error changes
as the activity of an output unit is changed
error derivative (EA) is the difference
between the actual
Eand the desired
activityEA j =
= xj tj
x j
CSE 555: Srihari

as the total input received by an output unit
is changed.
The quantity (EI) is the answer from step
dx
E
jat which the
1
multiplied
by
the
rate
EI j =
=
= EA j x j (1 x j )
output
as its total input
netofj a unit
xchanges
dnet
j
j
is changed.
CSE 555: Srihari

as a weight on the connection into an output
unit is changed.
E
EWij =
=
wij
E net j
= EI j xi
net j wij
= EA j x j (1 x j ) xi = ( x j t j ) x j (1 x j ) xi
CSE 555: Srihari

as the activity of a unit in the previous layer
is changed.
Allows back-propagation to be applied to
multi-layer networks.
When the activity of a unit in the previous
layer changes,
it affects
activities of all
E netthe
E
j
= EI j wij
EAi =
=
output units
xiis connected.
net j it
xi toj which
j
To compute theCSEoverall
555: Srihari effect on the error,
add together all the separate effects on

Conclusion
By using steps 2 and 4, we can convert the
EAs of one layer of units into EAs for the
previous layer.
This procedure can be repeated to get the
EAs for as many previous layers as desired.
Once we know the EA of a unit, we can use
steps 2 and 3 to compute the Ews on its
incoming connections.
CSE 555: Srihari
Training an Artificial Neural

Network
A network learns by successive repetitions
of a problem.
Makes smaller error with each iteration.
Commonly used function for the error is the
2
E = 12
( xi tof
)
sum of the squared
errors
the output
i
units.
1 /(1 + e neti )
CSE 555: Srihari
The variable ti is the desired output of unit i,
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari
CSE 555: Srihari

Network
To minimize the error, take the derivative
of the error with respect to wij the weight
E
between units i and
= xi jx j (1 x j ) j
wij
where j = ( x j t j ) for output units and

j = k w jk xk (1 xk ) k for hidden units (k represents
the number of units in the next layer unit that unit j is
connected to).
x j (1 x j )
CSE 555: Srihari

Network
For the links going into the output units, the
error can be computed directly.
For hidden units, the derivative depends on
values calculated at all the layers that come
after it.
i.e., the value must be back propagated
through the network to calculate the
derivatives
CSE 555: Srihari

Network
Using these equations, we can state the
back-propagation equation as follows
Choose step size, (used to update weights),
Until network is trained,
For each sample pattern,
j = (x j t j )
Do a forward pass through the net,
producing an output pattern.

j = units,
w jk xk (1 xcalculate
For all output
k ) k .
k
.
For all other
units
(from
wij =
xi x j (1
x j ) j . last layer to
first) calculate using the calculation
555: Srihari
from the layerCSE
after
it

Sarg Ur Sri Hari

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Sarg Ur Sri Hari

Загружено:

Авторское право:

Доступные форматы

Artificial Neural Networks

CSE 555: Srihari

Role of ANNs in Pattern Recognition

CSE 555: Srihari

Biological Neural Network

CSE 555: Srihari

CSE 555: Srihari

Decision Boundaries of ANNs

CSE 555: Srihari

ANN for XOR

CSE 555: Srihari

4-2-4 Encoder ANN

HIDDEN UNIT OUTPUTS

CSE 555: Srihari

Simple three-layer ANN

CSE 555: Srihari

Sigmoid Activation Function

CSE 555: Srihari

CSE 555: Srihari

The Back-Propagation Algorithm

A fully connected network

CSE 555: Srihari

Implementing the Back-Propagation

layer determines its activity by following a

CSE 555: Srihari

where xi is the activity level of the ith

Implementing the Back-Propagation

Four Steps to Calculating the BackPropagation Algorithm

The error derivative (EA) is the

Four Steps to Calculating the BackPropagation Algorithm

Four Steps to Calculating the BackPropagation Algorithm

Compute how fast the error

changes as a weight on the

This quantity (EW) is the answer from

Four Steps to Calculating the BackPropagation Algorithm

The Back-Propagation Algorithm

CSE 555: Srihari

The Back-Propagation Algorithm

Implementing the Back-Propagation

Implementing the Back-Propagation

CSE 555: Srihari

Implementing the Back-Propagation

CSE 555: Srihari

4 Steps to Calculating the BackPropagation Algorithm

CSE 555: Srihari

4 Steps to Calculating the BackPropagation Algorithm

4 Steps to Calculating the BackPropagation Algorithm

4 Steps to Calculating the BackPropagation Algorithm

The Back-Propagation Algorithm

Training an Artificial Neural

The variable ti is the desired output of unit i,

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

CSE 555: Srihari

Training an Artificial Neural

where j = ( x j t j ) for output units and

Training an Artificial Neural

Training an Artificial Neural

producing an output pattern.