Artificial Neural Networks: Prajith CA Associate Professor Ece, Cet

Artificial Neural Networks
Prajith CA
Associate Professor
ECE, CET.
4/17/2020 ANN 1
Contents
• Biologically Inspired
• Brief History and Origins
• Types and topologies of ANN
• Activation Functions
• Perceptron
• Examples
• Backpropagation Neural Network
• Conclusion
4/17/2020 ANN 2
Biological neuron
Synapse
Synapse Dendrites
Axon
Axon
Soma Soma
Dendrites
Synapse
• A neuron has
– A branching input (dendrites)
– A branching output (the axon)
• The information circulates from the dendrites to the axon via
the cell body(soma)
• Axon connects to dendrites via synapses
– Synapses vary in strength
– Synapses may be excitatory or inhibitory
4/17/2020 ANN 3
 A neural network can be defined as a model of
reasoning based on the human brain. The brain
consists of a densely interconnected set of nerve
cells, or basic information-processing units, called
neurons.
 The human brain incorporates nearly 10 billion
neurons and 60 trillion connections, synapses,
between them. By using multiple neurons
simultaneously, the brain can perform its functions
much faster than the fastest computers in existence
today.
4/17/2020 ANN 4
 Each neuron has a very simple structure, but an
army of such elements constitutes a tremendous
processing power.
 A neuron consists of a cell body, soma, a number
of fibers called dendrites, and a single long fiber
called the axon.
4/17/2020 ANN 5
Analogy between biological and
artificial neural networks
Biological Neural Network Artificial Neural Network

Soma Neuron
Dendrite Input
Axon Output
Synapse Weight
4/17/2020 ANN 6
What is an artificial neuron ?
• Definition : Non linear, parameterized function with
restricted output range
y
 n 1

y  f  w0   wi xi 
 i 1 
w0
w1 w2 w3
x1 x2 x3
4/17/2020 ANN 7
More Complex Model of a Neuron
I
n x1 wk1
Linear Activation
p Combiner Function
u Output Output
t
x2 wk2 uk yk
 (-)
s …
i
g …
n Summing
a function
xp wkp k
l
s Threshold
Synaptic weights of
neuron
4/17/2020 ANN 8
Architecture of a typical artificial neural network
Output Signals
Input Signals
Middle Layer
Input Layer Output Layer
4/17/2020 ANN 9
Learning in Neural Networks
1. Supervised Learning (i.e. learning with a

teacher)
2. Unsupervised learning (i.e. learning with no
help)
4/17/2020 ANN 10
Classification and Clustering
• In general, in classification you have a set of predefined classes and
want to know which class a new object belongs to.
• Predicts categorical class labels – Classifies data (constructs a model)
based on a training set and the values (class labels) in a class label
attribute – Uses the model in classifying new data
• Clustering tries to group a set of objects and find whether there

is some relationship between the objects.
• A collection of data objects – Similar to one another within the same
cluster – Dissimilar to the objects in other clusters
• In the context of machine learning, classification is supervised

learning and clustering is unsupervised learning.
4/17/2020 ANN 11
What are Artificial Neural Networks Used
for?
• Brain modeling
– Models of human development – help children with developmental problems
– Simulations of adult performance – aid our understanding of how the brain works
– Neuropsychological models – suggest remedial actions for brain damaged patients
• Real world applications

– Financial modeling – predicting stocks, shares, currency exchange rates
– Other time series prediction – climate, weather, airline marketing tactician
– Computer games – intelligent agents, backgammon, first person shooters
– Control systems – autonomous adaptable robots, microwave controllers
– Pattern recognition – speech recognition, hand-writing recognition, sonar signals
– Data analysis – data compression, data mining
– Noise reduction – function approximation, ECG noise reduction
– Bioinformatics – protein secondary structure, DNA sequencing
4/17/2020 ANN 12
A Brief History
• 1943 McCulloch and Pitts proposed the McCulloch-Pitts neuron model
• 1949 Hebb published his book The Organization of Behavior, in which the Hebbian learning rule was
proposed.
• 1958 Rosenblatt introduced the simple single layer networks now called Perceptrons.
• 1969 Minsky and Papert’s book Perceptrons demonstrated the limitation of single layer perceptrons, and
almost the whole field went into hibernation.
• 1982 Hopfield published a series of papers on Hopfield networks.
• 1982 Kohonen developed the Self-Organizing Maps that now bear his name.
• 1986 The Back-Propagation learning algorithm for Multi-Layer Perceptrons was re-discovered and the
whole field took off again.
• 1990s The sub-field of Radial Basis Function Networks was developed.
• 2000s The power of Ensembles of Neural Networks and Support Vector Machines becomes apparent.
4/17/2020 ANN 13
ANN Topologies
• Mathematically, ANNs can be represented as weighted directed graphs.
For our purposes, we can simply think in terms of activation flowing
between processing units via one-way connections
– Single-Layer Feed-forward NNs One input layer and one output layer of
processing units. No feed-back connections. (For example, a simple
Perceptron.)
– Multi-Layer Feed-forward NNs One input layer, one output layer, and one or
more hidden layers of processing units. No feed-back connections. The hidden
layers sit in between the input and output layers, and are thus hidden from
the outside world. (For example, a Multi-Layer Perceptron.)
– Recurrent NNs Any network with at least one feed-back connection. It may, or
may not, have hidden units. (For example, a Simple Recurrent Network.)
4/17/2020 ANN 14
ANN Topologies
4/17/2020 ANN 15
Types of Neural Networks
Neural Network types can be classified based on following attributes:
• Applications
-Classification
-Clustering
-Function approximation
-Prediction
• Connection Type
- feedforward
- feedback
• Topology
- Single layer
- Multilayer
- Recurrent
• Learning Methods
- Supervised
- Unsupervised
4/17/2020 ANN 16
Activation functions of a neuron
4/17/2020 ANN 17
PERCEPTRON
4/17/2020 ANN 18
Bias Node (Default Activation)
• Useful to allow nodes to have default activation.
• Guarantees that all receiving nodes have some
input even if all other nodes are off.
• Since output of bias node is always 1.0, input it
sends to any other node is 1.0 * wij (value of
weight itself).
• Only need one bias node per network.
• Useful to allow individual nodes to have
different defaults.
4/17/2020 ANN 19
Perceptron
4/17/2020 ANN 20
Perceptron Algorithm
• Step 0: Initialize weights and bias
– For simplicity, set weights and bias to zero
– Set learning rate a (0 <= a <= 1)
• Step 1: While stopping condition is false do
steps 2-6
• Step 2: For each training pair s:t do steps 3-5
• Step 3: Set activations of input units
xi = si
4/17/2020 ANN 21
• Step 4: Compute response of output unit:
y _ in  b   xi wi
i
1 if y_in  

y   0 if -   y_in  
 1 if y_in  - 

4/17/2020 ANN 22
• Step 5: Update weights and bias if an error occurred
for this pattern
if y != t
wi(new) = wi(old) + atxi
b(new) = b(old) + at
else
wi(new) = wi(old)
b(new) = b(old)
• Step 6: Test Stopping Condition

– If no weights changed in Step 2, stop, else, continue
4/17/2020 ANN 23
Convergence of Perceptron Learning
• The weight changes Dwij need to be applied
repeatedly – for each weight wij in the network, and
for each training pattern in the training set. One pass
through all the weights for the whole training set is
called one epoch of training
• Eventually, usually after many epochs, when all the

network outputs match the targets for all the training
patterns, all the Dwij will be zero and the process of
training will cease. We then say that the training
process has converged to a solution
4/17/2020 ANN 24
Convergence of Perceptron Learning
• It can be shown that if there does exist a possible set
of weights for a Perceptron which solves the given
problem correctly, then the Perceptron Learning Rule
will find them in a finite number of iterations
• Moreover, it can be shown that if a problem is

linearly separable, then the Perceptron Learning Rule
will find a set of weights in a finite number of
iterations that solves the problem correctly
4/17/2020 ANN 25
AND function:Bipolar input and targets
input Net y-in Out y target Weight changes Weights
x1 x2 1 w1 w2 b
0 0 0
1 1 1 0 0 1 1 1 1 1 1 1
1 -1 1 1 1 -1 -1 1 -1 0 2 0
-1 1 1 2 1 -1 1 -1 -1 1 1 -1
-1 -1 1 -3 -1 -1 0 0 0 1 1 -1
1 1 1 1 1 1 0 0 0 1 1 -1
1 -1 1 -1 -1 -1 0 0 0 1 1 -1
-1 1 1 -1 -1 -1 0 0 0 1 1 -1
-1 -1 1 -3 -1 -1 0 0 0 1 1 -1
if y != t else
wi(new) = wi(old) + αtxi wi(new) = wi(old) y _ in  b   xi wi
b(new) = b(old) + αt b(new) = b(old) i
1 if y_in  

α =1, θ=0 y   0 if -   y_in  
 1 if y_in  - 

4/17/2020 ANN 26
AND function:Binary input and Bipolar
targets
x1 x2 1 w1 w2 b
0 0 0
1 1 1 0 0 1 1 1 1 1 1 1
4/17/2020 ANN 27
targets
x1 x2 1 w1 w2 b
1 1 1
1 0 1 2 1 -1 -1 0 -1 0 1 0
4/17/2020 ANN 28
AND function:Binary input and Bipolar targets (Ist Epoch)
α =1, θ=0.2

x1 x2 1 w1 w2 b
0 0 0
1 1 1 0 0 1 1 1 1 1 1 1
1 0 1 2 1 -1 -1 0 -1 0 1 0
0 1 1 1 1 -1 0 -1 -1 0 0 -1
0 0 1 -1 -1 -1 0 0 0 0 0 -1
4/17/2020 ANN 29
AND function:Binary input and Bipolar targets (After
2nd Epoch)

x1 x2 1 w1 w2 b
0 0 -1
1 1 1 -1 -1 1 1 1 1 1 1 0
1 0 1 1 1 -1 -1 0 -1 0 1 -1
0 1 1 0 0 -1 0 -1 -1 0 0 -2
0 0 1 -2 -1 -1 0 0 0 0 0 -2
4/17/2020 ANN 30
4/17/2020 ANN 31
4/17/2020 ANN 32
4/17/2020 ANN 33
targets (After 10 th Epoch)
x1 x2 1 w1 w2 b
2 3 -4
1 1 1 1 1 1 0 0 0 2 3 -4
1 0 1 -2 -1 -1 0 0 0 2 3 -4
0 1 1 -1 -1 -1 0 0 0 2 3 -4
0 0 1 --4 -1 -1 0 0 0 2 3 -4
4/17/2020 ANN 34
Linear separability in the perceptrons
x2 x2
Class A1
1
2
1
x1
Class A2 x1
x1w1 + x2w2  = 0 x1w1 + x2w2 + x3w3  = 0

x3
(a) Two-input perceptron. (b) Three-input perceptron.
4/17/2020 ANN 35
Overview and Review
• Neural network classifiers learn decision boundaries from
training data
• Simple Perceptrons can only cope with linearly separable

problems
• Trained networks are expected to generalize, i.e. deal

appropriately with input data they were not trained on
• One can train networks by iteratively updating their weights
• The Perceptron Learning Rule will find weights for linearly

separable problems in a finite number of iterations.
4/17/2020 ANN 36
Derivations
• Delta rule for single output unit
– The delta rule changes the weights of the
connections to minimize the difference between
input and output unit
– By reducing the error for each pattern one at a
time
– The delta rule for Ith weight(for each pattern) is
DwI = a (t – y_in)xI
4/17/2020 ANN 37
Derivations
• The squared error for a particular training pattern is
E = (t – y_in)2.
E : function of all weights wi, I = 1, …, n
• The gradient of E is the vector consisting of the partial derivatives

of E with respect to each of the weights
• The gradient gives the direction of most rapid increase in E
• Opposite direction gives the most rapid decrease in the error
• The error can be reduced by adjusting the weight wI in the

direction of
- E
4/17/2020 wI ANN 38
Derivations
• Since
y_in =  xi wi ,
E = -2(t – y_in) y_in

wI wI
= -2(t – y_in)xI
The local error will be reduced most rapidly

by adjusting the weights according to the delta rule
4/17/2020 ANN 39
Derivations
• Delta rule for multiple output unit
– The delta rule for Ith weight(for each pattern) is
DwIJ = a (t – y_inJ)xI
Derivations
• The squared error for a particular training pattern is
m
E = (tj – y_inj)2.
j=1
E : function of all weights wi, I = 1, …, n
• The error can be reduced by adjusting the weight wI in

the direction of
E m
=   (tj – y_inj)2
wIJ wI j=1
=  (tJ – y_inJ)2
wI
Derivations
• Since
y_in =  xi wi ,
E = -2(t – y_in) y_in

wI wI
= -2(t – y_in)xI
The local error will be reduced most rapidly

by adjusting the weights according to the delta rule
4/17/2020 ANN 42
Backpropagation neural network
BPN
4/17/2020 ANN 43
Backpropagation neural network with
one hidden layer
4/17/2020 ANN 44
BP Nets
• Architecture:
– Multi-layer
– Feed-forward
(full connection between nodes in adjacent layers,
no connection within a layer)
– One or more hidden layers with non-linear
activation function
(most commonly used are sigmoid functions)
4/17/2020 ANN 45
Summary of BP Nets
• Back-Propagation learning algorithm:
– Supervised learning
– Approach: Gradient descent to reduce the total
error
( it is also called generalized delta rule)
– Error terms at output nodes and
Error terms at hidden nodes (why it is called error
BP)
4/17/2020 ANN 46

Artificial Neural Networks: Prajith CA Associate Professor Ece, Cet

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Artificial Neural Networks: Prajith CA Associate Professor Ece, Cet

Загружено:

Авторское право:

Доступные форматы

Artificial Neural Networks

Biological Neural Network Artificial Neural Network

1. Supervised Learning (i.e. learning with a

• Clustering tries to group a set of objects and find whether there

• In the context of machine learning, classification is supervised

• Real world applications

• 1982 Hopfield published a series of papers on Hopfield networks.

• 1990s The sub-field of Radial Basis Function Networks was developed.

Neural Network types can be classified based on following attributes:

• Step 6: Test Stopping Condition

• Eventually, usually after many epochs, when all the

• Moreover, it can be shown that if a problem is

input Net y-in Out y target Weight changes Weights

input Net y-in Out y target Weight changes Weights

x1w1 + x2w2  = 0 x1w1 + x2w2 + x3w3  = 0

• Simple Perceptrons can only cope with linearly separable

• Trained networks are expected to generalize, i.e. deal

• One can train networks by iteratively updating their weights

• The Perceptron Learning Rule will find weights for linearly

• The gradient of E is the vector consisting of the partial derivatives

• The gradient gives the direction of most rapid increase in E

• Opposite direction gives the most rapid decrease in the error

• The error can be reduced by adjusting the weight wI in the

E = -2(t – y_in) y_in

The local error will be reduced most rapidly

E : function of all weights wi, I = 1, …, n

• The error can be reduced by adjusting the weight wI in

E = -2(t – y_in) y_in

The local error will be reduced most rapidly

• Back-Propagation learning algorithm:

Вам также может понравиться