Академический Документы
Профессиональный Документы
Культура Документы
An Introduction to
Artificial neural network
TERM PAPER DOCUMENT
B.TECH 6TH SEMESTER
PAPER CSEB 605(P)
Kaushik Bose
13/06/2014
TABLE OF CONTENTS
Introduction ............................................................................................................................................................ 3
Basics .................................................................................................................................................................. 3
Networks ............................................................................................................................................................ 3
Why Neural Networks? ...................................................................................................................................... 4
Technical Viewpoint ....................................................................................................................................... 4
Biological Viewpoint ...................................................................................................................................... 4
Biological neural networks ..................................................................................................................................... 5
Characteristics that ann share with biological neural system ............................................................................ 6
Artificial Neural Networks ...................................................................................................................................... 7
What is neural network ...................................................................................................................................... 7
Formal definition of artificial neuron network ................................................................................................... 7
Characterization of ANN..................................................................................................................................... 7
A General framework for ann models ................................................................................................................ 8
Neurons the basic computational entities ......................................................................................................... 8
The Perceptron and Linear Separability ........................................................................................................... 10
Perceptron for Classification ........................................................................................................................ 11
limitations of perceptron ............................................................................................................................. 11
Artificial Neural Network Architecture or topology ............................................................................................. 12
Architecture based on number of layers .......................................................................................................... 12
single layer neural network ......................................................................................................................... 12
Multilayer Neural Network .......................................................................................................................... 12
Architecture based on the connection pattern ................................................................................................ 14
totally connected neuron network .............................................................................................................. 14
partially connected neuron network ........................................................................................................... 14
ARCHITECTURE BASED ON information flow ................................................................................................... 15
Feed-forward neural network ...................................................................................................................... 15
feed-back or recurrent neural network ....................................................................................................... 15
ANN learning process ........................................................................................................................................... 16
supervised learning ...................................................................................................................................... 16
reinforcement learning ................................................................................................................................ 16
unsupervised learning .................................................................................................................................. 16
Back propagation ......................................................................................................................................... 17
learning laws .................................................................................................................................................... 17
Hebb's Rule: ................................................................................................................................................. 17
Hopfield Law: ............................................................................................................................................... 17
INTRODUCTION
BASICS
The great majority of digital computers in use today are based around the
principle of using one very powerful processor through which all computations
are channelled. This is the so called von Neumann architecture, after John von
Neumann, one of the pioneers of modern computing. The power of such a
processor can be measured in terms of its speed (number of instructions that it
can execute in a unit of time) and complexity (the number of different
instructions that it can execute).
Nowadays there is a new field of computational science that integrates
the different methods of problem solving that cannot be so easily described
without an algorithmic traditional focus. These methods, in one way or another,
have their origin in the emulation, more or less intelligent, of the behaviour of
the biological systems.
It is a new way of computing denominated Artificial Intelligence, which
through different methods is capable of managing the impressions and
uncertainties that appear when trying to solve problems related to the real
world, offering strong solution and easy implementation. One of those
technique is known as Artificial Neuron Networks (ANN), inspired by the
functioning of human brain.
NETWORKS
the information flow between nodes. They can be unidirectional, when the
information flows only in one sense, and bidirectional, when the information
flows in either sense.
Networks are used to model a wide range of phenomena in physics,
computer science, biochemistry, ethology, mathematics, sociology, economics,
telecommunications, and many other areas. This is because many systems can
be seen as a network: proteins, computers, communities, etc.
WHY NEURAL NETWORKS ?
The basic unit of neural networks, the artificial neurons, simulates the four basic
functions of natural neurons (receives inputs from other sources, combines them in
some way, performs a generally nonlinear operation on the result, and then output
the final result). Artificial neurons are much simpler than the biological neuron. Here
we identify three basic elements of the artificial neural model:
A set of synapses or connecting links, each of which is characterized by a weight
or strength of its own. Specifically, a signal xj at the input of synapse j connected
to neuron k is multiplied by the synaptic weight wj. Unlike a synapse in the brain,
the synaptic weight of an artificial neuron may lie in a range that includes
negative as well as positive values.
An adder for summing the input signals, weighted by the respective synapses
of the neuron; the operations described here constitute a linear combiner.
An activation function for limiting the amplitude of the output of a neuron. The
activation function is also referred to as a squashing function in that it squashes
(limits) the permissible amplitude range of the output signal to some finite
value.
The neuronal model of Figure 1 also includes an externally applied bias, denoted by b.
The bias b, has the effect of increasing or lowering the net input of the activation
function; depending on whether it is positive or negative, respectively.
In mathematical terms, we may describe a neuron by writing the following pair of
equations:
u wjxj
j 1
And
y (u b)
9
, xm are the input signals; w1, w2, , wm are the synaptic weights
of neuron; u is the linear combiner output due to the input signals; b is the bias;
(.)
is the activation function; and y is the output signal of the neuron. The use of bias b,
has the effect of applying an affine transformation to the output u of the linear
combiner in the model of Figure 1.
v=u+b
v is called induced field of the neuron.
THE PERCEPTRON AND LINEAR SEPARABILITY
The Perceptron was the first supervised model of artificial neural network. In
1958 Frank Rosenblatt proposed the Perceptron model that can also be used as a
pattern classifier. The single-layer perceptron model consists of one layer of binary
input units and one layer of binary output units. There are no hidden layers and
therefore there is only one layer of modifiable weights.
A perceptron uses a step function that returns +1 if weighted sum of its input
>=0 and -1 otherwise.
1 if v 0
(v )
1 if v 0
X1
1 true
true
false
true
1
X2
11
This is the simplest form of layered neural network. Here an input layer of the
source nodes (input nodes) that projects onto an output layer of neurons or vice versa.
A multilayer neural network is a network with one or more layers (or levels) of
nodes (the so-called hidden units) between the input layers and the output layers.
Multilayer neural networks can solve more complicated problems than can singlelayer neural networks, but training may be more difficult.
12
A neural network is said to be totally connected neural network when all the
output from a level get to all and each of the nodes in the following node. In this case
there will be more connections than nodes.
In a feed forward Artificial Neuron Network a unit only sends its output to units
from which it does not receive an input directly or indirectly (via other units). In other
words, there are no feedback loops. A feed forward ANN arranged in layers, where the
units are connected only to the units situated in the next consecutive layer, is called a
strictly feed forward ANN.
BACK PROPAGATION
LEARNING LAWS
HEBB'S RULE:
The first, and undoubtedly the best known, learning rule was introduced by
Donald Hebb. The description appeared in his book T h e Organization of Behaviour in
1949. His basic rule is: If a neuron receives an input from another neuron, and if both
are highly active (mathematically have the same sign), the weight between the
neurons should be strengthened.
HOPFIELD LAW:
This law is similar to Hebbs Rule with the exception that it specifies the
magnitude of the strengthening or weakening. It states, "If the desired output and the
input are both active or both inactive, increment the connection weight by the learning
rate, otherwise decrement the weight by the learning rate." (Most learning functions
have some provision for a learning rate, or a learning constant. Usually this term is
positive and between zero and one.)
THE DELTA RULE:
This rule is a further variation of Hebb's Rule. It is one of the most commonly
used. This rule is based on the simple idea of continuously modifying the strengths of
the input connections to reduce the difference (the delta) between the desired output
value and the actual output of a processing element. This rule changes the synaptic
weights in the way that minimizes the mean squared error of the network. This rule is
also referred to as the Widrow-Hoff Learning Rule and the Least Mean Square (LMS)
Learning Rule.
KOHONENS LEARNING L AW:
declared the winner and has the capability of inhibiting its competitors as well as
exciting its neighbours. Only the winner is permitted output, and only the winner plus
its neighbours are allowed to update their connection weights.
The Kohonen rule does not require desired output. Therefor it is implemented
in the unsupervised methods of learning.
It is apparent that a neural network derives its computing power through, first,
its massively parallel distributed structure and, second, its ability to learn and
therefore generalize. Generalization refers to the neural network producing
reasonable outputs for inputs not encountered during training (learning). These two
information-processing capabilities make it possible for neural networks to solve
complex (large-scale) problems that are currently intractable.
NONLINEARITY
ADAPTIVITY
The design of a neural network is motivated by analogy with the brain, which is
a living proof that fault tolerant parallel processing is not only physically possible but
also fast and powerful. Neurobiologists look to (artificial) neural networks as a
research tool for the interpretation of neurobiological phenomena. On the other hand,
engineers look to neurobiology for new ideas to solve problems more complex than
those based on conventional hard-wired design techniques.
APPLICATIONS OF ANN
SIGNAL PROCESSING
There are many applications of neural networks in the general area of signal
processing. One of the first commercial applications was (and still is) to suppress noise
on a telephone line. The neural net used for this purpose is a form of ADALINE.
PATTERN RECOGNITION
Many interesting problems fall into the general area of pattern recognition. One
specific area in which many neural network applications have been developed is the
automatic recognition of handwritten characters (digits or letters).
19
MEDICINE
Learning to read English text aloud is a difficult task, because the correct
phonetic pronunciation of a letter depends on the context in which the letter appears.
SPEECH RECOGNITION
In clustering, there are no training data with known class labels. A clustering
technique explores similarity between the patterns and places similar pattern in a
cluster.
PREDICTION/FORECASTI NG
20
REFERENCES
21