Академический Документы
Профессиональный Документы
Культура Документы
Abstract
Artificial neural networks (ANNs) provide a general,
practical method for learning real-valued, discrete-valued
and vector-valued functions. Algorithms such as Back-
propagation use gradient descent to tune network
parameters to best fit a training set of input-output pairs.
ANN learning is robust to errors in the training data and
has been successfully applied to problems such as
interpreting visual scenes, speech recognition and Fig: 1 A Perceptron
learning robot control strategies. A perceptron takes a vector of real-valued inputs,
In this paper, a variant of Back-propagation algorithm is calculates a linear combination of these inputs, then
proposed for feed-forward neural networks learning. The outputs a 1 if the result is greater than some threshold
proposed algorithm improve the back-propagation and -1 otherwise. More precisely, given inputs xl
training in terms of quick convergence of the solution through x,, the output o(x1, . . . , x,) computed by the
depending on the slope of the error graph and increase perceptron is
the speed of convergence of the system.
Keywords: Neural Networks, Adaptive navigation.
where each wi is a real-valued constant, or weight, that
1. Introduction determines the contribution of input xi to the perceptron
Machine learning refers to a system capable of the output.
autonomous acquisition and integration of knowledge. Learning a perceptron involves choosing values for
This capacity to learn from experience, analytical the weights w0,. . . ,wn,. Let us begin by understanding
observation, and other means, results in a system that can how to learn the weights for a single perceptron. Here the
continuously self-improve and thereby offer increased precise learning problem is to determine a weight vector
efficiency and effectiveness. Over the past 50 years, the that causes the perceptron to produce the correct +1/-1
study of machine learning has grown from the efforts of a output. We can train a perceptron using perceptron rule.
handful of computer engineers exploring whether One way to learn an acceptable weight vector is to
computers could learn to play games, and a field of begin with random weights, then iteratively apply the
statistics that largely ignored computational perceptron to each training example, modifying the
considerations, to a broad discipline that has produced perceptron weights whenever it misclassifies an example.
fundamental statistical-computational theories of learning This process is repeated, iterating through the training
processes, has designed learning algorithms that are examples as many times as needed until the perceptron
routinely used in commercial systems from speech classifies all training examples correctly.
recognition to computer vision, and has spun off an But, in some cases, multi-layer perceptron network
industry in data mining to discover hidden regularities in i.e. feed-forward neural network is needed whenever
the growing volume of online data. nonlinear decision surfaces are to be used. Fig: 2 show us
Neural network learning methods provide a robust a feed forward network.
approach to approximating real-valued, discrete-valued,
and vector-valued target functions. For certain types of
problems, such as learning to interpret complex real-
world sensor data, artificial neural networks are among
the most effective learning methods currently known.
The remainder of the paper is organized as follows:
Section (2) focuses on theoretical concepts of Perceptron
& Multilayer Feed-forward network. Section (3)
emphasizes on Back-propagation algorithm and its
variant. Lastly we conclude providing results.
3. Back-Propagation Algorithm
Back-propagation, or propagation of error, is a
common method of teaching artificial neural networks 4. Variant of Back-propagation Algorithm
how to perform a given task. It was first described by The Back-propagation algorithm described above
Paul Werbos in 1974. has many shortcomings. The time complexity of the
In back-propagation, there are two phases in its algorithm is high and it gets trapped frequently in sub-
learning cycle, one to propagate the input pattern through optimal solutions. It is also difficult to get an optimum
the network and the other to adapt the output, by step size for the learning process, since a large step size
changing the weights in the network. It is the error would mean faster learning, which may miss an optimal
signals that are back propagated in the network operation solution altogether, and a small step size would mean a
to the hidden layer(s). The portion of the error signal that very high time complexity for the learning process.
a hidden-layer neuron receives in this process is an Hence, we discuss a variant of above algorithm with
estimate of the contribution of a particular neuron to the following changes.
output error. Adjusting on this basis the weights of the A) Momentum: A simple change to the training law that
connections, the squared error, or some other metric, is sometimes results in much faster training is the addition
reduced in each cycle and finally minimized, if possible. of a momentum term. With this change, the weight
change continues in the direction it was heading. This
Mathematical Analysis : Assume a network with N weight change, in the absence of error, would be a
inputs and M outputs. Let xi be the input to ith neuron in constant multiple of the previous weight change. The
input layer, Bj be the output of the jth neuron before momentum term is an attempt to try to keep the weight
activation, yj be the output after activation, bj be the bias change process moving & also makes the convergence
between input and hidden layer, bk be the bias between faster and the training more stable.
hidden and output layer, wij be the weight between the B) Dynamic control for the learning rate and the
input and the hidden layers, and wjk be the weight momentum: Learning parameters such as learning rate
between the hidden and output layers. Let η be the and momentum serve a better purpose if they can be
learning rate and δ the error. Also, let i, j and k be the changed dynamically during the course of the training .
indexes of the input, hidden and output layers The learning rate can be high when the system is far
respectively. from the goal, and can be decreased when the system
The response of each unit is computed as: gets nearer to the goal, so that the optimal solution
cannot be missed.
C) Gradient Following: Gradient Following has been
added to enable quick convergence of the solution. When
the system is far away from the solution, the learning rate
is further increased by a constant parameter C1 and when
Weights and bias between input and hidden layer are the system is close to a solution, the learning rate is
updated as follows: decreased by a constant parameter C2.
D) Speed Factor: To increase the speed of convergence
of the system, a speed factor S has been used.
8. References
[1] J. M. Zurada, “Introduction to artificial neural
systems,” M. G. Road, Mumbai: Jaico, (2002).
[2] P. Mehra and B. W. Wah, “Artificial neural networks:
concepts and theory,” IEEE Comput. Society Press,
(1992).
[3] E.M. Johansson, F.U. Dowla, and D.M. Goodman,
“Backpropagation learning for multi-layer feed-
forward neural networks using the conjugate gradient
method,” Intl. J. Neural Systems, vol. 2: pp. 291-301
(1992).