Вы находитесь на странице: 1из 2

Convergence of basic neural network architecture to CNN

In the field of machine learning recognition is the major concerned task along with the analytical
approach. Since the emerging of convolution neural network deep learning became a new sensational
field in artificial intelligence. Convolution neural network is the advanced version of basic neural
network. Ultimately neural network is for classification, which consist of several artificial neurons
often refer ed as nodes, interconnected and organized in layers. Every nodes are sum of all scaled input
values by weight values to represent the whole inputs into a straight line after biasing with a bias value.
Thus each of nodes itself a classifier, only a simpler one whose ability to classify is limited when used
for complex problems. It turns out that it can completely overcome the limitations of simple classifiers
by interconnecting a number of nodes to form powerful neural network. Again with the interconnection
of nodes it could not classify nonlinear problems with linear nodes. So every nodes are activated to a
nonlinear values with the help of an activation function. Activation decides whether the node should be
activated or not, by representing the linear node values into a range of values with a nonlinear function.
Thus every nodes are checking the presence of particular features and as the number of layers increases
network can classify better.

Simple neural network architecture

This basic neural network structure could not pick the spatial invariant features like edge and circles,
since every node values are the representation of all input values. Thus the architecture gives same
node values for any order of input values. Signals like images and sounds are very sensitive towards
spatial orientations. So effective extraction of features become important for recognition.

Instead of using all weight values with all input values for the calculation of nodes, weight sharing
approach with small sections of input could overcome the issue. In CNN architecture filter size
represents the size of input values and weights to be involved to create a node value. By using striding
of weight filters all the corresponding node values are calculating and this resulting set of nodes are
grouping as a single layer. For every stride, convolution processes is inner product of weight values of
filter size with corresponding signal or image patch added with a bias and all node values are activated
with activation function, same as in basic neural network. Concept of depth for each layer is introduced
in CNN, by just repeating the same layer process with different weight and values while keeping filter
size, striding are same. Depth in layers can increase the number of nodes, which strengthens the
network and the multiple node with same type of feature searching gives more clear feature extraction.

CNN architecture

First layer of CNN nodes finds the relationships between the neighboring inputs of filter size. This
filter size is the receptive field of first layer which is the area of original input that output of network
can see. That initial layers could only extract low level features like edges and line from an image. So
only with deep layers, can’t detect abstracted features from first filter size of receptive field. Further
layer nodes that can calculate from the initial layer nodes can increase the receptive field of network as
per the filter size and striding parameters. This multiple layers concept provides expansion of receptive
fields to the network. Expansion of the receptive field allows the convolution layers to combine the low
level features into higher level features like curves, textures, objects etc. As the signal moves forward
through the layers, the features become more abstract and normally last layer having the receptive field
of same size of input signal.
Complex networks with large number of parameters make architecture too noisy and leads to over
fitting issues. Pooling technique introduce to every layer as it allows features to shift relative to each
other resulting in robust matching of features even in the presence of small distortions by reducing the
spatial dimension of the feature map. Hence reducing the number of parameters high up the processing
hierarchy and simplifies the overall model complexity.

So convolution layers are extracting the features from low level to abstracted level which overcomes
the basic neural network drawback. Applying basic neural network architecture to the last convolution
layer nodes which is naming as fully connected networks. The goal of this layer is to combine features
detected from the image patches together for a particular task. Simply convolution layers are smart
feature extractors, and fully connected layers is the actual network.

Вам также может понравиться