Group A

CNN(Convolutional Neural
Network)
RNN( Recurrent Neural Network)
and its
Applications
Presented by:
Arjun Upadhyay(074-MSICE-003)
Ramesh Kumar Shah(074-MSICE-013)
Roshan Kandel(074-MSICE-014)
Roshan Kumar Nandan(074-MSICE-015)
Table of Contents:
1. Overview
2. CNN
3. RNN
4. Applications
Overview of Neural Network
Overview of Machine Learning
Overview of Neural Network and Deep learning
Overview of Neural Network and Deep learning
Overview of Convolutional Neural
Network(CNN)
• For images, multilayer perceptron's are not well adapted.
• Indeed, images are transformed into vectors loosing by the way the spatial
information's contained in the images,
• Before the development of deep learning for computer vision, learning was based
on the extraction of variables of interest, called features,
• The convolutional neural networks (CNN) introduced , revolutionized image
processing, and removed the manual extraction of features.
• CNN act directly on matrices, or even on tensors for images with three RGB color
channels.
• CNN are now widely used for image classification, image segmentation, object
recognition, face recognition .
Overview of Convolutional Neural
Network(CNN)
• A Convolutional Neural Network (CNN) is comprised of one or more
convolutional layers .
• A CNN is made up primarily of 3 kinds of layers: Convolutional layers, Pooling
layers, and Fully Connected layers.
Overview of Recurrent Neural Network(RNN)
• Conventional Neural Networks, lack the ability to retain information spread across
time steps.
• In tasks like sequence predictions, the input and output at time instant t − 1 of the
network affects the output and input at time t.
• So, there is a need for a network which could have some “memory” to store the
past experiences and use it later on.
• In order to infer sequential data such as text or time series, Recurrent Neural
Networks (RNN) are considered.
• RNNs have a feedback loop, through which they can examine their output at time
t − 1 as an input at time t.
Overview of Recurrent Neural Network(RNN)
Convolutional Neural Networks
Convolutional Neural Network
• Convolutional neural networks are neural networks that use convolution in place
of general matrix multiplication in at least one of their layers.
• They are very powerful in processing data with grid-like topology.
• Convolutional Neural Networks (ConvNets or CNNs) are a category of Neural
Networks that have proven very effective in areas such as image recognition and
classification.
Types of inputs
• Color images are three dimensional and so have a volume .
• Time domain speech signals are 1-d while the frequency domain representations
(e.g. MFCC vectors) take a 2d form. They can also be looked at as a time
sequence.
• Medical images (such as CT/MR/etc) are multidimensional .
• Videos have the additional temporal dimension compared to stationary images .
• Variable length sequences and time series data are again multidimensional .
• Hence it makes sense to model them as tensors instead of vectors.
• The classifier then needs to accept a tensor as input and perform the necessary
machine learning task.
• In the case of an image, this tensor represents a volume.
The Problem with Traditional Neural Networks
• Traditional neural networks called the multilayer perceptron (MLP)
have certain drawbacks.
• These are modeled on the human brain, whereby neurons are stimulated
by connected nodes and are only activated when a certain threshold
value is reached.
Figure: A standard multilayer perceptron (traditional neural network).
• MLPs use one perceptron for each input (e.g. pixel in an image,
multiplied by 3 in RGB case). The amount of weights rapidly becomes
unmanageable for large images.
• For a 224 x 224 pixel image with 3 color channels there are around
150,000 weights that must be trained. As a result, difficulties arise
while training and overfitting can occur.
• MLPs react differently to an input (images) and its shifted version —
they are not translation invariant.
• Clearly, MLPs are not the best idea to use for image processing. One of
the main problems is that spatial information is lost when the image is
flattened into an MLP.
Convolution
• 𝐶𝑜𝑛𝑣𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑖𝑛 1 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛:
• 𝐶𝑜𝑛𝑣𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑖𝑛 2 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑠:
Types of layers in a CNN
• Convolution Layer
• Pooling Layer
• Fully Connected Layer
Figure : Architecture of a CNN
Convolution Layer
• The convolution layer is the core building block of the CNN. It carries the main
portion of the network’s computational load.
• This layer performs a dot product between two matrices, where one matrix is the
set of learnable parameters known as a kernel, and the other matrix is the restricted
portion of the receptive field.
• The kernel is spatially smaller than an image, but is more in-depth. This means
that, if the image is composed of three (RGB) channels, the kernel height and
width will be spatially small, but the depth extends up to all three channels.
• During the forward pass, the kernel slides across the height and width of the image
producing the image representation of that receptive region.
• This produces a two-dimensional representation of the image known as an
activation map that gives the response of the kernel at each spatial position of the
image.
• The sliding size of the kernel is called a stride.
Pooling Layer
• Its function is to progressively reduce the spatial size of the representation to
reduce the amount of parameters and computation in the network.
• Pooling layer operates on each feature map independently.
• The most common approach used in pooling is max pooling, which reports the
maximum output from the neighborhood.
Figure : Pooling Operation
Fully Connected Layer
• Neurons in this layer have full connectivity with all neurons in the preceding and
succeeding layer as seen in regular FCNN.
• This is why it can be computed as usual by a matrix multiplication followed by a
bias effect.
• The FC layer helps map the representation between the input and the output.
Recurrent Neural Networks
RNN(Recurrent Neural Networks)
• The idea behind RNNs is to make use of sequential information
• In a traditional neural network we assume that all inputs (and outputs) are
independent of each other.
• But for many tasks that’s a very bad idea.
• If you want to predict the next word in a sentence you better know which words
came before it.
• RNNs are called recurrent because they perform the same task for every element
of a sequence, with the output being dependent on the previous computations and
there is a “memory” which captures information about what has been calculated so
far.
RNN Architecture
Recurrent neural network
• Here x_1, x_2, x_3, …, x_t represent the input words from the text
• y_1, y_2, y_3, …, y_t represent the predicted next words
• And h_0, h_1, h_2, h_3, …, h_t hold the information for the previous input words.
• Equations needed for training
• 1) —holds information about the previous words in the sequence.
• 2) — calculates the predicted word vector at a given time step t.
• 3) — uses the cross-entropy loss function at each time step t to calculate the error
between the predicted and actual word.
Training a Neural Network
• Training a neural network has three major steps.
• First, it does a forward pass and makes a prediction.
• Second, it compares the prediction to the ground truth using a loss function. The
loss function outputs an error value which is an estimate of how poorly the
network is performing.
• Last, it uses that error value to do back propagation which calculates the gradients
for each node in the network.
• The gradient is the value used to adjust the networks internal weights, allowing the
network to learn.
• The bigger the gradient, the bigger the adjustments and vice versa.
Problems with a standard RNN
• Simplest RNN model has a major drawback, called vanishing gradient
problem, which prevents it from being accurate.
• This means that the network experiences difficulty in memorising words from
far away in the sequence and makes predictions based on only the most recent
ones.
• That is why more powerful models like LSTM and GRU come in hand
Interpretation of Vanishing Gradient Descent
LSTM
• A special kind of RNN’s, capable of Learning Long-term dependencies.

• LSTM has a three step Process:
• Forget Gate: Decides how much of the past you should remember.
• Update Gate/input gate: Decides how much of this unit is added to the current
state.
• Output Gate: Decides which part of the current cell makes it to the output.
GRU
• The GRU is the newer generation of Recurrent Neural networks and is pretty
similar to an LSTM.
• It also only has two gates, a reset gate and update gate.
• Update Gate: The update gate acts similar to the forget and input gate of an
LSTM. It decides what information to throw away and what new information to
add.
• Reset Gate: The reset gate is another gate is used to decide how much past
information to forget.
Applications of CNN & RNN
Applications of CNN
• 1)Mobile application(Gesture control)
• 2)Surveillance(Object detection, people detection)
• 3)Automotive(image segmentation, pedestrian detection, traffic sign recognition)
• 4)AR/VR
• Depth creation(application done through stereo)
• Measure the dimensions of room
• To create object in the room
• To move those object in the room
Figure:- Virtual Reality VR
Figure:- Augmented Reality AR
5)Signal and image processing
6)Hand written text/digit recognition
7)Natural object classification(photos and videos)
8)Face detection
• Identifying all the faces in the picture
• Focussing on each face despite bad lighting or
• Different pose
• Identifying unique features
• Comparing identified features to existing database and determining the person's
name
Applications of RNN
1)Sentimental analysis
2)Language modelling
3)Translation
4)Conversational agent
5)Image captioning
6)Text summarization
Natural language Processing (NLP)
• Natural language processing is a branch of computer science and artificial
intelligence which is concerned with interaction between computers and human
languages.
Applications of NLP
1)Sentimental analysis
2)Chatbots
3)Speech recognitions(Google assistance)

Group A

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Group A

Загружено:

Авторское право:

Доступные форматы

CNN(Convolutional Neural

• A special kind of RNN’s, capable of Learning Long-term dependencies.

Вам также может понравиться