Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it

Deep Learning 4/7:
Convolutional Neural Networks

C. De Castro, IEIIT-CNR, cristina.decastro@ieiit.cnr.it
Big Data & Deep Learning Series
1. Machine Learning Basics

2. Neural Networks
3. Laboratory: Make Your Own Neural Network
4. Convolutional Neural Networks
5. Bayesian Reasoning and Prob. Graphical Models
6. Deep Networks
7. Big Learning with Bayesian Methods
2/49
References
• I. Goodfellow et al, Deep Learning, The MIT Press

2016, www.deeplearningbook.org
3/49
Outline
• Introduction
• The Convolution Operation
• Main Properties of CNNs (due to Convolution)
• Pooling
• Variants of the Basic Convolution Function
• Data Types
• Training Convolutional Networks
4/49
Outline
• Introduction
• Pooling
• Data Types
5/49
Applications
• CNNs have applications in image and video
recognition, recommender systems and natural
language processing.
6/49
• CNNs have played a
fundamental role in the
history of deep learning.
• They are still in use and
an important topic of
research.
7/49
Inspiration
• Convolutional Neural Networks (CNNs, Le Cun,
1989) were inspired by the animal visual cortex.
8/49
Inspiration
• Individual cortical neurons respond to stimuli only
in a restricted region of the visual field known as
the receptive field.
• The receptive fields of different neurons partially
overlap such that they cover the entire visual field.
9/49
Topology of Input Types
• CNNs are a specialized kind of neural networks for

processing data that has a known grid-like topology.
• e.g.:
• images (2-D grid of pixels)
• time-series data, which can be thought of as a 1-D
grid taking sample at regular time intervals
•…
10/49
Main Operations of CNNs
• Such networks employ the linear mathematical

operation of convolution in place of general matrix
multiplication in at least one of their layers.
• Almost all CNNs also use pooling.
11/49
An Architecture
12/49
Outline
• Introduction
• Pooling
• Data Types
13/49
The Convolution Operation
Suppose we are tracking

the location of a spaceship
with a laser sensor.
Our laser sensor provides
a single output x(t), the
position of the spaceship
at time t.
Both x and t are real-
valued, that is, we can get
a different reading from
the laser at any instant in
time.
14/49
The Convolution Operation
Now suppose that our laser sensor is noisy and we would
like to average several measurements:
s(t) = ‫ 𝑡 𝑤 𝑎 𝑥 ׬‬− 𝑎 𝑑𝑎
𝑠(𝑡) = (𝑥 ∗ 𝑤)(𝑡)
feature input kernel,

map weighting
function
15/49
Discretization
When we work with data in a computer, time is
discrete:
𝑠 𝑡 = 𝑥 ∗ 𝑤 𝑡 = ෍ 𝑥 𝑎 𝑤(𝑡 − 𝑎)
𝑎=−∞
16/49
Discretization
• In machine learning applications, the input is
usually a multidimensional array of data and the
kernel a multidimensional array of parameters that
are adapted by the learning algorithm.
• Such multidimensional arrays are called tensors.
• They are zero everywhere but in finite sets of
points:
S(i,j) = (I*K)(i,j) = σ𝑚 σ𝑛 𝐼 𝑚, 𝑛 𝐾(𝑖 − 𝑚, 𝑗 − 𝑛)
17/49
• Convolution is commutative:
S(i,j) = (K*I)(i,j) = σ𝑚 σ𝑛 𝐼 𝑖 − 𝑚, 𝑗 − 𝑛 𝐾(𝑚, 𝑛)
• Cross-correlation:
S(i,j) = (I*K)(i,j) = σ𝑚 σ𝑛 𝐼 𝑖 + 𝑚, 𝑗 + 𝑛 𝐾(𝑚, 𝑛)
18/49
Outline
• Introduction
• Pooling
• Data Types
19/49
Main Properties of CNNs
• Convolution leverages three important ideas that

improve machine learning:
• sparse interactions
• parameter sharing
• equivariance to translation
• In addition, it allows to work with inputs of variable size
20/49
Sparse Interactions
• In traditional neural
networks every output unit
interacts with every input
unit.
• Convolutional networks,
instead, thanks to kernels
smaller than the input,
achieve sparse interactions
(sparse connectivity).
21/49
Sparse Interactions
• For example, when processing an image, the

input image might have thousands or millions of
pixels, but we can detect small, meaningful
features such as edges with kernels that occupy
only tens or hundreds of pixels.
• This means we need to store fewer parameters
(memory and computational efficiency).
22/49
Example:
efficiency of edge detection
(60.000 times more efficient than multiplication)
23/49
Example
24/49
Parameter Sharing (Tied Weights)
• It refers to using the same parameter for more than

one function in a model.
• In a traditional neural network, each element of the
weight matrix is used exactly once when computing
the output of a layer.
25/49
Equivariance to Translation
I’(x,y) = I(x-1,y)
• When processing time-series data, this means that

convolution produces a sort of timeline that shows
when given features appear in the input.
• The same applies to images.
26/49
Outline
• Introduction
• Pooling
• Data Types
27/49
Pooling
• A typical layer of a
convolutional network
consists of three stages:
1. first stage, several
convolutions;
2. second stage, several
nonlinear activations
(e.g.: rectified linear);
3. third stage: pooling
function.
28/49
Pooling
• A pooling function replaces the output of the net at
a certain location with a summary statistic of the
nearby outputs.
• For example, the max pooling operation reports the
maximum output within a rectangular
neighborhood.
29/49
Pooling
• Other popular pooling functions include the

average of a rectangular neighborhood, the L2 norm
of a rectangular neighborhood, or a weighted
average based on the distance from the central
pixel.
30/49
Pooling
• In all cases, pooling helps to make the

representation approximately invariant to small
translations of the input.
31/49
Pooling
• This can be useful if we care more about whether

some feature is present than exactly where it is.
• In other contexts, it is more important to preserve
the location of a feature.
32/49
(Rectified Linear Unit)
• Both sparse interactions and parameter sharing

lead to great efficiency in terms of performance.
33/49
Example of processing:
input processing output

34/49
35/49
Outline
• Introduction
• Pooling
• Data Types
36/49
Long step
Strided Downsampled Convolution
37/49
Unshared Convolution
38/49
Tiled Convolution
39/49
Outline
• Introduction
• Pooling
• Data Types
40/49
Data Types
• The data used with a convolutional network

consists of several channels, each being the
observation of a different quantity at some point in
space or time.
41/49
42/49
43/49
44/49
Data Types
• One advantage to convolutional networks is that

they can also process inputs with varying spatial
extents. These kinds of input simply cannot be
represented by traditional, matrix multiplication-
based neural networks.
• This provides a compelling reason to use
convolutional networks even when computational
cost and overﬁtting are not signiﬁcant issues.
45/49
Outline
• Introduction
• Pooling
• Data Types
46/49
Training Convolutional Networks
• Typically, the most expensive part of convultional

network training is learning the features.
• A good approach is to use greedy layer-wise pre-
training, to train the first layer in isolation, then
extract all features from the first layer only once,
then train the second layer in isolation given those
features, and so on.
47/49
Training Convolution Networks
48/49
Thank you for your kind attention!
Any questions?
49/49

Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it

Загружено:

Авторское право:

Доступные форматы

Deep Learning 4/7:

Convolutional Neural Networks

1. Machine Learning Basics

• I. Goodfellow et al, Deep Learning, The MIT Press

• CNNs are a specialized kind of neural networks for

• Such networks employ the linear mathematical

Suppose we are tracking

feature input kernel,

S(i,j) = (I*K)(i,j) = σ𝑚 σ𝑛 𝐼 𝑚, 𝑛 𝐾(𝑖 − 𝑚, 𝑗 − 𝑛)

• Convolution leverages three important ideas that

• For example, when processing an image, the

• It refers to using the same parameter for more than

• When processing time-series data, this means that

• The same applies to images.

• Other popular pooling functions include the

• In all cases, pooling helps to make the

• This can be useful if we care more about whether

• Both sparse interactions and parameter sharing

input processing output

Strided Downsampled Convolution

• The data used with a convolutional network

• One advantage to convolutional networks is that

• Typically, the most expensive part of convultional

Вам также может понравиться