Вы находитесь на странице: 1из 44

7.

More Convolutional
Neural Networks
CS 519 Deep Learning, Winter 2016
Fuxin Li
With materials from Zsolt Kira, Roger Grosse, Nitish Srivastava

Action Items
Project proposal teams due on 2/9 end of the day (11:59PM)
Reminder: no more than 3 persons per team
Then we will arrange the presentation order before 2/11
Remind again that the class on 2/11 will last until 6PM!

Fall-back project ideas will be posted on 1/31

But hopefully you will be working your own project

Next quiz will be on 2/18

To leave time for you to prepare the project proposal

Quiz grades will come later

Hopefully by the end of next week

Difference between convolutional networks


and locally-connected networks

2D Convolution with Padding


0

-1

-1

-2 -2

-2

2D Convolution with Padding


0

-1

-1

-2 -2

-2

3 1 + 1 1 = 2

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1

1 2 + 1 1 + 1 1 + 1 1 = 1

2D Convolution with Padding


0

-1

-1

What if:
0

-1

-1

-2 -2

-2

-2 -2

-2

=
=

-1 -6

-1

-18

2D Convolution with Padding


0

-1

-1

-2 -2

-2

2
4

-1 -6

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1 -6

-3

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1 -6

-3 -5

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1 -6

-3 -5

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1 -6

-3 -5

-2

2D Convolution with Padding


0

-1

-1

-2 -2

-2

-1 -6

-3 -5

-2 -2

Rectifier
: = max(0, )

We need nonlinearity
Make the gradient sparser and simpler to
compute

Convolution/ReLU/Pooling

MNIST again

Le Net
Convolutional nets are invented by Yann LeCun et al. 1989
On handwritten digits classification

Many hidden layers


Many maps of replicated units in each layer.
Pooling of the outputs of nearby replicated units.
A wide net that can cope with several characters at once even if they
overlap.

A clever way of training a complete system, not just a


recognizer.
This net was used for reading ~10% of the checks in North America.
Look the impressive demos of LENET at http://yann.lecun.com

The architecture of LeNet5 (LeCun 1998)

ConvNets performance on MNIST


subsampling to 16x16
pixels
Convolutional net LeNet-4 none
Convolutional net LeNet-4
with K-NN instead of last none
layer
Convolutional net LeNet-4
with local learning instead none
of last layer
Convolutional net LeNet-5,
none
[no distortions]
Convolutional net, crossnone
entropy [elastic distortions]
Convolutional net LeNet-1

1.7 LeCun et al. 1998


1.1 LeCun et al. 1998
1.1 LeCun et al. 1998
1.1 LeCun et al. 1998
0.95 LeCun et al. 1998
0.4 Simard et al., ICDAR 2003

The 82 errors
made by LeNet5
The human error rate is
probably about 0.2% 0.3% (quite clean)

The errors made by the Ciresan et. al. net


The top printed digit is the
right answer. The bottom two
printed digits are the
networks best two guesses.
The right answer is almost
always in the top 2 guesses.
With model averaging they
can now get about 25 errors.

Whats different from back then till now


Computers are bigger, faster
GPUs

What else is different?


ReLU vs. Sigmoid

ReLU rectifier

Max-pooling

Grab local features and make them global

Dropout regularization (to-be-discussed)

Replaceable by some other regularization techniques

Backpropagation for the convolution operator


Forward pass:
Compute ;
Backward pass:
Compute

=?

=?

;
=

= (; )

Visualization of the filters


11x11 filters
filter

st
(1

layer)

Visualization of second-level filters

Visualization of third-level filters

Visualization of layer 5

Another visualization: Maximizing class score

Start with zero image


Keep weights fixed, perform backpropagation for a class!

Convolution/ReLU/Pooling

Graham. Fractional Max-pooling. arXiv:1412.6071v4

arXiv!
Because how fast the field evolves, most deep learning papers are on
arXiv first
http://arxiv.org/list/cs.CV/recent
http://arxiv.org/list/cs.CL/recent
Check that for newest papers/ideas!

Deconvolutional Network
Instead of mapping
pixels to features,
map the other way
around
Can be used to
learned unsupervised
features
Here, attached to
trained convnet

Visualization of second-level filters

Visualization of third-level filters

Visualization of layer 5

Вам также может понравиться