Вы находитесь на странице: 1из 83

CS231A

Computer Vision:
From 3D Reconstruc:on
to Recogni:on

Professor Silvio Savarese


Computa(onal Vision and Geometry Lab

Silvio Savarese! Lecture 1 ! 28-Mar-16


CS231A
Instructor
o Silvio Savarese
o ssilvio@stanford.edu
o Oce: Gates Building, room: 154
o Oce hour: Thursday 2-3pm or under appoint.

CAs:
o Kenji Hata (head CA)
o Sumant Sharma
o Ma:hew Cong
o Bryan Anenberg
o Kratarth Goel
o Kevin Chen
o Lyne Petse

Class Time & Loca:on


o M-W; 34:20PM Skilling Auditorium

Silvio Savarese! Lecture 1 !


Prerequisites
This course requires knowledge of linear algebra, probability,
sta:s:cs, machine learning and computer vision, as well as
decent programming skills. Though not an absolute
requirement, it is encouraged and preferred that you have at
least taken either CS221 or CS229 or CS131A or have equivalent
knowledge.

We will leverage concepts from low-level image processing


(CS131A) (e.g., linear lters, edge detectors, corner detectors,
etc) and machine learning (CS229) (e.g., SVM, basic Bayesian
inference, clustering, neural networks, etc) which we wont
cover in this class.

We will provide links to background material related to CS131A


and CS229 (or discuss during TA sessions) so students can
refresh or study those topics if needed.

Silvio Savarese! Lecture 1 !


Text books

Required:
- [FP] D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach (2nd
Edi:on). Pren:ce Hall, 2011.
- [HZ] R. Hartley and A. Zisserman. Mul(ple View Geometry in Computer
Vision. Academic Press, 2002.

Recommended:
- R. Szeliski. Computer Vision: Algorithms and Applica(ons. Springer, 2011.
- D. Hoiem and S. Savarese. Representa(ons and Techniques for 3D Object
Recogni(on and Scene Interpreta(on, Synthesis
lecture on Ar:cial Intelligence and Machine Learning. Morgan Claypool
Publishers, 2011
- Learning OpenCV, by Gary Bradski & Adrian Kaehler, O'Reilly Media, 2008.

Silvio Savarese! Lecture 1 !


Course assignments
1 warm up problem set (HW-0)
4 problem sets (rst problem released next week!)
1 mid-term exam
1 project

Look up class schedule for release and due dates.


Problems will be released through the schedule page and must
be submimed through Gradescope (Use code MB5ZB9).

Silvio Savarese! Lecture 1 !


Midterm Exam
The exam will be held in class and you will have 80
minutes to complete it.
The exam will be open-book and open-notes.
You will be updated with more details, e.g., material
to be covered, review sessions etc., as we approach
the midterm.

6
Silvio Savarese! Lecture 1 !
Course Projects
Replicate an interes:ng paper
Comparing dierent methods to a test bed
A new approach to an exis:ng problem
Original research

Write a 10-page paper summarizing your results
Release the nal code
Give a nal in-class presenta:on
SCPD students can send videos instead.

We will introduce projects in 1-2 weeks


Important dates: look up class schedule
Course Projects
Form your team:
1-4 people
The larger is the team, the more work we expect
from the team
Be nice to your partner: do you plan to drop the
course?
Evalua:on
Quality of the project (including wri:ng)
Final project in-class presenta:on (~ TBA minutes
spotlight presenta:ons)
Grading policy
Homeworks: 42%
2% for HW0
10% for HW1, HW2, HW3, HW4 (each)

Mid term exam: 15%


Course project: 38%
mid term progress report 5%
nal report 25%
presenta:on 8%

Amendance and class par:cipa:on: 5%


Ques:ons, answers, remarks, piazza posts,
Class par:cipa:on are waived for SCPD students. For the project
presenta:on, SCPD students can send videos instead.
Grading policy
Late policy for home works:

If 1 day late, 50% o the grade for that homework


Zero credits if more than one day.
Two "48-hours one-:me late submission bonuses are
available; that is, you can use this bonus to submit your HW
late axer at most 48 hours. This is one :me deal: Axer you
use all your bonuses, you must adhere to the standard late
submission policy.
No excep:ons will be made.
Grading policy

Late policy project:
If 1 day late, 25% o the grade for the project
If 2 days late, 50% o the grade for the project
Zero credits if more than 2 days
No "late submission bonus" is allowed when submizng your
progress report or project report

Collabora:on policy
Read the student code book, understand what is
collabora:on and what is academic infrac:on.
Discussing project assignment with each other is allowed,
but coding must be done individually
Home works or class project coding policy: using on line
code or other students/researchers code is not allowed in
general. Excep:ons can be made and individual cases will be
discussed with the instructor.
Lecture 1
Introduc:on

An introduc:on to computer vision


Course overview

Silvio Savarese! Lecture 1 ! 28-Mar-16


There was a table set out under
a tree in front of the house,
and the March Hare and the
Hatter were having tea at it.

The table was a large one, but


the three were all crowded
together at one corner of it

From A Mad Tea-Party


Alice's Adventures in Wonderland
by
Lewis Carroll
There was a table set out under
a tree in front of the house,
and the March Hare and the
Hatter were having tea at it.

The table was a large one, but


the three were all crowded
together at one corner of it

From A Mad Tea-Party


Alice's Adventures in Wonderland
by
Lewis Carroll

Illustration by Arthur Rackham


Computer vision

Image/video

Object 1 Object N

- semantic
-semantic
Computer vision

Image/video

Object 1 Object N

- semantic
-geometry
-semantic
-geometry
Computer vision

Image/video

Object 1 Object N

- semantic
-geometry
-semantic
-geometry

spatial & temporal relations


Computer vision

Image/video

Object 1 Object N

- semantic
-geometry
-semantic
-geometry

spatial & temporal relations

Scene

-Semantic
- geometry
Computer vision

InformaPon
extracPon
InterpretaPon
Sensing device ComputaPonal
device

1. InformaPon extracPon: features, 3D structure, mo:on


ows, etc
2. InterpretaPon: recognize objects, scenes, ac:ons, events
Computer vision and Applica:ons

EosSystems

1990 2000 2010

20
Fingerprint biometrics
Augmenta:on with 3D
computer graphics

22
3D object prototyping

EosSystems Photomodeler 23
Computer vision and Applica:ons

New features detector/descriptors


CV leverages machine learning

EosSystems AutosPch

1990 2000 2010

24
Face detec:on
Face detec:on
Web applica:ons

Photometria
27
Panoramic Photography

kolor
3D modeling of landmarks

29
Computer vision and Applica:ons
Ecient SLAM/SFM
Large scale image repositories
Deep learning (e.g. ImageNet)

EosSystems AutosPch

1990 2000 2010

30
Computer vision and Applica:ons
Ecient SLAM/SFM
Large scale image repositories
Deep learning (e.g. ImageNet)

Bemer clouds J
More bandwidth
Increase computa:onal power Kinect

A9
Google
Goggles
Kooaba
EosSystems AutosPch

1990 2000 2010

31
Image search engines

Movies, news, sports


Visual search and
landmarks recogni:on

Google Goggles
33
Visual search and
landmarks recogni:on

34
Augmented reality

35
Mo:on sensing and
gesture recogni:on

36
Autonomous naviga:on and safety

Mobileye: Vision systems in high-end BMW, GM, Volvo models


But also, Toyota, Google, Apple, Tesla, Nissan, Ford, etc.

Source: A. Shashua, S. Seitz


Personal robo:cs

38
Computer vision and Applica:ons

Assis:ve technologies Surveillance


Factory inspec:on

Vision for robo:cs,


space explora:on
Autonomous driving,
robot naviga:on Security

Sources: K. Grauman, L. Fei-Fei, S. Laznebick


Computer vision and Applica:ons

Kinect

A9
Google
Goggles
Kooaba
EosSystems AutosPch

1990 2000 2010 40


Computer vision and Applica:ons

3D EosSystems

Google
Goggles
2D

1990 2000 2010


41
Computer vision and Applica:ons

3D EosSystems

Google
Goggles
2D

1990 2000 2010


42
Current state of computer vision




3D ReconstrucPon 2D RecogniPon

3D shape recovery Object detec:on


3D scene reconstruc:on Texture classica:on
Camera localiza:on Target tracking
Pose es:ma:on Ac:vity recogni:on

43
Current state of computer vision



Snavely et al., 06-08


3D ReconstrucPon

3D shape recovery
3D scene reconstruc:on
Camera localiza:on
Pose es:ma:on

Levoy et al., 00 Golparvar-Fard, et al. JAEI 10


Lucas & Kanade, 81 Pandey et al. IFAC , 2010
Chen & Medioni, 92 Hartley & Zisserman, 00
Dellaert et al., 00 Pandey et al. ICRA 2011
Debevec et al., 96
Savarese et al. IJCV 05
Levoy & Hanrahan, 96 Rusinkiewic et al., 02
Nistr, 04
Savarese et al. IJCV 06
Fitzgibbon & Zisserman, 98 Microsoxs PhotoSynth
Triggs et al., 99 Brown & Lowe, 04
Schindler et al, 04 Snavely et al., 06-08
Pollefeys et al., 99 Schindler et al., 08
Kutulakos & Seitz, 99 Lourakis & Argyros, 04
Colombo et al. 05 Agarwal et al., 09 44
Frahm et al., 10
Current state of computer vision



Snavely et al., 06-08


3D ReconstrucPon

3D shape recovery
3D scene reconstruc:on
Camera localiza:on
Pose es:ma:on

Levoy et al., 00 Golparvar-Fard, et al. JAEI 10


Lucas & Kanade, 81 Pandey et al. IFAC , 2010
Chen & Medioni, 92 Hartley & Zisserman, 00
Dellaert et al., 00 Pandey et al. ICRA 2011
Debevec et al., 96
Savarese et al. IJCV 05
Levoy & Hanrahan, 96 Rusinkiewic et al., 02
Nistr, 04
Savarese et al. IJCV 06
Fitzgibbon & Zisserman, 98 Microsoxs PhotoSynth
Triggs et al., 99 Brown & Lowe, 04
Schindler et al, 04 Snavely et al., 06-08
Pollefeys et al., 99 Schindler et al., 08
Kutulakos & Seitz, 99 Lourakis & Argyros, 04
Colombo et al. 05 Agarwal et al., 09
Frahm et al., 10
Current state of computer vision




2D RecogniPon

Object detec:on
Texture classica:on
Target tracking
Ac:vity recogni:on

Turk & Pentland, 91 Argawal & Roth, 02 He et al. 06


Poggio et al., 93 Ramanan & Forsyth, 03 Gould et al. 08
Belhumeur et al., 97 Weber et al., 00 Maire et al. 08
LeCun et al. 98 Vidal-Naquet & Ullman 02 Felzenszwalb et al., 08
Amit and Geman, 99 Fergus et al., 03
Shi & Malik, 00 Kohli et al. 09
Torralba et al., 03 L.-J. Li et al. 09
Viola & Jones, 00
Vogel & Schiele, 03 Ladicky et al. 10,11
Felzenszwalb & Humenlocher 00
Barnard et al., 03 Gonfaus et al. 10
Belongie & Malik, 02 Fei-Fei et al., 04
Ullman et al. 02 Farhadi et al., 09
Kumar & Hebert 04 Lampert et al., 09
46
Current state of computer vision

Building Tree
car person car
Person
Car
Car Bike

Street 2D RecogniPon

Object detec:on
Texture classica:on
Target tracking
Ac:vity recogni:on

Turk & Pentland, 91 Argawal & Roth, 02 He et al. 06


Poggio et al., 93 Ramanan & Forsyth, 03 Gould et al. 08
Belhumeur et al., 97 Weber et al., 00 Maire et al. 08
LeCun et al. 98 Vidal-Naquet & Ullman 02 Felzenszwalb et al., 08
Amit and Geman, 99 Fergus et al., 03
Shi & Malik, 00 Kohli et al. 09
Torralba et al., 03 L.-J. Li et al. 09
Viola & Jones, 00
Vogel & Schiele, 03 Ladicky et al. 10,11
Felzenszwalb & Humenlocher 00
Barnard et al., 03 Gonfaus et al. 10
Belongie & Malik, 02 Fei-Fei et al., 04
Ullman et al. 02 Farhadi et al., 09
Kumar & Hebert 04 Lampert et al., 09
Current state of computer vision




3D ReconstrucPon 2D RecogniPon

3D shape recovery Object detec:on


3D scene reconstruc:on Texture classica:on
Camera localiza:on Target tracking
Pose es:ma:on Ac:vity recogni:on



Perceiving the World in 3D!

48
Visual processing in the brain
where pathway
(dorsal stream)

V1

what pathway
(ventral stream)
49
Visual processing in the brain
where pathway
(dorsal stream)

Pre-frontal
V1 cortex

what pathway
(ventral stream)
50
CS 231A course overview

1. Geometry
2. Seman:cs

Geometry:
- How to extract 3d informa:on?
- Which cues are useful?
- What are the mathema:cal tools?
Camera systems
Establish a mapping from 3D to 2D
How to calibrate a camera
Es:mate camera parameters such pose or focal length

?
Single view metrology
Es:mate 3D proper:es of the world from a single image

?
Single view metrology
Es:mate 3D proper:es of the world from a single image
Mul:ple view geometry
Es:mate 3D proper:es of the world from mul:ple views
Mathema:cal tools

Epipolar geometry

Tomasi & Kanade (1993) Projective Photoconsistency


structure from motion:
Here be dragons!
Structure from mo:on

Courtesy of Oxford Visual Geometry Group


Structure ligh:ng and volumetric stereo

Scanning Michelangelos The David


The Digital Michelangelo Project
- hmp://graphics.stanford.edu/projects/mich/
2 BILLION polygons, accuracy to .29mm
CS 231A course overview

1. Geometry
2. Seman:cs

Seman:cs:
- How to recognize objects?
- How to classify images or understand a scene?
- How to segment out cri:cal seman:cs
- How to es:mate 3D proper:es (pose, size, shape)
Object recogniPon and categorizaPon
Downtown chicago
Building

clock

person
car

Pedestrians crossing street


ClassicaPon:
Is this an forest?

No!
ClassicaPon:
Does this image contain a building? [yes/no]

Yes!
DetecPon:
Does this image contain a car? [where?]

car
DetecPon:
Which objects do this image contain? [where?]

Building

clock

person
car
DetecPon:
Accurate localizaPon (segmentaPon)

clock
DetecPon:
EsPmaPng 3D geometrical properPes

Building
45 degree

Car, side view


Person, back
Challenges: viewpoint variation

slide credit: Fei-Fei, Fergus & Torralba


Challenges: illumination

image credit: J. Koenderink


Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba


Challenges: deformation
Challenges:
occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba


Challenges: background clutter

Kilmeny Niland. 1995


Challenges: object intra-class variation

slide credit: Fei-Fei, Fergus & Torralba


CS 231A course overview

1. Geometry
2. Seman:cs

Joint recovery of geometry and seman:cs!


Visual processing in the brain
where pathway
(dorsal stream)

V1

what pathway
(ventral stream)
77
Visual processing in the brain
where pathway
(dorsal stream)

Pre-frontal
V1 cortex

what pathway
(ventral stream)
78
Joint reconstruc:on and recogni:on

Input images

Car Person Tree Sky


Street Building 79
Else
Joint reconstruc:on and recogni:on

Input images

Car Person Tree Sky


Street Building 80
Else
There was a table set out under
a tree in front of the house,
and the March Hare and the
Hatter were having tea at it.

The table was a large one, but


the three were all crowded
together at one corner of it

From A Mad Tea-Party


Alice's Adventures in Wonderland
by
Lewis Carroll
Syllabus
Lecture Topic
March

1 Introduc:on
2 Camera models

3D geometry
3 Camera calibra:on
4 Single view metrology
5 Epipolar geometry
6 Mul:-view geometry
April

7 Structure from mo:on/ SLAM


8 Volumetric stereo
9 Fizng and Matching
10 Detector and Descriptors Proposal due

Recogni:on
11 Intro to Recogni:on; Object classica:on I
12 Object classica:on II
13 2D Object detec:on
May

14 3D Object recogni:on
15 Scene understanding & segmenta:on
16 3D Scene understanding
June

Project presentations
CS231
IntroducPon to
Computer Vision

Next lecture: Camera systems


Silvio Savarese! Lecture 1 ! 28-Mar-16

Вам также может понравиться