Академический Документы
Профессиональный Документы
Культура Документы
Yi Ma
Electrical and Computer Engineering, UIUC & Visual Computing Group, MSRA
Recognition
Surveillance
Bioinformatics
The curse of dimensionality: increasingly demand inference with limited samples for very highdimensional data. The blessing of dimensionality: real data highly concentrate on low-dimensional, sparse, or degenerate structures in the high-dimensional space. But nothing is free: Gross errors and irrelevant measurements are now ubiquitous in massive cheap data.
Gaussian samples in 2D
Computational Tools:
Linear programming, convex optimization, greedy pursuit,
boosting, parallel processing
Practical Applications:
Compressive sensing, sketching, sampling, audio, image, video, bioinformatics, classification, recognition
Part I: Key Ideas and Application Robust Face Recognition via Sparse Representation
# ACLU: Face-Recognition Systems Won't Work. ZDNet, Nov. 2, 2001. # ACLU Warns of Face Recognition Pitfalls. Newsbytes, Nov. 2, 2001.
# Identix, Visionics Double Up. CNN / Money Magazine, Feb. 22, 2002. # 'Face testing' at Logan is found lacking. Boston Globe, July 17, 2002. # Reliability of face scan technology in dispute . Boston Globe, August 5, 2002. # Tampa drops face-recognition system. CNET, August 21, 2003. # Airport anti-terror systems flub tests. USA Today, September 2, 2003. # Anti-terror face recognition system flunks tests. The Register, September 3, 2003. # Passport ID technology has high error rate . The Washington Post, August 6, 2004. # Smiling Germans ruin biometric passport system. VNUNet, November 10, 2005. # U.K. cops look into face-recognition tech. ZDNet News, January 17, 2006. # Police build national mugshot database . Silicon.com, January 16, 2006. # Face Recognition Algorithms Surpass Humans matching faces, PAMI, 2007. # 100% Accuracy in Automatic Face Recognition, Science, 2008., January 25, 2008
Training Images
Images of the same face under varying illumination lie approximately on a low (nine)-dimensional subspace, known as the harmonic plane [Basri & Jacobs, PAMI, 2003].
The solution, , , should be a sparse vector of its entries should be zero, except for the ones associated with the correct subject.
to an under-determined :
The problem can be solved efficiently via Linear Programming, and the solution is stable under moderate noise [Candes & Tao04, Donoho04].
The equivalence holds iff .
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Face recognition as determining which facet of the polytope the test image belongs to.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Input:
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
N subject n
123 subject i
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
30% corruption
99.3% 90.7%
50%
37.5%
70%
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
98.5% 90.3%
65.3%
30% occlusion
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Results corroborate findings in human vision: the eyebrow or eye region is most informative for recognition [Sinha06]. However, the difference is less significant for our algorithm than for humans.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
The AR Database (100 subjects) Training: 799 images (un-occluded) EBP = 11.6%. Testing: 200 images (with glasses) 200 images (with scarf)
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Test image
Training dictionary
coefficients
corruption, occlusion
Solution is not but only supported on images of the same subject should be unique sparse: ideally, expected to be sparse: occlusion only affects a subset of the pixels Seek the sparsest solution:
convex relaxation
37.5%
minimization
, yielding
minimization
, yielding
Succeeds whenever
minimization
, yielding
Succeeds whenever
This work:
Instead solve
minimization
, yielding
Succeeds whenever
This work:
Instead solve
Succeeds whenever
PRIOR WORK -
Equivalence in
Donoho + Elad 03
Restricted Isometry
suffices.
The columns of
very sparse:
Existing theory:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Conjecture: If the matrices are sufficiently coherent, then for any error fraction , as , solving
with
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Observation dimension Problem size grows proportionally: Error support grows proportionally: Support size sublinear in :
recovers any sparse signal from almost any error with density less than 1
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
L1 - [A I]:
L1 - comp:
ROMP: Regularized orthogonal matching pursuit
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Invalid Subject
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
0%
10%
20%
30%
50%
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Receiver Transmitter
as
, recovers
by
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Alice
Intentionally corrupts messages
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Remaining obstacles to truly practical automatic face recognition: Pose and misalignment Obtaining sufficient training Scalability to large databases real face detector imprecision! which illuminations are truly needed? both in speed and accuracy.
All three difficulties can be addressed within the same unified framework of sparse representation.
Recognition succeeds
If
Seek the
Nonconvex in
Linear program
Solve, set
Least-squares solution
, classify based on
Excellent classification, validation and robustness with a linear-time algorithm that is efficient in practice and highly parallelizable.
Receiver Operating Characteristic (ROC) Validation performance: Is the subject in the database of 249 people? NN, NS, LDA not much better than chance. Our method achieves an equal error rate of < 10%.
Recognition succeeds
Rear illuminations!
32 illumination cells
95.9% rec. rate 91.5% rec. rate 62.3% rec. rate 73.7% rec. rate 53.5% rec. rate
37.5%
98.5%
90.3%
65.3%
60% occlusion
Query image
recovered error
recovered image
Longer-term direction: Sparse representation on structured domains (ala [Baraniuk 08, Do 07]):
12x10 pixels
120 dim
120 dim Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Compressed sensing:
Number of linear measurements is more important than specific details of how those measurements are taken.
d > 2k log (N/d) random measurements suffice to efficiently reconstruct any k-sparse signal. [Donoho and Tanner 07]
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
MRF / BP
[Freeman IJCV 00]
Our method
Original Original
Precision: 98.8% and recall: 94.2%, far better than other existing detectors & classifier
Understanding either behavior requires a much more expressive model for what happens inside the bouquet?
D - observation
A low-rank
E sparse error
convex relaxation
Nuclear norm
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Random orthogonal model (of rank r) [Candes & Recht 08]: independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary.
):
Magnitude of
is arbitrary.
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Convex optimization recovers almost any matrix of rank O(m/log m) from errors affecting O(m2) of the observations!
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Key technique: Iterative surgery for producing a certifying dual vector (extends [Wright and Ma 08]).
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Convex optimization exactly recovers matrices of rank O(m), even when O(m2) entries are missing!
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Caveats: - [C-T 09] tighter for small r. - [C-T 09] generalizes better to other matrix ensembles.
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Future direction: sampling approximations to the singular value thresholding operator [Rudelson and Vershynin 08] ?
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Video
Low-rank appx.
Sparse error
Background variation
Anomalous activity
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Video
Low-rank appx.
Sparse error
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
Low-rank appx.
Sparse error
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
CONCLUSIONS
Analytic and algorithmic tools from sparse representation lead to a new approach in face recognition: Robustness to corruption and occlusion Performance exceeds expectation & human ability Face recognition reveals new phenomena in high-dim statistics & geometry: Dense error correction with a coherent dictionary Recovery of corrupt low-rank matrices Theoretical insights to mathematical models lead back to practical gains
REFERENCES + ACKNOWLEDGEMENT
- Robust Face Recognition via Sparse Representation IEEE Trans. on Pattern Analysis and Machine Intelligence, February 2009. - Dense Error Correction via L1-minimization ICASSP 2008, Submitted to IEEE Trans. Information Theory, September 2008. - Towards a Practical Face Recognition System: Robust Alignment and Illumination via Sparse Representation IEEE Conference on Computer Vision and Pattern Recognition, June 2009. - Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization Submitted to the Journal of the ACM, May 2009.
John Wright, Allen Yang, Andrew Wagner, Arvind Ganesh, Zihan Zhou
THANK YOU
Questions, please?
Extended Yale B
AR Database
L1
Downsample[%]
Fisher[%]
76.2
85.9
87.6
N/A
92.7
N/A
96.9
N/A
Nearest Neighbor
Dimension (d) Eigen [%] Laplacian [%] 30 72.0 75.6 56 79.8 81.3 120 83.9 85.2 504 85.8 87.7 Eigen [%]
Nearest Subspace
Dimension (d) Laplacian [%] 30 89.9 89.0 56 91.1 90.4 120 92.5 91.9 504 93.2 93.4
Random[%]
Downsample[%] Fisher[%]
60.1
46.7 87.7
66.5
54.7 N/A
67.8
61.8 N/A
66.4
65.4 N/A
Random[%]
Downsample[%] Fisher[%]
87.4
80.8 81.9
91.5
88.2 N/A
93.9
91.1 N/A
94.1
93.4 N/A
30
71.1 73.7 57.8 46.8 87.0
56
80.0 84.7 75.5 67.0 92.3
120
85.7 91.0 87.5 84.6 N/A
504
92.0 94.3 94.7 93.9 N/A
L1
Nearest Neighbor
Dimension (d) Eigen [%] Laplacian [%] 30 68.1 73.1 56 74.8 77.1 120 79.3 83.8 504 80.5 89.7 Eigen [%] Laplacian [%]
Nearest Subspace
Dimension (d) 30 64.1 66.0 56 77.1 77.5 120 82.0 84.3 504 85.1 90.3
Random[%]
Downsample[%] Fisher[%]
56.7
51.7 83.4
63.7
60.9 86.8
71.4
69.2 N/A
75.0
73.7 N/A
Random[%]
Downsample[%] Fisher[%]
59.2
56.2 80.3
68.2
67.7 85.8
80.0
77.0 N/A
83.3
82.1 N/A
Features
nose
right eye
Dimension
L1 NN NS SVM
4,270
87.3% 49.2% 83.7% 70.8%
5,050
93.7% 68.8% 78.6% 85.8%
12,936
98.3% 72.7% 94.4% 95.3%
Call
-recoverable if
. W.l.o.g., let
Restrict to
and write
. W.l.o.g., let
Restrict to
and write
The NSC
hyperplane
The NSC
hyperplane
The NSC
hyperplane
Base case:
Inductive step:
Magnitude