Section 5

Sparse & Redundant Representations
and Their Applications in

Signal and Image Processing
Theoretical Study of the Approximate Pursuit Problem
Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Uniqueness vs. Stability
– Gaining Intuition
The (P0) Problem
 We have just decided to migrate from

(P0) to its approximated version (P0)
(P0 ) min x 0
s.t. A x  b
x
 2 2
(P ) m in x
0 0
s.t. Ax  b 2

x
 The question we aim to address now is this:

What is the parallel to the uniqueness claims
made for (P0) in the new formulated problem ?
Michael Elad | The Computer-Science Department | The Technion

Recall: Uniqueness
 For the (P0) problem, the uniqueness

result stated this:
o Suppose that b=Ax0 and x0 is sparse
o We are given A and b and we solve
(somehow) the (P0) problem, getting x̂
(P0 ) min x 0
s.t. A x  b
x
o Then, if x0 is sparse enough

1 1 
x0  1  
0 
2   A  
then x̂  x 0
 Could the same be claimed for (P0) ?
An Illustrative Example
 We consider the following case: 0 

 0.8   
1.0 0 0.6 0
A    x0   
 0 1.0 0.8 0.6  2 
 
 The vector b is created by 0 
b  A x 0  v w here v 2
 0.5
 Given A, b and =0.5, we solve

 2 2
(P0 ) m in x 0
s.t. Ax  b 2

x
 Where is the solution x̂ located ? We

will use a geometrical point of view in
order to answer this question
A x0
1.0 0 0.6  0.8 
A   
 0 1.0 0.8 0.6  0.5
b  A x0  v
0 
 
0
x0   
2 
 
0 
The 4 atoms
A x0
The Possible
Solutions b  A x0  v
A x̂
2 2
Ax  b 2

Solutions of
cardinality 1
 2 2
(P0 ) m in x 0
s.t. Ax  b 2

x

The Possible A x0
Solutions
b  A x0  v
A x̂
We see that even if we get the perfect (P0)

solution, it is not unique and it does not
necessarily coincide with x0. This brings us to
move from “Uniqueness” to “Stability”, claiming
that the two are not far apart
 2 2
(P0 ) m in x 0
s.t. Ax  b 2

x

 A x0
The Possible
Solutions b  A x0  v
A x̂
2 2
Ax  b 2

As the noise strengthens, the result

might be of a different support
 2 2
(P0 ) m in x 0
s.t. Ax  b 2

x

Michael Elad
Haifa 32000, Israel
The Restricted Isometry
Property (RIP)
Recall: The Uniqueness of (P0)
 x0 is an s-sparse vector, and we create b=Ax0

 Solving the (P0) problem we get x̂ , which
must satisfy
(i) A x̂  b and (ii) x̂ 0
s
 Thus: A x 0  b & A ˆx  b  A  x 0  ˆx   0
 x 0  ˆx must be either zero, or in the
null-space of A, implying
1
2s  x 0  ˆ
x  x0  ˆ
x  Spark  A   1 
0 0 0  A 
 Therefore, if s<0.5(1+1/), necessarily x̂  x 0
establishing the uniqueness property

So, How to Analyze Stability of (P0) ?
 x0 is an s-sparse vector, and we create

b  A x 0  v where v 2

 Solving the (P0) problem

(P0 ) m in x 0
s.t. Ax  b 2

x
A x0
we get x̂ satisfying: b
a) A x̂  b 2   and

b) x̂ 0  s as well A x̂
 Thus:
A ˆ
x  x0   2 & ˆ
x  x0  2s
2 0

So, How do We Proceed from Here ?
Define z  ˆ
x  x0 :
Az 2
 2 & z 0
 2s
 Is the spark still relevant for the

noisy case ?
 The answer is negative! We should

generalize it somehow to characterize
vectors satisfying
Az 2
 2

Revisiting the Spark Definition
 For a matrix A, define As as its
submatrix containing s of its columns
A As
s atoms
 Consider all (m-choose-s) such possible
sub matrices As. Then …
 Assume that spark{A}=s. Then, at least one
of these As has linearly dependent columns
 Equivalently:  m in  A s   0
Generalizing this Definition
 We use the same submatrices As

 We generalize the spark by allowing
the smallest singular value to be >0
 Put differently: for all As we have
Asz
m in   A s    m in
s
z  , 2
  m in
z 2
 By stating what is the minimal such

min over all As, we characterize the
behavior of the complete matrix A with
respect to supports of cardinality s
Modification #1 to this Definition
 What if As has orthonormal columns?

In that case min=1 as long as sn
 As we will see next, this is the ideal
behavior, and we would like to quantify
the deviation from it
 Thus we modify the definition to be
2
s
Asz 2
z  2
2
  m in 1  s  0
z 2
1   s 
s 2 2
z  z 2
 Asz 2

Modification #2 to this Definition
 What is the meaning of this definition?

1   s 
s 2 2
z  z 2
 Asz 2
 Answer: If s <<1, the multiplication by As

does not shrink the vector too much
 Amend this definition by making it two-sided
1   s  z  1   s  z
s 2 2 2
z  2
 Asz 2 2
 The intuition: Claim that multiplication by As

hardly changes the length of z: This is the
Restricted Isometry Property (RIP)

RIP Definition
 So, lets make it precise:
Definition: For a given matrix A, and for

a cardinality s, we define the
RIP as the smallest possible
s satisfying the condition
1   s   1   s  z
2 2 2
z 2
 Asz 2 2
for all As and  z 

s
 The smaller s, the better the behavior of A

for sparse vectors of cardinality s
RIP: Alternative Definition
Definition: For a given matrix A, and for

a cardinality s, we define the
RIP as the smallest possible
s satisfying the condition
1   s   1   s  x
2 2 2
x 2
 Ax 2 2
such that
x 
m
The implication: A is “well-behaved”

x 0 s
when operating on s-sparse vectors,
acting nearly as an orthogonal matrix
Michael Elad
Haifa 32000, Israel
Key Properties of the
Restricted Isometry Property
(RIP)
Property 1: 1=0
 The RIP is characterized by the value s

1   s  z  1   s  z
s 2 2 2
z  2
 Asz 2 2
 What if s=1 ? In this case, z is a scalar and

As is a single column (normalized) from A
A A s  ak
s
 Thus 2 T
T 2
Asz  z ak ak z  z 1  0
2
1

Property 2: s Monotonic Increasing
 The RIP is characterized by the value s

1   s  z  1   s  z
s 2 2 2
z  2
 Asz 2 2
 How does s behave as a function of s ?

 Answer: s  s+1
s
1
(Why?)
s
1
Property 3: s ≥1 ?
 What happens when s≥1 ? We get that
1   s 
2 2
z 2
 Asz 2
0
2
 On the other hand we must have 0  A s z 2
2 The columns of As are

Asz 0
s 2 linearly dependent
1
s≥Spark{A}
s
1
Property 4: Relation to (A)
 Another way to write the RIP definition is

2
Asz
z 
s
1   s   2
2
 1   s 
z 2
 Recall Rayleigh-Ritz definition of eigenvalues:

2
Bz
m in/ m ax 2
2
  m in/m ax B B T

z
z 2
 Therefore, 1s is the largest/smallest

possible eigenvalue of AsTAs, searched
over all possible supports of cardinality s
 What can we say about these eigenvalues ?
Property 4: Relation to (A)
T
As As =
 AsTAs is an s×s PD matrix

 The main diagonal of
AsTAs contains 1-es
 The off-diagonals are in the range [-,+]
 Based on Gershgorin’s Disk Theorem:
 T

1  (s  1)    A s A s  1  (s  1) 
 s  (s  1) 

Property 5: Another Relation to (A)
 What if s=2 ? In this case, As contains two

columns (normalized) from A
 a1
T
    1 a1 a 2 
T
T
As As   T
    T 
 a2     a1 a 2 1 
 a1 a2 
 There must be at  
least one pair that  
 
gives a1T a2=
 Thus, the eigenvalues of this matrix are
EXACTLY 1- and 1+
 Conclusion: s=
RIP – Summary
 The mutual-coherence  characterizes A by

exploring inter-relationship between atom-pairs
 As such,  is perfectly suited for cardinalities
s=2. What happens when s>2 ?
 Answer: You can either use bounds on the
eigenvalues such as (s-1) … or Use the RIP
 The RIP is a generalization of  for arbitrary
supports s, … BUT
 The RIP is impossible to compute in general,
since we need to sweep through m-choose-s
options
Michael Elad
Haifa 32000, Israel
Theoretical Study of (P0)
The Stability Problem
An v 
2
s-sparse
vector
x0  Multiply
by A
b xv
b0  A x 0
Clearly, x̂  x0 s
0 0 Min x 0
s.t.
x̂ x
b  Ax 2

Stability How far is x0 from x̂ ?

Analyzing the Stability of (P0)
 x0 is an s-sparse vector, and we create

b  A x  v w here v 2

 Solving the (P0) problem

(P0 ) ˆ
x  arg m in x 0
s.t. Ax  b 2

x
we get a solution satisfying:

A x0
A x̂  b
2
 and x̂ 0
s b
 Thus: A  x̂  x 0   2
A x̂

2
& ˆ
x  x0  2s
0

Analyzing the Stability of (P0)
2 2
d ˆ
x  x0 : Ad  4 & d  2s
2 0
 Invoking the RIP we get
1   2 s 
2 2 2
d 2
 Ad 2
 4
2
2 2 4
d  ˆ
x  x0 
2 2
1  2s
 Exploiting the relation to the coherence

2 2
2 4 4
x̂  x 0  
2
1  2s 1    2s  1 

(P0) Stability
Theorem: Given b=Ax0+v where x0 is s-sparse

and ||v||2, the solution of the (P0)
x̂  arg m in x 0
s.t. Ax  b 2

x
exhibits stability if s<0.5(1+1/):

2
2 4
x̂  x 0 
2
1    2s  1 
Conclusions:
 We have stability – the solution of (P0) is not far off
 The sparser x0 and the less coherent the dictionary is,
the stronger the stability

Remaining Difficulties
Question 1: Should we be happy with this result?

This stability result suggests a magnification of the
noise by factor 4 and beyond!
 This is an artifact of our noise being adversarial:
||v||2 can take any form to work against us
 Alternative: Assume random noise – e.g. N(0,2I)
and analyze accordingly (not shown here), giving:
o The distance is ~ log(m)sσ2 (instead of nσ2), &
o The bound is posed in probabilistic terms
Question 2: Could a theoretical result be derived for

the success in recovering the proper support of x0 ?
Michael Elad
Haifa 32000, Israel
Performance of Pursuit
Algorithms – General
Stability of Any Pursuit Algorithm
 x0 is a s0-sparse vector, and we create

b  A x  v w here v 2

 Using a pursuit method we get x̂ satisfying:
Aˆ
x b   and ˆ
x  s1
2 0
 Using the same analysis as before:

2
2 4  1 
x̂  x 0   s1  1   s 0 
2
1    s 0  s1  1    
 Conclusions:
o Any pursuit algorithm that can create a feasible
and sparse solution is necessarily stable
o However, this does not explain why such
pursuit will succeed in the first place
What is the Best Possible Performance ?
 Suppose that x is s-sparse, and we create
b  A x  v  A s x s  v w here v~  0,  I 
2
 The original signal contains n2 noise energy

 Question: What is the best performance
that a pursuit algorithm could obtain ?
 Answer: A perfect recovery of the original
support seems to be the best to expect
 What would be the denoising effect ?
 Turns out that answering this is rather easy,
leading to what we shall refer to as the
oracle performance
The Oracle Error: Signal
 If the support is known, we compute the

non-zero coefficients by solving a LS problem
 
2 1
T T
ẑ opt  arg m in A s z  b 2
 A As
s
As b
s
z
 The error in the signal domain is given by

2
 
2 1
2 T T
  E A s ˆz opt  A s x s
b
 E As A As s
A b  As xs
s
2
2
 Plug in the relation b  A s x s  v :

2
 
1
2
  E As A As
b
T
s
A
T
s As xs  v  As xs
2
2
 
1
T T
 E As A As s
As v
2

The Oracle Error: Signal
 We proceed with the analysis:

2
 
1
2 T T
  E As A As
b s
A v s
2
=I
   A A  A A  A v
T 1 1
T T T T
 E v As A As s s s s s s
 E  v A  A A  A v
T 1
T T
s s s s
= I
2
 tr  A E  v v  A  A A     tr I   s 
T 1
T T 2 2
s s s s s
 Implication: When the support is recovered

perfectly, the noise is attenuated by a factor s/n

The Oracle Error: Representation
 We now turn to evaluate the error in the sparse

vector itself, still relying on the estimation:
 
2 1
T T
ẑ opt  arg m in A s z  b 2
 A As s
As b
s
z
 The error in the representation domain is

2
A 
2 1
2 T T
  E ˆz opt  x s
x
E s
As A b  xs
s
2
2
 Plug in the relation b  A s x s  v :

2
A 
1
2
 E
x
T
s
As A
T
s As xs  v  xs
2
2
A 
1
T T
E s
As As v
2

The Oracle Error: Representation
 We proceed with the analysis as before:

2
A 
1
2 T T
 E
x s
As A v s
2
     
T 1 1
T T T
 E v As As As As As As v
=2I
 tr  A E  v v  A    tr  A A  
1
A 
T 2
T T 2 T
s s s
As s s
 Recall: the eigenvalues of AsTAs are in the

range [1-(s-1),1+(s-1)] and thus its inverse
have the reciprocal of these values
2
2 s
 
x
1  (s  1) 
Summary
x is an b=Ax+v
s-sparse where v is
vector noise
Solve (P0ε) or
If s is small enough, the approximate it in
solution obtained is stable order to recover x
A perfect recovery of the support

leads to a strong denoising effect
 We proceed now to analyze the performance

of several pursuit algorithms (BP, THR, OMP)
 We present the simplest (worst-case) results
in order to keep the discussion pleasant
Michael Elad
Haifa 32000, Israel
Basis-Pursuit Stability
Guarantee
Basis-Pursuit Stability (1)
 x0 is an s-sparse vector, and we create b:

b  A x 0  v w here v 2

 We turn now to study the conditions for
the Basis-Pursuit (BP) to succeed in
approximating the solution of (P0)
 Basis Pursuit: Solving this problem

(P1 ) ˆ
x  arg m in x 1
s.t. Ax  b 2

x
 Given the solution, we have that

A x̂  b   and Ax0  b 2

2
A  x̂  x 0   2
2

Ad 2
 A ˆ
x  x0 
2
 2 d x  x0 
ˆ
 The beginning of our analysis resembles the

one performed for the stability of (P0)
 However, we cannot use the RIP, since
we cannot say that the BP solution is sparse
 So, how to proceed ? By exploiting BP properties
A 
2 2 T T 2 T T
4  A d 2
 d A Ad  d 2
d A I d
2 T T
 d 2
 d A A I d
2 T T
 d 2
 d 11  I d
2 2 T T
4  d 2
 d 11  I d
 1    d
2 2
2
 d 1
 We need to do something about the L1 part

above, and luckily for us, BP has something
to say about this …
 Obviously, since BP minimizes the L1-norm
x̂  x0 1
1
 We have seen before (in the BP analysis)

how this can be turned into a constructive
inequality
x̂  x0 1
1
 Put differently: recall  d x  x0 

ˆ
0  d  x0 1
 x0 1
  d  xk  xk   
0 0
 dk
k 
k s k s
T
   dk   dk  d 1
 2  1s d
k s k s
ab  b   a
T
d 1
 2  1s d

4   1    d
2 2 2 T
2
 d 1
& d  2  1s d
1
 Therefore,
 
2
4   1    d
2 2
 1    d
2 2 T
2
 d 1
 4  1s d
2
 Using  v  s
v 1
 s v 2
for the last term
4   1    d
2
 4  s  d k  1    4  s  d
2 2 2
2 2
k s
This coefficient must be positive

since we are dividing by it

(P1) Stability
Theorem: Given b=Ax0+v where x0 is

s-sparse and ||v||2, if
1 1
s  1  
4 
then the solution of the (P1)

x̂  arg m in x 1
s.t. Ax  b 2

x
exhibits stability, i.e.,

2
2 4
x̂  x 0 
2
1    4s  1

(P1) Stability: Comments
 Notice that for ε=0 we get a result for the

noiseless case … but it is weaker than the
one we have already seen
 As before, this is a worst-case analysis
 A far better bounds could be obtained
under a random noise assumption,
including a near-oracle denoising effect
that states that 2 2
x̂  x 0  c  l og m  s 
2
i.e., the error is only a log factor away from

the outcome that knows the support
Michael Elad
Haifa 32000, Israel
Thresholding Stability
Guarantee: Worst-Case
Proof Strategy
 Almost the same as for the noiseless case

 We shall assume that the first s elements
in x0 are the non-zero ones, ordered in
decreasing order of absolute values:
b
n A  v 2

s m x0
b xa i i  v where
i 1
x1  x 2   xs  0
THR: Terms of Success
 The THR algorithm succeeds if the inner

products with the true atoms are dominant:
T T
m in b a i  m ax b a j
1  i s j s
 Just as before, we will use the bounding idea:
T
Lower Upper T
m in b a i  bound for  bound for  m ax b a j
1  i s j s
the LHS the RHS
 The main (and only) difference: Now we have

to take the noise into account

Upper-Bounding the RHS

T T T
RHS  m ax b a j  m ax v a j  x i ai a j
j s j s
i 1
s
b v xa i i
T
m ax a i a j  
i j
i 1
s
 m ax v a j  m ax  x i  a i a j
T T
j s j s
i 1
   x m ax  s  
T
v aj  v 2
 aj 
2
Cauchy-Schwartz Inequality

Lower-Bounding the LHS
s
x
T T T
LHS  m in b a i  m in v a i  t
a t ai
1  i s 1  i s
s t 1
b v x t
at
t 1
s

T T
 m in v a i  x i  x t a t ai
1  i s
t 1 ,t  i
 s 
 x t a t ai 
T T
 m in  x i  v a j 
1  i s
 t 1 , t  i 
ab  a  b
s

T
 m in x i  m ax x t a t ai  
1  i s 1  i s
t 1 , t  i

LHS 

T
 x m in  x t  a t ai  
t 1 , t  i
 x m in  x m ax  s  1    A   
m ax a i a j    A 
T
i j
x1   xs

Gathering the Bounds …
T
Lower Upper T
m in b a i  bound for  bound for  m ax b a j
1  i s j s
the LHS the RHS
x m in  x m ax  s  1     >x m ax
s   
x m in 2
  s  1   s  
x m ax x m ax
1  x m in 1  
s    1 
2  x m ax    x
 m ax

THR Success
We are given A and b defining the problem (P0)


(P0 ) m in x 0
s.t. Ax  b 2

x
and we deploy the THR algorithm for its solution
Theorem: Given the above (P0), if the unknown

to be found satisfies
1 x m in 1  
x0  s  1   
0
2  x m a x   A    x m a x
then THR is guaranteed to the exact

support of x0

THR: The Good and the Bad
 The main achievement here is the ability of

the THR to lead to oracle-performance if x0
is sparse enough
1 x m in 1  
x0 s   1   
0
2  x m ax   A    x m ax
 On the down side, this condition is quite

severe, as
(i) It depends on the contrast within the
non-zeros, and
(ii) It is influenced badly by the signal-to-
noise ratio
Michael Elad
Haifa 32000, Israel
OMP Stability Guarantee
OMP: First Step Success
 The first step of the OMP succeeds if the inner

product of b with a1 is bigger (in absolute value)
than all other columns of A
T T
b a1  m ax b a j
j s
 We shall proceed by expanding these two

expressions, lower-bounding the left and upper-
bounding the right, this way deriving a condition
for this inequality to be satisfied
Lower Upper
T T
b a1  bound for  bound for  m ax b a j
j s
the LHS the RHS

Upper-Bounding the RHS

T T T
RHS  m ax b a j  m ax v a j  x i ai a j
j s j s
i 1
s
b v xa i i
i 1
s s
 x i a i a j    m ax  x i  a i a j  
T T
 m ax
j s j s
i 1 i 1
 x m ax  s    
T
m ax a i a j  
i j

s s
LHS  b a1  v a1   i 
T T T T
x m axx i a a1 x i a i a1
i 1 i 2
s
b v xa i i
i 1
s

T
 x m ax  x i a i a1  
i 2
s

T
 x m ax  x i  a i a1  
i 2
 x m ax  1   s  1     
m ax a i a j    A  ; x m ax  x i  i  2
T
i j

Gathering the Bounds …
Lower Upper
T T
b a1  bound for  bound for  m ax b a j
j s
the LHS the RHS
b a1  x m ax  1   s  1     
T
T
> x m ax  s      m ax b a j
j s
2
1  s  1    s 
x m ax
1 1 
s  1   
2    x m ax
Moving to the Next Step
 Conclusion so far: if x0 is sparse enough

1 1  
x  s  1   
0
2    A    x m ax
then the first step of the OMP is successful,
finding an atom i0 from within the support
 The next OMP steps:
o Update the solution to get x1
o Update the residual by
r 1  b  A x 1  b  c 1 a i0

Moving to the Next Step
Observe That
 The updated residual is a linear
combination of the same s atoms (as in b)
s
r 1  b  c 1 a i0   xa i i v
i 1
 Therefore, the same condition as before

will guarantee the success of the next
round, … but …
 This would be true for a new value of
|xmax| that corresponds to the new
coefficients x i - what do we do about it?
Moving to the Next Steps
Claim: If the updated residual at the k

stage is given by , s
r k   x i ai  v
i 1
we have that x m ax  m ax x i  x m in
1  i s
Proof:
 In rk, only the coefficients of the chosen atoms
change, while the rest remain with their original
values
 Thus, the maximal non-zero (in abs value) cannot
be smaller than |xmin|, as claimed
Worsening the Condition
 If x0 is sparse enough
1 1  
x  s  1   
0 
2   A    x m ax
then the first step of the OMP is successful
 Obviously, this implies that the same remains
true with the following worse condition
1 1  
x  s  1  
0
2    A    x m in
 The idea: With this change we guarantee the
success of every OMP step
Worsening the Condition
 We just saw that if

1 1  
s  1   
2    A    x m in
then every step of the OMP is successful,

 This implies that we will find the exact and
complete support of x0 in exactly s steps
 What happens after s steps? As we solve
the least-squares over this support, the
residual energy must be  and below (why?)
 Thus, the OMP stops at this stage
OMP Stability
We are given A and b defining the problem (P0)


(P0 ) m in x 0
s.t. Ax  b 2

x
and we deploy OMP for its solution
Theorem: Given the (P0) problem, if the unknown

to be found is sparse enough,
1 1  
x  s  1  
0 
2   A    x m in
then OMP is guaranteed to find its exact
support. Furthermore, OMP finds this
solution in exactly s steps
Algorithms’ Performance: Summary
Terms for Stability Stability Error
1 1 4
2
BP s 1   
4  1    4s  1
1 1 
2
 4
OMP s  1 
2 

  A    x m in

1   s  1
2
1 x m in 1   4

THR s  1 
2 
 
x m ax   A    x m ax 1   s  1
 So, which is better?

 The General Experience: THR < OMP  BP
Michael Elad
Haifa 32000, Israel
Rate of Decay of the
Residual in Greedy Methods
Recall the Matching Pursuit Algorithm
Main Iteration
Initialization
1. Compute p(i)  a iT r k 1 for 1  i  m
k  0, x 0  0
k  k 1 2. Choose i0 s.t. 1  i  m, p(i0 )  p(i)
r0  b  A x0  b
and S 0   3. Update S k : S k  S k 1  i0 
4. Update x k : x k  x k 1 & x k (i0 )  x k (i0 )  aiT r k 1
0
5. Update Residual: r k  b  A x k
rk  b  A xk
 T
 b  A x k 1  a i0 r k 1 a i0  No
rk 2

Yes
Stop
 T
 r k 1  a i0 r k 1 a i0
 The question we address: How fast can we expect
this residual to decay?
 While different from the rest of the analysis shown
so far, this result is quite interesting and elegant
The Residual Recursion
 T

r k  r k 1  a i0 r k 1 a i0
 
2
2 T
rk 2
 r k 1  a i0 r k 1 a i0
2
   a 
2 2 2
T T
 r k 1 2
 2 a i0 r k 1 i0 r k 1
 
2 2
T
 r k 1 2
 a i0 r k 1
 
2 2
T
 r k 1 2
 m ax a i r k 1
1  i m
2 2 2
T
rk 2
 r k 1 2
 A r k 1


Definition: Given a matrix A, we define its minimal
magnification factor, s(A) by
2
T
A v
s  A   m in 

v 2
v T
 Could s(A)=0? Since A is
2
A v m
full-rank and mn, this is
impossible because it implies A v
T
ATv=0 for all v

 What is s(I)? This is a well-known result in ratio
of norms, obtained for vT=[1,1, … ,1]
1
s  I   min v
2 2
v 
v  2
n
2
T
A v
We have defined s  A   mvin 2

v 2
Lets use this in order to analyze the rate of decay of

the MP’s residual
2
T
2 2 T
2 2 2
A r k 1
rk 2
 r k 1 2
 A r k 1  r k  1  r k 1 2

 2 2
2
r k 1 2
T
A v
 1  s  A   r k 1
2 2 2
 r k 1 2
 r k 1 2
 m in 2

2 b
2
v
v 2
2
 1  s  A   r 0
k 2

Theorem: The worst decay of the MP residual is

exponential, with a rate given by
 1  s  A   b
2 k 2
rk 2 2
Implications:
 Clearly this rule applies to OMP and LS-OMP
as they are more aggressive in reducing the
residual’s energy
 This result provides another justification for
the hope to get a sparse solution from MP
and other greedy methods

Section 5

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Section 5

Загружено:

Авторское право:

Доступные форматы

Sparse & Redundant Representations

and Their Applications in

 We have just decided to migrate from

 The question we aim to address now is this:

Michael Elad | The Computer-Science Department | The Technion

 For the (P0) problem, the uniqueness

o Then, if x0 is sparse enough

 We consider the following case: 0 

 Given A, b and =0.5, we solve

 Where is the solution x̂ located ? We

Michael Elad | The Computer-Science Department | The Technion

We see that even if we get the perfect (P0)

Michael Elad | The Computer-Science Department | The Technion

As the noise strengthens, the result

Michael Elad | The Computer-Science Department | The Technion

 x0 is an s-sparse vector, and we create b=Ax0

Michael Elad | The Computer-Science Department | The Technion

 x0 is an s-sparse vector, and we create

Michael Elad | The Computer-Science Department | The Technion

 Is the spark still relevant for the

 The answer is negative! We should

Michael Elad | The Computer-Science Department | The Technion

 We use the same submatrices As

 By stating what is the minimal such

 What if As has orthonormal columns?

Michael Elad | The Computer-Science Department | The Technion

 What is the meaning of this definition?

 Answer: If s <<1, the multiplication by As

 The intuition: Claim that multiplication by As

Michael Elad | The Computer-Science Department | The Technion

 So, lets make it precise:

Definition: For a given matrix A, and for

for all As and  z 

 The smaller s, the better the behavior of A

Definition: For a given matrix A, and for

The implication: A is “well-behaved”

 The RIP is characterized by the value s

 What if s=1 ? In this case, z is a scalar and

Michael Elad | The Computer-Science Department | The Technion

 The RIP is characterized by the value s

 How does s behave as a function of s ?

2 The columns of As are

 Another way to write the RIP definition is

 Recall Rayleigh-Ritz definition of eigenvalues:

 Therefore, 1s is the largest/smallest

 AsTAs is an s×s PD matrix

Michael Elad | The Computer-Science Department | The Technion

 What if s=2 ? In this case, As contains two

 The mutual-coherence  characterizes A by

Michael Elad | The Computer-Science Department | The Technion

 x0 is an s-sparse vector, and we create

we get a solution satisfying:

Michael Elad | The Computer-Science Department | The Technion

 Invoking the RIP we get

 Exploiting the relation to the coherence

Michael Elad | The Computer-Science Department | The Technion

Theorem: Given b=Ax0+v where x0 is s-sparse

exhibits stability if s<0.5(1+1/):

Michael Elad | The Computer-Science Department | The Technion

Question 1: Should we be happy with this result?

Question 2: Could a theoretical result be derived for

 x0 is a s0-sparse vector, and we create

 Using the same analysis as before:

 The original signal contains n2 noise energy