Академический Документы
Профессиональный Документы
Культура Документы
c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
ANTONIN CHAMBOLLE
CEREMADE–CNRS UMR 7534, Université de Paris-Dauphine, 75775 Paris Cedex 16, France
antonin@ceremade.dauphine.fr
Abstract. We propose an algorithm for minimizing the total variation of an image, and provide a proof of
convergence. We show applications to image denoising, zooming, and the computation of the mean curvature
motion of interfaces.
Keywords: total variation, image reconstruction, denoising, zooming, mean curvature motion
note).
with |y| := y12 + y22 for every y = (y1 , y2 ) ∈ R2 .
Let us observe here that this functional J is a dis-
2. Notations and Preliminary Remarks cretization of the standard total variation, defined in the
continuous setting for a function u ∈ L 1 () ( open
Let us fix our main notations. To simplify, our images subset of R2 ) by
will be 2-dimensional matrices of size N × N (adapta-
tion to other cases or higher dimension is not difficult). J (u) = sup u(x) div ξ (x) d x:
We denote by X the Euclidean space R N ×N . To define
the discrete total variation, we introduce a discrete (lin-
ξ ∈ Cc1 (; R2 ), |ξ (x)| ≤ 1 ∀x ∈ (2)
ear) gradient operator. If u ∈ X , The gradient ∇u is a
90 Chambolle
(see for instance [12]). It is well known that J , de- for every p = ( p 1 , p 2 ), q = (q 1 , q 2 ) ∈ Y . Then, for
fined by (2), is finite if and only if the distributional every u,
derivative Du of u is a finite Radon measure in , in
which case we have J (u) = |Du|(). If u has a gra- J (u) = sup p, ∇u Y (5)
dient ∇u ∈ L 1 (; R2 ), then J (u) = |∇u(x)| d x. p
We will work mostly, in this note, in the discrete
setting. Let us however make the observation that if where the sup is taken on all p ∈ Y such that | pi, j | ≤
some step-size (or pixel size) h ∼ 1/N is introduced 1 for every i, j. We introduce a discrete divergence
in the discrete definition of J (defining a new func- div : Y → X defined, by analogy with the continuous
tional Jh equal to h times the expression in (1)), one setting, by div = −∇ ∗ (∇ ∗ is the adjoint of ∇). That is,
can show that as h → 0 (and the number of pix- for every p ∈ Y and u ∈ X , −div p, u X = p, ∇u Y .
els N goes to infinity), Jh “–converges” (see for One checks easily that div is given by
instance [1]) to the continuous J (defined by (2) on 1
= (0, 1) × (0, 1)). This means that the minimizers of pi, j − pi−1, j if 1 < i < N ,
1
the problems we are going to consider approximate cor- (div p)i j = pi,1 j if i = 1,
rectly, if the pixel size is very small, minimizers of sim-
− pi−1,
1
j if i = N ,
ilar problems defined in the continuous setting with the 2
pi, j − pi, j−1 if 1 < j < N ,
2
functional (2).
Being J one–homogeneous (that is, J (λu) = λJ (u) + pi, j 2
if j = 1,
for every u and λ > 0), it is a standard fact in convex
− pi, j−1
2
if j = N ,
analysis (we refer to [11] for a quite complete introduc-
tion to convex analysis, and to [14] for a monograph
for every p = ( p 1 , p 2 ) ∈ Y . From (5) and the definition
on convex optimization problems) that the Legendre–
of the operator div, one immediately deduce (4), with
Fenchel transform
K given by
we get that w = (g − u)/λ is the minimizer of Theorem 3.1. Let τ ≤ 1/8. Then, λdiv p n converges
to πλK (g) as n → ∞.
w − (g/λ) 2 1 ∗
+ J (w).
2 λ Proof: By induction we easily see that for every n ≥
∗
Since J is given by (3), we deduce w = π K (g/λ). 0, | pi,n j | ≤ 1 for all i, j. Let us fix n ≥ 0 and let η =
Hence the solution u of problem (6) is simply given by ( p n+1 − p n )/τ . We have
min{λdiv p − g : p ∈ Y,
2
and since ηi, j is of the form ∇(div p n − g/λ))i, j − ρi, j
| pi, j | − 1 ≤ 0 ∀ i, j = 1, . . . , N }. (8)
2
(with ρi, j = |∇(div p n − g/λ))i, j | pi,n+1
j ), we have for
every i, j
The Karush-Kuhn-Tucker conditions (cf [14, Vol. I,
Theorem 2.1.4] or [7, Theorem 9.2-4]) yield the exis- 2ηi, j · (∇(div p n − g/λ))i, j − κ 2 τ |ηi, j |2
tence of a Lagrange multiplier αi, j ≥ 0, associated to
= (1 − κ 2 τ )|ηi, j |2
each constraint in problem (8), such that we have for
each i, j + (|(∇(div p n − g/λ))i, j |2 − |ρi, j |2 ).
We now can show the following result. −(∇(λdiv p̄ − g))i, j + |(∇(λdiv p̄ − g))i, j | p̄i, j = 0,
92 Chambolle
which is the Euler equation for a solution of (8). with A a linear operator (corresponding in general to a
One can deduce that p̄ solves (8) (see for instance low-pass filtering, that is, a blurring of the image). It is
[7, Theorem 9.2-4]) and that λdiv p̄ is the projection not clear how to adapt our approach to this case, and it
π K (g). Since this projection is unique, we deduce that is the subject of future studies. We show in Section 5
all the sequence λdiv p n converges to π K (g). The the- how to treat the particular case where A is an orthogo-
orem is proved if we can show that κ 2 ≤ 8. nal projection (zooming). On the other hand, the advan-
By definition, κ = sup pY ≤1 div p. Now, (adopt- tage of our approach is the existence of the convergence
ing the convention that p0, j = p N , j = pi,0 = pi,N = 0 Theorem 3.1, that ensures its efficiency and stability.
for every i, j) It also provides a framework for understanding the be-
havior of the algorithms proposed in [6] and [3], at
2
div p2 = pi,1 j − pi−1, least in the case A = Id.
j + pi, j − pi, j−1
1 2 2
1≤i, j≤N
2 1 2 2 2
≤4 pi,1 j + pi−1, j + pi, j 4. Image Denoising
1≤i, j≤N
2 The idea of minimizing total variation for image de-
+ pi,2 j−1 ≤ 8 p2Y .
noising, suggested in [17], assumes that the observed
image g = (gi, j )1≤i, j≤N is the addition of an a pri-
Hence κ 2 ≤ 8. ori piecewise smooth (or with little oscillation) image
u = (u i, j )1≤i, j≤N and a random Gaussian noise, of es-
Remark. Choosing pi,1 j = pi,2 j = (−1)i+ j shows that timated variance σ 2 . It is hence suggested to recover
κ 2 ≥ 8 − O(1/N ). the original image u by trying to solve the problem
Remark. In practice, it appears that the optimal con- min{J (u) : u − g2 = N 2 σ 2 } (11)
stant for the stability and convergence of the algorithm
is not 1/8 but 1/4. We do not know the reason for this. (N 2 being the total number of pixels). It can be shown
If τ < 1/4, then it is easy to check that both applica- (see for instance [5]) that there exists (both in the con-
tions p n → p̃ n and p̃ n → p n+1 defined respectively tinuous and discrete settings, in fact) a Lagrange mul-
by tiplier λ > 0 such that, provided g − g 2 ≥ N 2 σ 2
(with g the average value of the pixels gi, j ), this
problem has a unique solution that is given by the
p̃i,n j = pi,n j + τ (∇(div p n − g/λ))i, j and
equivalent problem (6). We have just shown how to
p̃i,n j numerically solve problem (6), however, since σ is
j =
pi,n+1
in general less difficult to estimate than λ, we pro-
1 + τ |(∇(div p n − g/λ))i, j |
pose another algorithm that tackles directly the res-
are contractions, but each in a different norm (the first olution of (11). The task is to find λ > 0 such that
one for the semi-norm div p, the second one for the πλK g2 = N 2 σ 2 . For s > 0, let us set f (s) = πs K g.
norm supi, j | pi, j |). The following lemma states the main properties
of f .
Remark. To our knowledge there exist two other im-
Lemma 4.1. The function f (s) maps [0, +∞)
portant contributions addressing the same issue, that
onto [0, g − g ]. It is non-decreasing, while the
is the minimization of total variation through a dual
function s → f (s)/s is non-increasing. Moreover,
approach. One is the paper of Chan, Golub and Mulet
f ∈ W 1,∞ ([0, +∞)) and satisfies, for a.e. s ≥ 0,
[6], the other is the thesis of Carter [3]. In both works,
the proposed algorithms are quite different. They share f (s) √
the advantage that they are supposed to work also for 0 ≤ f (s) ≤ ≤ 2 2N .
s
“deconvolution” problems, that is, when instead of (6),
the problem to solve is Proof: Fix s, s , v = πs K g, v = πs K g. By definition
of the projection, we have
Au − g 2
min + J (u), (10) g − v, w − v ≤ 0
u∈X 2λ
An Algorithm for Total Variation Minimization and Applications 93
f (s)
0 ≤ f (s) ≤ ≤c
s
for a.e. s ≥ 0. Eventually, we can easily show that any
u ∈ X with u = 0 can be written div p for some p ∈
Y , so that there exists s ∗ ≥ 0 such that g − g ∈ s ∗ K ,
hence f (s) = g − g for every s ≥ s ∗ . This ends
the proof of the lemma.
deviation respectively 12 and 25 has been added. The 1/100. Notice that this algorithm can very easily be
original is a 256 × 256 square image with values parallelized.
ranging from 0 to 255. The CPU time for comput-
ing the reconstructed images is in both case approx-
imately 1.9 seconds, on a 900 MHz Pentium III pro- 5. Zooming
cessor with 2 Mb of cache. The criterion for stopping
the iteration just consists in checking that the max- In the case of zooming, the inverse problem that has
imum variation between pi,n j and pi,n+1
j is less than to be solved is now (in its most simple formulation, as
An Algorithm for Total Variation Minimization and Applications 95
Figure 4. Left: original 512 × 512 Lena and a 128 × 128 reduction. Middle, the small image expanded by a factor 4. Right, the small image
expanded by 4 using the algorithm of Section 5.
proposed by Guichard and Malgouyres, see [13, 16] to a minimizer (u, w) of the convex energy (u, w) →
for a general presentation) u − (g + w)2 /(2λ) + J (u) (as long as the vectors in
Z ⊥ have zero average, which is usually the case). We
Au − g 2 leave it to the reader.
min + J (u), (12)
u∈X 2λ We illustrate the output of this algorithm on Fig. 4.
As expected (see [16]), the result is very good. How-
where g ∈ X is a coarse image, that is, belonging to ever, we found out that our method is quite slow, and
a “coarse” subspace Z ⊂ X , and A is the orthogonal does not seem to be a great improvement with re-
projection onto Z . For instance, Z might be the set of spect to standard methods. Still some work has to been
vectors gi, j such that g2k,2l = g2k+1,2l = g2k,2l+1 = done in order to understand better how the energy is
g2k+1,2l+1 for every k, l ≤ N /2, in which case we ex- decreased at each iteration, and to try to find faster
pect u to be a zooming of factor 2 of g. We have Ag = g, strategies.
and it is clear that
Au − g = A(u − g) = min u − g − w.
w∈Z ⊥ 6. Mean Curvature Motion
Hence (12) may be reformulated as We mention here quickly another possible application
of our algorithm. We do not intend to give to many
u − (g + w) 2 details in this section (which has a priori little applica-
min + J (u).
u∈X,w∈Z ⊥ 2λ tions to imaging and vision). This will be the subject of
a forthcoming paper [4]. We present the isotropic case,
This provides an obvious algorithm for solving the although the method is very general and also works for
problem, by alternate minimizations of the energy with anisotropic curvature motion.
respect to w and u. We let w0 = 0 and set for every Consider a set E ⊂ ⊂ R2 , such that the convex
n≥0 envelope of E is strictly inside . Let d E be the signed
distance to ∂ E, such that d E ≥ 0 in E and d E ≤ 0
u n = (g + wn ) − πλK (g + wn )
in \E. This distance can be computed in a quite ef-
ficient way, using a fast-marching algorithm [18]. We
which is computed using the algorithm (9), and
choose h > 0 and solve then, using our algorithm, a
discretization of the problem
wn+1 = π Z ⊥ (u n − g)
which is a straightforward calculation. It is very easy to 1
min |w(x) − d E (x)|2 d x + J (w) (13)
establish the convergence of this algorithm, as n → ∞, w 2h
96 Chambolle
Figure 5. An original curve (left), and its evolution for times t = 1, 30, 70, 100, 140 (right).
with J defined by (2). We define the operator Th 3. J.L. Carter, “Dual methods for total variation—Based image
by letting Th E = {w > 0}, with w the solution restoration,” Ph.D. thesis, U.C.L.A. (Advisor: T. F. Chan),
of (13). 2001.
4. A. Chambolle, “An algorithm for mean curvature motion,” to
Given an initial set E 0 , we let for h > 0 small and appear in Interfaces Free Bound.
every t > 0 5. A. Chambolle and P.-L. Lions, “Image recovery via total vari-
ation minimization and related problems,” Numer. Math., Vol.
E h (t) = (Th )n E 0 76, No. 2, pp. 167–188, 1997.
6. T.F. Chan, G.H. Golub, and P. Mulet, “A nonlinear primal-
dual method for total variation-based image restoration,”
with n = [t/ h] = the integer part of t/ h. Then, if ∂ E 0 SIAM J. Sci. Comput., Vol. 20, No. 6, pp. 1964–1977, 1999
is smooth, we have the following result. (electronic).
7. P.G. Ciarlet, Introduction à l’analyse numérique matricielle et
à l’optimisation, Collection Mathématiques Appliquées pour la
Theorem 6.1. There exists t0 > 0 such that, as h → 0, Maı̂trise. [Collection of Applied Mathematics for the Master’s
the boundaries ∂ E h (t) converge to (t) in the Haus- Degree]. Masson: Paris, 1982.
dorff sense for 0 ≤ t ≤ t0 , where (t) is the Mean Cur- 8. P.L. Combettes and J. Luo, “An adaptive level set method
vature evolution starting from ∂ E 0 . for nondifferentiable constrained image recovery,” IEEE Trans.
Image Process., Vol. 11, 2002.
9. F. Dibos and G. Koepfler, “Global total variation minimiza-
For the definition the Mean Curvature Motion, we refer tion,” SIAM J. Numer. Anal., Vol. 37, No. 2, pp. 646–664, 2000
to [2] and the huge literature that has followed. This (electronic).
result holds in fact in any dimension. The proof will be 10. D.C. Dobson and C.R. Vogel, “Convergence of an iterative
given in [4]. Figure 5 shows the evolution of a curve method for total variation denoising,” SIAM J. Numer. Anal.,
Vol. 34, No. 5, pp. 1779–1791, 1997.
computed with this algorithm. 11. I. Ekeland and R. Temam, Convex Analysis and Variational
Problems. Amsterdam: North Holland, 1976.
Note 12. E. Giusti, Minimal Surfaces and Functions of Bounded Varia-
tion. Birkhäuser Verlag: Basel, 1984.
1. We will sometimes drop the subscript “X ”, when not ambiguous. 13. F. Guichard and F. Malgouyres, “Total variation based inter-
polation,” in Proceedings of the European Signal Processing
Conference, Vol. 3, pp. 1741–1744, 1998.
References 14. J.-B. Hiriart-Urruty and C. Lemaréchal, Convex Analysis and
Minimization Algorithms. I, II, Vol. 305–306 of Grundlehren
1. A. Braides, Gamma—Convergence for Beginners, No. 22 in der Mathematischen Wissenschaften [Fundamental Principles
Oxford Lecture Series in Mathematics and Its Applications. of Mathematical Sciences]. Springer-Verlag: Berlin, 1993 (two
Oxford University Press, 2002. volumes).
2. K.A. Brakke, The Motion of a Surface by its Mean Curva- 15. Y. Li and F. Santosa, “A computational algorithm for minimizing
ture, Vol. 20 of Mathematical Notes. Princeton University Press: total variation in image restoration,” IEEE Trans. Image Process-
Princeton, NJ, 1978. ing, Vol. 5, pp. 987–995, 1996.
An Algorithm for Total Variation Minimization and Applications 97
16. F. Malgouyres and F. Guichard, “Edge direction preserving im- mathematics from the Université de Paris-Dauphine in 1993. Since
age zooming: A mathematical and numerical analysis,” SIAM J. then he has been a CNRS researcher at the CERE-MADE, Univer-
Numer. Anal., Vol. 39, No. 1, pp. 1–37, 2001 (electronic). sité de Paris-Dauphine, and, for a short period, a researcher at the
17. L.I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation SISSA, Trieste, Italy. His research interest include calculus of varia-
based noise removal algorithms,” Physica D, Vol. 60, pp. 259– tions, with applications to shape optimization, mechanics and image
268, 1992. processing.
18. J.A. Sethian, “Fast marching methods,” SIAM Rev., Vol. 41,
No. 2, pp. 199–235, 1999 (electronic).
19. C.R. Vogel and M.E. Oman, “Iterative methods for total varia-
tion denoising,” SIAM J. Sci. Comput., Vol. 17, No. 1, pp. 227–
238, 1996. Special issue on iterative methods in numerical linear
algebra (Breckenridge, CO, 1994).