Вы находитесь на странице: 1из 10

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO.

6, JUNE 2016 2519

Fast and Provably Accurate Bilateral Filtering


Kunal N. Chaudhury, Senior Member, IEEE, and Swapnil D. Dabhade

Abstract— The bilateral filter is a non-linear filter that uses a and a box or Gaussian kernel is used for spatial filtering [3].
range filter along with a spatial filter to perform edge-preserving In this setting, the bilateral filtering of an image { f (i ) :
smoothing of images. A direct computation of the bilateral filter i ∈ I}, I being some finite rectangular domain of Z2 , is
requires O(S) operations per pixel, where S is the size of the
support of the spatial filter. In this paper, we present a fast and given by

j ∈ w( j )gσr ( f (i − j ) − f (i )) f (i − j )
provably accurate algorithm for approximating the bilateral filter
when the range kernel is Gaussian. In particular, for box and f BF (i ) =  (1)
Gaussian spatial filters, the proposed algorithm can cut down the j ∈ w( j )gσr ( f (i − j ) − f (i ))
complexity to O(1) per pixel for any arbitrary S. The algorithm
has a simple implementation involving N + 1 spatial filterings, where
 
where N is the approximation order. We give a detailed analysis t2
of the filtering accuracy that can be achieved by the proposed gσr (t) = exp − 2 . (2)
approximation in relation to the target bilateral filter. This allows 2σr
us to estimate the order N required to obtain a given accuracy. The spatial filter is a Gaussian:
We also present comprehensive numerical results to demonstrate  
that the proposed algorithm is competitive with the state-of-the- i 2
art methods in terms of speed and accuracy. w(i ) = exp − 2 (i ∈ ), (3)
2σs
Index Terms— Edge-preserving smoothing, bilateral filter,
kernel, approximation, fast algorithm, error analysis, bounds. or a box:
I. I NTRODUCTION w(i ) = 1/|| (i ∈ ). (4)

G AUSSIAN and box filters typically work well in appli-


cations where the amount of smoothing required is
small. For example, they are quite effective in removing small
The domain  of the spatial kernel is a square neighbourhood,
 = [−W, W ] × [−W, W ], where W = 3σs for the Gaussian
dosages of noise from natural images. However, when the filter. We refer the interested reader to [3] and [4] for a
noise floor is large, and one is required to average more detailed exposition on the working of the filter. We note that
pixels to suppress the noise, these filters begin to over- the bilateral filter has a straightforward extension to video and
smooth sharp image features such as edges and corners. The volume data. Another natural extension is the cross (or joint)
over-smoothing can, however, be alleviated using some form bilateral filter [4]. While we will limit our discussion to the
of data-driven (non-linear) diffusion, where the quantum of standard bilateral filter, the main ideas in this paper can also
smoothing is controlled using the image features. A classical be applied to the above-mentioned extensions.
example in this regard is the famous PDE-based diffusion of
Perona and Malik [2]. The bilateral filter was proposed by A. Fast Bilateral Filtering
Tomasi and Maduchi [3] as a filtering-based alternative to the It is clear that a direct computation of (1) requires O(W 2 )
Perona-Malik diffusion. The bilateral filter has turned out to operations per pixel. In fact, the computation is slow for
be a versatile tool that has found widespread applications in practical settings of W . To address this issue, researchers
image processing, computer graphics, computer vision, and have come up with several fast algorithms [7]–[14]. Most of
computational photography [4]. More recently, the bilateral these are based on some form of approximation, and provide
filter has received renewed attention in the context of image various levels of compromise between speed and quality of
denoising [5], [6]. approximation. One of the early algorithms for fast bilateral
In this paper, we consider a standard form of the bilateral filtering involved the quantization of the image intensities,
filter where a Gaussian kernel is used for range filtering, where the final output was obtained via the interpolation of
Manuscript received September 11, 2015; revised February 24, 2016 and the output of a set of linear filters [7]. It was later shown
March 26, 2016; accepted March 26, 2016. Date of publication March 29, that this approximation can be used to obtain a constant-time
2016; date of current version April 14, 2016. This work was supported implementation which further improves its speed [8]. In a
by the Indian Institute of Science through the Startup Grant. Some of the
results in this paper were presented at the IEEE International Conference different direction, it was observed in [9] that the bilateral filter
on Image Processing 2015 [1]. The associate editor coordinating the review can be conceived as a linear filter acting in three-dimensions,
of this manuscript and approving it for publication was Dr. Alessandro Foi. where the three-dimensions are obtained by augmenting the
(Corresponding author: Kunal N. Chaudhury.)
The authors are with the Department of Electrical Engineering, Indian Insti- image intensity to the spatial dimensions. This observation
tute of Science, Bangalore 560012, India (e-mail: k.n.chaudhury@ieee.org; was used to derive a fast filtering in three-dimensions, which
swapnilddabhade@yahoo.co.in). was then sampled to obtained the final output. We refer the
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. interested reader to [10] for a survey of fast algorithms for
Digital Object Identifier 10.1109/TIP.2016.2548363 bilateral filtering.
1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2520 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016

The algorithms in [10]–[12] are particularly relevant to algorithm arising from the Gaussian-polynomial approxima-
the present work. Here the authors proceed by approximat- tion and analyze its approximation quality. We present exhaus-
ing (2) using polynomial and trigonometric functions, and tive numerical results in Section IV, and demonstrate its
demonstrate how the bilateral filter can be decomposed into superior performance over existing algorithms.
a series of spatial filterings as result. As is well-known,
since spatial box and Gaussian filters can be implemented in II. G AUSSIAN -P OLYNOMIAL A PPROXIMATION
constant-time using separability and recursion [13], the overall The present idea is to consider the translated kernel
approximation can therefore be computed in constant-time. gσr (t −τ ) that appears in (1), where t = f (i − j ) and τ = f (i ).
We can write
B. Present Contribution      
τ2 t2 τt
We propose a fast algorithm for computing (1) which was gσr (t − τ ) = exp − 2 exp − 2 exp . (5)
2σr 2σr σr2
motivated by the line of work in [12] and [14]. In particular,
similar to these papers, we present a novel approximation For a fixed translation τ , this is a function of t. Notice that
of (2) that allows us to decompose the bilateral filter into the first term is simply a scaling factor, while the second
a series of spatial convolutions. The fundamental difference term is a Gaussian centered at the origin. In fact, the second
between the above papers and the present approach is that, term essentially contributes to the bell shape of the translated
instead of approximating (2) and then translating the approxi- Gaussian. The third term is a monotonic exponential, which
mation in range space, we directly approximate the translated is increasing or decreasing depending on the sign of τ ; this
Gaussians appearing in (1). In particular, the computational term helps in translating the Gaussian to t = τ .
advantages obtained using the proposed approximation are the We assume (without loss of generality, as will be explained
following: at the start of Section III) that the dynamic range of the image
• For a fixed approximation order (to be defined shortly),
is [−T, T ]. That is, the arguments t = f (i − j ) and τ = f (i )
the proposed approximation requires half the number in (5) take values in [−T, T ]. This means that the product
of spatial filterings required by the approximations τ t appearing in (5) takes values in [−T 2 , T 2 ]. Consider the
in [8], [10], and [12]. Taylor expansion of the exponential term about the origin:
• The proposed approximation does not involve the tran-   N−1  1  τ t n
τt
scendental functions cos(ωx) and sin(ωx) which are exp = + higher-order terms. (6)
σr2 n! σr2
used in [12] and [14]. It only involves polynomials (and n=0
just a single Gaussian), and hence can be efficiently By dropping the higher-order terms, we obtain the following
implemented on hardware [15]. Moreover, the rounding approximation of (5):
error is small when working with polynomials.  2   N−1
As will be demonstrated shortly, the proposed algorithm is t + τ2  1  τ t n 
φ N,σr (t, τ ) = exp − . (7)
generally faster and more accurate than Yang’s algorithm [8], 2σr2 n! σr2
n=0
which is currently considered to be the state-of-the-
art [10], [16]. In particular, we perform an error analysis Being the product of a bivariate Gaussian and a polynomial,
whereby we compare the output obtained using the proposed we will henceforth refer to (7) as a Gaussian-polynomial,
algorithm with that of the exact bilateral filter. Due to the where N is its approximation order. By construction, we have
particular nature of the proposed approximation, our analysis the pointwise convergence
is much more simple than that carried out for Yang’s algorithm lim φ N,σr (t, τ ) = gσr (t − τ ). (8)
in [16]. Nevertheless, compared to Yang’s algorithm, we are N→∞
able to establish a smaller bound on the number of spatial We would like to note that the above idea of splitting the kernel
filterings required to achieve a given filtering accuracy. The and approximating a part of its using Taylor polynomials was
latter is defined in terms of the error between the outputs of employed in [17] in the context of the fast Gauss transform.
the bilateral filter and the fast algorithm (this will be made To the best of our knowledge, this idea has not been exploited
precise in Section III). To best of our knowledge, with the for fast bilateral filtering along the lines of the present work.
exception of [8], this is the only fast algorithm that comes In Figure 1, we study the approximations corresponding to
with a provable guarantee on the quality of approximation. different N. The fundamental difference between the Taylor
At this point, we note that the term “accurate” is used in the approximation in [11] and the Gaussian-polynomial approxi-
paper not just to signify that the output of the fast algorithm mation (8) is that instead of approximating the entire Gaussian,
is visibly close to that of the target bilateral filter. It also has we approximate one of its component, namely the exponential
a precise technical meaning, namely, that we can control the function in (5). The intuition behind this is that the Taylor
approximation order to make the error between the outputs of polynomial blows up as one moves away from the origin. This
the bilateral filter and the fast algorithm arbitrarily small. makes it difficult to approximate the tail part of a Gaussian
using such polynomials. On the other hand, the exponential
C. Organization in (5) is monotonic, and hence can be closely approximated
The rest of the paper is organized as follows. We present using polynomials. This point is explained with an example
the proposed kernel approximation and the error analysis in Figure 2. In particular, notice in Figure 2b that the Gaussian-
in Section II. In Section III, we develop a fast constant-time polynomial approximation is quite precise over the range of
CHAUDHURY AND DABHADE: FAST AND PROVABLY ACCURATE BILATERAL FILTERING 2521

Fig. 3. Comparison of the actual error (9) and the bound in (12) for T = 128
and σr = 30. We plot the samples of the error function E 40,30 (t, τ ) over the
Fig. 1. Approximation of g30 (t −τ ) using Gaussian-polynomials φ N,30 (t, τ ) square domain −128 ≤ t, τ ≤ 128 in (a). We compare this with the samples
with different N . The bivariate functions g30 (t −τ ) and φ N,30 (t, τ ) have been of (12) over the same domain in (b), where we have set s = |τ t|. Notice that
sampled along t = −τ to generate a one-dimensional profile. the supremum of either plots are of the same order of magnitude.

This is also referred to as the worst-case or uniform error.


We note that one can measure the error using other means, e.g.,
using the 2 metric. The reason why we choose the ∞ metric
is that our ultimate goal is to quantify the ∞ accuracy of
the final filtering arising from the approximation, and a bound
on (10) is sufficient for this purpose. Moreover, computing the
∞ error is relatively simple.
Using the inequality (t 2 + τ 2 )/2 ≥ |τ t|, we can bound the
first term in (9) by exp(−|τ t|/σr2 ). Therefore, we have
Fig. 2. Comparison of the approximations of g30 (t − 10) using raised-
cosine [12], Taylor polynomial [11], and Gaussian-polynomial of order 10. E N,σr ∞ ≤ max ψ N,σr (s), (11)
We notice in (a) that the Taylor polynomial quickly goes off to +∞ as s∈[0,T 2 ]
one moves away from the origin. For this reason, we restricted the plot
to [−90, 90], although the desired approximation range is the full dynamic where
range [−128, 128]. The plots over [−80, 80] are separately provided in (b)   
∞  n 
for comparing the raised-cosine and the Gaussian-polynomial approximations s 1 s
ψ N,σr (s) = exp − 2 . (12)
with the target Gaussian. σr n! σr2
n=N

interest, and is comparable to the raised-cosine approximation Using (11), we obtain the following result. We note that this
of same order [12]. bound is stronger than that derived for the fast Gauss transform
in [17].
A. Quantitative Error Analysis Proposition 1:


Before explaining how we can use Gaussian-polynomials e−λ λn

to derive a fast bilateral filter in Section III, we study the E N,σr ∞ ≤ λ = T 2 /σr2 . (13)
n!
kernel error incurred by approximating (2) using Gaussian- n=N
polynomials. We will see in Section III that a bound on the To arrive at (13), we proceed by writing (12) as
kernel error can in turn be used to bound the filtering accuracy   N−1  n
of the fast algorithm. Note that (8) tells us that Gaussian- s  1 s
ψ N,σr (s) = 1 − exp − 2 .
polynomial can be used to approximate the range kernel with σr n! σr2
n=0
arbitrary accuracy. However, in practice, we will be required to
use a Gaussian-polynomial of some fixed order N. A relevant After differentiation, we get
question is the size of error incurred for a given N? A related   N−1  
 1 s s
question is that, given some error margin ε > 0, how do we fix ψ N,σ (s) = exp − ≥ 0.
r
(N − 1)!σr2 σr2 σr2
the smallest N such that the corresponding error is within ε?
To begin with, we define the error function Thus, (12) is non-decreasing on [0, T 2 ], whereby we conclude
that the maximum in (11) is attained at s = T 2 . This
E N,σr (t, τ ) = gσr (t − τ ) − φ N,σr (t, τ )
 2  ∞   establishes Proposition 1.
t + τ2  1 τt n To get an idea of the tightness of the bound in (13), we
= exp − . (9)
2σr2 n! σr2 compare the mesh plots of (9) and (12) in Figure 3 when
n=N
σr = 30 and N = 40. While there is a gap between the error
The mathematical problem is one of bounding (9) for fixed N and the corresponding bound at certain values of (t, τ ), the
and σr . In this work, we consider the ∞ error given by supremum of the latter (which occurs at one of the boundaries
 as predicted above) is nevertheless of the same order of
E N,σr ∞ = max |E N,σr (t, τ )| : −T ≤ t, τ ≤ T . (10) magnitude as the supremum of the former.
2522 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016

TABLE I Algorithm 1 Estimation of the Approximation Order


C OMPARISON OF THE G AUSSIAN -P OLYNOMIAL O RDER O BTAINED
U SING (18), W HERE (1) W0 I S C OMPUTED U SING THE MATLAB
F UNCTION lambertw ( N0 ), (2) W0 I S G IVEN BY (19) ( N0 ),
AND (3) THE S ERIES E VALUATION I S R EFINED
U SING T HREE N EWTON I TERATIONS ( N0 )

B. Relation Between N and Kernel Error


Having obtained a bound on the approximation error, we
consider the problem of finding the smallest N such that (10)
is within some allowed error margin ε > 0. Note that the
quantity on the right in (13) is simply the tail probability of
a Poisson random variable with parameter λ. We recall that
a random variable X taking values in {0, 1, 2, . . .} is said to
follow a Poisson distribution with parameter λ > 0 if
e−λ λn
Prob(X = n) = (n = 0, 1, 2, . . .).
n!
We can thus interpret the quantity on the right in (13) as the we note that W0 (t) can be approximated using a series
probability Prob(X ≥ N). In this context, the leading question expansion [19]. In particular, the first four terms are
is the following: given ε > 0, find the smallest N such that 3 8
Prob(X ≥ N) ≤ ε. The advantage of expressing the problem W0 (t) = t − t 2 + t 3 − t 4 . (19)
2 3
in this form is that it brings to our disposal various tools However, we observed that (19) provides inexact estimates
for bounding the tail probability. For example, assuming that when λ is large, that is, when σr is small. An extremely large
N > λ, we have the Chebyshev bound [18]: number of terms of the series are required to get a precise
λ estimate. To address this problem, we propose to use Newton
Prob(X ≥ N) ≤ . (14)
(N − λ)2 iterations for finding the positive root of ν(x) = x log x −
On the other hand, the Chernoff bound [18] when N > λ is px − q = 0 (see Appendix VI-A for notations), where the
given by initialization is done using (18) and (19). Namely, starting
with x 0 = q/W0 (qe− p ), we run the following iterations for
e−λ (eλ) N k ≥ 0:
Prob(X ≥ N) ≤ . (15)
NN ν(x k ) x k log x k − px k − q
Numerical experiments suggest that for σr < 70 and for a x k+1 = x k −  = xk − . (20)
ν (x k ) log x k + 1 − p
range of values of ε (to be reported shortly), the empirically
In practice, we noticed that about 3-4 iterations are sufficient
computed N is always larger than λ. Under this assumption,
to produce a good solution. In Table I, we illustrate the
we have the following estimate of the smallest N using (14):
improvement obtained after performing the Newton iterations.
N0 = [λ + λ/ε], (16) The complete scheme for computing the order for a given
accuracy ε is summarized in Algorithm 1.
where [x] is the smallest integer greater than or equal to x. Note that for σr > 70, we use a fixed order of 10. This is
As is well-known, the Chernoff bound (15) is typically because the condition N > λ in (14) and (15) is violated in this
tighter than the Chebyshev bound. However, finding the regime. Moreover, we have noticed that a small order suffices
smallest N such that when σr is large. In Figure 4, we compare the estimated order
e−λ (eλ) N N0 obtained using the following methods: Chebyshev (16),
≤ε (17)
NN Chernoff (18) along with (19), and Chernoff followed by
is somewhat more involved. Newton iterations (20). We also compare the corresponding
Proposition 2: Let t → W0 (t) be the inverse of the map errors (computed using exhaustive search) given by (10).
t → t exp(t) on (0, ∞]. Then the smallest integer greater Notice that the estimates are close to that obtained using
than λ for which (17) holds is exhaustive search when ε = 0.1; however, when ε = 0.001,
the Chebyshev bound is quite loose.
N0 = [q/W0 (qe− p )], (18)
where p = 1 + log(λ) and q = −λ − log ε. III. FAST B ILATERAL F ILTERING
The details are provided in Appendix VI-A. While W0 (t) We now explain how Gaussian-polynomials can be used to
can be computed using the Matlab script lambertw(0,t), derive a fast algorithm for implementing (1). As a first step,
CHAUDHURY AND DABHADE: FAST AND PROVABLY ACCURATE BILATERAL FILTERING 2523

Algorithm 2 Gaussian-Polynomial Approximation (GPA)

Fig. 4. For ε = 0.1 and 0.001, we compare the order N0 obtained


using various methods (top row) and the corresponding error (bottom row).
(a) ε = 0.1. (b) ε = 0.001. (c) ε = 0.1. (d) ε = 0.001.

we center the intensity range { f (i ) : i ∈ I} around the origin.


This is in keeping with the Taylor expansion in (7) which is
performed around the origin. A simple means of doing so is
to set tc = T , assuming the dynamic range to be [0, 2T ], and
to consider the centred image {h(i ) : i ∈ I} given by
h(i ) = f (i ) − tc (i ∈ I). (21)
and set
The crucial observation is that that the shift operation in (21) 
commutes with the non-linear bilateral filtering. F̄n (i ) = (Fn ∗ w) (i ) = w( j )Fn (i − j ). (25)
Proposition 3: For i ∈ I, j ∈

We can then write (23) as (cf. Appendix VI-B)


fBF (i ) = h BF (i ) + tc . (22)
P(i )
In other words, we can first centre the intensity range, apply f GPA (i ) = , (26)
Q(i )
the bilateral filter, and finally add back the centre to the output.
where
Henceforth, we will assume that the range of the input image
is [−T, T ]. For an 8-bit grayscale image, T = 128. 
N−1
1
P(i ) = σr G n (i ) F̄n+1 (i ), (27)
n!
n=0
A. Fast Algorithm and
The underlying mechanism of the proposed fast algorithm 
N−1
1
is related to the fast algorithms in [12] and [14]. The sub- Q(i ) = G n (i ) F̄n (i ). (28)
tle difference is that instead of directly approximating (2), n!
n=0
we approximate its translates in (1). In particular, we fix Notice that we have effectively transferred the non-linearity
some order N, and approximate the range kernel in (1) of the bilateral filter to the intermediate images in (24), which
using (7). This gives us the following Gaussian-Polynomial are obtained from the input image using simple pointwise
Approximation (GPA) of (1): transforms. The computational advantage that we get from

j ∈ w( j )φ N,σr ( f (i − j ) − f (i )) f (i − j )
the above manipulation is that the spatial filtering in (25) can
fGPA (i ) =  . be computed using O(1) operations per pixel when w is a
j ∈ w( j )φ N,σr ( f (i − j ) − f (i ))
box or a Gaussian [13]. The overall cost of computing (23)
(23) is therefore O(1) per pixel with respect to the filter size W .
Next, for n = 0, . . . , N − 1, we define the images This is a substantial reduction from the O(W 2 ) complexity of
    the direct implementation of (1).
f (i ) n f (i )2 The complete algorithm for computing (23) is summarized
G n (i ) = , Fn (i ) = exp − G n (i ), (24)
σr 2σr2 in Algorithm 2, which we will continue to refer as GPA.
2524 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016

Note that we efficiently implement steps (24) to (28) by


avoiding redundant computations. In particular, we recursively
compute the images in (24) and the factorials in (27) and (28).
Notice that steps 6-11, 16-17, 21-22, and 26 are cheap
pointwise operations. The main computation in Algorithm 2
is the spatial filtering in step 19, and the initial filtering in
step 13. That is, the overall cost is dominated by the cost
of computing N + 1 spatial filtering. In this regard, we note
that for the same degree N, the number of spatial filterings
required in [12] and [14] is 4N, and that in [8] is 2N.
Moreover, we note that the proposed algorithm involves the
evaluation of a transcendental function just once, namely in
step 7. In contrast, the algorithm in [8] requires N evaluations
of the Gaussian over the whole image. Thus, the present Fig. 5. List of grayscale images used for the experiments in Section IV.
algorithm has smaller rounding errors, and is better suited The images were obtained from [21]. All images are of size 512 × 512.
for hardware implementations [15] compared to the above (a) I1 . (b) I2 . (c) I3 . (d) I4 . (e) I5 . (f) I6 . (g) I7 . (h) I8 .
mentioned algorithms. Yet another key advantage with the
approximation order. In particular, suppose that we want (29)
Algorithm 2 is that we need just six images (excluding the
to be within ±δ. A sufficient condition for this is that
input and output images) for the complete pipeline. As against
this, the algorithm in [8] requires the computation and storage w(0)δ
E N,σr ∞ ≤ .
of N principal images, which are interpolated to get the final 2T + δ
output. To summarize, we have the following guarantee that follows
from (31).
B. Filtering Accuracy Corollary 5: Suppose that N is set using Algorithm 1,
It is clear that the kernel error, and hence the overall quality where ε is given by
of approximation, is controlled by the order N. In this regard, w(0)δ
ε= . (32)
we need a rule to fix N in Algorithm 2. As before, we will 2T + δ
consider the worst-case error given by Then the output of Algorithm 2 is within ±δ of the output of

the bilateral filter.
 f BF − f GPA ∞ = max | f BF (i ) − f GPA (i )| : i ∈ I . (29)

By bounding (29), we can control the pixelwise difference IV. E XPERIMENTS AND D ISCUSSION
between the exact and the approximate bilateral filter. In par- We implemented the proposed GPA algorithm using
ticular, we have the following result which formally establishes Matlab 8.4 on an Intel 3.4 GHz Linux system with 8 GB mem-
the intuitive fact that the filtering accuracy is essentially within ory. The Matlab implementation has been shared here [20].
a certain factor of the kernel error given by (10). The details The set of grayscale images used for the experiments are
of the derivation are provided in Appendix VI-C. shown in Figure 5. We compared the proposed algorithm with
Proposition 4: Suppose that the spatial filter is non- the following fast algorithms: Yang [8], Paris [9], Weiss [22],
negative and normalized, i.e., w( j ) ≥ 0 for all j ∈ , and and the Shiftable Bilateral Filter (SBF) [14]. We used the
 Matlab implementation of these algorithms to make the com-
w( j ) = 1. (30) parison fair; moreover, we used the parameter settings sug-
j ∈ gested in the respective papers. For determining the order
Then in [8] for a given accuracy parameter δ, we have used (34).
T E N,σr ∞ A. Experiment 1
 fBF − f GPA ∞ ≤ 2 . (31)
w(0) − E N,σr ∞ The output of the proposed GPA algorithm on a couple of
We note that the spatial filters (3) and (4) are non-negative, images are shown in Figures 6 and 7. We also provided the
and that w(i ) appears in both the numerator and denominator output obtained using exact bilateral filtering. We performed
of (1) and (23). Therefore, we can assume (30) without any the comparison using the box and the Gaussian kernels for
loss of generality. In fact, (30) is automatically true for the box the spatial filter. Notice that the speedup obtained is signifi-
filter. We also recall that the range of the image is assumed to cant. Moreover, the filtered images are visually identical and
be centered; the intensity values are in the interval [−T, T ], numerically very close, in terms of the ∞ and mean-squared
where T ≈ 128 for most grayscale images. errors. We have used the following definition of mean-squared
error (MSE):
 
2
C. Relation Between Accuracy and N0 MSE = 10 log10 |I|−1 f BF (i ) − fGPA (i ) ,
Note that by combining (31) with the bound in (15), we i∈I
get a direct control on the filtering accuracy in terms of the where |I| denotes the number of pixels in the image.
CHAUDHURY AND DABHADE: FAST AND PROVABLY ACCURATE BILATERAL FILTERING 2525

Fig. 6. Comparison of the exact bilateral filter (BF) and the proposed approximation (GPA) on images I1 and I2 . A Gaussian kernel (σs = 5) is used for
the spatial filter, and σr = 50 for the Gaussian range kernel. The accuracy parameter was set to δ = 0.1 for the GPA. In the caption of (a) and (c), we
report the run time of the BF. In the caption of (b) and (d), we report the run time of the GPA, and the ∞ and mean-squared errors between BF and GPA.
(a) BF (10.2 sec). (b) GPA (0.77 sec, −72 dB, −174 dB). (c) BF (9.4 sec). (d) GPA (0.85 sec, −58 dB, −162 dB).

Fig. 7. The setup here is identical to that in Figure 6, with the difference that a box kernel (W = 10) is used for the spatial filter instead of the Gaussian.
(a) BF (4.3 sec). (b) GPA (0.61 sec, −65 dB, −164 dB). (c) BF (4.22 sec). (d) GPA (0.62 sec, −54 dB, −155 dB).

Fig. 8. Comparison of the filtering accuracy and the run time of four different algorithms as a function of the parameters σs (Gaussian spatial filter) and σr .
We used image I1 in Figure 5 for the comparison. We used δ = 1 for GPA and Yang’s algorithm [8]. A tolerance of 0.01 was used for the SBF [14].
(a) σs = 3. (b) σs = 3. (c) σr = 30. (d) σr = 30.

To get a better understanding of how N0 varies with δ, we TABLE II


used the following approximation (see Appendix VI-D): C OMPARISON OF THE O RDER N0 R EQUIRED TO A CHIEVE
A D ESIRED A CCURACY δ W HEN σs = 5 AND σr = 30
 2  
T 2T
N0 ≈ 1.72 + log . (33)
σr w(0)δ

An important point to note in (33) is the logarithmic depen-


dence on δ. In fact, the log(1/δ) factor can be traced back algorithm [8]:
to the tail bound in (15), which, in turn, follows from the
1.14 × 105
particular splitting in (7). The implication of the logarithmic N0 ≈ . (34)
dependence is that we can force δ to be quite small without δ 1/2 σr2
blowing up N0 . The above estimate was recently derived in [16]. In particular,
To further highlight the importance of (33), we com- notice that the dependence on σr is similar to that in (33).
pared (33) with the corresponding estimate for Yang’s However, the dependence on δ is much more strong in (34)
2526 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016

Fig. 9. Comparison of the filtering accuracy and the run time as a function of W and σr . The settings are identical to that in Figure 8; the difference here
is that we have used a box spatial kernel instead of a Gaussian kernel. (a) W = 4. (b) W = 4. (c) σr = 30. (d) σr = 30.

Fig. 10. Comparison of the filtering accuracy (∞ error and MSE) of various fast algorithms on the images in Figure 5. The red horizontal lines
in 10a and 10b represent the accuracy parameter δ used for GPA and Yang’s algorithm. The tolerance for SBF was set to be 0.01. In 10a - 10b, we
show the results for a Gaussian spatial kernel, and in 10c - 10d we show the results for a box kernel. We used σr = 30 for the Gaussian range kernel in all
the experiments. (a) σs = 5. (b) σs = 5. (c) W = 4. (d) W = 4.

TABLE III
C OMPARISON OF THE P ROPOSED GPA A LGORITHM W ITH YANG ’s A LGORITHM [8] FOR D IFFERENT O RDER N . T HE ∞ E RROR AND THE MSE A RE
IN D ECIBELS , W HILE THE T IME I S IN M ILLISECONDS . T HE C OMPARISON I S D ONE ON I MAGE I5 U SING B OTH B OX AND G AUSSIAN
S PATIAL F ILTERS ; THE T YPE OF S PATIAL F ILTER I S M ENTIONED W ITHIN B RACKETS . T HE R ESPECTIVE PARAMETERS FOR
THE B OX AND G AUSSIAN F ILTER A RE W = 4 AND σs = 5, AND σr = 30 FOR THE G AUSSIAN R ANGE K ERNEL .
N OTICE T HAT THE A CCURACY OF GPA S ATURATES A BOVE N = 60.

compared to (33), since log(1/δ) δ −1/2 when δ < 1. in Figures 8 and 9. As before, we performed the comparison
Moreover, the leading constant in (34) is much larger than for both the box and Gaussian spatial filters. It is evident from
the constant in the first term in (33). As an example, when these results that the proposed method is competitive with
δ = 3 and σr = 50, we have N0 ≈ 27 for Yang’s algorithm existing methods in terms of the speed-accuracy tradeoff.
(this is the estimate reported in [16]). On the other hand,
the corresponding estimate for our algorithm is N0 ≈ 19 C. Experiment 3
(assuming that we use a box filter of size 3×3). The difference We next compared the proposed algorithm with existing fast
becomes dramatic for smaller values of δ. For example, when algorithms on the images shown in Figure 5. A summary of the
σr = 50 and δ = 0.01, the estimate from (33) is 24, while that comparisons (in terms of maximum pixelwise error and MSE)
from (34) is 456. Further comparisons are provided in Table II. is provided in Figure 10.
Notice that the order for Yang’s approximation explodes when
δ < 1 (sub-pixel accuracy). It is also seen from the table D. Experiment 4
that (33) provides a close approximation of (18) for the setting Finally, we performed a detailed comparison of the proposed
under consideration. algorithm with Yang’s algorithm, which is widely considered
to be the state-of-the-art algorithm. In the first comparison,
B. Experiment 2 we fixed an image and the parameters of the bilateral filter.
A graphic comparison of the algorithms for various The order N was then varied and the corresponding error and
settings of the spatial and range kernels is presented run times were noted. The results are presented in Table III.
CHAUDHURY AND DABHADE: FAST AND PROVABLY ACCURATE BILATERAL FILTERING 2527

TABLE IV
C OMPARISON OF THE GPA A LGORITHM W ITH YANG ’s A LGORITHM [8] AT D IFFERENT δ. S EE TABLE III FOR THE PARAMETER S ETTINGS .

Notice that the run time of GPA is consistently smaller than B. Derivation of (26)
that of Yang’s algorithm for both the box and Gaussian In terms of (24), we can write φ N,σr ( f (i − j )− f (i )) f (i − j )
kernels. Indeed, as remarked earlier, for a fixed order N, as
Yang’s algorithm [8] requires 2N spatial filterings, while GPA   N−1
requires only N +1 spatial filterings. Thus, the runtime of GPA f (i )2  1
σr exp − G n (i )Fn+1 (i − j ). (37)
is about half of that of Yang’s algorithm. Moreover, beyond 2σr2 n!
n=0
a certain N, GPA provides much better filtering accuracy.
We performed a similar experiment by varying δ, the results On substituting (37) in the numerator of (23), and exchanging
of which are reported in Table IV. Notice that the run time of the summations, we get
Yang’s algorithm becomes prohibitively large when δ is small. 
w( j )φ N,σr ( f (i − j ) − f (i )) f (i − j )
j ∈
V. C ONCLUSION  
f (i )2
= exp − P(i ),
We presented a novel fast algorithm for approximating the 2σr2
bilateral filter. The algorithm was shown to be both fast and
which gives us (27) where we have used (25). Similarly, on
accurate in practice using extensive experiments. The space
substituting (37) in the denominator of (23), and exchanging
and time complexity of the proposed algorithm is smaller
the summations, we get
than the state-of-the-art algorithm of Yang et al. [8], and,
  
moreover, was shown to provide much better accuracy. We also f (i )2
performed an error analysis of the approximation scheme, and w( j )φ N,σr ( f (i − j ) − f (i )) = exp − Q(i ),
2σr2
presented a rule for setting the approximation order that can j ∈

guarantee the filtering accuracy to be within a desired margin. where Q(i ) is given by (28). Cancelling the common expo-
nential term from the numerator and denominator, we get (26).
A PPENDIX
A. Derivation of (18) C. Derivation of (31)
Taking the logarithm of (17), we can restate the problem as To establish (31), we write (1) as f BF (i ) = P1 (i )/Q 1 (i ),
one of finding the smallest integer x > λ such that where

ν(x) = x log x − px − q ≥ 0 (35) P1 (i ) = w( j )gσr ( f (i − j ) − f (i )) f (i − j ),
j ∈
where p = 1 + log(λ) and q = −λ − log ε.
and
Notice that ν  (λ) = 0 and ν  (x) = 1/x > 0. Hence, ν(x) is 
strictly convex over (0, ∞) with a minimum at x = λ. Since Q 1 (i ) = w( j )gσr ( f (i − j ) − f (i )).
ν(λ) = log ε < 0 when ε < 1, we conclude that there exists j ∈
some θ > λ for which ν(θ ) = 0. The smallest integer solution
Similarly, we write (23) as f GPA (i ) = P2 (i )/Q 2 (i ), where
of (17) is precisely [θ ]. To find θ , we solve the equations 
ν(θ ) = 0 and θ > λ. Note that we can write ν(θ ) = 0 as P2 (i ) = w( j )φ N,σr ( f (i − j ) − f (i )) f (i − j ),
q q  j ∈
exp = qe− p , (36)
θ θ and
which is of the form y exp(y) = qe− p , where y = q/θ . 
Q 2 (i ) = w( j )φ N,σr ( f (i − j ) − f (i )).
The inverse of the mapping y → y exp(y) is a well-studied
j ∈
function called the Lambert W-function [19]. In particular, the
inverse (which is generally multivalued) in this case is given by We can then write fBF (i ) − f GPA (i ) as
q P1 (i )(Q 2 (i ) − Q 1 (i )) + Q 1 (i )(P1 (i ) − P2 (i ))
= W0 (qe− p ), =
θ Q 1 (i )Q 2 (i )
where W0 (t) is one of the two branches of the Lambert 1  
= f BF (i )(Q 2 (i ) − Q 1 (i )) + P1 (i ) − P2 (i ) . (38)
W-function [19]. This gives us estimate (18). Q 2 (i )
2528 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016

We uniformly upper-bound (resp. lower-bound) the numerator [5] C. Knaus and M. Zwicker, “Progressive image denoising,” IEEE Trans.
(resp. denominator) in (38). In particular, note that Image Process., vol. 23, no. 7, pp. 3114–3125, Jul. 2014.
[6] K. N. Chaudhury and K. Rithwik, “Image denoising using optimally
 f BF ∞ ≤ T. (39) weighted bilateral filters: A sure and fast approach,” in Proc. IEEE ICIP,
Sep. 2015, pp. 108–112.
[7] F. Durand and J. Dorsey, “Fast bilateral filtering for the display of high-
This follows from the fact that f BF (i ) in (1) can be expressed dynamic-range images,” ACM Trans. Graph., vol. 21, no. 3, pp. 257–266,
as a convex combination of { f (i − j ) : j ∈ }. On the other 2002.
hand, Q 2 (i ) − Q 1 (i ) is [8] Q. Yang, K.-H. Tan, and N. Ahuja, “Real-time O(1) bilateral filtering,”
   in Proc. IEEE CVPR, Jun. 2009, pp. 557–564.
w( j ) gσr ( f (i − j ) − f (i )) − φ N,σr ( f (i − j ) − f (i )) . [9] S. Paris and F. Durand, “A fast approximation of the bilateral filter using
a signal processing approach,” in Proc. ECCV, 2006, pp. 568–580.
j ∈ [10] K. Sugimoto and S. I. Kamata, “Compressive bilateral filtering,” IEEE
Trans. Image Process., vol. 24, no. 11, pp. 3357–3369, Nov. 2015.
Therefore, using (30), we get [11] F. Porikli, “Constant time O(1) bilateral filtering,” in Proc. IEEE CVPR,
Jun. 2008, pp. 1–8.
Q 1 − Q 2 ∞ ≤ E N,σr ∞ . (40) [12] K. N. Chaudhury, D. Sage, and M. Unser, “Fast O(1) bilateral filtering
using trigonometric range kernels,” IEEE Trans. Image Process., vol. 20,
Similarly, no. 12, pp. 3376–3382, Dec. 2011.
[13] R. Deriche, “Recursively implementating the Gaussian and its deriva-
P1 − P2 ∞ ≤ E N,σr ∞ T. (41) tives,” INRIA, France, Res. Rep. INRIA-00074778, 1993.
[14] K. N. Chaudhury, “Acceleration of the shiftable O(1) algorithm for
To uniformly lower-bound Q 2 (i ), we note that for i ∈ I, bilateral filtering and nonlocal means,” IEEE Trans. Image Process.,
 vol. 22, no. 4, pp. 1291–1300, Apr. 2013.
Q 1 (i ) = w(0)gσr (0) + w( j )gσr ( f (i − j ) − f (i )) [15] J.-M. Muller, Elementary Functions: Algorithms and Implementation.
j ∈\{0} Cambridge, MA, USA: Birkhäuser, 2006.
[16] S. An, F. Boussaid, M. Bennamoun, and F. Sohel, “Quantitative error
≥ w(0), analysis of bilateral filtering,” IEEE Signal Process. Lett., vol. 22, no. 2,
pp. 202–206, Feb. 2015.
where we have used the non-negativity of the range and spatial [17] C. Yang, R. Duraiswami, and N. A. Gumerov, “Improved fast Gauss
kernels. Using the inverse triangle inequality along with (40), transform,” Dept. Comput. Sci., Univ. Maryland, College Park, MD,
USA, Tech. Rep. CS-TR-4495, 2003.
we have for i ∈ I, [18] M. Mitzenmacher and E. Upfal, Probability and Computing:
Randomized Algorithms and Probabilistic Analysis. Cambridge, U.K.:
|Q 2 (i )| ≥ Q 1 (i )−|Q 2 (i )− Q 1 (i )| ≥ w(0)−E N,σr ∞ . (42) Cambridge Univ. Press, 2005.
[19] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and
Combining (38) - (42), we arrive at (31). D. E. Knuth, “On the Lambert W function,” Adv. Comput. Math., vol. 5,
no. 1, pp. 329–359, 1996.
[20] K. Chaudhury and S. Dabhade. Fast and
D. Derivation of (33) Accurate Bilateral Filtering, MATLAB Central File Exchange, accessed
on Mar. 25, 2016. [Online]. Available: http://www.mathworks.com/
Note that typically δ T . For example, T is in hundreds matlabcentral/fileexchange/56158
for a grayscale image, whereas, δ ∼ 1. Therefore, it follows [21] Image Databases, accessed on Sep. 2, 2015. [Online]. Available:
from (32) that ε ≈ w(0)δ/(2T ). On the other hand, from http://www.imageprocessingplace.com/root_files_V3/image_databases.
htm
(18) and (19), we have [22] B. Weiss, “Fast median and bilateral filtering,” ACM Trans. Graph.,
q eλ vol. 25, no. 3, pp. 519–526, 2006.
N0 ≈ = ,
t − t2 1 − (q/eλ)
where t = q/eλ and q = −λ + log(1/ε). Since |q| < eλ,
Kunal N. Chaudhury (M’08–SM’14) is currently
1
≈ 1 + (q/eλ). an Assistant Professor of Electrical Engineering with
1 − (q/eλ) the Indian Institute of Science. His research areas
include fast algorithms for image and video process-
Therefore, N0 ≈ eλ + q = (e − 1)λ + log(1/ε). ing, convex optimization models and fast solvers,
sensor network localization, multiview registration,
and Fourier and wavelet analysis. He is a member
ACKNOWLEDGEMENTS of SIAM. He is on the Editorial Board of the SPIE
Journal of Electronic Imaging.
The authors thank Dr. Alessandro Foi and the anonymous
reviewers for their useful comments and suggestions.

R EFERENCES
[1] K. N. Chaudhury, “Fast and accurate bilateral filtering using Swapnil D. Dabhade received the B.E. degree in
Gauss-polynomial decomposition,” in Proc. IEEE ICIP, Sep. 2015, electronics and telecommunication engineering from
pp. 2005–2009. the Vishwakarma Institute of Information Technol-
[2] P. Perona and J. Malik, “Scale-space and edge detection using ogy, Pune, India, in 2013. He is currently pursuing
anisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, the M.E. degree in system science and automation
no. 7, pp. 629–639, Jul. 1990. with the Indian Institute of Science. He was a
[3] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color project intern with the Tata Institute of Fundamental
images,” in Proc. IEEE 6th ICCV, Jan. 1998, pp. 839–846. Research, Ooty, in 2012 and 2013. His research
[4] S. Paris, P. Kornprobst, J. Tumblin, and F. Durand, “Bilateral filtering: interests include image processing, computer vision,
Theory and applications,” Found. Trends Comput. Graph. Vis., vol. 4, and VLSI and FPGA design.
no. 1, pp. 1–73, 2009.