Вы находитесь на странице: 1из 5

The Entropy of Scale-Space

Jon Sporring
(sporring@diku.dk)
Department of Computer Science / University of Copenhagen
Universitetsparken 1 / DK-2100 Copenhagen East
DENMARK

Abstract discretization can be performed as,

Viewing images as distributions of light quanta enables ( ) = xhp(x = ix)i


px i

where i is an integer, x is the discretization constant, and


an information theoretic study of image structures on dif-
ferent scales. This article combines Shannon’s entropy and
Witkin and Koenderink’s Scale-Space to establish a precise
hp(x)i is the mean value of p(x) in the x  2 interval.
x

This corresponds to sampling p convolved with a box fil-


ter at points spaced according to x, where the width of the
connection between the Heat Equation and the Thermody-
box filter also is x. Many arguments from various fields
namic Entropy in Scale-Space. Experimentally the entropy
function is used to study global textures.
such as Numerical Analysis [10, p.606ff], Statistical Infer-
ence [4], and Physics [6] indicate that the least committed
choice of a sampling function is not the box filter, but the
1. Introduction Gaussian filter, i.e.

A recent paper in by Jägersand [3] investigated the so-


( ) = xt (Gt  p) jx=i
pt i xt

called Kullback measure as a measure of the change of in-


p
where the sampling factor now is proportional to the stan-
formation in Scale-Space on images. This measure does dard deviation of the Gaussian function, i.e. xt = c t,
not perform well in the limit of infinitesimal change as is where c is an abitrary constant. The entropy thus becomes,
shown in appendix A. A similar description is based on
X
the derivative of the entropy w.r.t. the natural parameter in hSt (p)i = ? xtpt (i) log (xtpt (i))
Scale-Space. In the present paper, this measure will be de- i
X
rived and applied to the study of textures.
Further and most importantly, the connection between
= ? logxt ? xt pt(i) log pt (i)
i
the Heat Equation and the Thermodynamic Entropy is es-
tablished in the context of Gaussian Scale-Space.
The last term can be approximated as an integral for easier
study by taking an averaging of each interval i. Thus,
2. Discretizing Distributions Z
hSt (p)i ' ? log xt ? pt (x) log pt(x)dx
The mean entropy [7] of a discretely defined probability x

distribution p is defined as,


X 3. A Spatial Distribution of Image Points
hS (p)i = ? (x) log p(x)
p
x Think of an image as a realisation of spatially distributed
light quanta. Each point in the image registers the number
which is usually interpreted as the mean information con- of light quantum-hit. Equivalently, each point is the (un-
tent of the distribution. While it is intuitively easy to inter- normalised) probability of light. In this way we may define,
pret entropies of discrete distributions, the same is not true
for continuously defined distributions. This paper analyses
continuously defined distributions by discretization. Such a p (x) = PI (xI ()x)
x
This spatial probability perspective corresponds very nicely where k = 1 ? log c. The spatial information change
with the theory of Scale-Space [13, 6]. At coarse resolution, is tightly coupled to the second derivatives, as should be
the spatial smoothing is high or equivalently, the spatial un- expected when using Scale-Space. Numerically, the sec-
certainty is high and the spatial probability distribution close ond order derivatives can only be evaluated at scales higher
to being uniform. What further is shown in appendix B is than the inner scale (  1). On the other hand, experi-
that the spatial entropy complies with the molecular entropy ments have shown that significant change can take place for
of thermodynamic. This is of importance since The Gaus- lower scales. Luckily experiments also seem to indicate that
sian Scale-Space can be viewed as a physical model of sam- the entropy as a function of scale is a smooth function (for
pling, as Heat-Diffusion, and thus the study of the Thermo- large images), and thus a numerical derivation of the entropy
dynamic Entropy is one important aspect of this model. function can be used on lower scales instead.
The above distribution is completely invariant to multi-
plicative constants as a consequence of the normalization,
i.e.
PkI (x) = PI (x) 5. Experiments
kI (x)
x
I (x)
x

However, it is not invariant to additive constants. In brief,


the change in entropy when adding a constant to the images, The entropy change by scale is a global measure, and
I (x) = I 0 (x) + c is, thus the experiments performed concentrate on charac-
( ) = ? 1 (S + hlog I i)
@S I
terization of globally defined texture-images. Further
@c hI i more, Gauss-convolution has been implemented using Fast
Fourier Transformation and therefore the image is implic-
where hi is the mean value operator. This function is itly assumed to be on a torus. This will give some ‘pecu-
strictly positive for discrete images, since S can be shown to liar’ boundary effects. This boundary will increase in size
be less than ?hlog I i, see e.g. Rissanen [11]. An interpreta- with scale while the information contents will decrease. For
tion of this is that: an additive constants changes the func- large images this is not expected to have a dominating effect
tion proportional with the difference of the mean of log I un- on the result. All images used in these experiments are from
der the I distribution and under the uniform distribution. the ‘VisTex’ collection [9].
Figure 1 shows examples of the non-resampled entropy
4. Entropy Change: A characteristic function functions, i.e. the log xt has been ignored. Note the mono-
tonicity of the entropy function, and the regularity of the first
As shown in appendix B the entropy as a function of scale derivative with respect to logarithmic scale.
is a monotonically growing function, starting with the en-
A first order effect to expect from simple textures is that
tropy at scale ‘zero’ and ending at maximum entropy when
the point of maximal entropy change should correspond to
the mean value image has been reached. The characteristic
the size of the dominating image structures. Figure 2a-b
functionality is thus in the derivatives of the entropy with
respect to the natural scale parameter  = log t = 2 log .
shows images from a lab-experiment: The camera is place
fronto-parallel to a plane with a very simple texture. A se-
The natural scale parameter is defined in Florack [6, 2]. The
quence is the produced as a series of increasing zoom values.
entropy change is thus,
h i = ? Z 1 (1 + log p (x)) @pt(x) dx ? 1
In figure 2c the standard deviation of the point of maximum
d St entropy change has been plotted against the mean size of the
d ?1
t
@ 2 small blob’s. As can be seen, the relation is linear except in
Z1
(1 + log pt(x)) @t dx ? 12
@pt(x)
the last few images. There the larger blob starts to have an
= ?t effect on the maximum. The l2 distance error from linearity
?1 is 0.4854 to the shown line.

It can be seen that the spatial quantization has only constant The higher order moments of the entropy change func-
effect on the information change and will hereafter be ig- tion does not monotonically grow with increased inner scale.
nored. The main reason for this being that different structures are
Using the image as a distribution, pt = 1c It = GtcI , dominating at different inner scales. But comparing the en-
R
where c = I (x)dx, and @I = 2r2I (The Heat Equation), tropy change functions in figure 1 with 2 it is seen that the
2 @t
where r is the Laplacian operator, thus yields,
difference in the finer structure results in difference of the

h i = ? 2t Z 1 (k + log I (x)) r2I (x)dx ? 1


functions ‘skew’. It is the experience of this author based
d St on textures from VisTex, that the entropy change function is
d c ?1
t t
2 indeed a characteristic function of a particular texture.
Fabric_0005 Fabric_0005 x 10
-3 Fabric_0005
12.48 1.4

12.46 1.2
100

Entropy change (bits/log(pixel^2))


max: (-0.49,0.0012)
12.44 1 mean: 1

200 var: 3.6

Entropy (bits)
12.42 0.8
y (pixels)

skew: 0.28

300 12.4 0.6 kurt: -1.3

12.38 0.4
400
12.36 0.2

500
12.34 0
100 200 300 400 500 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3 4 5
x (pixels) Scale (log(pixels^2)) Scale (log(pixels^2))

Fabric_0007 Fabric_0007 x 10
-4 Fabric_0007

(4a) 12.48
(4b) 7
(4c)
12.475
6 max: (0.012,0.00066)
100

Entropy change (bits/log(pixel^2))


12.47
mean: 0.095
5
12.465 var: 0.99
200
Entropy (bits)

4 skew: 1.1
y (pixels)

12.46

kurt: 2.9
12.455 3
300

12.45
2
400 12.445
1
12.44

500
12.435 0
100 200 300 400 500 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3 4 5
x (pixels) Scale (log(pixels^2)) Scale (log(pixels^2))

Fabric_0008 Fabric_0008 x 10
-3 Fabric_0008
(3a) (3b) 3
(3c)
12.45
2.5 max: (1.4,0.0028)
100
Entropy change (bits/log(pixel^2))

mean: 0.66
2
12.4 var: 1
200
Entropy (bits)
y (pixels)

skew: -0.37
1.5
kurt: -0.048
300
12.35
1

400
12.3 0.5

500
0
100 200 300 400 500 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3 4 5
x (pixels) Scale (log(pixels^2)) Scale (log(pixels^2))

Fabric_0012 Fabric_0012 x 10
-3 Fabric_0012

(2a) 12.48
(2b) 1
(2c)
12.47 0.9 max: (-0.46,0.00098)

100 12.46 0.8 mean: 0.35


Entropy change (bits/log(pixel^2))

12.45 0.7 var: 1.6

200
Entropy (bits)

12.44 0.6 skew: 0.28


y (pixels)

12.43 0.5 kurt: -0.58

300 12.42 0.4

12.41 0.3
400
12.4 0.2

12.39 0.1
500
12.38 0
100 200 300 400 500 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3 4 5
x (pixels) Scale (log(pixels^2)) Scale (log(pixels^2))

(1a) (1b) (1c)

Figure 1. Examples of the entropy-scale function. The a’s are the image/distribution, the b’s entropy
as a function of logarithmic scale, and the c’s the entropy change with respect to the logarithmic
scale.
level006.b level015.b
7

20 20 6

standard deviation at maximum (pixel)


40 40
5

y (pixels)

y (pixels)
60 60
4

80 80
3

100 100
2

120 120
1
20 40 60 80 100 120 20 40 60 80 100 120 10 15 20 25
x (pixels) x (pixels) mean blob size (pixel)

(a) (b) (c)

Figure 2. The images in (a) and (b) are the first and the last from a sequence of images of a fronto-
parallel plane with circular texture taken with different zoom-value. The graph in (c) is the point of
maximum entropy change plottet against the estimated ‘blob’-expansion

6. Discussion A. The Kullback divergence

The Kullback divergence has been used by Jägersand


This paper has empirically indicated that the derivative of
[3] to measure the change of information when images are
the entropy of the spatial image distribution with respect to
the natural scale parameter, log t, is characteristic function
viewed on different scales in Scale-Space. The Kullback di-
vergence is given as,
of textures. Z Z
The formal correspondence of the entropy in Scale-Space [ ] =
K p; q (x) log p(x) ? p(x) log q(x)dx p
and the Thermodynamic Entropy has been established. By Z p(x)
the second law of thermodynamics follows that the spatial
entropy is an increasing function of scale.
= p(x) log
q(x)
dx

One application of the entropy is scale-selection: scales where p and q are two distributions defined on identical N -
of maximal information loss is the scales of which the dimensional domains to which x belongs. Historically, the
dominating image content deteriorates fastest. It might Kullback divergence is a measure of the waste of band-width
even be possible to globally distinguish several dominating when coding with an incorrect distribution.
scales. Another application is quantization: Using the en- In the case of the spatial distributions in Scale-Space, p
tropy change, a spatial down-sampling (a pyramid) can be and q belong to a continuous one-parameter family of dis-
calculated in such a way, that the information loss is con- tributions, and hence in the limit of infinitesimal change the
stant. This refines the a priori natural logarithmic scale by change in Kullback is,
image content. Z
A natural extension of this paper would be local en- [
K pa ; pa+ ] = (x) log ppa(x(x) ) dx pa
a+
tropies, for e.g. texture segmentation. It is of course straight Z log pa (x) ? log pa+ (x)
forward to window the images, but this would not be a ‘true’
local method. Another important matter is that the presented
=  pa (x) 
dx

entropy function is just one of a continuum of information


measuring functions. These two extension are currently be- It is now straight forward to see that,
ing investigated.
[ ] Z
(x) @ log@tpt (x) dx
K pt ; p(t(1+)
lim
!0 
= pt

7. Acknowledgements Z @p (x)
= t
@t
dx

To end, I would like to mention that this article would


never have come about without the many and very useful Thus the value of K only depends on the values of pt in the
insights by Mads Nielsen and Luc Florack. Finally should limits of the integral. The Kullback measure is therefor not
Niels Holm Olsen be acknowledged for his help with some a global measure of the information change in Scale-Space
of the programming. in the infinitesimal limit.
B. Monotonicity of Entropy by Scale For large N ’s, the spatial entropy part will be dominating,
or in other words, when ni  21 , this simplifies to,
The spatial entropy is equivalent of the thermodynamic
entropy as defined in statistical mechanics. Since both S ' c ? N < S (N ) >
Scale-Space and Spatial Entropy mimics thermodynamic
processes, the second law of thermodynamics ensures that The second law of thermodynamics states that the entropy
the spatial entropy is an increasing function of scale. This of a closed system will grow towards equilibrium monoton-
formal relation will be established in the following. For a ically with time. Scale-Space is governed by the heat dif-
reference on thermodynamics and statistical mechanics see fusion of a perfect gas, where each intensity value signifies
for example Atkins [1] and Kittel [5]. the number of molecules in each spatial position (subsys-
The spatial entropy of an image is given as, tem). Hence the spatial entropy will grow monotonically
X (x)
 I (x)  with time.
() =
< St I > P I (x) log P t I (x)
It

x x x References
One of the basic assumptions of statistical mechanics is
the indifference of thermodynamics to position of the sys- [1] P. W. Atkins. Physical Chemistry. Oxford University Press,
tem under study. The number of different arrangements of a 1990.
fixed number of subsystems is signified by the multiplicity [2] L. Florack, B. ter Haar Romeny, J. Koenderink, and
M. Viergever. Linear scale-space. Journal of Mathematical
function,
(fni g) = QNn! !
Imaging and Vision, 4:325–351, 1994.
g [3] M. Jägersand. Saliency maps and attention selection in scale
i i and spatial coordinates: An information theoretic approach.
where is the number of particles in subsystem i and N =
P n nisi the In Fifth International Conference on Computer Vision, pages
i i total number of particles in the thermodynam- 195–202. IEEE Computer Society Press, June 1995.
ically closed system under study. The thermodynamic (di- [4] E. T. Jaynes. Prior probabilities. IEEE Transactions on sys-
mensionless) entropy is given as, tems science and cybernetics, 4(3):227–241, 1968.
[5] C. Kittel and H. Kroemer. Thermal Physics. W. H. Freeman
S = log g and Company, New York, 1980.
[6] J. J. Koenderink. The structure of images. Biological Cyber-
which can be simplified using Stirling’s approximation,
p netics, 50:363–370, 1984.
x ! ' 2xx+ 12 exp(?x + : : : ) [7] M. Li and P. Vitányi. An introduction to Kolmogorov com-
plexity and its applications. Springer-Verlag, 1993.
where the higher order terms can be neglected for x greater [8] T. Lindeberg. Scale-Space Theory in Computater Vision.
than about 10. The thermodynamic entropy is thus approx- Kluwer Academic Publishers, The Netherlands, 1994.
imated as, [9] R. Picard, C. Graczyk, S. Mann, J. Wachman, L. Picard, and
L. Campbell. Vistex. via ftp:whitechapel.media.mit.edu,
S = log g X 1995. Copyright 1995 Massachusetts Institute of Technol-
= log N ! ? log ni ! ogy.
[10] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flan-
i

= 1 ?2 k log 2 + (N + 12 ) log N ? N
nery. Numerical recipes in C. Cambridge University Press,
1992.
[11] J. Rissanen. Stochastic Complexity in Statistical Inquiry.
X X
? (ni + 21 ) log ni + ni
k k World Scientific, 1989.
[12] B. M. ter Haar Romeny (Ed.), editor. Geometry-Driven Dif-
i=1 i=1 fusion in Computer Vision. Kluwer Academic Publishers,
1 ? k 1
= 2 log 2 + (N + 2 ) log N
The Netherlands, 1994.
[13] A. P. Witkin. Scale space filtering. In Proc. of Interna-

X
tional Joint Conference on Artificial Intelligence (IJCAI),
? (ni + 21 ) log N
k
ni
? (N + k2 ) log N Karlsruhe, Germany, 1983.
i=1

X
= 1 ?2 k log 2N ? (ni + 12 ) log nNi
k

i=1

X
= c ? N < S (N ) > ? 21 log nNi
k

i=1

Вам также может понравиться