Вы находитесь на странице: 1из 11

Digital Signal Processing 43 (2015) 1727

Contents lists available at ScienceDirect

Digital Signal Processing


www.elsevier.com/locate/dsp

Robust image hashing with embedding vector variance of LLE


Zhenjun Tang a,b, , Linlin Ruan b , Chuan Qin c , Xianquan Zhang a,b,d , Chunqiang Yu d
a

Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin 541004, PR China
Department of Computer Science, Guangxi Normal University, Guilin 541004, PR China
c
School of OpticalElectrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, PR China
d
Network Center, Guangxi Normal University, Guilin 541004, PR China
b

a r t i c l e

i n f o

Article history:
Available online 9 May 2015
Keywords:
Image hashing
Robust hashing
Locally linear embedding
Data reduction
Secondary image
CIE L a b color space

a b s t r a c t
Locally linear embedding (LLE) has been widely used in data processing, such as data clustering, video
identication and face recognition, but its application in image hashing is still limited. In this work,
we investigate the use of LLE in image hashing and nd that embedding vector variances of LLE are
approximately linearly changed by content-preserving operations. Based on this observation, we propose
a novel LLE-based image hashing. Specically, an input image is rstly mapped to a normalized matrix
by bilinear interpolation, color space conversion, block mean extraction, and Gaussian low-pass ltering.
The normalized matrix is then exploited to construct a secondary image. Finally, LLE is applied to the
secondary image and the embedding vector variances of LLE are used to form image hash. Hash similarity
is determined by correlation coecient. Many experiments are conducted to validate our eciency and
the results illustrate that our hashing is robust to content-preserving operations and reaches a good
discrimination. Comparisons of receiver operating characteristics (ROC) curve indicate that our hashing
outperforms some notable hashing algorithms in classication between robustness and discrimination.
2015 Elsevier Inc. All rights reserved.

1. Introduction
Nowadays, the popularization of imaging device, such as smart
cell phone, digital camera and scanner, provides us more and more
digital images. Consequently, ecient techniques are needed for
storing and retrieving hundreds of thousands of images. Meanwhile, it is easy to copy, edit and distribute images via powerful
tools and the Internet. Therefore, digital right management (DRM)
(image authentication, image forensics, copyright protection, etc.)
is in demand. All these practical issues lead to emergence of image
hashing. Image hashing is a novel technology for mapping input
image into a short string called image hash. It not only allows
us to retrieve images from large-scale database, but also can be
applied to DRM. In fact, it has been widely used in image authentication [1], digital watermarking [2], image copy detection, tamper
detection, image indexing [3], image retrieval, image forensics [4],
and image quality assessment [5].
Generally, image hashing has two basic properties [68]. The
rst one is perceptual robustness. It requires that, for those visually identical images, image hashing should generate the same
or very similar image hashes no matter whether their digital rep-

Corresponding author at: Department of Computer Science, Guangxi Normal


University, No. 15 YuCai Road, Guilin 541004, PR China.
E-mail address: tangzj230@163.com (Z. Tang).
http://dx.doi.org/10.1016/j.dsp.2015.05.002
1051-2004/ 2015 Elsevier Inc. All rights reserved.

resentations are the same or not. This means that image hashing
must be robust against content-preserving operations, such as JPEG
compression, brightness adjustment, contrast adjustment, watermark embedding and image scaling. The second property is called
discrimination. This implies that, image hashing should extract different hashes from different images. In other words, similarity between hashes of different images should be small enough. Note
that the two properties contradict with each other [8]. The rst
property requires robustness under small perturbations, whereas
the second property amounts to minimization of collision probability for images with different contents. High performance algorithms should reach a good trade-off between the two properties.
In addition to the basic properties, image hashing should have
another property when it is applied to specic applications. For
example, it should be key-dependent for image authentication [9].
Due to the wide use of image hashing, many researchers have
paid attention to hashing techniques. For example, Venkatesan
et al. [10] exploited statistics of discrete wavelet transform (DWT)
coecients to generate image hashes. This hashing is robust to
JPEG compression and small-angle rotation, but sensitive to gamma
correction and contrast adjustment. Lefebvre et al. [11] pioneered
the use of Radon transform (RT) to hash extraction. This scheme
can resist geometric transform, such as rotation and scaling, but
its discriminative capability is limited. Kozat et al. [12] viewed images and attacks as a sequence of linear operators and presented

18

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

Fig. 1. Block diagram of our image hashing.

an image hashing with two singular value decompositions (SVDs).


The SVDSVD hashing can tolerate geometric transform at the cost
of signicantly decreasing discrimination. In another study, Swaminathan et al. [13] proposed to calculate image hashes based on
coecients of FourierMellin transform. This algorithm is robust
against moderate geometric transforms and ltering. Monga and
Mihcak [14] were the rst to use non-negative matrix factorization (NMF) to derive image hashing. This method is robust against
many popular digital operations, but sensitive to watermark embedding. Tang et al. [15] found invariant relation in the NMF coecient matrix and exploited it to design hashing. This scheme
is resistant to JPEG compression and watermark embedding, but
sensitive to image rotation. In another work, Ou et al. [16] used
RT combining with discrete cosine transform (DCT) to generate
image hashes. The RTDCT hashing is resilient to image rotation,
but its discrimination is not good enough. Kang et al. [17] introduced a compressive sensing-based image hashing. This method
is also sensitive to image rotation. Recently, Sun et al. [18] presented a robust image hashing by using relations in the weight
matrix of locally linear embedding (LLE). This LLE-based hashing
can tolerate JPEG compression, but is fragile to rotation and its
discrimination is also not good enough. Tang et al. [19] exploited
structural features to extract image hashes and introduced a novel
similarity metric for tampering detection. This method is also sensitive to image rotation. In [20], Li et al. extracted image hashes by
random Gabor ltering (GF) and dithered lattice vector quantization (LVQ). The GFLVQ hashing has better performances than the
well-known algorithms [13,17], but its discrimination is also not
desirable enough. Zhao et al. [21] exploited Zernike moments (ZM)
to calculate image hashes. The ZM-based hashing only tolerates
rotation within 5 . Tang et al. [22] investigated the use of color
vector angle (CVA) and then exploited CVA and DWT to design
image hashing. The CVADWT hashing is also robust to rotation
within 5 , but its discrimination can be improved.
Although many hashing algorithms have been reported, there
are still some problems in hashing design. For example, more efforts are still needed for developing high performance algorithms
reaching a desirable balance between robustness and discrimination. In this work, we propose a novel LLE-based image hashing,
which can achieve a good trade-off between robustness and discrimination. The key technique of our work is an innovative use of
LLE, which is based on the property that embedding vector variances are approximately linearly changed by content-preserving
operations. Since LLE can eciently learn global structure of nonlinear manifolds and discover compact representations of highdimensional data, the use of LLE provides our hashing a good
discrimination. Many experiments are conducted to validate the
eciency of our hashing. The results illustrate that our hashing
is robust against popular digital operations and reaches a good
discrimination. Comparisons indicate that our hashing outperforms
some notable algorithms in classication between robustness and
discrimination.
The rest of this paper is organized as follows. Section 2 introduces the proposed image hashing. Section 3 presents experimental results and Section 4 discusses performance comparisons.
Finally, conclusions are drawn in Section 5.

2. Proposed image hashing


Our proposed image hashing is a three-step method, whose
block diagram is presented in Fig. 1. In the rst step, our hashing
converts input image into a normalized matrix by preprocessing.
In the second step, our method constructs a secondary image from
the normalized matrix. Finally, we apply LLE to the secondary image and exploits LLE results to produce image hash. Details of these
steps are described as follows.
2.1. Preprocessing
To make a normalized image for constructing secondary image,
some digital operations are applied to the input image. Firstly, bilinear interpolation is used to resize the input image to a standard
size M M, which makes our method robust against image rescaling. For RGB color image, the resized image is then converted into
CIE L a b color space and the L component is taken for representing the resized image. Here, we choose L component for
image representation. This is based on the consideration that, CIE
L a b color space is perceptually uniform and the L component
closely matches human perception of lightness [23,24]. For each
image pixel, let L be color lightness, a and b be chromaticity
coordinates, respectively. Thus, color space conversion [23] can be
done by the following rules.

L =

116(Y 1 /Y 0 )1/3 16, if Y 1 /Y 0 > 0.008856


903.3(Y 1 /Y 0 ),
otherwise

a = 500 f ( X 1 / X 0 ) f (Y 1 /Y 0 )

(2)


b = 200 f (Y 1 /Y 0 ) f ( Z 1 / Z 0 )

(1)

(3)

where X 0 = 0.950456, Y 0 = 1.0 and Z 0 = 1.088754 are the CIE


XYZ tristimulus values of the reference white point, f (t ) is dened
as:

f (t ) =

t 1/3 ,
if t > 0.008856
7.787t + 16/116, otherwise

(4)

and X 1 , Y 1 and Z 1 are the CIE XYZ tristimulus values [24], which
are calculated by the equation.

X1
0.4125 0.3576 0.1804
R
Y 1 = 0.2127 0.7152 0.0722 G
B
0.0193 0.1192 0.9502
Z1

(5)

where R, G and B are the red, blue and green components of each
image pixel, respectively.
Next, the L component is divided into non-overlapping blocks
with a small size s1 s1 . For simplicity, let M be the integral multiple of s1 , and M 1 = M /s1 . Thus, to make an initial compression,
we calculate block mean and use these means to construct a feature matrix as follows.

1,1
2,1

F=

...

1,2
2,2
...

M 1 ,1 M 1 ,2

... 1, M 1
... 2, M 1

...
...
... M 1 , M 1

(6)

where i , j is the mean of the block in the i-th row and the j-th
column of the L component (1 i M 1 , 1 j M 1 ). This operation not only achieves initial compression, but also makes our

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

19

Fig. 2. Instance of the preprocessing.

method resistant to small-angle rotation, which can be understood


as follows. Although small-angle rotation will alter pixel positions,
pixels in a small region have similar values and then block means
will not be signicantly changed. Note that a big block size helps
to improve robustness against large-angle rotation. But a bigger
block size leads to fewer features in F, which will inevitably hurt
discrimination. In experiments, we choose 2 2 as block size,
which can reach a desirable balance between robustness and discrimination. Finally, a rotationally symmetric Gaussian low-pass
lter is applied to the matrix F. This is to reduce the inuence of
digital operations on F. In practice, the element of Gaussian lowpass lter can be calculated by:

G (1 ) ( i , j )
G ( i , j ) = (1 )
(i , j )
i
jG

(7)

in which G (1) (i, j) is dened as

G (1 ) ( i , j ) = e

(i 2 + j 2 )
2 2

Fig. 3. Schematic diagram of secondary image construction.

(8)

where is a given standard deviation of the Gaussian distribution.


For example, if the lter size is 3 3, 1 i 1, and 1 j 1.
Fig. 2 presents an instance of the preprocessing with M = 512,
s1 = 2, and 3 3 lter size, where (a) is the original input image, (b) is the resized image, (c) is the L component of (b), (d) is
the matrix F, and (e) is its blurred version.
2.2. Secondary image construction
To construct a secondary image for data reduction, we randomly select N blocks sized n n from F under the control of
a secret key. We view each block as a high dimensional vector of
size n2 1 via concatenating block entries column by column. Let
xi be the corresponding vector of the i-th block (1 i N). Thus,
we can obtain the secondary image X as follows.

X = [x1 , x2 , . . . , x N ]

(9)

Note that, during block selection, there can exist overlapping region between blocks. However, the same selected blocks should be
discarded since the same vectors are not expected in the secondary
image. Compared with the input image, the secondary image has
fewer columns. As column number is equal to vector number that

is kept unchanged during data reduction, a small vector number


helps to make a short image hash. Fig. 3 is the schematic diagram
of secondary image construction.
2.3. Locally linear embedding
Locally linear embedding (LLE) [25] is a well-known algorithm
for non-linear dimensionality reduction. It can eciently discover
compact representations of high-dimensional data by computing
low-dimensional, neighborhood-preserving embeddings and learning global structure of nonlinear manifolds, such as those generated by face images or text documents [25]. LLE has been indicated
better performances than some popular methods, such as principal component analysis (PCA) [26] and multidimensional scaling
(MDS) [27]. Actually, LLE has been widely used in many applications, such as data clustering [28], video identication [29], gait
analysis [30] and face recognition [31].
The classical LLE algorithm [25] consists of three steps, i.e.,
neighbor selection, weight calculation, and low-dimensional embedding vector computation. For simplicity, suppose that xi is a
vector of dimensionality D, where D = n2 and 1 i N. Thus, details of these steps are illustrated as follows.
(1) Neighbor selection. For each vector xi (1 i N ), its K nearest neighbors are chosen. This can be determined by Euclidean

20

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

distance between xi and the other vector x j (1 j N and j = i)


as follows.


D



U (xi , x j ) =

2

xi (l) x j (l)

(10)

l =1

where xi (l) and x j (l) are the l-th elements of xi and x j , respectively. Thus, those vectors corresponding to the K smallest distances are the K nearest neighbors of xi .
(2) Weight computation. Calculate the weight matrix W =
( W i , j )N K . The weight matrix can best linearly reconstruct xi from
its nearest neighbors, and the reconstruction errors are computed
by the following cost function .

(W) =

2
N 




xi

W
x
i
,
j
j


i =1

d



1
d1

y i (l) i

2

1

(13)

l =1

where y i (l) is the l-th element of yi and


by the below equation.

i is the mean calculated

i =

y i (l)

(14)

l =1

To reduce storage, each variance is quantized to an integer as follows.

c (i ) = Round

i2

1000

Finally, our image hash h is obtained as follows.

h = h(1), h(2), . . . , h( N )

(17)

It is clear that our hash consists of N integers. In experiment, we


nd that each integer only requires 11 bits at most for storage.
Therefore, the length of our hash is 11N bits. This will be validated
in Section 3.3.

Let h1 = [h1 (1), h1 (2), . . . , h1 ( N )] and h2 = [h2 (1), h2 (2), . . . ,


h2 ( N )] be a pair of hashes of two images. In this study, the wellknown correlation coecient is exploited to evaluate similarity
between h1 and h2 . Specically, the used correlation coecient
is dened as follows.

S (h1 , h2 ) = 

(15)

where Round() is the rounding operation, and 1 i N. Next,


by a pseudo-random generator, we scramble the integer sequence
c = [c (1), c (2), . . . , c ( N )] to make a secure image hash. Specically,

l=1 [h 1 (l) m1 ][h 2 (l) m2 ]

2
l=1 [h 1 (l) m1 ]

2
l=1 [h 2 (l) m2 ]

+ s
(18)

(12)

where Y = [y1 , y2 , . . . , y N ] is a matrix forming by all low-dimensional


embedding vectors. For more details of LLE algorithm, please refer
to [25,32]. The MATLAB code of LLE algorithm can be downloaded
from the personal website of Roweis [33].
Having obtained these low-dimensional embedding vectors, we
calculate statistics of each embedding vector to produce a short
image hash. Here we choose variance as the feature for representing low-dimensional embedding vector. This is because variance
can eciently measure the uctuation of vector elements, and we
also nd the LLE property that embedding vector variances are
approximately linearly changed by content-preserving operations.
This LLE property will be validated in Section 3.1. The reason of the
LLE property is that the effect of content-preserving operations on
the change of embedding vector variances is relatively small and
like Gaussian noise disturbance. Note that the use of LLE in our
work is different from that of [18], which used LLE weight matrix
to construct hash but cannot acquire good classication between
robustness and discrimination. The variance of yi is dened as follows.

i2 =

(16)

2.4. Hash similarity evaluation

where W i , j is the weight between xi and x j . In practice, W can be


calculated by minimizing the Eq. (11) subject to two constraints as
follows. First, W i , j = 0 if x j is not a nearest neighbor
of xi . Second,
the sum of those neighbor weights of xi is 1, i.e.,
j W i , j = 1.
(3) Low-dimensional embedding vector calculation. After the
weight matrix is obtained, each high-dimensional vector xi is then
mapped to a low-dimensional vector yi of dimensionality d. This
can be done by minimizing the cost function below.

2
N 





(Y) =
W i , j y j 
yi

h (i ) = c P [i ]

(11)

i =1

we can set a secret key as the seed of pseudo-random generator and create N random numbers. Then, we sort these N random
numbers and use an array P [ N ] to record the original positions of
the sorted elements. Therefore, the i-th hash element is obtained
by the below equation.

where s is a small constant to avoid zero denominator, and m1


and m2 are the means of h1 and h2 , respectively. The range of
correlation coecient is [1, 1]. The greater the correlation coefcient, the more similar the evaluated hashes and then the more
similar the corresponding images. If the correlation coecient is
greater than a threshold, the two images of the input hashes are
considered as visually identical images. Otherwise, they are the
images with different contents. Here, correlation coecient is chosen as the similarity metric. This is based on the observation that
content-preserving operations approximately linearly change the
variances of low-dimensional embedding vectors. Section 3.1 will
empirically verify this.
3. Experimental results
In the experiments, our parameter settings are as follows. In the
preprocessing, the input image is resized to 512 512, the L component of the input image is divided into 2 2 non-overlapping
blocks, and a 3 3 Gaussian low-pass lter with zero mean and a
unit standard deviation is taken. During secondary image construction, 50 blocks of size 50 50 are randomly chosen. For LLE, 30
nearest neighbors are selected for each vector and the dimensionality of low-dimensional embedding vector is 30. In other words,
the used parameters of our hashing are: M = 512, s1 = 2, n = 50,
N = 50, K = 30 and d = 30. Thus, our hash length is 50 integers.
To validate eciency of our hashing, robustness and discrimination
are tested in Sections 3.1 and 3.2, respectively. Section 3.3 presents
hash length analysis and Section 3.4 discusses the effect of different parameter settings on hash performances.
3.1. Robustness validation
Many images in the USC-SIPI Image Database [34] are taken as
test images, including 8 standard color images sized 512 512 and
all color images, i.e., 37 images sized 512 512 2250 2250, in
the Aerials volume. Fig. 4 illustrates these standard color images
and Fig. 5 presents typical images in the Aerials volume. We exploit Photoshop, MATLAB and StirMark [35] to generate visually

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

21

Fig. 4. Standard color images for robustness validation.

Fig. 5. Typical images in the Aerials volume.


Table 1
Digital operations and their parameter settings.
Tool

Operation

Parameter

Parameter setting

Number of
operations

Photoshop
Photoshop
MATLAB
MATLAB
MATLAB
MATLAB
StirMark
StirMark
StirMark
StirMark

Brightness adjustment
Contrast adjustment
Gamma correction
3 3 Gaussian low-pass ltering
Speckle noise
Salt and pepper noise
JPEG compression
Watermark embedding
Image scaling
Rotation, cropping and rescaling

Photoshops scale
Photoshops scale

10, 20
10, 20

0.75, 0.9, 1.1, 1.25


0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0
0.001, 0.002, . . . , 0.01
0.001, 0.002, . . . , 0.01
30, 40, 50, 60, 70, 80, 90, 100
10, 20, 30, 40, 50, 60, 70, 80, 90, 100
0.5, 0.75, 0.9, 1.1, 1.5, 2.0
0.25, 0.5, 0.75, 1.0, 1.25, 1.5

4
4
4
8
10
10
8
10
6
12

Standard deviation
Variance
Density
Quality factor
Strength
Ratio
Angle in degree

Total

identical versions of these test images. The adopted digital operations include brightness adjustment, contrast adjustment, gamma
correction, 3 3 Gaussian low-pass ltering, speckle noise, salt
and pepper noise, JPEG compression, watermark embedding, image scaling, and the operation of rotation, cropping and rescaling.
For the operation of rotation, cropping and rescaling, each test image is rstly rotated, the rotated version is then cropped to remove
those padded pixels introduced by rotation, and the cropped version is nally resized to the original size of the test image. Detailed
parameter settings of each operation are listed in Table 1. It is observed from Table 1 that total number of the used operations is 76.
This means that each test image has 76 visually similar versions.
Therefore, there are (8 + 37) 76 = 3420 pairs of visually similar
images.
We extract image hashes of the test images and their similar versions, calculate similarity between each pair of hashes, and
nd that our hashing is robust to the used digital operations. For
space limitation, only the results of 8 typical standard color im-

76

ages are plotted here. Fig. 6 presents the robustness results under
various digital operations. Clearly, all results are bigger than 0.70.
To demonstrate the robustness performance of our hashing on a
big dataset, we calculate statistics of correlation coecients based
on the above mentioned 3420 pairs of visually similar images.
The results are listed in Table 2. It is observed from these results
that, the means of correlation coecients for all digital operations
are bigger than 0.87, and all standard deviations are very small.
Note that correlation coecient is an effective metric for measuring the linearity. These big correlation coecients empirically
verify that the variances of LLE results are approximately linearly
changed by content-preserving operations. In addition, the minimum values of correlation coecients for all digital operations are
bigger than 0.75, except the operation of rotation, cropping and
rescaling. The minimum value of rotation, cropping and rescaling
is 0.4572, which is much smaller than those values of other operations. This is because it is a combinational operation, which causes
more changes in the attacked images than other operations. Con-

22

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

Fig. 6. Robustness results under various digital operations.

sequently, if there are no rotated images in application, we can


select 0.75 as the threshold. In this case, all similar images are almost correctly detected. If there exists some rotated images, the
threshold should be lowered. From Fig. 6(j), we nd that only a
few cases are smaller than 0.75. Therefore, we can choose 0.70 as
the threshold to resist most of the above used operations.
3.2. Discrimination test
To test discrimination of our hashing, we collect a large
database with 200 different color images via Internet and digital camera, whose sizes range from 256 256 to 2048 1536.
Specically, we download 67 images from Internet, take 100 im-

ages from a well-known public database, i.e., the Ground Truth


Database [36], and capture 33 images with camera. We apply our
hashing to the image database, extract 200 image hashes, calculate similarity between each pair of hashes, and then reach
200 (200 1)/2 = 19900 correlation coecients. Distribution
of these results is shown in Fig. 7, where the x-axis is the correlation coecient and the y-axis is its frequency. It is observed
that the minimum and the maximum of correlation coecients are
0.6891 and 0.6967, respectively. And further, the mean and standard deviation of the distribution in Fig. 7 are 0.0066 and 0.1905,
respectively. Clearly, if we choose 0.80 or 0.70 as the threshold, no
images will be falsely considered as similar images. If the thresh-

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

23

Fig. 6. (continued)
Table 2
Statistics of correlation coecients based on 3420 pairs of similar images.
Operation

Maximum

Minimum

Mean

Standard
deviation

Brightness adjustment
Contrast adjustment
Gamma correction
3 3 Gaussian low-pass ltering
Speckle noise
Salt and pepper noise
JPEG compression
Watermark embedding
Image scaling
Rotation, cropping and rescaling

1
0.9977
1
1
0.9986
0.9983
0.9999
1
0.9980
0.9914

0.8426
0.8457
0.7765
0.9198
0.8677
0.8173
0.9031
0.7569
0.8626
0.4572

0.9694
0.9667
0.9713
0.9825
0.9706
0.9701
0.9691
0.9765
0.9765
0.8717

0.0240
0.0227
0.0259
0.0160
0.0218
0.0219
0.0200
0.0269
0.0269
0.0781

old is 0.60, only 0.05% different images will be mistakenly detected


as visually identical images. Actually, a big threshold will improve
discrimination, but will also inevitably decrease robustness. Table 3
illustrates our detection performances under different thresholds,
where robustness is described by the percentage of similar images
judged as identical images and discrimination is indicated by the
percentage of different images detected as similar images. In practice, we can choose a threshold in terms of specic applications.

ements, where the x-axis is the value of hash element and the
y-axis is its frequency. It is found that the minimum value is 4
and the maximum value is 1666. This means that storage of hash
element only requires 11 bits, which can represent integers ranging from 0 to 211 1 = 2047. Therefore, the length of our hash is
50 11 = 550 bits, reaching a reasonable short length. As a reference, the lengths of the SVDSVD hashing [12], the ZM-based
hashing [21], and the CAV-DWT hashing [22] are 1600 digits, 560
bits, and 960 bits, respectively.

3.3. Hash length analysis


3.4. Effect of parameters on hash performances
A short length is a basic requirement for image hash. To analyze the required bits for storing our hash, we take the 200 image
hashes extracted in discrimination as the sample data. As each image hash contains 50 elements, the total number of hash elements
is 50 200 = 10 000. Fig. 8 is the distribution of these hash el-

To evaluate the effect of parameter settings on our hash performances, we use different parameter values to validate robustness
and discrimination. To make visual comparisons between the results of different settings, the notable tool, i.e., receiver operating

24

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

Fig. 7. Distribution of correlation coecients between hashes of different images.

Fig. 9. ROC curve comparisons under different K values when d = 30.

Table 3
Detection performances under different thresholds.
Threshold

Similar images judged as


identical images

Different images detected


as similar images

0.90
0.80
0.75
0.70
0.60
0.55
0.50
0.45

89.91%
97.16%
98.92%
99.53%
99.91%
99.94%
99.97%
100%

0
0
0
0
0.05%
0.12%
0.37%
0.87%

where n1 is the number of visually different images detected as


similar images, and N 1 is the total number of different images,
n2 is the number of visually similar images judged as identical images, N 2 is the total number of similar images. Clearly, P 1 and P 2
are indicators of discrimination and robustness, respectively. The
smaller the P 1 value, the better the discrimination performance.
And the bigger the P 2 value, the better the robustness performance. Since ROC curve is formed by a set of points ( P 1 , P 2 ), the
curve close to the top-left corner (a small P 1 and a big P 2 ) is better than the one far away from the top-left corner.
Here, we mainly discuss the key parameters, i.e., the number
of nearest neighbors K and the dimensionality of low-dimensional
vector d for LLE. Firstly, we only vary the K value and keep other
parameters unchanged. The used K values include 5, 15, 25, 30,
35, and 40. Fig. 9 is the ROC curve comparisons among different K
values. It is observed that all ROC curves are close to the top-left
corner, indicating that our hashing reaches a satisfactory balance
between robustness and discrimination. To view differences, the
ROC curves around the top-left corner are enlarged in the rightbottom part of Fig. 9. It is found that, a moderate K value, such
as 30 or 35, can reach a slightly better classication performance
than those small K values (such as 5 or 15) and big K values (e.g.,
40). This can be understood as follows. In LLE algorithm, K nearest
neighbors are used to reconstruct a vector. A small K value cannot make a good reconstruction due to the lack of enough nearest
neighbors. But a big K value will bring some unnecessary neighbors, leading to a larger reconstruction error. In the experiments,
we nd that, for 512 512 images, K = 30 or K = 35 can reach a
good classication performance between robustness and discrimination. Moreover, computational time is also compared. To this
end, the total consumed time for extracting 200 hashes in discrimination test is recorded, and then the average time for generating a hash is acquired. Our hashing is implemented with MATLAB
R2012a, running in a desktop PC with 3.4 GHz Intel Core i5-3570
CPU and 4.0 GB RAM. We observe that the running time under different K values is around 0.61 seconds, and the time under K = 40
is a little more than those of other values. Table 4 summarizes the
comparison of computational time under different K values.
Secondly, we only change the d value. The adopted values of
d are 25, 30, 35, and 40. Fig. 10 presents the ROC curve compar-

Fig. 8. Distribution of hash elements based on 200 different images.

characteristics (ROC) graph [37], is adopted here. In the ROC graph,


the x-axis is generally dened as false positive rate (FPR) P 1 and
the y-axis is used to represent true positive rate (TPR) P 2 . The two
rates are calculated by the below equations.

P1 =
P2 =

n1

(19)

N1
n2

(20)

N2

Table 4
Time comparison under different K values.
K

15

25

30

35

40

Computational time (second)

0.62

0.61

0.61

0.61

0.62

0.64

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

Fig. 10. ROC curve comparisons under different d values when K = 30.

25

Fig. 11. ROC curve comparisons among different hashing algorithms.

Table 5
Time comparison under different d values.
d

25

30

35

40

Computational time (second)

0.62

0.61

0.61

0.62

isons among different d values. From the results, we nd that all


ROC curves reach the top-left corner. This implies that the d value
has little effect on classication performances of our hashing. To
view more details, the right-bottom region of Fig. 10 shows the
magnied part of the ROC curves around the top-left corner. We
observe that the curves of d = 30 and d = 35 are almost the same,
and the curves of d = 25 and d = 40 is a little below them. Therefore, for 512 512 images, we can choose d = 30 or d = 35 to
achieve a good classication between robustness and discrimination. In addition, we calculate the average time of hash generation
and nd that the time under different d values is also around 0.61
seconds. The comparison of computational time is presented in Table 5.
4. Performance comparisons with notable algorithms
To illustrate our superiority in classication performances, we
compare our hashing with some notable image hashing algorithms
including LLE-based hashing [18], RTDCT hashing [16], GFLVQ
hashing [20] and CVADWT hashing [22]. To ensure fair comparisons, the test images used in Sections 3.1 and 3.2 are also taken
here. In the comparisons, all images are converted to 512 512,
and other parameters are the same with the default settings of
the assessed algorithms. The original metrics for measuring hash
similarity are also adopted here, i.e., Hamming distance for LLEbased hashing and RTDCT hashing, the normalized Hamming distance for GFLVQ hashing, and L 2 norm for CVADWT hashing.
Thus, hash lengths of LLE-based hashing, RTDCT hashing, GFLVQ
hashing, and CVADWT hashing are 300, 240, 120, 960 bits, respectively.
We exploit the compared algorithms to generate hashes of the
above images and then calculate hash similarity with respective
metric. For our hashing, the results of K = 30 and d = 30 is taken
here. To make theoretical analysis, the ROC graph is used again.
In the ROC graph, if some algorithms reach the same TPR, the
algorithm with small FPR is better than the algorithm with big
FPR. Similarly, the algorithm with big TPR outperforms the algorithm with small TPR when they have the same FPR. Intuitively,
the larger the area under ROC curve, the better the classication
performance. Fig. 11 is the ROC curve comparisons among the as-

sessed algorithms. Clearly, our ROC curve is above those curves of


the compared algorithms. In other words, the area under our ROC
curve is much larger than others. This means that our hashing has
better classication performance than other image hashing algorithms. For example, when TPR 1.0, the best FPRs of LLE-based
hashing, RTDCT hashing, GFLVQ hashing, CVADWT hashing, and
our hashing are 0.5569, 0.9963, 0.7040, 0.1033, and 0.0087, respectively. And when FPR 0, the optimum TPRs of LLE-based
hashing, RTDCT hashing, GFLVQ hashing, CVADWT hashing, and
our hashing are 0.8959, 0.7018, 0.7827, 0.9406, and 0.9953, respectively.
In fact, the good classication performance of our hashing
is contributed by our well-designed steps, including the similarity metric. Specically, our preprocessing reduces inuences of
content-preserving operations and thus helps to achieve good robustness. Our second step provides a secondary image suitable for
data reduction with LLE. The use of LLE in the third step makes our
hashing discriminative since it can eciently learn global structure of nonlinear manifolds and discover compact representations.
Finally, according to the LLE property, we select the correlation coecient as similarity metric, which can effectively measure the
linearly changed variances of embedding vectors and thus make
our algorithm reach a desirable tradeoff between robustness and
discrimination.
Moreover, computational time of generating a hash is also compared. This is done by recording the total consumed time of extracting 200 hashes in respective discrimination test. The average
time of LLE-based hashing, RTDCT hashing, GFLVQ hashing and
CVADWT hashing is 0.04, 3.04, 0.42 and 0.27 seconds, respectively. Our average time is about 0.61 seconds, which is slower
than those of LLE-based hashing, RTDCT hashing and CVADWT
hashing, but is much faster than that of RTDCT hashing. The low
speed of RTDCT hashing is due to the high complexity of RT.
Performance comparisons are summarized in Table 6. Our hashing outperforms the compared algorithm in classication with respect to robustness and discrimination. As to computational time
and hash length, our hashing reaches moderate performances. The
fastest algorithm is the LLE-based hashing, and the GFLVQ hashing has the shortest length.
5. Conclusions
In this work, we have proposed a robust image hashing based
on LLE. The key contribution of our work is the innovative use of
LLE. As LLE is good at discovering compact representation by learning global structure of input data, it provides our image hashing a

26

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

Table 6
Performance comparisons among different algorithms.
Performance

Algorithm

Classication
Computational time (second)
Hash length (bit)

GFLVQ hashing

RTDCT hashing

LLE-based hashing

CVADWT hashing

Our hashing

Moderate
0.42
120

Bad
3.04
240

Moderate
0.04
300

Good
0.27
960

Best
0.61
550

desirable discrimination. More specically, we have observed the


LLE property that embedding vector variances are approximately
linearly changed by content-preserving operations, so as to measure hash similarity with correlation coecient. Many experiments
have been conducted to validate eciency of our image hashing.
The results have shown that our image hashing is robust to many
common digital operations and reaches a good discrimination. ROC
curve comparisons with some well-known algorithms have illustrated that our hashing outperforms the compared algorithms in
classication performances with respect to robustness and discrimination.
Acknowledgments
This work is partially supported by the National Natural
Science Foundation of China (61300109, 61363034, 61303203),
the Guangxi Natural Science Foundation (2012GXNSFBA053166,
2012GXNSFGA060004), Guangxi Bagui Scholar Teams for Innovation and Research, the Scientic and Technological Research
Projects in Guangxi Higher Education Institutions (YB2014048,
ZL2014005), the Project of the Guangxi Key Lab of Multi-source Information Mining & Security (14-A-02-02, 13-A-03-01), the Project
of Outstanding Young Teachers Training in Higher Education Institutions of Guangxi, and Guangxi Collaborative Innovation Center
of Multi-source Information Integration and Intelligent Processing.
The authors would like to thank the anonymous referees for their
valuable comments and suggestions.
References
[1] F. Ahmed, M.Y. Siyal, V.U. Abbas, A secure and robust hash-based scheme for
image authentication, Signal Process. 90 (5) (2010) 14561470.
[2] J. Fridrich, M. Goljan, Robust hash functions for digital watermarking, in: Proc.
of IEEE International Conference on Information Technology: Coding and Computing, 2000, pp. 178183.
[3] C. Winter, M. Steinebach, Y. Yannikos, Fast indexing strategies for robust image
hashes, Digit. Investig. 11 (2014) S27S35.
[4] W. Lu, M. Wu, Multimedia forensic hash based on visual words, in: Proc. of
IEEE International Conference on Image Processing, 2010, pp. 989992.
[5] X. Lv, Z.J. Wang, Reduced-reference image quality assessment based on perceptual image hashing, in: Proc. of IEEE International Conference on Image
Processing, 2009, pp. 43614364.
[6] C. Qin, C.C. Chang, P.L. Tsou, Robust image hashing using non-uniform sampling
in discrete Fourier domain, Digit. Signal Process. 23 (2) (2013) 578585.
[7] Z. Tang, Y. Dai, X. Zhang, S. Zhang, Perceptual Image Hashing with Histogram
of Color Vector Angles, Lecture Notes in Computer Science, vol. 7669, 2012,
pp. 237246.
[8] V. Monga, Perceptually based methods for robust image hashing, PhD dissertation, The University of Texas at Austin, 2005.
[9] Y. Lei, Y. Wang, J. Huang, Robust image hash in Radon transform domain for
authentication, Signal Process. Image Commun. 26 (6) (2011) 280288.
[10] R. Venkatesan, S.-M. Koon, M.H. Jakubowski, P. Moulin, Robust image hashing, in: Proc. of IEEE International Conference on Image Processing, 2000,
pp. 664666.
[11] F. Lefebvre, B. Macq, J.-D. Legat RASH, Radon soft hash algorithm, in: Proc. of
European Signal Processing Conference, 2002, pp. 299302.
[12] S.S. Kozat, R. Venkatesan, M.K. Mihcak, Robust perceptual image hashing via
matrix invariants, in: Proc. of IEEE International Conference on Image Processing, 2004, pp. 34433446.
[13] A. Swaminathan, Y. Mao, M. Wu, Robust and secure image hashing, IEEE Trans.
Inf. Forensics Secur. 1 (2) (2006) 215230.
[14] V. Monga, M.K. Mihcak, Robust and secure image hashing via non-negative matrix factorizations, IEEE Trans. Inf. Forensics Secur. 2 (3) (2007) 376390.

[15] Z. Tang, S. Wang, X. Zhang, W. Wei, S. Su, Robust image hashing for tamper
detection using non-negative matrix factorization, J. Ubiquitous Convergence
Technol. 2 (1) (2008) 1826.
[16] Y. Ou, K.H. Rhee, A key-dependent secure image hashing scheme by using
Radon transform, in: Proc. of the IEEE International Symposium on Intelligent
Signal Processing and Communication Systems, 2009, pp. 595598.
[17] L. Kang, C. Lu, C. Hsu, Compressive sensing-based image hashing, in: Proc.
of IEEE International Conference on Image Processing, Cairo, Egypt, 2009,
pp. 12851288.
[18] R. Sun, X. Yan, Z. Ding, Robust image hashing using locally linear embedding,
in: Proc. of the 2011 International Conference on Computer Science and Service
System (CSSS), 2011, pp. 715718.
[19] Z. Tang, S. Wang, X. Zhang, W. Wei, Structural feature-based image hashing
and similarity metric for tampering detection, Fundam. Inform. 106 (1) (2011)
7591.
[20] Y. Li, Z. Lu, C. Zhu, X. Niu, Robust image hashing based on random Gabor ltering and dithered lattice vector quantization, IEEE Trans. Image Process. 21 (4)
(2012) 19631980.
[21] Y. Zhao, S. Wang, X. Zhang, H. Yao, Robust hashing for image authentication
using Zernike moments and local features, IEEE Trans. Inf. Forensics Secur. 8 (1)
(2013) 5563.
[22] Z. Tang, Y. Dai, X. Zhang, L. Huang, F. Yang, Robust image hashing via colour
vector angles and discrete wavelet transform, IET Image Process. 8 (3) (2014)
142149.
[23] R.C. Gonzalez, R.E. Woods, Digital Image Processing, 3rd edition, Prentice Hall,
2007.
[24] W. Burger, M.J. Burge, Principles of Digital Image Processing: Core Algorithm,
Springer, London, 2009.
[25] S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (2000) 23232326.
[26] I.T. Jolliffe, Principal Component Analysis, Springer-Verlag, New York, 1989.
[27] T. Cox, M. Cox, Multidimensional Scaling, Chapman & Hall, London, 1994.
[28] S. Pang, D. Kim, S.Y. Bang, Face membership authentication using SVM classication tree generated by membership-based LLE data partition, IEEE Trans.
Neural Netw. 16 (2) (2005) 436446.
[29] X. Nie, J. Qiao, J. Liu, J. Sun, X. Li, W. Liu, LLE-based video hashing for video
identication, in: Proc. of the IEEE 10th International Conference on Signal Processing (ICSP), 2010, pp. 18371840.
[30] H. Li, X. Li, Gait analysis using LLE, in: Proceedings of the IEEE 7th International
Conference on Signal Processing (ICSP), vol. 2, 2004, pp. 14231426.
[31] A. Hadid, O. Kouropteva, M. Pietikainen, Unsupervised learning using locally
linear embedding: experiments with face pose analysis, in: Proc. of the 16th
International Conference on Pattern Recognition, vol. 1, 2002, pp. 111114.
[32] S.T. Roweis, L.K. Saul, An introduction to locally linear embedding [online].
Available: http://www.cs.nyu.edu/~roweis/lle/papers/lleintroa4.pdf, 2001.
[33] LLE code page [online]. Available: http://www.cs.nyu.edu/~roweis/lle/code.html,
2013.
[34] USC-SIPI Image Database [online], Available: http://sipi.usc.edu/database/, 2007.
[35] F.A.P. Petitcolas, Watermarking schemes evaluation, IEEE Signal Process. Mag.
17 (5) (2000) 5864.
[36] Ground Truth Database [online]. Available: http://www.cs.washington.edu/
reseach/imagedatabase/groundtruth/, 2008.
[37] T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett. 27 (8) (2006)
861874.

Zhenjun Tang received the B.S. and M.Eng. degrees from Guangxi Normal University, Guilin, P.R.
China, in 2003 and 2006, respectively, and the Ph.D.
degree from Shanghai University, Shanghai, P.R. China,
in 2010. He is now a professor with the Department
of Computer Science, Guangxi Normal University. His
research interests include image processing and multimedia security. He has contributed more than 30 papers in international journals such as IEEE Transactions
on Knowledge and Data Engineering, Signal Processing, Applied Mathematics
and Computation, IET Image Processing, Fundamenta Informaticae, Multimedia
Tools and Applications, Imaging Science Journal, Applied Mathematics & Information Sciences, AE-International Journal of Electronics and Communications,

Z. Tang et al. / Digital Signal Processing 43 (2015) 1727

and Optik-International Journal for Light and Electron Optics. He holds three
China patents. He is a reviewer of some reputable journals such as IEEE
Transactions on Image Processing, IEEE Transactions on Information Forensics
and Security, IEEE Transactions on Multimedia, IEEE Transactions on Circuits
and Systems for Video Technology, Signal Processing, Digital Signal Processing,
IET Image Processing, IET Computer Vision, Neurocomputing, Multimedia Tools
and Applications, and Imaging Science Journal.

Linlin Ruan received the B.S. degree from Wuhan


University of Science and Engineering, Wuhan, P.R.
China, in 2009. Currently, he is pursuing the M.Eng.
degree at the Department of Computer Science,
Guangxi Normal University. His research interests include image processing and multimedia security.

Chuan Qin received the B.S. and M.S. degrees in


electronic engineering from Hefei University of Technology, Anhui, China, in 2002 and 2005, respectively,
and the Ph.D. degree in signal and information processing from Shanghai University, Shanghai, China, in
2008. Since 2008, he has been with the faculty of the
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, where he is currently an Associate Professor. He

27

was with Feng Chia University at Taiwan as a Postdoctoral Researcher and


Adjunct Assistant Professor from July 2010 to July 2012. His research interests include image processing and multimedia security.

Xianquan Zhang received the M.Eng. degree from


Chongqing University, Chongqing, P.R. China, in 1996.
He is currently a Professor with the Department of
Computer Science, Guangxi Normal University. His research interests include image processing and computer graphics. He has contributed more than 70 papers.

Chunqiang Yu received the M.Eng. degree from


Guangxi Normal University in 2013. His research interests include image processing and information hiding.

Вам также может понравиться