Академический Документы
Профессиональный Документы
Культура Документы
org/ijc
31
ImprovedNonlinearImageEnhancementfor
VideoCoding
LungJenWang
*1
DepartmentofComputerScienceandInformationEngineering,NationalPingtungInstituteofCommerce,Taiwan
51MinShengE.Road,Pingtung900,Taiwan,R.O.C.
*1
ljwang@npic.edu.tw
Abstract
It is well known that the Bspline filter can yield a very
accurate algorithm for smoothing. In this paper, it is shown
that a cubic Bspline filter can be used to improve the non
linear image enhancement method. In the nonlinear image
enhancement, a higherfrequency component can be
predictedtosolvetheblurredproblemofanenlargedimage.
This paper also presents a new threedimensional (3D)
downscaling scheme to subsample video data for video
coding.Furthermore,anovelnonlinearimageenhancement
compensation algorithm with cubic Bspline filter is
presented to improve the prediction of higherfrequency
component and accordingly the efficiency of the 3D down
scaling video coding. Finally, a computer simulation shows
that the proposed method yields a better quality of the
decodedimagethanothernonlinearenhancementmethods.
Keywords
Cubic Bspline Filter; Nonlinear Image Enhancement; Video
Coding
I nt r oduc t i on
Withthegrowinginterestofdigitalimageprocessing,
the applications in this domain such as digital high
definition television (HDTV), Internet protocol
television(IPTV),andvideophone,anintegralpartof
our life are related to a scale image enlargement
technique.Typically,however,theimageenlargement
causes a blurred image because there is no power in
thehighfrequencycomponentofenlargedimage[14].
To improve the quality of such a blurred image, the
image enhancement is an indispensable post
processingmethod.
Image Enhancement is a very important topic in the
researches. The principle of image enhancement is to
process an image so that the result is more suitable
than the original image in many applications. A
typical image enhancement is achieved througth the
highpass filter followed by the postprocessing in
ordertomaketheimagesuitable.Inotherwords,this
method uses a typical principle behind unsharp
maskingandhighboostfiltering[14].
Nonlinear image enhancement [1, 2, 13] is similar to
the typical image enhancement except that the high
pass filter is replaced by nonlinear operations. This
enhancement method uses the Gaussianpyramid [1]
or filter subtract and decimate (FSD)pyramid [2]
representation of an image to extract the high
frequency component from input (blurred) image.
That is, a highfrequency component L1 can be
obtainedfromablurredimagebyanonlinearfilteras
showninFig.1.Anewoutputimageisgeneratednext
as the sum of the given input image and high
frequency component L1. The major nonlinear step
involves clipping and scaling the extracted
components.
FIG.1THENONLINEARIMAGEENHANCEMENT
The cubic Bspline filters have been extensively used
in image processing [3, 6, 7, 8, 10]. In [11] part, which
is related to an improved nonlinear image
enhancement method for video coding, is proposed.
However, the detailed derivation of both nonlinear
image enhancement with the cubic Bspline filter and
the 3D downscaling video coding has not been
presented. In addition, this paper gives more detailed
descriptions on the nonlinear image enhancement
compensation with the cubic Bspline filter and the 3
D downscaling video coding. Furthermore, the novel
nonlinear image enhancement compensation is used
along with video coding to improve the quality of
decoded image. Finally, some experimental results
show that the proposed method obtains a better
www.seipub.org/ijc International Journal of Communications (IJC) Volume 2 Issue 2, June 2013
32
subjective quality and objective PSNR performance
thanothernonlinearimageenhancementmethods.
Therestofthispaperisorganizedasfollows.Section2
describes the cubic Bspline filter. The nonlinear
imageenhancementisproposedinSection3.Section4
showsthe3Ddownscalingschemeforvideocoding.
In Section 5, the video coding using nonlinear image
enhancement compensation is described. The
computersimulationisillustratedinSection6.Thelast
sectionshowstheconclusionsofthispaper.
Cubi c B-Spl i ne Fi l t er
Splinesarepiecewisepolynomialswithpiecesthatare
smoothlyconnectedtogether.TheBsplines(wherethe
B may stand for basis or basic) are the basic building
blocksforsplines[8].Inaddition,itisshownin[3,6,7,
10] that the Bspline is a very good lowpass filter for
imagerepresentation.
Let
1 1 0
:
+
< < < <
n n
be a partition of the
interval [
0
,
1 + n
] on a real axis. The spline basis
function of degree n on in [3] is the following
piecewisepolynomial:
, 2 , 1 , 0 for ,
) (
) ( ) (
) 1 (
) , , , , ; (
1
0
1 2 1 0
=
+ =
+
=
+
n
U
n
B
n
k k
k
n
k
n n
e
(1)
where
) ( ) (
1
0
j k
n
k j
j
k
e =
[
+
=
=
and
s
>
=
k
k k
k
U
for , 0
for , ) (
) (
0
isaunitstepfunction.
From(1),thecubicBsplinefunctionisgivenby
, 6 / )] ( ) ( +
) ( ) ( 4
) ( ) ( 6 +
) ( ) ( 4
) ( ) [(
) , , , , ; (
) (
4
2
3
2
1
3
1
3
1
3
1
2
3
2
2 1 1 2 3
3
A
=
+ +
+ +
+ +
k k
k k
k k
k k
k k
k k k k k
k
U
U
U
U
U
B
S
(2)
where
1
= A
k k
.
An interpolation function can be expressed in the
followingform
=
=
N
k
k k k
S c f
1
) ( ) (
(3)
where
k
c isthecoefficientstobedeterminedfromthe
input data, ) (
k k
S is the spline basis function, and
Nisthenumberofgivendatapoints.
FIG.2 ) (
f INTERPOLATEDBYCUBICBSPLINE
From(1)(3)andillustratedinFig.2.,thecubicBspline
interpolation function in onedimension (1D) can be
obtainedas
, 6 / ]} ) ( [ +
] ) ( 4 ) ( [ + ] ) ( 6 +
) ( 4 ) ( [ + ] ) ( 4
) ( 6 + ) ( 4 ) [( { ) (
4 3
2
3 3
1 1
3
3
1
3
2
3
1
3
2
3
3 1
A
=
+
+
k k
k k k k
k k k k
k k k k
c
c
c
c f
(4)
where
1 1
, ,
+ k k k
c c c and
2 + k
c are the coefficients to be
determinedfromtheinputdata.
Let A + = x
k
, where 1 0 s s x . The cubic Bspline
interpolationfunctionin(4)canbewrittenas
. 6 / )} 4 ( +
) 3 3 (
) 3 6 3 (
) 3 3 ( {
6 / } +
] 4 ) 1 ( [ +
] 6 + ) 1 ( 4 ) 2 ( [ +
] 4 ) 1 ( 6 + ) 2 ( 4 ) 3 [( {
) (
1 1
1 1
1 1
2
1 1 2
3
3
2
3 3
1
3 3 3
3 3 3 3
1
A + +
+
+ +
+ =
A
+
+ +
+ + + =
A +
+
+
+
+ +
+
+
k k k
k k
k k k
k k k k
k
k
k
k
k
c c c
c c x
c c c x
c c c c x
x c
x x c
x x x c
x x x x c
x f
(5)
InternationalJournalofCommunications(IJC)Volume2Issue2,June2013www.seipub.org/ijc
33
Then (5) can be used to find the interpolation at any
pointamongsampledpoints.Inparticular,atthenode
point
k
= ,i.e., 0 = x ,(5)becomes
. 6 / ) 4 ( ) (
1 1
A + + =
+ k k k k
c c c f (6)
Inotherwords,the1DfilterofcubicBsplinefunction
in(6)is[1,4,1]/6.
Based on the definition of the twodimensional (2D)
interpolation function [10], the cubic Bspline
interpolation function can be extended from the 1D
interpolationfunctiontothe2Dinterpolationfunction.
Let ) , (
+ ] 4 ) 1 ( [ ) (
+
] 6 + ) 1 ( 4 ) 2 ( [ ) (
+ 4
) 1 ( 6 + ) 2 ( 4 ) 3 )[( (
{ ) , (
3
2
3 3
1
3 3 3 3
3 3 3
1
A +
+ +
+ + + =
+ +
y x f y y x f
y y y x f y
y y y x f y x f
l l
l
l
(7)
Inparticular,atthenodepoint(
k
,
l
q ),i.e., 0 = x and
0 = y ,(7)gives
, 36 / )} 4 (
) 4 ( 4
) 4 {( ) , (
2
1 , 1 1 , 1 , 1
, 1 , , 1
1 , 1 1 , 1 , 1
A + + +
+ + +
+ + =
+ + + +
+
+
l k l k l k
l k l k l k
l k l k l k l k
c c c
c c c
c c c f q
(8)
forall K k , , 2 , 1 = and L l , , 2 , 1 = .
Inotherwords,the2DfilterofcubicBsplinefunction
in(8)is[1,4,1;4,16,4;1,4,1]/36.
The Pr oposed Enhanc ement Sc heme
In this paper, the proposed enhancement algothrim
applies the philosophy of the cubic Bspline filter to
improvethenonlinearimageenhancementmethod[2]
whichisshowninFig.3.
FIG.3THEPROPOSEDNONLINEARIMAGEENHANCEMENT
METHOD
The lowfrequency image
1
I is obtained from the
inputblurred image
0
I using the cubic Bspline filter,
and the highfrequency image
0
K , called the residual
image, is obtained by subtracting the lowfrequency
image
1
I fromtheinputblurredimage
0
I ,i.e.,
1 0 0
I I K = . (9)
By[2],theenhancedimage
1
I isgeneratedasthesum
of the inputblurred image
0
I and the predicted
higherfrequencyimage
1
K ;thatis,
1 0 1
+ = K I I ,
(10)
where ) (
0 1
K NL K =
is a nonlinear operator of
0
K ,
which includes both scaling and clipping steps,
definedasfollows:
) ( ) (
0 0
K Clip s K NL = , (11)
where the scaling constant s ranges in between 1 and
10andClip(x)isgivenby
<
s s
>
=
T x if T
T x T if x
T x if T
x Clip
,
,
,
) (
(12)
where x isthepixelofthehighfrequencyimage
0
K ,
max 0
K c T = ,
(13)
where
max 0
K is the maximum pixel of the high
frequencyimage
0
K andtheclippingconstantcranges
inbetween0and1.
After a nonlinear operator, the higherfrequency
image
1
K canbeutilizedtoenhancetheinputblurred
image
0
I .
Then,usingthedescriptionin[2],thevaluesofcands
parametersareestimatedasfollows.Letthestandard
deviation
0
o of input image
0
I be 0.9, the standard
deviation
1 -
o of enhanced image
1
I be 0.45, and the
lowpass (cubic Bspline) filter LF is normal
distribution, whose standard deviation
LF
o be 1.
Through the lowpass filter LF, the lowfrequency
image
1
I isacquired.Sothatthestandarddeviation
1
o
of
1
I is 1.345 ( 81 . 1
2 2
0
2
1
= + =
LF
o o o ). The residual
image
0
K in(9)isobtainedbysubtracting
1
I from
0
I .
Through the normalized Gaussian filter, the
www.seipub.org/ijc International Journal of Communications (IJC) Volume 2 Issue 2, June 2013
34
ErrorFunctionisgenerated.Thatis,
)) /( ( )) /( (
1 0 0
o o x Erf x Erf K = . (14)
Therefore,themaximumvaluecanbeobtainedas
0 )) max/( ( )) max/( (
1 0
= ' ' o o x f Er x f Er . (15)
or
0 ) ( ) (
max 1 max 0
=
x x
I I o o , (16)
and
)) ) /( 1 ) /( 1 /( ) / log( 2 ( max
2
1
2
0 0 1
o o o o = sqrt x (17)
Itisfollowedfrom[2]thatthesolutionof(14)and(17)
canbeconfirmed.Thus,oneparametercombinationof
c = 0.45, s = 3 from the theoretical evaluation and the
other parameter combination of c = 0.4, s = 5 from the
estimationanalysisareproposedforthenonlinear
A 3-D Dow n-sc al i ng f or Vi deo Codi ng
Inordertoobtainalowbitratevideo,anewtypeof3
D downscaling scheme [11] is presented for video
coding. In general, the process of decreasing the data
rate is called decimation while the process of
increasingdatasamplesiscalledinterpolation[3].This
3D downscaling scheme applies a 3D decimation
with a compression ratio of 8 to 1 as the pre
processingstepoftheencoder.Asaconsequence,a3
D interpolation with a ratio of 1 to 8 is used for the
postprocessingstepofthedecoder.
A3DLinearDecimatedScheme
Let
2 1
, t t and
3
t be the integer indices and
2 1
, n n and
3
n are also integers. The 3D decimated scheme [12]
takesanvideo ( )
3 2 1
, , t t t X asaninputandproducesan
outputof ( )
3 2 1
, , t t t Y byafactorof2ineachdimension
asfollows:
) ) 2 2 2 ( ( avg
) (
1
0 i
1
0
1
0
3 3 2 2 1 1
3 2 1
1 2 3
= = =
+ + +
=
i i
i t , i t , i t X
, t , t t Y
. 3 , 2 , 1 , 1 0 for = s s i n t
i i
(18)
where ) (- avg is returns theaverage(arithmetic mean)
of a set of numeric values. Fig.4. shows the down
sampling of preprocessing stage using the 3D linear
method. In this figure, a simplified example with two
adjacent44frames,thatis,frameNandframeN+1,is
illustrated. Firstly, two adjacent 44 frames are
divided into four separate 22 groups as depicted in
Fig.4(a).Then,each22groupiscalculatedby(18)to
obtainacorrespondingdecimateddataof ( )
3 2 1
, , t t t Y as
showninFig.4(b).
A3DLinearInterpolatedScheme
It is followed from [9] that a symmetric extension
scheme is used to solve the boundary condition
problem to compute the 3D interpolation at both
boundariesoftheimage.Toillustratethis,weconsider
the 1D case, for example, in Fig.5, if Xk for k =
0,1,2, ,7 is original data, then Xk= Xk and X7+k= X7k
are extendedfrom the leftand right boundaries ofXk,
respectively.
FIG.4A3DLINEARDECIMATEDSCHEME
X
0
X
1
X
2
X
3
X
4
X
5
X
6
X
7
X
-4
X
-3
X
-2
X
-1
X
8
X
9
X
10
X
11
Extended data of
the left boundary
Extended data of
the right boundary
FIG.5SYMMETRICEXTENSIONOFIMAGEDATA
FIG.6A3DLINEARINTERPOLATEDSCHEME
Using the decimated video ( )
3 2 1
, , t t t Y obtained from
(18),the3Dreconstructedvideo[12]canbecalculated
InternationalJournalofCommunications(IJC)Volume2Issue2,June2013www.seipub.org/ijc
35
byalinearinterpolationshowninFig.6.andgivenby
= = =
=
1
0
1
0
1
0
3 3 2 2 1 1 3 2 1
3 2 1
1 2 3
) 2 2 2 ( ) (
) (
k k k
k , t k , t k t R , k , k k Y
, t , t t X
(19)
. 3 , 2 , 1 , 3 0 for = s s i t
i
where ( )
3 3 2 2 1 1
2 2 2 k ,t k ,t k t R is the 3D linear
functionsdefinedas
) ( ) ( ) ( ) , , (
3 2 1 3 2 1
t R t R t R t t t R =
(20)
and ) (t R isthe1Dlinearfunctiongivenby
otherwise
t t
t R
,
2 / ,
0
2 / - 1
) (
=
(21)
That is, in the 3D linear interpolated scheme, using
thereconstructedvaluesatthedecimatedvideopoints
( )
3 2 1
, , t t t Y , the reconstructed video points ) (
3 2 1
, t , t t X
between the decimated video points are obtained by
the use of (19). A simplified example with the 22
decimated video points of ( )
3 2 1
, , t t t Y that are
interpolated is depicted in Fig.6(b) and in Fig.6(a) a
mirrored method based on the above symmetric
extension scheme is used to solve the boundary
condition problems of computing the ) (
3 2 1
, t , t t X in
(19). Finally, the two 44 reconstructed video frames
areobtainedasshowninFig.6.
ComputationComplexity
In order to illustrate the computation complexity of
theproposed3Ddownscalingscheme,thenumberof
multiplication and addition/subtraction/shift is
estimated in Table 1. In this table, the operations are
categorized into two groups: one is multiplication
operation and the other is addition, subtraction, and
shift operation. Obviously, the estimated operation
numbersoftheproposed3Ddownscalingschemein
both dicimation and interpolation are very compact
and thus the proposed 3D downscaling scheme can
berealizedquiteeasilyinrealtime.
TABLE1.NUMBEROFOPERATIONSFOR3DDOWNSCALINGSCHEME.
Function Unit Multiplication Addition,
Subtraction,
Shift
EstimationBasis
3DLinear
Decimated
Scheme
(44)2
blocks
22block
0 36
3Ddecimationofspatial(vertical,horizontal)andtemporal
directionsareperformedindependently.
verticaldirection:(42)2=16addition
horizontaldirection:(22)2=8addition
temporaldirection:22=4addition
rounding:22=4addition
dividedby8:22=43bitshift
3DLinear
Interpolated
Scheme
22block
(44)2
blocks
0 120
3Dinterpolationofspatial(vertical,horizontal)andtemporal
directionsareperformedindependently.
verticaldirection:42=8addition
horizontaldirection:44=16addition
temporaldirection:(44)2=32addition
rounding:(44)2=32addition
dividedby8:(44)2=323bitshift
Vi deo Codi ng Usi ng Non-l i near I mage
Enhanc ement Compensat i on
The 3D downscaling for video coding provides a
better performance at a lower bitrate transmission,
however, this method causes the blurred problem of
decoded image. In this section, the proposed non
linear image enhancement compensation with the
cubic Bspline filter is used to improve the decoded
quality of the 3D downscaling for video coding
shown in Fig.7. In this figure, this algorithm applies
the3Dlineardecimation(3DLI)astheencoder,and
the3Dlinearinterpolation(3DLI)asthedecoderfor
video coding. As a consequence, the proposed non
linearimageenhancementcompensationwithcubicB
splinefilterisusedforthepostprocessingstepofthis
decoder.
Forthisalgorithm,anoriginalvideo(image)sequence
in the RGB color space is converted into another
preliminarysequenceinYUV[4,5]colorspacepriorto
the 3DLI processing. The size of the original RGB
videosequencein352288resolutionisassumedtobe
352288330 bytes for a period N = 30 frames. After
colorspace conversion, one set of 35228830 bytes is
usedforY,andtwosetsof17614430bytesareused
fortheUandVvideosequences.Intheproposed3D
LI procedure, the input video sequence is a Y video
sequence of size 35228830 bytes, and the output
www.seipub.org/ijc International Journal of Communications (IJC) Volume 2 Issue 2, June 2013
36
video sequence is a decimated video sequence of size
17614415 bytes. For the U and V video sequences,
theinputvideosequencehas17614430bytessothat
theoutputvideosequencetobedecimatedis887215
bytes.Attheendoftheencoder,thethreeseparateY,
U, and V decimated video sequences are combined
intooneYUVdecimatedvideosequence.
In addition, in the decoder, there are two processes
used that are reversed in the encoding steps. In the
first step, the YUV received (decimated) video
sequence is separated into three separate Y, U, and V
decimated video sequences. Then, the proposed 3D
LIprocessusesthelinearinterpolationtoreconstruct
the video sequences. After this interpolation, the size
of the Y video sequence is therefore converted from
17614415 bytes to 35228830 bytes, and the U and
V video sequences are increased from 887215 bytes
to 17614430 bytes. In the second step, the proposed
image enhancement compensation method, described
inSection3,isusedfortheYvideosequenceonly,and
it is not used for the U and V video sequences.
Furthermore, the three Y, U, and V video sequences
arecombinedagainintooneYUVformat.Finally,itis
followed from [4, 5] that this YUV video sequence is
converted into the reconstructed RGB video sequence
withsize352288330bytes.
FIG.7VIDEOCODINGUSINGNONLINEARIMAGEENHANCEMENTCOMPENSATION
Comput er Si mul at i on
Let ) , ( j i X and ) , (
j i X isgivenby
) /( ) , (
) , (
1
0
1
0
2
N M j i X j i X MSE
M
i
N
j
|
|
.
|
\
|
=
=
.
(22)
Thus the peak signaltonoise ratio (PSNR) between
) , ( j i X and ) , (
j i X isdefinedby
( ) log 10 ) (
2
10
MSE b dB PSNR = ,
(23)
where b is the largest value of the image signal
(typically255for8bitsofgraylevel).
Inthissection,someexperimentalresultsofthreegray
images(Aerial,Baboon,Barbara)in512512resolution,
threecolorimages(Lena,Peppers,Sailboat)in512512
resolution and three video sequences (Mother, Stefan,
Table) in 352288 resolution are presented by
computersimulations.Firstofall,thesegrayandcolor
images are blurred to lowresolution images,
repectively, then the proposed method and the
methods in[1]and[2]are used to enhance these low
resolution images, in addition, the PSNR of these
experimental results are compared in Table 2 and
Table 3. Furthermore, the 3D downscaling for video
coding is used along with the proposed method and
the methods in [1] and [2] to enhance the
reconstructed images of the above three video
sequences, finally, the PSNR of these experimental
InternationalJournalofCommunications(IJC)Volume2Issue2,June2013www.seipub.org/ijc
37
results are also compared in Table 4. Obviously, in
Tables 2, 3 and 4, the quality of reconstructed images
usingtheproposedmethodisbetterthanthemethods
in[1]and[2].
Note that in clipping and scaling parameters, s=5,
c=0.4 are selected for both gray and color images, but
for 3D downscaling video coding s=3, c=0.45 are
chosed, because these parameters get better PSNR
results.
TABLE2.PSNR(DB)OFGRAYENHANCEDIMAGEOFSIZE512512 FOR
GAUSSIAN[1],FSD[2],ANDPROPOSEDMETHODS.
Image
Name
Blurred
Image
Gaussian
method[1]
FSDmethod[2]
Proposed
method
max 0
04 . 0 G
s=3,c=0.45 s=5,c=0.4 s=5,c=0.4
Aerial 25.57 25.81 27.94 25.12 28.46
Baboon 21.88 22.20 23.49 22.59 23.50
Barbara 24.68 24.54 25.58 24.87 25.72
TABLE3.PSNR(DB)OFCOLORENHANCEDIMAGE(Y)OFSIZE512512
FORGAUSSIAN[1],FSD[2],ANDPROPOSEDMETHODS.
Image
Name
Blurred
Image
Gaussian
method[1]
FSDmethod[2]
Proposed
method
max 0
04 . 0 G
s=3,c=0.45 s=5,c=0.4 s=5,c=0.4
Lena 31.97 30.79 34.36 30.79 34.81
Peppers 29.80 28.97 31.18 29.28 31.44
Sailboat 27.62 27.43 29.66 26.97 29.93
TABLE4.PSNR(DB)OFENHANCEDVIDEOSEQUENCE(Y)OFSIZE352
288BY3DDOWNSCALINGFORGAUSSIAN[1],FSD[2],ANDPROPOSED
METHODS.
Sequence
Name
3DLI
/ 3D-LI
Gaussian
method[1]
FSDmethod[2]
Proposed
method
max 0
04 . 0 G
s=3,c=0.45 s=5,c=0.4 s=3,c=0.45
Mother 33.97 32.74 33.74 29.93 35.19
Stefan 21.43 21.60 21.93 20.64 22.04
Table 24.07 24.26 24.54 22.92 24.94