Вы находитесь на странице: 1из 4

Robust Subband Video Coding with Leaky Prediction

Arild Fuldseth Tor A. Ramstad


Department of Telecommunications,
Norwegian University of Science and Technology (NTNU)
N-7034 Trondheim, Norway
Telephone: +47 73 59 26 50. Fax: +47 73 59 26 40.
email: ffuldseth,torg@tele.unit.no.

ABSTRACT the cost of less prediction gain. Thus, for a given


channel-to-noise ratio (CSNR), there is an opti-
The use of leaky prediction in a video trans- mal value of the interframe prediction coecient
mission system designed for noisy channels is corresponding to the best possible trade-o be-
investigated. For the proposed system, graceful tween the amount of error propagation and the
degradation is achieved by nding good index prediction gain. In this paper, approximate opti-
maps between the quantized source coder pa- mal values of for various CSNR values are found
rameters and the amplitude levels of a QAM experimentally. Furthermore, for the cases where
signal constellation. The contribution of this the CSNR is not known at the encoder side, the ef-
paper, however, is to investigate the use of fects of channel mismatch are addressed. Finally,
leaky prediction to further increase the robust- the proposed system is compared to a traditional
ness to channel noise. It is shown that by us- video communication system based on the H.263
ing a temporal prediction coecient slightly less video coder.
than one, large improvements can be achieved
with only a small degradation for the noiseless 2. SYSTEM DESCRIPTIONS
case. The performance of the proposed system 2.1. Reference System
is comparable to a reference system based on
the H.263 video coder for high CSNR values, The reference system which is based on the H.263
and degrades far more gracefully for low CSNR video coder and QAM is illustrated in Figure 1.
values. The H.263 coder was implemented according to [6]
without any of the optional coding techniques, and
without the chrominance information. In addition
1. INTRODUCTION to the speci cations given in [6], resynchroniza-
Recently, there has been much interest in tion based on the group of block (GOB) header as
combined source coding and channel cod- well as error detection and error concealment tech-
ing/modulation for transmission over noisy chan- niques are used in the decoder. Error detection is
nels. In [1], the index assignment problem was based on detection of illegal codewords, while er-
addressed by using simulated annealing and as- ror concealment is performed by repeating infor-
suming a binary symmetric channel (BSC). The mation from the previous GOB, or from the pre-
idea behind this method was to assign binary code- vious frame whenever an error is detected. Fur-
words to the quantizer indices such as to minimize thermore, in order to reduce the propagation of
the e ects of channel errors. In [2, 3, 4], the simu- errors through subsequent frames, NI intrablocks
lated annealing method was modi ed to optimize are introduced systematically in each frame at the
the index assignment (index map) with multilevel encoder side. Note that the e ects of introducing
modulation instead of a BSC, and with the trans- intrablocks systematically are similar to the e ects
mitted power as an additional constraint. In par- of leaky prediction, since the error propagation for
ticular, the method was used to design a robust noisy channels is reduced at the cost of less cod-
system for still image transmission by optimizing ing eciency for a noiseless channel. The output
the index maps from the source coder parame- bit stream from the H.263 coder is mapped to a
ter space to the amplitude levels of a quadrature 16-QAM signal constellation by traditional Gray
amplitude modulation (QAM) signal constellation coding. Note that since the H.263 coder uses vari-
able length coding, it is not possible to take the
(e.g. 64-QAM). Furthermore, in [5] the techniques signi cance or the meaning of the individual bits
in [2, 3, 4] were extended to video coding. In into account when choosing the mapping.
this work, we address the problem of channel error
propagation due to the temporal prediction feed- 2.2. Proposed System
back loop of the video coder structure. In par- The proposed system, illustrated in Figure 2, uses
ticular, the robustness of the video transmission subband decomposition and temporal prediction
system of [5] is increased further by using leaky with motion compensation as in H.263. The sub-
interframe prediction. This reduces the propa- band signals are coded by subdividing each sub-
gation of channel errors to subsequent frames at band into blocks of 4  4 samples. Each block is
Video frames Bit stream QAM symbols
- H.263
encoder
- coding
Gray -
Figure 1. Reference system, (transmitter only).
Block classi cation
Quantizer scale factors
-- MAP -
Video - VQ3 - MAP3 - QAM
symbols
-
frames i- lterbank
Analysis - Classi -
6 cation - VQ2 - MAP2 -
- VQ1 - MAP1 -
???
- Class 0 Synthesis
lterbank

- ?i

...
.......
....
....
....
 compensation
...

..
Motion  Frame
delay
Motion
vectors
6
- estimation 
Motion

Figure 2. Proposed system, (transmitter only). The normalization and renormalization operations are assumed
to be included in VQ1-VQ3.
classi ed into one of four classes according to its classes are listed in Table 1. The QAM signal con-
mean squared value, and the subband samples be- stellations are chosen with an odd number of am-
longing to class 1-3 are normalized using the corre- plitudes in each dimension. This allows for power
sponding class mean squared value. Next, the nor- ecient transmission by assigning the codebook
malized subband samples are quantized using vec- vector having the highest probability to a chan-
tor quantizers (VQs) of three di erent rates (mea- nel symbol with zero amplitude. As an example,
sured in bits per source sample), with the high- consider class 2. The subband samples belong-
est rate for the class corresponding to the high- ing to this class are divided into vectors of length
est mean squared value. The blocks having the L = 3, and quantized using a VQ codebook of size
smallest mean squared value (class 0) are neither N = 81. The codebook indices are then mapped
quantized nor transmitted. to an 81 QAM signal constellation (K = 2) by the
Next, the VQ output indices are mapped to the corresponding index map.
amplitude levels of the QAM channel symbols by class L N K
the index maps. The index maps were designed to 1 2 9 1
have the following two properties: First, in order 2 3 81 2
to minimize the e ects of channel errors, points 3 1 15 1
that are close in the QAM signal constellation
(channel space) should correspond to points (code-
book vectors) that are close in the source space. Table 1. Source vector dimension (L), codebook
Second, in order to maximize the minimum dis- size (N ), and channel space dimension (K ) for each
tance, dmin , of the signal constellation for a xed class. The case K = 1 corresponds to the real or
transmitted power, the most probable codebook imaginary axis of a QAM constellation.
vectors should be mapped to the QAM points hav-
ing the smallest amplitude. In this work, simu- The side information parameters consisting of
lated annealing was used to design the index maps the motion vectors, the block classi cation table,
as described in [5]. and the quantizer scale factors are particularly
For the proposed system, the choices of source sensitive to channel noise. Also, robust mappings
vector dimension (L), codebook size (N ), and are dicult to design in this case, since the neigh-
channel space dimension (K ) for each of the three bor properties for these parameters are not well
de ned. In order to reduce the error rate for the In the rst experiment, the reference system
side information, a sparse subset of an 81-QAM was simulated with a quantizer step size of 12
signal constellation is used. This reduces the error (QP = 6) and with various values of NI (the num-
rate signi cantly for a given noise level. In ad- ber of intra blocks per frame), resulting in vari-
dition, power ecient transmission is achieved by ous bit rates and channel symbol rates. The av-
mapping the most probable values of the side in- erage number of bits per pixel (bpp) and QAM
formation parameters to the QAM symbols having symbols per pixel (spp) for each value of NI are
the smallest amplitudes. shown in Table 2. In Figure 3, the resulting peak
3. LEAKY PREDICTION signal-to-noise ratio (PSNR) of the reconstructed
sequence is evaluated for various values of NI and
As mentioned above, the optimal value of the leaky the CSNR. As can be seen from the gure, the
prediction coecient is a result of a trade-o be- robustness to channel noise increases slightly with
tween limiting the channel noise propagation and an increased value of NI .
maximizing the interframe prediction gain. In [7],
a similar optimization problem for a DPCM sys- NI bpp spp
tem with scalar quantization was studied. In the 0 0.338 0.085
case of a 1st order predictor, it was shown that for 1 0.348 0.087
a temporal correlation coecient , the normal- 3 0.367 0.092
ized reconstruction error can be expressed as 9 0.425 0.106
 = [q + m + n =( 2 ? 1)][1 + 2 ? 2 ] (1) Table 2.Average bpp and spp for various values of NI
where q and n correspond to the quantization Next, the proposed system was simulated with
and the channel noise respectively, and m is a various values of at a channel rate of 0.092 spp.
mutual term. The results are illustrated in Figure 4. From the
Thus, the optimal value of must satisfy the gure, we observe that the robustness to channel
following fth order equation noise can be improved signi cantly by reducing
from 1.0 to e.g. 0.99. On the other hand, the cor-
responding performance loss for high CSNR val-
(1 ? 2 )2( ? )(q + m ) ? ( 2 ? 2 + )n = 0 (2) ues is small. Furthermore, optimal values of can
be estimated as follows. At CSNR values above
Assuming that the quantizer as well as the source 21 dB, should be above 0.99, approaching 1.0
and channel statistics are known, the optimal as the CSNR increases. As the CSNR is reduced
value of the prediction coecient, opt can be from 21 dB to 16 dB, opt decreases to about 0.90.
found numerically. A special case is for very noisy Note that opt does not approach zero as CSNR
channels i.e. n  q + m . In this case we have decreases to 0. This can be seen from equation 3.
[7] Typically, the temporal correlation coecient 
q could be as high as 0.99 or higher. The resulting
value of opt is in the order of 0.9 (equation 3). Al-
opt ' (1 ? 1 ? 2)= (3)
though equation 3 does not apply directly to the
A similar theoretical analysis can be applied to problem at hand, it indicates an optimal value of
the proposed system, but some modi cations are that is much closer to 1.0 than to 0.0 as CSNR
needed to take the subband decomposition and the approaches zero.
classi cation scheme into account. This will be a Finally, the proposed system was compared to
topic for further research. In this paper the opti- the reference system. The reference system was
mal values of were estimated experimentally. simulated with NI = 3 resulting in a QAM symbol
4. EXPERIMENTS rate of 0.092 spp. The proposed system was sim-
ulated at the same spp and with = 0:99. The
All our experiments were performed by coding equivalent bit rate in terms of bpp for the pro-
400 luminance frames of the QCIF Foreman se- posed system is higher than for the reference sys-
quence, and by simulating transmission over an tem since the power ecient index maps allow for
additive white Gaussian noise (AWGN) channel larger signal constellations (e.g. 81 QAM vs. 16
with maximum likelihood detection on the receiver QAM). This is necessary in order to compensate
side. For the reference system, the simulations for the less ecient xed length coding strategy
were conducted by assuming error free transmis- of the proposed system. Note however, that since
sion of the picture header, ensuring correct re- the performances of the two systems are compared
synchronization of the bit pattern at the begin- for the same channel symbol rate and CSNR, the
ning of each frame. Furthermore, the H.263 coder equivalent bpp is irrelevant. The results are illus-
was implemented without rate control, using the trated in Figure 5. The performances of the two
average channel rate as a reference. All our ex- systems are quite close at CSNR values down to
periments were conducted by averaging over 20 about 20 dB. Below this point, the reference sys-
separate simulations of the entire sequence using tem degrades abruptly, while the proposed system
di erent random noise sequences. degrades gracefully.
35 35

30 30

PSNR (dB)
PSNR (dB)

25
25

20
20 12 14 16 18 20 22 24
12 14 16 18 20 22 24 CSNR (dB)
CSNR (dB)

Figure 3. PSNR vs. CSNR for various values of NI , Figure 5. Right: PSNR vs. CSNR, reference system,
NI = 0 (||), NI = 1 ({ { {), NI = 3 ({{{), NI = 3 (||), proposed system, = 0:99 (- - -).
NI = 9 (    ). 6. ACKNOWLEDGEMENT
The authors want to thank Gisle Bjntegaard at
35
Telenor Research for suggesting the error detection
and concealment techniques for the H.263 video
coder, making it a realistic and challenging refer-
ence.
30
REFERENCES
[1] N. Farvardin, \A study of vector quantization
PSNR (dB)

for noisy channels," IEEE Trans. Inform. The-


ory, vol. 36, pp. 799{809, July 1990.
25
[2] J. M. Lervik, A. Fuldseth, and T. A. Ram-
stad, \Combined image subband coding and
multilevel modulation for communication over
power- and bandwidth limited channels," in
Proc. Workshop on Visual Signal Processing
and Communications, (New Brunswick, NJ,
20
12 14 16 18 20 22 24
CSNR (dB)

Figure 4. PSNR vs. CSNR for various values of USA), pp. 173{178, IEEE, Sept. 1994.
(From top to bottom at CSNR = 24 dB: = 1.0, [3] A. Fuldseth and J. M. Lervik, \Combined
0.99, 0.98, 0.95, 0.9,0.85, 0.80). source and channel coding for channels with a
power constraint and multilevel signaling," in
Proc. ITG Conference, (Munchen, Germany),
5. CONCLUSIONS pp. 429{436, ITG, Oct. 1994.
[4] J. M. Lervik, A. Grvlen, and T. A. Ram-
For the proposed video transmission system we stad, \Robust digital signal compression and
have demonstrated how the robustness to chan- modulation exploiting the advantages of ana-
nel noise can be improved by using leaky inter- log communication," in Proc. IEEE GLOBE-
frame prediction matched to the channel statistics. COM, (Singapore), pp. 1044{1048, IEEE, Nov.
For the cases where the channel statistics is not 1995.
known on the encoder side, the robustness can be [5] A. Fuldseth and T. A. Ramstad, \Combined
increased signi cantly at the cost of a small qual- video coding and multilevel modulation," in
ity reduction for noiseless channels. Finally, the Proc. Int. Conf. on Image Processing, (Lau-
proposed system compares favorably to a refer- sanne, Switzerland), Sept. 1996. to appear.
ence system based on the H.263 video coder due to [6] Telecommunication Standardization Sector,
graceful degradation for poor channel conditions. Study Group 15, Working Party 15/1, Expert's
The inherent robustness to channel noise makes Group on Very Low Bitrate Videophone, Video
the proposed system particularly useful for video Codec Test Model, TMN5, Jan. 1995. Source:
transmission over time varying wireless channels Telenor Research.
such as in mobile and broadcasting applications. [7] K. Y. Chang and R. W. Donaldson, \Analysis,
The ideas behind the proposed system might also optimization, and sensitivity study of di er-
be of interest for the functionalities of the new ential PCM systems operating on noisy com-
MPEG4 video coding standard for applications re- munication channels," IEEE Trans. Commun.,
quiring error resilience. vol. COM{20, pp. 338{350, June 1972.

Вам также может понравиться