Академический Документы
Профессиональный Документы
Культура Документы
Karin Engdahl, Student Member, IEEE and Kamil Sh. Zigangirov, Member, IEEE3
1 This work was supported in part by Swedish Research Council for Engineering Sciences under
Grant 95-164
2 Part of this work was presented at IEEE International Symposium on Information Theory,
Ulm, Germany, June 29 { July 4 1997
3 Both authors are at the Department of Information Technology, Telecommunication Theory
Group, Lund University, Box 118, S-221 00 Lund, Sweden, Phone +46 46 222 3450 and +46 46
222 4460, Fax +46 46 222 4714, e-mail karin@it.lth.se and kamil@it.lth.se
1
I. INTRODUCTION
The principle of trellis-coded modulation was described by Ungerbck in his
paper of 1982 [8], and the concept of multilevel modulation was introduced by
Imai and Hirakawa [6]. In this method the channel signal set is successively binary
partitioned using the set partitioning rule where the binary labels of the branches
from one level of the partition chain to the next are encoded by independent binary
codes. The multilevel scheme enables the usage of a suboptimal multistage decoder
[6], [7], which performs almost as well as an optimum joint maximum likelihood
sequence estimator over all levels [4], but is much less complex.
In this paper we will study a multilevel modulation scheme using QAM-signaling,
transmitting over a discrete memoryless Gaussian channel and employing a multi-
stage decoder. To further reduce complexity we use a suboptimal metric in each
decoding stage. This is shown to generate very slight performance loss in terms of
capacity.
In the following section we introduce the QAM multilevel modulation scheme,
which is a modied version of the one proposed in [6]. The channel characteristics
and the decoding procedure are also given. In Section III we upperbound the block
and burst error probabilities, for block and convolutional codes respectively. The
calculation of upper bounds for these probabilities reduces to calculation of code
generating functions with the Cherno bounding parameter Z as argument. This
parameter is a function of the intra-set squared Euclidean distance k2 on level k, the
noise variance 2 and the number of signal points on the corresponding level of set
partitioning.
2
For 2-QAM transmission (the last level of the multilevel QAM scheme) Z =
exp ( k2 =82). This value is a lower bound on Z for other levels of the scheme, and
is often used as an approximation of Z , see for example [1], though the approximation
results in that the bounds on error probability lose their validity. The inaccuracy
increases with the ratio k2 =2. An accurate upper bound on Z for any level of
the QAM scheme is Z = 4 exp ( k2 =82). This is a consequence of the \nearest
neighbor error events principle" [4], [7], and the fact that each signal point of a
M -QAM set has at most 4 nearest neighbors. This value of Z yields, especially
for small values of k2 =2, loose bounds on the error probabilities. In the case of
16-QAM the average number of nearest neighbors is 3 so here the estimation Z =
3 exp ( k2 =82) can be used [5], and for 4-QAM each signal point has 2 nearest
neighbors so Z = 2 exp ( k2 =82) is an approximation of Z .
In Sections IV and V we calculate better upper bounds on Z that tightens
the error bounds without causing them to lose their rigor. This is done using the
Cherno bounding method, which gives exponentially tight bounds [11]. The last
but one level, when the signal set is 4-QAM, is considered in Section IV, and in
Section V we calculate Z for a QAM signal set with an innite number of points.
The assumption of M -QAM with M = 1 is a mathematical abstraction, but it
is a good approximation of a M -QAM multilevel coded modulation scheme with
M < 1, at least for large M . Finally, in Section VI we calculate the capacity of
this multilevel modulation scheme using QAM-signaling and this type of suboptimal
decoder, and compare to the capacity for the same scheme using optimal metric in
each decoding stage.
3
II. SYSTEM DESCRIPTION
The transmitter and receiver described in Figure 1 is a generalization of the
scheme of Imai and Hirakawa [6] to multilevel QAM. A binary information se-
quence u is partitioned into K binary subsequences u(1) ; u(2) ; : : : ; u(K ), where each
subsequence is encoded by an independent binary component code Ck (block or
n o
convolutional). A set of K bits, v(1) (n) ; v(2) (n) ; : : : ; v(K ) (n) , one bit from each
code sequence v(1); v(2) ; : : : ; v(K ), are synchronously mapped onto one of the 2K -
QAM signal points, s (n). The K -level partitioning of the signal set is a sequence
of K partitions S (0) =S (1)= : : : =S (K ). The mapping by set partitioning is illustrated
in Figure 2 for K = 4. The squared Euclidean intra-set distance on the kth level of
partitioning is k2 = 2k 112, k = 1; 2; : : : ; K . Assuming equiprobable signal points
we get that the total average signal energy per channel use is Es = 2K 1 12 =6
when K is even, and Es = 2K +1 1 12=12 when K is odd.
When passed through the discrete memoryless Gaussian channel the complex in-
put sequence s = s (1) ; s (2) ; : : : ; s (n) ; : : : (where s (n) = a (n)+jb (n) is the channel
n o
input at the nth moment of time, a (n) ; b (n) 2 1; 3; : : : ; 2K=2 1 1=2
n p o
when K is even and a (n) ; b (n) 2 1; 3; : : : ; 2K +1=2 1 1=2 2 s.t. a (n)+
n p o
b (n) 2 0; 4; 8; : : : ; 4 2K 1=2 1 1=2 2 when K is odd) is corrupted by
the error sequence e = e (1) ; e (2) ; : : : ; e (n) ; : : : such that the complex received se-
quence is r = s + e. Here e (n) = e(I ) (n) + je(Q) (n), where e(I ) (n) and e(Q) (n) are
independent Gaussian random variables with zero mean and variance 2.
The multistage decoder consists of a set of suboptimal decoders matched to the
codes used on the corresponding levels of encoding. Each decoding stage consists of
4
calculation of distances (metrics) to the received sequence r from all possible code
words on the corresponding level of set partitioning. The side information from
the previous decoding stages determines, according to the set partitioning structure
(illustrated in Figure 2), the signal set upon which the metrics are calculated.
When calculating the metrics, the decoder uses the following suboptimal prin-
ciple. Let us suppose that a binary block code (the extension to convolutional
codes is straight forward) of length N is used on the kth level of the encoding,
and that the decoding on the previous (k 1) decoding stages determines the
subsets S (k ; S (k 1) (2) ; : : : ; S (k 1) (N ), to which the transmitted symbols of
1) (1)
the codeword v(k) = v(k) (1) ; v(k) (2) ; : : : ; v(k) (N ), v(k) (n) 2 f0; 1g, belong. Let
s(k) (n) 2 S (k 1) (n), n = 1; 2; : : : ; N and s(k) = s(k) (1) ; s(k) (2) ; : : : ; s(k) (N ).
Let S0(k 1)
(n) and S1(k 1)
(n) be subsets of S (k
n), corresponding to transmis-
1) (
sion of v(k) (n) = 0 and v(k) (n) = 1 respectively. Finally let S(vk(k)1) = Sv(k(k)(1)
1)
(1) ;
Sv(k(k)(2)
1)
(2) ; : : : ; Sv(k(k)(1)N ) (N ) be the sequence of subsets corresponding to transmission
of the code word v(k). Then the distance (metric) between the received sequence
r = r (1) ; r (2) ; : : : ; r (N ) and the codeword v(k) is determined as
r; v(k) = (k)min(k 1) dE r; s(k) ; (1)
s 2Sv(k)
where dE (x; y) means the squared Euclidean distance between the N -dimensional
vectors x and y. The decoding consists of choosing the codeword v(k) for which the
metric r; v(k) above is minimal.
6
exponentially tight bound [11]. From (2) and (3) we get for linear block codes [10]
(k) LX1 (k)wl(k) NX(k) (k) (k)w (k)
P " Z = aw Z = G (D) jD=Z (k) ; (4)
l=1 w=d(min
k)
where fa(wk)g is the weight distribution of the code on the kth decoding level, i.e.
k) is the minimal Hamming
a(wk) is the number of code words having weight w, d(min
distance of the code, and G(k) (D) is the generating function of the linear block code.
When a convolutional code is used, the burst error (rst-event) probability P "(k)
is upperbounded by the union type bound [9]
P "(k) < T (k) (D) jD=Z (k) (5)
where T (k) (D) is the generating function of the convolutional code on the kth de-
coding level.
We note that the commonly used upper bounds for block and burst error proba-
bilities in the block and convolutional coding cases are both functions of the param-
eter Z (k). The same is correct when considering bit error probabilities. Calculation
of the minimal possible Z (k) yields tight upperbounding of the decoding error prob-
ability on each decoding level. As mentioned in the introduction
! !
exp k2 Z (k) < 4 exp k2 (6)
8 2 8 2
when QAM-signaling is used. The bounds (4) and (5) are often violated through
the use of the lower bound exp ( k2 =82) as an approximation to Z (k). Thereby the
upper bounds (4) and (5) lose their validity. On the other hand the upper bound
4 exp ( k2 =82) sometimes gives bounds that are loose. In the following sections
we calculate the values Z (k) that give exponentially tight union bounds, (4) and
7
(5), without causing them to lose their validity. This is done by using the Cherno
bounding method and the analysis is performed for the last two decoding levels and
for a QAM signal set with an innite number of signal points. The application of
the Cherno bound to the situation considered is explained in Appendix A.
10
p
0 x k = 2, with itself, as is shown in Appendix C. There the following theorem
is also proven.
Theorem 2 On each level of suboptimal decoding of a multilevel coded QAM signal
set with an innite number of signal points, the Cherno bounding parameter has
the value
! !
Z (k) = Z1 = min exp spk g(k) (s)2 ; (10)
s0 2
where
X
1 p ! p p !!
(k) (
g s) = 2 exp 2s2
2sik Q 2ik s Q 2ik s + p k
=
i= 1 2
p ! ! p X i sk i 2
2 s 4 2 s 1 ( 1) exp p2 1 exp k
= s exp p k
1 + 2 : (11)
k 2 k i=1 2s2 + 2i k
The innite sum in the rst expression for g (k) ( s) in (11) converges faster for large
k than for small k . If it is desirable to have fast convergence when k is small one
should use the second expression in (11). Numerical values of Z1 are shown in Table
I for a number of dierent = k . Here it can also be seen that Z1 approaches
4Z2 for large , due to the \nearest neighbor error events principle". We conclude
that the rough bounds Z2 < ZM < 4Z2 have been tightened to Z4 < ZM < Z1 for
M > 4. In Figure 6 we show by two examples how the bounds on error probability
are improved by the use of Z1 instead of 4Z2.
VI. CAPACITY AND CUTOFF RATE OF MULTISTAGE DECODING
USING SUBOPTIMAL METRIC
Analogously to [4] we consider the transmission on each level of modulation as
transmission over an individual channel. But in contrast to [4], where the capa-
cities of these individual channels for optimal decoding on each level of modulation
11
was calculated, we calculate the capacities when suboptimal decoding, described
in Section II is used. We consider the QAM signal set with an innite number of
signal points, and as an output of the individual channel we consider the sequence
of statistics (k) introduced in Section IV.
The capacity for the kth level of the given multilevel QAM system using multi-
stage decoding with suboptimal metric is
C (k ) = H H (k) j v(k) = 0 =
(k) (12)
Z 2k 1 (0) 1 1 (0) 1
= (1) (1)
f (k) (
) + 2 f (k) (
) log 2 f (k) (
) + 2 f (k) (
) d
+
2k 2
Z 2k (0)
+ 2 f (k) (
) log f (0)(k) (
)d
;
k
where the superscript denotes the transmitted bit, and where the probability den-
sity function of (k) conditioned on that a one was transmitted satises f (1)(k) (
)
= f (0)(k) (
) by symmetry. All signal points are assumed to be equally likely. Con-
sidering the ensemble of codes and calculating the expectation of decoding error
(k) = 1 log 1 + Z (k) . The capacity
probability gives that the cuto rate is Rcomp 2
and computational cuto rate of a level using QAM with an innite number of signal
points is shown in Figure 7. In this gure we also show the capacity for the same
system using optimal metric at each decoding stage calculated in [4]. We conclude
that the capacity for the system using suboptimal metric that we considered is close
to the capacity for a system using optimal metric and multistage decoding. This
indicates that it is not necessary to use the optimal metric in multistage decoding
of multilevel QAM. That is, a complexity reduction by use of the suboptimal metric
can be achieved at very small loss in capacity.
12
VII. CONCLUSIONS
The decoding error probability for multilevel QAM with multistage decoding
using suboptimal metric has been analyzed. Rigorous union bounds for error prob-
abilities of component codes have been presented and formulas for the parameter
Z have been derived. To do this the Cherno bounding method, which gives ex-
ponentially tight bounds, was applied. This formula yields a Z that gives tighter
bounds than the commonly used \nearest neighbor error events principle", but the
bounds do not lose their validity as is the case when the often used approxima-
tion Z = exp ( 2 =82) is applied. Comparison to traditional bounding techniques
for the probability of decoding error has been made for 4-QAM and a QAM sig-
nal constellation with an innite number of signal points. Calculation of capacity
and computational cuto rate was made for QAM with an innite number of signal
points. It was concluded that the capacity for the decoding using suboptimal metric
is close to the capacity for the system where optimal metric is used. Hence in terms
of capacity the need for using optimal metric is small when multilevel QAM with
multistage decoding is considered.
13
APPENDIX A.
DERIVATION OF THE CHERNOFF BOUNDING PARAMETER
Let r; v0(k) and r; vl(k) be the squared Euclidean distances (metrics) from
r to the transmitted codeword and from r to the lth codeword, l = 1; 2; : : : ; L 1,
respectively. The decision is then taken in favor of the codeword whose metric is
minimal. From (1) we have
X
N
r; vl(k) = dE r (n) ; vl(k) (n) ; l = 0; 1; : : : ; L 1; (13)
n=1
where
dE r (n) ; vl(k) (n) = min
(k 1)
dE r (n) ; s(k) (n) : (14)
s(k) (n)2S (k) (n)
vl (n)
Then the decision criterion r; v0(k) r; vl(k) >< 0, k = 1; 2; : : : ; K becomes
(l k) = PNn=1 (l k) (n) >
< 0, where
(k)
l (n) = dE r (n) ; v0(k) (n) dE r (n) ; vl(k) (n) : (15)
Without loss of generality we suppose that v0(k) is the all-zero codeword and
that vl(k) is a codeword of Hamming weight wl(k). To simplify the analysis, we also
change the order of the transmitted symbols, such that the rst wl(k) symbols of vl(k)
are ones. Then the decision statistic (l k) = Pwn=1
(k )
l (k ) ( n) is a sum of independent
identically distributed random variables (k ) (
n).
To get an exponentially tight bound for P "(l k) , let us introduce the generating
R
function of (l k), (l k) (s) = 11 exp (s ) f(lk) ( ) d , where f(l k) ( ) is the probabil-
ity density function of (l k). Since (l k) is a sum of wl(k) independent identically
w(k)
distributed random variables we have (l k) (s) = '(k) (s) l , where '(k) (s) =
14
R 1 exp (s
) f
1 (k) (
) d
and f (k ) (
) is the probability density function of (k) ( n).
Using the Cherno bound [11] we get
(k) (k) (k ) wl(k)
P "l = P l 0 min
s0
l (s) = min
s0
(k
' (s)
) ; (16)
Z (k) = min
s0
'(k) (s) : (17)
APPENDIX B.
PROOF OF THEOREM 1
Let us consider the last but one level, Figure 3. To study the probability density
function of n), we introduce the system of coordinates (x; y). In this system
(K 1) (
the four quadrants dene 4 regions, where the received point has the same nearest
point from each set. For example in the rst quadrant the nearest point from S0(K 2)
16
APPENDIX C.
PROOF OF THEOREM 2
Here we consider a level where the QAM signal set has innitely many signal
points, Figure 5. To study the probability density function of the statistics n), (k) (
we introduce the system of coordinates (x; y), Figure 5. The square regions dened
in this gure are regions in which the received point has the same nearest point from
p p
each set. For example in the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g
point A is the nearest point from S0(k 1) , and the nearest point from S1(k 1)
is point
B. The noise components in this system of coordinates are independent Gaussian
random variables with zero mean and variance equal to one.
p p
Let us rst study the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g. We
suppose that the received point is in this region, and that v0(k) was transmitted. Then
the squared Euclidean distance from the received point to the nearest reference point
is
dE r (n) ; v0(k) (n) = (X (n))2 + (Y (n))2 (25)
and the squared Euclidean distance from the received point to the nearest opposite
point is !2 !2
dE r (n) ; vl (n) = X (n) p + Y (n) p :
(k) k k
(26)
2 2
So the dierence between the squared Euclidean distances (15) becomes
(k ) (
p
n) = 2k (X (n) + Y (n)) 2k : (27)
17
conditional probability density functions of X and Y given that v0(k) was transmitted
p
and given that the received signal is in the region fx; y s.t. 0 < x < k = 2; 0 <
p
y < k = 2g are (see Figure 8)
P m2 p
1 x + 2i 2
i= m2 p2 exp
1
2 k
fX(0) (x) = fY(0) (x) = mlim
!1 Rpk mP 2
p = (28)
2 2 m p1 exp
2 x + 2ik
1
0 i= 2 2
P m2 p 2
1
i= m2 p2 exp 1 x + 2ik X1 2 p
= mlim k2
= p exp 12 x + 2ik 2 ;
!1 Q p2 (m + 1) i= 1 2
1
2
p (k)
0 < x < k = 2, and the conditional probability density function of is a shifted
version of the scaled convolution of the probability density functions of X and Y .
Obviously, for each of the square regions dened in Figure 5 we can introduce a
system of coordinates such that the nearest reference point would have coordinates
p p
(0; 0) and nearest opposite point coordinates k = 2; k = 2 . The dierence be-
tween the squared Euclidean distances satises (27) where X and Y are coordinates
of received point in this system of coordinate. Therefore (28) denes the condi-
tional probability density functions of X and Y given that v0(k) was transmitted,
independently from in which square region the received point is.
p
We have that mins0 '(k) (s) = mins0 '(k) 2k s , and from (17), (27) and
p p
(28) follows that '(k) 2k s = exp sk = 2 'X (s) 'Y (s), where
Z pk
2
'X (s) = 'Y (s) = exp (sx) fX(0) (x) dx = (29)
0
18
which proves the rst part of Theorem 2. Now let us prove the second part of the
theorem. We have from (28) that
1 X1 1 p 2 1 p 2
fX (x) = p
(0)
exp 2 x + 2ik + exp 2 x 2ik ;
2 i= 1
which can be expressed as
Z1 !
fX (x) = p1
(0)
(x; ) exp 2 d; (30)
2 1 2
p p
where (x; ) = P1j= 1 x 2 j k + x + 2j k . The Fourier
series expansion of (x; ) in is
X p
1 2 2 p p p
2 2ix cos 2i :
(x; ) = + cos (31)
k i=1 k k k
R p
Now, (30){(31) together with the equality 11 exp ( p2x2 qx) dx = exp (q2 =4p2) =p,
p > 0 [2] gives
p X
1 p p i 2!
(0) 2 2
fX (x) = + 2 cos 2ix exp (32)
k k i=1 k k
and thus
Z pk
2
g (s) = exp (sx) fX(0) (x) dx = (33)
0
i sk 2
p ! ! p X 1 ( 1) exp p2 1 exp i
= s2 exp spk 4 2s k
1 + 2i 2 ;
k 2 k i=1 2s2 + k
and the proof is complete.
19
References
[1] E. Biglieri, D. Divsalar, P. J. McLane and M. K. Simon, Introduction to Trellis-
Coded Modulation with Applications. Macmillan, 1991.
[4] J. Huber, \Multilevel Codes: Distance Proles and Channel Capacity," in ITG-
Fachbereicht 130, Oct. 1994, pp. 305-319. Conference Record.
[6] H. Imai and S. Hirakawa, \A New Multilevel Coding Method Using Error-
Correcting Codes," IEEE Trans. Inform. Theory, vol. IT-23, pp. 371-377, May
1977.
20
[9] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Cod-
ing. McGraw Hill, 1979.
21
LIST OF FIGURE CAPTIONS
Figure 1: The system model showing the transmitter, the additive white Gaussian
noise (AWGN) channel and the multistage receiver.
Figure 7: Capacity (solid line) and computational cuto rate (dashed line) for
QAM with an innite number of signal points, using suboptimal metric. The
capacities for 16-QAM (stars) and a \large" QAM constellation (circles), both
using optimal metric [4], are also shown.
23
u(1) C v(1)
1
u(2)
C
v (2)
2
u Partition
of
u(3) C3
v(3) 2K -QAM s r subopti-
mal
u^
mapper AWGN
information p decoder
p
p
u(K ) C v(K )
K
Figure 1:
uuuu
uuuu
uuuu S (0) , 1
uuuu
v(1) (n) = 0 XXv(1) (n) = 1
XXX
9 XX
z
X
eueu ueue
ueue eueu p
eueu ueue S (1) , 2 = 21
ueue eueu
v(2) (n) = 0 QQv(2) (n) = 1 v(2) (n) = 0 QQv(2) (n) = 1
+
s
Q +
s
Q
eeee eueu ueue eeee
ueue eeee eeee eueu
eeee eueu ueue eeee S (2) , 3 = 21
ueue eeee eeee eueu
Jv ((3)
n) = 1
Jv ((3)
n) = 1
Jv ((3)
n) = 1
Jv (n) = 1
(3) (3) (3) (3)
v (3) (n) = 0
v ( n ) = 0 v ( n ) = 0 v ( n ) = 0
e e e e e^e e e e ee u e^u e e u ee e e^e u e e ee e eJ^e e e
J
J
J
e e u e u e e e e e e e e e e e e e e e e e e e e u e e e e e u (3) p
e e e e e e e e e u e e e e e u e e u e u e e e e e e e e e e e S , 4 = 2 2 1
ueee eeue eeee eeee eeee eeee eeeu euee
Figure 2:
24
received
point
y@
I
@ (
b(
6
((H
x
( ( H
2@@x((( Hh1
I
@
(K 1)
@ @
d E r (n) ; v
0 ( n) = 0 @
@
@
dE r (n) ; vl(K 1)
(n) = 1
@
@ -
@
@
a
@
@
@
u Referencepoint 2 S0(K 2)
3 h K 1 @x 4 e Opposite point 2 S (K 2)
@ 1
Figure 3:
Z2 Z4 Z1 Z4=Z2 Z1=Z2
0.5 0.9692 0.9984 1.0000 1.03 1.03
1.0 0.8825 0.9793 1.0000 1.11 1.13
1.5 0.7548 0.9198 0.9997 1.22 1.32
2.0 0.6065 0.8124 0.9858 1.34 1.63
2.5 0.4578 0.6686 0.9152 1.46 2.00
3.0 0.3247 0.5108 0.7740 1.57 2.38
3.5 0.2163 0.3618 0.5932 1.67 2.74
4.0 0.1353 0.2379 0.4135 1.76 3.06
4.5 0.0796 0.1453 0.2636 1.82 3.31
5.0 0.0439 0.0825 0.1545 1.88 3.52
5.5 0.0228 0.0437 0.0837 1.92 3.67
6.0 0.0111 0.0216 0.0420 1.94 3.78
6.5 0.0051 0.0100 0.0196 1.96 3.86
7.0 0.0022 0.0043 0.0086 1.98 3.91
Table I:
25
10
−2
(a) (b)
−4
10
−3
10
−6
10
−4
10
−8
10
−5
10
−10
10
−6
10
−12
10
−7
10
−14
10
−8
3 4 5 6 3 4 5 6
10
Figure 4:
@
I y b x B
@
@ 6
j
A@ )
dE r (n) ; vl(k) (n) = 1
@ A @
@u @ @ e @@u @@e @@u A @ received point
@ @
@ @ @ @@
@ e@ u@ @ @
@
@ @ B@e @u @e @
@
i
P P P dE r (n) ; v0(k) (n) = 0
@ @ @ @
k @ @ @ @ @
@ @
@u @e A@u @e @u @z
A
@ @ @ @
@ e @ u @ e @ u @ e
@ -
a
u Reference point 2 S0(k 1)
@ @ @ @ @
@ @ @ @ @
e Opposite point 2 S1(k 1)
Figure 5:
10
−2
(a) (b)
−4
10
−3
10
−6
10
−4
10
−8
10
−5
10
−10
10
−6
10
−12
10
−7
10
−14
10
−8
3 4 5 6 3 4 5 6
10
Figure 6:
26
1
bits per
level and 0.8
channel use
0.6
0.4
0.2
0 0 1
10 10
Figure 7:
Figure 8:
27