Вы находитесь на странице: 1из 27

On the Calculation of the Error Probability for a Multilevel

Modulation Scheme Using QAM-signaling1 2 ;

Karin Engdahl, Student Member, IEEE and Kamil Sh. Zigangirov, Member, IEEE3

Abstract{ We analyze the quadrature amplitude modulation (QAM) version of the


multilevel modulation scheme with multistage decoding using suboptimal metric,
when transmission takes place over a memoryless Gaussian channel. The upper
bounds for decoding error probabilities are functions of the Cherno bounding pa-
rameter Z . We argue that the conventional approximation of Z is not adequate, and
new values of Z that tightens the error bounds without causing them to lose their
validity are given. The capacity for this system is also calculated, and the result
hereof is the conclusion that the use of suboptimal metric in multistage decoding
causes very little degradation in capacity compared to when optimal metric is used
in each decoding stage.

Index terms{ Multilevel modulation, multistage decoding, QAM-signaling, set par-


titioning.

1 This work was supported in part by Swedish Research Council for Engineering Sciences under
Grant 95-164
2 Part of this work was presented at IEEE International Symposium on Information Theory,
Ulm, Germany, June 29 { July 4 1997
3 Both authors are at the Department of Information Technology, Telecommunication Theory
Group, Lund University, Box 118, S-221 00 Lund, Sweden, Phone +46 46 222 3450 and +46 46
222 4460, Fax +46 46 222 4714, e-mail karin@it.lth.se and kamil@it.lth.se

1
I. INTRODUCTION
The principle of trellis-coded modulation was described by Ungerbck in his
paper of 1982 [8], and the concept of multilevel modulation was introduced by
Imai and Hirakawa [6]. In this method the channel signal set is successively binary
partitioned using the set partitioning rule where the binary labels of the branches
from one level of the partition chain to the next are encoded by independent binary
codes. The multilevel scheme enables the usage of a suboptimal multistage decoder
[6], [7], which performs almost as well as an optimum joint maximum likelihood
sequence estimator over all levels [4], but is much less complex.
In this paper we will study a multilevel modulation scheme using QAM-signaling,
transmitting over a discrete memoryless Gaussian channel and employing a multi-
stage decoder. To further reduce complexity we use a suboptimal metric in each
decoding stage. This is shown to generate very slight performance loss in terms of
capacity.
In the following section we introduce the QAM multilevel modulation scheme,
which is a modi ed version of the one proposed in [6]. The channel characteristics
and the decoding procedure are also given. In Section III we upperbound the block
and burst error probabilities, for block and convolutional codes respectively. The
calculation of upper bounds for these probabilities reduces to calculation of code
generating functions with the Cherno bounding parameter Z as argument. This
parameter is a function of the intra-set squared Euclidean distance k2 on level k, the
noise variance 2 and the number of signal points on the corresponding level of set
partitioning.

2
For 2-QAM transmission (the last level of the multilevel QAM scheme) Z =
exp ( k2 =82). This value is a lower bound on Z for other levels of the scheme, and
is often used as an approximation of Z , see for example [1], though the approximation
results in that the bounds on error probability lose their validity. The inaccuracy
increases with the ratio k2 =2. An accurate upper bound on Z for any level of
the QAM scheme is Z = 4 exp ( k2 =82). This is a consequence of the \nearest
neighbor error events principle" [4], [7], and the fact that each signal point of a
M -QAM set has at most 4 nearest neighbors. This value of Z yields, especially
for small values of k2 =2, loose bounds on the error probabilities. In the case of
16-QAM the average number of nearest neighbors is 3 so here the estimation Z =
3 exp ( k2 =82) can be used [5], and for 4-QAM each signal point has 2 nearest
neighbors so Z = 2 exp ( k2 =82) is an approximation of Z .
In Sections IV and V we calculate better upper bounds on Z that tightens
the error bounds without causing them to lose their rigor. This is done using the
Cherno bounding method, which gives exponentially tight bounds [11]. The last
but one level, when the signal set is 4-QAM, is considered in Section IV, and in
Section V we calculate Z for a QAM signal set with an in nite number of points.
The assumption of M -QAM with M = 1 is a mathematical abstraction, but it
is a good approximation of a M -QAM multilevel coded modulation scheme with
M < 1, at least for large M . Finally, in Section VI we calculate the capacity of
this multilevel modulation scheme using QAM-signaling and this type of suboptimal
decoder, and compare to the capacity for the same scheme using optimal metric in
each decoding stage.

3
II. SYSTEM DESCRIPTION
The transmitter and receiver described in Figure 1 is a generalization of the
scheme of Imai and Hirakawa [6] to multilevel QAM. A binary information se-
quence u is partitioned into K binary subsequences u(1) ; u(2) ; : : : ; u(K ), where each
subsequence is encoded by an independent binary component code Ck (block or
n o
convolutional). A set of K bits, v(1) (n) ; v(2) (n) ; : : : ; v(K ) (n) , one bit from each
code sequence v(1); v(2) ; : : : ; v(K ), are synchronously mapped onto one of the 2K -
QAM signal points, s (n). The K -level partitioning of the signal set is a sequence
of K partitions S (0) =S (1)= : : : =S (K ). The mapping by set partitioning is illustrated
in Figure 2 for K = 4. The squared Euclidean intra-set distance on the kth level of
partitioning is k2 = 2k 112, k = 1; 2; : : : ; K . Assuming equiprobable signal points
 
we get that the total average signal energy per channel use is Es = 2K 1 12 =6
 
when K is even, and Es = 2K +1 1 12=12 when K is odd.
When passed through the discrete memoryless Gaussian channel the complex in-
put sequence s = s (1) ; s (2) ; : : : ; s (n) ; : : : (where s (n) = a (n)+jb (n) is the channel
n   o
input at the nth moment of time, a (n) ; b (n) 2 1; 3; : : : ;  2K=2 1 1=2
n   p o
when K is even and a (n) ; b (n) 2 1; 3; : : : ;  2K +1=2 1 1=2 2 s.t. a (n)+
n   p o
b (n) 2 0; 4; 8; : : : ; 4 2K 1=2 1 1=2 2 when K is odd) is corrupted by
the error sequence e = e (1) ; e (2) ; : : : ; e (n) ; : : : such that the complex received se-
quence is r = s + e. Here e (n) = e(I ) (n) + je(Q) (n), where e(I ) (n) and e(Q) (n) are
independent Gaussian random variables with zero mean and variance 2.
The multistage decoder consists of a set of suboptimal decoders matched to the
codes used on the corresponding levels of encoding. Each decoding stage consists of

4
calculation of distances (metrics) to the received sequence r from all possible code
words on the corresponding level of set partitioning. The side information from
the previous decoding stages determines, according to the set partitioning structure
(illustrated in Figure 2), the signal set upon which the metrics are calculated.
When calculating the metrics, the decoder uses the following suboptimal prin-
ciple. Let us suppose that a binary block code (the extension to convolutional
codes is straight forward) of length N is used on the kth level of the encoding,
and that the decoding on the previous (k 1) decoding stages determines the
subsets S (k ; S (k 1) (2) ; : : : ; S (k 1) (N ), to which the transmitted symbols of
1) (1)

the codeword v(k) = v(k) (1) ; v(k) (2) ; : : : ; v(k) (N ), v(k) (n) 2 f0; 1g, belong. Let
s(k) (n) 2 S (k 1) (n), n = 1; 2; : : : ; N and s(k) = s(k) (1) ; s(k) (2) ; : : : ; s(k) (N ).
Let S0(k 1)
(n) and S1(k 1)
(n) be subsets of S (k
n), corresponding to transmis-
1) (

sion of v(k) (n) = 0 and v(k) (n) = 1 respectively. Finally let S(vk(k)1) = Sv(k(k)(1)
1)
(1) ;
Sv(k(k)(2)
1)
(2) ; : : : ; Sv(k(k)(1)N ) (N ) be the sequence of subsets corresponding to transmission
of the code word v(k). Then the distance (metric) between the received sequence
r = r (1) ; r (2) ; : : : ; r (N ) and the codeword v(k) is determined as
   
 r; v(k) = (k)min(k 1) dE r; s(k) ; (1)
s 2Sv(k)

where dE (x; y) means the squared Euclidean distance between the N -dimensional
vectors x and y. The decoding consists of choosing the codeword v(k) for which the
 
metric  r; v(k) above is minimal.

III. ERROR PERFORMANCE


The performance of a multilevel coded modulation system, which employs a
5
multistage decoder, is commonly estimated by average bit error probability of each
component code [1], [4], [7] or by block error probability and burst error probability
for block and convolutional coding respectively [1], [4]. In this section we present up-
per bounds for the last two probabilities. When considering the overall performance
of the multilevel system, it should be noted that the overall decoding error rate is
upperbounded by the sum, over all levels, of the probability that a decoding error
occurs at one level conditioned that all the previous levels were correctly decoded
[3].
Consider the kth decoding level, k = 1; 2; : : : ; K . Let us suppose that a linear
binary block code of length N (k), with L codewords, v0(k); v1(k) ; : : : ; vL(k) 1 , is used on
   
this level. Let the transmitted codeword be v0(k). If  r; v0(k)   r; vl(k) we say
that the lth codeword, l = 1; 2; : : : ; L 1, causes a decoding error and we will denote
 
this event "(l k). Then the probability of decoding error P "(k) on the kth level is
upperbounded by the union bound [9], [11]
  LX1  
P "(k)  P "(l k) (2)
l=1
 
In the binary case P "(l k) is a function of the Hamming distance wl(k) between v0(k)
and vl(k). The main term of the bound is decreasing exponentially with wl(k), i.e. the
bound has the form
 (k)  (k)wl(k)
P "l < Z ; (3)
where Z (k) is a function of k2 and k2 . In this paper we will use the Cherno bounding
 
method for upperbounding of P "(l k) . This method gives, as is well known, an

6
exponentially tight bound [11]. From (2) and (3) we get for linear block codes [10]
 (k)  LX1  (k)wl(k) NX(k) (k)  (k)w (k)
P "  Z = aw Z = G (D) jD=Z (k) ; (4)
l=1 w=d(min
k)

where fa(wk)g is the weight distribution of the code on the kth decoding level, i.e.
k) is the minimal Hamming
a(wk) is the number of code words having weight w, d(min
distance of the code, and G(k) (D) is the generating function of the linear block code.
 
When a convolutional code is used, the burst error ( rst-event) probability P "(k)
is upperbounded by the union type bound [9]
 
P "(k) < T (k) (D) jD=Z (k) (5)

where T (k) (D) is the generating function of the convolutional code on the kth de-
coding level.
We note that the commonly used upper bounds for block and burst error proba-
bilities in the block and convolutional coding cases are both functions of the param-
eter Z (k). The same is correct when considering bit error probabilities. Calculation
of the minimal possible Z (k) yields tight upperbounding of the decoding error prob-
ability on each decoding level. As mentioned in the introduction
! !
exp k2  Z (k) < 4 exp k2 (6)
8 2 8 2
when QAM-signaling is used. The bounds (4) and (5) are often violated through
the use of the lower bound exp ( k2 =82) as an approximation to Z (k). Thereby the
upper bounds (4) and (5) lose their validity. On the other hand the upper bound
4 exp ( k2 =82) sometimes gives bounds that are loose. In the following sections
we calculate the values Z (k) that give exponentially tight union bounds, (4) and
7
(5), without causing them to lose their validity. This is done by using the Cherno
bounding method and the analysis is performed for the last two decoding levels and
for a QAM signal set with an in nite number of signal points. The application of
the Cherno bound to the situation considered is explained in Appendix A.

IV. THE CHERNOFF BOUNDING PARAMETER FOR THE LAST TWO


LEVELS
On the last decoding level each of the subsets S0(K 1)
and S1(K 1)
consists of
one point, and the squared Euclidean distance between the subsets is K2 . It is well
known that in this case the probability that the lth codeword will cause the decoding
error can be calculated exactly,
0 q (K ) 1 0 (K ) 2 1
 (K ) w   w(K)
P "l = Q @ 2l  K A < exp @ wl82K A = Z (K ) l ; (7)
R p
where Q (x) = x1 exp ( t2 =2) dt= 2, and Z (K ) = Z2 = exp ( K2 =82), i.e. Z (K )
coincides with the lower bound in (6). The parameter Z (K ) is a function of the
normalized Euclidean distance K = K =, and since all the expressions of Z (k) in
the further evaluation of the error probability on the kth level will also be functions
of k = k =, we will keep this notation. Thus, without loss of generality, we
consider on the kth level of set partitioning the signal set with the minimal inter-set
Euclidean distance k and additive Gaussian noise with the variance 2 equal to 1.
Numerical values of the Cherno bounding parameter Z2 are shown in Table I for a
number of di erent  = K .
Consider now the decoding on the last but one level, when the subsets S0(K 2)
and
S1(K 2)
consist of two points each. The corresponding set partitioning is presented
   
in Figure 3. We introduce the notation (k ) = dE r; S0(k 1)
dE r; S1(k 1) , where
8
 
dE r; Si(k 1) , i = 0; 1 is the squared Euclidean distance from the received signal
point r to the nearest point of the set Si(k 1). In Appendix B it is shown that
f (0)(K 1) ( ), the probability density function of (K 1) conditioned on that a zero was
sent, is equal to a scaled convolution of
!
fX (x) = p2 exp
(0) x2
2 2
and
0 2 !1 0 2 !1
1
fY (y) = p exp 2 y p
(0) @ 1  K 1 A 1
+ p exp @ 1 y+ p K 1 A
;
2 2 2 2 2
x; y  0. It is also proven i Appendix B that the following theorem applies.
Theorem 1 For the last but one level of suboptimal decoding of a multilevel coded
QAM signal set, the Cherno bounding parameter has the value
Z (K 1) = Z4 = min
s0
'(K 1) ( s) ; (8)
where
   p 
'( K s) = 2 exp 2 (sK 1)2 Q
1) ( 2sK 1  (9)
  !   !!
 exp s2K 1 Q pK2 1 (2s 1) + exp s2K 1 Q pK2 1 (2s + 1) :
Numerical values of Z4 are shown in Table I for a number of di erent  = K 1.
We can see that for small  the values of Z4 are close to the lower bound Z2 , but for
large  Z4 approaches 2Z2 due to the \nearest neighbor error events principle". In
Figure 4 we show by two examples how the bounds on error probability are improved
by the use of Z4 instead of 2Z2.
In principal the value Z that gives an exponentially tight union bound for the
error probability can be calculated for any decoding level of multilevel coded M -
QAM, but for M > 4 the calculation becomes cumbersome. For example, when the
9
signal set is the 8-QAM signal set, we would have to consider nine reception regions
of which there are 6 di erent types, as compared to the 4-QAM case where we
consider four regions that are all of the same type. Instead of carrying through with
these calculations we perform an analysis of the QAM set with an in nite number
of signal points. The resulting Cherno bounding parameter Z1 upperbounds the
values of Z for any nite QAM signal set. To make this plausible we present the
following intuitive argument. Consider a M -QAM scheme for any nite M . Since
the suboptimal decoding procedure only takes the distance to the two nearest points
into consideration, the adding of ctional points from the QAM set with an in nite
number of signal points, can only make the performance worse. This is because, in
this case, the distance from the received point to the nearest point of the reference
set competes not only with the distance from the received point to the nearest point
of the opposite set, but also with the distance from the received point to the nearest
ctional point.
V. THE CHERNOFF BOUNDING PARAMETER FOR A QAM SET
WITH AN INFINITE NUMBER OF SIGNAL POINTS
Here the Cherno bounding parameter for a multilevel QAM set with an in nite
number of signal points, i.e. the (2Z + 1)2 -lattice, will be given. In this case the
subsets S0(k 1)
and S1(k 1)
consist, on each decoding level, of in nitely many points
each. The corresponding set partitioning is presented in Figure 5. The probability
density function of (k ) , introduced in Section IV, f (0)(k) ( ), is equal to a shifted
version of the scaled convolution of
X1 2  1  p 2
(0)
fX (x) = p exp 2 x + 2ik
i= 1 2

10
p
0  x  k = 2, with itself, as is shown in Appendix C. There the following theorem
is also proven.
Theorem 2 On each level of suboptimal decoding of a multilevel coded QAM signal
set with an in nite number of signal points, the Cherno bounding parameter has
the value
! !
Z (k) = Z1 = min exp spk g(k) (s)2 ; (10)
s0 2
where
X
1 p ! p  p !!
(k) (
g s) = 2 exp 2s2
2sik Q 2ik s Q 2ik s + p k
=
i= 1 2
p ! ! p X  i  sk     i 2 
2 s 4 2 s 1 ( 1) exp p2 1 exp k
= s exp p k
1 +    2 : (11)
k 2 k i=1 2s2 + 2i k
The in nite sum in the rst expression for g (k) ( s) in (11) converges faster for large
k than for small k . If it is desirable to have fast convergence when k is small one
should use the second expression in (11). Numerical values of Z1 are shown in Table
I for a number of di erent  = k . Here it can also be seen that Z1 approaches
4Z2 for large , due to the \nearest neighbor error events principle". We conclude
that the rough bounds Z2 < ZM < 4Z2 have been tightened to Z4 < ZM < Z1 for
M > 4. In Figure 6 we show by two examples how the bounds on error probability
are improved by the use of Z1 instead of 4Z2.
VI. CAPACITY AND CUTOFF RATE OF MULTISTAGE DECODING
USING SUBOPTIMAL METRIC
Analogously to [4] we consider the transmission on each level of modulation as
transmission over an individual channel. But in contrast to [4], where the capa-
cities of these individual channels for optimal decoding on each level of modulation
11
was calculated, we calculate the capacities when suboptimal decoding, described
in Section II is used. We consider the QAM signal set with an in nite number of
signal points, and as an output of the individual channel we consider the sequence
of statistics (k) introduced in Section IV.
The capacity for the kth level of the given multilevel QAM system using multi-
stage decoding with suboptimal metric is
   
C (k ) = H H (k) j v(k) = 0 =
(k) (12)
Z 2k  1 (0) 1   1 (0) 1 
= (1) (1)
f (k) ( ) + 2 f (k) ( ) log 2 f (k) ( ) + 2 f (k) ( ) d +
2k 2
Z 2k (0)
+ 2 f (k) ( ) log f (0)(k) ( )d ;
k

where the superscript denotes the transmitted bit, and where the probability den-
sity function of (k) conditioned on that a one was transmitted satis es f (1)(k) ( )
= f (0)(k) ( ) by symmetry. All signal points are assumed to be equally likely. Con-
sidering the ensemble of codes and calculating the expectation of decoding error
 
(k) = 1 log 1 + Z (k) . The capacity
probability gives that the cuto rate is Rcomp 2

and computational cuto rate of a level using QAM with an in nite number of signal
points is shown in Figure 7. In this gure we also show the capacity for the same
system using optimal metric at each decoding stage calculated in [4]. We conclude
that the capacity for the system using suboptimal metric that we considered is close
to the capacity for a system using optimal metric and multistage decoding. This
indicates that it is not necessary to use the optimal metric in multistage decoding
of multilevel QAM. That is, a complexity reduction by use of the suboptimal metric
can be achieved at very small loss in capacity.

12
VII. CONCLUSIONS
The decoding error probability for multilevel QAM with multistage decoding
using suboptimal metric has been analyzed. Rigorous union bounds for error prob-
abilities of component codes have been presented and formulas for the parameter
Z have been derived. To do this the Cherno bounding method, which gives ex-
ponentially tight bounds, was applied. This formula yields a Z that gives tighter
bounds than the commonly used \nearest neighbor error events principle", but the
bounds do not lose their validity as is the case when the often used approxima-
tion Z = exp ( 2 =82) is applied. Comparison to traditional bounding techniques
for the probability of decoding error has been made for 4-QAM and a QAM sig-
nal constellation with an in nite number of signal points. Calculation of capacity
and computational cuto rate was made for QAM with an in nite number of signal
points. It was concluded that the capacity for the decoding using suboptimal metric
is close to the capacity for the system where optimal metric is used. Hence in terms
of capacity the need for using optimal metric is small when multilevel QAM with
multistage decoding is considered.

13
APPENDIX A.
DERIVATION OF THE CHERNOFF BOUNDING PARAMETER
   
Let  r; v0(k) and  r; vl(k) be the squared Euclidean distances (metrics) from
r to the transmitted codeword and from r to the lth codeword, l = 1; 2; : : : ; L 1,
respectively. The decision is then taken in favor of the codeword whose metric is
minimal. From (1) we have
  X
N  
 r; vl(k) = dE r (n) ; vl(k) (n) ; l = 0; 1; : : : ; L 1; (13)
n=1

where
   
dE r (n) ; vl(k) (n) = min
(k 1)
dE r (n) ; s(k) (n) : (14)
s(k) (n)2S (k) (n)
vl (n)
   
Then the decision criterion  r; v0(k)  r; vl(k) >< 0, k = 1; 2; : : : ; K becomes
(l k) = PNn=1 (l k) (n) >
< 0, where
(k)
   
l (n) = dE r (n) ; v0(k) (n) dE r (n) ; vl(k) (n) : (15)

Without loss of generality we suppose that v0(k) is the all-zero codeword and
that vl(k) is a codeword of Hamming weight wl(k). To simplify the analysis, we also
change the order of the transmitted symbols, such that the rst wl(k) symbols of vl(k)
are ones. Then the decision statistic (l k) = Pwn=1
(k )
l (k ) ( n) is a sum of independent
identically distributed random variables (k ) (
n).
 
To get an exponentially tight bound for P "(l k) , let us introduce the generating
R
function of (l k), (l k) (s) = 11 exp (s ) f(lk) ( ) d , where f(l k) ( ) is the probabil-
ity density function of (l k). Since (l k) is a sum of wl(k) independent identically
 w(k)
distributed random variables we have (l k) (s) = '(k) (s) l , where '(k) (s) =
14
R 1 exp (s ) f
1 (k) ( ) d and f (k ) ( ) is the probability density function of (k) ( n).
Using the Cherno bound [11] we get
 (k)   (k)   (k )   wl(k)
P "l = P l  0  min
s0
l (s) = min
s0
(k
' (s)
) ; (16)

which is equivalent to (3) with

Z (k) = min
s0
'(k) (s) : (17)

APPENDIX B.
PROOF OF THEOREM 1
Let us consider the last but one level, Figure 3. To study the probability density
function of n), we introduce the system of coordinates (x; y). In this system
(K 1) (

the four quadrants de ne 4 regions, where the received point has the same nearest
point from each set. For example in the rst quadrant the nearest point from S0(K 2)

is 2 and from S1(K 2)


it is 1. The noise components in this system of coordinates are
independent Gaussian random variables with zero mean and variance equal to one.
The conditional probability that the received point is in the rst quadrant given
that v0(K 1)
was transmitted is equal to 1=4. In this quadrant the squared Euclidean
distance from the received point to the nearest point of the set S0(K 2)
is
  !2
d E r (n) ; v (K 1)
0 ( n) = (X (n)) + Y (n) pK 1
2
(18)
2
and the squared Euclidean distance from the received point to the nearest point of
the set S1(K 2)
is
  !2
dE r (n) ; vl (K 1)
(n) = X (n) pK 1 + (Y (n))2 : (19)
2
15
So the di erence between the squared Euclidean distances (15) becomes
(K 1) (
p
n) = 2K 1 (X (n) Y (n)) : (20)

Since the probability density function of (K 1) ( n) does not depend on n, for n =


1; 2; : : : ; wl(K 1)
we will drop the argument n. The random variables X and Y are
independent. The conditional probability density functions of X and Y given that
v0(K 1) was transmitted and given that the received signal is in the rst quadrant
(x  0 and y  0) are
2 x 2!
fX (x) = p exp 2 ;
(0)
(21)
2
0 !2 1 0 !2 1
(0) 1 1 
fY (y) = p exp @ 2 y pK 1 1
A + p exp @ 1 p A ; (22)
K 1
2 2 2 2 y+ 2
and the conditional probability density function of (K 1) is a scaled convolution of
the probability density functions of X and Y .
If the received signal is in the second, third or fourth quadrant, we get formulas
for the representation of (K 1) analogous to (20) and for the density functions for X
and Y analogous to (21) and (22), and therefore (20){(22) are valid for all possible
received signals.
From (20){(22) follows that '(K 1) ( s) = 'X (s) 'Y (s) where
Z1 p   p 
'X (s) = exp 2sxK 1 fX(0) (x) dx = 2 exp (sK 1)2 Q 2sK 1 (23)
0
 p Z1 
'Y (s) = exp 2syK 1 fY(0) (y) dy = (24)
0
 2  K 1 !  2  K 1 !
= exp sK 1 (s 1) Q p (2s 1) +exp sK 1 (s + 1) Q p (2s + 1) ;
2 2
and Theorem 1 follows from (17), (23) and (24).

16
APPENDIX C.
PROOF OF THEOREM 2
Here we consider a level where the QAM signal set has in nitely many signal
points, Figure 5. To study the probability density function of the statistics n), (k) (

we introduce the system of coordinates (x; y), Figure 5. The square regions de ned
in this gure are regions in which the received point has the same nearest point from
p p
each set. For example in the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g
point A is the nearest point from S0(k 1) , and the nearest point from S1(k 1)
is point
B. The noise components in this system of coordinates are independent Gaussian
random variables with zero mean and variance equal to one.
p p
Let us rst study the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g. We
suppose that the received point is in this region, and that v0(k) was transmitted. Then
the squared Euclidean distance from the received point to the nearest reference point
is
 
dE r (n) ; v0(k) (n) = (X (n))2 + (Y (n))2 (25)
and the squared Euclidean distance from the received point to the nearest opposite
point is !2 !2
   
dE r (n) ; vl (n) = X (n) p + Y (n) p :
(k) k k
(26)
2 2
So the di erence between the squared Euclidean distances (15) becomes
(k ) (
p
n) = 2k (X (n) + Y (n)) 2k : (27)

As in Appendix B we drop the argument n since the probability density function of


(k) ( n) does not depend on n. The random variables X and Y are independent. The

17
conditional probability density functions of X and Y given that v0(k) was transmitted
p
and given that the received signal is in the region fx; y s.t. 0 < x < k = 2; 0 <
p
y < k = 2g are (see Figure 8)
P m2   p 
1 x + 2i  2
i= m2 p2 exp
1
2 k
fX(0) (x) = fY(0) (x) = mlim
!1 Rpk mP 2
  p = (28)
2 2 m p1 exp
2 x + 2ik
1
0 i= 2 2

P m2   p 2 
1
i= m2 p2 exp 1 x + 2ik X1 2   p 
= mlim  k2
 = p exp 12 x + 2ik 2 ;
!1 Q p2 (m + 1) i= 1 2
1
2
p (k)
0 < x < k = 2, and the conditional probability density function of is a shifted
version of the scaled convolution of the probability density functions of X and Y .
Obviously, for each of the square regions de ned in Figure 5 we can introduce a
system of coordinates such that the nearest reference point would have coordinates
 p p
(0; 0) and nearest opposite point coordinates k = 2; k = 2 . The di erence be-
tween the squared Euclidean distances satis es (27) where X and Y are coordinates
of received point in this system of coordinate. Therefore (28) de nes the condi-
tional probability density functions of X and Y given that v0(k) was transmitted,
independently from in which square region the received point is.
p 
We have that mins0 '(k) (s) = mins0 '(k) 2k s , and from (17), (27) and
p   p
(28) follows that '(k) 2k s = exp sk = 2 'X (s) 'Y (s), where
Z pk
2
'X (s) = 'Y (s) = exp (sx) fX(0) (x) dx = (29)
0

2 X1  1  p 2 Z pk2  1  p 2


p exp (ik ) + 2 s 2ik
2
exp 2 x s 2ik dx
2 i= 1 0
X
1 s2 p  p  p !!
=2 e2 2 si k Q 2ik s Q 2ik s + pk
= g(k) (s) ;
i= 1 2

18
which proves the rst part of Theorem 2. Now let us prove the second part of the
theorem. We have from (28) that
1 X1   1  p 2   1  p 2
fX (x) = p
(0)
exp 2 x + 2ik + exp 2 x 2ik ;
2 i= 1
which can be expressed as
Z1 !
fX (x) = p1
(0)
 (x;  ) exp  2 d; (30)
2 1 2
  p   p 
where  (x;  ) = P1j= 1   x 2 j k +   x + 2j k . The Fourier
series expansion of  (x;  ) in  is
X p
1 2 2 p p p
2 2ix cos 2i :
 (x;  ) =  +  cos  (31)
k i=1 k k k
R p
Now, (30){(31) together with the equality 11 exp ( p2x2  qx) dx =  exp (q2 =4p2) =p,
p > 0 [2] gives
p X
1 p p  i 2!
(0) 2 2
fX (x) =  +  2 cos 2ix exp (32)
k k i=1 k k
and thus
Z pk
2
g (s) = exp (sx) fX(0) (x) dx = (33)
0
 i  sk     2 
p ! ! p X 1 ( 1) exp p2 1 exp i
= s2 exp spk 4 2s k
1 +   2i 2 ;
k 2 k i=1 2s2 + k
and the proof is complete.

19
References
[1] E. Biglieri, D. Divsalar, P. J. McLane and M. K. Simon, Introduction to Trellis-
Coded Modulation with Applications. Macmillan, 1991.

[2] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products.


Academic Press, 1980.

[3] H. Herzberg, \On the Spectrum of Distances of a Multilevel Code, Decoded by


a Multistage Decoder," IEEE Trans. Inform. Theory, vol. IT-43, pp. 1736-1739,
Sept. 1997.

[4] J. Huber, \Multilevel Codes: Distance Pro les and Channel Capacity," in ITG-
Fachbereicht 130, Oct. 1994, pp. 305-319. Conference Record.

[5] J. Huber, private communication

[6] H. Imai and S. Hirakawa, \A New Multilevel Coding Method Using Error-
Correcting Codes," IEEE Trans. Inform. Theory, vol. IT-23, pp. 371-377, May
1977.

[7] Y. Kofman, E. Zehavi and S. Shamai, \Performance Analysis of a Multilevel


Coded Modulation System," IEEE Trans. Commun., vol. COM-42, pp. 299-
312, Feb./Mar./Apr. 1994.

[8] G. Ungerbck, \Channel Coding with Multilevel/Phase Signals," IEEE Trans.


Inform. Theory, vol. IT-28, pp. 55-67, Jan. 1982.

20
[9] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Cod-
ing. McGraw Hill, 1979.

[10] S. G. Wilson, Digital Modulation and Coding. Prentice Hall, 1996.

[11] J. M. Wozencraft and I. M. Jacobs, Principles of Communication Engineering.


Wiley, 1965.

21
LIST OF FIGURE CAPTIONS
Figure 1: The system model showing the transmitter, the additive white Gaussian
noise (AWGN) channel and the multistage receiver.

Figure 2: Set partitioning when K = 4.


Figure 3: The signal constellation on the last but one level.
Figure 4: Two examples of the error bound (5) for the last but one level in a QAM
system. The gure shows the bound when the lower bound in (6) D = Z2
(dashed) and the \nearest neighbor" upper bound for 4-QAM D = 2Z2
(dash-dotted) are used, and when D = Z4 (solid) is applied. (a) T (D) =
D5= (1 2D), (b) T (D) = D10 (D8 4D6 + 5D4 4D2 + 3) = ( D12 + 4D10
4D8 + 2D6 2D4 2D2 + 1).

Figure 5: The signal constellation for M -QAM with M = 1.


Figure 6: Two examples of the error bound (5) for a level of a M -QAM system with
M = 1. The gure shows the bound when the lower and upper bounds in
(6) D = Z2 (dashed) and D = 4Z2 (dash-dotted) are used, and when D = Z1
(solid) is applied. The generating functions are the same as in Figure 4.

Figure 7: Capacity (solid line) and computational cuto rate (dashed line) for
QAM with an in nite number of signal points, using suboptimal metric. The
capacities for 16-QAM (stars) and a \large" QAM constellation (circles), both
using optimal metric [4], are also shown.

Figure 8: Illustration to equation (28).


22
LIST OF TABLE CAPTIONS
Table I: Numerical values of Z .

23
u(1) C v(1)
1
u(2)
C
v (2)
2
u Partition
of
u(3) C3
v(3) 2K -QAM s r subopti-
mal
u^
mapper AWGN
information p decoder
p
p
u(K ) C v(K )
K

Figure 1:

uuuu
uuuu
uuuu S (0) , 1
uuuu
v(1) (n) = 0  XXv(1) (n) = 1
 XXX


9 XX
z
X
eueu ueue
ueue eueu p
eueu ueue S (1) , 2 = 21
ueue eueu
v(2) (n) = 0  QQv(2) (n) = 1 v(2) (n) = 0  QQv(2) (n) = 1
+
 s
Q +
 s
Q
eeee eueu ueue eeee
ueue eeee eeee eueu
eeee eueu ueue eeee S (2) , 3 = 21
ueue eeee eeee eueu

Jv ((3)
n) = 1
Jv ((3)
n) = 1
Jv ((3)
n) = 1
Jv (n) = 1
(3) (3) (3) (3)

v (3) (n) = 0 
v ( n ) = 0 v ( n ) = 0 v ( n ) = 0
e e e e e^e e e e ee u e^u e e u ee e e^e u e e ee e eJ^e e e
J
J
J

e e u e u e e e e e e e e e e e e e e e e e e e e u e e e e e u (3) p
e e e e e e e e e u e e e e e u e e u e u e e e e e e e e e e e S , 4 = 2 2 1
ueee eeue eeee eeee eeee eeee eeeu euee

Figure 2:

24
received
point
y@
I
@ (
b(
6
((H
x
( ( H
2@@x(((  Hh1
I
@
 (K 1)
 @ @  
d E r (n) ; v
0 ( n) = 0 @
@
@
dE r (n) ; vl(K 1)
(n) = 1
@
@ -
@
@
a
@
@
@
u Referencepoint 2 S0(K 2)
3 h K 1 @x 4 e Opposite point 2 S (K 2)
@ 1

Figure 3:
 Z2 Z4 Z1 Z4=Z2 Z1=Z2
0.5 0.9692 0.9984 1.0000 1.03 1.03
1.0 0.8825 0.9793 1.0000 1.11 1.13
1.5 0.7548 0.9198 0.9997 1.22 1.32
2.0 0.6065 0.8124 0.9858 1.34 1.63
2.5 0.4578 0.6686 0.9152 1.46 2.00
3.0 0.3247 0.5108 0.7740 1.57 2.38
3.5 0.2163 0.3618 0.5932 1.67 2.74
4.0 0.1353 0.2379 0.4135 1.76 3.06
4.5 0.0796 0.1453 0.2636 1.82 3.31
5.0 0.0439 0.0825 0.1545 1.88 3.52
5.5 0.0228 0.0437 0.0837 1.92 3.67
6.0 0.0111 0.0216 0.0420 1.94 3.78
6.5 0.0051 0.0100 0.0196 1.96 3.86
7.0 0.0022 0.0043 0.0086 1.98 3.91

Table I:

25
10
−2
(a) (b)
−4
10
−3
10

−6
10
−4
10

−8
10
−5
10

−10
10
−6
10

−12
10
−7
10

−14
10
−8

3 4 5 6 3 4 5 6
10

 
Figure 4:
@
I y b  x B  
@
@ 6 
j
 A@ )
 dE r (n) ; vl(k) (n) = 1
@   A @
@u @ @ e @@u @@e @@u  A @  received point
@ @   
@ @ @ @@
@ e@ u@  @ @
 @   
@ @ B@e @u @e @
@
 i
P P P dE r (n) ; v0(k) (n) = 0
@ @ @ @ 
 k @ @ @ @ @
@ @ 
@u @e A@u @e @u @z

A
@ @ @ @
@ e @ u @ e @ u @ e
@ -
a
u Reference point 2 S0(k 1)
@ @ @ @ @
@ @ @ @ @
e Opposite point 2 S1(k 1)
Figure 5:

10
−2
(a) (b)
−4
10
−3
10

−6
10
−4
10

−8
10
−5
10

−10
10
−6
10

−12
10
−7
10

−14
10
−8

3 4 5 6 3 4 5 6
10

 
Figure 6:

26
1

bits per
level and 0.8

channel use
0.6

0.4

0.2

0 0 1
10 10


Figure 7:

Figure 8:

27