Вы находитесь на странице: 1из 42

Chapter 5

Nyquist Signalling
5.1 Introduction
In this and the following chapter, we focus on widespread signaling techniques. This
chapter is devoted to the waveform former and its receiver-side counterpart, the n-tuple
former. Chapter 6 focuses on the encoder/decoder pair.
In principle, what we do in these two chapters is general enough to cover both baseband
and passband communication. However, for reasons of practicality and robustness, we
assume for now that we are doing baseband communication. The extension to passband
communication will be handled by a nal layer, discussed in Chapter 7.
The examples in Section 4.8 suggest that symbol-by-symbol on a pulse train is a signaling
candidate that should be purused further. In the next section we rediscover symbol-
by-symbol on a pulse train as the natural technique for dealing with baseband channels
that have a strict bandwidth limitation.
Once we have committed to symbol-by-symbol on a pulse train, the only freedom we have
at the waveform former is the pulse choice. The pulse that we use in the next section is
not suitable in practice, but we will be able to generalize the theory and discover a design
method that allows us to construct various pulses that can be used in real applications.
To keep the notation to the minimum, in this chapter we write
w(t) =
n

j=1
c
j
(t jT) (5.1)
instead of w
i
(t) =

n
j=1
c
i,j
(t jT) . We drop the message index i because we will be
studying properties of the pulse (t) , as well as properties of the stochastic process that
models the transmitter output signal, neither of which depend on a particular message
choice. Following common practice, we refer to c
j
as a symbol. The next example gives
us three common choices.
153
154 Chapter 5.
Example 71. (Modulation) The widely used PAM and QAM modulations are indeed
symbol-by-symbol on a pulse train, with the symbols taking value in a PAM or QAM
constellation, respectively. See Figure 2.6 for an example of PAM constellation and
Figure 2.7 for a QAM constellation. Similarly, PSK modulation is symbol-by-symbol on
a pulse train, with the symbols taking value in a PSK constellation, an example of which
is given in Figure 2.9.
For mathematical convenience, in this chapter we let the symbol sequence be of innite
length.
Symbol-by-symbol on a pulse train is not part of the standard terminology. We use it as
we nd it to be appropriate and because there is no standard substitute for it. The word
modulation is too general for instance it can be used in conjunction with pulse position
modulation, which does not t in the framework of this chapter.
With the pulse we want to obtain a desired power spectral density for the stochastic
process that models the transmitted signal. We are interested in controlling the power
spectral density because in many applications, notably cellular communications, it has
to t a certain frequency-domain mask. This restriction is meant to limit the amount
of interference that a user can cause to users of adjacent bands. In certain situations, a
restriction is self-imposed for eciency. For instance, if the channel attenuates certain
frequencies more than others, we can optimize the system performance by shaping the
spectrum of the transmitted signal so as to invest more power at those frequencies where
the channel is better. This is done according to a technique called water-lling
1
.
This chapter is organized as follows. In Section 5.2, we develop an instructive special
case where the channel is strictly bandlimited and we rediscover symbol-by-symbol on
a pulse train as a natural signaling technique for strictly band limited channels. In
Section 5.3 we derive the expression for the power spectral density of the transmitted signal
for an arbitrary pulse when the symbol sequence constitutes a discrete-time wide-sense-
stationary process. As a preview, we discover that when the symbols are uncorrelated,
which is frequently the case, the spectrum is proportional to [
F
(f)[
2
. In Section 5.4, we
derive the necessary and sucient condition on [
F
(f)[
2
in order for
j
(t)
jZ
to be an
orthonormal set when
j
(t) = (t jT) . We will see that it is the case if and only if
[
F
(f)[
2
fullls the so-called Nyquist criterion.
5.2 The Ideal Lowpass Case
In this section we assume that the channel contains an ideal lowpass lter of impulse
response h(t) and frequency response
h
F
(f) =
_
1, [f[
W
2
0, otherwise,
1
The water-lling idea is typically treated in an information theory course. See e.g. [2].
5.2. The Ideal Lowpass Case 155
see Figure 5.1. This is an idealized version of a lowpass channel.

h(t)

w(t) R(t)
N(t)
AWGN,
No
2
Figure 5.1: Lowpass channel model.
The lter blocks all the signals components that fall outside the frequency interval
[
W
2
W
2
] . Hence we need to consider only signals that are strictly band-limited to [
W
2
W
2
] .
The sampling theorem, stated below and proved in Appendix 5.D, tells us that such sig-
nals can be described by a sequence of numbers. The idea is to let the encoder produce
these numbers and let the waveform former convert them into the desired w(t) .
Theorem 72. (Sampling Theorem) Let w(t) be a continuous L
2
function (possibly
complex-valued) and let its Fourier transform w
F
(f) vanish for f , [
W
2
,
W
2
] . Then
w(t) can be reconstructed from the sequence of T -spaced samples w(nT)
nZ
, provided
that T 1/W . Specically,
w(t) =

n=
w(nT) sinc
_
t
T
n
_
(5.2)
where sinc(t) =
sin(t)
t
.
In the sampling theorem, we assume that the signal of interest is in L
2
. Essentially, L
2
is a vector space of nite-energy functions, and this is all we really need to know about
L
2
to get most out of this text. However, I recommend reading Appendix 5.A, which
contains an informal introduction of L
2
, because we often read about L
2
in the technical
literature. The appendix is also necessary to understand some of the subtleties related to
the Fourier transform (Appendix 5.B) and Fourier series (Appendix 5.C). Our reason for
referring to L
2
is that it is necessary for a rigorous proof of the sampling theorem. All
nite-energy signals that we encounter in this text are L
2
functions. It is safe to say that
all nite-energy functions that anyone encounters while practicing engineering are in L
2
.
In the statement of the sampling theorem we require continuity. Continuity is needed
and does not follow from the condition that w
F
(f) is band-limited. In fact, if we take a
nice (continuous and L
2
) function and modify it at a single point, say at t = T/2, then
the original and the modied functions have the same Fourier transform. If the original
is band-limited then so is the modied function. The reconstruction formula (5.2) will
reconstruct the original (continuous) function.
The sinc pulse used in the statement of the sampling theorem is not normalized to unit
energy. If we normalize it, specically dene (t) =
1

T
sinc(
t
T
) , then (t jT)

j=
156 Chapter 5.
forms an orthonormal set. (This is implied by Theorem 79. The impatient reader can
verify it by direct calculation, using Parsevals relationship. Useful facts about the sinc
function and its Fourier transform are contained in Appendix 5.B.) Thus (5.2) can be
rewritten as
w(t) =

j=
c
j
(t jT), (t) =
1

T
sinc(
t
T
), (5.3)
where c
j
= w(jT)

T . Hence a signal w(t) that fullls the conditions of the sampling the-
orem is one that lives in the inner product space spanned by (t jT)

j=
. When we
sample such a signal, we obtain (up to a scaling factor) the coecients of its orthonormal
expansion with respect to the orthonormal basis (t jT)

j=
.
Now let us go back to our communication problem. We have just seen that any physical
(continuous and L
2
) signal w(t) that has no energy outside the frequency range [
W
2
,
W
2
]
can be synthesized as w(t) =

j
c
j
(t jT) . This signal has exactly the form of
symbol-by-symbol on a pulse train. To implement this signaling method we let the j -th
encoder output be c
j
= w(jT)

T , and let the waveform former be dened by the pulse


(t) =
1

T
sinc(
t
T
) . The waveform former, the channel, and the n-tuple former are shown
in Figure 5.2.

c
j
(t)
Waveform
Former

h(t)

N(t)

c
j
(t jT)

(t)
n-tuple
Former

R(t) Y
j
= R,
j

t = jT
Figure 5.2: Symbol-by-symbol on a pulse train obtained naturally from the
sampling theorem.
Discussing the various ways to implement the the waveform former is outside the scope of
this text, but we do mention a particularly interesting one. To generate c
j
(tjT) (which
is the j -th term in the orthonormal expansion of w(t) ), we produce the narrow rectangle
c
j
[jT,jT+]
(t)

, where 0 < << T , and let it be the input to a lter of impulse response
(t) . As approaches 0, the input approaches the Dirac delta function positioned at
jT and scaled by c
j
and the output approaches the lter impulse response delayed by
jT and scaled by c
j
, as desired.
It is interesting to observe that we are using the sampling theorem somewhat backwards,
in the following sense. Typically the sampling theorem is introduced as a means to
represent a continuos and band-limited L
2
source by means of its samples. In this case
the sampling happens at the source and the reconstruction (or interpolation) happens
at the destination. In the diagram of Figure 5.2, the reconstruction happens at the
transmitter and the sampling at the receiver. The matched lter in this case is a lowpass
lter that removes the out-of-band noise. Its output is continuous and bandlimited, hence
5.2. The Ideal Lowpass Case 157
it can be sampled without losing any information. To see that the matched lter is indeed
a lowpass lter, recall that (t) is a sinc function hence real-valued and even. Thus,

(t) = (t) . Its Fourier transform is

F
(f) =
_

T, [f[
1
2T
0, otherwise,
where W = 1/T . (Appendix 5.B explains an eortless method for relating the rectangle
and the sinc as Fourier pairs.)
The following examples are meant to add some perspective by providing a few scenarios
where the sampling theorem is used in a related setting. (The T referred to in the
examples that follow is the one for which (t jT)

j=
forms an orthonormal set.)
Example 73. (Digital communication) Here we assume the source outputs a sequence
of bits. (No need to assume that they are independent and uniformly distributed.) We
choose the symbol alphabet to be of cardinality 2
k
for some integer k . To be concrete, let
us assume that we choose the 8-ary PAM constellation. (Figure 2.6 shows the 6-ary PAM
constellation.) In this case k = 3. The encoder partitions the bit sequence into triplets
and maps each triplet into a symbol of the 8-ary PAM constellation. With reference to
Figure 5.2, in the absence of noise Y
j
= c
j
.
Example 74. (Communication of an analog discrete-time source) If the source produces
a sequence of real-valued numbers, we could associate the j -th source output with the
symbol c
j
. In this case no encoder/decoder pair is needed. Here we are implicitly as-
suming that the source produces one output every T seconds. If it produces outputs
at a rate smaller than one every T seconds, then we can buer the source outputs to
form packets of some length. The sender then works in burst mode, meaning that once
in a while it transmits a packet of symbols at the rate of one symbol every T seconds.
In between bursts, the channel can be used for other purposes. It could be used by the
same transmitter to send data from a dierent source or it could be used by a dierent
transmitter. In the latter case the two transmitters have to coordinate their activity or
else they might interfere with one-another.
Example 75. (Communication of a band-limited continuous-time source: analog ap-
proach) If the source in continuous-time and band-limited to [
W
2
W
2
] , where W = 1/T ,
then we can sample the source and proceed as in the previous example. However, if the
source bandwidth equals the channel lter bandwidth, then it is easier to plug the source
directly into the channel. This is the most basic form of analog communication.
Example 76. (Communication of a band-limited continuous-time source: digital ap-
proach) Assume that the source is as in the previous example. The state-of-the-art ap-
proach is to sample the source, quantize the samples, and convert them to binary form.
Now we are in the domain of digital communication. In particular, we are free to choose
the desired symbol constellation and to choose a very basic encoder such as the one in
Example 73 or, if we are not satised with the error probability obtained this way, we can
choose a more sophisticated encoder as described in Chapter 6.
158 Chapter 5.
In the rest of the chapter we generalize symbol-by-symbol on a pulse train. The goal is
to understand which pulses (t) are allowed and to determine their eect on the power
spectral density of the communication signal they produce.
5.3 Power Spectral Density
A typical requirement imposed by regulators is that the power spectral density (PSD)
of the transmitters output signal be below a given a frequency-domain mask. In this
section we compute the PSD of the transmitters output signal modeled as
X(t) =

i=
X
i
(t iT ), (5.4)
where X
j

j=
is a zero-mean wide-sense stationary discrete-time process, (t) an
arbitrary L
2
function (not necessarily normalized or orthogonal to its T -spaced time
translates), and is a random dither (or delay) independent of X
j

j=
and uniformly
distributed in the interval [0, T) .
The insertion of the random dither , not considered so far in our signals model, needs
be justied. It models the fact that a transmitter is switched on at a random time. Hence
an observer that knows the signals structure has no information regarding the relative
position of a pulse with respect to an interval of the form [a, a+T] , where a is an arbitrary
number. The dither models this uncertainty. The reason we have not inserted the dither
thus far is because we did not want to make the signals model more complicated than
needed. Upon a suciently long observation time, the receiver can estimate the dither
and compensate for it. Hence there is no need to include it in the signals model used to
derive the receiver. We include it now because, the instrument used to measure the power
spectral density of the transmitters output signal does not try to synchronize its clock to
that of the measured signal, and because the dither makes X(t) a wide sense stationary
(WSS) process, thus greatly simplifying our derivation of the power spectral density. We
are also assuming that X
j

j=
is zero-mean and wide-sense stationary. It depends
on the encoder whether or not this is a valid assumption. It is for many encoders, an
example of which will be studied in the next chapter.
In the derivation that follows we use the autocovariance
K
X
[i] := E[(X
j+i
E[X
j+i
])(X
j
E[X
j
])

] = E[X
j+i
X

j
],
which depends only on i since, by assumption, X
j

j=
is WSS. We use also the
self-similarity function of the pulse () , dened as
R

() :=
_

( + )

()d. (5.5)
(Think of the denition of an inner product if you tend to forget where to put the

in
the above two denitions.)
5.3. Power Spectral Density 159
The process X(t) is zero-mean. Indeed, using the independence between X
i
and and
the fact that E[X
i
] = 0, we obtain E[X(t)] =

i=
E[X
i
]E[(t iT )] = 0.
The autocovariance of X(t) is
K
X
(t + , t) = E
__
X(t + ) E[X(t + )]
__
X(t) E[X(t)]
_

= E
_
X(t + )X

(t)

= E
_

i=
X
i
(t + iT )

j=
X

(t jT )
_
= E
_

i=

j=
X
i
X

j
(t + iT )

(t jT )
_
=

i=

j=
E[X
i
X

j
]E[(t + iT )

(t jT )]
=

i=

j=
K
X
[i j]E[(t + iT )

(t jT )]
(a)
=

k=
K
X
[k]

i=
1
T
_
T
0
(t + iT )

(t iT +kT )d
(b)
=

k=
K
X
[k]
1
T
_

(t + )

(t + kT )d
=

k
K
X
[k]
1
T
R

( kT),
where in (a) we make the change of variable k = i j and in (b) we use the fact that for
an arbitrary function u : R R, an arbitrary number a R, and a positive (interval
length) b ,

i=
_
a+b
a
u(x + ib)dx =
_

u(x)dx. (5.6)
(If 5.6 is not clear, a picture is worth a thousand words: Picture integrating from a to
a + 2b by integrating rst from a to a + b , then from a + b to a + 2b , and summing
the results. This is the right-hand side. Now consider integrating both times from a to
a +b , but before you perform the second integration you shift the function to the left by
b . This is the left-hand side.)
We see that K
X
(t+, t) depends only on . Hence we simplify notation and write K
X
()
instead of K
X
(t + , t) . We summarize:
K
X
() =

k
K
X
[k]
1
T
R

( kT). (5.7)
The process X(t) is WSS because neither its mean nor its autocovariance depend on t .
160 Chapter 5.
For the last step in the derivation of the power spectral density, we use the fact that the
Fourier transform of R

() is [
F
(f)[
2
. This follows from Parsevals relationship,
R

() =
_

( + )

()d
=
_

F
(f)

F
(f) exp(j2f)df
=
_

[
F
(f)[
2
exp(j2f)df.
Now we can take the Fourier transform of K
X
() to obtain the power spectral density
S
X
(f) =
[
F
(f)[
2
T

k
K
X
[k] exp(j2kfT). (5.8)
The above expression is in a form that suits us. In many situations, the innite sum has
only a small number of non-zero terms. Note that the summation in (5.8) is the discrete-
time Fourier transform of K
X
[k]

k=
, evaluated at fT . This is the power spectral
density of the discrete-time process X
i

i=
. If we think of
|
F
(f)|
2
T
as being the power
spectral density of (t) , we can interpret S
X
(f) as being the product of two PSDs, that
of (t) and that of X
i

i=
.
In many cases of interest, K
X
[k] = c
k
, where c = E[[X
j
[
2
] . In this case we say that the
zero-mean WSS process X
i

i=
is uncorrelated. Then (5.7) and (5.8) simplify to
K
X
() = c
R

()
T
(5.9)
S
X
(f) = c
[
F
(f)[
2
T
. (5.10)
Example 77. Suppose that X
i

i=
is an independent and uniformly distributed se-
quence taking values in

c, and (t) =
_
1/T sinc(
t
T
) . Then K
X
[k] = c
k
and
S
X
(f) = c
[
W
2
,
W
2
]
(f),
where W =
1
T
. This is consistent with our intuition. When we use the pulse sinc(
t
T
) , we
expect a at power spectrum over [
W
2
,
W
2
] and no power outside this interval. The energy
per symbol is c , hence the power is
E
T
, and the power spectral density is
E
WT
= c .
In the next example, we work out a case where K
X
[k] ,= c
k
. In this case we say that
the zero-mean WSS process X
i

i=
is correlated.
Example 78. (Correlated Symbol Sequence) Suppose that the pulse is as in Example 77,
but the symbol sequence is now the output of an encoder described by X
i
=

c(B
i

B
i2
) , where B
i

i=
is a sequence of independent random variables that are uniformly
5.4. Nyquist Criterion for Orthonormal Bases 161
distributed over the alphabet 0, 1. After verifying that the symbol sequence is zero-
mean, we insert X
i
=

c(B
i
B
i1
) into K
X
[k] = E[X
j+k
X

j
] and obtain
K
X
[k] =
_

_
c, k = 0,
c/2, k = 2,
0, otherwise,
S
X
(f) =
[
F
(f)[
2
T
c
_
1
1
2
e
j4fT

1
2
e
j4fT
_
= 2c sin
2
(2fT)
[
W
2
,
W
2
]
(f). (5.11)
When we compare this example to Example 77, we see that this encoder shapes the power
spectral density from a rectangular shape to a squared sin. Notice that the spectral
density vanishes at f = 0. This is desirable if the channel blocks very low frequencies,
which happens for instance for a cable that contains ampliers. To avoid amplifying oset
voltages and leaky currents, ampliers are AC (alternating current) coupled. This means
that ampliers have a high-pass lter at the input, often just a capacitor, that blocks
DC (direct current) signals. Notice that the encoder is a linear time-invariant system
(with respect to addition and multiplication in R). Hence the cascade of the encoder and
the pulse forms a linear time-invariant system. It is immediate to verify that its impulse
response is

(t) = (t) (t 2T) . Hence in this case we can write
X(t) =

X
l
(l kT) =

l
B
l

(t lT).
The technique described in this example is called correlative encoding or partial response
signaling.
The encoder in Example 78 is linear with respect to (the eld) R and this is the reason its
eect can be incorporated into the pulse but this is not the case in general (see Exercise 2
and Chapter 6).
5.4 Nyquist Criterion for Orthonormal Bases
In the previous section we saw that symbol-by-symbol on a pulse train with uncorrelated
symbols a condition often met generates a stochastic process of power spectral density
c
|
F
(f)|
2
T
, where c is the symbols average energy. This is progress, because it tells us how
to choose (t) to obtain a desired power spectral density. But remember that (t) has
another important constraint: it must be orthogonal to its T -spaced time translates. It
would be nice to have a simple-to-work-with characterization of [
F
(f)[
2
that guarantees
orthogonality between (t) and (t lT) for all l Z. This seems to be asking too
much, but it is exactly what the next result is all about.
So, we are looking for a frequency-domain equivalent of the condition
_

(t nT)

(t)dt =
n
. (5.12)
162 Chapter 5.
The form of the left-hand side suggests using Parsevals relationship. Doing so yields

n
=
_

(t nT)

(t)dt =
_

F
(f)

F
(f)e
j2nTf
df
=
_

[
F
[
2
(f)e
j2nTf
df
(a)
=
_ 1
2T

1
2T

kZ
[
F
[
2
(f
k
T
)e
j2nT(f
k
T
)
df
(b)
=
_ 1
2T

1
2T

kZ
[
F
[
2
(f
k
T
)e
j2nTf
df
(c)
=
_ 1
2T

1
2T
g(f)e
j2nTf
df,
where in (a) we use again (5.6) (but in the other direction), in (b) we use the fact that
e
j2nT(f
k
T
)
= e
j2nTf
, and in (c) we introduce the function
g(f) =

kZ
[
F
[
2
(f +
k
T
).
Notice that g(t) is a periodic function of period 1/T and the right-hand side of (c) is
1/T times the nth Fourier series coecient A
n
of the periodic function g(f) . (A review
of Fourier series is given in Appendix 5.C.) Because A
0
= T and A
k
= 0 for k ,= 0, the
Fourier series of g(f) is the constant T . Up to a technicality discussed below, this proves
the following result.
Theorem 79. (Nyquist Criterion for Orthonormal Pulses) Let (t) be an L
2
function.
The set of waveforms (t jT)

j=
is an orthonormal one if and only if [
F
(f)[
2
satises Nyquist criterion for T , i.e.,
l. i. m.

k=
[
F
(f +
k
T
)[
2
= T, f R. (5.13)
The l. i. m. stands for limit in mean square. It means that as we add more and more terms
to the sum on the left-hand side of (5.13) it becomes L
2
equivalent to the constant on the
right-hand side. Two L
2
functions are L
2
equivalent if their dierence is a zero-energy
function (see Appendix 5.A). The l. i. m. is a technicality due to the Fourier series. To see
that the l. i. m. is needed, take a
F
(f) such that [
F
(f)[
2
fullls (5.13) even without the
l. i. m. . An example of such a function is the rectangle [
F
(f)[
2
= T for f [T/2, T/2)
and zero elsewhere. Now, take a copy [

F
(f)[
2
of [
F
(f)[
2
and modify it at an arbitrary
isolated point. For instance, we set [

F
(0)[
2
= 0. The inverse Fourier transform of

F
(t)
5.4. Nyquist Criterion for Orthonormal Bases 163
is still (t) . Hence it is orthogonal to its T -spaced time translates. Yet (5.13) is no longer
fullled if we omit the l. i. m. For our specic example, the left and the right dier at
exactly one point of each period. Equality still holds in the l. i. m. sense. In all practical
applications,
F
(f) is a smooth function and we can ignore the l. i. m. in (5.13).
Notice that the left side of the equality in (5.13) is periodic with period 1/T . Hence to
verify that [
F
(f)[
2
fullls Nyquist criterion, it is sucient to verify that (5.13) holds
over an interval of length 1/T .
Example 80. The following functions satisfy Nyquists criterion with parameter T .
(a) [
F
(f)[
2
= T
[
1
2T
,
1
2T
](f)
(b) [
F
(f)[
2
= T cos
2
(

2
fT)
[
1
T
,
1
T
]
(f)
(c) [
F
(f)[
2
= T(1 T[f[)
[
1
T
,
1
T
]
(f)
Two comments are in order:
(a) A function [
F
(f)[
2
cannot fulll Nyquist criterion for T if its support is contained
in an interval of the form [W/2, W/2] with 0 < W < 1/T . If this is not clear, draw
a picture of the right side of (5.13) using for [
F
(f)[
2
=
[W/2,W/2]
(f) .
(b) If [
F
(f)[
2
vanishes outside [
1
T
,
1
T
] , the Nyquist criterion is satised if and only if
[
F
(
1
2T
)[
2
+[
F
(
1
2T
)[
2
= T for [
1
2T
,
1
2T
] (see the picture below). If, in
addition, (t) is real-valued, which is typically the case, then [
F
(f)[
2
= [
F
(f)[
2
.
In this case, it is sucient that we check the positive frequencies, i.e., Nyquists
criterion is met if
[
F
(
1
2T
)[
2
+[
F
(
1
2T
+ )[
2
= T, [0,
1
2T
].
This means that [
F
(
1
2T
)[
2
=
T
2
and the amount by which [
F
(f)[
2
increases when
we go from f =
1
2T
to f =
1
2T
equals the decrease when we go from f =
1
2T
to
f =
1
2T
+ . For examples of such a band-edge symmetry see Figure 5.3, Figure 5.4
(top), and the three functions in Example 80.
164 Chapter 5.

[
F
(f)[
2
[
F
(f
1
T
)[
2

[
F
(
1
2T
)[
2
+[
F
(
1
2T
)[
2
= T
1
2T
1
T
Figure 5.3: Band-edge symmetry for a pulse [
F
(f)[
2
that vanishes outside
[
1
T
,
1
T
] and fullls Nyquists criterion.
Example 81. For every (0, 1) , and every T > 0, the raised-cosine function
[
F
(f)[
2
=
_

_
T, [f[
1
2T
T
2
_
1 + cos
_
T

_
[f[
1
2T
_
__
,
1
2T
< [f[ <
1+
2T
0, otherwise
fullls Nyquist criterion (see Figure 5.4 for a raised-cosine pulse with =
1
2
). Using the
relationship
1
2
(1 +cos ) = cos
2
2
, we can take the squared root of the above and obtain
the root raised-cosine function (also called squared-root raised-cosine function)

F
(f) =
_

T, [f[
1
2T

T cos
T
2
([f[
1
2T
),
1
2T
< [f[
1+
2T
0, otherwise.
The inverse Fourier transform of
F
(f) , derived in Appendix 5.E, is
(t) =
4

T
cos
_
(1 + )
t
T
_
+
(1)
4
sinc
_
(1 )
t
T
_
1
_
4
t
T
_
2
,
where sinc(x) =
sin(x)
x
. Notice that when t =
T
4
, both the numerator and the denom-
inator of the above expression become 0. We can show that the limit as t approaches

T
4
is sin
_
(1+)
4

+
2

sin
_
(1)
4

. The pulse (t) is plotted in Figure 5.4 (bottom)


for =
1
2
. The rectangular function and the squared-cosine function of of Example 80
are raised cosine functions with = 0 and = 1, respectively.
5.5. Symbol Synchronization 165
f
[
F
(f)[
2
1
2T
T
t
(t)
T
T
Figure 5.4: Raised cosine function [
F
(f)[
2
with =
1
2
(top) and
corresponding pulse (t) (bottom).
5.5 Symbol Synchronization
Consider the following network scenario. There are several transmitters and several re-
ceivers. Any transmitter can send to any receiver. We assume that information is com-
municated via symbol-by-symbol on a pulse train, with a common symbol alphabet and
a common pulse. Let T be the symbol time. Equipment are occasionally switched o
and on. When a receiver is switched o, all the electronics turn o, including the clock.
When the receiver is turned on, the clock starts at some default initial time. We assume
that there is no mechanism to align clocks. Suppose that after a silent period, a receiver
detects a signal at the matched lter output. The receiver can assume that the signal
comes from one of the transmitters, but it does not know the identity of the sender until
it decodes the symbols and reads the header. Before it can decode, the receiver needs to
determine when to sample the matched lter output. Specically, it needs to estimate
the for which + lT , l integer, are the desired sampling times.
Estimating is a parameter estimation problem. Parameter estimation, like detection, is
a well-established eld. In a typical detection problem, there is a clear objective function,
namely the error probability to be minimized, and it is minimized by a MAP decision. On
the contrary, in estimation problems it is often not obvious what the objective function
should be. Maximum likelihood (ML) estimation is an approach that does not require an
objective function and it is often the approach of choice. We discuss the ML approach
in it in Section 5.5.1 and in Section 5.5.2 we discuss a more pragmatic approach, the
delay locked loop (DLL). In both cases, we assume that the sender starts with a training
sequence used by the receiver to estimate .
166 Chapter 5.
5.5.1 Maximum Likelihood Approach
Suppose that the transmission starts with a training signal s(t) known to the receiver.
For the moment, we assume that s(t) is real-valued. Extension to a complex-valued s(t)
is done in Section 7.6.
The channel output signal is
R(t) = s(t ) + N(t),
where N(t) is white Gaussian noise of power spectral density
N
0
2
, is an unknown
scaling factor (channel attenuation and receiver front end amplication), and is the
unknown parameter to be estimated. We can assume that the receiver knows that is
in some interval, say [0,
max
] , for some possibly large constant
max
.
To describe the ML estimate of , we need a statistical description of the received signal
as a function of and . Towards this goal, suppose that we have an orthonormal basis

1
(t),
2
(t), . . . ,
n
(t) that spans the set s(t

) :

[0,
max
]. For instance, if s(t) is
continuous, has nite duration, and is essentially band limited, then the sampling theorem
tells us that we can use sinc functions for such a basis.
For i = 1, . . . , n, let Y
i
= R(t),
i
(t) and let y
i
be the observed sample valued of Y
i
.
The random vector Y = (Y
1
, . . . , Y
n
)
T
consists of independent random variables with
Y
i
A
_
m
i
(),
2
_
, where m
i
() = s(t ),
i
(t) and
2
=
N
0
2
. Hence, the density of
Y parameterized by and is
f(y; , ) =
1
(2
2
)
n
2
e

n
i=1
(y
i
m
i
())
2
2
2
. (5.14)
A maximum likelihood (ML) estimate

ML
is a value of

that maximizes the likelihood
function f(y;

, ) . This is the same as minimizing

n
i=1
(y
i
m
i
(

)

2
m
i
(

)
2
2
) , obtained
by taking the natural log of (5.14), removing terms that do not depend on

, and
multiplying by
2
.
Because the collection
1
(t),
2
(t), . . . ,
n
(t) spans the set s(t

) :

[0,
max
], addi-
tional projections of R(t) onto normalized functions that are orthogonal to
1
(t), . . . ,
n
(t)
lead to zero-mean Gaussian random variables of variance N
0
/2. Their inclusion in the
likelihood function has no eect on the maximizer
ML
. Furthermore, by assumption,

n
i=1
m
2
i
(

) equals
_
s
2
(t

)dt = |s|
2
. Finally, R(t) can be written as

y
i

i
(t) +
N

(t) , where N

(t), s(t

) = 0 for all

[0,
max
] . This implies that
_
R(t)s(t

)dt
equals

Y
i
m
i
(

) . Putting everything together, we obtain that the maximum likelihood


estimate

ML
is the

that maximizes
_
r(t)s(t

)dt, (5.15)
where r(t) is the observed sample path of R(t) .
5.5. Symbol Synchronization 167
This result is intuitive. If we write r(t) = s(t ) +n(t) , where n(t) is the sample path
of N(t) , we see that we are maximizing R
s
(

) +
_
n(t)s(t

)dt , where R
s
() is the
self-similarity function of s(t) and
_
n(t)s(t

)dt is a sample of a zero-mean Gaussian
random variable of variance that does not depend on

. The self-similarity function of
a real-valued function achieves it maximum at the origin, hence R
s
(

) achieves its
maximum at

= .
Notice that we have introduced y
1
, . . . , y
n
for an intermediate step, but the nal result
(5.15) does not depend on them.
The ML approach to parameter estimation is a well-established method but it is not the
only one. If we model as the sample of a random variable and we have a cost function
c(,

) that quanties the cost of deciding that the parameter is

when it is , then we
can aim for the estimate that minimizes the expected cost. This is the Bayesian approach
to parameter estimation. Another approach is the least-squares estimation (LSE). It seeks
the

for which

i
(y
i
m
i
(

))
2
is minimized. In words, the LSE approach chooses
the parameter that provides the most accurate description of the measurements, where
accuracy is measured in terms of squared distance. This is a dierent objective than that
of the ML approach, which chooses the parameter for which the measurements are the
most likely. When the noise is additive and Gaussian, like in our case, the ML approach
and the LSE approach lead to the same estimate. This is due to the fact that the likelihood
function f(y;

, a) depends on the squared distance

i
(y
i
m
i
(

))
2
.
5.5.2 Delay Locked Loop Approach
The shape of the training signal did not matter for the derivation of the ML estimate but
it does matter for the delay locked loop approach. We assume that the training signal
takes the same form as the communication signal, namely
s(t) =
L1

l=0
c
l
(t),
where c
0
, . . . , c
L1
are training symbols known to the receiver. The easiest way to see
how the delay locked loop works is to assume that (t) is a rectangular pulse
(t) =
_
1
T
0 t T
and let the training symbol sequence c
0
, . . . , c
L1
be the sign-alternating sequence

c
s
,

c
s
,

c
s
, ,

c
s
. The corresponding received signal is
R(t) =
L1

l=0
c
l
(t lT )
168 Chapter 5.
plus white Gaussian noise, where is the unknown scaling factor. If we neglect the noise
for the moment, the matched lter output (before sampling) is the convolution of R(t)
with

(t) , which can be written as


M(t) =
L1

l=0
c
l
R

(t lT ),
where
R

() =
_
1
[[
T
_
T , T
is the self-similarity function of (t) . Figure 5.5 plots a piece of M(t) .
t
+lT
Figure 5.5: Shape of M(t) .
t
t
k
(a)
t
t
k
(b)
t
t
k
(c)
t
t
k
(d)
Figure 5.6: DLL sampling points. The three consecutive dots of each
subgure are examples of M(t
E
k
) , M(t
k
) , and M(t
L
k
) , respectively.
The desired sampling times of the form +lT , l integer, correspond to the maxima and
minima of M(t) . Let t
k
be the k th sampling point. Until symbol synchronization is
achieved, M(t
k
) is not necessarily near a maxima or a minima of M(t) . For every sample
point t
k
, we collect also an early sample t
E
k
= t
k
and a late sample t
L
k
= t
k
+ ,
where is some small positive value (smaller than T/2). The dots in Figure 5.6 are
examples of sample values. Consider the cases when M(t
k
) is positive (subgures (a)
and (b)). We see that M(t
L
k
) M(t
E
k
) is negative when t
k
is late with respect to the
target, and it is positive when t
k
is early. The opposite is true when M(t
k
) is negative
(subgures (c) and (d)). Hence, in general,
_
M(t
L
k
) M(t
E
k
)
_
M(t
k
) can be used as a
feedback signal to the clock that determines the sampling times. A positive feedback is
5.6. Summary 169
a sign for the clock to speed up and a negative value is a sign to slow down. This can
be implemented via a voltage-controlled oscillator (VCO), with the feedback signal as the
controlling voltage.
Now consider the eect of noise. The noise added to M(t) is zero-mean. Intuitively, if
the VCO does not react too quickly to the feedback signal, or equivalently if the feedback
signal is lowpass ltered, then we expect the sampling point to settle at the correct position
even when the feedback signal is noisy. A rigorous analysis is possible but outside the
scope of this text. (Give some reference.)
Notice the similarity between the ML and the DLL solution. Ultimately they both make
use of the fact that a self-similarity function achieves its maximum at the origin. It is
also useful to think of a correlation such as
_
r(t)s(t

)dt
|r(t)||s(t)|
as a measure for the degree of similarity between two functions, where the denominator
serves to make the result invariant to a scaling of either function. In this case the two
functions are r(t) and s(t

) and the maximum is achieved when s(t

) lines up
with the s(t ) contained in r(t) . The solution to the ML approach correlates with
the entire training signal s(t) , whereas the DLL correlates with the pulse (t) ; but it
does so repeatedly and averages the results. The DLL is designed to work with a VCO
and together they provide a complete and easy-to-implement solution that tracks the
sampling times. It is easy to see that the DLL provides valuable feedback even after the
transition from the training symbols to the regular symbols, provided that the symbols
change polarity suciently often. To implement the ML approach, we still need a good
way to nd the maximum of (5.15) and to oset the clock accordingly. This can easily be
done if the receiver is implemented on a digital signal processor (DSP) but could be costly
in terms of additional hardware if the receiver is implemented with analog technology.
5.6 Summary
For the transmitter, the communication problem consists in signaling to the recipient the
message chosen by the sender. The message is one out of m possible messages. For each
message there is a unique signal used as messenger to communicate the message across
the channel.
In principle we can start our design by choosing m nite-energy signals w
0
(t) ,. . . , w
m1
(t) .
These signals span some subspace 1 of the vector space L
2
. To avoid anomalies due to the
fact that L
2
contains functions that are unsuitable to model physically realizable signals,
we assume that the only zero-energy signal of 1 is the zero signal (which must be in 1
by the axioms of a vector space). This is enough to guarantee that 1 is a nite-dimension
inner product space with respect to the inner product (5.17). Every nite-dimensional
inner product space has an orthonormal basis that can be found, for instance, via Gram-
170 Chapter 5.
Schmidt. Let
1
(t), . . . ,
n
(t) be the chosen basis for 1 and let c
0
, . . . , c
m1
in R
n
or
C
n
be the codewords such that
w
i
(t) =
n

j=1
c
i,j

j
(t), i = 0, . . . , m1. (5.16)
This implies that the decomposition of a sender into an encoder and a waveform former as
shown in Figure 3.2 is completely general. The orthonormal basis species the waveform
former and the codebook species the encoder.
The common practice is to follow the opposite path, i.e., to chose the codebook and
independently the orthonormal basis and to let the signals be the result of these choices.
Quite popular is the choice
j
(t) = (t jT) obtained from a function [
F
(t)[
2
that
fullls Nyquists criterion. Often people opt for raised-cosine functions. To experiment
with them, we have created a MATLAB le available upon request. Decide how to handle
this. Approaching the waveform design this way allows us to have good control on the
power spectral density of the transmitter output signal. It is also a good choice for the
complexity of the n-tuple former.
When
j
(t) = (t jT) , the cascade of the waveform former, the channel, and the n-
tuple former is equivalent to one use every T seconds of the discrete-time AWGN channel
depicted in Figure 5.7.
Questions that pertain to the encoder, such as bit-rate [bits/sec], average energy per sym-
bol, and error probability, can be answered based on the discrete-time channel. Questions
related to the the signals time/frequency characteristics are the domain of the signal
former. We see that the encoder and the signal former can essentially be designed inde-
pendently.
c
j

Y
j
= c
j
+Z
j

Z A(0,
N
0
2
)
Figure 5.7: Equivalent discrete-time channel.
5.A. L
1
, L
2
, and Lebesgue Integral: A Primer 171
Appendix 5.A L
1
, L
2
, and Lebesgue Integral: A Primer
In the sampling theorem (Theorem 72) we have introduced the notion of L
2
function. A
function g : R C belongs to L
2
if it is Lebesgue measurable and has a Finite Lebesgue
integral
_

[f(t)[
2
dt . Lebesgue measure, Lebesgue integral, and L
2
are technical terms,
but the main ideas are quite natural. In this appendix, our goal is to introduce them
informally. The hurried reader can skip this appendix and associate L
2
functions with
nite-energy functions and Lebesgue integral with integral.
We can associate the integral of a non-negative function g : R R to the area between
the abscissa and the functions graph. We attempt to determine this area via a sequence of
approximations that we hope converges to the desired result. The Riemann integral, the
one we learn in high school and the only one used in a typical undergraduate engineering
curriculum, partitions the abscissa in small intervals of some xed width and makes
vertical slices. The area of each slice is approximated by the area of a rectangle. For the
height of the rectangle, we can choose any value taken by the function inside the interval.
Hence the rectangles area might be an underestimatation or overestimation of the slices
area. This is shown in Figure 5.8(a). By adding up the area of the rectangles, we obtain
an approximation of the integral. As the intervals width becomes smaller, the sequence
of approximations may or may not converge. If it converges to the same number for any
such construction, then the Riemanns integral of that function is well dened and equal
to the limit. The Lebesgue integral is dened similarly but uses a dierent technique to
approximate. It partitions the ordinate and makes horizontal slices. The area of a slice is
approximated by the area of a rectangle that ts under the graph, hence it underestimates
the slices area (Figure 5.8(b)). As we take the limit by making the intervals smaller and
smaller, we obtain a non-decreasing sequence of area approximations. The limit of a non-
decreasing sequence always exists, and this is the value assigned to the integral. Every
Riemann-integrable function is Lebesgue-integrable and the values of the two integrals
agree whenever they are both dened.
(a) Riemann integration (b) Lebesgue integration
Figure 5.8: Integration
Example 82. The Dirichlet function is 0 where its argument is irrational and 1 oth-
erwise. Its Lebesgue integral is well dened and equal 0. But to see why, we need to
introduce the notion of measure. We will do so shortly. The Riemanns integral of the
Dirichelet function is undened. The problem is that in each interval of the abscissa, no
matter how small, we can nd both rational and irrational numbers. So to each approxi-
mating rectangle, we can assign height 0 or 1. If we assign hight 0 to all approximating
172 Chapter 5.
rectangles, the integral is approximated by 0; If we assign hight 1 to all rectangles, the in-
tegral is approximated by the length of the integrating interval. And this is so, no matter
how narrow we make the vertical slices. Clearly we can create approximation sequences
that do not converge. This could not happen with a continuous function or a function
that has a nite number of discontinuities, but the Dirichelet function is discontinuous
everywhere.
The way we have informally described the Lebesgue integral assumes that the area of
each horizontal slice is well dened. We write area in quotations because to integrate the
Dirichelet function we need a generalized notion of area, called measure. In approximating
the Lebesgue integral of the Dirichelet function, we make horizontal slices that are rather
bizarre subsets of a rectangle. A slice is a set of the form
(x, y) : y is in the vertical interval dened by the slice, x is rational
Since there are only countably many rational numbers, the measure (area) of such a
set is 0.
It turns out that if we try to assign a measrue to every set, including the most bizarre ones,
we can construct contradictions of the kind that the measure of the union of disjoint sets
is not the sum of the individual measures (see e.g. the example at the end of Appendix
4.9 of [6]). The sets to which we can assign a measure are called measurable. A function
is measurable if all the horizontal slices that come in the denition of the Lebesgue
integral are measurable sets. Only such functions are Lebesgue integrable.
Lebesgues approach to integration was summarized in a letter to Paul Montel [Wikipedia,
July 27, 2012, which in turn credits Siegmund-Schultze 2008]. Lebesgue wrote:
I have to pay a certain sum, which I have collected in my pocket. I take the
bills and coins out of my pockt and give them to the creditor in the order I nd
them until I have rached the total sum. This is the Riemann integral. But I can
proceed dierently. After I have taken all the money out of my pocket I order
the bills and coins according to identical values and then I pay the several heaps
one after the other to the creditor. This is my integral.
We would not mention the Lebesgue integral if it were just to integrate bizarre functions.
The real power of Lebesgue integration theory comes from a number of theorems that give
precise conditions under which a limit and an integral can be exchanged. Such operations
come up frequently in the study of Fourier transforms and Fourier series. The reader
should not be alarmed at this point. We will interchange integral, swap integrals and
sums, swap limits and integrals without fuss, but we do this because we know that the
Lebesgue integration theory allows us to do so, in the cases of interest to us.
In introducing the Lebesgue integral we have assumed that the function being integrated
is non-negative. The same idea applies to non-positive functions. (In fact we can integrate
the negative of the function and then take the negative of the result.) If the function takes
5.A. L
1
, L
2
, and Lebesgue Integral: A Primer 173
on positive as well as negative values, we split the function as the sum of two functions,
one that is non-negative and one that is non-positive, we compute the respective integrals,
and sum the results. This works as long as the two intermediate results are not + and
, respectively. If they are, the Lebesgue integral is undened. Otherwise the Lebesgue
integral is dened.
A complex-valued function g : R C is integrated by separately integrating its real
and imaginary parts. If the Lebesgue integral over both parts is dened and nite, the
function is said to be Lebesgue integrable. The set of Lebesgue integrable functions is
denoted by L
1
. The notation comes from the easy-to-verify fact that for a Lebesgue
measurable function g , the Lebesgue integral is nite if and only if
_

[g(x)[dx < .
All functions that model physical processes are nite-energy and measurable, hence L
2
functions. All nite energy functions that we encounter in this text are measurable, hence
L
2
functions. There are examples of nite-energy functions that are not measurable, but
it is hard to imagine an engineering problems where such a function arizes.
The set of L
2
functions forms a complex vector space with the zero vector being the
all-zero function. If we modify an L
2
function in a countable number of points, the
result is also an L
2
function and the (Lebesgue) integral over the two functions is the
same. (More generally, the same is true if we modify the function over a set of measure
zero.) The dierence between the two functions is an L
2
function (t) such that the
Lebesgue integral
_
[(t)[
2
dt = 0. L
2
functions that have this property are said to be L
2
equivalent.
Unfortunately the function that maps a(t), b(t) L
2
to
a, b =
_
a(t)b

(t)dt (5.17)
is not an inner product over L
2
because one axiom is not fullled. In fact, for a general
a(t) L
2
, a, a = 0 implies that a(t) is L
2
equivalent to the zero function, but this is
not enough: to satisfy the axiom, a(t) must be the zero function.
There are two obvious ways around this problem if we want to treat nite energy functions
as vectors. One way is to consider only subspaces 1 of L
2
such that there is only one
vector (t) in 1 that has the property
_
[(t)[
2
dt = 0. We made this assumption in
Chapter 3. See also the next example. Another way is to form equivalence classes as
outlined in the second example below.
Example 83. The set 1 that consists of the continuous functions of L
2
is a vector
space. Continuity is sucient to ensure that there is only one function (t) 1 for
which integral
_
[(t)[
2
dt = 0, namely the zero function. Hence 1 equipped with ,
of (5.17) forms an inner product space.
Example 84. For engineering applications there is no need to distinguish among signals
that are L
2
equivalent. As already mentioned, such signals can not be distinguished by
means of a physical experiment. Hence the idea of partitioning L
2
into equivalence classes,
174 Chapter 5.
with the property that two functions are in the same equivalence class if and only if they
are L
2
equivalent. With an appropriately dened vector addition and multiplication of
a vector by a scalar, the set of equivalence classes forms a complex vector space. We can
use (5.17) to dene an inner product over this vector space: The inner product between
an equivalence classe and another is the result of applying (5.17) to any element of the
rst class and to any element of the second class. The result does not depend on which
element of a class we choose to perform the calculation. This way L
2
can be transformed
into an inner product space denoted by L
2
. As a sanity check, say that we want to
compute the inner product of a vector with itself. Let a(t) be an arbitrary element of
the corresponding class. If a, a = 0 then a(t) is in the equivalence class that contains
all the functions that have 0 norm. This class is the zero vector.
For a thorough treatment of measurability, Lebesgue integral, and L
2
functions we refer
to a real-analysis book (see e.g. [9]). For an excellent introduction to the Lebesgue
integral in a communication engineering context, we recommend the graduate-level text
[6, Section 4.3]. Even more details can be found in [10]. A very nice summary of Lebesgue
integration can be found on Wikipedia (July 27, 2012).
Appendix 5.B Fourier Transform: a Review
In this appendix, we review a few facts about the Fourier transform. They belong to two
categories. One category consists of facts that have an operational value. You should
know them as they help us in routine manipulations. The other category consists of
mathematical subtleties that you should be aware of, even though they do not make a
dierence in engineering applications. We mention them for awareness, and because in
certain cases they aect the language we use in formulating theorems. For an in-depth
discussion of this second category we recommend [6, Section 4.5] and [10].
The following two integrals relate a function g : R C to its Fourier transform g
F
:
R C.
g(u) =
_

g
F
()e
j2u
d (5.18)
g
F
(v) =
_

g()e
j2v
d. (5.19)
Some books use capital letters for the Fourier transform, such as G(f) for the Fourier
transform of g(t) . Other books use a hat, like in g(f) . We use the subscript T because, in
this text, capital letters denote random objects (random variables and random processes)
and hats are used for estimates in detection-related problems.
Typically we take the Fourier transform of a time-domain function and the inverse Fourier
transform of a frequency domain function. However, from a mathematical point of view
it does not matter whether or not there is a physical meaning associated to the variable
5.B. Fourier Transform: a Review 175
of the function being transformed. To underline this fact, we use the unbiased dummy
variable in (5.18) and (5.19).
Sometimes the Fourier transform is dened using e
jt
in (5.18) and using e
jt
in the
inverse, but the inverse also inherits the factor
1
2
, which breaks the nice symmetry
between (5.18) and (5.19). Most books in communication dene the Fourier transform
as we do, because it makes the formulas easier to remember. Lapidoth [10, Section 6.2.1]
gives ve good reasons for dening the Fourier transform as we do.
When we say that the Fourier transform of a function exists, we mean that the integral
(5.18) is dened. It does not imply that the inverse Fourier transform exists. The Fourier
integral (5.18) exists and is nite for all L
2
function dened over a nite-length interval.
The technical reason for this is that an L
2
function f(t) dened over a nite-length
interval is also an L
1
function, i.e.,
_
[f(t)[dt is nite, which excludes the problem
mentioned in Appendix 5.A. Then also
_
[f(t)e
j2t
[dt =
_
[f(t)[dt is nite. Hence (5.18)
is nite.
If the L
2
function is dened over R, then the problem mentioned in Appendix 5.A
can arise. This is the case with the sinc(t) function. In such cases, we truncate the
function to the interval [T, T] , compute its Fourier transform and let T go to innity.
It is in this sense that the Fourier transform of the sinc(t) is dened. An important
result of Fourier analysis says that we are allowed to do so (see Plancherels theorem, [6,
Section 4.5.2]). Fortunately we rarely have to do this, because the Fourier transform of
most functions of interest to us is tabulated.
Thanks to Plancherels theorm, we can make the sweeping statement that the Fourier
transform is dened for all L
2
functions. The transformed function is in L
2
, hence also its
inverse is dened. Be aware though, when we compute the transform and then the inverse
of the transformed, we do not necessarily obtain the original function. However, what we
obtain is L
2
equivalent to the original. As already mentioned, no physical experiment
will ever be able to detect the dierence between the original and the modied function.
Example 85. The function g(t) that has value 1 at t = 0 and 0 everywhere else is
an L
2
function. Its Fourier transform is g
F
(f) is 0 everywhere. The inverse Fourier
transform of g
F
(f) is also 0 everywhere.
One way to remember whether (5.18) or (5.19) has the minus sign in the exponent is to
think of the Fourier transform as a tool that allows us to write a function g(u) as a linear
combination of complex exponentials. Hence we are writing g(u) =
_
g
F
()(, u)d
with (, u) = exp(j2u) . Technically this is not an orthonormal expansion, but it
looks like one, where g
F
() is the coecient of the function (, u) (seen as a function
of u with as a parameter). Like for a orthonormal expansions, the coecient is obtained
from an expression that takes the form g
F
(u) = g(), (, u) =
_
g()

(, u)d. It is
the complex conjugate in the computation of the inner product that brings in the minus
sign in the exponent. We emphasize that we are working by analogy here: The complex
exponential has innite energy hence not a unit-norm function (at least not with respect
176 Chapter 5.
to the standard inner product).
Rectangular pulses and their Fourier transforms often show up in examples and we should
know how to go back and forth between them. Two tricks make it easy to relate the
rectangle and the sinc as Fourier transform pairs. First, let us recall how the sinc is
dened: sinc(x) =
sin(x)
x
. (The makes sinc(x) vanish at all integer values of x except
for x = 0 where it is 1.)
The rst trick is well known: the value of a (time-domain) function at (time) 0 equals the
area under its Fourier transform. Similarly, the value of a (frequency-domain) function
at (frequency) 0 equals the area under its inverse Fourier transform. These properties
follow directly from the denition of Fourier transform: namely
g(0) =
_

g
F
()d
g
F
(0) =
_

g()d.
Everyone knows how to compute the area of a rectangle. But how do we compute the
area under a sinc? Here is where the second and not-so-well-known trick comes in handy:
The area under a sinc is the area of the triangle inscribed in its main lobe. Hence the
integral under the following two curves is identical and equal to ab , and this is true for
all positive values a, b .
x
b sinc(
x
a
)
a
b
x
a
b
Figure 5.9:
_

b sinc(
x
a
)dx equals the area under the triangle on the right.
Let us consider a specic example of how to use the above two tricks. It does not matter if
we start from a rectangle or from a sinc and whether we want to nd its Fourier transform
or its inverse Fourier transform. Let a, b , c, and d be as shown in Figure 5.10.
5.C. Fourier Series: a Review 177
x
b
[a,a]
(x)
a
b
x
d sinc(
x
c
)
c
d
Figure 5.10: Rectangle and sinc to be related as Fourier transform pairs.
We want to relate a, b to c, d (or vice-versa). Since b must equal the area under the sinc
and d the area under the rectangle we have
b = cd
d = 2ab,
which may be solved for a, b or for c, d.
Example 86. The Fourier transform of b
[a,a]
(x) is a sinc that has height 2ab . Its rst
zero crossing must be at
1
2a
so that the area under the sinc becomes b . The result is
2ab sinc(2af) .
Example 87. The Fourier transform of b sinc(
t
a
) is a rectangle of height ab . Its width
must be
1
a
so that its area is b . The result is ab
[
1
2a
,
1
2a
]
(x) .
See Exercise 15 for a list of useful Fourier transform relations and see Table 5.1 for the
Fourier transform of a few L
2
functions.
e
a|t|
F

2a
a
2
+(2f)
2
e
at
, t 0
F

1
a+(2f)
e
t
2 F
e
f
2
[a,a]
(t)
F
2a sinc(2af)
Table 5.1: Fourier transform (right) for a few L
2
functions (left), where
a > 0.
Appendix 5.C Fourier Series: a Review
We review the Fourier series, focusing on the big picture and on how to remember things.
Let f(x) be a periodic function, x R. It has period p if f(x) = f(x+p) for all x R.
Its fundamental period is the smallest positive such p. We are using the physically
178 Chapter 5.
unbiased variable x instead of t , because we want to emphasize that we are dealing
with a general periodic function, not necessarily a function of time.
A function f(x) of fundamental period p can be written as a linear combination of all
the complex exponentials of period p. Hence
f(x) =

iZ
A
i
e
j2
x
p
i
(5.20)
for some sequence of coecients . . . A
1
, A
0
, A
1
, . . . with value in C.
Two functions of fundamental period p are identical if and only if they coincide over a
period. Hence to check if a given series of coecients . . . A
1
, A
0
, A
1
, . . . is the correct
series, it suces to verify that
f(x)
[
p
2
,
p
2
]
(x) =

iZ

pA
i
e
j
2
p
xi

p
[
p
2
,
p
2
]
(x),
where we have multiplied and divided by

p to make
i
(x) =
e
j
2
p
xi

p
[
p
2
,
p
2
]
(x) of unit
norm, i Z. Thus we can write
f(x)
[
p
2
,
p
2
]
(x) =

iZ

pA
i
(x), (5.21)
The right-hand side of the above expression is an orthonormal expansion. The coecients
of an orthonormal expansion are always computed according to the same expression: The
i th coecient

pA
i
equals f,
i
. Hence,
A
i
=
1
p
_ p
2

p
2
f(x)e
j
2
p
xi
dx. (5.22)
Notice that the right-hand side of (5.20) is periodic, and that of (5.21) has nite support.
In fact we can use the Fourier series for both kind of functions, periodic and nite support.
In communications, we are more interested in functions that have nite support. (Once
we have seen one period of a periodic function, we have seen them all: from that point
on there is no information being conveyed.)
In terms of mathematical rigor, there are two things that can go wrong in what we have
said: (1) For some i , the integral in (5.22) might not be dened or might not be nite.
We can show that neither is the case if f(x)
[
p
2
,
p
2
]
(x) is an L
2
function; (2) for a specic
x, the truncated series

l
i=l

pA
i
(x) might not converge as l goes to innity or might
converge to a value that diers from f(x)
[
p
2
,
p
2
]
(x) . It is not hard to show that the norm
of the function
f(x)
[
p
2
,
p
2
]
(x)
l

i=l

pA
i
(x)
5.D. Proof of the Sampling Theorem 179
goes to zero as l goes to innity. Hence the two functions are L
2
equivalent. We write
this as follows
f(x)
[
p
2
,
p
2
]
(x) = l. i. m.

iZ

pA
i
(x)
where l. i. m. means limit in mean-square: it is a short-hand notation for
lim
l
_ p
2

p
2

f(x)
l

i=l

pA
i
(x)

2
dx = 0.
We summarize with the following rigorous statement. The details that we have omitted
in our proof are in [6, Section 4.4].
Theorem 88. (Fourier series) Let g(x) : [
p
2
,
p
2
] C be an L
2
function. Then for each
k Z, the (Lebesgue) integral
A
k
=
1
p
_
p/2
p/2
g(x)e
j2k
x
p
dx
exists and is nite. Furthermore,
g(x) = l. i. m.

k
A
k
e
j2k
x
p
[p/2,p/2]
(x).
Appendix 5.D Proof of the Sampling Theorem
In this appendix, we prove Theorem 72. By assumption, w
F
(f) = 0 for f / [
W
2
,
W
2
] .
Hence we can write w
F
(f) as a Fourier series (Theorem 88):
w
F
(f) = l. i. m.

k
A
k
e
+j
2
W
fk
[
W
2
,
W
2
]
(f).
w
F
(f) is in L
2
(it is the Fourier transform of an L
2
function) and vanishes outside
[
W
2
,
W
2
] , hence it is L
1
(see Exercise 7.11 , to be added.). The inverse Fourier transform
of an L
1
function is continuous (see Exercise 7.11 , to be added.). Hence it must be
identical to w(t) , which is also continuous.
Even though w
F
(f) and

k
A
k
e
+j
2
W
fk
[
W
2
,
W
2
]
(f) might not agree at every point, they
are L
2
equivalent. Hence they have the same inverse Fourier transform which, as argued
above, is w(t) . Thus
w(t) =

k
A
k
W sinc (Wt + k) =

k
A
k
T
sinc
_
t
T
+k
_
,
where T = 1/W .
180 Chapter 5.
We still need to determine
A
k
T
. It is straightforward to determine A
k
from the denition
of the Fourier series, but it is even easier to plug t = nT on both sides of the above
expression to obtain w(nT) =
A
n
T
. This completes the proof. To see that we can easily
obtain A
k
from the denition (5.22), we write
A
k
=
1
W
_ W
2

W
2
w
F
(f)e
j
2
W
kf
df = T
_

w
F
(f)e
j2Tkf
df = Tw(kT),
where the rst equality is the denition of the Fourier coecient A
k
, the second uses the
fact that w
F
(f) = 0 for f / [
W
2
,
W
2
] , and the third is the inverse Fourier transform
evaluated at t = kT .
Appendix 5.E Square-Root Raised-Cosine Pulse
In this appendix, we derive the inverse Fourier transform of the square-root raised-cosine
pulse

F
(f) =
_

T, [f[
1
2T

T cos
T
2
([f[
1
2T
),
1
2T
< [f[
1+
2T
0, otherwise.
We write
F
(f) = a
F
(f) + b
F
(f) where a
F
(f) =

T
[
1
2T
,
1
2T
]
(f) is the central piece
of the square-root raised-cosine pulse and b
F
(f) accounts for the two square-root raised-
cosine edges.
The inverse Fourier transform of a
F
(f) is
a(t) =

T
t
sin
_
(1 )t
T
_
.
Write b
F
(f) = b

F
(f) + b
+
F
(f) , where b
+
F
(f) = b
F
(f) for f 0 and zero otherwise. Let
c
F
(f) = b
+
F
(f +
1
2T
) . Specically,
c
F
(f) =

T cos
_
T
2
_
f +

2T
__
[

2T
,

2T
]
(f).
The inverse Fourier transform c(t) is
c(t) =

2

T
_
e
j

4
sinc
_
t
T

1
4
_
+e
j

4
sinc(
t
T
+
1
4
)
_
.
Now we use the relationship b(t) = 2c(t)e
j2
1
2T
t
to obtain
b(t) =

T
_
sinc
_
t
T

1
4
_
cos
_
t
T


4
_
+ sinc
_
t
T
+
1
4
_
cos
_
t
T
+

4
__
.
5.E. Square-Root Raised-Cosine Pulse 181
After some manipulations of (f) = a(t) + b(t) we obtain the desired expression
(t) =
4

T
cos
_
(1 + )
t
T
_
+
(1)
4
sinc
_
(1 )
t
T
_
1
_
4
t
T
_
2
.
182 Chapter 5.
Appendix 5.F The Picket Fence Miracle
In this appendix, we derive the picket fence miracle, which is a useful tool for infor-
mal derivations related to sampling. (see Example 89 and Exercises 11 and 13). The
derivation is not rigorous in that we handwave over convergence issues.
Recall that (t) is the (generalized) function dened through its integral against a function
f(t) :
_

f(t)(t) = f(0).
It follows that
_

(t) = 1, that f(t)(t) = f(0)(t) , and that the Fourier transform of


(t) is 1 for all f R.
The T -spaced picket fence is the train of Dirac delta functions

n=
(t nT).
The picket fence miracle refers to the fact that the Fourier transform of a picket fence is
again a (scaled) picket fence. Specically,
T
_

n=
(t nT)
_
=
1
T

n=
(f
n
T
),
where T[] stands for the Fourier transform of the enclosed expression. The above re-
lationship can be derived by expanding the periodic function

(t nT) as a Fourier
series, namely:

n=
(t nT) =
1
T

n=
e
j2
t
T
n
.
(The careful reader should wonder in which sense the above equality holds. We are
indeed being informal.)
Taking the Fourier transform on both sides yields
T
_

n=
(t nT)
_
=
1
T

n=

_
f
n
T
_
which is what we wanted to prove.
5.F. The Picket Fence Miracle 183
It is convenient to have a notation for the picket fence. Thus we dene
2
E
T
(x) =

n=
(x nT).
Using this notation, the relationship that we just proved can be written as
T [E
T
(t)] =
1
T
E1
T
(f).
The picket fence miracle is a practical tool in engineering and physics, but in the stated
form it is not appropriate to obtain results that are mathematically rigorous. An example
follows.
Example 89. We give an informal proof of the sampling theorem by using the picket
fence miracle. Let s(t) be such that s
F
(f) = 0 for f / [
W
2
,
W
2
] and let T < 1/W .
We want to show that s(t) can be reconstructed from the T -spaced samples s(lT)
lZ
.
Dene
s[(t) =

n=
s(nT)(t nT).
( s[ is just a name for the expression on the right-hand side of the equality.) Notice that
we can also write
s[(t) = s(t)E
T
(t).
Taking the Fourier transform on both sides yields
T[s[(t)] = s
F
(f)
_
1
T
E1
T
(f)
_
=
1
T

n
s
F
_
f
n
T
_
.
The relationship between s
F
(f) and T[s[(t)] is depicted in Figure 5.11.
f
a
b
a +
1
T
b +
1
T
1
T

i
s
F
(f
l
T
)
f
a
b
s
F
(f)
Figure 5.11: Fourier transform of a function s(t) (top) and of
s[(t) =

s(nT)(t nT) (bottom).


2
The choice of the letter E is suggested by the fact that it looks like a picket fence when rotated 90
degrees.
184 Chapter 5.
From the gure, it is obvious that we can reconstruct the original signal s(t) by ltering
s[(t) with a lter that scales (1/T)s
F
(f) by T and blocks (1/T)

n=
s
F
_
f
n
T
_
.
Such a lter exists if, like in the gure, the support of s
F
(f) does not intersect with
the support of

n=
s
F
_
f
n
T
_
. This is clearly the case if T < 1/W . If h(t) is the
impulse response of such a lter, the lter output y(t) when the input is s[(t) satises
y
F
(f) =
_
1
T

n=
s
F
_
f
n
T
_
_
h
F
(f) = s
F
(f).
After taking the inverse Fourier transform, we obtain the reconstruction (also called in-
terpolation) formula
y(t) =
_

n=
s(nT)(t nT)
_
h(t) =

n=
s(nT)h(t nT) = s(t).
A specic lter that has the desired properties is the lowpass lter of frequency response
h
F
(f) =
_
T, f [
1
2T
,
1
2T
]
0, otherwise.
Its impulse response is sinc(
t
T
) . Inserting into the reconstruction formally yields
s(t) =

n=
s(nT) sinc
_
t
T
n
_
, (5.23)
which matches (5.2).
Now we see why the picket fence miracle is not suitable for a rigorous proof of the sampling
theorem. In fact, the right-hand side of (5.23) is a continuous function; and in this
derivation of the sampling theorem, we did not ask that s(t) be continuous. To see that
continuity is needed, start with s(t) = W sinc(Wt) for some positive W . Its Fourier
transform satises s
F
(f) = 0 for f , [W/2, W/2] . For this s(t) , equation (5.23) holds
with any T that satises the sampling theorem, including T =
1
2W
(which twice the
minimum required). Now modify s(t) by setting it to 0 at the sampling points. The
rest of the function remains unchanged. This modication does not aect the Fourier
transform. Yet the right-hand side of (5.23) is now 0 for all t R.
The picket fence miracle is useful for computing the spectrum of certain signals related
to sampling. Examples are found in Exercises 11 and 13.
5.G. Exercises 185
Appendix 5.G Exercises
Problem 1. (Matched Filter Basics) Let
w(t) =
K

k=1
d
k
(t kT)
be a transmitted signal where (t) satises
_

(t)(t kT)dt =
_
0, k ,= 0
1, k = 0,
and d
i
1, 1.
(a) Suppose that w(t) is ltered at the receiver by the matched lter with impulse
response (t) . Show that the lter output y(t) sampled at mT , m integer, yields
y(mT) = d
m
, for 1 m K.
(b) Now suppose that the channel outputs the input plus a delayed and scaled replica of
the input. That is, the cannels impulse response takes the form h(t) = (t)+(tT)
for some T and some [1, 1] . The transmitted signal w(t) is ltered by h(t) ,
then ltered at the receiver by (t) . The resulting waveform y(t) is again sampled
at multiples of T . Determine the samples r(mT) , for 1 m K.
(c) Suppose that the k th received sample is Y
k
= d
k
+d
k1
+Z
k
, where Z
k
A(0,
2
)
and 0 < 1 is a constant. d
k
and d
k1
are realizations of independent random
variables that take on the values 1 and 1 with equal probability. Suppose that
the receiver decides

d
k
= 1 if Y
k
> 0, and decides

d
k
= 1 otherwise. Find the
probability of error for this receiver.
Problem 2. (Dierential Encoding) For many years, telephone companies built their
networks on twisted pairs. This is a twisted pair of copper wires invented by Alexander
Graham Bell in 1881 as a means to mitigate the eect of electromagnetic interference.
In essence, an alternating magnetic eld induces an electric eld in a loop. This applies
also to the loop created by two parallel wires connected at both ends. If the wire is
twisted, the electric eld components that build up along the wire alternate polarity and
tend to cancel out one another. If we swap the two contacts at one end of the cable, the
signals polarity at one end is the opposite of that on the other end. Dierential encoding
is a technique for encoding the information in such a way that it makes no dierence
when the polarity is inverted. The dierential encoder takes the data sequence D
i

n
i=1
,
here assumed to have independent and uniformly distributed components taking value in
186 Chapter 5.
0, 1, and produces the symbol sequence X
i

n
i=1
according to the following encoding
rule:
X
i
=
_
X
i1
, D
i
= 0,
X
i1
, D
i
= 1,
where X
0
=

c by convention. Suppose that the symbols sequence is used to form


X(t) =
n

l=1
X
i
(l kT),
where (t) is normalized and orthogonal to its T -spaced time translates. The signal is
sent over the AWGN channel of power spectral density N
0
/2 and at the receiver is passed
through the matched lter of impulse response

(t) . Let Y
i
be the lter output at
time iT .
(a) Determine R
X
[k] , k Z, assuming an innite sequence X
i

i=
.
(b) Describe a method to estimate D
l
from Y
l
and Y
l1
, such that it the performance
is the same if the polarity of Y
i
is inverted for all i . We ask for a simple decoder,
not necessarily ML.
(c) Determine (or estimate) the error probability of your decoder.
Problem 3. (Nyquist Original) A communication system uses signals of the form

lZ
c
l
p(t lT),
where c
l
takes values in some symbol alphabet and p(t) is a nite-energy pulse. The
transmitted signal is rst ltered by a channel of impulse response h(t) and then corrupted
by additive white Gaussian noise of power spectral density
N
0
2
. The receiver front-end is
a lter of impulse response q(t) .
(a) Neglecting the noise, show that the front-end lter output has the form
y(t) =

lZ
c
l
g(t lT),
where g(t) = (p h q)(t) and denotes convolution.
(b) A necessary and sucient (time-domain) condition that g(t) has to fulll so that the
samples of y(t) satisfy y(lT) = c
l
, l Z, is
g(lT) =
l
.
A function g(t) that fullls this condition is called a Nyquist pulse of parameter T .
Prove the following theorem:
5.G. Exercises 187
Theorem 90. (Nyquist Criterion for Nyquist Pulses) The L
2
pulse g(t) is a Nyquist
pulse (of parameter T ) if and only if its Fourier transform g
F
(f) fullls Nyquist
criterion (with parameter T ), i.e.,
l. i. m.

lZ
g
F
(f
l
T
) = T, t R.
Note: Due to the periodicity of the left-hand side, equality is fullled if and only if
it is fullled over an interval of length 1/T . Hint: Set g(t) =
_
g
F
(f)e
j2ft
df , insert
on both sides t = lT and proceed as in the proof of Theorem 79.
(c) Prove Theorem 79 as a corollary to the above theorem. Hint:
l
=
_
(t lT)

(t)dt
if and only if the self-similarity function R

() is a Nyquist pulse of parameter T .


(d) Let p(t) and q(t) be real-valued with Fourier transform as shown in Figure 5.12,
where only positive frequencies are plotted (both functions being even). The channel
frequency response is h
F
(f) = 1. Determine y(kT) .
f
p
F
(f)
T
0
1
T
f
q
F
(f)
1
0
1
T
Figure 5.12:
Problem 4. (Power Spectrum: Manchester Pulse) Derive the power spectral density of
the random process
X(t) =

i=
X
i
(t iT )
where X
i

i=
is an iid sequence of uniformly distributed random variables taking
values in

c, is uniformly distributed in the interval [0, T] , and (t) is the so-


called Manchester pulse shown in Figure 5.13. The Manchester pulse guarantees that
X(t) has at least one transition per symbol, which facilitates the clock recovery at the
receiver.
188 Chapter 5.
t
(t)
T
1

T
-
1

T
Figure 5.13:
Problem 5. (Nyquist Criterion) For each function [
F
(f)[
2
in Figure 5.14, indicate
whether the corresponding pulse (t) has unit norm, is a Nyquist pulse, is orthogonal
to its time-translates by multiples of T , or neither. The function in Figure 5.14(d) is
sinc
2
(tT) .
f
[
F
(f)[
2
T

1
2T
1
2T
(a)
f
[
F
(f)[
2
T
2

1
T
1
T
(b)
f
[
F
(f)[
2

1
2T
1
2T
T
(c)
f
[
F
(f)[
2
1

1
T
1
T
(d)
Figure 5.14:
Problem 6. (Communication Link Design) Specify the block diagram for a digital com-
munication system that uses twisted copper wires to connect devices that are 5 Km apart
from each other. The cable has an attenuation of 16 dB/Km. You are allowed to use the
spectrum between 5 and 5 MHz. The noise at the receiver input is white and Gaussian,
with power spectral density N
0
/2 = 4.210
21
W/Hz. The required bit rate is 40 Mbps
(megabits per second) and the bit error probability should be less than 10
5
. Be sure to
specify the symbol alphabet and the waveform former of the system you propose. Give
5.G. Exercises 189
precise values or bounds for the bandwidth used, the power of the channel input signal,
the bit rate, and the error probability. Indicate which bandwidth denition you use.
Problem 7. (Mixed Questions)
(a) Consider the signal x(t) = cos(2t)
_
sin(t)
t
_
2
. Assume that we sample x(t) with
sampling period T . What is the maximum T that guarantees signal recovery?
(b) Consider the three signals s
1
(t) = 1, s
2
(t) = cos(2t), s
3
(t) = sin
2
(t) , for 0 t 1.
What is the dimension of the signal space spanned by s
1
(t), s
2
(t), s
3
(t)?
(c) You are given a pulse p(t) with spectrum p
F
(f) =
_
T(1 [f[T) , 0 [f[
1
T
.
What is the value of
_
p(t)p(t 3T)dt ?
Problem 8. (Pulse Orthogonal to its T -Spaced Time Translates.) The following gure
shows part of the plot of a function [
F
(f)[
2
, where
F
(f) is the Fourier transform of
some pulse (t) .
f
[
F
(f)[
2
0
1
2T
Complete the plot (for positive and negative frequencies) and label the ordinate, knowing
that the following conditions are satised:
For every pair of integers k, l ,
_
(t kT)(t lT)dt =
kl
.
(t) is real-valued.

F
(f) = 0 for [f[ >
1
T
.
Problem 9. (Peculiarity of the Sinc Pulse) Let U
k

n
k=0
be an iid sequence of uniformly
distributed bits taking value in 1. Prove that for certain values of t and for n
suciently large, s(t) =

n
k=0
U
k
sinc(t k) can become larger than any given constant.
Hint: The series

k=1
1
k
diverges and so does

k=1
1
ka
for any constant a.
190 Chapter 5.
Problem 10. (Properties of the Self-Similarity Function) Prove the following properties
of the self-similarity function (5.5). Recall that the self-similarity function of a pulse (t)
is R

() =
_

(t + )

(t)dt .
(a) Value at zero:
R

() R

(0) = ||
2
, R. (5.24)
(b) Conjugate symmetry:
R

() = R

(), R. (5.25)
(c) Convolution representation:
R

() = ()

(), R. (5.26)
Note: The convolution between a(t) and b(t) can be written as (a b)(t) or as
a(t) b(t) . Both versions are used in the literature. We prefer the rst version, but
in the above case the second version does not require the introduction of a name for

() .
(d) Fourier relationship:
R

() is the inverse Fourier transform of [(f)[


2
. (5.27)
Note: This implies that R

() is a continuous function.
Problem 11. (Sampling And Reconstruction) Here we use the picket fence miracle to
investigate practical ways to approximate sampling and/or reconstruction. We assume
that for some positive W , s(t) satises s
F
(f) = 0 for f , [W/2, W/2] . Let T be be
such that 0 < T W .
(a) As a reference, review Example 89 of Appendix 5.F.
(b) To generate the intermediate signal s(t)E
T
(t) of Example 89, we need an electrical
circuit that produces Diracs. Such a circuit does not exist. As a substitute for (t) ,
we use a rectangular pulses of small width T
w
, where 0 < T
w
T . The intermediate
signal at the input of the reconstruction lter is then
1
Tw
[s(t)E
T
(t)]
[0,Tw]
(t) . (We can
generate this signal without passing through E
T
(t) .) Express the Fourier transform
y
F
(f) of the reconstruction lter output.
(c) In the so-called zero-order interpolator, the reconstructed approximation is the step-
wise signal
1
T
[(s(t)E
T
(t)]]
[0,T]
(t) . This is the intermediate signal of Part (b) with
T
w
= T . Express its Fourier transform. Note: There is no interpolation lter in this
case.
5.G. Exercises 191
(d) In the rst-order interpolator, the reconstructed approximation consists of straight
lines connecting the values of the original signal at the sampling points. This can be
written as [s(t)E
T
(t)] p(t) where p(t) is the triangular shape waveform
p(t) =
_
T|t|
T
, t [T, T]
0, otherwise.
Express the Fourier transform of the reconstructed approximation.
Compare s
F
(f) to the Fourier transform of the various reconstructions you have obtained.
Problem 12. (Sampling and Projections) We have seen that the reconstruction formula
of the sampling theorem can be rewritten in such a way that it becomes an orthonormal
expansion (expression (5.3)). If
j
(t) is the j th element of an orthonormal set of functions
used to expand w(t) , the j th coecient is c
j
equals the inner product w, . Explain
why we do not need to explicitly perform an inner product to obtain the coecients used
in the reconstruction formula (5.3).
Problem 13. (Nyquist Criterion via Picket Fence Miracle) Give an informal proof of
Theorem 79 (Nyquist Criterion for Nyquist Pulses) using the picket fence miracle (Ap-
pendix 5.F). Hint: A function p(t) is a Nyquist pulse if and only if p(t)E
T
(t) = (t) .
Problem 14. (Peculiarity of Nyquists Criterion) Let
g
(0)
F
(f) = T
[
1
3T
,
1
3T
]
(f)
be the central rectangle in Figure 5.15, and for every positive integer n, let g
(n)
F
(f) consist
of g
(0)
F
(f) plus 2n smaller rectangles of height
T
2n
and width
1
3T
, each placed in the middle
of an interval of the form [
l
T
,
l+1
T
] , l = n, n +1, . . . , n 1. Figure 5.15 shows g
(3)
F
(f) .
f
g
(3)
F
(f)
T
1
3T
2
3T

1
T

2
T

3
T
1
T
0
2
T
3
T
Figure 5.15:
192 Chapter 5.
(a) Show that for every n 1, g
(n)
F
(f) fullls Nyquist criterion with parameter T .
Hint: It is sucient that you verify that Nyquists criterion is fullled for f [0,
1
T
] .
Towards that end, rst check what happens to the central rectangle when you perform
the operation

lZ
g
(n)
F
(f
l
T
) . Then see how the small rectangles ll in the gaps.
(b) As n goes to innity, g
(n)
F
(f) converges to g
(0)
F
(f) . (It converges for every f and
also it converges in L
2
, i.e., the squared norm of the dierence g
(n)
F
(f) g
(0)
F
(f) goes
to zero.) Peculiar is that the limiting function g
(0)
F
(f) fullls Nyquists criterion with
parameter T
(0)
,= T . What is T
(0)
?
(c) For a Nyquist pulse of parameter T , the absolute bandwidth has to be at least
1
T
.
What about for other bandwidth criteria? Hint: Consider e.g. the 3-dB bandwidth
or the -bandwidth.
(d) Suppose that we use symbol-by-symbol on a pulse train to communicate across an
AWGN channel. To do so, we choose a pulse (t) such that [
F
(f)[
2
= g
(n)
F
(f) for
some n, and we choose n suciently large that
T
2n
is much smaller than the noise
power spectral density
N
0
2
. In this case, we can argue that our bandwidth is only
2
3T
. This means a 30% bandwidth reduction with respect to the minimum absolute
bandwidth
1
T
. This reduction is non-negligible if we pay for the bandwidth we use.
How do you explain that such a pulse is not used in practice? Hint: What do you
expect (t) to look like?
(e) Construct a function g
F
(f) that looks like Figure 5.15 in the interval shown by the
gure except for the heights of the rectangles. Your function should have innitely
many smaller rectangles on each side of the central rectangle and (like g
(n)
F
(f) ) shall
satisfy Nyquists criterion. Hint: One such construction is suggested by the innite
geometric series

i=1
(
1
2
)
i
, which adds to 1.
Problem 15. (Properties of the Fourier Transform) Prove the following properties of the
Fourier transform. The sign
F
relates Fourier transform pairs, with the function on
the right being the Fourier transform of that on the left. The Fourier transform of v(t)
and w(t) are denoted by v
F
(f) and w
F
(f) , respectively.
(a) linearity:
v(t) + w(t)
F
v
F
(f) + w
F
(f).
(b) time shifting:
v(t t
0
)
F
v
F
(f)e
j2ft
0
.
(c) frequency shifting:
v(t)e
j2f
0
t
F
v
F
(f f
0
).
5.G. Exercises 193
(d) convolution in time:
(v w)(t)
F
v
F
(f)w
F
(f).
(e) time scaling by ,= 0:
v(t)
F

v
F
_
f
[[
_
.
(f) time-frequency duality:
v
F
(t)
F

v(f).
(g) conjugation:
v

(t)
F
v

F
(f).
(h) Parsevals relationship:
_
v(t)w

(t)dt =
_
v
F
(f)w

F
(f)df.
Note: As a mnemonic, notice that the above can be written as v, w = v
F
, w
F
.
(i) correlation:
_
v(t + )w

(t)dt
F
v
F
(f)w

F
(f).
Hint: use Parsevals relationship on the expression on the right and interpret the
result.
194 Chapter 5.

Вам также может понравиться