Channel Capacity Properties and Definitions

Chapter 7: Channel capacity
University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye
Chapter 7 outline
Definition and examples of channel capacity

Symmetric channels
Channel capacity properties
Definitions and jointly typical sequences
Channel coding theorem: achievability and converse
Hamming codes
Channels with feedback
Source-channel separation theorem
Generic communication block diagram

Noise
Source
Encoder
Channel
Decoder
Destination
Noise
Encoder
Source
Source
coder
Decoder
Channel
coder
Channel
Channel
decoder
Source
decoder
Destination
Decode signals, detect/correct errors

Remove redundancy
Controlled adding of redundancy
Restore source
Communication system
Noise
Source
Encoder
Channel
Destination
Estimate of message
Message
Source
Decoder
Encoder
Channel
Decoder
Destination
Capacity: key ideas

Estimate of message
Message
Source
Encoder
Channel
Decoder
Destination
choose input set of codewords so they are non-confusable at the output

number of these that we can chose will determine the channels capacity!
number that we can choose will depend on the distribution p(y|x) which
characterizes the channel!
for now we deal with discrete channels
Discrete channel capacity
Channel
" #
n
!
n ofi capacity ni
Mathematical
description
f (1 f )
Pe =
i
i=m+1
Information channel capacity:
C = max I(X; Y )
p(x)
1
C = log2 (1 + |h|2 P/PN )
2
Highest rate (bits/channel use) that can
1communicate at2 reliably
2 log2 (1 + |h| P/PN )
C = theorem' says: information capacity = operational
( capacity
Channel coding
1
2
Eh 2 log2 (1 + |h| P/PN )
)
)
)
1
)
maxQ:T r(Q)=P 2 log2 IMR + HQH
Operational channel capacity:
C=
max
r(Q)=P
ChannelQ:T
capacity
EH
)
)(
)
)
log2 IMR + HQH
2
'1
Y =Channel:
HX +p(y|x)
N
X
Y
1
X=H U+N
1I(X; Y )
Capacity
C H(H
= max
Y=
p(x) U) + N
=U+N
bits/channel use
hard part, to find the ``capacity achieving input

distribution.
"
#
!
p(x, y)
p(x, y) log
I(X; Y ) =
p(x)p(y)
x,y
= ,
1
C = log2B(1= +
B1 P/N)
+ B2
2
Noiseless channel
Channel capacity
Capacity?
1 bit/channel use
C = max I(X; Y )
p(x)
= max H(X) H(X|Y )

p(x)
= max H(X) 0
p(x)
=1
Non-overlapping outputs
Channel capacity
1/2
Capacity?
1/2
1 bit/channel use
1/2
C = max I(X; Y )
1
1/2
p(x)
= max H(X) H(X|Y )

p(x)
= max H(X) 0
p(x)
=1
o
1
2
3
;Y )
X) H(X|Y Channel
)
capacity
X) log2 (3)
Noisy typewriter
Z .
Capacity?
D
E
log2 (3) = log2 (9) bits/channel use
I
J
T
S
C = max I(X; Y )
p(x)
O N M
= max H(X) H(X|Y )

p(x)
= max H(X) log2 (3)

p(x)
= log2 (27) log2 (3) = log2 (9)
Channel capacity
Capacity?
Binary erasure channel

p
1-f bits/channel use

1-p
1-f
f
1-f
Binary symmetric channel
Channel capacity
Capacity?
1-H(f)
bits/channel use
1-f
f
f
Conditional distributions
p(y=0|x=0) = p(y=1|x=1)=1-f
1-f
p(y=1|x=0) = p(y=0|x=1)=f
[Cover+Thomas pg.187]
Transition probability matrix

ReviewReview
Examples of
Channel of Channel
Examples
Review
Channel Capacity
Channel Capacity
Examples of Channel
Jointly Typical Sequences

Channel Capacity
Binary
Channels
Binary
Channels
Binary
BinaryChannels
Symmetric Channel: X = {0, 1} and Y = {0, 1}
Binary Symmetric Channel: X = {0, 1} and Y = {0, 1}

!
"
0
0
1X=f !{0,
f 1} and Y = "
{0, 1}
XBinary Symmetric Channel:
Y
ff f
fY
11
0
0
X
!
f
1 "f
1f f
X
1
Binary
Erasure
Channel: X =1Y{0, 1} and Y = {0, ?, 1}
1Y
f = {0, ?, 1}
Binary Erasure Channel: X =f {0, 1} and
0
"
{0,
f f1} and
0
Binary Erasure Channel:
X1=
Y
! f 1 fY = {0, ?,"1}
0
1f f 0
X
Y
!
0
f 1 "f
1f f 0
X
Y
Z channel: X = {0, 1} and Y = {0,0 1}
f 1f
"
Z channel: X = {0, 1} and 1Y =

0 {0, 1}
Y
0
0
f 1!
f
"
Z channel: X = {0, 1} and Y = {0, 1}
1 0
X
Y
0
0
!
f 1 "f
1 0
1Y
X B. Smida 1(ES250)
Channel Capacity
f 1f
Fall 2008-09
9 / 22
B. Smida (ES250)
B. Smida (ES250)
Channel Capacity
Channel Capacity
Fall 2008-09
Fall 2008-09
9 / 22
9 / 22
Symmetric channels
= max(1 )H ( ) + H () H ()
(7.14)
and C is achieved by a uniform

distribution on the input.
The transition matrix
of the
= max
(1 symmetric
)H ( ) channel defined above is doubly
(7.15)
stochastic. In the computation

of the capacity, we used the facts that the
= 1 of
one
, another and that all the column sums(7.16)
rows were permutations
were
equal.
1
where capacity is achieved by = 2 .
Considering
theseforproperties,
we has
cansome
define
a generalization
of the
The expression
the capacity
intuitive
meaning: Since
a
concept
of
a
symmetric
channel
as
follows:
proportion of the bits are lost in the channel, we can recover (at most)
a proportion 1 of the bits. Hence the capacity is at most 1 . It is
Definition A channel is said to be symmetric if the rows of the channel
not immediately obvious that it is possible to achieve this rate. This will
transition matrix p(y|x) are permutations of each other and the columns
follow from Shannons second theorem.
are permutations
of each
other. Athe
channel
said to some
be weakly
symmetric
In many practical
channels,
senderisreceives
feedback
from
if every
row
of
the
transition
matrix
p(|x)
is
a
permutation
of
every
"
the receiver. If feedback is available
for the binary erasure channel, other
it is
row
andclear
all the
column
are equal.
x p(y|x)
very
what
to do: sums
If a bit is
lost, retransmit
it until it gets through.
For example,
Since
the bits the
get channel
through with
with transition
probabilitymatrix
1 , the effective rate of
$able to achieve a capacity
transmission is 1 . In this way #we1are1easily
1
of 1 with feedback.p(y|x) =
3
6
2
(7.24)
Later in the chapter we prove that 31the 21rate611 is the best that can be
achieved both with and without feedback. This is one of the consequences
the surprising
does not increase the capacity of
is of
weakly
symmetricfact
butthat
not feedback
symmetric.
discrete memoryless channels.
7.2
SYMMETRIC CHANNELS
The capacity of the binary symmetric channel is C = 1 H (p) bits per

transmission, and the capacity of the binary erasure channel is C = 1
bits per transmission. Now consider the channel with transition matrix:
0.3 0.2 0.5

p(y|x) = 0.5 0.3 0.2 .
(7.17)
0.2 0.5 0.3
Here the entry in the xth row and the yth column denotes the conditional
probability p(y|x) that y is received when x is sent. In this channel, all
the rows of the probability transition matrix are permutations of each other
and so are the columns. Such a channel is said to be symmetric. Another
example of a symmetric channel is one of the form
Y =X+Z
Capacity of weakly symmetric channels
(mod c),
(7.18)
" #
n
!
n i
f (1 f )ni
Pe =
Properties of the channel
i capacity
i=m+1
C = max I(X; Y )
p(x)
C=
C=
C=
1
2
1
log2 (1 + |h|2 P/PN )
2
log2 (1 + |h|2 P/PN )
(
Eh 2 log2 (1 + |h| P/PN )
)
)
)
1
)
maxQ:T r(Q)=P 2 log2 IMR + HQH
'1
maxQ:T r(Q)=P EH
)
)(
log2 )IMR + HQH )

2
'1
Preview of the channel coding theorem
Y = HX + N
X = H1 U + N
Y = H(H1 U) + N
=U+N
What happens when we use the channel n times?
C=
1
log2 (1 + P/N)
2
Review
Examples of Channel
Channel Capacity
Previous
of the
coding
theorem
Preview
of channel
the channel
coding
theorem
An average input sequence corresponds to about 2nH(Y |X ) typical output

sequences
There are a total of 2nH(Y ) typical output sequences
For nearly error free transmission, we select a number of input sequences
whose corresponding sets of output sequences hardly overlap
The maximum number of distinct sets of output sequences is
2n(H(Y )H(Y |X )) = 2nI (Y ;X )
B. Smida (ES250)
Channel Capacity
Lets make this rigorous!
Fall 2008-09
18 / 22
Definitions
Channel
Estimate of message
Message
Definitions
Source
Encoder
Channel
Decoder
Destination
Estimate of message
Message
Definitions
Source
Encoder
Channel
Source
Destination
Estimate of message
Message
Definitions
Decoder
Encoder
Channel
Decoder
Destination
Whats our goal?
" #
n
!
n ofi capacity ni
Mathematical
description
f (1 f )
Pe =
i
i=m+1
Information channel capacity:
C = max I(X; Y )
p(x)
1
C = log2 (1 + |h|2 P/PN )
2
Highest rate (bits/channel use) that can
1communicate at2 reliably
2 log2 (1 + |h| P/PN )
C = theorem' says: information capacity = operational
( capacity
Channel coding
1
2
Eh 2 log2 (1 + |h| P/PN )
)
)
1
maxQ:T r(Q)=P 2 log2 )IMR + HQH )
Operational channel capacity:
Recall the definition of typical sequences....
Lets make this 2-D!
Jointly typical sequences
Joint Asymptotic Equipartition Theorem (AEP)
Joint typicality images
Channel Coding Theorem
Proof of Achievability
Proof of Converse
ntly Typical Diagram
There are about 2nH(X ) typical X in all

Each typical Y is jointly typical with about 2nH(X |Y ) of those typical X s
Channel coding theorem
Key ideas behind channel coding theorem

Allow for arbitrarily small but nonzero probability of error
Use channel many times in succession: law of large numbers!
Probability of error calculated over a random choice of codebooks
Joint typicality decoders used for simplicity of proof
NOT constructive! Does NOT tell us how to code to achieve capacity!
Key ideas behind the channel coding theorem
Random codes
Estimate of message
Message
Source
Encoder
Channel
Decoder
Destination
Transmission
Probability of error
Estimate of message
Message
Source
Encoder
Channel
Destination
Estimate of message
Message
Source
Decoder
Encoder
Channel
Decoder
Destination
Probability of error
Random codes?
Estimate of message
Message
Source
Encoder
Channel
Destination
Estimate of message
Message
Source
Decoder
Encoder
Channel
Decoder
Destination
AY
2N H(Y )
Analogy....
#
! ! ! 2N H(Y |X)
!!!
#
!!!#
!!!
!!
!!
!!
!"
!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
!
"
!!!
N H(X|Y ) ! ! !
2
!!!
!!!
!!!
!!
!!
10.3 Proof of the noisy-channel coding theorem

Analogy
Imagine that we wish to prove that there is a baby in a class of one hundred
babies who weighs less than 10 kg. Individual babies are difficult to catch and
weigh. Shannons method of solving the task is to scoop up all the babies
and weigh them all at once on a big weighing machine. If we find that their
average weight is smaller than 10 kg, there must exist at least one baby who
weighs less than 10 kg indeed there must be many! Shannons method isnt
guaranteed to reveal the existence of an underweight child, since it relies on
there being a tiny number of elephants in the class. But if we use his method
and get a total weight smaller than 1000 kg then our task is solved.
Figure 10.3. Shannons method for

proving one baby weighs less than
10 kg.
From skinny children to fantastic codes

We wish to show that there exists a code and a decoder having small probability of error. Evaluating the probability of error of any particular coding
and decoding system is not easy. Shannons innovation was this: instead of
constructing a good coding and decoding system and evaluating its error probability, Shannon calculated the average probability of block error of all codes,
and proved that this average is small. There must then exist individual codes
that have small probability of block error.
[Mackay textbook, pg. 164]
Random coding and typical-set decoding

Consider the following encodingdecoding system, whose rate is R ! .
!
1. We fix P (x) and generate the S = 2N R codewords of a (N, N R! ) =
Channel coding theorem
Converse to the channel coding theorem

Based of Fanos inequality:

Need one more result before we prove the converse:
What does this mean?
Now lets prove the channel coding converse
Weak versus strong converses

Weak converse:
Strong converse:
Channel capacity: sharp dividing point below which

fast, and above which
exponentially fast.
exponentially
Practical coding schemes

Noise
Encoder
Source
Source
coder
Decoder
Channel
coder
Channel
Channel
decoder
Source
decoder
Destination
Example: channel coding
With permission from David J.C. Mackay
Rate?
R = # source bits / # coded bits
With permission from David J.C. Mackay

Use repetition code of rate R=1/n: 0 ! 000...0
1 ! 111...1
Decoder?
Majority vote
om
c
e
l
liab
e
Probability of error?
[n=2m+1]
r
r
! fo
!
" #
dn
n
e
e
!
N
Pe =
i=m+1
n!
o
i
t
a
unic
n i
f (1 f )ni
i
C = max I(X;WithYpermission
) from David J.C. Mackay
p(x)
Channel capacity
Is capacity R = 0?
No! just need better coding!

Now, were more interested in determining capacity than determining
(finding codes) to achieve it
Benchmarks
Practical coding schemes

Noise
Encoder
Source
Source
coder
Decoder
Channel
coder
Channel
Channel
decoder
Source
decoder
Destination
Linear block codes
Properties of linear block codes
Properties of linear block codes
Examples
of the information bits. The Hamming code that we describe below is an

parity check bit and to allow the parity checks to depend on various subsets
exampleofofthea information
parity check
code. We describe it using some simple ideas
bits. The Hamming code that we describe below is an
from linear
algebra.
example of a parity check code. We describe it using some simple ideas
To illustrate
thealgebra.
principles of Hamming codes, we consider a binary
from linear
code of block
length
7. All
operations
will be codes,
done modulo
2. Consider
To illustrate the
principles
of Hamming
we consider
a binary
code
block length
All operations
will3.be
done modulo
2. columns
Consider
the set of
all of
nonzero
binary7.vectors
of length
Arrange
them in
set of allcodes
nonzero binary vectors of length 3. Arrange them in columns
to
form the
a matrix:
Hamming
to form a matrix:
0 0 0 1 1 1 1
0 0 0 1 1 1 1
0 1 1 0 0 1 1 .
H =
(7.117)
H = 0 1 1 0 0 1 1 .
(7.117)
1 01 10 01 10 10 01 1
the set the

of vectors
of length
7 in7the
nullnull
space
Consider
set of vectors
of length
in the
spaceofofHH(the
(the vectors
vectors
Consider
# of codewords?
which
when
multiplied
by
H
give
000).
From
the
theory
of
linear
spaces,
which when multiplied by H give 000). From the theory of linear spaces,
has3,rank
we expect
the null
space
havedimension
dimension 4.
since Hsince
has H
rank
we 3,expect
the null
space
of of
H Htotohave
4.
4 the24codewords?
These
codewords
are
These
what 2are
codewords are
0000000 0100101 1000011 1100110
0000000
0100101 1000011 1100110
0001111 0101010 1001100 1101001
0001111
0101010
1001100
0010110
0110011
10101011101001
1110000
0010110
0110011
1010101
0011001 0111100 10110101110000
1111111
0011001 0111100 1011010 1111111
Since the set of codewords is the null space of a matrix, it is linear in the
sense
the sum of is
anythe
two
codewords
a codeword.
The in
setthe
of
Since the
set that
of codewords
null
space ofisaalso
matrix,
it is linear
213
codewords
therefore
forms
a
linear
subspace
of
dimension
4
in
the
vector
sense that the sum of any two codewords is also a codeword. The set of
space of dimension
7.represent the message, and the last n k bits are parity check
codeword
codewords
therefore forms
linear
subspace
ofThedimension
4 in the vector
bits. Suchaa code
is called
a systematic code.
code is often identified
by its block length n, the number of information bits k and the minimum
space of dimension 7.distance
d. For example, the above code is called a (7,4,3) Hamming code
7.11 HAMMING CODES
(i.e., n = 7, k = 4, and d = 3).

An easy way to see how Hamming codes work is by means of a Venn
diagram. Consider the following Venn diagram with three circles and with
four intersection regions as shown in Figure 7.10. To send the information
sequence 1101, we place the 4 information bits in the four intersection
regions as shown in the figure. We then place a parity bit in each of the
three remaining regions so that the parity of each circle is even (i.e., there
are an even number of 1s in each circle). Thus, the parity bits are as
shown in Figure 7.11.
Now assume that one of the bits is changed; for example one of the
information bits is changed from 1 to 0 as shown in Figure 7.12. Then
the parity constraints are violated for two of the circles (highlighted in the
figure), and it is not hard to see that given these violations, the only single
bit error that could have caused it is at the intersection of the two circles
(i.e., the bit that was changed). Similarly working through the other error
cases, it is not hard to see that this code can detect and correct any single
bit error in the received codeword.
214
CHANNEL CAPACITY
We can easily generalize this procedure to construct larger matrices
H . In general, if we use l rows in H , the code that we obtain will have
block length n = 2l 1, k = 2l l 1 and minimum distance 3. All
these codes are called Hamming codes and can correct one error.
A curiosity: Venn diagrams + Hamming codes

Hamming codes can correct up to 1 error.
1
1
1
0
1
0
214
FIGURE 7.11. Venn diagram with information bits and parity bits with even parity for each
circle.
CHANNEL CAPACITY
FIGURE 7.10. Venn diagram with information bits.
1
1
FIGURE 7.11. Venn diagram with information bits and parity bits with even parity for each
circle.
FIGURE 7.12. Venn diagram with one of the information bits changed.
Hamming codes are the simplest examples of linear parity check codes.
They demonstrate the principle that underlies the construction of other
linear codes. But with large block lengths it is likely that there will be
more than one error in the block. In the early 1950s, Reed and Solomon
from linear algebra.
To illustrate the principles of Hamming codes, we consider a binary
code of block length 7. All operations will be done modulo 2. Consider
the set of all nonzero binary vectors of length 3. Arrange them in columns
to form a matrix:
0 0 0 1 1 1 1
Linear block codes: not good enough...
H = 0 1 1 0 0 1 1 .
(7.117)
1 0 1 0 1 0 1
Achieving capacity
Convolutional codes: not good

enough......
Consider
the set of vectors of length 7 in the null space of H (the vectors
since H has rank 3, we expect the null space of H to have dimension 4.
These 24 codewords are
0000000
0001111
0010110
0011001
0100101
0101010
0110011
0111100
1000011 1100110
1001100 1101001
1010101
1110000
7.11 HAMMING CODES
211
1011010 1111111
the block as a 1; otherwise, we decode it as 0. An error occurs if and

only if more than three of the bits are changed. By using longer repetition
Sincecodes,
the set
of achieve
codewords
is the
spaceofof
a matrix,
it is linear in the
we can
an arbitrarily
lownull
probability
error.
But the rate
the code
goes
zero with
length, so even
though
code
senseofthat
the also
sum
oftoany
twoblock
codewords
is also
a the
codeword.
The set of
is simple, it is really not a very useful code.
codewords
therefore
forms
a
linear
subspace
of
dimension
4
in
the vector
Instead of simply repeating the bits, we can combine the bits in some
spaceintelligent
of dimension
7. each extra bit checks whether there is an error in
fashion so that
some subset of the information bits. A simple example of this is a parity
check code. Starting with a block of n 1 information bits, we choose
the nth bit so that the parity of the entire block is 0 (the number of 1s
in the block is even). Then if there is an odd number of errors during
the transmission, the receiver will notice that the parity has changed and
detect the error. This is the simplest example of an error-detecting code.
The code does not detect an even number of errors and does not give any
information about how to correct the errors that occur.
We can extend the idea of parity checks to allow for more than one
from linear algebra.
To illustrate the principles of Hamming codes, we consider a binary
code of block length 7. All operations will be done modulo 2. Consider
the set of all nonzero binary vectors of length 3. Arrange them in columns
to form a matrix:
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1 .
(7.117)
good enough...
1 0 1 0 1 0 1
Achieving capacity
Linear block codes: not
Convolutional codes: not good
Consider the set of vectors of length 7 in the null space of H (the vectors
enough...
since H has rank 3, we expect the null space of H to have dimension 4.
These 24 codewords are
0000000 0100101 1000011 1100110
0001111
1001100convolutional
1101001
Turbo codes: 1993 Berrou et al. considered
two 0101010
interleaved
0010110 0110011 1010101 1110000
0111100
1011010
1111111
codes with parallel cooperative decoders.0011001
Achieved
close
to capacity!
Since the set of codewords is the null space of a matrix, it is linear in the
sense that the sum of any two codewords is also a codeword. The set of
formsintroduced
a linear subspace by
of dimension
4 in the
Paritycodewords
Checktherefore
Codes
Gallager
invector
his
space of dimension 7.
LDPC codes: Low Density

1963 thesis, later kept alive by Michael Tanner (UICs provost!) in the 80s and
then re-discovered in the 90s, where an iterative message passing
algorithm used to decode was formulated. Also achieve close to capacity!
Excellent survey is
Channel Coding: The Road to Channel Capacity

Forney, G.D.!! Costello, D.J.!!
This paper appears in: Proceedings of the IEEE
Publication Date: June 2007
Volume: 95,! Issue: 6
On page(s): 1150-1177
ISSN: 0018-9219
INSPEC Accession Number: 9629854
Digital Object Identifier: 10.1109/JPROC.2007.895188
Current Version Published: 2007-07-30
(linked to on course website)
Feedback capacity
Channel without feedback
Estimate of message
Message
Source
Encoder
Channel
Decoder
Destination
Channel WITH feedback

Estimate of message
Message
Source
Encoder
Feedback capacity
Decoder
Destination
Feedback capacity
Channel without feedback
Estimate of message
Message
Source
Encoder
Channel
Decoder
Destination
Channel WITH feedback

Estimate of message
Message
Source
Decoder
Encoder
Destination
Source-channel separation
When are we allowed to design the source and channel coder separately AND
remain optimal from an end-to-end perspective?
Noise
Source
Encoder
Channel
Decoder
Destination
Noise
Encoder
Source
Source
coder
Decoder
Channel
coder
Channel
Channel
decoder
Source
decoder
Destination
Source-channel separation: achievability
Source
Encoder
Channel
Decoder
Destination
Source-channel separation: converse
Noise
Source
Encoder
Channel
Decoder
Destination
Noise
Encoder
Source
Source
coder
Decoder
Channel
coder
Channel
Channel
decoder
Source
decoder
Destination

Channel Capacity Properties and Definitions

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Channel Capacity Properties and Definitions

Загружено:

Авторское право:

Доступные форматы

Chapter 7: Channel capacity

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Definition and examples of channel capacity

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Generic communication block diagram

Decode signals, detect/correct errors

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Capacity: key ideas

choose input set of codewords so they are non-confusable at the output

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Discrete channel capacity

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

hard part, to find the ``capacity achieving input

= max H(X) H(X|Y )

= max H(X) H(X|Y )

log2 (3) = log2 (9) bits/channel use

= max H(X) H(X|Y )

= max H(X) log2 (3)

= log2 (27) log2 (3) = log2 (9)

Binary erasure channel

1-f bits/channel use

Binary symmetric channel

Transition probability matrix

Jointly Typical Sequences

Jointly Typical Sequences

Binary Symmetric Channel: X = {0, 1} and Y = {0, 1}

Z channel: X = {0, 1} and 1Y =

and C is achieved by a uniform

stochastic. In the computation

The capacity of the binary symmetric channel is C = 1 H (p) bits per

0.3 0.2 0.5

Capacity of weakly symmetric channels

log2 )IMR + HQH )

Preview of the channel coding theorem

What happens when we use the channel n times?

Jointly Typical Sequences

An average input sequence corresponds to about 2nH(Y |X ) typical output

Lets make this rigorous!

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Whats our goal?

University of Illinois at Chicago ECE 534, Fall 2009, Natasha Devroye

Recall the definition of typical sequences....

Lets make this 2-D!

Jointly typical sequences

Joint Asymptotic Equipartition Theorem (AEP)

Joint typicality images

Channel Coding Theorem

ntly Typical Diagram

There are about 2nH(X ) typical X in all

Channel coding theorem

Key ideas behind channel coding theorem

Key ideas behind the channel coding theorem

10.3 Proof of the noisy-channel coding theorem

Figure 10.3. Shannons method for

From skinny children to fantastic codes

[Mackay textbook, pg. 164]

Random coding and typical-set decoding

1. We fix P (x) and generate the S = 2N R codewords of a (N, N R! ) =

Channel coding theorem

Converse to the channel coding theorem

Converse to the channel coding theorem

What does this mean?

Now lets prove the channel coding converse