Вы находитесь на странице: 1из 6

Hamming and Huffman Coding Tutorial

By Tom S. Lee

Encoding:

Hamming and Huffman codes are completely different tools used by computers. Hamming code is
an error detecting and correcting tool, while Huffman is a compression tool. Both can be used
together, however, to both compress and error detect a message. The following tutorial will show
both methods used together.

Let's begin by compressing the simple message "codes_are_cool". The first thing we need to do is
detect which letters are being used in the message, and their frequency. The list is as follows:

c 2/14
o 3/14
d 1/14
e 2/14
s 1/14
_ 2/14
a 1/14
r 1/14
l 1/14
14/14

We then make a tree graph, beginning at the bottom and working our way up. We start with the
lowest frequency numbers and combine them to one node. The graph is as follows:

Next, we read our codes from the top down (to keep it prefix free) and encode our characters:

c 011
o 11
d 0000
e 100
s 0001
_ 010
a 0010
r 0011
l 101

We can now convert our code:


0111100001000001010001000111000100111111101

Now this is our Huffman code, but what about hamming? we must now break this code into byte
sized chunks. I will break it down into 4 bit sections:

0111
1000
0100
0001
0100
0100
0111
0001
0011
1111
101

For each section, I need to find the check bits. First set up a table like the following:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0

and every colomn that has a single 1, put a check bit. Everywhere else, put a data bit:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4

Now input the first data into our table:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
0 1 1 1

We must now figure out what our check bits are. Let's leave off C4 for simplicity. Look in the row
where C1 has a 1. Every place that has a 1 needs to be xor'd together. This will give us the value
of that check bit. We do this in each row. For example:
C1 = D1 xor D2 xor D4
C1 = 0 xor 1 xor 1
C1 = 0

C2 = D1 xor D3 xor D4
C2 = 0 xor 1 xor 1
C2 = 0

C3 = D2 xor D3 xor D4
C3 = 1 xor 1 xor 1
C3 = 1
So our Hamming code of 0111 becomes:

0001111

And thats all there is to it. Lets try the second code:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
1 0 0 0

C1 = D1 xor D2 xor D4
C1 = 1 xor 0 xor 0
C1 = 1

C2 = D1 xor D3 xor D4
C2 = 1 xor 0 xor 0
C2 = 1

C3 = D2 xor D3 xor D4
C3 = 0 xor 0 xor 0
C3 = 0

Our resulting code is:

1110000

We can go through all of these in the same manner and obtain a correct code for each, but I will
leave that as an exercise for the reader.

Decoding:

For this part of the tutorial, I will use a different example, as the previous is quite an exhaustive
amount of work. Here is the Huffman code that is given:

d 10
g 110
o 0
_ 111

The hamming codes that were given are as follows (ignore the 8th bit in each):

0 1 0 1 1 0 0 0
0 1 0 0 1 1 1 0
0 1 1 0 1 1 0 0
1 1 0 0 0 1 0 0

So we create the same table that we did in the encoding part and place the first byte of code in it:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
0 1 0 1 1 0 0 0

Now similarly to encoding, we must xor all bits whose row corresponds to the check bit in question.
The difference is, this time we must also xor in the check bit. The result should be zero. If not, an
error has occurred in transmission.

A1 = C1 xor D1 xor D2 xor D4


A1 = 0 xor 0 xor 1 xor 0
A1 = 1

A2 = C2 xor D1 xor D3 xor D4


A2 = 1 xor 0 xor 0 xor 0
A2 = 1

A3 = C3 xor D2 xor D3 xor D4


A3 = 1 xor 1 xor 0 xor 0
A3 = 0

We can see that all three A's are not equal to zero. This lets us know that there is an error in the
code. It also tells us where exactly the error is. If you read the errors in reverse order, you will have
the binary location of the error. Looking at this error, we can see there is an error in location three.
We simply flip the bit, and we get the Huffman code:

1100

Now we try the second byte.

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
0 1 0 0 1 1 1 0

A1 = C1 xor D1 xor D2 xor D4


A1 = 0 xor 0 xor 1 xor 1
A1 = 0

A2 = C2 xor D1 xor D3 xor D4


A2 = 1 xor 0 xor 1 xor 1
A2 = 1

A3 = C3 xor D2 xor D3 xor D4


A3 = 0 xor 1 xor 1 xor 1
A3 = 1

We can see there is an error in location 6 so we flip the bit and get the huffman code:

0101

Now we try byte three

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
0 1 1 0 1 1 0 0

A1 = C1 xor D1 xor D2 xor D4


A1 = 0 xor 1 xor 1 xor 0
A1 = 0

A2 = C2 xor D1 xor D3 xor D4


A2 = 1 xor 1 xor 1 xor 0
A2 = 1

A3 = C3 xor D2 xor D3 xor D4


A3 = 0 xor 1 xor 1 xor 0
A3 = 0

So there is an error in bit two. Bit two is a check bit, so it has no effect on the huffman code, which
is:

1110

For the last bit, we get:

0 0 0 0 0 0 0 1
0 0 0 1 1 1 1 0
0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0
C1 C2 D1 C3 D2 D3 D4 C4
1 1 0 0 0 1 0 0

A1 = C1 xor D1 xor D2 xor D4


A1 = 1 xor 0 xor 0 xor 0
A1 = 1

A2 = C2 xor D1 xor D3 xor D4


A2 = 1 xor 0 xor 1 xor 0
A2 = 0

A3 = C3 xor D2 xor D3 xor D4


A3 = 0 xor 0 xor 1 xor 0
A3 = 1

So there is an error in bit 5. Simply by flipping the bit, we get the huffman code:

0110

Now that we have corrected all the errors in the code, we can begin to decompress it. The code
reads:

1100 0101 1110 0110

Now looking at the huffman chart, we can decipher this code:

d 10
g 110
o 0
_ 111

110 g
0 o
0 o
10 d
111 _
10 d
0 o
110 g

So our message reads:

good_dog

Вам также может понравиться