Вы находитесь на странице: 1из 9

CS301 Data Structure-Assign_04- by Muhammad Ishfaq

Assignment 4 Idea solution W ith reference and theory Detail of Assignment: Objective: Understanding Huffman Coding Total M arks: 30 Due Date: 28/06/2011 Lecture Covered: From 25-27

Page No.1

Important Note: it is correct that huffman code would be/might be different for same sentence by different student, depending upon how they construct their huffman tree. Consider the following phrase: W elcome to Virtual University There are total 29 characters in it; 26 alphabets and 3 spaces Draw frequency table [5 marks] Draw Huffman tree [10 marks] Determine Huffman code for each character [5 marks] Encode the above phrase using Huffman codes calculated above [5 marks] Encode the above phrase using ASCII codes [5 marks] Solution: Some Important Question? W hat is meant by frequency table? This table shows how many time a number / character is repeated in a given data set or line of character. Frequency Table:

Draw the Hoffman encoding tree (step wise) with briefly description for the following sentence.
Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq


"go go gophers" And also draw the frequency table for each character and write the Huffman codes for each character.

Page No.2

Answer:
The frequency table of the sentence:
char frequency 'g' 'o' 'p' 'h' 'e' 'r' 's' ' ' 3 3 1 1 1 1 1 2

Initially we have the forest shown below. The nodes are shown with a weight/count that represents the number of times the node's character occurs.

We pick two minimal nodes. There are five nodes with the minimal weight of one, it doesn't matter which two we pick. In a program, the deterministic aspects of the program will dictate which two are chosen, e.g., the first two in an array, or the elements returned by a priority queue implementation. We create a new tree whose root is weighted by the sum of the weights chosen. We now have a forest of seven trees as shown here:

Choosing two minimal trees yields another tree with weight two as shown below. There are now six trees in the forest of trees that will eventually build an encoding tree.
Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq

Page No.3

Again we must choose the two trees of minimal weight. The lowest weight is the 'e'-node/tree with weight equal to one. There are three trees with weight two, we can choose any of these to create a new tree whose weight will be three.

Now there are two trees with weight equal to two. These are joined into a new tree whose weight is four. There are four trees left, one whose weight is four and three with a weight of three.

Two minimal (three weight) trees are joined into a tree whose weight is six. In the diagram below we choose the 'g' and 'o' trees (we could have chosen the 'g' tree and the space-'e' tree or the 'o' tree and the space-'e' tree.) There are three trees left.

The minimal trees have weights of three and four; these are joined into a tree whose weight is seven leaving two trees.

Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq

Page No.4

Finally, the last two trees are joined into a final tree whose weight is thirteen, the sum of the two weights six and seven.

The character encoding induced by the last tree is shown below where again, 0 is used for left edges and 1 for right edges.
char binary 'g' 'o' 'p' 00 01 1110

Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq


'h' 'e' 'r' 's' ' ' 1101 101 1111 1100 100

Page No.5

The frequency table of the sentence: Sr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Frequency 1 3 2 1 1 3 2 1 3 2 1 1 1 1 1 1 2 3 30 There is some error I made it consciously . Plz check and correct it. What is meant by frequency table? ( Its means that how many times a character exists in the sentence like, there are u=2 in virtual university) How to construct frequency table? ( see above and its creation is very simple as above) What is ASCII Code Table and its use? (I have write a program in my 1st semester to see all the ASCII code and that .cpp file I have attached here. It is the numeric representation of each and every character or space bar or any key of the key board and its range is from 0-127 whereas the extended ASCII are from 0-255) Huffman Tree: What is meant by Huffman table? (Huffman table is nothing but just a frequency table) How to construct Huffman table?(If you would like to ask how to construct Huffman tree than see........ )
Back to TOP

Character W e l c m t o V i r u a U n v s y SP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq

Page No.6

(Just like a binary tree. (i) Make nodes for each and every character and mark its frequency (ii) separate the nodes of same frequency (iii) make tree by using a blank root node (all process is available on the Lecture 25 nd 26 of handouts.) Where Huffman table is used ? (No Huffman table but Huffman CODE..... to determine the Huffman Code some one have to mark all links with 1 to the right side of every node and with 0 to the left side of a node.... ON THE HUFFMAN TREE.... and then traverse from ROOT NODE up to the required character) Huffman Code for each Character:(Described above to traverse from root node to the required node)

Huffman encoding of given Phrase: ("virtual University of Pakistan" in this sentence you have to put all the codes of specific character like code of v then i then r then t and so on) ACSII encoding of given phrase (Just put all the ASCII code in order ... like ..... ASCII for v then i then r and so on.... ASCII can be taken to run the program I have attached named ASCII.cpp) Character e t i SP l o r W c m V u a U n v s y Frequency 3 3 3 3 2 2 2 1 1 1 1 1 1 1 1 1 1 1 Initially we have the forest shown below. The nodes are shown with a weight/count that represents the number of times the node's character occurs.
e 3 t 3 i 3 SP 3 l 2 o 2 r 2 s 1 y 1 v 1 W 1 c 1 m 1 V 1 u 1 a 1 U 1 n 1

We pick two minimal nodes. There are 11 nodes with the minimal weight of one, it doesn't matter which two we pick. In a program, the deterministic aspects of the program will dictate which two are chosen, e.g., the first two in an array, or the elements returned by a priority queue implementation. We create a new tree whose root is weighted by the sum of the weights chosen. We now have a forest of trees as shown here:
2 r 2 s 1 y 1 v 1 2 2 2 2

e 3

t 3

i 3

SP 3

l 2

o 2

W 1

c 1

m 1

V 1

u 1

a 1

U 1

n 1

r 2

s 1

e 3

t 3

i 3

SP 3

l 2

o 2

y 1

v 1

W 1

c 1

m 1

V 1

u 1

a 1

U 1

n 1

Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq

Page No.7

12

r 2

s 1

e 3

t 3

i 3

SP 3

l 2

o 2

y 1

v 1

W 1

c 1

m 1

V 1

u 1

a 1

U 1

n 1

18

11

12

r 2

s 1

e 3

t 3

i 3

SP 3

l 2

o 2

y 1

v 1

W 1

c 1

m 1

V 1

u 1

a 1

U 1

n 1

18

18

11

1
3

0 Back to TOP

1
8

CS301 Data Structure-Assign_04- by Muhammad Ishfaq


12 6

Page No.8
1
4

0
6 6

0
4 r 2

0
s 1 2

1
4

0
e 3

1
t 3

0
i 3

1 1
SP 3

0
l 2 o 2

0
y 1

1
v 1 2

0
2

1
2

0
2

0 1W
1

1
c 1

0
m 1

1
V 1

0
u 1

1
a 1

0
U 1 n 1

Hence the required Huffman Tree is given below.

Huffman Code for each Character: Sr Character Huffman Code 1 W 11000 2 e 0000 3 l 4 c 5 m 6 t 7 o 8 V 9 i 10 r 11 u 12 a 13 U 14 n 15 v 16 s 17 y 18 SP 0011

Huffman encoding of given Phrase: I have just filled two columns, fill others with the help of above table. Huffman encoding of given Phrase Character W e l c o m Huff. code 11000 0000 Character e (SP) t o (SP) V Huff. code Character i r t u a l
Back to TOP

CS301 Data Structure-Assign_04- by Muhammad Ishfaq


Huff. code Character (SP) Huff. code Character r Huff. code Note: (SP)=blank space U s n i i t v y e

Page No.9

ACSII encoding of given phrase: ASCII encoding of given Phrase e l c 01100101 01101100 01100011 (SP) t o 00100000 01110100 01101111 r t u 01110010 01110100 01110101 U n i 01010101 01101110 01101001 s i t 01110011 01101001 01110100

Character ACSII code Character ACSII code Character ACSII code Character ACSII code Character ACSII code

W 01010111 e 01100101 i 01101001 (SP) 00100000 r 01110010

o 01101111 (SP) 00100000 a 01100001 v 01110110 y 01111001

m 01101101 V 01010110 l 01101100 e 01100101

Back to TOP

Вам также может понравиться