You are on page 1of 7

Assignment # 6

Assignment # 6
Huffman coding

Saad Iftikhar Munhal Imran Muhammad Hassan Zia Moeez Aslam Hamza Hashmi junaid Afzal Swatti Ali Tausif Armaghan Ahmed Zohair Fakhar Mohsin Altaf




Instructor: Sir Qasim Umer Khan

Assignment # 6

Huffman Coding implementation in Matlab

classdef huffman % data values that the node has in it structure

Abstract In this assignment we have implemented the Huffman entropy encoding algorithm for data compression. The results obtained after extensive testing with different sets showed acceptable results and confirmed the notion that more similar the data set the better is compression achieved by Huffman compression algorithm.


n computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variablelength code table for encoding a source symbol (such as a character in a file) where the variable-length code table has been derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol. Huffman coding uses a specific method for choosing the representation for each symbol, resulting in a prefix code (sometimes called "prefix-free codes", that is, the bit string representing some particular symbol is never a prefix of the bit string representing any other symbol) that expresses the most common source symbols using shorter strings of bits than are used for less common source symbols

properties leftNode = [] rightNode = [] probability code = []; symbol huffy % will store the huufman code just for check end end %%%%%%%%%%%--------------%%%%%%%%%%%%%% Code for probability finding of data: %- calculating frequency of elements--% %%%%--- Saad Iftikhar-------%%%%%%%%%%% %%%---- 17 december 2013----%%%%%%%%%% %% calculate how many same numbers occur function [data_unique,data_freq]=frequency(data); clc; % data=[22 33 55 66 11 22 33 44 66]; data_unique=unique(data); % this function creates asorted ascending order array % with only unique elements no two elements are repereated for i=1:length(data_unique) data_unique1(i)= sum(data == data_unique(i)); % this array has the corresponding frequency of the data % in the unique array end data_unique=data_unique; data_freq=data_unique1; data_freq=data_freq/sum(data_freq); end %%%%%%%%%%%------------%%%%%%%%%%%%%% Conversion of data from binary in decimal: %--- calculating frequency of elements binary version-------------% %%%%%%--- 20-21 december 2013---%%%%%% %% convert the data to decimal function [convData]=dataConv(data,M);

II. ASSIGNMENT In this assignment we were required to implement the Huffman algorithm in matlab.

following are the matlab codes:

CODE: Class of Huffman code: %-----huffman coding-------------% %%%----- version 1--------------%%% %%- data structure (classes)---%%%% %%------18-12-2013------------%%% %%

Assignment # 6
%%% always comes fraction % M=8; k=log2(M); convData=[]; a whole no not a

3 information=[0 0 1 0 1 0 0 0 0 1]; % information=randint(1,3000); convdata=dataConv(information,M); % this function her will convert our % from binary format to decimal as rest of our program is written for % decimal [data,prob]=frequency(convdata); %% Empty Array of Object Huffman array = huffman.empty(length(prob),0); array_final huffman.empty(length(prob),0);

remainder=mod(length(data),k); % this function here will check if the data is exactly divisable by k or else will append 0 bits ; if(remainder~=0) append=k-remainder; else append=0; end data=[zeros(1,append) data]; for dataLength=1:k:length(data)

string=num2str(data(dataLength:dataLength +k-1)); decimal=bin2dec(string); convData=[convData decimal]; end %%%%%%%%%%%------------%%%%%%%%%%%%%%

Main code of Huffman: %%%%% main code of huffman coding algo using classes-------%%%%% %%%%%-- Saad Iftikhar 18-12-2013-%%%%% %%%---creating binary tree-----%%%%%% %%%---huffman code using classes---%%%% %% function codedData=sourceHuffman(information,M); clc; clear symbol; clear codeHuff; clear codeBits; clear arrr; %% initializing global symbol; % global variable will have the the symbols global codeHuff; % global variable will have the the symbols huffman codes global codeBits;% global variable will have the the symbols related length of huffman codes decodeData=[]; symbol=[]; codeHuff=[]; codeBits=[]; M=4;

%% Assign Initial saving all the probabilities of the numbers in the % probability property of the class/structure alsp for i=1:length(data) array(i).probability = prob(i); array(i).symbol = data(i); array_final(i).probability = prob(i); array_final(i).symbol = data(i); end % here creating a temperary aaray to do the sorting the algo we are using % is the bubleSort algo for ascending order temparray = array; % %% Creating the Binary Tree for k = 1:size(temparray,2)-1 % size(a,2) gives size of the columns % binary tree is where a node/ parent has two children and lower % probability one is on left and higher one is on right % here to create a binary tree we have to traverse for the no of nodes -1 % here we take size of the colums as size is always given as 2 dim vector for k = 1:size(temparray,2)-1 % % First Sort the temp array usse buble sort % for i=1:size(temparray,2) for j = 1:size(temparray,2)-1 % buble Sort algorithm if (temparray(j).probability > temparray(j+1).probability) tempnode = temparray(j); % this is the swaping operation temparray(j) = temparray(j+1); temparray(j+1) = tempnode;

Assignment # 6
end end end % % % f=traverse(rootNode,le_code); % will traverse the tree and huffman tree

4 here we generate

% now we have to

Create a new node

% here is the loop for detecting replacing the data with its huffman code for codeLength=1:length(convdata) for inner=1:length(symbol) if(convdata(codeLength)==symbol(inner)) level=sum(codeBits(1:inner-1))+1; final_data=[final_data codeHuff(level:level+codeBits(inner)-1)]; else end end end % codedData=final_data %%%%%%%%%%%----------------%%%%%%%%%%%%%% Traversal code for code generation: %%%%%----------- function for traversal of the binary tree-------%%%%% %%%%%---- 20-12-2013------------%%%%% %%%--algorithm for traversing the tree whole to get the code-----%%%%%%%

newnode = huffman; % a node of the class of huffman % Add the probailities here we are creating the tree lowest two % probability nodes are added into one single node newnode.probability = temparray(1).probability + temparray(2).probability; % new node has the sum of previous two probabilities % % now assign the left lowest probabily one as 0 and higher probabilty oen % as 1 temparray(1).code = [0]; temparray(2).code = [1]; % % % Attach Children Nodes to the new node the parent node created newnode.leftNode = temparray(1); newnode.rightNode = temparray(2); % % % remove the previous two nodes and replace by parent nodes just like % in C++ we would remove the pointer and of children nodes and replace % by pointer of father node % temparray = temparray(3:size(temparray,2)); % fist two nodes are gone % % now appending the new parent node % temparray = [newnode temparray]; % end % end the looping and hence binary tree created %% rootNode = temparray(1); % the root node is always the first node le_code = []; % that will be the final code huffman % % % Looping though the tree % % See recursive function loop.m % final_data=[]; % variable definition later used check=huffman;

function f = traverse(tempNode,codec) global symbol; % these are the global variables to store our array code and data as in recursive functions they are continously over written global codeHuff; global codeBits;

if ~isempty(tempNode) % if we have the next root or not codec = [codec tempNode.code]; % append with the previous node if ~isempty(tempNode.symbol) % disp(tempNode.symbol); tempNode.huffy=[codec]; % disp(codec); symbol=[symbol tempNode.symbol]; codeHuff=[codeHuff codec]; codeBits=[codeBits length(codec)]; end traverse(tempNode.leftNode,codec); traverse(tempNode.rightNode,codec);

Assignment # 6
end f=codec; end %%%%%%%%%-------------%%%%%%%%%%%%%% Code for decoding of Huffman: %---Huffman decoding algorithm---------% %%%%%------- Saad Iftikhar----%%%%%%%%%% %-- 21-12-2013------------------------% end

%% function decodedData=decodeHuffman(data,M,rootNode ) %lets traverse data and create decoded string using the structures the

end if (i>length(data)) % if previous was the last value of data that node symbol was decoded realdata=[realdata centerNode.symbol]; flag=1; k=i+1; break end if(data(i)==1) % data is 1 so right child if ~isempty(centerNode.rightNode) % decoded=[decoded tempNode.rightNode.code]; % to check if we % are traversing correct i=i+1; centerNode=centerNode.rightNode; else realdata=[realdata centerNode.symbol]; flag=1; k=i; break; end end if (i>length(data)) % if data is over this is the leaf realdata=[realdata centerNode.symbol]; flag=1; k=i+1; break end end end %% here we convert the data from decimal back to binary format binaryReal=dec2bin(realdata,log2(M)); binaryReal1=binaryReal'; binaryRealFinal=binaryReal1(:); binaryRealFinal=binaryRealFinal'; for loop=1:length(binaryRealFinal) olright(loop)=str2double(binaryRealFinal( loop)); end

function [realdata,olright] = dHuffman(tempNode,data,M) %% traversing the tree and when we reach a leaf we assign the leaf nodes value to the data vector decoded=[]; % definig variables realdata=[]; i=1; k=1; centerNode=tempNode; % variable of class huffamn while(k<length(data)+1) % loop defining flag=0; % flag to terminated loop centerNode=tempNode; % temperary node

while (flag==0&i<=length(data)) loop condition

if (data(i)==0) % if data is 0 check left child if ~isempty(centerNode.leftNode) % decoded=[decoded tempNode.leftNode.code]; %this was only % checking that we are traversing correct i=i+1; centerNode=centerNode.leftNode; else realdata=[realdata centerNode.symbol]; flag=1; k=i; break;

end %%%%%%%%%%%------%%%%%%%%%%%%%%

Assignment # 6

Information is the original data its size is 21 bits long. final_data is the Huffman compressed data its size is greatly reduced to 10 as M=8 ,k=3 in this case quite similar data

Assignment # 6
Now in this window it is shown that final data Huffman encoded data when sent to the decoding functions return the original data and the sum(ol==information) returns 21 which means all 21 bits of original data and decoded data are a match.

IV. CONCLUSION Now the implementation of the Huffman lossless entropy encoding compression algorithm has confirmed the notion that when data has many similar elements in it this compression reduces the length of a code and hence increases the entropy no of useful information sent per bits.