Huffman Coding

Why AVL Trees?
Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h) time where h is the height
of the BST. The cost of these operations may become O(n) for a skewed Binary tree. If we make sure that
height of the tree remains O(Logn) after every insertion and deletion, then we can guarantee an upper
bound of O(Logn) for all these operations. The height of an AVL tree is always O(Logn) where n is the number
of nodes in the tree (See this video lecture for proof).
Insertion
To make sure that the given tree remains AVL after every insertion, we must augment the standard BST
insert operation to perform some re-balancing. Following are two basic operations that can be performed
to re-balance a BST without violating the BST property (keys(left) < key(root) < keys(right)). 1) Left Rotation
2) Right Rotation
T1, T2 and T3 are subtrees of the tree rooted with y (on the left side) or x (on the right side)
y x
/ \ Right Rotation / \
x T3 –––––––> T1 y
/ \ <------- / \
T1 T2 Left Rotation T2 T3
Keys in both of the above trees follow the following order keys(T1) < key(x) < keys(T2) < key(y) <
keys(T3), So BST property is not violated anywhere.
Steps to follow for insertion

Let the newly inserted node be w
1) Perform standard BST insert for w.
2) Starting from w, travel up and find the first unbalanced node. Let z be the first unbalanced node, y
be the child of z that comes on the path from w to z and x be the grandchild of z that comes on the
path from w to z.
3) Re-balance the tree by performing appropriate rotations on the subtree rooted with z. There can be
4 possible cases that needs to be handled as x, y and z can be arranged in 4 ways. Following are the
possible 4 arrangements:
a. y is left child of z and x is left child of y (Left Left Case)
b. y is left child of z and x is right child of y (Left Right Case)
c. y is right child of z and x is right child of y (Right Right Case)
d. y is right child of z and x is left child of y (Right Left Case)
Following are the operations to be performed in above mentioned 4 cases. In all of the cases, we only need
to re-balance the subtree rooted with z and the complete tree becomes balanced as the height of subtree
(After appropriate rotations) rooted with z becomes same as it was before insertion. (See this video lecture
for proof)
a) Left Left Case
T1, T2, T3 and T4 are subtrees.
z y
/ \ / \
y T4 Right Rotate (z) x z
/ \ - - - - - - - - -> / \ / \
x T3 T1 T2 T3 T4
/ \
T1 T2
b) Left Right Case
z z x
/ \ / \ / \
y T4 Left Rotate (y) x T4 Right Rotate(z) y z
/ \ - - - - - - - - -> / \ - - - - - - - -> / \ / \
T1 x y T3 T1 T2 T3 T4
/ \ / \
T2 T3 T1 T2
c) Right Right Case
z y
/ \ / \
T1 y Left Rotate(z) z x
/ \ - - - - - - - -> / \ / \
T2 x T1 T2 T3 T4
/ \
T3 T4
d) Right Left Case
z z x
/ \ / \ / \
T1 y Right Rotate (y) T1 x Left Rotate(z) z y
/ \ - - - - - - - - -> / \ - - - - - - - -> / \ / \
x T4 T2 y T1 T2 T3 T4
/ \ / \
T2 T3 T3 T4
Insertion Examples:
implementation
Following is the implementation for AVL Tree Insertion. The following implementation uses the recursive
BST insert to insert a new node. In the recursive BST insert, after insertion, we get pointers to all ancestors
one by one in a bottom-up manner. So we don’t need parent pointer to travel up. The recursive code itself
travels up and visits all the ancestors of the newly inserted node.
1) Perform the normal BST insertion.

2) The current node must be one of the ancestors of the newly inserted node. Update the height of
the current node.
3) Get the balance factor (left subtree height – right subtree height) of the current node.
4) If balance factor is greater than 1, then the current node is unbalanced and we are either in Left Left
case or left Right case. To check whether it is left left case or not, compare the newly inserted key
with the key in left subtree root.
5) If balance factor is less than -1, then the current node is unbalanced and we are either in Right Right
case or Right-Left case. To check whether it is Right Right case or not, compare the newly inserted
key with the key in right subtree root.
// C program to insert a node in AVL tree

#include<stdio.h>
#include<stdlib.h>
// An AVL tree node

struct Node {
int key;
struct Node *left;
struct Node *right;
int height;
};
// A utility function to get maximum of two integers

int max(int a, int b);
// A utility function to get the height of the tree
int height(struct Node *N) {
if (N == NULL)
return 0;
return N->height;
}
// A utility function to get maximum of two integers

int max(int a, int b) {
return (a > b)? a : b;
}
/* Helper function that allocates a new node with the given key and NULL left and right pointers. */
struct Node* newNode(int key) {
struct Node* node = (struct Node*) malloc(sizeof(struct Node));
node->key = key;
node->left = NULL;
node->right = NULL;
node->height = 1; // new node is initially added at leaf
return(node);
}
// A utility function to right rotate subtree rooted with y. See the diagram given above.
struct Node *rightRotate(struct Node *y) {
struct Node *x = y->left;
struct Node *T2 = x->right;
// Perform rotation
x->right = y;
y->left = T2;
// Update heights
y->height = max(height(y->left), height(y->right))+1;
x->height = max(height(x->left), height(x->right))+1;
// Return new root
return x;
}
// A utility function to left rotate subtree rooted with x. See the diagram given above.
struct Node *leftRotate(struct Node *x) {
struct Node *y = x->right;
struct Node *T2 = y->left;
// Perform rotation
y->left = x;
x->right = T2;
// Update heights
x->height = max(height(x->left), height(x->right))+1;
y->height = max(height(y->left), height(y->right))+1;
// Return new root
return y;
}
// Get Balance factor of node N

int getBalance(struct Node *N) {
if (N == NULL) return 0;
return height(N->left) - height(N->right);
}
// Recursive function to insert a key in the subtree rooted with node and returns the new root of the subtree.
struct Node* insert(struct Node* node, int key) {
/*1. Perform the normal BST insertion */
if (node == NULL)
return(newNode(key));
if (key < node->key)
node->left = insert(node->left, key);
else if (key > node->key)
node->right = insert(node->right, key);
else // Equal keys are not allowed in BST
return node;
/*2. Update height of this ancestor node */
node->height = 1 + max(height(node->left), height(node->right));
/*3. Get the balance factor of this ancestor node to check whether this node became unbalanced */
int balance = getBalance(node);
// If this node becomes unbalanced, then there are 4 cases
// Left Left Case

if (balance > 1 && key < node->left->key)
return rightRotate(node);
// Right Right Case

if (balance < -1 && key > node->right->key)
return leftRotate(node);
// Left Right Case

if (balance > 1 && key > node->left->key) {
node->left = leftRotate(node->left);
return rightRotate(node);
}
// Right Left Case

if (balance < -1 && key < node->right->key) {
node->right = rightRotate(node->right);
return leftRotate(node);
}
/* return the (unchanged) node pointer */

return node;
}
// A utility function to print preorder traversal of the tree. The function also prints height of every node
void preOrder(struct Node *root) {
if(root != NULL) {
printf("%d ", root->key);
preOrder(root->left);
preOrder(root->right);
}
}
/* Drier program to test above function*/

int main() {
struct Node *root = NULL;
/* Constructing tree given in the above figure */
root = insert(root, 10);
/* The constructed AVL Tree would be

30
/ \
20 40
/ \ \
10 25 50 */
printf("Preorder traversal of the constructed AVL tree is: \n");

preOrder(root);
return 0;
}
Time Complexity: The rotation operations (left and right rotate) take constant time as only a few pointers
are being changed there. Updating the height and getting the balance factor also takes constant time. So
the time complexity of AVL insert remains same as BST insert which is O(h) where h is the height of the tree.
Since AVL tree is balanced, the height is O(Logn). So time complexity of AVL insert is O(Logn).
Huffman Tree
This problem is that of finding the minimum length bit string which can be used to encode a string of symbols.
One application is text compression:
What's the smallest number of bits (hence the minimum size of file) we can use to store an arbitrary piece
of text?
Huffman's scheme uses a table of frequency of occurrence for each symbol (or character) in the input. This
table may be derived from the input itself or from data which is representative of the input. For instance,
the frequency of occurrence of letters in normal English might be derived from processing a large number
of text documents and then used for encoding all text documents. We then need to assign a variable-length
bit string to each character that unambiguously represents that character. This means that the encoding for
each character must have a unique prefix. If the characters to be encoded are arranged in a binary tree:
An encoding for each character is found by following the tree from

the route to the character in the leaf: the encoding is the string of
symbols on each branch followed.
For example:
String Encoding
TEA 10 00 010
SEA 011 00 010
TEN 10 00 110
Encoding tree for ETASNO
Notes:
1. As desired, the highest frequency letters - E and T - have two digit encodings, whereas all the others
have three digit encodings.
2. Encoding would be done with a lookup table.
A divide-and-conquer approach might have us asking which characters should appear in the left and right
subtrees and trying to build the tree from the top down. As with the optimal binary search tree, this will lead
to to an exponential time algorithm.
A greedy approach places our n characters in n sub-trees and starts by combining the two least weight nodes
into a tree which is assigned the sum of the two leaf node weights as the weight for its root node.
Operation of the Huffman algorithm.

The time complexity of the Huffman algorithm is O(nlogn). Using a heap to store the weight of each tree,
each iteration requires O(logn) time to determine the cheapest weight and insert the new weight. There
are O(n) iterations, one for each item.
Decoding Huffman-encoded Data

"How do we decode a Huffman-encoded bit string? With these variable length strings, it's not possible to
break up an encoded string of bits into characters!"
The decoding procedure is deceptively simple.

Starting with the first bit in the stream, one then uses
successive bits from the stream to determine
whether to go left or right in the decoding tree. When
we reach a leaf of the tree, we've decoded a
character, so we place that character onto the
(uncompressed) output stream. The next bit in the
input stream is the first bit of the next character.
Transmission and storage of Huffman-encoded Data

If your system is continually dealing with data in which the symbols have similar frequencies of occurence,
then both encoders and decoders can use a standard encoding table/decoding tree. However, even text
data from various sources will have quite different characteristics. For example, ordinary English text will
have generally have 'e' at the root of the tree, with short encodings for 'a' and 't', whereas C programs would
generally have ';' at the root, with short encodings for other punctuation marks such as '(' and ')' (depending
on the number and length of comments!). If the data has variable frequencies, then, for optimal encoding,
we have to generate an encoding tree for each data set and store or transmit the encoding with the data.
The extra cost of transmitting the encoding tree means that we will not gain an overall benefit unless the
data stream to be encoded is quite long - so that the savings through compression more than compensate
for the cost of the transmitting the encoding tree also.
Sorting
A Sorting Algorithm is used to rearrange a given array or list elements according to a comparison operator
on the elements. The comparison operator is used to decide the new order of element in the respective
data structure.
 Selection Sort  Pigeonhole Sort  Tag Sort (To get both
 Bubble Sort  Cycle Sort sorted and original)
 Recursive Bubble Sort  Cocktail Sort  Tree Sort
 Insertion Sort  Strand Sort  Cartesian Tree Sorting
 Recursive Insertion Sort  Bitonic Sort  Odd-Even Sort / Brick Sort
 Merge Sort  Pancake sorting  QuickSort on Singly Linked
 Iterative Merge Sort  Binary Insertion Sort List
 Quick Sort  BogoSort or permutation  QuickSort on Doubly
 Iterative Quick Sort Sort Linked List
 Heap Sort  Gnome Sort  3-Way QuickSort (Dutch
 Counting Sort  Sleep Sort – The King of National Flag)
 Radix Sort Laziness / Sorting while  Merge Sort for Linked Lists
 Bucket Sort Sleeping  Merge Sort for Doubly
 ShellSort  Structure Sorting (By Linked List
 TimSort Multiple Rules) in C++  3-way Merge Sort
 Comb Sort  Stooge Sort
Internal & External Sort
In internal sorting all the data to sort is stored in memory at all times while sorting is in progress. In external
sorting data is stored outside memory (like on disk) and only loaded into memory in small chunks. External
sorting is usually applied in cases when data can't fit into memory entirely.
So in internal sorting you can do something like shell sort - just access whatever array elements you want at
whatever moment you want. You can't do that in external sorting - the array is not entirely in memory, so
you can't just randomly access any element in memory and accessing it randomly on disk is usually extremely
slow. The external sorting algorithm has to deal with loading and unloading chunks of data in optimal
manner.
Insertion sort
Insertion sort is a to some extent an interesting algorithm with an expensive runtime characteristic having
O(n2). This algorithm can be best thought of as a sorting scheme which can be compared to that of sorting
a hand of playing cards, i.e., you take one card and then look at the rest with the intent of building up an
ordered set of cards in your hand.
Insertion Algorithms:
1. If it is the first element, it is already sorted.
2. Pick the next element.
3. Compare with all the elements in sorted sub-list.
4. Shift all the elements in sorted sub-list that is greater than the value to be sorted.
5. Insert the value.
6. Repeat until list is sorted.
Important Characteristics of Insertion Sort:
 It is efficient for smaller data sets, but very inefficient for larger lists.
 Insertion Sort is adaptive, that means it reduces its total number of steps if given a partially sorted list,
hence it increases its efficiency.
 Its space complexity is less. Insertion sort requires a single additional memory space.
 Overall time complexity of Insertion sort is O(n2).
Input elements: 89 17 8 12 0
Step 1: 89 17 8 12 0 (the bold elements are sorted list and non-bold unsorted list)
Step 2: 17 89 8 12 0 (each element will be removed from unsorted list and placed at the right position in
the sorted list)
Step 3: 8 17 89 12 0
Step 4: 8 12 17 89 0
Step 5: 0 8 12 17 89
C Program – Insertion Sort implementation

#include<stdio.h>
int main(){
/* Here i & j for loop counters, temp for swapping, count for total number of elements, number[] to
store the input numbers in array. You can increase or decrease the size of number array as per
requirement */
int i, j, count, temp, number[25];
printf("How many numbers u are going to enter?: ");
scanf("%d",&count);
printf("Enter %d elements: ", count);
// This loop would store the input numbers in array

for(i=0;i<count;i++)
scanf("%d",&number[i]);
// Implementation of insertion sort algorithm

for(i=1;i<count;i++){
temp=number[i];
j=i-1;
while((temp<number[j])&&(j>=0)){
number[j+1]=number[j];
j=j-1;
}
number[j+1]=temp;
}
printf("Order of Sorted elements: ");

for(i=0;i<count;i++)
printf(" %d",number[i]);
return 0;
}
Selection or exchange sort

Selection sort algorithm starts by compairing first two elements of an array and swapping if necessary, i.e.,
if you want to sort the elements of array in ascending order and if the first element is greater than second
then, you need to swap the elements but, if the first element is smaller than second, leave the elements as
it is. Then, again first element and third element are compared and swapped if necessary. This process goes
on until first and last element of an array is compared. This completes the first step of selection sort.
If there are n elements to be sorted then, the process mentioned above should be repeated n-1 times to get
required result. But, for better performance, in second step, comparison starts from second element
because after first step, the required number is automatically placed at the first (i.e, In case of sorting in
ascending order, smallest element will be at first and in case of sorting in descending order, largest element
will be at first.). Similarly, in third step, comparison starts from third element and so on.
A figure is worth 1000 words. This figure below clearly explains the working of selection sort algorithm.
How selection sort algorithm works in programming?
C Program to Sort Elements Using Selection Sort Algorithm

#include <stdio.h>
int main() {
int data[100], i, n, steps, temp;
printf("Enter the number of elements to be sorted: ");
scanf("%d",&n);
for(i=0;i<n;++i) {
printf("%d. Enter element: ",i+1);
scanf("%d",&data[i]);
}
for(steps=0;steps<n;++steps)
for(i=steps+1;i<n;++i) {
if(data[steps]>data[i]) { /* To sort in descending order, change > to <. */
temp=data[steps];
data[steps]=data[i];
data[i]=temp;
}
}
}
printf("In ascending order: ");
for(i=0;i<n;++i)
printf("%d ",data[i]);
return 0;
}

Huffman Coding

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Huffman Coding

Загружено:

Авторское право:

Доступные форматы

Why AVL Trees?

Steps to follow for insertion

1) Perform the normal BST insertion.

// C program to insert a node in AVL tree

// An AVL tree node

// A utility function to get maximum of two integers

// A utility function to get maximum of two integers

// Get Balance factor of node N

// Left Left Case

// Right Right Case

// Left Right Case

// Right Left Case

/* return the (unchanged) node pointer */

/* Drier program to test above function*/

/* The constructed AVL Tree would be

printf("Preorder traversal of the constructed AVL tree is: \n");

An encoding for each character is found by following the tree from

Operation of the Huffman algorithm.

Decoding Huffman-encoded Data

The decoding procedure is deceptively simple.

Transmission and storage of Huffman-encoded Data

C Program – Insertion Sort implementation

// This loop would store the input numbers in array

// Implementation of insertion sort algorithm

printf("Order of Sorted elements: ");

Selection or exchange sort

How selection sort algorithm works in programming?

C Program to Sort Elements Using Selection Sort Algorithm

Вам также может понравиться