16 views

Uploaded by sj819

hashing

- 4690 POS Keyed Files Characteristics and Limitations
- data structure lab manual
- Faq
- BNMIT file structure lab manual
- An Evaluation of Software Requirement Prioritization Techniques
- Assoc Parallel Journal
- Cart Intro Tutorial
- Archi User Guide
- Newseplan8
- Scenario Based Questions
- Course Description ADS
- PW Reference
- Lab15-BST.pdf
- Data Structures
- Assignment 5
- B-Tree_syam
- Complete Syllabus 2nd Year Non-Credit - Copy
- New Microsoft Word Document
- Othello
- p1425-garg.pdf

You are on page 1of 67

Binary search tree is a simple data structure for which running time of most operations is

O(logN) on average.

*Trees are used to implement the file system of several popular operating systems.

*Trees can be used to support searching operations in O(log N) average time.

*Trees can be used to implement symbol table.

Tree(Recursive definition): A tree is a collection of nodes.The collection can be

empty;otherwise, a tree consists of a distinguished node r,called the root,and zero or

more nonempty(sub) trees T1,T2, T

k

, each of whose roots are connected by a

directed edge from r .

A

B

G

C

H I

N O

D

J

E

K

P

F

L M

Q

*In a tree with N nodes ,the no of edges will be N-1.

*Root is a node without a parent.

*Leaves: Nodes without children are known as leaves.

*Siblings: Nodes with the same parent.

*Path : A path from node n

1

to n

k

is defined as a sequence of nodes n

1

, n

2

, nk ,such

that n

i

is the parent of n

i+1

for, 1i k.

*Path Length : It is the no of edges on the path .If you have k nodes in the path,path

length will be k-1.

*Depth:- Depth of a node is the path length from the root to the given node.

*Depth of the tree is the maximum path length that is existing in the tree.

*Depth of the root is zero.

*Height :- Height of a node is the maximum path length from a leaf to the given node.

Height of a tree is the maximum path length that is existing in the tree.

*Height of the leaf is Zero.

*The Height of the tree is equal to the depth of the tree.

*Ancestors : All nodes in the path from the root to the parent of a given node.

*Descendants :- All the children ,grand children etc are called as descendants of the

node.

Generic tree:

It is a tree in which any node can have any number of children.

m-ary tree: It is a tree in which any node can have maximum m children only.

2-ary tree or Binary Tree: It is a tree in which no node can have more than two

children.

Implementation of Trees: Generic tree to Binary tree.

Since the no of children per node can vary so greatly and is not known in advance,it

might be infeasible to make the children direct links in the data structure,because there

would be too much wastage of space.The solution is keep the children of each node in a

linked list of tree nodes.

Ex: node structure:

data First Child Next Sibling

T

A /

B C D E F

H I J k L

N

O

P

M

Q

G

Directory Structure:

Binary Tree:

Worst Case

B

C

D

T1 T2

A

T

*The Depth of a binary tree may vary from (N-1) worstcase to log

2

N Best Case .

* The average depth of a binary tree is O(square root(N)).

Perfectly Balance Binary tree:-

If the height of the left subtree is equal to the height of the right subtree at all the nodes it

is called as the perfectly balanced binary tree.

A

B

D E

C

F G

*The maximum no of nodes in a binary tree of height H is 2

H+1

1. If D is the depth of

the tree then N = 2

D+1

- 1

*Full node : A full node is a node with two children.

*The number of full nodes plus one is equal to the no of leaves in a nonempty binary tree.

*If the height of the left subtree is greater than the height of the right subtree the tree is

called as left heavy.

*If the height of the right is greater than the height of the left subtree it is called as right

heavy tree.

*Any nodes which has outdegree 0 is called a terminal node or a leaf ,all other nodes are

called branch nodes/Internal-nodes.

*Level: The level of any node is the length of its path from the root.

*Ordered tree: If in a(directed) tree an ordering of the nodes at each level is prescribed

,then such a tree is called an ordered tree.

*Degree of a node: The no. of subtrees of a node is called the degree of the node.

*A set of disjoint trees is a forest.

*If the outdegree of every node is exactly equal to m or 0 and the no of nodes at level i

is m

i-1

(assuming the root is at level 1) then the tree is called a full or complete m-ary

tree

.

*No of ordered trees with n nodes is (1/n+1)(2n)Cn.

Storage Representation of Binary Trees.

1.Sequential /Array representation

.

2.Linked storage representation.

3.Threaded storage representation.

Sequential Representation:-

If the root/parent is stored in ith location its left child must be stored in (2i)

th

location

and its right child must be stored in (2i+1)

th

location.

C language : left child in (2i +1 )

th

location

& right child in (2i + 2)

th

location

Eg :

T

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

B C D E F G H I J K

A

B

D E

H

k

C

F

I

G

J

Disadvantage :- Memory wastage

*Linked storage Representation:In each node two(pointers) link fields are required.

left data right

A

B C

D E

F G

H

K

I

J

*In a binary tree,with n nodes there exists (n+1) NULL Links.

Binary Search tree: It is a binary tree in which for every node, x, the values of all the

keys in its left subtree are smaller than the key value in x, and the values of all the keys

in its right subtree are larger than the key value in x.

eg :

15

10

13

5

8 2

34

25 41

18 30

38

Implementation of BST:-

left data right

Find ,insert,delete operations.

typedef struct treenode* treptr;

struct treenode

{

treptr left;

int data;

treptr right;

};

Searching for a given element (Find)

treptr find (treptr T, int x)

{

if(T = = NULL)

return(NULL);

if( x < T->data)

return(find(T->left, x));

else if(x > T->data)

return(find (T->right, x));

else

return(T);

}

*Find min and Find max

treptr findmin(treptr T)

{

if(T = = NULL)

return(NULL);

else if(T->left = = NULL)

return(T);

else return((findmin(T->left));

}

*Nonrecursive Implementation of Find Max.

treptr findmax(treptr T)

{

if(T != NULL)

while(T->right!=NULL)

T = T->right;

return(T);

}

*Inserting an element into Binary search tree if the element is not existing.

treptr insert(treptr T,int x)

{

if(T= = NULL)

{

T =(treptr)malloc(sizeof(struct treenode));

if(T = = NULL)

printf(out of space);

else

{

T->data = x;

T->left = T->right = NULL;

}

}

else if(x < T->data)

T->left = insert(T->left , x);

else if(x > T->data)

T->right = insert(T->right,x);

return(T);

}

*Duplicates can be handled by keeping an extrafield in the node indicating the frequency

of occurence.

*If the key is only part of a larger structure,then we can keep all of the structure that have

the same key in an auxiliary data structure ,such as a list or another search tree.

Deleting an element .

The general strategy is to replace the data of this node (with two children) with the

smaller data of the right subtree and recursively delete the node.

If the node to be deleted is having one child delete the node and return the child to its

parent.

If the node is not having any children then free the node and return a null to its parent.

*Lazy deletion: when an element is to be deleted, it is left in the tree and merely marked

as being deleted(use extra field in the node).

It is used when duplicate keys are present.

If the no of deletions is expected to be small

If the keys which are deleted are to be inserted in future.

*A small time penality is associated with lazy deletion.

Time complexity of insert/delete operations is O(log

2

n) on an average for a balanced

tree.

Deletion routine for BST:-

treptr delete (treptr T , int x)

{

treptr temp;

if(T = = NULL)

printf(Element is not existing);

else

if(x < T->data)

T->left = delete (T->left ,x);

else

if( x > T->data)

T->right = delete(T->.right ,x);

else

if(T->left && T->right)

{

temp = findmin(T->right);

T->data = temp ->data;

T->right = delete(T ->right,T ->data);

}

else

{

temp = T;

if(T->left = =NULL)

T = T->right;

else if(T-> right = = NULL)

T = T->left;

free(temp);

}

return(T);

}

Expression Trees:

The leaves of an expression tree are operands , and the other nodes contain operators.

eg : (a + b * c) +( (d * e + f ) * g)

+

+ *

*

b

c

a

+ g

f

*

e

d

Prefix exp: If preorder traversal is applied on an expression tree we will get prefix

expression .

Postfix exp: If postorder traversal is applied on expression tree we will get postfix

expression.

Constructing an Expression Tree:

From a given postfix expression we can construct an expression tree.

1.Repeat thru step 5 (the following ) until the end of the postfix expression.

2.Read a symbol from the expression.

3.Create a node,store the symbol in the datafield.

4.If the symbol is an operand push the node address into the stack and go to step 2.

5.If the symbol is an operator pop two nodes from the stack and attach as right & left

children to the node respectively and push it into the stack and go to step 2.

6.pop the root address from the stack.

Tree Traversals

To list the names of all the files in the directory.

To swap the left & right children of all the nodes.

To prepare a copy of the existing tree.

Tree Traversal is a procedure by which each node in the tree is processed exactly once

in a systematic manner.The meaning of processed depends on the nature of the

application.

Main traversals:

Within the children if we assume that the left is to be visited first, depending upon the

position of the parent we get preorder,inorder, and postorder traversals.

If we assume that the right is to be visited first then we get

reverse(converse)preorder,reverse inorder & reverse postorder traversals.

The nodes in the tree can be visited level by level and is called as Levelorder Traversal.

Level order left to right traversal.

Level order right to left traversal.

Preorder Traversal of a binary tree is defined as follows:

1.Process the root/parent node.

2. Traverse the left subtree in preorder.

3. Traverse the right subtree in preorder.

Inorder traversal of a binary tree is given by the following steps.

1.Traverse the left subtree in inorder.

2.Process the root/parent node.

3.Traverse the right subtree in inorder.

*we define the postorder traversal of a binary tree as follows.

1.Traverse the left subtree in postorder.

2.Traverse the right subtree in postorder.

3. Process the root/parent node.

Preorder o/p:- A B C D E F G

Inorder o/p:- C B A E F D G

Postorder o/p:-C B F E G D A

A

B

C

D

E

F

G

Implementation of traversals(Recursive Algorithms)

void rpreorder(treptr T)

{

if(T! = NULL)

{

printf(%d,T->data);

rpreorder(T->left);

preorder(T->right);

}

}

void rinorder(treptr T)

{

if(T!= NULL)

{

rinorder(T->left);

printf(%d,T->data);

rinorder(T->right);

}

}

void rpostorder(treptr T)

{

if(T!=NULL)

{

rpostorder(T->left);

rpostorder(T->right);

printf(%d,T->data);

}

}

Nonrecursive preorder routine:

General Algorithm:

1.If the tree is empty then write tree empty and return else place the pointer to the root

of the tree in the stack.

2.Repeat step3 while the stack is not empty.

3.Pop the top pointer off the stack.Repeat while the pointer value is not null .write the

data associated with the node.If right subtree is not empty then stack the pointer to the

right subtree set pointer value to left subtree.

*Assuming tos is global,push & pop routines are available.

void preorder(treptr T)

{

treptr p,S[20];

tos = -1;

if( T = = NULL)

printf(tree is empty);

else

{

push(S,T);

while(tos ! =-1)

{

p = pop(S);

while(p ! = NULL)

{

printf(%d,p->data);

if (p->right ! = NULL)

push ( S, p->right);

p = p->left;

}

}

}

}

Iterative Postorder Traversal.

General Algorithm.

1.If the tree is empty then write empty tree and return else initialize the stack and

initialize the pointer value to root of tree.

2.Start an infinite loop to repeat through step 5.

3.Repeat while pointer value is not null.

Stack current pointer value.

Set pointer value to left subtree.

4.Repeat while top pointer on stack is negative.

Pop pointer off stack.

write data associated with positive value of this pointer.

If stack is empty then return.

5.Set pointer value to the right subtree of the value on top of the stack.

Stack the negative value of the pointer to the right subtree.

Assuming push & pop routines are existing & tos is global.

void postorder(treptr T)

{

treptr p, s[50];

if(T= = NULL)

{

printf(Tree is empty);

return;

}

else

{

p = T;

tos = -1;

}

while(1)

{

while(p! = NULL)

{

push(s,p);

p = p->left;

}

while( s[tos] < 0 )

{

p = pop(s);

printf(%d, p->data);

if(tos = = -1)

return;

}

p =s[tos]->right;

s[tos] = - s[tos];

}

}

* Routine to prepare a copy of the given tree T.

treptr copy(treptr T)

{

treptr temp;

if( T = = NULL)

return(NULL);

temp = ((treptr)malloc(sizeof(struct treenode));

temp ->data = T-> data;

temp ->left = copy(T->left);

temp ->right = copy (T->right);

return(temp);

}

* If two traversal outputs are given namely

Inorder-preorder

Or

Inorder -postorder

We can construct a tree

If preorder & postorder traversal outputs are given we can not construct a unique tree

ex: preorder output : a,b,d,e,g ,e ,f

inorder output : d,b ,g, e,a,f,c.

* from the preorder take the first data,construct a node and using this divide the inorder

output.

a

d,b,g ,e f,c

Take the next data item and create a node and attach at the appropriate place.

a

f,c

g,e

b

d

Repeat the above until all the data items of preorder output is explored.

The elements of a binary search tree can be printed in ascending order

a

b

c

d

* by applying.

Inorder Traversal.

*Threaded storage Representation for Binary trees.

The wasted NULL links in the linked representation can be replaced by threads.

binary tree is threaded according to a particular traversal order.

g:- Threads for the inorder

A

e traversal of a tree are pointers to its higher nodes.

If the left link of a node is null then this is replaced by the address of the

If the right link of a node is null then this is replaced by the address of the successor.

A null link without predecessor/successor can be replaced with the address of the

*

predecessor.

*

*

root

f g,e

*A thread can be represented using negative address.

*A separate field can be used

left leftthread data rightthread right

AVL(Adelson velski and landis) Trees:

*If the tree is balanced then its depth will be O(log

2

N) and any operation can be

implemented in O(log

2

N) time.

*As the no of insertions/deletions increases on BST,the tree becomes either left heavy or

right heavy.And the time complexity becomes O(N).

*To implement any operation in O(log

2

N) time insist on an extra structural condition

called balance.

*The data structures in which after every operation, a restructuring rule is applied that

tends to make future operations efficient are classified as self-adjusting: AVL,splay

tree.

AVL tree is a binary search tree with a balance condition which ensures that the depth of

the tree is O(logN)

*An AVL tree is identical to a binary search tree,except that for every node in the tree,the

height of the left and right subtrees can differ by at most 1.

*Height information is kept for each node.

*The height of an empty tree is defined to be 1.

*The height of an AVL tree is at most roughly.

h = 1.44 log(N+2) 0.328.

*The minimum no of nodes,s(h) in an AVL tree of heighth is given by

s(h) = s(h-1) +s(h-2) +1 ; for h =0 s(h) =1, for h =1 s(h) = 2 .

*when we do an insert operation,we need to update all the balancing information for the

nodes on the path back to the root.

*Normally lazy deletion is performed.

*When we insert a new element,a node could violate the AVL tree property ,which can

be restored with a simple modification to the tree, known as a rotation.

The violation may occur in four cases.

1.An insertion into the left subtree of the left child of T.

2. An insertion into the right subtree of the right child of T.

3.An insertion into the right subtree of the left child of T.

4.An insertion into the right subtree of the right child of T.

*If the insertion occurs on the outside (left-left or right-right) is fixed by a Single

rotation of the tree.

*If the insertion occurs on the inside (left-right or right-left) is handled by the double

rotation.

Single rotation: Single logic here is if the height of the left subtree is more due to

outside insertion make the left child as the new root and the root as the right child of the

new root.Make the new links keeping the order property in mind.

Single rotation with its left child

Single rotation with its right child

K1

B C

K1

K2

A

B

C

K2

K1

K2

A B

K1

C

K2

A

B C

A

Example :Inserting the elements 3,2,1,4,5,6,7

2

1 3

4

5

1

2

3

4

4 2

2

1

5

4

5 3

2

1 4

3 5

6

6

4

2

3 1

6

5

7 4

2

1 3

5

6

7

4

2

1 3 5

6

7

Double rotation: Can be implemented using two single rotations.

Left-right double rotation

K3

K1

D

A

K2

B C

K3

K2

K1

B A

C

D

K1

A B

K2

K3

C D

Right-left double rotation

K1

A

K3

K2

D

B

C

K1

A

K2

B

K3

D

C

K2

K1

K3

A B

D

C

Implementation:

left data height right

typedef struct avlnode * avlptr;

struct avlnode

{

avlptr left;

int data;

int height;

avlptr right;

}

int height(avlptr T)

{

if (T = = NULL)

return(-1);

else

return( T ->height);

}

This function can be called only if k2 has a left child .Perform a rotate between a

node(k2) and its left child.update height and return the new root.

avlptr srotatewithleft(avlptr k2)

{

avlptr k1;

k1 = k2->left;

k2->left = k1->right;

k1->right = k2;

k2 ->height = max(height(k2->left), height(k2->right))+1;

k1->height = max(height(k1->left),k2->height) + 1;

return(k1);

}

This function can be called only if k3 has a left child and k3s left child has a right

child.Do the left right double rotation.update heights, then return new root.

avlptr drotatewithleft(avlptr k3)

{

K3 ->left = srotatewithright(k3->left);

return(Srotatewithleft(k3));

}

avlptr drotatewithright(avlptr k3)

{

K3->right = Srotatewithleft(K3->right);

return(Srotatewithright(k3));

}

Insertion into an AVL tree

avlptr insert(avlptr T,int x)

{

if (T = = NULL)

{

T = (avlptr)malloc(sizeof(struct avlnode));

if (T = = NULL)

printf(out f space);

else

{

T->data = x;

T->height = 0;

T->left = T->right = NULL;

}

}

else if ( x < T->data)

{

T->left =insert(T->left ,x) ;

if(height(T->left) height(T->right)= = 2)

if ( x < T->left->data)

T = srotatewithleft(T);

else

T =drotatewithleft(T);

}

else if ( x > T->data)

{

T->right =insert(T->right ,x) ;

if(height(T->left) height(T->right)= = -2)

if ( x > T->right->data)

T = srotatewithright(T);

else

T =drotatewithright(T);

}

T ->height =max(height(T->left), height(T->right)) + 1;

return(T);

}

Splay Trees

Splay tree guarantees that any M consecutive tree operations staring from an empty tree

take at most O(Mlog N) time.

Splay tree has an O(logN) amortized cost per operation.

The basic idea of the splay tree is that after a node is accessed,it is pushed to the root by a

series of AVL tree rotations.

This is based on the locality of reference principle.

In many applications when a node is accessed,it is likely to be accessed again

in the near future.

A simple Idea:

We rotate every node on the access path with its parent(single rotation,bottom up)

ex : Find on k1

An access on the required node will then push other nodes deep in the tree.

K2 K5

A B

K4

K3

F

E

C D

K1

*If we perform ordered rotations on the required node,it can be the new root and

simultaneously some balancing can be achieved.

*If the node is having parent & grand parent a double rotation(zig-zig/zig-zag) is

performed,other wise a single rotation is performed.

*If the total no of nodes in the path is odd then a single rotation is required at the

end,otherwise double rotations are sufficient.

Single rotations

Zig Left

p

C

x

B A

x

P

C B

A

Zig -right

x

A

p

B

C

p

x

A

B C

Double rotations

Zig Zig left

ig Zig right

Z

Zig Zag left

Zig Zag right

The splay tree node & structure definition

pedef struct splaynode * splptr

ruct splaynode

splptr left;

ent;

};

plementaion of splay tree:

ty

st

{

int data;

splptr par

splptr right;

Im

asic splay routine

B

oid splay(splptr current)

plptr father;

t ->parent;

f(father->parent = = NULL)

elrotate(current);

t;

ingle rotate function

v

{

s

father = curren

while(father ! = NULL)

{

i

Zig(current);

else

doub

father = current ->paren

}

}

S

oid Zig(splptr current)

f(current->parent->left = = current)

ght(current);

left data parent right

v

{

i

Zigleft(current);

else

Zigri

}

Double Rotate function

oid doublerotate(splptr current)

plptr p,g;

>parent;

f(p->left = = current)

agleft(current);

lse

f(p->right = = current)

agright(current);

outine for Zigleft

v

{

s

p = current-

g = p->parent;

if(g->left = = p)

{

i

ZigZigleft(current);

else

ZigZ

}

e

{

i

ZigZig right(current);

else

ZigZ

}

}

R :

oid Zigleft(splptr current)

plptr p, B;

parent;

)

;

t;

LL;

v

{

s

p = current->

B = current->right;

p->right = B;

if (B != NULL

B -> parent = p

current->right = p;

p ->parent = curren

current ->parent = NU

}

p

x

A B

C

x

A

p

B C

ZigZig left routine

void ZigZigleft(splptr current)

{

splptr p,g,ggp,B,c;

p = current ->parent;

g = p->parent;

ggp = g->parent;

B = current->right;

current ->.right

current->right = p;

p->parent = current;

p -> left = B;

if ( B! = NULL)

B -> parent = p;

c = p->right ;

p-> right = g;

g ->parent = p;

g -> left = c;

if(c! = NULL)

c -> parent = g;

current->parent = g;

current ->parent = ggp;

if(ggp !=NULL)

{ ggp->left = current;

else

ggp -> right = current;

}}

G

P

B-Tree:

It is a popular search tree that is not binary.

*A B-tree of order M is a tree with the following structural properties.

- The root is either a leaf or has between 2 and M children.

-All nonleaf nodes(expect the root) have between [M/2] and M children.

-All leaves are at the same depth.

used for indexing purpose-indexed sequential files.

All data items are stored at the leaves.

*No. of keys in a(nonroot) leaf is also between [M/2] and M.

Each interior node contains p1,p2,.pm pointers to the m children , k1,k2km-1

values, representing the smallest key when compared to k2,k3..km-1.

If a pointer pi is NULL ,the corresponding ki is undefined.

The node structure

P1 k1 P2 k2 P3 .. kn-1 Pn

The leaves contain all the data ,which are either the keys themselves or pointers to

records containing the keys.

Example of a B-tree of order 4

21

48

72

A B-tree of order 4 is popularly known as 2-3-4 tree.

A B-tree of order 3 is known as a 2-3 tree.

Insert operation on B-trees by using the special case of 2-3 tree.

The interior nodes(nonleaves) in ellipses Leaves are drawn in boxes ,which contain the

keys.The keys in the leaves are ordered.

12 15 - 59

_ _

84

_ _

25

31 41

1,4,8 ,11 13,14

74,78

74,78 60,63 47,5

43,4

33,3 26,28 21,24 17,18,19

16:- 41:58

8,11,12 16,17 22,23,31 58,59,60 41,52

22:-

Case1 : To insert a node with key 18, we can just add it to a leaf without causing any

violations of the 2-3 tree properties.

Case2 : To insert a key X , if there is no place in the leaf split the leaf into two and attach

to its parent.

Case3: If there is no place at its parent then split its parent into two and attach both at its

grandparent, and so on.

When we are inserting the elements using the above procedure satisfying the B-tree

properties the levels of the tree may increase.

Deletion : We can perform deletion by finding the key to be deleted and removing it.

If there is any violation of the property(no of keys) combine this leaf with its

sibling if there is any violation at its parent then merge the parent with its sibling and so

on.

When we delete an element from a B-tree the level of the B-tree may be reduced.

The worst-case running time for each of the Insert/delete operations is O(MlogM N)

Find operations takes O(logN)

The real use of B-trees lies in database systems.

B+-tree:

The minimum in the right subtree is maintained in the root/parent.

B*-tree:

Horizontal links are maintained.Links are available from a node to its siblings.

*A node is not going to be splitted until all the siblings are 70% filled. Shifting of

the data from one node to its sibling is implemented.

Hashing

The process of converting a key into address is called as Hashing.

Key-to-address transformation is defined as a mapping or a Hashing functions.

Hash Table:

The sequence of memory locations(Array of memory locations) in which the keys are to

be stored is called as a Hash Table.

Hash Table Size(m) : The no. of memory locations in the hash table.The table size m

must be a prime number for the even distribution of the keys in the table.

Load factor() : It is the ratio of the present no. of elements in the table to the table size.

Collision: I f more than one key is transformed into the same address then it is called as

collision.

Collision are resolved using Collision resolution techniques.

Preconditions: The process of converting alphanumeric keys into a from which can be

more easily manipulated by a hashing function is called as preconditioning.

Mainly hashing is used to implement Direct files.

Hash search is faster than other search algorithms.

Hashing Functions:

0

1

2

3

4

21

83

5

15

6

7

8

17

9

m = 10

= 4/10 = 0.4

The Division Method :

H(x) = x mod m

In this system the term x mod m has a value which is equal to the remainder of dividing

x by m.

The division method yields a hash value which belongs to the set { 0, 1,2 ..m-1 }

It is a simple and widely accepted method.

Possibility of collisions is more.

Midsquare Method :

In this method a key is multiplied by itself and the address is obtained by selecting an

apprppriate no of bits or digits from the middle of the square depending up on the table

size.

Ex: x=123456

x

2

=15241383936

3-digit h(x)=138

The Folded Method:

In this method the key is partitioned into no.of parts , each of which has the same length

as the required address with the possible exception of the last part.

Fold Shifting method:

All the parts are added together ,ignoring the final carry .In the case of binary Ex- OR can

be used.

Fold-boundary method:

A variation of the basic method involves the reversal of the digits in the outermost

partitions.

Folding is a hashing function which is also useful in converting multiword keys into a

single word so that other hashing functions can be used.

Digit Analysis:

This method forms addresses by selecting and shifting digits or bits of the original key.

This hashing function is in a sense distribution dependent.

For a given key set,the same positions in the key and the same rearrangement pattern

must be used consistently.

After the analysis on a sample of the key set,Digit positions having the most uniform

distributions are selected.

It is used in conjuction with static key sets.

(i.e key sets that do not change over time).

The length Dependent Method

It is commonly used in table-handling applications. In this method the length of the key is

used along with some portion of the key to introduce either a table address directly or an

intermediate key which is used, for example , with the division method to produce a final

table address.

ex: The sum of the binary equivalent of the first and last characters + 6 times the

length of the key.

Algebraic Coding:

It is a cluster-separating hashing function on algebraic theory. An r-bit key

(k1,k2,kr)2 is considered as a polynomial

r

K(x)= ki xi-1

i=1

m = 2t-1;

t

P(x)= xt + pi xi-1

i=1;

t

K(x) mod P(x) = hi xi-1

i=1.

Multiplicative Hashing:

For a nonnegative integral key x and constant c such that 0 < c< 1 , the function is

H[x] = [ m(cx mod 1)]

Here cx mod 1 is the fractional part of cx.

Collision Resolution Techiniques.

Separate Chaining :

It is to keep a list of all elements that has to the same hash value.

To perform a find, we use the hash function to determine which list to traverse,we then

traverse the list in the normal manner,returning the position where the item is found.

To insert an element find the required list and insert the new element either in the

beginning or at the end.

Implementation:

Ttypedef struct node * nptr;

Struct node

{

int data;

nptr next;

};

Creating the hash table assuming memory is allocated by malloc.

nptr * create ht(int m)

{

nptr * HT;

int i;

HT = (nptr *) malloc (sizeof(nptr) * m);

for( i =0 ;i< m ;i++)

{

HT[ i ] = createhead();

}

return(HT);

}

Disadv : space for pointers.

Open Addressing : In this if a collision occurs, alternative cells are tried until an empty

cell is found.More formally ,cells h0(x), h1(x),h2(x) .. are tired in succession ,where

hi(x) = (Hash(x) + F(i)) mod Tablesize.

If F(i) =i ->i = 1,2, ..m-1 linear probing

F(i) = i*i Quadracting probing.

&F(i) = i.h2(x) Double hashing.

Generally the load factor should be below ( < 0.5) 50% for open addressing hashing.

A Bigger table is needed for open addressing hashing than for separate chaining hashing.

Linear Probing.

In linear probing ,F is alinear function of i , typically F(i) = i .This amounts to trying cells

sequencially (with wraparound) in search of an empty cell.

Ex: h(x) = x mod m + F(i) ,i = 1,2, m-1.

F(i)=i;

As long as the table is big enough, a free cell can always found ,but the time to do so can

get quite large.

Worse ,even if the table is relatively empty ,blocks of occupied cells start forming.

Thi s effect ,known as Primary clustering , means that any key that hashes into the

cluster will require several attempts to resolve the collision, and then it will add to the

cluster.

The expected no. of probes using linear probing is roughly ()(1+ 1/(1- )

2

)

for insertions and unsuccessful searches and for

Successful searches. ( )(1+1/(1- ) )

The mean value of Insertion time I()=1/ 1/(1-x)dx

0 = (1/ )ln(1/(1- ))

Empty

table

After 89 After 18 After 49 After 58 After 69

0 49 49 49

1 58 58

2 69

3

4

5

6

7

8 18 18 18 18

9 89 89 89 89 89

Quadratic probing:

It is a collision resolution method that eliminates the primary clustering problem of

linear probing.

h(x) + F(i) when F(i) = i*i

Empty

table

After 89 After 18 After 49 After 58 After 69

0 49 49 49

1

2 58 58

3 69

4

5

6

7

8 18 18 18 18

9 89 89 89 89 89

There is no guarentee of finding an empty cell once the table gets more than half full, or

even before the table gets half full if the table size is not prime.

If quadratic probing is used , and the table size is prime ,then a new element can always

be inserted if the table is at least half empty.

Standard deletion cannot be performed in an open addressing hash table,because the cell

might have caused a collision to go past it.

Open addressing hash tables require lazy deletion.

Secondary Clustering :

Although Quadratic probing eliminates primary clustering,elements that hash to the same

position will probe the same alternative cells. This is known as Secondary Clustering.

Double Hashing:

h1(x) + F(i)

Where F(i) = i.h2(x) and h2(x)is another hash function .A better choice for h2(x) is

h2(x) = R (x mod R ) with R is a prime number smaller than Table size.

If double hashing is correctly implemented , the expected no. of probes is about almost

the same as for a Random Collision resolution Strategy.

Quadratic probing , however , does not require the use of a second hash function and is

thus likely to be simpler and faster in practice.

Rehashing:

If the table gets too full ,the running time for the operations will start taking too long and

Inserts might fail for open addressing hashing with quadratic resolution.This can happen

if there are too many removals intermixed with insertions.A solution ,then is to build

another table that is about twice as big(with an associated new hash function) and scan

down the entire original hash function,computing the new hash value for

each(nondeleted) element and inserting it in the new table.

This entire operation is called rehasing If the exceeds 0.5 rehashing can be

implemented.

0

1

2

3

4

5

6 6

7 23

8 24

9

10

11

12

13 13

14

15 15

16

0 6

1 15

2 23

3 24

4

5

6 13

The running time of rehashing operation O(N) .

Rehashing can also be done when an insertion fails.

Rehashing frees the programmer from worrying about the table size and is important

because hash tables cannot be made arbitrarily large in complex programs.

Extendible Hashing:

This deals with the case where the amount of data is too large to fit in main

memory.We assume that at any point we have N records to store; the value of N changes

over time. Furthermore, at most M records ( 4 in this case) fit in one disk block.

Extendible hashing, allows a find to be perfomed in two disk accesses .Insertions also

require few disk accesses.

As M increases,the depth of a B-tree decreases.

To insert an element,if the leaf is full then we have to split the leaf and three bits are

required for identifying a key. Hence extend the address.

The expected number of leaves is (N/M)log2e .This the average leaf is ln 2 = 0.69 full.

Priority Queues(Heaps)

Insert(pq,x) Y=deletemax(pq)

Priotiry Queue

In the case of single server and several users the queue is used .But if users are with

different priorites then a Priority Queue is used.

Priority queue is queue in which the elements inserted are with different priorities.

A priority queue is a data structure that allows at least the following two operations.

Insert,which does the abvious thing ,and.

Deletion ,which finds, returns and removes the minimum element/maximum element in

the priority Queue.

Applications It is used in the operating systems.

It is used for external sorting(replacement selection/Extendable runs)

Priority queues are also important in the implementation of greedy algorithms,which

operate by repeatedly finding a minimum.

Ways of Implementation Priority Queue

1.use a simple linked list.performing insertion at the point in O(1) and traversing the

list,which requires O(n) time, to delete the minimum.

2.Maintain always a sorting list.This makes insertions expensive (O(n)) and Delete Min

cheap(O(1)).

3.Priority Queue can be implemented using a binary search tree.This gives an O(log n)

average running time for both operations.

In BST minimum will exist in the left after some deletion the tree may be

right heavy.

The basic data structure we will use will not require pointers and will support both

operations in O(log n) worst case time.Insertion will actually take constant time on

average,and our implementation will allow building a priority queue of nitems in

Linear time,if no deletion intervene,which known as a binary heap.

Binary Heap: (heap)

Heaps have two properties:

1.Structure property and 2.Heap order property.

Structure property: A heap is a binary tree that is completely filled, with the possible

exception of the bottom level,which is filled from left to right such a tree is known as a

Complete binary tree.

The height of a complete binary tree is [log n].

Heap order property:

In a heap ,for every node x, the key in the parent of x with is smaller than (or equal to)

the key in x,with the expection of the root.This heap is called as Minimum Heap.

Max heap order property:

For every node x, the key in the parent of x is greater than(or equal to) the key in x, with

the exception of the root.(Which is not having any root).

A complete binary tree is so regular,it can be represented in an array and no pointers are

necessary.

For any element in array position i, the left child is in 2i,the right child is in the cell after

the left child(2i+1) and the parent is in position [i/2].

13 21 16 24 31 19 68 65 26 32

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Heapsize

MINHEAP MAXHEAP

After insert/delete operation the structure and order properties if disturbed must be

restored.

To (define) implement heap ADT:

Centinal value

0 1 2 3 4 5 6 7 8 9.. heap-1

16

19

13

24

68

21

13

32 26

58

37

24

73

35

64

48

18

16

19

68

24

13

32 26

15

21

13

65

Heap

hsize

harray

H

Struct heap

{

int heap;

int hsize;

int *harray;

};

typedef struct heap *hptr;

In the first array location a centinal value (minimum/maximum) is inserted.

(Declaration) of creation of a priority queue

hptr createpq(int n)

{

hptr H;

H = (hptr) malloc (sizeof( struct heap));

if( H = = NULL)

Printf(out of space);

Eelse

{

H ->harray = (int * )malloc(n*sizeof(int));

If(H->harray==NULL)

{

Printf(out of space );

Rreturn(NULL);

}

else

{

H->heap = n;

H ->hsize = 0;

H-> harray[0] = MinData;/*centinal value*

}

}

return(H);

}

Insert Operation:

To insert an element x into the heap,we create a hole in the next available location to

maintain the structure property by increment the present heap size we can get this.

If x can be placed in the hole without violating heap order then insert otherwise move the

parent into the hole and hole into the parent position continue this (perculate up) until we

get the correct place for x and insert x.

Insert function:

void Insert(nptr H, int x)

{

int i;

if( H ->hsize >= H->heap)

printf(overflow on insert);

else

{

for( i = ++H ->hsize; H->harray[i/2] > x; i/ =2)

{

H->harray[i] = H->harray[i/2];

}

H->harray[i] = x;

}

}

DeleteMin :

Minimum is existing in the root .Deletion of minimum creates a hole( violation of

structure property) which is to be moved to the last(perculate _down) and the last element

is inserted in appropriate position.

int deletemin(hptr H)

{

int I,child,min,last;

if(H->hsize < = 0)

{

printf (underflow on delete);

return(-1);

}

else {

min = H->harray[1];

last = H->harray[H->hsize --];

for( i = 1 ; i* 2 < = H->hsize ; i = child )

{

child = i* 2;

if( child ! = H ->hsize && H->harray[child + 1) < H-.harray[child])

child ++;

if( last > H ->harray[child])

H -> harray[i] = H->harray[child];

Eelse

break;

H ->harray[i] = last;

return(min);

}}

Insert/Delete operations can be completed in O(n) time.

Other Heap Operations:

The decrease_key(H,k,p) operation lowers the value of the key at position p, by

a positive amount k.Since this might violate the heap order,it must be fixed by a

perculate_up.

This operation could be useful to system administrators. They can make their programs

run with highest priority.

Increase_key(H,K,P) Operation increases the value of the key at position p by a

positive amount k.This is done with a perculate-down.

Many schedules automatically drop the priority of a process that is consuming excessive

cpu time.

Delete(H,P) Operation removes the node at a positive p from the heap.This is

done by first performing decrease_key(H, K , p ) and then performing deletemin(H).

When a process is terminated by a user(instead of finishing normally) ,it must be

removed from the priority queue.

Build Heap:

The build heap(H) operation takes as input n takes and places then into an empty

heap.This can be done with n successive Inserts.Since each insert will take O(1)

average and O(log n) worst_case time, The total running time of this algorithm would be

O(n) average but O(nlog n) worst_case.

With reasonable care a linear time bound can be guaranteed.

The general algorithm is to place the n keys into the tree in any order,maintaining the

structure property.Perculate all the elements from i = n/2 to 1 down.

for(i = n/2 ; i> O ; i --)

Perculatedown(i);

The time complexity of this algorithm is O(n).

For the perfect binary tree of height h containing 2h+1 1 nodes,the sum of the heights

of the nodes is 2h+1 1 (h+1).

The heap is represented using simple array.

The perculate down routine is given below.

59

58

31

41

97

53

26

97 53 59 26 41 58 31

0 1 2 3 4 5 6 7 8 9 10 11

void perculatedown(int H[] ,int i, int n)

{

int child,temp;

for(temp = H[i] ; 2* i< n ;i =child)

{

child = 2* i;

if( child! = n-1 && H[child] < H[child +1])

child ++;

if(temp < H[child ])

H[i ] = H[child];

Else

break;

}

H[i] = temp;

}

HeapSort :

Space complexity = 1

T(n) = O(nlog n)

Basic-strategy :

Build a binary heap of n elements. This takes O(n) time.Then perform n deletemin

Operations which requires n extra locations.

To Reduce the space complexity build max heap and perform delete maximum

Operation.

This maximum can be inserted in last location of the heap.The same can be implemented

by swaping the first and last element of the heap and perculate_down the first element.

Repeating the above procedure n-1 times the given list can be sorted .The function is:

void heapsort(int a[],int n)

{

int i;

for( i =n/2 ; i>=0 ; i--)

perculatedown(a,i,n );

for(i = n-1 ;i>0 ;i-- )

{

swap( &a[0] ,&A[i ]);

perculatedown(a,0,i);

}

}

The average no of comparisons used to heapsort a random permutation of n distinct items

is

2nlogn-O(nloglogn)=O(nlogn)

- 4690 POS Keyed Files Characteristics and LimitationsUploaded byjssanche
- data structure lab manualUploaded bybharatkumarpadhi
- FaqUploaded byAkshay Singhal
- BNMIT file structure lab manualUploaded byAmitkumar55555
- An Evaluation of Software Requirement Prioritization TechniquesUploaded byijcsis
- Assoc Parallel JournalUploaded byAnonymous RrGVQj
- Cart Intro TutorialUploaded byAK
- Archi User GuideUploaded byJuan David Vertel Holguin
- Newseplan8Uploaded byCristian Pavel
- Scenario Based QuestionsUploaded byajaybhosal
- Course Description ADSUploaded byRahul Jain
- PW ReferenceUploaded bySergey Vilka
- Lab15-BST.pdfUploaded bys_gamal15
- Data StructuresUploaded byPavan Boro
- Assignment 5Uploaded bysanjivrmenon
- B-Tree_syamUploaded bysyamradhe
- Complete Syllabus 2nd Year Non-Credit - CopyUploaded byanishkl
- New Microsoft Word DocumentUploaded byJaspreet Singh
- OthelloUploaded bywafasdfasdfsdfsd
- p1425-garg.pdfUploaded byArshdeep Singh
- Binary TreesUploaded byManauar Ali
- Data Structures Algorithms Multiple Choice Questions MCQsUploaded bysibanandarms
- AvlUploaded byali
- Lecture-24-CS210-2012.pptxUploaded byMoazzam Hussain
- Phyl o Genetic TreesUploaded bybleizher
- 207Uploaded byChatarina Indah
- Data Dictionary Part-1Uploaded bysgelivi
- avl treeUploaded byVikash Kumar
- Bitpacking techniques for indexing genomes: I. Hash tablesUploaded byalphaprod
- 478a3solUploaded byVenkataMurali Krishna

- Moduri de parcurgereUploaded byAlex_GP
- Lecture 16,17,18 TreesUploaded byaliyaraza1
- Lect 8 Trees.pptxUploaded byAthanasious Ramsis
- Data Structures Question Bank (2)Uploaded byLn P Subramanian
- QuizUploaded byRajeev Pathak
- Non Linear Data StructuresUploaded byGirish Kumar Nistala
- Coding QuestionsUploaded byMukesh Panch
- McqUploaded byZohaib Bukhari
- cs301 (1)Uploaded byRizwanIlyas
- Segment Tree and Lazy Propagation _ HackerEarthUploaded bythextroid
- Numerical MethodsUploaded byAndreas Neophytou
- 17-TreeFundamentalsUploaded byRishav Kanth
- A Data Structure for Dynamic TreesUploaded bylike2share
- CS201 Slides12Uploaded byNeelam Rawat
- Which Wsdl Style to UseUploaded byindercappy
- DS-MCQsUploaded bydeepalikayande
- A Hope TutorialUploaded byNick Thompson
- Data StructuresUploaded bykranthi933
- CernerUploaded byRudresh GE
- EECS 281 HEAPSUploaded byCharlie Yan
- math202.pdfUploaded byNitesh Jain
- C C++ quizUploaded byvikaskumar615
- CSE-III-DATA STRUCTURES WITH C [10CS35]-ASSIGNMENT.pdfUploaded byJ.B.BABU
- c progUploaded bySharafatAli
- MCQs_CS301_For_Fnal_2010Uploaded bycs619finalproject.com
- Computer Notes - Data Structures - 11Uploaded byecomputernotes
- data structure notes 1.docUploaded byRaja Kushwah
- Trees NotesUploaded bysubramanyam62
- ADOBE Written TestUploaded bybrp700144
- CS 1997 UnsolvedUploaded byRakesh Kumar