Вы находитесь на странице: 1из 56

# HASHING

CSCI 203
Hash tables

1. Hash Tables
3. Hash tables

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 2
Hash tables
 Many applications require:
 Dynamic set ( a set than can grow and
shrink).
 Dictionary operations:
• INSERT
• SEARCH
• DELETE

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 3
Hash table
 Effective for implementing dictionaries
 Searching for an element in a hash table
can take as long as searching for an
element in a linked list ---Θ(n) worst case
 But , with reasonable assumptions,
expected time to search in a hash table
can be reduced to O(1)

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 4
 Simple technique
 Works well when universe U of Keys is
reasonably small

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 5
 Assume application needs a dynamic set
 For example each student record may contain
several fields of attributes such as:
 Student ID number
 Name,
 Date of birth,
 Email
 Phone number

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 6
 Each Record has a key drawn from the Universe U
 U = {0, 1,2, ….., m-1}
 Think of the key as a student ID for example.
 Assume that m is not too large
 This dictionary (dynamic set) can be represented
by an array or direct-addressable table
 Where each position or slot corresponds to a key

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 7
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

## The key is the student id, the satellite

data= name, date of birth, sex, email, etc.

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 8
 This Simple technique becomes a problem when U is
large:
1. Storing a table of size U  can be impractical,
Even impossible (memory requirement).
2. When the set K of keys stored in the dictionary
is much smaller that the set U of all possible
keys, we will waste a lot of memory space with
 Solution is hashing

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 9
 Element with key k  element with key k
is stored in slot k is stored in slot h(k)

## 1. Hash function h computes the slot

from the key k
2. h maps the universe U of keys into the
slots of a hash table T[0..m-1]

h : U 0,1,, m 1 
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 10
Hash table: basic idea
If set of keys K stored in a dictionary is
much smaller than the universe U of
all possible keys

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 11
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 12
Hash function requirements (1)
 Hash table size:
1. should not be excessively large
compared to the number of keys
2. But should be sufficiently large to not
jeopardize the efficiency of the
implementation time

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 13
Hash function requirements (2)
 Hash function:
1. Needs to distribute the keys amongst
the cells of the table as uniformly as
possible
2. The function must be easy to
compute

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 14
Hash table
 Hash objective: instead of handling
U values , we need only handle
m values.
 Storage requirements are therefore
reduced
 BUT ….

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 15
Hash table
 There is a
potential problem
 …. Collision
 If two keys may
hash to the same
slot.

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 16
Hash table
 Collision resolution:
 Chaining
 Open hashing-separate hashing

 Closed hashing

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 17
Hash table- chaining
 All elements that hash to the same slot are

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 18
How well does hashing with chaining
perform?

##  Assume hash table with m slots

that stores n elements
 Worst case is terrible
1. All n keys hash to same slot
2. Creating a list of length n
3. Worst case search is (n) + time to
compute the function
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 19

## The Load Factor of Table T is :

  nm
That is the average number of elements
stored in a chain

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 20
Open chaining
 Keys are stored in linked lists
attached to cells of a hash table
 Each list contains all the keys attached
to its cells

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 21
Open chaining example
 Listof objects:
 A, FOOL, AND, HIS, MONEY, ARE,
SOON, PARTED
 Example of h function:
 add the positions of a world’s
letter in the alphabet
 Compute the sum’s remainder
after division by 13
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 22
h(A)=1 mod 13 =1

## Open chaining example

keys A

Hash

0 1 2 3 4 5 6 7 8 9 10 11 12

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 23
h(A)=1 mod 13=1

## Open chaining example

keys A

Hash 1

0 1 2 3 4 5 6 7 8 9 10 11 12

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 24
H(fool)= (6+15+15++12)mod13= 9

## Open chaining example

keys A FOOL

Hash 1

0 1 2 3 4 5 6 7 8 9 10 11 12

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 25
Open chaining example
keys A FOOL

Hash 1 9

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 26
Open chaining example
keys A FOOL AND HIS MONEY ARE SOON PARTED

Hash 1 9 6 10 7 11 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12

## money are soon

CSI 203 UoW
his
Dubai H.M.
11/25/2018 Khelalfa 27
How do we perform a search?

##  We just apply to the search key the

same procedure used in creating the
table.
 Example: search for the key KID in the
hash table
 Compute h(KID)= 11
 Look up the list attached to cell 11

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 28
Open chaining example
keys A FOOL AND HIS MONEY ARE SOON PARTED

Hash 1 9 6 10 7 11 11 12

## h(Kid)=11 We must traverse the

linked list attached to cell 11
0 1 2 3 4 5 6 7 8 9 10 11 12

## money are soon

CSI 203 UoW
his
Dubai H.M.
11/25/2018 Khelalfa 29
Open chaining example
n
 Let  
m
 Distribution of n key among m cells
 S= average number of pointers
inspected in successful search
 U= average number of pointers
inspected in unsuccessful search

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 30
Open chaining example

S 1
2
U 
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 31
Open chaining example
 Load factor should be close to 1.
 What If too small ?
 Lots of empty list- inefficient use of
space
 What if too large?
 Longer linked lists– longer search
times
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 32
Closed hashing
 All keys are stored in the hash table
itself
 What does this imply for the table size
m, given we have n keys?
 Table size m must be at least equal to n

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 33
Closed hashing
 What happen
if there is a
collision?

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 34
Closed hashing – linear probing as
solution to collisions

## Check the cell following the one

where the collusion occurs

N
IF cell is empty
Y
The new key is The availability of
stored there the cell’s immediate
successor is checked
and so on, ..
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 35
Exercise
Linear probing
keys A FOO AND HIS MONEY ARE SOON PARTED
L
Hash 1 9 6 10 7 11 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 37
keys A FOO AND HIS MONEY ARE SOON PARTED
L
Hash 1 9 6 10 7 11 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 38
keys A FOO AND HIS MONEY ARE SOON PARTED
L
Hash 1 9 6 10 7 11 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 39
keys A FOO AND HIS MONEY ARE SOON PARTED
L
Hash 1 9 6 10 7 11 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 40
Search for KID, h(KID)=11
We compare kid with ARE, SOON, PARTED, A
At that point we stop – search unsuccessful
0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 41
Search for LIT, h(LIT)=2
Cell 2 is empty, we stop immediately
search unsuccessful
0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 42
Assume we delete the Key are

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 43
Assume we delete the Key are

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 44
Assume that after deleting are, we search
for the key soon – h(soon)=11

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 45
Assume that after deleting are, we search for
the key soon – h(soon)=11??????
Cell 11 is empty!!!!!!! Unsuccessful search

0 1 2 3 4 5 6 7 8 9 10 11 12

A fool

A and fool

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 46
Simple solution
 Lazy detection
 Mark previously occupied cells by a
special symbol to distinguish them
from cells that have never been
occupied.

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 47
Linear probing complexity

1 1 
S  1  
2 1 
1 1 
U  1  

2 1   2

CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 48
Linear probing
α S U
50% 1.5 2.5

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 49
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 50
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 51
Linear probing- clustering
 When table gets closer to being full
 Clustering phenomena
 Here , cluster means : sequence of
continuously occupied cells, with
possible wrapping
 Make dictionaries inefficient

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 52
Linear probing- clustering
 As clusters become larger
 The Probability of a new element to
be attached to a cluster increases

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 53
One solution: Double hashing

##  Another hash function s(K) is used

 It determines a fixed increment for the
probing sequence, to be used after a
collision at location l= h(k)
 (l+s(K)) mod m,
 (l+2s(K)) mod m,
 ……….

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 54
Double hashing
 To guarantee that every loaction in the table is
probed by the sequence (a)
 The increment s(k) and the table size must be
relatively prime - their gcd =1
 Some recommend that
 S(k) = (m-2 –k ) mod (m-2) Or S(k)= 8 –(k mod8) for
small tables
 S(k)= k (mod 97) + 1 for large tables

l  s  K  mod m, l  2s  K  mod m, , (a )
CSI 203 UoW Dubai H.M.
11/25/2018 Khelalfa 55
Exercise- open hashing
 For the input 30, 30, 36, 75, 31, 19,
 Hash function h(K)= K mod 11
 Construct the open hash table
 Find the largest number of key comparisons
in a successful search in this table
 Find the average number of key
comparisons in a successful search in this
table

## CSI 203 UoW Dubai H.M.

11/25/2018 Khelalfa 56