Академический Документы
Профессиональный Документы
Культура Документы
hash function:
index = h(key)
⁞
key space (e.g., integers, strings)
size -1
The goal:
Aim for constant-time find, insert, and delete "on
average" under reasonable assumptions
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 2
An Ideal Hash Functions
Is fast to compute
Rarely hashes two keys to the same index
Known as collisions
Zero collisions often impossible in theory but
reasonably achievable in practice
0
hash function: ⁞
index = h(key)
An inherent trade-off:
hashing-time vs. collision-avoidance
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 4
Hashing Integers
key space = integers
TableSize = 10 8 7
9 18
Insert keys 7, 18, 41, 34, 10
Programming Trade-off:
Calculation speed
Avoiding distinct keys hashing to same ints
h K = s0 % TableSize
k−1
h K = si % TableSize
i=0
k−1
h K = si ∙ 37𝑖 % TableSize
i=0
COLLISION RESOLUTION
Open Addressing
Linear Probing
Quadratic Probing
Double Hashing
0 10 /
1 /
2 42 12 22 /
3 /
4 /
5 /
6 86 /
7 /
8 /
9 /
𝑛 5
𝜆= = ? = 0.5
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒 10
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 22
Load Factor?
0 10 /
1 71 2 31 /
2 42 12 22 /
3 63 73 /
4 /
5 75 5 65 95 /
6 86 /
7 27 47
8 88 18 38 98 /
9 99 /
𝑛 21
𝜆= = ? = 2.1
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒 10
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 23
Rigorous Separate Chaining Analysis
The load factor, , of a hash table is calculated as
𝑛
𝜆=
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒
where n is the number of items currently in the table
REHASHING
Rehash Algorithm:
Go through old table
Do standard insert for each item into new table
IMPLEMENTING HASHING