185 views

Uploaded by api-3801329

- Comp101-002 Final PDF
- Chainalysis Roadmap
- Ogle_Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings
- ADS Question Papers
- Compiler Design Notes (UNIT 4 & 5) (1)
- Partitioning in MySQL 5.1
- Lecture Notes COMP3506
- 10.1.1.18.9659
- ch11b
- ca_file
- Java Collections Interview Questions
- TERATOM -- Chapter 3 Hashing of the Primary Index
- Hash Map Internal Implementation
- hash table1
- Data Structure Lec42 Handout
- Hash Table
- SC19_06
- Hashing and File Structure
- Assignment 6 (1).pdf
- Blockchain Report

You are on page 1of 42

1

Tables: rows & columns of information

• A telephone book may have fields name, address,

phone number

• A user account table may have fields user id,

password, home folder

Sohail Aslam 50 Zahoor Elahi Rd, Gulberg-4, Lahore 576-3205

Imran Ahmad 30-T Phase-IV, LCCHS, Lahore 572-4409

2

Tables: rows & columns of information

know the contents of one of the fields (not

all of them).

• In a telephone book, the key is usually “name”

• In a user account table, the key is usually “user

id”

3

Tables: rows & columns of information

• If the key is “name” and no two entries in the

telephone book have the same name, the key

uniquely identifies the entries

Sohail Aslam 50 Zahoor Elahi Rd, Gulberg-4, Lahore 576-3205

Imran Ahmad 30-T Phase-IV, LCCHS, Lahore 572-4409

4

The Table ADT: operations

into the table

the key

with the key, and removes it

5

How should we implement a table?

depends on the answers to the following

How many of the possible key values are likely to

be used?

What is the likely pattern of searching for keys?

E.g. Will most of the accesses be to just one or

two key values?

Is the table small enough to fit into memory?

How long will the table exist?

6

TableNode: a key and its entry

the key and the entry separately (even

though the key’s value may be inside the

entry)

key entry

“Saleem” “Saleem”, “124 Hawkers Lane”, “9675846”

TableNode

“Yunus” “Yunus”, “1 Apple Crescent”, “0044 1970 622455”

7

Implementation 1: unsorted sequential array

are stored consecutively in 0

any order 1

insert: add to back of array; 2

3

(1)

…

find: search through the keys and so on

one at a time, potentially all of

the keys; (n)

remove: find + replace

removed node with last node;

(n)

8

Implementation 2:sorted sequential array

are stored consecutively, key entry

sorted by key 0

1

insert: add in sorted order; (n)

2

find: binary search; (log n) 3

…

remove: find, remove node and so on

and shuffle down; (n)

array elements are sorted

9

Searching an Array: Binary Search

or a word in the dictionary

• Start in middle of book

• If name you're looking for comes before names on

page, look in first half

• Otherwise, look in second half

10

Implementation 3: linked list

consecutively (unsorted or

sorted) key entry

insert: add to front; (1or n for

a sorted list)

find: search through

potentially all the keys, one at

a time; (n for unsorted or for

a sorted list

remove: find, remove using and so on

pointer alterations; (n)

11

Implementation 4: AVL tree

key entry

insert: a standard insert; (log n)

find: a standard find (without

removing, of course); (log n) key entry key entry

(log n) key entry

and so on

12

Anything better?

where time varies between constant logn.

constant time operations!

13

Implementation 5: Hashing

An array in which

TableNodes are not stored key entry

consecutively

Their place of storage is

4

calculated using the key and

a hash function

10

hash array

Key index

function

123

Keys and entries are

scattered throughout the

array.

14

Hashing

storage, insert

key entry

TableNode; (1)

find: calculate place of

4

storage, retrieve entry;

(1) 10

remove: calculate place

of storage, set it to null;

(1) 123

All are constant time (1) !

15

Hashing

hold the data. T is typically prime.

in the range 0 to T-1 using a hash

function, which ideally should be

efficient to compute.

16

Example: fruits

gave us the following 1

values: 2 banana

hashCode("apple") = 5 3 watermelon

hashCode("watermelon") = 3

4

hashCode("grapes") = 8

hashCode("cantaloupe") = 7 5 apple

hashCode("kiwi") = 0 6 mango

hashCode("strawberry") = 9 7 cantaloupe

hashCode("mango") = 6

hashCode("banana") = 2 8 grapes

9 strawberry

17

Example

1

array:

table[5] = "apple"

2 banana

table[3] = "watermelon" 3 watermelon

table[8] = "grapes" 4

table[7] = "cantaloupe" 5 apple

table[0] = "kiwi"

table[9] = "strawberry" 6 mango

table[6] = "mango" 7 cantaloupe

table[2] = "banana" 8 grapes

9 strawberry

18

Example

1

table["apple"]

2 banana

table["watermelon"]

table["grapes"]

3 watermelon

4

table["cantaloupe"]

table["kiwi"] 5 apple

table["strawberry"] 6 mango

table["mango"] 7 cantaloupe

table["banana"] 8 grapes

9 strawberry

19

Example Hash Functions

some function of the characters in the

strings.

One possibility is to simply add the ASCII

values of the characters:

length −1

h( str ) = ∑ str[i ] %TableSize

i =0

Example : h( ABC ) = (65 + 66 + 67)%TableSize

20

Finding the hash function

{

int i, sum;

sum = 0;

for(i=0; i < strlen(s); i++ )

sum = sum + s[i]; // ascii value

return sum % TABLESIZE;

}

21

Example Hash Functions

into some number in some arbitrary base b

(b also might be a prime number):

length −1 i

h( str ) = ∑ str[i ] × b %T

i =0

= 0

+

Example : h( ABC ) (65b 66b 67b )%T

1

+ 2

22

Example Hash Functions

generally a good hash function, unless the

data has some undesirable features.

For example, if T = 10 and all keys end in

zeros, then key%T = 0 for all keys.

In general, to avoid situations like this, T

should be a prime number.

23

Collision

the following values:

1

• hash("apple") = 5

hash("watermelon") = 3 2 banana

hash("grapes") = 8 3 watermelon

hash("cantaloupe") = 7

4

hash("kiwi") = 0

hash("strawberry") = 9 5 apple

hash("mango") = 6

hash("banana") = 2

6 mango

7 cantaloupe

hash("honeydew") = 6 8 grapes

9 strawberry

• Now what?

24

Collision

location, this is called a collision

Collisions are normally treated as “first

come, first served”—the first value that

hashes to the location gets it

We have to find something to do with the

second and subsequent values that hash to

this same location.

25

Solution for Handling collisions

location

• Can stop searching when we find the

value or an empty location.

• Search must be wrap-around at the end.

26

Solution for Handling collisions

• ...and a third, and a fourth, and a fifth, ...

27

Solution for Handling collisions

header of a linked list of values that hash to

this location

28

Solution 1: Open Addressing

called open addressing; it is also known

as closed hashing.

More formally, cells at h0(x), h1(x), h2(x),

… are tried in succession where

with f(0) = 0.

The function, f, is the collision resolution

strategy.

29

Linear Probing

of i. Thus

linear probing because it scans the array

sequentially (with wrap around) in search

of an empty cell.

30

Linear Probing: insert

seagull to this hash table 141

Also suppose: 142 robin

• hashCode(“seagull”) = 143 143 sparrow

• table[143] is not empty 144 hawk

• table[143] != seagull

145 seagull

• table[144] is not empty

146

• table[144] != seagull

• table[145]

147 bluejay

is empty

148 owl

Therefore, put seagull at

...

location 145

31

Linear Probing: insert

hawk to this hash table 141

Also suppose 142 robin

• hashCode(“hawk”) = 143 143 sparrow

• table[143] is not empty 144 hawk

• table[143] != hawk

145 seagull

• table[144] is not empty

146

• table[144] == hawk

147 bluejay

hawk is already in the

148 owl

table, so do nothing.

...

32

Linear Probing: insert

Suppose: ...

• You want to add cardinal to 141

this hash table 142 robin

• hashCode(“cardinal”) = 147

143 sparrow

• The last location is 148

144 hawk

• 147 and 148 are occupied

145 seagull

Solution:

146

• Treat the table as circular;

147 bluejay

after 148 comes 0

• Hence, cardinal goes in 148 owl

location 0 (or 1, or 2, or ...)

33

Linear Probing: find

hawk in this hash table 141

We proceed as follows: 142 robin

• hashCode(“hawk”) = 143

143 sparrow

• table[143] is not empty

• table[143] != hawk 144 hawk

• table[144] is not empty 145 seagull

• table[144] == hawk (found!) 146

We use the same 147 bluejay

procedure for looking 148 owl

things up in the table as

...

we do for inserting them

34

Linear Probing and Deletion

then the item just before it is deleted

How will probe determine that the “hole” does not

indicate the item is not in the array?

Have three states for each location

• Occupied

• Empty (never used)

• Deleted (previously used)

35

Clustering

technique is the tendency to form

“clusters”.

A cluster is a group of items not

containing any open slots

The bigger a cluster gets, the more likely

it is that new values will hash into the

cluster, and make it ever bigger.

Clusters cause efficiency to degrade.

36

Quadratic Probing

• Use F(i) = i2 to resolve collisions

• If hash function resolves to H and a search in cell

H is inconclusive, try H + 12, H + 22, H + 32, …

Probe

array[hash(key)+12], then

array[hash(key)+22], then

array[hash(key)+32], and so on

• Virtually eliminates primary clusters

37

Collision resolution: chaining

Add the keys and 4

10

list (front easiest)

key entry

123

38

Collision resolution: chaining

addressing:

key entry key entry

• Simpler insertion and 4

removal

key entry key entry

• Array size is not a 10

limitation

Disadvantage

key entry

• Memory overhead is 123

large if entries are small.

39

Applications of Hashing

declared variables (symbol table).

spelling checkers — if misspelling detection

(rather than correction) is important, an

entire dictionary can be hashed and words

checked in constant time.

40

Applications of Hashing

store seen positions, thereby saving

computation time if the position is

encountered again.

check for inequality — if two elements hash

to different values they must be different.

41

When is hashing suitable?

many searches in a reasonably stable table.

Hash tables are not so good if there are many

insertions and deletions, or if table traversals are

needed — in this case, AVL trees are better.

Also, hashing is very slow for any operations

which require the entries to be sorted

• e.g. Find the minimum key

42

- Comp101-002 Final PDFUploaded byRahul Nathan
- Chainalysis RoadmapUploaded bythrowaway
- Ogle_Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio RecordingsUploaded byKai Lin Zhang
- ADS Question PapersUploaded byapi-3849895
- Compiler Design Notes (UNIT 4 & 5) (1)Uploaded byPritamGupta
- Partitioning in MySQL 5.1Uploaded byBest Tech Videos
- Lecture Notes COMP3506Uploaded byJack
- 10.1.1.18.9659Uploaded byAmir Maghribi
- ch11bUploaded byAbolfazl Venoos
- ca_fileUploaded bynexus_2110
- Java Collections Interview QuestionsUploaded byMohan Raj
- TERATOM -- Chapter 3 Hashing of the Primary IndexUploaded byAnand Ranjan
- Hash Map Internal ImplementationUploaded byHemanth Kumar
- hash table1Uploaded bypushpi336
- Data Structure Lec42 HandoutUploaded byJʋŋaiɗ Aĸɓar
- Hash TableUploaded byRam C. Gudavalli
- SC19_06Uploaded byHezekiah Ebere Enekwa
- Hashing and File StructureUploaded bySatvik Khara
- Assignment 6 (1).pdfUploaded byMattTaubler
- Blockchain ReportUploaded byuditagarwal1997
- lec02Uploaded byozturk_nihal
- 10.1.1.67Uploaded byIshwinder Brar
- Oracle All About GroupingUploaded bylagarutte
- A Synthesis of Consistent Hashing Using ORB - Jaramillo, JuanUploaded byd3vboot
- 1602-14-733-039.doc (1).docUploaded byAmulya Sai Inturi
- Using Netezza Query PlanUploaded byleonardo russo
- Solution 8Uploaded byugwak
- Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication – A ReviewUploaded byIRJET Journal
- Particle smokeUploaded byHansi Rüting
- 05slide.pptUploaded byAnonymous ZntoXci

- Probability Random Variables Stochastic ProcessUploaded byapi-3801329
- Q6 SolutionUploaded byapi-3801329
- Q5 SolutionUploaded byapi-3801329
- Q4 SolutionUploaded byapi-3801329
- Q3 SolutionUploaded byapi-3801329
- Q2 SolutionUploaded byapi-3801329
- Q1 SolutionUploaded byapi-3801329
- Exercise Solution) 3Uploaded byapi-3801329
- Exercise Solution) 2Uploaded byapi-3801329
- Exercise Solution) 1Uploaded byapi-3801329
- Pentium 4 Pipe LiningUploaded byapi-3801329
- Power Pc (G5)Uploaded byapi-3801329
- Pentium 4 StructureUploaded byapi-3801329
- Memory StructureUploaded byapi-3801329
- Computer Structure and ComponentsUploaded byapi-3801329
- Multiple Interrupts and Buses StructureUploaded byapi-3801329
- History of ComputersUploaded byapi-3801329
- Height-Biased Leftist Heaps Advanced)Uploaded byapi-3801329
- Weight-Biased Leftist Heaps Advanced)Uploaded byapi-3801329
- Skew Heaps Advanced)Uploaded byapi-3801329
- Binomial & Fibonacci Heap Advanced)Uploaded byapi-3801329
- Red-Black Trees Advanced)Uploaded byapi-3801329
- B Trees Advanced)Uploaded byapi-3801329
- Splay Trees Advanced)Uploaded byapi-3801329
- ProbingUploaded byapi-3801329
- 2 3 TreesUploaded byapi-3801329
- Graph Search Methods ExamplesUploaded byapi-3801329
- GraphsUploaded byapi-3801329
- Linked ListsUploaded byapi-3801329

- Hash TableUploaded byShubhashree Seth
- HashingUploaded byRamLalli2014
- Hashing.4up 2Uploaded byAiswaryaUnnikrishnan
- TestUploaded byPrateek Sinha
- CMP202-DS Handout BookUploaded byAkash Kulshrestha
- Hashing Concepts in DBMS.pdfUploaded bykaramthota bhaskar naik
- 109search Hash Malik Ch09Uploaded bysachinsr099
- Chapter 11 - HashingUploaded byboonandzip
- 09 HashingUploaded byanony_1
- Slides 34HashTablesUploaded byXiaomin Ding
- Short Online Q&A for DataStructureUploaded byRavi Varma
- 2011-2012 Lecture 2.2 3.1 Functions Sequences Relations CH3Uploaded byStefanvBreukelen89
- GATE (CSE)_ADA & DSA.pdfUploaded byAkshat Sinha
- Collision Resolution docUploaded byawadhesh.kumar
- Data Structures and Algorithms Question BankUploaded byRamKrishnaMurti
- quiz1-solUploaded byJoe Knut Ant
- Unit 7 HashingUploaded bySudarshan Subramanian
- Ds Two MarksUploaded bysrcsundar
- hashing.pptUploaded byNimra Nazeer
- SearchingUploaded byrstoikos
- Data Structure NotesUploaded byKumar Raja
- QuestionsUploaded bySaravana Priya
- Lecture 18Uploaded byparmanandkj
- Hashing So LnUploaded bySiri Shirisha
- Hashing NotesUploaded byRafi Mohammed
- SecEng_2016LectureUploaded byg007adam759
- Unit08B.pdfUploaded byKeshavanRavi
- CLRS Chapter 11 SolutionsUploaded byসাজিদ হাসান আপন
- hashingUploaded byRadheshyam Gawari
- CSI-2110-summary.pdfUploaded byjohn