Вы находитесь на странице: 1из 125

Resmi N.G. References: Data Structures and Algorithms: Alfred V. Aho, John E. Hopcroft, Jeffrey D.

Ullman

Syllabus
Searching - Sequential Search - Searching Arrays and Linked Lists Binary Searching - Searching arrays and Binary Search Trees Hashing - Open & Closed Hashing - Hash functions Resolution of Collision Sorting- n2 Sorts - Bubble Sort - Insertion Sort - Selection Sort - n log n Sorts - Quick Sort - Heap Sort - Merge Sort - External Sort - Merge Files
10/25/2012 CS09 303 Data Structures - Module 4 2

Bubble Sort
Bubble sort is a comparison-sort algorithm. The algorithm starts at one end of the data set. It compares two adjacent elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the other end of the data set.

10/25/2012

CS09 303 Data Structures - Module 4

The algorithm gets its name from the way smaller elements "bubble" to the top of the list (or larger elements bubble to the end of the list). Because it only uses comparisons to operate on elements, it is a comparison sort.

10/25/2012

CS09 303 Data Structures - Module 4

Bubble Sort (from the beginning)


The algorithm starts at the beginning of the data set. It compares first two elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the end of the data set. After the first pass, the largest element will be at the last position.
10/25/2012 CS09 303 Data Structures - Module 4 5

It then starts again with the first two elements, repeating the process until no swaps have occurred on the last pass. After the second pass, the second largest element will be in second last position in the array and so on.

10/25/2012

CS09 303 Data Structures - Module 4

Bubble Sort (from the end)


The algorithm starts at the end of the data set. It compares the last two elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the beginning of the data set. After the first pass, the smallest element will be at the first position.
10/25/2012 CS09 303 Data Structures - Module 4 7

It then starts again from the end, leaving behind the last element which is already sorted, compares the elements, repeating the process until no swaps have occurred on the last pass. After the second pass, the second smallest element will be in its right position in the array and so on.

10/25/2012

CS09 303 Data Structures - Module 4

Algorithm 1
{Smaller elements bubble to the beginning of the list.} For i := 1 to n-1 do for j:= n downto i+1 do if A[j] < A[j-1] then temp := A[j]; A[j] := A[j-1]; A[j-1] := temp;
10/25/2012 CS09 303 Data Structures - Module 4 9

Algorithm 2
{Larger elements bubble to the end of the list.} For i := 1 to n-1 do for j:= 1 to n-i do if A[j] >A[j+1] then temp := A[j]; A[j] := A[j+1]; A[j+1] := temp;
10/25/2012 CS09 303 Data Structures - Module 4 10

First Pass : i=1


1 2 3 4 5 6

5
j 1 swap

1
j+1 2

1
j

5
swap

3
j+1

3
j

5
swap

4
j+1

10/25/2012

CS09 303 Data Structures - Module 4

11

4
j

5
No swap 4

6
j+1 5

5
j

6
swap

2
j+1

sorted
10/25/2012 CS09 303 Data Structures - Module 4 12

Second Pass : i=2


1 2 3 4 5 6

1
j 1

3
No swap 2 j+1

1
j

3
No swap

4
j+1

1
10/25/2012

3
j

4
No swap

5
j+1

6
13

CS09 303 Data Structures - Module 4

4
j

5
swap

2
j+1 5

sorted

10/25/2012

CS09 303 Data Structures - Module 4

14

Third Pass : i =3
1 2 3 4 5 6

1
j 1

3
No swap 2 j+1

1
j

3
No swap

4
j+1

3
j

4
swap

2
j+1

10/25/2012

CS09 303 Data Structures - Module 4

15

sorted

10/25/2012

CS09 303 Data Structures - Module 4

16

Fourth Pass : i =4
1 2 3 4 5 6

1
j 1

3
No swap 2 j+1

1
j

3
swap

2
j+1

1
10/25/2012

5
sorted

6
17

CS09 303 Data Structures - Module 4

Fifth Pass : i =5
1 2 3 4 5 6

1
j 1

2
No swap 2 j+1

sorted

10/25/2012

CS09 303 Data Structures - Module 4

18

Selection Sort
The algorithm finds the minimum value, swaps it with the value in the first position, and repeats these steps for the remainder of the list. In the ith pass, lowest among A[i], , A[n] is selected and swapped with A[i]. After i passes, the lowest i keys will occupy A[1], A[2], , A[i] in sorted order. It does no more than n swaps for an array of n elements.
10/25/2012 CS09 303 Data Structures - Module 4 19

10/25/2012

CS09 303 Data Structures - Module 4

20

First Pass : i=1


1 2 3 4 5 6

5
i min 1

1
j 2

3
3

4
4

6
5

2
6

5
i 1 min

1
2

3
j 3

4
4

6
5

2
6

5
i 1 min

1
2

3
3

4
j 4

6
5

2
6

5
i
10/25/2012

1
min

6
j

2
21

CS09 303 Data Structures - Module 4

5
i min

2
j

5
i swap

1
min

1 sorted

2
22

10/25/2012

CS09 303 Data Structures - Module 4

Second Pass : i=2


1 2 3 4 5 6

1
i 1

5
min 2

3
j 3

1
i

3
min

4
j

1
i
10/25/2012

5
min

6
j

2
23

CS09 303 Data Structures - Module 4

1
i

5
min

2
j

1
i

4
swap

2
min

1
sorted

10/25/2012

CS09 303 Data Structures - Module 4

24

Third Pass : i=3


1 2 3 4 5 6

1
1

2
i 2

3
min 3

4
j 4

6
5

5
6

1
1

2
i 2

3
min 3

4
4

6
j 5

5
6

2
i

3
min 3

6
No swap

5
j 6

1
10/25/2012

5
25

sorted

CS09 303 Data Structures - Module 4

Fourth Pass : i=4


1 2 3 4 5 6

1
1

2
2

3
3

4
i min 4

6
j 5

5
6

1
1

2
2

3
3

4
i min 4

6
5

5
j 6

1
1

2
2

3
3

4
i min 4

6
No swap 5

5
6

1
10/25/2012

2
sorted

5
26

CS09 303 Data Structures - Module 4

Fifth Pass : i=5


1 2 3 4 5 6

6
i min

5
j

6
i swap

5
min

3
sorted

10/25/2012

CS09 303 Data Structures - Module 4

27

Insertion Sort
Insertion sort is a sorting algorithm that builds the final sorted array (or list) one item at a time. Here, on the ith pass, the ith element A[i] is inserted into its right position among A[1], , A[i-1], which were previously placed in sorted order. After inserting A[i], A[1], A[2], , A[i] are in sorted order.
10/25/2012 CS09 303 Data Structures - Module 4 28

i=2
1 i=2 2 3 4 5 6

18
i 1

20
3

11
4

15
5

9
6

j=2

18
j-1 swap 1

7
j

20

11

15

j=1

-
j-1 j

18

20

11

15

10/25/2012

sorted

CS09 303 Data Structures - Module 4

29

i=3
1 i=3 2 3 4 5 6

7
1

18
i 2

20
3

11
4

15
5

9
6

j=3

18
j-1

20
j

11

15

18

20
sorted

11

15

10/25/2012

CS09 303 Data Structures - Module 4

30

i=4
1 i=4 2 3 4 5 6

7
1

18
2

20
i 3

11
4

15
5

9
6

j=4

18

20
j-1 swap

11
j

15

1 j=3

18
j-1 swap

11
j

20

15

10/25/2012

CS09 303 Data Structures - Module 4

31

1 j=2

7
j-1

11
j

18

20

15

11

18

20

15
sorted

10/25/2012

CS09 303 Data Structures - Module 4

32

i=5
1 i=5 2 3 4 5 6

7
1

11
2

18
3

20
4

15
i 5

9
6

j=5

11

18

20
j-1 swap

15
j

1 j=4

11

18
j-1 swap

15
j

20

10/25/2012

CS09 303 Data Structures - Module 4

33

1 j=3

11
j-1

15
j

18

20

11

15

18
sorted

20

10/25/2012

CS09 303 Data Structures - Module 4

34

i=6
1 i=6 2 3 4 5 6

7
1

11
2

15
3

18
4

20
i 5

9
6

j=6

11

15

18

20
j-1 swap

9
j

1 j=5

11

15

18
j-1 swap

9
j

20

10/25/2012

CS09 303 Data Structures - Module 4

35

1 j=4

11
j-1

15

9
j

18

20

swap 1 j=3 2 3 4 5 6

11
j-1 swap

9
j

15

18

20

1 j=2

-
j-1

7
1

9
j 2

11
3

15
4

18
5

20
6

-
10/25/2012

11

15

18

20

sorted
36

Quick Sort
Quicksort is a divide and conquer algorithm. The steps are:
Pick an element, called a pivot, from the list. Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation. Recursively, sort the sub-list of lesser elements and the sub-list of greater elements.
10/25/2012 CS09 303 Data Structures - Module 4 37

10/25/2012

CS09 303 Data Structures - Module 4

38

10/25/2012

CS09 303 Data Structures - Module 4

39

10/25/2012

CS09 303 Data Structures - Module 4

40

10/25/2012

CS09 303 Data Structures - Module 4

41

Heap
A heap is a specialized tree-based data structure in which all the nodes satisfy the heap property: Either the keys of parent nodes are always greater than or equal to those of the children and the highest key is in the root node (this kind of heap is called max heap) or the keys of parent nodes are less than or equal to those of the children (min heap).
10/25/2012 CS09 303 Data Structures - Module 4 42

Heap
Min-Heap A balanced, left-justified binary tree in which no node has a value lesser than the value in its parent.

10/25/2012

CS09 303 Data Structures - Module 4

43

Max-Heap A balanced, left-justified binary tree in which no node has a value greater than the value in its parent.

10/25/2012

CS09 303 Data Structures - Module 4

44

Constructing a Heap
Construct a heap by adding nodes one at a time: Add the node just to the right of the rightmost node in the deepest level. If the deepest level is full, start a new level.
Add a new node here Add a new node here

10/25/2012

CS09 303 Data Structures - Module 4

45

Heap Sort
Heapsort is an in-place algorithm. It is a two step algorithm. The first step is to build a heap out of the data. The second step consists of two parts: Repeat until all the elements have been removed from the heap: Remove the smallest(or largest) element from the minheap(or max-heap) and insert it into the array. Reconstruct the heap. After all the elements have been removed from the heap, we have a sorted array.
10/25/2012 CS09 303 Data Structures - Module 4 46

10/25/2012

CS09 303 Data Structures - Module 4

47

10/25/2012

CS09 303 Data Structures - Module 4

48

10/25/2012

CS09 303 Data Structures - Module 4

49

Merge Sort
Merge Sort is a O(n log n) comparison-based sorting algorithm. Merge sort is a divide and conquer algorithm. Merge sort works as follows: Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted). Repeatedly merge the sublists to produce new sublists until there is only 1 sublist remaining. This will be the sorted list.
10/25/2012 CS09 303 Data Structures - Module 4 50

10/25/2012

CS09 303 Data Structures - Module 4

51

Mergesort
A divide-and-conquer algorithm: Divide the unsorted array into 2 halves until the subarrays only contain one element. Merge the sub-problem solutions together:
Compare the sub-arrays first elements Remove the smallest element and put it into the result array Continue the process until all elements have been put into the result array
37 23 6 89 15 12 2 19

10/25/2012

CS09 303 Data Structures - Module 4

52

Informal Algorithm
Mergesort(Passed an array) if array size > 1 Divide array in half. Call Mergesort on first half. Call Mergesort on second half. Merge two halves. Merge(Passed two arrays) Compare leading element in each array. Select lower and place in new array.
10/25/2012 CS09 303 Data Structures - Module 4 53

10/25/2012

CS09 303 Data Structures - Module 4

54

10/25/2012

CS09 303 Data Structures - Module 4

55

10/25/2012

CS09 303 Data Structures - Module 4

56

10/25/2012

CS09 303 Data Structures - Module 4

57

10/25/2012

CS09 303 Data Structures - Module 4

58

12

10/25/2012

CS09 303 Data Structures - Module 4

59

12

12

10/25/2012

CS09 303 Data Structures - Module 4

60

12

12

10/25/2012

CS09 303 Data Structures - Module 4

61

12

12

10/25/2012

CS09 303 Data Structures - Module 4

62

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

63

12

12

2 Merge

10/25/2012

CS09 303 Data Structures - Module 4

64

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

65

12

12

10/25/2012

CS09 303 Data Structures - Module 4

66

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

67

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

68

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

69

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

70

12

12

1 Merge

10/25/2012

CS09 303 Data Structures - Module 4

71

12

12

2 Merge

10/25/2012

CS09 303 Data Structures - Module 4

72

12

12

4 Merge

10/25/2012

CS09 303 Data Structures - Module 4

73

12

12

4 Merge

10/25/2012

CS09 303 Data Structures - Module 4

74

12

12

12

10/25/2012

CS09 303 Data Structures - Module 4

75

12

12

12

10/25/2012

CS09 303 Data Structures - Module 4

76

12

12

12

4 Merge

10/25/2012

CS09 303 Data Structures - Module 4

77

12

12

12

6 Merge

10/25/2012

CS09 303 Data Structures - Module 4

78

12

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

79

12

12

12

12

10/25/2012

CS09 303 Data Structures - Module 4

80

12

12

12

12

7 Merge

10/25/2012

CS09 303 Data Structures - Module 4

81

12

12

12

12

3 Merge

10/25/2012

CS09 303 Data Structures - Module 4

82

12

12

12

12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

83

12

12

12

12

12

9 Merge

10/25/2012

CS09 303 Data Structures - Module 4

84

12

12

12

12

6 3

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

85

12

12

12

12

6 3

7 6

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

86

12

12

12

12

6 3

7 6 7

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

87

12

12

12

12

6 3

7 6 7

3 12

12

Merge

10/25/2012

CS09 303 Data Structures - Module 4

88

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 89

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 90

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 91

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 92

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 93

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 94

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 95

12

12

12

12

6 3

7 6 7

3 12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 96

12

12

12

12

6 3

7 6 7

3 12

12

12

Merge
10/25/2012 CS09 303 Data Structures - Module 4 97

12

12

12

12

6 3

7 6 7

3 12

12

12

10/25/2012

CS09 303 Data Structures - Module 4

98

12

12

10/25/2012

CS09 303 Data Structures - Module 4

99

Divide the unsorted collection into two until the subarrays only contain one element. Then merge the sub-problem solutions together.

10/25/2012

CS09 303 Data Structures - Module 4

100

External Sorting
Merge sort (Merge files) O(log n)

10/25/2012

CS09 303 Data Structures - Module 4

101

10/25/2012

CS09 303 Data Structures - Module 4

102

HASHING
Hashing is a method of Information Retrieval - typically used for database management systems, other systems in which rapid storage and retrieval of information is necessary.

Hashing is used to compute the location of the desired record in order to retrieve it in a single access. eg: empcode in employee file, which is called a key.
10/25/2012 CS09 303 Data Structures - Module 4 103

Hashing takes a potentially huge range of values and maps it to a much smaller range of values.

It is used to implement dictionaries.

10/25/2012

CS09 303 Data Structures - Module 4

104

Hash Tables
Motivation: symbol tables A compiler uses a symbol table to relate symbols to associated data. Symbols: variable names, procedure names, etc. Associated data: memory location, call graph, etc. For a symbol table (also called a dictionary), we care about search, insertion, and deletion.

10/25/2012

CS09 303 Data Structures - Module 4

105

HASH FUNCTIONS
Hash function is the transformation of key into corresponding location in the hash table.

A hash function H can be defined as function that takes key as input and transforms it into a hash table index.

H: KEY------> INDEX or ADDRESS

10/25/2012

CS09 303 Data Structures - Module 4

106

There are mainly 3 hash functions: Division Method Mid square method Folding Method

10/25/2012

CS09 303 Data Structures - Module 4

107

The Division Method (MODULO arithmetic)


h(k) = k mod m Hash the key k into a table with m slots using the slot given by: the remainder of (k divided by m). Pick table size m = prime number not too close to a power of 2 (or 10), to avoid maximum collision.

10/25/2012

CS09 303 Data Structures - Module 4

108

A simple hash function H(k)


function H(x :array[1..10]of char ):0..n-1; Var i, sum:integer; begin sum:= 0; for i:=1 to 10 do sum = sum + ord(x[i]); h:= sum mod n end ;{H}
10/25/2012 CS09 303 Data Structures - Module 4 109

H(k) = k mod m HASH TABLE : Let m =7 where TABLE contains 5 records. (i.e , m should be selected such that it is greater than total number of records in the TABLE. Hash address 0 1 2 3 4
10/25/2012

Employee code (key ,K ) 49

Employee name John

500 11
CS09 303 Data Structures - Module 4

Tom Bell
110

Table size of 100 3 Digit numbers are the keys 999 possible items Indices 0..99 on the table 999 % 100 = 99 (100 is Table size) 524 % 100 = 24 199 % 100 = 99 (COLLISION)

10/25/2012

CS09 303 Data Structures - Module 4

111

Mid- Square Method


In midsquare hashing, the key is squared and the address is selected from the middle of the squared number. The most obvious limitation of this method is the size of the key. Given a key of 6 digits, the product will be 12 digits, which is beyond the maximum integer size of many computers.

10/25/2012

CS09 303 Data Structures - Module 4

112

H(k) = k2 Same number of digits must be used for all of the keys.

K
K2 H(K)

14 196 9

15 225 2

26 676 7

10/25/2012

CS09 303 Data Structures - Module 4

113

Hash address 0 1 2 3 4 5 6 7 8 9
10/25/2012

Employee code (key ,K )

Employee name

15

Anu

26 14
CS09 303 Data Structures - Module 4

Sam Neenu
114

FOLDING Method
H(K) = K1 + K2+....+ Kr Key is partitioned into number of parts. The parts should have same number of digits, as the required hash address. Then the parts are added together ignoring the last carry.

K K1 K2 K3
K2 + K3
10/25/2012

2103 21 , 03

7148 71 , 48

12345 12 ,34 , 5

H(K) = K1 + 21+03=24

71+48=19 12+34+5= 51
115

CS09 303 Data Structures - Module 4

H(K)=K1 + K2+....+ Kr Extra milling can also be applied to even numbered parts,ie.K2, K4 are reversed before addition
K 2103 7148 12345

K1 K2 K3 Reversing K2 ,K4.

21 , 03 21,30

71 , 48 71,84 71+84=55

12 ,34 , 5 12,43,5 12+43+5=60


116

H(K)= K1 + 21+30=51 K2 + K3
10/25/2012

CS09 303 Data Structures - Module 4

Hash Collision
Sometimes, 2 different keys may hash to the same external location! This is called a COLLISION.
Hash address 0 Employee code (key ,K ) 49(if a key 14 occurs, there is a collision) Employee name anju

1 2 3 4
10/25/2012

500 11
CS09 303 Data Structures - Module 4

Meena clark
117

Collision Resolution
Handling Collisions - Techniques: Two Major Strategies: 1) Open Addressing - Find another spot in the "Table" (same contiguous address space) 2) Chaining - Find another spot outside the "Table"

10/25/2012

CS09 303 Data Structures - Module 4

118

Resolving Collisions
Solution 1: Chaining
Keep linked list of elements in slots Upon collision, just add new element to list

Solution 2: open addressing -To insert: if slot is full, try another slot, and another, until an open slot is found (Linear probing)
To search, follow same sequence of probes as would be used when inserting the element

Solution 3: bucket addressing


10/25/2012 CS09 303 Data Structures - Module 4 119

Chaining
How do we insert an element?
U (universe of keys) k1 k4 K (actual k7 keys) k6
10/25/2012

T k1 k5 k2 k7 k4

k5

k2

k8

k3
CS09 303 Data Structures - Module 4

k3 k8 k6
120

Chaining
How do we search for a element with a T given key?
U (universe of keys) k1 k4 K (actual k7 keys) k6
10/25/2012

k1 k5 k2 k7 k4

k5

k2

k8

k3
CS09 303 Data Structures - Module 4

k3 k8 k6
121

Variation of Open addressing


Quadratic probing Suppose a record with R with key k has the hash address H(k)=h. Then instead of searching the location with address h,h+1,h+2,.h+i., we search for free hash address h,h+1,h+4,h+9,.,h+i 2

10/25/2012

CS09 303 Data Structures - Module 4

122

Variation of Open addressing


Double Hashing A Second hash function is used to resolve the collision. Suppose there is a primary hash function H(k)=(kmod)m. If any collision occurs, apply second hash function say H(k)= k mod m1

10/25/2012

CS09 303 Data Structures - Module 4

123

BUCKET Addressing
Store colliding elements in the same position in table by introducing a bucket with each hash address. A bucket is a block of memory space ,which is large enough to store multiple items. If a bucket is full then the colliding item can be stored in new bucket by incorporating its link to previous bucket.

10/25/2012

CS09 303 Data Structures - Module 4

124

Thank You

10/25/2012

CS09 303 Data Structures - Module 4

125

Вам также может понравиться