Вы находитесь на странице: 1из 112

Modified by Dr.

ISSAM ALHADID
11/3/2019
 In computer science, a search algorithm is
an algorithm that retrieves information
stored within some data structure, or
calculated in the search space of a problem
domain.
 Data structures can include linked lists,
arrays, search trees, hash tables, or various
other storage methods.
 Search space: the intention of finding a
goal state with a desired property.

5/17/2019 2
 The appropriate search algorithm often
depends on the data structure being
searched. Searching also encompasses
algorithms that query the data structure,
such as the SQL SELECT command.

5/17/2019 3
 Search functions are also evaluated on the
basis of their complexity, or maximum
theoretical run time.
 Binary search functions, for example, have a
maximum complexity of O(log(n)), or
logarithmic time. This means that the
maximum number of operations needed to
find the search target is a logarithmic
function of the size of the search space.
 Search depends on data structure
 Hence,
 Review data structure

5/17/2019 5
1. Array is a data structure used to store homogeneous elements at
contiguous locations. Size of an array must be provided before
storing data.

2. Linked List: is a linear data structure. Unlike arrays, linked list


elements are not stored at contiguous location; the elements are
linked using pointers.

5/17/2019 6
1. Stack: is a linear data structure which follows a particular
order in which the operations are performed. The order may
be LIFO(Last In First Out) or FILO(First In Last Out). Mainly the
following three basic operations are performed in the stack:
◦ Push: Adds an item in the stack. If the stack is full, then it is said to be an
Overflow condition.
◦ Pop: Removes an item from the stack. The items are popped in the
reversed order in which they are pushed. If the stack is empty, then it is
said to be an Underflow condition.
◦ Peek or Top: Returns top element of stack.
◦ isEmpty: Returns true if stack is empty, else fals.

5/17/2019 7
 Queue: A queue or FIFO (first in, first out) is an
abstract data type that serves as a collection of
elements, with two principal operations: enqueue,
the process of adding an element to the
collection.(The element is added from the rear side)
and dequeue, the process of removing the first
element that was added. (The element is removed
from the front side). It can be implemented by
using both array and linked list.

5/17/2019 8
 Hashing : A function that converts a given
Value (for example a big phone number ) to a
small practical integer value. The mapped
integer value is used as an index in hash
table.
 In simple terms, a hash function maps a big
number or string to a small integer that can
be used as index in hash table.
 Hash Table: An array that stores pointers to records
corresponding to a given phone number. An entry in hash
table is NIL if no existing phone number has hash function
value equal to the index for the entry. Number
701466868

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

5/17/2019 10
 Collision Handling: Since a hash function gets us a small
number for a big key, there is possibility that two keys result
in same value. The situation where a newly inserted key maps
to an already occupied slot in hash table is called collision
and must be handled using some collision handling
technique. Following are the ways to handle collisions:
◦ Chaining: The idea is to make each cell of hash table point
to a linked list of records that have same hash function
value. Chaining is simple, but requires additional memory
outside the table.
◦ Open Addressing: In open addressing, all elements are
stored in the hash table itself. Each table entry contains
either a record or NIL. When searching for an element, we
one by one examine table slots until the desired element is
found or it is clear that the element is not in the table.

5/17/2019 11
Number 701466868

 This is called a
collision, because
there is already
another valid record at
[2]. When a collision occurs,
move forward until you
find an empty spot.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685

...
Number 506643548 Number 155778322

5/17/2019 12
 In chained hashing, each location in the hash
table contains a list of records whose keys map
to that location:
[0] [1] [2] [3] [4] [5] [6] [7] [n]

Record whose
Record whose
key hashes Record whose
key hashes
to 0 key hashes
to 3
to 1

Record whose

key hashes Record whose Record whose
to 0 key hashes key hashes
to 1 to 3

… … … 5/17/2019 13
 Graph Search Methods.
 Linear Search or Sequential search
 Binary, or half interval search
 Breadth-First Search (BFS).
 Depth-First Search (DFS).
 Searching falls under Artificial Intelligence (AI). A
major goal of AI is to give computers the ability to
think, or in other words, mimic human behavior.
The problem is, unfortunately, computers don't
function in the same way our minds do. They
require a series of well-reasoned out steps before
finding a solution.
 Your goal, then, is to take a complicated task and
convert it into simpler steps that your computer
can handle. That conversion from something
complex to something simple is what this tutorial
is primarily about. Learning how to use two search
algorithms is just a welcome side-effect.
 A searching algorithm can be classified as a blind
search algorithm or as a heuristic search
algorithm.
 Blind search (uninformed search) algorithms have
no information about the states or search space.
These algorithms generate successors (inheritor),
and test whether each successor is the goal state
or not. Consequently, all the generated nodes
have the same priority of expansion.
 Examples of blind search algorithms are
breadth-first search and depth-first search
algorithms.
 Heuristic search (informed search) algorithms
have further information about the cost of the
path between any state in the search space
and the goal state.
 Examples of heuristic search algorithms are
Best-First search, A-Star (A*), and Tabu
search algorithms.
•Linear search algorithms check every record for the one
associated with a target key in a linear fashion.
•Elements are not sorted…
 procedure linear_search (list, value)
◦ for each item in the list  1…2…3… n times
 if match item == value  1…2…3… n times
 return the item's location  1
 end if
◦ end for
 end procedure

5/17/2019 19
 Best Case  O(1) … first element
 Worst case  O(n)… last element or not
found !
 Avg. Case  O(n/2)
◦ Avg. case ɵ(n) ???
•Binary, or half interval searches, repeatedly target the
center of the search structure and divide the search
space in half.
•Elements are sorted… (Example… External File)
1. To search a key in the search space i.e. some
list of data points, we need to find the mid
point in the data and check if data present
there. If data found, then then we stop the
further iteration. In this case, time complexity
will be O(1), best case. Otherwise, we will move
to next step.
2. Find if data will be present in left or right by
comparing search key with current data item.
3. Repeat 1 and 2 until there is match of there is
no further points to search.
 binary search algorithm break the break into half
in each iteration.
 So how many times we need to divide by 2 until
with have only one element  k ???
◦ n/(2k)=1
 we can rewrite it as -
◦ 2k=n
 by taking log both side, we get
◦ k=log2n
 So, in average and worst case, time complexity of
binary search algorithm is log(n).
 Many graph problems solved using blind
search algorithm.
◦ Path from one vertex to another.
◦ Is the graph connected?
◦ Is there is a cycle in the graph?
 Commonly used search algorithms:
◦ Breadth-First Search (BFS).
◦ Depth-First Search (DFS).
 A vertex u is reachable from vertex v iff there is
a path from v to u.
 A search method starts at a given vertex v and
visits every vertex that is reachable from v.
 Given: a graph G = (V, E), directed or
undirected.
 Goal: methodically explore every vertex and
every edge.
 Ultimately (finally): build a tree on the graph:
◦ Pick a vertex as the root.
◦ Choose certain edges to produce a tree.
◦ Note: might also build a forest if graph is not
connected.
 DFS Algorithm
◦ DFS Example
◦ DFS Running Time and Space
◦ DFS Predecessor Subgraph
◦ DFS Time Stamping
◦ DFS Parenthesis Theorem
◦ DFS Edge Classification
◦ DFS and Graph Cycles
 Depth-first search is another strategy for
exploring a graph.
 Explore “deeper” in the graph whenever
possible.
 Edges are explored out of the most recently
discovered vertex v that still has unexplored
edges.
 When all of v’s edges have been explored,
backtrack to the vertex from which v was
discovered.
 DFS uses the LIFO stack.
 In our above graph, the
path connections are
not two-way.
 All paths go only from
top to bottom. In other
words, A has a path to B
and C, but B and C do
not have a path to A. It
is basically like a one-
way street
 Each lettered circle in our
graph is a node.
 A node can be connected
to other via our
edge/path, and those
nodes that its connects to
are called neighbors.
 B and C are neighbors of
A. E and D are neighbors
of B, and B is not a
neighbors of D or E
because B cannot be
reached using either D or E
 Our search graph also
contains depth:
 We now have a way of
describing location in our
graph. We know how the
various nodes (the lettered
circles) are related to each
other (neighbors), and we
have a way of characterizing
the depth each belongs in.
Knowing this information isn't
directly relevant in creating
our search algorithm, but they
do help us to better
understand the problem.
 Depth first search works by taking a node,
checking its neighbors, expanding the first node it
finds among the neighbors, checking if that
expanded node is our destination, and if not,
continue exploring more nodes.
 The above explanation is probably confusing if this
is your first exposure to depth first search. I hope
the following demonstration will help more. Using
our same search tree, let's find a path between
nodes A and F:
 Let's start with our root/goal node:
 I will be using two lists to keep track of what we
are doing - an Open list and a Closed List. An Open
list keeps track of what you need to do, and the
Closed List keeps track of what you have already
done. Right now, we only have our starting point,
node A. We haven't done anything to it yet, so let's
add it to our Open list.
 Open List: A
 Closed List: <empty>

5/17/2019 33
 Now, let's explore the neighbors of our A node. To put
another way, let's take the first item from our Open list
and explore its neighbors:
 Node A's neighbors are the B and C nodes. Because we
are now done with our A node, we can remove it from
our Open list and add it to our Closed List. You aren't
done with this step though. You now have two new
nodes B and C that need exploring. Add those two
nodes to our Open list.
 Our current Open and Closed Lists contain the following
data:
 Open List: B, C
 Closed List: A

5/17/2019 34
 Our Open list contains two items. For depth first
search and breadth first search, you always explore
the first item from our Open list. The first item in
our Open list is the B node. B is not our
destination, so let's explore its neighbors:
 Because I have now expanded B, I am going to
remove it from the Open list and add it to the
Closed List. Our new nodes are D and E, and we
add these nodes to the beginning of our Open list:
 Open List: D, E, C
 Closed List: A, B

5/17/2019 35
 You should start to see a pattern forming.
Because D is at the beginning of our Open
List, we expand it. D isn't our destination, and
it does not contain any neighbors. All you do
in this step is remove D from our Open List
and add it to our Closed List:
 Open List: E, C
 Closed List: A, B, D

5/17/2019 36
 We now expand the E node from our Open list. E is
not our destination, so we explore its neighbors
and find out that it contains the neighbors F and G.
Remember, F is our target, but we don't stop here
though. Despite F being on our path, we only end
when we are about to expand our target Node - F
in this case:
 Our Open list will have the E node removed and the
F and G nodes added. The removed E node will be
added to our Closed List:
 Open List: F, G, C
 Closed List: A, B, D, E

5/17/2019 37
 We now expand the F node. Since it is our
intended destination, we stop:
 We remove F from our Open list and add it
to our Closed List. Since we are at our
destination, there is no need to expand F in
order to find its neighbors. Our final Open
and Closed Lists contain the following data:
 Open List: G, C
 Closed List: A, B, D, E, F

5/17/2019 38
 The final path taken by our depth first search
method is what the final value of our Closed
List is: A, B, D, E, F. Towards the end of this
tutorial, I will analyze these results in greater
detail so that you have a better
understanding of this search method.

5/17/2019 39
step Open list Closed list
0 A EMPTY
1 B, C A
2 D, E, C A, B
3 EC ABD
4 FGC ABDE
5 GC ABDEF
6 ABDEF

5/17/2019 40
 To detect cycles in graphs, therefore, we choose
an arbitrary white node and run DFS. If that
completes and there are still white nodes left
over … we have a cycle in the graph.
 To complete the process …. we choose another
white node arbitrarily and repeat.
 Eventually all nodes are colored black. If at any
time we follow an edge to a gray node
(discovered), there is a cycle in the graph.
Therefore, cycles can be detected in O(|V+E|)
time.
 Lines 1-3 paint all vertices white and initialize
their p fields to NIL.
 Line 4 resets the global time counter.
 Lines 5-7 check each vertex in V in turn and,
when a white vertex is found, visit it using
DFS_Visit.
 Every time DFS-Visit(u) is called in line 7, vertex
u becomes the root of a new tree in the depth-
first forest.
 When DFS returns, every vertex u has been
assigned a discovery time d[u] and a finishing
time f[u].
 In each call DFS-Visit(u), vertex u is initially white.
 Line 1 paints u gray.
 Line 2 increments the global variable time.
 Line 3 records the new value of time as the discovery
time d[u].
 Lines 4-7 examine each vertex v adjacent to u and
recursively visit v if it is white.
 As each vertex v ∈ Adj[u] is considered in line 4, we
say that edge (u, v) is explored by the depth-first
search.
 Finally, after every edge leaving u has been explored,
lines 8-10 paint u black and record the finishing time
in f[u].
So, running time of DFS = Θ(V+E)
source
vertex

David Luebke

53

5/17/2019
source
vertex
d f
1 | | |

| |

| | |
David Luebke

54

5/17/2019
source
vertex
d f
1 | | |

2 | |

| | |
David Luebke

55

5/17/2019
source
vertex
d f
1 | | |

2 | |

3 | | |
David Luebke

56

5/17/2019
source
vertex
d f
1 | | |

2 | |

3 | 4 | |
David Luebke

57

5/17/2019
source
vertex
d f
1 | | |

2 | |

3 | 4 5 | |
David Luebke

58

5/17/2019
source
vertex
d f
1 | | |

2 | |

3 | 4 5 | 6 |
David Luebke

59

5/17/2019
source
vertex
d f
1 | 8 | |

2 | 7 |

3 | 4 5 | 6 |
David Luebke

60

5/17/2019
source
vertex
d f
1 | 8 | |

2 | 7 |

3 | 4 5 | 6 |
David Luebke

61

5/17/2019
source
vertex
d f
1 | 8 | |

2 | 7 9 |

3 | 4 5 | 6 |
David Luebke

What is the structure of the grey vertices? 62

What do they represent?


5/17/2019
source
vertex
d f
1 | 8 | |

2 | 7 9 |10

3 | 4 5 | 6 |
David Luebke

63

5/17/2019
source
vertex
d f
1 | 8 |11 |

2 | 7 9 |10

3 | 4 5 | 6 |
David Luebke

64

5/17/2019
source
vertex
d f
1 |12 8 |11 |

2 | 7 9 |10

3 | 4 5 | 6 |
David Luebke

65

5/17/2019
source
vertex
d f
1 |12 8 |11 13|

2 | 7 9 |10

3 | 4 5 | 6 |
David Luebke

66

5/17/2019
source
vertex
d f
1 |12 8 |11 13|

2 | 7 9 |10

3 | 4 5 | 6 14|
David Luebke

67

5/17/2019
source
vertex
d f
1 |12 8 |11 13|

2 | 7 9 |10

3 | 4 5 | 6 14|15
David Luebke

68

5/17/2019
source
vertex
d f
1 |12 8 |11 13|16

2 | 7 9 |10

3 | 4 5 | 6 14|15
David Luebke

69

5/17/2019
 Defined the predecessor subgraph of G as
Gp= (V, Ep), where :
Ep = {(p[v], v) ∈E : v ∈V and p[v] ≠ NIL}.
 The PD subgraph of a depth-first search
forms a depth-first forest composed of
several depth-first trees.
 The edges in Gp are called tree edges.
 DFS introduces an important distinction
among edges in the original graph:
◦ Tree edge: encounter new (white) vertex
◦ Back edge: from descendent to ancestor
◦ Forward edges: from ancestor to descendent
 Not a tree edge, though
 From grey node to black node

David Luebke

71

5/17/2019
 DFS introduces an important distinction
among edges in the original graph:
◦ Tree edge: encounter new (white) vertex
◦ Back edge: from descendent to ancestor
◦ Forward edge: from ancestor to descendent
◦ Cross edge: between a tree or subtrees
 From a grey node to a black node (unreachable)

David Luebke

73

5/17/2019
 Thm 23.9: If G is undirected, a DFS produces
only tree and back edges
 Proof by contradiction:
◦ Assume there’s a forward edge
 But F? edge must actually be a
back edge (why?)
Since edge (w, u) is from
descendant to ancestor
(gray to gray ).
 The DFS algorithm maintains a monotonically
increasing global clock:
◦ discovery time d[u] and finishing time f[u].
 For every vertex u, the inequality d[u] < f[u]
must hold.
 Vertex u is
◦ White before time d[u]
◦ Gray between time d[u] and time f[u], and
◦ Black thereafter.
 Notice the structure throughout the
algorithm:
◦ gray vertices form a linear chain,
◦ corresponds to a stack of vertices that have not
been exhaustively explored (DFS-Visit started but
not yet finished).
 Exercise: What is the size of the stack in DFS?
 Discovery and finish times have parenthesis
structure:
 Represent discovery of u with left parenthesis
"(u”.
 Represent finishing of u with right
parenthesis "u)”.
 History of discoveries and finishings makes a
well-formed expression (parenthesis are
properly nested).
 Parenthesis theorem: If v is a descendant (child)
of u, then the discovery time of v is later than
(after) the discovery time of u. However, the
finishing time of v is earlier than the finishing
time of u. You can see this using the recursive
call structure of DFS-Visit
 First we call DFS-Visit on u, and then we recurse
on its descendants (childs), and the inner
recursion must start after the outer recursion,
but finish before the outer recursion. (Notice)
Finishing
Discovery time
time
 Visit start vertex (s) and put into a FIFO
queue.
 Repeatedly remove a vertex from the queue,
visit its unvisited adjacent vertices, put newly
visited vertices into the queue.
 All vertices reachable from the start vertex (s)
(including the start vertex) are visited.
r s t u

   

   
v w x y

David Luebke

88

5/17/2019
r s t u

 0  

   
v w x y

David Luebke

Q: s 89

5/17/2019
r s t u

1 0  

 1  
v w x y

David Luebke

Q: w r 90

5/17/2019
r s t u

1 0 2 

 1 2 
v w x y

David Luebke

Q: r t x 91

5/17/2019
r s t u

1 0 2 

2 1 2 
v w x y

David Luebke

Q: t x v 92

5/17/2019
r s t u

1 0 2 3

2 1 2 
v w x y

David Luebke

Q: x v u 93

5/17/2019
r s t u

1 0 2 3

2 1 2 3
v w x y

David Luebke

Q: v u y 94

5/17/2019
r s t u

1 0 2 3

2 1 2 3
v w x y

David Luebke

Q: u y 95

5/17/2019
r s t u

1 0 2 3

2 1 2 3
v w x y

David Luebke

Q: y 96

5/17/2019
r s t u

1 0 2 3

2 1 2 3
v w x y

David Luebke

Q: Ø 97

5/17/2019
 The reason I cover both depth and breadth first search
methods in the same tutorial is because they are both similar.
In depth first search, newly explored nodes were added to
the beginning of your Open list. In breadth first search, newly
explored nodes are added to the end of your Open list.
 Let's see how that change will affect our results. For
reference, here is our original search tree:
 Let's try to find a path between
nodes A and E.

5/17/2019 98
 Let's start with our root/goal node:
 Like before, I will continue to employ the
Open and Closed Lists to keep track of what
needs to be done:
 Open List: A
 Closed List: <empty>

5/17/2019 99
 Now, let's explore the neighbors of our A node. So
far, we are following in depth first's foot steps:
 We remove A from our Open list and add A to our
Closed List. A's neighbors, the B and C nodes, are
added to our Open list. They are added to the end
of our Open list, but since our Open list was empty
(after removing A), it's hard to show that in this
step.
 Our current Open and Closed Lists contain the
following data:
 Open List: B, C
 Closed List: A

10
5/17/2019 0
 Here is where things start to diverge from our
depth first search method. We take a look the B
node because it appears first in our Open List.
Because B isn't our intended destination, we
explore its neighbors:
 B is now moved to our Closed List, but the
neighbors of B, nodes D and E are added to
the end of our Open list:
 Open List: C, D, E
 Closed List: A, B

10
5/17/2019 1
 We now expand our C node:
 Since C has no neighbors, all we do is remove
C from our Open List, add it to the end of our
Closed List, and move on:
 Open List: D, E
 Closed List: A, B, C

10
5/17/2019 2
 Similar to Step 3, we expand node D. Since it
isn't our destination, and it too does not have
any neighbors, we simply remove D from our
to Open list, add D to our Closed List, and
continue on:
 Open List: E
 Closed List: A, B, C, D

10
5/17/2019 3
 Because our Open list only has one item, we
have no choice but to take a look at node E.
Since node E is our destination, we can stop
here:
 Our final versions of the Open and Closed
Lists contain the following data:
 Open List: <empty>
 Closed List: A, B, C, D, E

10
5/17/2019 4
 Traveling from A to E takes you through B, C,
and D using breadth first search. In the next
page, I will summarize the algorithm for both
our search methods and cover the Flash code
needed to re-create them.

10
5/17/2019 5
step Open list Closed list
0 A EMPTY
1 B, C A
2 CDE AB
3 DE ABC
4 E ABCD
5 EMPTY ABCD
6

10
5/17/2019 6
 Given a graph G = (V, E):
◦ Vertices are enqueued if there color is white.
◦ Assuming that en- and de-queuing takes O(1)
time, so the total cost of this operation is O(V).
◦ Adjacency list of a vertex is scanned when the
vertex is de-queued (and at most once).
◦ The sum of the lengths of all lists is Ο(E).
Consequently,
◦ O(E) time is spent on scanning them.
◦ Initializing the algorithm takes O(V).
 Total running time O(V+E)
Space ??????
 Start a breadth-first search at any vertex of
the graph.
 Graph is connected iff all n vertices get
visited.
 Time is O(V+E).

Вам также может понравиться