Академический Документы
Профессиональный Документы
Культура Документы
Introduc+on
WEEK #1
LECTURE #1
Introduc+on
Why
study
Data
Structures
and
Algorithms
Working
with
large
data
set
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Assessment
Class
Prole
Sophomore,
Junior,
Senior,
Open
University,
etc.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Class
Climate
Work
really
hard
this
is
a
challenging
class!
A\end
each
class,
because
Not
everything
I
taught
will
be
available
on
Canvas
I
will
go
over
some
selected
problem
sets
in
the
class
Those
students
skipped
my
last
class
all
failed
the
class
Assignments
Assignment
will
include
both
wri\en
and
programming
problem
sets
Do
your
own
homework
assignment
as
the
concept
will
be
tested
in
quiz/exam.
All
homework
assignments
will
be
submi\ed
electronically
on
Canvas
As
such,
no
late
or
make-up
assignments
will
be
graded.
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Gegng
Recommenda+on
You
must
earn
an
A-
or
be\er
grade
in
this
class
if
you
want
to
ask
me
for
a
recommenda+on
le\er.
Dont
add
me
to
your
Facebook;
but
I
can
consider
your
invita+on
to
connect
in
LinkedIn
aMer
knowing
you
for
some
+me
in
this
class.
Okay
to
ask
me
ques+ons
about
hi-tech
industry.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Textbook
We
will
use
Data
Structures
and
Algorithm
Analysis
in
JAVA
by
Mark
Weiss
in
this
class.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Slide
Deck
PowerPoint
Slides
for
each
lecture
will
follow
very
closely
to
the
textbook.
Highlight
Comment
Textbook
Sec+on
Math
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
10
Grading
Policy
The
percentage
weight
assigned
to
class
assignments,
group
project
and
nal
exam
are
listed
as
below:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
11
Grading
Policy
Grades
will
be
assigned
as
described
below.
This
scale
may
be
adjusted
once
the
nal
exam
has
been
graded
to
provide
a
le\er
grade
distribu+on
that
matches
the
expected
average
for
this
class.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
12
Course Schedule
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
13
Course Schedule
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
14
Course
Structure
Founda'on
Reinforcement
A\end
lectures
Read
book
chapters
Integra'on
Introduc+on
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
Sor+ng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final
Exam
15
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
16
1.0
Selec+on
Problem
Determine
the
kth
largest
of
a
group
of
N
numbers
Algorithm
1:
Algorithm 2:
17
1.1
Algorithm 2:
18
1.1
19
1.1
Book-Keeping
Mathma+cs
Review
Exponents,
Logarithms
Series,
Harmonic
Number,
Euler's
Constant
Modular
Arithme+c
Proof
by
Induc+on
Recursive
Func+on
WEEK #1
LECTURE #2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
20
File
Uploaded:
Slide
deck
for
Chapter
1
Green
Sheet
Upcoming:
Assignment
1
will
be
posted
Revised
slide
deck
for
Chapter
1
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
21
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
23
Mathema+cs
Review
Exponents
Logarithms
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
24
1.2
Mathema+cs
Review
Series
Geometric
Series:
Harmonic Number:
Eulers Constant:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Arithme+c Series:
H1
H2
H3
H4
H5
H6
H7
H8
H9
25
ln
ln
ln
ln
ln
ln
ln
ln
ln
1
2
3
4
5
6
7
8
9
=
=
=
=
=
=
=
=
=
1,
0.806852...,
0.734721...,
0.697038...,
0.673895...,
0.658240...,
0.646946...,
0.638415...,
0.631743...,
1.2
Mathema+cs
Review
Modular
Arithme'c
Example:
If
N
is
a
prime
number:
is
true
if
and
only
if
or
26
1.2
Mathema+cs
Review
Proof
by
Induc'on
Prove
that:
(Fibonacci
numbers)
F0 = 1, F1 = 1, F2 = 2, F3 = 3, F4 =5,...,Fi = Fi1+Fi2,
sa+sfy
Fi <(5/3)i,for i 1.
Base
Case:
F1 = 1 < 5/3 and F2 = 2 < 25/9
Induc+ve
Hypothesis:
We
assume
that
the
theorem
is
true
for
i = 1, 2, . . . , k;
Proof:
By
deni+on:
Fk+1 = Fk + Fk1
Use
the
induc+ve
hypothesis
on
the
right-hand
side:
Fk+1 < (5/3)k + (5/3)k1
<
<
<
<
<
(3/5)(5/3)k+1 + (3/5)2(5/3)k+1
(3/5)(5/3)k+1 + (9/25)(5/3)k+1
(3/5 + 9/25)(5/3)k+1
(24/25)(5/3)k+1
(5/3)k+1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
27
1.2
Mathema+cs
Review
Proof
by
Induc'on
Prove
that:
Base
Case:
the
theorem
is
true
when
N = 1
Induc+ve
Hypothesis:
We
assume
that
the
theorem
is
true
for
1 k N;
Proof:
We
have:
Use
the
induc+ve
hypothesis
on
the
right-hand
side:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
28
1.2
Mathema+cs
Review
Proof
by
Counterexample
Prove
that
the
statement
Fk k2 is
false
The
easiest
way
to
prove
this
is
to
compute
F11 = 144 > 112.
Proof
by
Contradic'on
Prove
that
there
is
an
innite
number
of
primes.
1. Assume
that
the
theorem
is
false.
So
there
is
some
large
prime
Pk.
2. Show that this assump+on implies that some known property is false.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
29
1.2
Recursive
Func+on
Consider
the
following
func+on:
f(0) = 0 and
f(x) = 2f(x-1)+x2
Base Case
30
Recursive Call
1.3
Generic
Implementa+on
Generic
mechanism
promotes
code
reuse
an
important
aspect
of
object-oriented
programming
Java
didnt
support
generic
implementa+on
directly
un+l
Java
5
Pre-Java
5:
generic
methods
and
classes
can
be
implemented
in
Java
using
the
principle
of
inheritance
Use
Object
for
Genericity
Use
Interface
Types
for
Genericity
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
31
1.4
32
1.4
String toString()
boolean equals(Object Obj)
Object clone()
int hashCode()
33
1.4
34
1.4
intValue() is
a
method
of
Integer
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
35
1.4
Primi+ves
cannot
be
3
passed
as
Comparables
but
the
wrappers
can
Covariant
Data
Type
:
be
careful
with
it!
Must
implement
the
Comparable
interface
Objects
must
be
compa+ble,
e.g.
same
super
class
Shape
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
36
1.4
37
1.4
Generic
Types
available
in
Java
5
and
higher
38
1.51
39
For
example,
public interface Map<K, V> {}
denes
a
generic
Map
interface
with
two
type
parameters,
K
and
V.
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
40
Generic
Methods
Generic
method
=
method
with
type
parameter(s)
public class Utils
{
public static <E> void fill(ArrayList<E> a, E value, int count)
{
for (int i = 0; i < count; i++)
a.add(value);
}
}
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
41
Autoboxing
/
Unboxing
Java
5
adds
autoboxing
and
unboxing
features.
Autoboxing:
If
an
int
is
passed
in
a
place
where
an
Integer
is
required,
the
compiler
will
insert
a
call
to
the
Integer
constructor
behind
the
scenes.
Auto-Unboxing:
If
an
Integer
is
passed
in
a
place
where
an
int
is
required,
the
compiler
will
insert
a
call
to
the
intValue
method
behind
the
scene.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
42
1.5.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
43
1.5.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
44
1.5.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
45
Remember:
this
is
a
generic
method
(instead
of
generic
class
or
interface)
1.5.5
Type
Bounds
The
type
bound
species
proper+es
that
the
parameter
types
must
have.
.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
46
1.5.6
2. Algorithm Analysis
WEEK #2
LECTURE #3
Add
Codes
Assignment
#1
Generic ImplementaFon
Algorithmic Analysis
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Course
Structure
Founda'on
Reinforcement
AXend
lectures
Read
book
chapters
Integra'on
IntroducFon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorFng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final
Exam
3
Time
EsFmaFon
How
to
esFmate
the
Fme
required
for
a
program
ReducFon
of
Running
Time
How
to
reduce
the
running
Fme
of
a
problem,
for
example,
from
days
or
years
to
fracFons
of
a
second.
Recursion
The
results
of
careless
use
of
recursion
Ecient
Algorithms
Very
ecient
algorithms
to
raise
a
number
to
a
power
and
to
compute
the
greatest
common
divisor
of
two
numbers.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
2.0
MathemaFcal
Background
Throughout
this
course
we
will
use
the
following
four
deniFons
to
establish
a
relaFve
order
among
funFons:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
2.1
2.1
There
is
some
point
n0
past
which
cf(N)
is
always
at
least
as
large
as
T(N)
If
the
constant
factors
are
ignored,
f(N)
is
at
least
as
big
as
T(N).
2.1
DeniFon
2.4
Explained:
Growth
Rate
of
T(N)
<
Growth
Rate
of
h(N)
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
2.1
2.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
10
O NotaFon
c*g(n)
is
a
lower
bound
for
f(n)
c1*g(n)
is
an
upper
bound
for
f(n)
and
c2*g(n)
is
a
lower
bound
for
f(n)
NotaFon
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
NotaFon
11
Provides
Big Theta
N2
10
N2
5
N2
+
22
N
log
N
+
3N
Classify
algorithms
Big Oh
10
N2
100
N
22
N
log
N
+
3
N
Develop
Upper
Bounds
N2
N5
N3
+
22
N
log
N
+
3
N
Develop
Lower
Bounds
(N2)
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Used to
12
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
13
2.1
Order-of-Growth
ClassicaFons
Common
order-of-growth
classicaFons:
1,
log
N,
N,
N
log
N,
N2,
N3,
and
2N
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
14
Order-of-growth
ClassicaFons
Order
of
Growth
Name
Typical Code
Descrip'on
Example
constant
a = b + c;
Statement
log N
logarithmic
while (N > 1)
{ N = N / 2; ... }
Divide in half
Binary Search
linear
loop
N log N
linearithmic
See mergesort
Divide
and
conquer
Mergesort
N2
quadraFc
Double loop
N3
cubic
Triple loop
2N
exponenFal
ExhausFve Search
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
15
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
16
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
17
Useful
Rules
Rule
1
If
T1(N)
=
O(f(N))
and
T2(N)
=
O(g(N))
,
then
T1(N)
+
T2(N)
=
O(
f(N)
+
g(N)
)
T1(N)
*
T2(N)
=
O(
f(N)
*
g(N)
)
Rule 2
Rule 3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
18
2.1
WEEK #2
LECTURE #4
Big-Oh Revisit
Book-keeping
Assignment
#1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
19
T(N)
=
O(2N2)
T(N)
=
O(N2+N)
Write this:
T(N) = O(N2)
Tip
2:
For
comparing
the
relaFve
growth
rates
for
two
funcFons
f(N)
and
g(N),
use
Hpitals
rule
if
necessary
(most
of
Fme,
this
method
is
an
overkill)
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
2.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
21
2.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
22
Memory Usage
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
23
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
24
What
to
Analyze
Running
Fme
is
the
most
important
resource
to
analyze.
Several
factors
could
aect
the
running
Fme:
Computer
Compiler
Programming
Language
Algorithm
25
2.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
26
2.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
27
2.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
28
2.3
Count
for
*
*
+
=:
4N
Count:
0
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
29
2.4
General
Rules
Rule
1
for
loops
The
running
Fme
of
the
statements
inside
the
for
loop
the
number
of
iteraFon
=
Total
O(N)
Example:
4
N
=
4N
30
2.4
General
Rules
Rule
3
ConsecuFve
Statements
Just
add
(the
maximum
is
the
one
that
counts)
Example:
O(N)
lower
order
can
be
ignored
O(N2)
higher
order
counts
Rule 4 if/else
31
2.4
Algorithm
1
Cubic
maximum
conFguous
subsequence
sum
algorithm
3
2
1
O(N2)
lower
order
can
be
ignored
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
32
2.4
Algorithm
2
EliminaFng
Line
13
and
14
in
Algorithm
1,
we
can
reduce
the
running
Fme
to
O(N2).
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
SimplicaFon
33
2.4
Algorithm
3
We
can
use
conquer
and
divide
strategy
and
further
simplify
the
soluFon
to
O(NlogN).
The
idea
is
to
split
the
problem
into
two
roughly
equal
subproblems
and
solve
them
recursively.
Divide
Conquer
11
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
34
2.4
Algorithm
3
We
can
use
conquer
and
divide
strategy
and
simplify
the
soluFon
to
O(NlogN).
Stop
condiFon
Special
case
for
odd
number
of
input
entries
Recursive
calls
35
2.4
Algorithm 3
Let
T(N)
as
the
Fme
to
solve
a
Max
Subsequence
sum
problem
of
size
N
and
T(1)
as
one
unit.
T(1)=1,
T(N)
=
2T(N/2)
+
O(N)
ObservaFon:
T(2)=2*2,
T(4)=4*3,
T(8)=8*4,
T(16)=16*5
Conclusion:
If
N
=
2k,
then
T(N)=N*(K+1)=NlogN+N=O(NlogN)
T(1)
=
1
unit
T(N/2)
T(N/2)
O(N)
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
36
2.4
Algorithm
4
If
we
dont
need
to
know
the
actual
best
subsequence,
the
design
of
algorithm
can
be
further
simplied
to
O(N).
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
37
2.4
Course
Structure
Founda'on
Reinforcement
AEend
lectures
Read
book
chapters
Integra'on
IntroducPon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorPng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
Collection
Interface
Iterators
The
List
Interface,
ArrayList,
and
LinkedList
ListIterators
ImplementaPon
of
ArrayList
The
Basic
Class
The
Iterator
and
Java
Nested
and
Inner
Classes
Java
Review:
StaPc Class
Inner
Class
Local
Inner
Class
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #3
LECTURE #6
Chapter
3
Overview
Chapter
3
discusses
some
most
simple
and
basic
data
structures
Introduce
the
concept
of
Abstract
Data
Type
(ADTs)
Show
how
to
eciently
perform
operaPons
on
Lists
Introduce
the
Stack
ADT
and
its
use
in
implemenPng
recursion
Introduce
the
Queue
ADT
and
its
use
in
operaPng
systems
and
algorithm
design
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
3.0
OperaPons:
add,
remove,
contain,
union,
nd,
etc.
ImplementaPon:
May
have
mulPple
implementaPon
hidden
away
from
users
3.1
List
ADT
The
list
ADT
views
its
data
much
like
an
array
does:
elements
are
accessible
via
consecuPve
indices.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Cons:
Trade-o
for
this
exibility:
operaPons
is
O(N),
instead
of
O(1)
in
other
data
structures.
Worst
case:
InserPng
into
posiPon
0
requires
shiding
all
the
elements
in
the
list
up
one
spot.
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
List:
Shiding
5,
8,
2,
1,
4,
7
add(3,6)
5,
8,
2,
6,
1,
4,
7
RemoveAt(1) return
8
5,
2,
6,
1,
4,
7
Elements
changed
posiPons
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.2.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.2.2
InserPon
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.2.2
3.2.2
3.3.1
Capitalized I
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.2
Iterators
CollecPons
that
implement
the
Iterable
interface
must
provide
a
method
named
iterator
The
method
iterator
returns
an
object
type
Iterator.
The
Iterator
is
an
interface
dened
in
package
java.uPl
and
is
shown
below:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.2
3.3.2
3.3.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.3
3.3.3
Making
a
List
Whether
an
ArrayList or
LinkedList
is
passed
as
a
parameter,
the
running
Pme
of
the
following
method
is
O(N)
because each call to add (to the end of the list), takes constant Pme.
3.3.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.3
ImplementaPon
of
ArrayList
The
Basic
Class
The
Iterator
and
Java
Nested
and
Inner
Classes
Java
Review:
StaPc
Class
Inner
Class
Local
Inner
Class
Anonymous
Inner
Class
ImplementaPon of LinkedList
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #4
LECTURE #7
Ader: 5, 1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Before:
ArrayList
->
O(N2)
LinkedList
->
O(N2)
Ader:
ArrayList
->
O(N2)
because
array
items
must
be
shi3ed
LinkedList
->
O(N)
3.3.4
List Type
No. of Items
LinkedList<Integer>
800,000
0.039
LinkedList<Integer>
1,600,000
0.073
ArrayList<Integer>
800,000
300
ArrayList<Integer>
1,600,000
1,200
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.4
ListIterators
A
ListIterator
extends
the
funcPonality
of
an
Iterator
for
Lists:
Previous
and
hasPrevious
allow
traversal
of
the
list
from
the
back
to
the
front
Add
places
a
new
item
into
the
list
in
the
current
posiPon
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.3.5
ImplementaPon
of
ArrayList
Outlines
of
a
usable
ArrayList
generic
class:
MyArrayList
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4.1
Inner
Class
-
1
This
iterator
version
doesnt
work
because
theItems
and
size()
are
not
part
of
the
ArrayListIterator
class.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4.2
Inner
Class
-
2
The
iterator
is
a
top-level
class
and
stores
the
current
posiPon
and
a
link
to
the
MyArrayList.
It
doesnt
work
because
theItems
is
private
in
the
MyArrayList
class
It
is
dened
as
a
private.
It is a HAS-A relaPonship.
Error
Here!
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
3.4.2
Inner
Class
-
3
This
Pme
it
works:
The
iterator
is
a
nested
class
and
stores
the
current
posiPon
and
a
link
to
the
MyArrayList.
It
works
because
the
nested
class
is
considered
part
of
the
MyArrayList
class.
ArrayListIterator
is
dened
inside
of
MyArrayList.
Static
indicates
a
nested
class
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4.2
Inner
Class
-
4
This
one
works
as
well:
The
iterator
is
an
inner
class
and
stores
the
current
posiPon
and
an
implicit
link
to
the
MyArrayList.
Inner
class
doesnt
have
the
static
keyword
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.4.2
StaPc
Classes
The
nested
class
has
access
to
its
containing
classs
private
staPc
members
(is
it
useful
at
all?)
package pizza;
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Inner
Classes
An
inner
class
is
a
class
declared
as
a
non-staPc
member
of
another
class
The
inner
class
instance
has
access
to
the
instance
members
of
the
containing
class
instance.
These
enclosing
instance
members
are
referred
to
inside
the
inner
class
via
just
their
simple
names,
not
via
this
(this
in
the
inner
class
refers
to
the
inner
class
instance,
not
the
associated
containing
class
instance).
package pizza;
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
ImplementaPon
of
LinkedList
MyLinkedList
:
contains
links
to
both
ends,
the
size
of
the
list,
and
a
host
of
methods.
Node
:
contains
data
and
links
to
the
previous
and
next
nodes,
along
with
appropriate
constructors.
LinkedListIterator
:
implements
Iterator
with
next,
hasNext
and
remove
methods.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.5
Adding a Node
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.5
Removing a Node
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.5
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #4
LECTURE #8
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.1
ImplementaPon
of
Stacks
Linked
List
ImplementaPon
of
Stacks
Push:
insert
at
the
front
of
the
list
Top/Pop:
return
the
value
of
the
element
at
the
front
of
the
list
and
delete
it.
3.6.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
A
+
is
read,
so
3
and
2
are
poped
from
the
stack.
Next 8 is pushed
Next
a
+
is
seen,
so
40
and
5
are
popped
and
5
+
40
=
45
is
pushed.
Now 3 is pushed.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
a b c * + d e * f + g * +
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.6.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.7
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
3.7.2
Anonymous
Subclass
Given:
What
is
the
result?
A. An
excepPon
occurs
at
runPme
B. true
C. Fred
D. CompilaPon
fails
because
of
an
error
on
line
3
E. CompilaPon
fails
because
of
an
error
on
line
4
F. CompilaPon
fails
because
of
an
error
on
line
8
G. CompilaPon
fails
because
of
an
error
on
a
line
other
than
3,
4,
or
8
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Inner
Class
Given:
Which,
inserted
independently
at
line
6,
compile
and
produce
the
output
"spooky"?
(Choose
all
that
apply.)
A. Sanctum s = c.new Sanctum();
B. c.Sanctum s = c.new Sanctum();
C. c.Sanctum s = Cathedral.new Sanctum();
D. Cathedral.Sanctum s = c.new Sanctum();
E. Cathedral.Sanctum s = Cathedral.new Sanctum();
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
4. Trees
Course
Structure
Founda'on
Reinforcement
ABend
lectures
Read
book
chapters
Integra'on
IntroducOon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorOng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
Trees
(Chapter
4)
Preliminaries
ImplementaOon
of
Trees
Tree
Traversals
with
an
ApplicaOon
Binary Trees
ImplementaOon
An
Example:
Expression
Trees
contains
findMin
and
findMax
insert
remove
Average-Case
Analysis
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #5
LECTURE #9
Chapter
4
Overview
Trees
in
general
are
very
useful
abstracOons
in
computer
science.
In
Chapter
4,
we
will
See
how
trees
are
used
to
implement
the
le
system
of
several
popular
operaOng
systems.
See
how
trees
can
be
used
to
evaluate
arithmeOc
expressions.
Show
how
to
use
trees
to
support
searching
operaOons
in
O(logN)
average
Ome,
and
how
to
rene
these
ideas
to
obtain
O(logN)
worst-case
bounds.
Discuss
and
use
the
TreeSet
and
TreeMap
classes.
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
4.0
Tree
Preliminaries
parent
edge
Grandparent
child
Depth
=
2
Grandchild
siblings
Height
(the
longest
path)
=
3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.1
ImplementaOon
of
Trees
class TreeNode
{
Object element;
TreeNode firstChild;
TreeNode nextSibling;
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.1.1
Advantages:
Allow
users
to
organize
their
data
logically.
Two
les
in
dierent
directories
can
share
the
same
name.
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
4.1.2
Pre-order Traversal
4.1.2
Post-order Traversal
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.1.2
Binary
Trees
Binary
Tree
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Worst-case
4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.2.1
General Strategy:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.2.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.2.2
the
values
of
all
the
items
in
its
lei
subtree
are
smaller
than
the
item
in
X,
the
values
of
all
the
items
in
its
right
subtree
are
larger
than
the
item
in
X.
This
is
NOT
a
binary
search
tree
This
is
a
binar
search
tree
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
4.3
4.3
insert
To
insert
X
into
tree
T,
proceed
down
the
tree
as
you
would
with
a
contains.
If
X
is
found,
do
nothing
(or
update
something).
Otherwise,
insert
X
at
the
last
sport
on
the
path
traversed.
Example:
Adding
5
to
the
tree.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.3.3
remove
To
remove node
X
from
tree
T
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.3.4
Average-Case
Analysis
Binary
search
tree
should
take
O(log
N)
Ome
because
in
constant
Ome
we
descend
a
leve
in
the
tree,
thus
operaOng
on
a
tree
that
is
now
roughly
half
as
large.
Internal
Path
Length,
D(N):
sum
of
the
depths
of
all
nodes
in
a
tree
(N-node
tree)
D(N)
=
D(i)
+
D(N-i-1)
+
N
1
If
all
subtree
sizes
are
equally
likely,
we
can
put
the
average
value
of
subtree
into
D(N):
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.3.5
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.3.5
Unbalanced
Tree
Aier
a
quarter-million
random
insert/remove
pairs,
the
tree
looks
decidedly
unbalanced
(average
depth
equals
12.51).
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.3.5
AVL
Trees
Single
RotaOon
Double
RotaOon
Review
of
source
code
Splay
Trees
Splaying
B-Trees
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
WEEK #5
LECTURE #10
AVL
Trees
AVL
(Adelson-Velskii
and
Landis)
tree:
a
binary
search
tree
with
a
balance
condiOon.
Require
the
lei
and
right
subtree
to
have
the
same
height
IdenOcal
to
a
binary
search
tree,
except
that
for
every
node,
the
height
of
the
lei
and
right
subtrees
can
dier
by
at
most
1.
Height
of
AVL:
1.44
log
(N+2)
1.328
AVL
Unbalanced
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Unbalanced
4.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4
Aier
inserOon
in
Case
1,
the
lei
subtree
for
Node
k2
is
two
levels
depper
than
its
right
subtree
Rebalancing
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.1
Child
root
becomes
the
new
root
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.1
InserOng 6
InserOng 4, 5
InserOng 7
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.1
Double
RotaOon
Single
RotaOon
fails
to
x
cases
2
or
3.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
Double
RotaOon
Lei-right
double
rotaOon
to
x
case
2
hBp://www.cs.uah.edu/~rcoleman/CS221/Trees/AVLTree.html
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
Double
RotaOon
Right-lei
double
rotaOon
to
x
case
3
hBp://www.cs.uah.edu/~rcoleman/CS221/Trees/AVLTree.html
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.4.2
WEEK #6
LECTURE #11
Splay
Trees
B-Trees
Sets
and
Maps
Review
of
AVL
source
code
(if
Ome
allows)
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Splay
Trees
Basic
ideas
for
Splay
Tree:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Whenever
a
splay
tree
node
is
accessed,
the
tree
performs
splaying
operaOons
that
move
the
accessed
node
to
the
root
of
the
tree.
Splaying
a
node
consists
of
a
series
of
rotaOons.
Similar
to
AVL
tree
rotaOons.
4.5
Splay
Trees
If
a
node
has
not
been
accessed
in
a
while,
you
will
pay
the
performance
penalty
of
splaying
in
the
next
Ome
it
is
accessed.
But
access
of
that
node
in
near
furture
is
very
fast.
So
we
amorOze
the
cost
of
splaying
over
future
operaOons.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Zig-zag
Strategy:
rotate
boBom
up
along
the
access
path.
X
is
right
child,
P
is
the
lei
child:
perform
double
rotaOon
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Example
Example,
with
a
contains
on
k1.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Example
K1
is
a
zig-zag,
we
do
rotaOon
with
k1,
k4
and
k5.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
How
is
a
worst-case
BST
created
?
When
all
the
nodes
are
entered
in
sorted
order.
Suppose
the
boBom
node
is
accessed
in
such
a
tree:
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
1
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
2
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
5
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
6
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
7
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
8
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
Splay
Trees
Splaying
at
node
9
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.5
B-trees
A
B-tree
is
a
tree
data
structure
suitable
for
disk
drives
It
may
take
up
to
11ms
to
access
data
on
disk
Todays
modern
CPUs
can
execute
billions
of
instructuions
per
second
Therefore,
it
makes
sense
for
us
to
spend
CPU
cycles
to
reduce
the
number
of
disk
accesses.
4.7
B-trees
A
B-tree
is
an
m-ary
(allowing
for
M-way
branching)
tree.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
B-trees
B-tree
of
order
5
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
B-trees
B-tree
aier
inserOon
of
57
into
the
tree
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
B-trees
InserOon
of
55
into
the
B-tree
causes
a
split
into
two
leaves
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
B-trees
InserOon
of
40
causes
a
split
into
two
leaves
and
then
a
split
of
the
parent
node.
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
B-trees
B-tree
aier
the
deleOon
of
99
from
the
B-tree
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.7
Sets
Set
interface
is
inherited
from
Collection
Does
not
allow
duplicates
OperaOons:
insert,
remove,
etc.
Very
ecient
basic
search
SortedSet interface
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
4.8.1
Maps
Map interface
is
inherited
from
Collection
SortedMap interface
4.8.2
Summary
Tree
Descrip'on
Applica'on
Binary
Lei-subtree
is
less
than
X;
Right
subtree
is
Search
Trees
larger
than
X
AVL Trees
Splay Trees
B-Trees
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
5. Hashing
Course
Structure
Founda'on
Reinforcement
A6end
lectures
Read
book
chapters
Integra'on
IntroducJon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorJng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
Hashing
(Chapter
5)
General
Idea
Hash
FuncJon
Separate
Chaining
Hash
Tables
without
Linked
Lists
Rehashing
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #6
LECTURE #12
For
instance,
we
just
happen
to
know
that
the
item
we
want
it
is
at
posiJon
3;
we
can
apply:
myitem=myarray[3];
With
this,
we
don't
have
to
search
through
each
element
in
the
array,
we
just
access
posiJon
3.
The
quesJon
is,
how
do
we
know
that
posiJon
3
stores
the
data
that
we
are
interested
in?
This
is
where
hashing
comes
in
handy.
Given
some
key,
we
can
apply
a
hash
funcJon
to
it
to
nd
an
index
or
posiJon
that
we
want
to
access.
Chapter
Overview
Hashing:
ImplementaJon
of
hash
tables
Technique
for
performing
inserJons,
deleJons,
and
searches
in
constant
average
Jme
General
Idea
Hash
table
data
structure:
an
array
of
some
xed
size,
containing
the
items
Each
item
could
consist
of
a
key
and
addiJonal
data
elds
Hash
funcJon:
the
mapping
to
convert
each
key
into
some
numbers
in
the
range
from
0
to
TableSize
-1
and
placed
in
the
appropriate
cell.
Distributes
the
keys
evenly
among
the
cells.
Collision:
when
two
keys
hash
to
the
same
value.
5.1
Hash
FuncJon
Some
simple
hash
funcJons
Keys
are
in
integers:
returning
Key
mod
TableSize
Keys
are
in
strings:
adding
up
ASCII
values
of
the
characters
in
the
string
(What
if
TableSize=10,007
and
typical
hash
funcJon
is
127
* 8 = 1,016?
It
will
not
be
a
good
and
equitable
distribuJon.)
5.2
Hash
FuncJon
Some
simple
hash
funcJons
Keys
has
at
least
three
characters:
Hash
FuncJon
A
good
hash
funcJon:
Mapping
using:
Separate Chaining
5.3
Linear Probing
5.4.1
Linear Probing
Number
of
probes
plo6ed
against
load
factor
for
linear
probing
(dashed)
and
random
strategy
(S
is
successful
search,
U
is
unsuccessful
search,
and
I
is
inserJon)
5.4.1
QuadraJc
Probing
QuadraJc
probing
is
a
collision
resoluJon
method
that
eliminates
the
primary
clustering
problem
of
linear
probing:
f(i)
=
I2
When
collison
occurs,
the
next
posiJon
a6empted
is
one
cell
away.
5.4.2
Double
Hashing
Secondary
clustering:
quadraJc
probing
eliminates
primary
clustering,
elements
that
hash
to
the
same
posiJon
will
probe
the
same
alternaJve
cells.
Double
hashing
eliminates
secondary
clustering:
f(i)
=
i.hash2(x)
by
applying
a
second
hash
funcJon
to
x
and
probing
at
a
distance
hash2(x),
2hash2(x),
and
so
on.
5.4.3
Rehashing
Problems
when
the
table
gets
too
full:
Running
Jme
for
the
operaJons
will
take
too
long
InserJons
might
fail
for
open
addressing
hashing
with
quadraJc
resoluJon,
especially
if
there
are
too
many
removals
intermixed
with
inserJons.
SoluJon:
Build
another
table
that
is
about
twice
as
big
(with
an
associated
new
hash
funcJon)
Scan
down
the
enJre
orginial
hash
table
and
compute
the
new
hash
value
for
each
(non-deleted)
element
Insert
the
new
hash
values
in
the
new
table.
5.5
Rehashing
Hash
funcJon
h(x)
=
x
mod
7
Aner
inserJng
23,
the
hash
table
is
over
70%
in
capacity.
5.5
Rehashing
Linear
probing
hash
table
aner
rehashing.
New
Hash
funcJon:
h(x)
=
x
mod
17
5.5
WEEK #7
LECTURE #13
Extendible Hashing
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
A
map
in
which
the
key
is
a
word
length,
and
the
value
is
a
collecJon
of
all
words
of
that
word
length.
->
HashMap
A
map
in
which
the
key
is
a
representaJve,
and
the
value
is
a
collecJon
of
all
words
with
that
representaJve.
->
HashMap
A
map
in
which
the
key
is
a
word,
and
the
value
is
a
collecJon
of
all
words
that
dier
in
only
one
character
from
that
word.
->
HashMap
ImplemenJng
hashCode
If
a
class
overrides
equals,
it
must
override
hashCode
When
they
are
both
overridden,
equals
and
hashCode
must
use
the
same
set
of
elds
If
two
objects
are
equal,
then
their
hashCode
values
must
be
equal
as
well
If
the
object
is
immutable,
then
hashCode
is
a
candidate
for
caching
and
lazy
iniJalizaJon
It's
a
popular
misconcepJon
that
hashCode
provides
a
unique
idenJer
for
an
object.
It
does
not.
5.6
Perfect
Hashing
If
we
have
N
items,
how
do
we
lower
the
probability
of
collisions?
One
Approach:
We
can
have
separate
chaining
implementaJon,
and
Keep
each
list
at
most
a
constant
number
of
items.
As
we
make
more
lists,
the
lists
will
on
average
be
shorter.
Perfect
Hashing
Even
with
lots
of
lists,
we
might
sJll
get
unlucky
Choose
M
(number
of
lists)
to
be
suciently
large
that
probability
is
at
least
for
no
collisions.
If
a
collision
is
detected,
we
simply
clear
out
the
table
and
try
again
using
a
dierent
hash
funcJon
that
is
independent
of
the
rst.
Keep
trying
unJl
we
get
no
collisions.
The
expected
number
of
trials
will
be
at
most
2
(since
the
success
probability
is
)
5.7.1
Perfect
Hashing
The
number
of
lists
might
be
unreasonably
large
How
large
M
needs
to
be?
M
needs
to
be
quite
large:
M=(N2)
If
M
=
N2,
table
is
collision
free
with
probability
at
least
.
Theorem
5.2
5.7.1
Perfect
Hashing
Using
N2
lists
is
impracJcal.
More
pracJcal
implementaJon:
Use
only
N
bins,
but
resolve
the
collisions
in
each
bin
by
using
hash
tables
instead
of
linked
lists.
The
bins
are
expected
to
have
only
a
few
items
each,
the
hash
table
for
each
bin
can
be
quadraJc
in
the
bin
size.
5.7.1
Perfect
Hashing
Perfect
hashing
table
using
secondary
hash
tables
5.7.1
Perfect
Hashing
The
scheme
of
Perfect
Hashing:
Theorem 5.3
5.7.1
Cuckoo
Hashing
If
N
items
are
randomly
tossed
into
N
bins,
the
size
of
the
largest
bin
is
expected
to
be
(log
N/
log
log
N).
If,
at
each
toss,
two
bins
were
randomly
chosen
and
the
item
was
tossed
into
the
more
empty
bin
(at
the
Jme),
then
the
size
of
the
largest
bin
would
only
be
(log
log
N),
a
signicantly
lower
number.
Thats
so
called
the
power
of
two
choices.
5.7.2
Cuckoo
Hashing
Given
N
items,
we
maintain
two
tables:
each
more
than
half
empty
and
each
with
independent
funcJon
for
assigning
each
item
to
a
posiJon
in
each
table,
5.7.2
Cuckoo
Hashing
Item
A
can
be
at
either
posiJon
0
in
Table
1,
or
posiJon
2
in
Table
2.
A
search
in
a
cuckoo
hash
table
requires
at
most
two
table
accesses
5.7.2
Cuckoo
Hashing
The
Cuckoo
hashing
algorithm:
To
insert
a
new
item
x,
rst
make
sure
it
is
not
already
there.
If
the
rst
table
locaJon
is
empty,
the
item
can
be
placed.
5.7.2
Cuckoo
Hashing
The
Cuckoo
hashing
algorithm:
To
insert
a
new
item
x,
rst
make
sure
it
is
not
already
there.
If
the
rst
table
locaJon
is
empty,
the
item
can
be
placed.
5.7.2
Cuckoo
Hashing
To
insert
B,
we
can
add
it
to
locaJons
0
in
Table
1
and
0
in
Table
2.
Table
1
is
already
occupied
by
A
in
posiJon
0.
Cuckoo
will
preempJvely
displace
A
and
does
not
bother
to
look
at
Table
2.
5.7.2
Cuckoo
Hashing
InserJon
of
C
is
straighyorward.
For
InserJon
of
D
with
hash
locaJons
(1,0),
Table
1
locaJon
is
already
taken
but
we
dont
look
at
the
Table
2
locaJon.
5.7.2
Cuckoo
Hashing
E
can
be
easily
inserted.
In
order
to
insert
F,
we
need
to
displace
E,
then
A,
and
then
B.
5.7.2
Cuckoo
Hashing
But
we
cannot
successfully
insert
G!
G
Hash
locaJons
(1,
2)
Displace
D,
Displace
B,
Displace
A,
Displace
E,
Displace
F,
Displace
C,
Displace
G,
CIRCULAR
DEPENDENCE!
5.7.2
Cuckoo
Hashing
Fortunately
if
the
tables
load
factor
is
below
0.5,
the
probability
of
a
cycle
is
very
low.
If
circular
dependence
really
occurs,
we
can
simply
rebuild
the
tables
with
new
hash
funcJons
aner
a
certain
number
of
displacements
are
detected.
5.7.2
Cuckoo
Hashing
Cuckoo
Hash
Table
ImplementaJon
Allow
an
arbitrary
number
of
hash
funcJons
Use
a
single
array
that
is
addressed
by
all
the
hash
funcJons
(instead
of
two
separately
addressable
hash
tables)
Specify
the
maximum
load
to
be
0.4
(auto
expansion
if
higher
load)
Specify
how
many
rehashes
we
will
perform
5.7.2
Hopscotch
Hashing
Hopscotch
Hashing:
bound
the
maximal
length
of
the
probe
sequence
by
a
predetermined
constant
that
is
opJmized
to
the
underlying
computers
architecture.
For
example:
MAX_DIST
=
4.
This
gives
constant-Jme
lookups
in
the
worst
case.
The
lookup
could
be
parallelized
to
simltaneously
check
the
bounded
set
of
possible
locaJons.
5.7.3
Hopscotch
Hashing
The
hops
tell
which
of
the
posiJons
in
the
block
are
occupied
with
cells
containing
this
hash
value.
Thus
Hop[8]
=
0010
indicates
that
only
posiJon
10
currently
contains
items
whose
hash
value
is
8,
while
posiJons
8,
9,
and
11
do
not
5.7.3
Hopscotch
Hashing
A6empJng
to
insert
H.
Linear
probing
suggests
locaJon
13,
but
that
is
too
far,
so
we
evict
G
from
posiJon
11
to
nd
a
closer
posiJon
5.7.3
Hopscotch
Hashing
A6empJng
to
insert
I.
Linear
probing
suggests
locaJon
14,
but
that
is
too
far;
consulJng
Hop[11],
we
see
that
G
can
move
down,
leaving
posiJon
13
open.
ConsulJng
Hop[10]
gives
no
suggesJons.
Hop[11]
does
not
help
either
(why?),
so
Hop[12]
suggests
moving
F
5.7.3
Hopscotch
Hashing
InserJon
of
I
conJnues:
Next
B
is
evicted,
and
nally
we
have
a
spot
that
is
close
enough
to
the
hash
value
and
can
insert
I
5.7.3
Extendible
Hashing
What
if
the
full
amount
of
data
is
too
large
to
t
in
memory?
Our
main
concern
is
the
number
of
disk
accesses
to
get
a
given
data
item
N
items
to
store,
M
items
t
on
each
disk
block
Collisions
will
cause
a
number
of
blocks
to
be
examined
resulJng
in
signicant
disk
read
cost
When
hash
becomes
too
full,
rehashing
will
be
needed
The
cost
will
be
O(N)
disk
accesses
Extensible
hashing:
Search:
two
disk
accesses
InserJon:
few
disk
accesses
5.9
Extendible
Hashing
Original
Data
5.9
Extendible
Hashing
Aner
inserJon
of
100100
and
directory
split
5.9
Extendible
Hashing
Aner
inserJon
of
000000
and
leaf
split
5.9
6. Priority Queues
Course
Structure
Founda'on
Reinforcement
A6end
lectures
Read
book
chapters
Integra'on
IntroducJon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorJng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
Hashing
(Chapter
5)
Hash
Tables
with
Worst-Case
O(1)
Access
Hopscotch
Hashing
Extendible Hashing
WEEK #7
LECTURE #14
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
11
13
17
19
23
29
HASHING
Hopscotch
Hashing
Hopscotch
Hashing:
bound
the
maximal
length
of
the
probe
sequence
by
a
predetermined
constant
that
is
opJmized
to
the
underlying
computers
architecture.
For
example:
MAX_DIST
=
4.
This
gives
constant-Jme
lookups
in
the
worst
case.
The
lookup
could
be
parallelized
to
simltaneously
check
the
bounded
set
of
possible
locaJons.
5.7.3
Hopscotch
Hashing
The
hops
tell
which
of
the
posiJons
in
the
block
are
occupied
with
cells
containing
this
hash
value.
Thus
Hop[8]
=
0010
indicates
that
only
posiJon
10
currently
contains
items
whose
hash
value
is
8,
while
posiJons
8,
9,
and
11
do
not
0
0
1
0
5.7.3
Hopscotch
Hashing
A6empJng
to
insert
H.
Linear
probing
suggests
locaJon
13,
but
that
is
too
far,
so
we
evict
G
from
posiJon
11
to
nd
a
closer
posiJon
5.7.3
Hopscotch
Hashing
A6empJng
to
insert
I.
Linear
probing
suggests
locaJon
14,
but
that
is
too
far;
consulJng
Hop[11],
we
see
that
G
can
move
down,
leaving
posiJon
13
open.
ConsulJng
Hop[10]
gives
no
suggesJons.
Hop[11]
does
not
help
either
(why?),
so
Hop[12]
suggests
moving
F
5.7.3
Hopscotch
Hashing
InserJon
of
I
conJnues:
Next
B
is
evicted,
and
nally
we
have
a
spot
that
is
close
enough
to
the
hash
value
and
can
insert
I
5.7.3
Extendible
Hashing
What
if
the
full
amount
of
data
is
too
large
to
t
in
memory?
Our
main
concern
is
the
number
of
disk
accesses
to
get
a
given
data
item
N
items
to
store,
M
items
t
on
each
disk
block
Collisions
will
cause
a
number
of
blocks
to
be
examined
resulJng
in
signicant
disk
read
cost
When
hash
becomes
too
full,
rehashing
will
be
needed
The
cost
will
be
O(N)
disk
accesses
Extensible
hashing:
Search:
two
disk
accesses
InserJon:
few
disk
accesses
5.9
Extendible
Hashing
Original
Data
5.9
Extendible
Hashing
Aner
inserJon
of
100100
and
directory
split
5.9
Extendible
Hashing
Aner
inserJon
of
000000
and
leaf
split
5.9
PRIORITY QUEUES
Chapter
Overview
Priority
queue
are
used
in
many
applicaJons:
Queue
for
the
print
jobs
sent
to
a
printer
1-page
jobs
should
be
prioriJzed
over
100-page
job
Model
Two
main
operaJons
for
a
priority
queue:
insert
the
equivalent
of
enqueue
operaJon
deleteMin
equivalent
of
dequeue
operaJon
6.1
Simple
ImplementaJons
There
are
several
obvious
ways
to
implement
a
priority
queue:
Linked
List
Binary
Heap
A
binary
heap
(or
just
heap)
is
a
binary
tree
that
is
complete.
All
levels
of
the
tree
are
full
except
possibly
for
the
bo6om
level
which
is
lled
from
len
to
right.
6.3
Binary
Heap
Conceptually,
a
heap
is
a
binary
tree.
But
we
can
implement
it
as
an
array.
For
any
element
in
array
posiJon
i:
Len
child
is
at
posiJon
2i
Right
child
is
at
posiJon
2i + 1
Parent
is
at
posiJon
i / 2
6.3
Heap-Order
Priority
We
want
to
nd
the
minimum
value
(highest
priority)
very
quickly.
Make
the
minimum
value
always
at
the
root.
Apply
this
rule
also
to
roots
of
subtrees.
Weaker
rule
than
for
a
binary
search
tree.
Not
necessary
that
values
in
the
len
subtree
be
less
than
the
root
value
and
values
in
the
right
subtree
be
greater
than
the
root
value.
6.3
Heap-Order
Priority
Two
complete
trees
(only
the
len
tree
is
a
heap)
6.3
Heap
InserJon
InserJon
Strategy
Percolate
up:
Repeatedly
do
a
heap
inserJon
on
the
list
of
values
Percolate
up
the
hole
each
Jme
from
the
bo6om
of
the
heap.
InserJng
14
6.3
Heap
InserJon
Procedure
to
insert
into
a
binary
heap:
6.3
6.3
6.3
buildHeap
Each
dashed
line
in
the
gures
corresponds
to
two
comparisons
during
a
call
to
percolateDown():
One
to
nd
the
smaller
child.
One
to
compare
the
smaller
child
to
the
node.
The
maximum
number
of
dashed
lines
is
the
sum
of
the
heights
of
all
the
nodes
in
the
heap.
Prove
that
this
sum
is
O(N).
6.4
Building
Heap
Sketch
of
buildHeap
6.4
Building
Heap
IniJal
Heap
and
aner
percolateDown(7)
6.4
Building
Heap
percolateDown(6)
and
percolateDown(5)
6.4
Building
Heap
percolateDown(4)
and
percolateDown(3)
6.4
Building
Heap
percolateDown(2)
and
percolateDown(1)
6.4
S = 2i ( h i )
i =0
(a)
(b)
S = h + 2 + 4 + 8 + ... + 2h 1 + 2h
= (2h+1 1) (h + 1)
6.4
6.4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #8
LECTURE #15
Lenist
Heaps
A
lenist
heap
is
a
heap
that
supports
ecient
merging.
A
node
inserJon
into
a
lenist
heap
is
a
merger
with
a
one-node
tree.
A
node
deleJon
of
the
root
splits
a
lenist
tree
into
two
trees
which
are
then
merged
back
together.
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
6.6
Skew
Heaps
Skew
Heaps:
Self-adjusJng
version
of
a
lenist
heap
Simple
to
implement
Are
binary
trees
with
heap
order
but
without
structural
constraint
on
these
trees
Unlike
Lenist
Heaps,
Skew
Heaps
always
Contain
no
NPL
informaJon
of
any
node
Perform
uncondiJonal
swap
Lenist
heaps:
will
check
to
see
whether
the
len
and
right
children
saJsfy
the
lenist
heap
structure
property
and
swap
them
if
they
do
not.
Skew
Heaps:
one
excepJon
the
largest
of
all
the
nodes
on
the
right
paths
does
not
hasve
its
children
swapped.
6.7
Skew
Heaps
Two
skew
heaps
H1
and
H2:
6.7
Skew
Heaps
Result
of
merging
H2
with
H1s
right
subheap
6.7
Skew
Heaps
Result
of
merging
skew
heaps
H1
and
H2
6.7
Binomial
Queues
Binomial
Queues:
Similar
to
Lenist
and
Skew
Heaps
in
supporJng
merging,
inserJon
and
deletemin
operaJons.
Similar
to
Lenist
and
Skew
Heaps
in
having
O(log
N)
worst-case
Jme
per
operaJon
for
those
three
operaJons.
But
inserJons
take
only
constant
Jme
on
average.
6.8
Binomial
Queues
A
binomial
queue
is
a
collecJon
of
heap-trees,
known
as
a
forest.
Binomial
Trees
B0,
B1,
B2,
B3,
and
B4.
6.8
6.8
6.8
6.8
6.8
6.8
6.8
6.8
6.8
6.8
7. Sor'ng
Course
Structure
Founda'on
Reinforcement
A3end
lectures
Read
book
chapters
Integra'on
Introduc'on
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
Sor'ng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
Sor'ng
(Chapter
7)
Inser'on
Sort
Shellsort
Heapsort
Mergesort
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
WEEK #9
LECTURE #17
Inser'on
Sort
No.
of
input,
N
=
6
Posi'ons
0
through
p-1
are
already
sorted
Posi'ons p
7.2
7.2
7.3
Shellsort
Other
notes
on
inser'on
sort
Inser'on
sort
is
fast
if
the
array
is
nearly
sorted
Parallellism?
If
we
can
swap
non-adjacent
values,
we
may
be
able
to
remove
more
than
one
inversion
at
a
'me.
If
we
can
get
the
array
nearly
sorted
as
soon
as
possible,
inser'on
sort
can
nish
the
job
quickly.
Shellsort
Basic
principle:
We
start
by
comparing
elements
that
are
distant
The
distance,
h,
between
comparisons
decreases
as
the
algorithms
runs
un'l
the
last
phase,
in
which
adjacent
elements
are
compared.
This
is
referred
to
as
diminishing
increment
sort.
7.4
Shellsort
Shellsort
uses
a
sequence,
h1,
h2,
,
ht,
called
the
increment
sequence.
Like
Inser'on
Sort,
except
that
we
compare
values
that
are
h
elements
apart
in
the
list:
a[i]
a[i+hk]
hk
diminishes
awer
comple'ng
a
pass,
e.g.,
5,
3,
and
1.
The
le
is
said
to
be
hk-sorted.
For
example,
5-sorted,
3-sorted,
etc.
The
nal
value,
h1,
must
be
1.
So
the
nal
pass
is
always
a
regular
Inser'on
Sort.
7.4
A
Shellsort
Example
Shellsort
awer
each
pass,
using
[1,
3,
5]
as
the
increment
sequence.
7.4
Implementa'on
of
Shellsort
Shellsort
rou'ne
using
Shells
increments
(be3er
increments
are
possible)
Shells
increments:
ht=[N/2]
and
hk=[hk+1/2]
ht=[N/2]
hk=[hk+1/2]
7.4
Heapsort
Heapsort
is
based
on
using
a
priority
queue
with
running
'me
at
O(N
log
N)
To
sort
N
values
into
increasing
order:
Build
a
heap:
running
'me
=
O(N)
Remove
N
dele'ons:
O(log
N)
Sorted
values
can
be
appended
to
the
end
of
underlying
array
7.5
Mergesort
Mergesort
uses
the
strategy
of
Divide
and
Conquer
Divide:
split
the
list
of
values
into
two
halves
and
recursively
sort
each
half.
Conquer:
merge
the
two
sorted
halves
back
7.6
Mergesort
Illustrated
The
basic
merging
algorithm
takes
two
input
arrays
A
and
B,
an
output
array
C,
and
three
counters,
Actr,
Bctr,
and
Cctr
7.6
Mergesort
Illustrated
If
the
array
A
contains
1,
13,
24,
26,
and
B
contains
2,
15,
27,
38,
then
the
algorithm
proceeds
as
follows:
First,
a
comparison
is
done
between
1
and
2.
1
is
added
to
C,
and
then
13
and
2
are
compared.
7.6
Mergesort
Illustrated
2
is
added
to
C,
and
then
13
and
15
are
compared.
7.6
Mergesort
Illustrated
13
is
added
to
C,
and
then
24
and
15
are
compared.
This
proceeds
un'l
26
and
27
are
compared.
7.6
Mergesort
Illustrated
26
is
added
to
C,
and
the
A
array
is
exhausted.
7.6
Mergesort
Illustrated
The
remainder
of
the
B
array
is
then
copied
to
C.
7.6
Analysis
of
Mergesort
What
is
the
running
'me
for
Mergesort?
Let
T(N)
be
the
'me
to
sort
N
values.
For
N=1,
the
'me
to
mergesort
is
constant,
O(1)
Otherwise,
it
takes
T(N/2)
for
recursive
mergesort,
and
N
to
do
the
merge
7.6
Analysis
of
Mergesort
1
3
2
7.6
Sor'ng
(Chapter
7)
Quicksort
Picking
the
Pivot
Par''oning
Strategy
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
WEEK #10
LECTURE #20
Quicksort
Quicksort
is
one
of
the
most
elegant
and
useful
algorithms
in
computer
science.
A
fast
divide-and-conquer
recursive
algorithm
Very
'ght
and
highly
op'mized
inner
loop
Performance
Average
running
'me
is
O(N
log
N)
Worst-case
performance
is
O(N2)
Basic
idea:
Find
a
good
pivot
value
in
a
list
Recursively
sort
the
two
sublists
Similar
to
mergesort
but
does
not
require
merging
or
a
temp
array.
7.7
Quicksort:
Algorithm
Simple
recursive
sor'ng
algorithm
Input
is
divided
into
three
sublists:
smaller,
same,
and
larger
7.7
Quicksort:
Example
1.
Select
Pivot
2.
Par''on
3.
Recursive
Sort
7.7
3.
Recursive
Sort
Sublists
may
not
be
equal
size.
7.7
Median-of-Three
Par''oning
Median
of
the
array
is
hard
to
calculate
Alterna've:
pick
three
elements
randomly
and
use
the
median
of
these
three
as
the
pivot.
Be3er
for
implementa'on:
pick
lew,
right
and
center
elements
Result:
Reduce
the
number
of
comparisons
by
14%
7.7.1
4.
Stop
when
i
&
j
cross
over
7.7.2
WEEK #11
LECTURE #21
7.7
Quicksort
7.7.5
Analysis
of
QuickSort
7.7.6
A
Linear-Expected-Time
Algorithm
for
Selec'on
Quicksort:
Analysis
What
is
the
running
'me
to
quicksort
a
list
of
N?
Par''on
the
array
into
two
subarrays
(constant
cN
'me).
A
recursive
call
on
each
subarray.
A recurrence rela'on:
7.7.5
7.7.5
7.7.5
1
T ( N i 1) =
N
Therefore:
N 1
T ( j )
j =0
2 N 1
T ( N ) = T ( j ) + cN
N j =0
N 1
NT ( N ) = 2 T ( j ) + cN 2
(a)
j =0
N 2
NT ( N ) ( N 1)T ( N 1) = 2T ( N 1) + 2cN c
7.7.5
NT ( N ) = ( N + 1)T ( N 1) + 2cN
Divide
through
by
N(N+1):
T ( N ) T ( N 1)
2c
=
+
N +1
N
N +1
Telescope:
T ( N 1) T ( N 2) 2c
=
+
N
N 1
N
T ( N 2) T ( N 3)
2c
=
+
N 1
N 2
N 1
T (2) T (1) 2c
=
+
3
2
3
7.7.5
1
Recall
the
harmonic
number: loge N
i =3 i
And
so:
Therefore:
T (N )
= O (log N )
N +1
T ( N ) = O( N log N )
7.7.5
7.8
7.8
7.8
Bucket
Sort
Input:
A1,
A2,,
AN,
posi've
integers
<
M.
Bucket
Sort
Algorithm:
Keep
an
array
called
count,
of
size
M
ini'alized
with
all
0s.
When
Ai
is
read,
increment
count[Ai]
by
1.
Awer
all
the
input
is
read,
scan
the
count
array,
prin'ng
out
a
representa'on
of
the
sorted
list.
Radix
Sort
Radix
sort
is
some'mes
known
as
card
sort.
It
was
used
by
the
old
electromechanical
IBM
card
sorters
to
sort
punched
cards.
7.11
Radix
Sort
Input:
10
numbers
in
the
range
0
to
999.
Principle:
Too
many
buckets
->
bucket
sort
not
so
useful
here.
How
about
use
several
passes
of
bucket
sort?
Perform
bucket
sorts
in
the
reverse
order,
star'ng
with
the
least
signicant
digit
rst.
7.11
7.11
External
Sor'ng
Internal
sor'ng
algorithms
take
advantage
of
the
fact
that
memory
is
directly
addressable.
External
sor'ng
algorithms
are
designed
to
handle
very
large
inputs.
The
input
is
much
too
large
to
t
into
memory.
Some'mes
the
'me
it
takes
to
read
the
input
is
signicant
compared
to
the
'me
to
sort
the
input.
Even
though
sor'ng
is
an
O(N
log
N)
opera'on
and
reading
the
input
is
only
O(N).
In
realty,
reading
the
input
is
much
larger
than
O(N).
7.12
7.12
Assume
that
the
internal
memory
can
hold
and
sort
M
records
at
a
'me.
So
M
records
are
read
at
a
'me
from
the
input
tape.
We
use
four
tapes:
two
input
and
two
output
taps
Merge
7.12
Merged
7.12
7.12
7.12
Ceiling(log3(13/3)) = 2
7.12
7.12
Course
Structure
Founda'on
Reinforcement
A8end
lectures
Read
book
chapters
Integra'on
IntroducKon
to
CS146
Algorithm
Alnalysis
Assignment 1
Assignment 2
Trees
Hashing
Assignment 3
Priority Queues
Mid-Term
SorKng
The
Disjoint
Set
Class
Assignment 4
Graph
Algorithms
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
Final Exam
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #11
LECTURE #22
0.67% ??!!!!
Wearable
CompuKng
Interested
in
wearable
compuKng
projects?
Android
programming
Android
L
Android
Wear
Google
Fit
iOS
Programming
Swid
ANCS
Arduino
C Programming
IntroducKon
The
Disjoint
Set
Class
is
an
ecient
data
structure
to
solve
the
equivalence
problem.
Data
structure
is
simple
to
implement
ImplementaKon
is
extremely
fast
Analysis
is
extremely
dicult
Equivalence
RelaKons
Dene
a
relaKon
R
on
members
of
a
set
S:
For
each
pair
of
elements
(a,
b),
where
a
and
b
are
in
S,
a
R
b
is
either
true
or
false.
If
a
R
b
is
true,
then
a
is
related
to
b.
a
R
a
for
all
a
in
S.
any
component
is
connected
to
itself.
Symmetric:
a
R
b
if
and
only
if
b
R
a.
If
a
is
electrically
connected
to
b,
then
b
must
be
electrically
connected
to
a.
TransiKve:
If
a
R
b
and
b
R
c
then
a
R
c.
If
a
is
connected
to
b
and
b
is
connected
to
c,
then
a
is
connected
to
c.
8.1
a
R
a
for
all
a
in
S.
any
city
is
connected
to
itself.
Symmetric:
a
R
b
if
and
only
if
b
R
a.
If
it
is
possible
to
travel
from
city
a
to
city
b
by
roads,
then
it
is
also
possible
to
travel
from
city
b
to
city
a
by
roads.
TransiKve:
If
a
R
b
and
b
R
c
then
a
R
c.
If
it
is
possible
to
travel
from
city
a
to
city
b
and
from
city
b
to
city
c,
then
it
is
possible
to
travel
from
city
a
to
city
c.
8.1
8.2
8.2
8.3
8.3
8.3
8.3
8.3
8.3
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #12
LECTURE #23
8.4
8.4
8.4
8.4
8.4
Path
Compression
Problems
with
the
union/nd
algorithms
The
worst
case
of
O(M
log
N)
for
the
union/find
algorithm
can
occur
fairly
easily
and
naturally.
If
there
are
many
more
finds
than
unions,
this
running
Kme
is
bad.
8.5
Path
Compression
Path
compression
is
an
operaKon
that
does
something
clever
on
the
find
operaKon.
Path
compression
is
performed
during
a
find
operaKon
and
is
independent
of
the
strategy
used
to
perform
unions.
Suppose
the
operaKon
is
nd(x).
Then
the
eect
of
path
compression
is
that
every
node
on
the
path
from
x
to
the
root
has
its
parent
changed
to
the
root.
8.5
Path
Compression
An
example
of
path
compression
ader
nd(14)
on
the
generic
worst
tree.
Nodes
12
&
13,
and
Node
14
&
15
are
now
closer
to
the
root.
8.5
Path
Compression
Code
for
the
disjoint
set
nd
with
path
compression.
8.5
An
ApplicaKon
An
example
of
the
use
of
the
union/nd
data
structure
is
the
generaKon
of
mazes.
8.7
An
ApplicaKon
IniKal
state:
all
walls
up,
all
cells
in
their
own
set.
8.7
An
ApplicaKon
At
some
point
in
the
algorithm:
Several
walls
down,
sets
have
merged;
of
at
this
point
the
wall
between
8
and
13
is
randomly
selected,
this
wall
is
not
knocked
down,
because
8
and
13
are
already
connected.
8.7
An
ApplicaKon
Wall
between
squares
18
and
13
is
randomly
selceted;
this
wall
is
knocked
down,
because
18
and
13
are
not
already
connected;
their
sets
are
merged.
8.7
An
ApplicaKon
Eventually,
24
walls
are
knocked
down;
all
elements
are
in
the
same
set.
8.7
BACKUP
8.6
Iterated
Logarithm
For
pracKcality,
the
iterated
argorithm
with
base
2
has
a
value
no
more
than
5.
lg*4 = 2
8.6
8.6
8.6
9. Graph Algorithms
Course
Structure
Founda'on
Reinforcement
A9end
lectures
Read
book
chapters
Integra'on
IntroducWon
to
CS146
Algorithm
Alnalysis
Assignment 1
Quiz 1
Assignment 2
Quiz
2
Quiz
3
Assignment 3
Mid-Term
Assignment 4
This Lecture
Quiz
4
Final
Exam
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #12
LECTURE #24
IntroducWon
We
are
going
to
discuss
several
common
problems
in
graph
theory.
In
many
applicaWons,
they
are
too
slow
unless
we
pay
a9enWon
to
the
choice
of
data
structure.
Graph
Theory
In
computer
science,
graph
theory
is
the
study
of
graphs,
which
are
mathemaWcal
structures
used
to
model
pairwise
relaWons
between
objects.
Path
A
path
is
a
sequence
of
verWces
w1,
w2,
w3,
...,
wN
where
(wi,
wi+1)
is
in
E,
for
1
i
<
N.
The
length
is
the
path
is
the
number
of
edges
on
the
path.
A
simple
path
has
all
disWnct
verWces,
except
that
the
rst
and
last
can
be
the
same.
9.1
Cycle
A
cycle
in
a
directed
graph
is
a
path
of
length
1
where
w1
=
wN.
A
directed
graph
with
no
cycles
is
acyclic.
A
DAG
is
a
directed
acyclic
graph.
9.1
More
on
DeniWons
An
undirected
graph
is
connected
if
there
is
a
path
from
every
vertex
to
every
other
vertex.
A
directed
graph
with
this
property
is
strongly
connected.
A
directed
graph
is
weakly
connected
if
it
is
not
strongly
connected
but
the
underlying
undirected
graph
is
connected.
RepresentaWon
of
Graphs
Represent
a
directed
graph
with
an
adjacency
list.
For
each
vertex,
keep
a
list
of
all
adjacent
verWces.
9.1
Topological
Sort
A
topological
sort
is
an
ordering
of
verWces
in
a
directed
acyclic
graph,
such
that
if
there
is
a
path
from
vi
to
vj,
then
vi
comes
before
vj
in
the
ordering.
9.2
Topological
Sort
We
can
use
a
graph
to
represent
the
prerequisites
in
a
course
of
study.
A
directed
edge
from
Course
A
to
Course
B
means
that
Course
A
is
a
prerequisite
for
Course
B.
9.2
Topological
Sort
Topological
sort
example
using
a
queue.
Start
with
vertex
v1.
On
each
pass,
remove
the
verWces
with
indegree
=
0.
Subtract
1
from
the
indegree
of
the
adjacent
verWces.
9.2
Topological
Sort
Result
of
applying
topological
sort
to
the
graph
A
vertex
is
put
on
the
queue
as
soon
as
its
indegree
falls
to
0.
9.2
9.2
Book-Keeping
Slide
deck
uploaded
to
Canvas
Midterm
grades
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
WEEK #13
LECTURE #25
Shortest-Path Algorithms
9.3
Shortest-Path
Algorithms
Find
the
least-cost
path
from
a
disWnguished
vertex
s
to
every
other
vertex
in
the
graph.
9.3
Shortest-Path
Algorithms
A
graph
with
a
negaWve-cost
.
The
shortest
path
from
from
v5
to
v4
is
undened.
NegaWve-Cost
Cycle
9.3
Shortest-Path
Algorithms
We
are
going
to
examine
algorithms
to
solve
four
versions
of
the
shortest
path
problem.
Solve
the
unweighted
shortest-path
problem.
Solve
the
weighted
shortest-path
problem
if
there
are
no
negaWve
edges.
Solve
the
weighted
shortest-path
problem
if
the
graph
has
negaWve
edges.
Solve
the
weighted
problem
for
the
special
case
of
acyclic
graphs
in
linear
Wme.
9.3
Unweighted
Shortest-Path
An
unweighted
directed
graph
G
Unweighted
shortest
path
is
clearly
a
special
case
of
the
weighted
shortest
path
problem,
since
we
could
assign
all
edges
a
weight
of
1.
9.3.1
Breadth-First
Strategy
Breadth
First
Search:
Processing
verWces
in
layers.
The
verWces
closest
to
the
start
are
evaluated
rst,
and
the
most
distant
verWces
are
evaluated
last.
Graph
aper
marking
the
start
node
as
reachable
in
zero
edges
9.3.1
Breadth-First
Strategy
Graph
aper
nding
all
verWces
whose
path
length
from
s
is
1.
9.3.1
Breadth-First
Strategy
Graph
aper
nding
all
verWces
whose
shortest
path
is
2
9.3.1
Breadth-First
Strategy
Final
shortest
paths
9.3.1
9.3.1
5
2
3
4
1
9.3.1
Unweighted
Shortest-Path
Rened
algorithm
using
a
queue
Known
eld
can
be
discarded.
9.3.1
Unweighted
Shortest-Path
Rened
algorithm
using
a
queue
9.3.1
ImplementaWon:
Keep
the
same
informaWon
for
each
vertex;
The
informaWon
is
either
known
or
unknown
TentaWve
distance
dv
Path
informaWon
pv
9.3.2
Dijkstras
Algorithm
IniWal
conguraWon
of
table
used
in
Dijkstras
algorithm.
9.3.2
Dijkstras
Algorithm
Aper
v1
is
declared
known.
9.3.2
Dijkstras
Algorithm
Aper
v4
is
declared
known.
9.3.2
Dijkstras
Algorithm
Aper
v2
is
declared
known.
V2*:
Not
updaWng
V5:
(2+10)
>
3
9.3.2
Dijkstras
Algorithm
Aper
v5
and
the
v3
are
declared
known.
V3*:
UpdaWng
V6:
(3+5)
<
9
V5*:
Not
updaWng
V7:
(3+6)
>
5
9.3.2
Dijkstras
Algorithm
Aper
v7
is
declared
known.
V7*:
UpdaWng
V6:
(5+1)
<
8
9.3.2
Dijkstras
Algorithm
Aper
v6
is
declared
known
and
algorithm
terminates.
9.3.2
Dijkstras
Algorithm
Pseudocode
for
Dijkstras
Algorithm
9.3.2
WEEK #13
LECTURE #26
Acyclic
Graphs
Vertex
SelecWon
Rule:
If
the
graph
is
known
to
be
acyclic,
we
can
improve
Dijkstras
algorithm
by
changing
the
order
in
which
verWces
are
declared
known.
The
new
rule
is
to
select
verWces
in
topological
order.
It
works
because
when
a
vertex
is
selected,
its
distance
can
no
longer
be
lowered,
since
by
the
topological
ordering
rule
it
has
no
incoming
edges
emanaWng
from
unknown
nodes.
9.3.4
9.3.4
9.3.4
9.3.4
9.3.4
9.3.4
9.3.4
Shortest-Path Example
Word ladder problem. For instance: zero hero here hire fire five
9.3.6
Graph
=
Maze
Edge
=
Path
Vertex
=
Each
intersecWon
in
the
maze
The max ow is 2 + 3 = 5
sink
9.4
9.4
9.4.1
9.4.1
9.4.1
9.4.1
9.4.1
Undoing
the
ow
9.4.1
Undoing
the
ow
9.4.1
9.4.1
It
is
a
greedy
algorithm
9.5
9.5
Prims
Algorithm
Prims
algorithm
aper
each
stage:
9.5.1
Prims
Algorithm
IniWal
conguraWon
of
table
used
in
Prims
algorithm
for
Minimum
Spanning
Tree:
9.5.1
Prims
Algorithm
The
table
aper
v1
is
declared
known:
9.5.1
Prims
Algorithm
The
table
aper
v4
is
declared
known:
9.5.1
Prims
Algorithm
The
table
aper
v2
and
then
v3
are
declared
known:
9.5.1
Prims
Algorithm
The
table
aper
v7
is
declared
known:
9.5.1
Prims
Algorithm
The
table
aper
v6
and
then
v5
are
selected
(Prims
algorithm
terminates).
9.5.1
Kruskals
Algorithm
Kruskals
is
a
greedy
algorithm
using
equivalence
classes
First
parWWon
the
verWces
into
|V|
equivalence
classes
Process
the
edges
in
order
of
weight
Add
an
edge
to
the
Minimum
Spanning
Tree
and
combine
two
equivalence
classes
if
the
edge
connects
two
verWces
in
dierent
equivalence
classes
9.5.2
Kruskals
Algorithm
AcWon
of
Kruskals
algorithm
on
G:
9.5.2
Kruskals
Algorithm
Kruskals
algorithm
aper
each
stage:
9.5.2
Kruskals
Algorithm
Pseudocode
for
Kruskals
algorithm:
9.5.2
9.6
Undirected
Graphs
An
undirected
graph
and
depth-rst
search
of
the
graph:
1
4
5
3
9.6.1
Undirected
Graphs
An
undirected
graph
and
depth-rst
search
of
the
graph:
5
2
4
3
6
10
8
7
11
Forward
Marked
already
Return
9.6.1
BioconnecWvity
A
connected
undirected
graph
is
bioconnected
if
there
are
no
verWces
whose
removal
disconnects
the
rest
of
the
graph.
VerWces
that
are
not
bioconnected
are
called
as
arWculaWon
points.
A
graph
with
arWculaWon
points
C
and
D,
and
Depth-rst
tree
with
Num
and
Low:
1
1,
special
case
Low
is
the
minimum
of
Num(v)
Lowest
Num(w)
among
all
back
edges
Lowest
Low
among
all
tree
edges
ArWculaWon
Point:
Low(child)
Num
7
3
4
4
9.6.2
BioconnecWvity
Depth-rst
tree
that
results
if
depth-rst
search
starts
at
C:
1
1,
special
case
ArWculaWon
Point:
Low(child)
Num
7
1
2
2
9.6.2
BioconnecWvity
RouWne
to
assign
Num
to
verWces:
9.6.2
BioconnecWvity
Pseudocode
to
compute
Low
and
to
test
for
arWculaWon
points
(test
for
the
root
is
omi9ed):
9.6.2
BioconnecWvity
TesWng
for
arWculaWon
points
in
one
depth-rst
search
9.6.2
Euler
Circuits
A
Puzzle:
Reconstruct
these
gures
using
a
pen,
drawing
each
line
exactly
once.
The
pen
may
not
be
liped
from
the
paper
while
the
drawing
is
being
performed.
As
an
extra
challenge,
make
the
pen
nish
at
the
same
point
at
which
it
started.
9.6.3
Euler
Circuits
Conversion
of
puzzle
to
graph:
9.6.3
Euler
Circuits
Graph
for
Euler
circuit
problem:
9.6.3
Euler
Circuits
Graph
remaining
aper
5,
4,
10,
5:
9.6.3
Euler
Circuits
Graph
remaining
aper
5,
4,
1,
3,
7,
4,
11,
10,
7,
9,
3,
4,
10,
5:
9.6.3
Euler
Circuits
Graph
remaining
aper
5,
4,
1,
3,
2,
8,
9,
6,
3,
7,
4,
11,
10,
7,
9,
3,
4,
10,
5
9.6.3
Course
Structure
Founda'on
Reinforcement
A9end
lectures
Read
book
chapters
Integra'on
IntroducVon
to
CS146
Algorithm
Alnalysis
Assignment 1
Quiz 1
Assignment 2
Quiz
2
Quiz
3
Assignment 3
Mid-Term
This Lecture
Assignment 4
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Quiz
4
Final
Exam
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
WEEK #14
LECTURE #27
IntroducVon
List
Stacks
and
Queues
Trees
Hashing
Priority
Queues
SorVng
The
Disjoint
Set
Class
Graph
Algorithms
Algorithm
Design
Techniques
In
Chapter
10,
we
will
discuss
the
design
of
algorithm
and
see
the
general
approach.
10.0
Divide
and
Conquer
Running
Time
Closest
Points
Problem
SelecVon
Problem
Dynamic
Programming
Randomized
Algorithms
Backtracking
Algorithms
Greedy
Algorithms
We
have
already
seen
three
greedy
algorithms
in
the
previous
chapter:
Dijkstras,
Prims
and
Kruskals.
Greedy
Algorithms
always
choose
local
opVmum
take
what
you
can
get
now
strategy,
instead
of
global
opVmum.
For
example:
Coin
Changing
Problem
We
repeatedly
dispense
the
largest
denominaVon.
To
give
out
$17.61,
we
give
out
$10,
$5,
two
$1,
two
$0.5,
$0.1,
$0.01
bills
or
coins.
10.1
10.1.1
10.1.1
10.1.1
10.1.1
10.1.1
10.1.1
WEEK #14
LECTURE #28
Dynamic
Programing
Using
a
Table
Instead
of
Recursion
OpVmal
Binary
Search
Tree
CS146
Data
Structures
and
Algorithms,
Spring
2015,
Angus
Yeung,
Ph.D.
WEEK #15
LECTURE #29
Closest-Points Problem
Dynamic
Programing
Using
a
Table
Instead
of
Recursion
Randomized
Algorithms
Backtracking
Algorithms
The
Turnpike
ReconstrucVon
Problem
CS146 Data Structures and Algorithms, Spring 2015, Angus Yeung, Ph.D.
Human
Codes
Human
code
is
used
in
greedy
algorithm
for
le
compression.
For
example,
a
le
contains
only
a,
e,
i,
s,
t,
spaces
and
newlines.
This
le
requires
174
bits
to
represent,
since
each
character
requires
three
bits.
10.1.2
Human
Codes
In
large
les,
there
is
usually
a
big
disparity
between
the
most
frequent
and
least
frequent
characters.
Human
Codes
allow
the
code
length
to
vary
from
character
to
character
and
ensure
the
frequently
occuring
characters
have
short
codes.
Human
codes
is
ecient
in
represenVng
data
(resulVng
in
removing
data
redundancy).
If
all
the
characters
occur
with
the
same
frequency,
then
there
are
not
likely
to
be
any
savings.
10.1.2
Human
Codes
The
binary
code
that
represents
the
alphabet
can
be
represented
by
the
binary
tree.
The
representaVon
of
each
character
can
be
found
by
starVng
at
the
root
and
recording
the
path,
using
a
0
to
indicate
the
ler
branch
and
a
1
to
indicate
the
right
branch.
For
instance,
s
is
reached
by
going
le,
then
right,
and
nally
right.
This
is
encoded
as
011.
10.1.2
Human
Codes
Since
newline
is
an
only
child,
we
can
place
newline
one
level
higher
at
its
parent.
This
will
save
1
bit
to
represent
newline
and
the
new
tree
has
cost
of
173.
10.1.2
Prex
Code
If
the
characters
are
placed
only
at
the
leaves,
any
sequence
of
bits
can
always
be
decoded
unambiguously.
For
instance,
suppose
010011110001011000100011
is
the
encoded
string.
0
is
not
a
character
code,
01
is
not
a
character
code,
but
010
represets
i.
It
does
not
ma9er
if
the
character
codes
are
dierent
lengths,
as
long
as
no
character
code
is
a
prex
of
another
character
code.
10.1.2
Prex
Code
OpVmal
prex
code:
10.1.2
Humans
Algorithm
Assume
the
number
of
characters
is
C.
Humans
Algorithm:
Maintain
a
forest
of
trees.
The
weight
of
a
tree
is
equal
to
the
sum
of
the
frequencies
of
its
leaves.
10.1.2
Humans
Algorithm
C-1
Vmes,
select
the
two
trees,
T1
and
T2,
of
smallest
weight,
breaking
Ves
arbitrarily,
and
form
a
new
tree
with
subtrees
T1
and
T2.
Humans
algorithm
arer
the
rst
merge:
10.1.2
Humans
Algorithm
Humans
algorithm
arer
the
second
merge:
10.1.2
Humans
Algorithm
Humans
algorithm
arer
the
third
merge:
10.1.2
Humans
Algorithm
Humans
algorithm
arer
the
fourth
merge:
10.1.2
Humans
Algorithm
Humans
algorithm
arer
the
rh
merge:
10.1.2
Humans
Algorithm
Humans
algorithm
arer
the
nal
merge:
10.1.2
Humans
Algorithm
There
are
two
details
that
must
be
considered.
Transmission
of
Code
Book:
The
encoding
informaVon
must
be
transmi9ed
at
te
start
of
the
compressed
le,
since
otherwise
it
will
be
impossible
to
decode.
Two-pass
Algorithm:
The
rst
pass
collects
the
frequency
data
and
the
second
pass
does
the
encoding.
10.1.2
Past
Examples:
Chapter
2:
Maximum
Subsequence
Sum
Problem
with
O(N
log
N)
soluVon
Chapter
4:
Linear-Vme
tree
traversal
strategies
(preorder
&
postorder
traversal)
Chapter
7:
Mergesort
and
quicksort
10.2
Closest-Points
Problem
In
Closest-Points
Problem,
we
are
required
to
nd
the
closest
pair
of
points.
Below
shows
a
small
point
set.
10.2.2
Closest-Points
Problem
We
can
compute
dL
and
dR
recursively.
Then
how
about
dC?
P
parVVoned
into
PL
and
PR;
shortest
distances
are
shown.
10.2.2
Closest-Points
Problem
Let
=min(dL,
dR).
We
only
need
to
compute
dC
if
dC
improves
on
.
Below
shows
a
two-lane
strip,
containing
all
points
considered
for
dC
strip
10.2.2
Closest-Points
Problem
For
large
point
sets
that
are
uniformly
distributed,
the
number
of
points
that
are
expected
to
be
in
the
strip
is
very
small.
In
this
case,
we
can
use
brute-force
calculaVon
of
min(,
dC)
10.2.2
Closest-Points
Problem
In
the
worst
case,
all
the
points
could
be
in
the
strip.
We
need
a
be9er
algorithm
using
rened
calculaVon
of
min(,
dC)
10.2.2
Closest-Points
Problem
For
p3,
only
p4
and
p5
are
considered
in
the
second
for
loop
since
they
lie
in
the
strip
within
verVcal
distance.
10.2.2
Dynamic
Programming
A
problem
that
can
be
mathemaVcally
expressed
recursively
can
also
be
expressed
as
a
recursive
algorithm.
But
the
implementaVon
of
a
direct
translaVon
of
recursive
formula
may
not
give
out
an
ecient
program
results.
Dynamic
programming
rewrites
the
recursive
algorithm
as
a
nonrecursive
algorithm
that
systemaVcally
records
the
answers
to
the
sub-
programs
in
a
table,
thus
giving
help
to
the
compiler
to
generate
more
ecient
code.
10.3
10.3.1
10.3.1
10.3.1
10.3.1
10.3.1
10.3.1
Backtracking
Algorithms
A
backtracking
algorithm
usually
doesnt
have
good
performance
but
in
many
cases
it
has
signicant
savings
over
a
brute-force
exhausVve
search.
For
example,
O(N2)
is
not
good
but
it
is
signicantly
be9er
than
an
O(N5)
algorithm.
Backtracking
Example:
the
Turnpike
ReconstrucVon
Problem
10.5
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1
10.5.1