Вы находитесь на странице: 1из 15
B-Tree
By
Mahmoud
Ismail
CS
600.226:
Data Structures, Professor: Greg Hager
JHU Spring 2010
© 2004
Goodrich, Tamassia
Agenda
Review of
2-4 Tree
B
Tree
Dictionary and
Map
© 2004
Goodrich, Tamassia
Multi-way Search Trees
Each node
pairs
may
store multiple key-element
Node
with
d
children
(d-node) stores d-1
key-element
pairs
Children have
keys
that fall
either before
smallest parent
key, after
largest parent
key, or
between two
parent
keys
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Multi-way Search Trees
Each node
pairs
may
store multiple key-element
Node
with
d
children
(d-node) stores d-1
key-element
pairs
Children have
keys
that fall
either before
smallest parent
key, after
largest parent
key, or
between two
parent
keys
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Example Multi-way
Search
Tree
50
60 70
80
20
30
85 90
10
15
25
40
42 45
55
64
66
75
22
27
!
External
node between each
pair
of
keys and before/after
(n-1) + 1
+ 1 = n+1 external nodes
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
(2,4) Trees
! A (2,4)
tree (also called
2-4 tree or 2-3-4 tree) is a
multi-way
search with the
following properties
Node-Size
Property:
every internal
node
has at
most
four children
!
Depth Property:
all the external
nodes have
the same depth
!
! Depending on the
number
of
children, an internal node of a
(2,4)
tree
is called
a 2-node, 3-node or 4-node
10
15
24
2 8
12 18
27 32
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Simple
Insertion
(no
overflow)
10
10
5
12
14
5
12
14
15
Insert 15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Overflow
and
Split
!
We
handle an overflow at a
5-node v with a split
operation:
let v 1 … v 5 be the children
of v and k 1 … k 4 be the keys of v
!
node v is
replaced nodes v' and v"
!
"
v'
is a
3-node
with keys k 1 k 2
and children v 1 v 2 v 3
"
v" is a
2-node
with key k 4 and children v 4 v 5
key k 3
is inserted into the
parent
u of v
(a new root may
be
created)
!
u
u
15
24
15
24
32
v
v'
v"
12
18
27
30
32
35
12
18
27
30
35
v
v 1 v 2
v 3
v 4 v 5
v 1 v 2
v 3
v 4
5
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Insertion
with
Overflow
10
Insert
11 10
5
12
14
15
5
11
12
14
15
Split
10
14
5 11
12 15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Insert with
Cascading
Split
!
The overflow may propagate to the parent node
u
6
8
10
Insert
11
6
8
10
5
7
5
7
11
12
14 15
9
12
14
15
9
10
Split
Split
6
8
14
6
8
10 14
5
11 12
15 5
7
9
11 12
15
7 9
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Inserting
into (2,4)
Tree
1.
Search
for position
in deepest
internal
node
Insert into position
2.
If #
3.
elements >
3, do a
split
operation
!  Split node
into 2 nodes
!  Push
1 element
up
to parent
"
Create new
root if no
parent
"
If parent overflows, split parent
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Analysis
of
Insertion
!
Algorithm insert(k, o)
Let T be a
(2,4)
tree
with
n items
!
1.
We
search for key
k to locate the
Tree
T has O(log n)
!
insertion
node v
height
Step
1
takes O(log n)
!
!
2.
We
add the new entry (k, o) at
time
because
we visit O
node v
(log
n)
nodes
Step
2 takes O(1) time
!
!
3. while
overflow(v)
Step
3 takes O(log n)
!
if
isRoot(v)
time
because
each split
create a new
empty root above
v takes O(1) time and we
v ! split(v)
perform O(log n) splits
Thus, an insertion
in a
(2,4)
tree takes
O(log n)
time
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Deletion
Simple Case:
Delete
item from a leaf node
6
8
10
6
8
10
Remove 14
5
7
9
5
12
14
15
7
9
12
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Swap
Deletion of non leaf
node: we replace the
entry
with
its
inorder
successor (or, equivalently, with its inorder predecessor)
6
8
10
6
8
Remove 10
5
7
9
5
12
14
15
7
9
12
14
15
Swap
6
8 12
5
7
9
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Underflow
and
Transfer
! handle an
To
underflow at node v with parent
u,
we consider
two cases
! adjacent sibling
Case 2: an
w
of
v is a 3-node
or
a 4-node
Transfer
operation:
!
1. a child of w to
we
move
v
2. item from u to
we
move
an
v
3. item from w to u
we
move
an
After
a transfer, no underflow occurs
!
u
u
4
9
4
8
w
v
w
v
2
6
8
2
6
9
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Transfer
Remove 9
6
8
10
6 8
10
5
7
9
5
12
14
15
7
12
14
15
Transfer
(~rotate)
6
8
12
5
7
10
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Underflow
and
Fusion
! Deleting
an entry from a
node v
may cause
an underflow, where
node
v becomes
a
1-node with one
child
and no
! handle an
To
underflow at node v with parent
u,
keys
we consider two
cases
! 1: the adjacent siblings of v are 2-nodes
Case
Fusion operation:
we merge v with
an
adjacent
sibling w and move
!
an entry from
u to the merged node v'
After
a fusion,
the underflow
may propagate
to
the parent u
!
u
u
9
14
9
w
v
v'
2
5
7
10
2
5
7
10
14
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Fusion
6
8
10
Remove 7
6
8
10
5
7
9
5
12
14
15
9
12
14
15
Fusion
6
10
5
8
9
12
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removing from
(2,4) Tree
1. Search for element
2. Remove element
3. If element’s
child is internal
!  Swap next larger element into hole (so
we’ve
removed element above an
external)
4. If node
has no
elements
If an
adjacent
sibling has
>
1
element
Perform
transfer (kind of rotation)
Else
Perform
fusion
(can
cascade
upward)
CS
600.226: Data Structures,
Professor: Greg Hager
©
2004
Goodrich, Tamassia
Analysis
of
Deletion
Let T be
a (2,4) tree with n items
Tree T has O(log
n) height
!
In a
deletion
operation
We visit O(log
n)
nodes to locate the node from which to
!
delete
the entry
We handle an underflow with a
series of O(log n)
fusions,
!
followed by at
most
one transfer
Each fusion
and transfer takes
O(1)
time
!
Thus, deleting an
item from a (2,4)
tree
takes O(log
n) time
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
(a,b) Trees
Generalization
of (2,4)
trees
Size
property:
internal
node
has at
least a
children and
at most
b
children
!  2 <= a <= (b+1)/2
property:
Depth
all
external
nodes
have
same
depth
Height of
(a,b) tree
is "
(logn/logb)
and
O(logn/loga)
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
External
Memory
Searching
Memory Hierarchy
Registers
Cache
RAM
External
Memory
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Types of External
Memory
Hard disk
Floppy disk
Compact
disc
Tape
!  Distributed/networked
memory
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Primary
Motivation
External memory access
much slower
than internal
memory access
!  orders of magnitude slower
!  need to
minimize
I/O
Complexity
!  can
afford
slightly
more
work
on data in
memory in exchange for lower I/O
complexity
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
I/O
Efficient Dictionaries
Balanced
tree structures
!  Typically O(log 2 n)
update
transfers
for
query
or
!  Want to
reduce
height by
constant
factor
as
much
as possible
!  Can
be reduced
to O(log B n)
= O(log 2 n/log 2 B)
"
B is
number of nodes per block
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+
Trees
Choose
a and b
to be
# (B)
Data
are
stored at
leaves.
All
leaves are at the
same
depth
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+Trees
Non Leaf nodes (except
children
root) have
between
B/2
and
B
Root has between 2 and B children
All leaves are at the same
depth and have
between
L/2
and
L
element.
(where
L is an
arbitrary
number,
usually
= B)
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+
Tree
Example
©
2004
Goodrich, Tamassia
B+
Tree
!
Best case h is O(log B n)
!
Worst
Case h
is O(log B/2 n)
!
I/O complexity for search is O(log B n)\$
© 2004
Goodrich, Tamassia