Вы находитесь на странице: 1из 15
B-Tree By Mahmoud Ismail CS 600.226: Data Structures, Professor: Greg Hager JHU Spring 2010 ©
B-Tree
By
Mahmoud
Ismail
CS
600.226:
Data Structures, Professor: Greg Hager
JHU Spring 2010
© 2004
Goodrich, Tamassia
Agenda  Review of 2-4 Tree  B Tree  Dictionary and Map © 2004 Goodrich, Tamassia
Agenda
 Review of
2-4 Tree
 B
Tree
 Dictionary and
Map
© 2004
Goodrich, Tamassia
Multi-way Search Trees  Each node pairs may store multiple key-element  Node with d children (d-node)
Multi-way Search Trees
 Each node
pairs
may
store multiple key-element
 Node
with
d
children
(d-node) stores d-1
key-element
pairs
 Children have
keys
that fall
either before
smallest parent
key, after
largest parent
key, or
between two
parent
keys
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Multi-way Search Trees  Each node pairs may store multiple key-element  Node with d children (d-node)
Multi-way Search Trees
 Each node
pairs
may
store multiple key-element
 Node
with
d
children
(d-node) stores d-1
key-element
pairs
 Children have
keys
that fall
either before
smallest parent
key, after
largest parent
key, or
between two
parent
keys
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Example Multi-way Search Tree 50 60 70 80 20 30 85 90 10 15 25
Example Multi-way
Search
Tree
50
60 70
80
20
30
85 90
10
15
25
40
42 45
55
64
66
75
22
27
!
External
node between each
pair
of
keys and before/after
(n-1) + 1
+ 1 = n+1 external nodes
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
(2,4) Trees ! A (2,4)   tree (also called 2-4 tree or 2-3-4 tree) is
(2,4) Trees
! A (2,4)
tree (also called
2-4 tree or 2-3-4 tree) is a
multi-way
search with the
following properties
Node-Size
Property:
every internal
node
has at
most
four children
! 
Depth Property:
all the external
nodes have
the same depth
! 
! Depending on the
number
of
children, an internal node of a
(2,4)
tree
is called
a 2-node, 3-node or 4-node
10
15
24
2 8
12 18
27 32
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Simple Insertion (no overflow) 10 10 5 12 14 5 12 14 15 Insert 15
Simple
Insertion
(no
overflow)
10
10
5
12
14
5
12
14
15
Insert 15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Overflow and Split !   We handle an overflow at a 5-node v with a
Overflow
and
Split
!  
We
handle an overflow at a
5-node v with a split
operation:
let v 1 … v 5 be the children
of v and k 1 … k 4 be the keys of v
! 
node v is
replaced nodes v' and v"
! 
" 
v'
is a
3-node
with keys k 1 k 2
and children v 1 v 2 v 3
" 
v" is a
2-node
with key k 4 and children v 4 v 5
key k 3
is inserted into the
parent
u of v
(a new root may
be
created)
! 
u
u
15
24
15
24
32
v
v'
v"
12
18
27
30
32
35
12
18
27
30
35
v
v 1 v 2
v 3
v 4 v 5
v 1 v 2
v 3
v 4
5
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Insertion with Overflow 10 Insert 11 10 5 12 14 15 5 11 12 14
Insertion
with
Overflow
10
Insert
11 10
5
12
14
15
5
11
12
14
15
Split
10
14
5 11
12 15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Insert with Cascading Split !  The overflow may propagate to the parent node u 6
Insert with
Cascading
Split
!
 The overflow may propagate to the parent node
u
6
8
10
Insert
11
6
8
10
5
7
5
7
11
12
14 15
9
12
14
15
9
10
Split
Split
6
8
14
6
8
10 14
5
11 12
15 5
7
9
11 12
15
7 9
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Inserting into (2,4) Tree  1. Search for position in deepest internal node Insert into position
Inserting
into (2,4)
Tree
 1.
Search
for position
in deepest
internal
node
Insert into position
 2.
If #
 3.
elements >
3, do a
split
operation
!  Split node
into 2 nodes
!  Push
1 element
up
to parent
" 
Create new
root if no
parent
" 
If parent overflows, split parent
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Analysis of Insertion !   Algorithm insert(k, o) Let T be a (2,4) tree with
Analysis
of
Insertion
!
Algorithm insert(k, o)
Let T be a
(2,4)
tree
with
n items
!
1.
We
search for key
k to locate the
Tree
T has O(log n)
! 
insertion
node v
height
Step
1
takes O(log n)
! 
!
2.
We
add the new entry (k, o) at
time
because
we visit O
node v
(log
n)
nodes
Step
2 takes O(1) time
! 
!
3. while
overflow(v)
Step
3 takes O(log n)
! 
if
isRoot(v)
time
because
each split
create a new
empty root above
v takes O(1) time and we
v ! split(v)
perform O(log n) splits
Thus, an insertion
in a
(2,4)
tree takes
O(log n)
time
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Deletion Simple Case: Delete item from a leaf node 6 8 10 6 8 10
Deletion
Simple Case:
Delete
item from a leaf node
6
8
10
6
8
10
Remove 14
5
7
9
5
12
14
15
7
9
12
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Swap Deletion of non leaf node: we replace the entry with its inorder
Removal with Swap
Deletion of non leaf
node: we replace the
entry
with
its
inorder
successor (or, equivalently, with its inorder predecessor)
6
8
10
6
8
Remove 10
5
7
9
5
12
14
15
7
9
12
14
15
Swap
6
8 12
5
7
9
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Underflow and Transfer ! handle an   To underflow at node v with parent u,
Underflow
and
Transfer
! handle an
To
underflow at node v with parent
u,
we consider
two cases
! adjacent sibling
Case 2: an
w
of
v is a 3-node
or
a 4-node
Transfer
operation:
! 
1. a child of w to
we
move
v
2. item from u to
we
move
an
v
3. item from w to u
we
move
an
After
a transfer, no underflow occurs
! 
u
u
4
9
4
8
w
v
w
v
2
6
8
2
6
9
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Transfer Remove 9 6 8 10 6 8 10 5 7 9 5
Removal with Transfer
Remove 9
6
8
10
6 8
10
5
7
9
5
12
14
15
7
12
14
15
Transfer
(~rotate)
6
8
12
5
7
10
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Underflow and Fusion ! Deleting   an entry from a node v may cause an
Underflow
and
Fusion
! Deleting
an entry from a
node v
may cause
an underflow, where
node
v becomes
a
1-node with one
child
and no
! handle an
To
underflow at node v with parent
u,
keys
we consider two
cases
! 1: the adjacent siblings of v are 2-nodes
Case
Fusion operation:
we merge v with
an
adjacent
sibling w and move
! 
an entry from
u to the merged node v'
After
a fusion,
the underflow
may propagate
to
the parent u
! 
u
u
9
14
9
w
v
v'
2
5
7
10
2
5
7
10
14
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removal with Fusion 6 8 10 Remove 7 6 8 10 5 7 9 5
Removal with Fusion
6
8
10
Remove 7
6
8
10
5
7
9
5
12
14
15
9
12
14
15
Fusion
6
10
5
8
9
12
14
15
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Removing from (2,4) Tree 1. Search for element 2. Remove element 3. If element’s child
Removing from
(2,4) Tree
1. Search for element
2. Remove element
3. If element’s
child is internal
!  Swap next larger element into hole (so
we’ve
removed element above an
external)
4. If node
has no
elements
If an
adjacent
sibling has
>
1
element
Perform
transfer (kind of rotation)
Else
Perform
fusion
(can
cascade
upward)
CS
600.226: Data Structures,
Professor: Greg Hager
©
2004
Goodrich, Tamassia
Analysis of Deletion Let T be a (2,4) tree with n items Tree T has
Analysis
of
Deletion
Let T be
a (2,4) tree with n items
Tree T has O(log
n) height
! 
In a
deletion
operation
We visit O(log
n)
nodes to locate the node from which to
! 
delete
the entry
We handle an underflow with a
series of O(log n)
fusions,
! 
followed by at
most
one transfer
Each fusion
and transfer takes
O(1)
time
! 
Thus, deleting an
item from a (2,4)
tree
takes O(log
n) time
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
(a,b) Trees  Generalization of (2,4) trees  Size property: internal node has at least a children
(a,b) Trees
 Generalization
of (2,4)
trees
 Size
property:
internal
node
has at
least a
children and
at most
b
children
!  2 <= a <= (b+1)/2
property:
 Depth
all
external
nodes
have
same
depth
 Height of
(a,b) tree
is "
(logn/logb)
and
O(logn/loga)
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
External Memory Searching  Memory Hierarchy Registers Cache RAM External Memory CS 600.226: Data Structures,
External
Memory
Searching
 Memory Hierarchy
Registers
Cache
RAM
External
Memory
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Types of External Memory  Hard disk  Floppy disk  Compact disc  Tape !  Distributed/networked memory
Types of External
Memory
 Hard disk
 Floppy disk
 Compact
disc
 Tape
!  Distributed/networked
memory
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
Primary Motivation  External memory access much slower than internal memory access !  orders of magnitude
Primary
Motivation
 External memory access
much slower
than internal
memory access
!  orders of magnitude slower
!  need to
minimize
I/O
Complexity
!  can
afford
slightly
more
work
on data in
memory in exchange for lower I/O
complexity
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
I/O Efficient Dictionaries  Balanced tree structures !  Typically O(log 2 n) update transfers for query
I/O
Efficient Dictionaries
 Balanced
tree structures
!  Typically O(log 2 n)
update
transfers
for
query
or
!  Want to
reduce
height by
constant
factor
as
much
as possible
!  Can
be reduced
to O(log B n)
= O(log 2 n/log 2 B)
" 
B is
number of nodes per block
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+ Trees  Choose a and b to be # (B) Data are stored at leaves.
B+
Trees
 Choose
a and b
to be
# (B)
Data
are
stored at
leaves.
All
leaves are at the
same
depth
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+Trees   Non Leaf nodes (except children root) have between B/2 and B   Root
B+Trees
  Non Leaf nodes (except
children
root) have
between
B/2
and
B
  Root has between 2 and B children
All leaves are at the same
depth and have
between
L/2
and
L
element.
(where
L is an
arbitrary
number,
usually
= B)
CS
600.226: Data Structures,
Professor: Greg Hager
© 2004
Goodrich, Tamassia
B+ Tree •   Example © 2004 Goodrich, Tamassia
B+
Tree
  Example
©
2004
Goodrich, Tamassia
B+ Tree !   Best case h is O(log B n) !   Worst Case
B+
Tree
!
Best case h is O(log B n)
!
Worst
Case h
is O(log B/2 n)
!
I/O complexity for search is O(log B n)$
© 2004
Goodrich, Tamassia