Вы находитесь на странице: 1из 12

3.

3 BALANCED SEARCH TREES

Symbol table review

worst-case cost (after N inserts) implementation search sequential search (unordered list) binary search (ordered array) insert delete

average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

N/2

N/2

no

equals()

lg N

lg N

N/2

N/2

yes

compareTo()

Algorithms
F O U R T H E D I T I O N
R O B E R T S E D G E W I C K K E V I N W A Y N E

2-3 search trees red-black BSTs B-trees

BST

1.39 lg N

1.39 lg N

yes

compareTo()

goal

log N

log N

log N

log N

log N

log N

yes

compareTo()

Challenge. Guarantee performance. This lecture. 2-3 trees, left-leaning red-black BSTs, B-trees.
Algorithms, 4th Edition Robert Sedgewick and Kevin Wayne Copyright 20022012 March 4, 2012 1:33:02 PM introduced to the world in COS 226, Fall 2007

2-3 tree Allow 1 or 2 keys per node. 2-node: one key, two children. 3-node: two keys, three children.

Symmetric order. Inorder traversal yields keys in ascending order. Perfect balance. Every path from root to null link has same length.

2-3 search trees red-black BSTs B-trees

M
3-node 2-node

smaller than E

E J

R
larger than J

AC

SX

between E and J
3

null link
4

2-3 tree demo

Local transformations in a 2-3 tree Splitting a 4-node is a local transformation: constant number of operations.

a e b c d

less than a

between a and b

between b and c

between c and d

between d and e

greater than e

a c e b less than a between a and b between b and c d between d and e greater than e

between c and d

Splitting a 4-node is a local transformation that preserves balance

Global properties in a 2-3 tree Invariants. Maintains symmetric order and perfect balance. Pf. Each transformation maintains symmetric order and perfect balance.

2-3 tree: performance Perfect balance. Every path from root to null link has same length.

root

parent is a 3-node

a b c

left
a b c

Typical 2-3 tree built from random keys


d e b d e

parent is a 2-node

left
a b c

b d

middle

a e b c d

a c e

c
a c

Tree height. Worst case: Best case:

right

a b c d

right
d

a b c d e

a b d

Splitting a temporary 4-node in a 2-3 tree (summary)


7 8

2-3 tree: performance Perfect balance. Every path from root to null link has same length.

ST implementations: summary

worst-case cost (after N inserts) implementation search sequential search (unordered list) insert delete

average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

N/2

N/2

no

equals()

Typical 2-3 tree built from random keys

binary search (ordered array)

lg N

lg N

N/2

N/2

yes

compareTo()

Tree height. Worst case: Best case: lg N. [all 2-nodes] log3 N .631 lg N. [all 3-nodes]

BST

1.39 lg N

1.39 lg N

yes

compareTo()

Between 12 and 20 for a million nodes. Between 18 and 30 for a billion nodes.

2-3 tree

c lg N

c lg N

c lg N

c lg N

c lg N

c lg N

yes

compareTo()

Guaranteed logarithmic performance for search and insert.

constants depend upon implementation

10

2-3 tree: implementation? Direct implementation is complicated, because: Maintaining multiple node types is cumbersome. Need multiple compares to move down tree. Need to move back up the tree to split 4-nodes. Large number of cases for splitting.

Bottom line. Could do it, but there's a better way.

2-3 search trees red-black BSTs B-trees

11

12

Left-leaning red-black BSTs (Guibas-Sedgewick 1979 and Sedgewick 2007) 1. Represent 23 tree as a BST.
J E C A H
3-node

An equivalent definition A BST such that: No node has two red links connected to it. Every path from root to null link has the same number of black links. Red links lean left.
"perfect black balance"

3-node

a b

black tree 2.

Use "internal"Mleft-leaning links as


R L
a bP between a and b b a greater than b

less "glue" forbetween 3nodes.greater than b than a a and b b a less than a between a and b greater than b
larger key is root

X S

less than a

zontal red links

M
between a and b

E A C H

less than a

greater than b

Encoding a 3-node with two 2-nodes connected by a left-leaning red link

redblack tree
black links connect 2-nodes and 3-nodes

M J E C H L P S R X

Encoding a 3-node with two 2-nodes connected by a left-leaning red link

red links "glue" nodes within a 3-node

tree

M E J A C H L
2-3 tree

redblack tree

M J E L H
corresponding red-black BST

R P S X
C A
horizontal red links

R P S X

A
horizontal red links

M E J H L P R S X
14

1 correspondence between red-black and 2-3 trees

M E J H L P R S X

13

Left-leaning red-black BSTs: 1-1 correspondence with 2-3 trees


2-3 tree

Search implementation for red-black BSTs 2-3 tree

M R P S X

Key property. 11 correspondence between 23 andJLLRB. E


A C
redblack tree

Observation. Search is the same asJ R E for elementary BST (ignore color).
but runs faster because of better balance

A C

S X

M 11 correspondence between red-black and 2-3 trees J R E C H L P S X

A
horizontal red links

M E J H L P R S X

2-3 tree

public Val 11 correspondence between red-black and 2-3 trees get(Key key) { Node x = root; redblack tree M while (x != null) { J int cmp = key.compareTo(x.key); E L if (cmp < 0) x = x.left; C H else if (cmp > 0) x = x.right; A else if (cmp == 0) return x.val; } return null; horizontal red links M }

R P S X

M E J A C H L P R S X
2-3 tree

E A C H

J L P

R S X

11 correspondence between red-black and 2-3 trees


15

Remark. Most other ops (e.g., ceiling, selection, iteration) J R E are also identical.
A C H L P S X
16

Red-black BST representation Each node is pointed to by precisely one link (from its parent) can encode color of links in nodes.

Elementary red-black BST operations Left rotation. Orient a (temporarily) right-leaning red link to lean left.

rotate E left

private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; boolean color; // color of parent link } private boolean isRed(Node x) { if (x == null) return false; return x.color == RED; }
null links are black

(before) h

E S
x

h.left.color is RED

E J D G

C A

h.right.color is BLACK

less than E

private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; int N; boolean color; Node(Key key, { this.key this.val this.N this.color }

between E and S

greater than S

private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }

// // // // // //

key associated data subtrees # nodes in this subtree color of link from parent to this node

Value val) = = = = key; val; 1; RED;


17

Invariants. Maintains symmetric order and perfect black balance.


18

Elementary red-black BST operations Left rotation. Orient a (temporarily)

Elementary red-black BST operations Right rotation. Orient a left-leaning red link to (temporarily) lean right.

private boolean isRed(Node x) { if (x == right-leaning rednull) to leanfalse; link return left. return x.color == RED; }

Node representation for redblack trees

rotate E left (after)

rotate S right

S
h

E
greater than S

less than E

between E and S

private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }

(before)

S
x

E
greater than S

less than E

between E and S

private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }

Invariants. Maintains symmetric order and perfect black balance.


19

Invariants. Maintains symmetric order and perfect black balance.


20

Elementary red-black BST operations Right rotation. Orient a left-leaning red link to (temporarily) lean right.

Elementary red-black BST operations Color flip. Recolor to split a (temporary) 4-node.

rotate S right (after) x

ip colors

E S
h

less than E

between E and S

greater than S

private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }

(before) h

E S

less than A

between A and E

between E and S

greater than S

private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); asset isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }

Invariants. Maintains symmetric order and perfect black balance.


21

Invariants. Maintains symmetric order and perfect black balance.


22

Elementary red-black BST operations Color flip. Recolor to split a (temporary) 4-node.

Insertion in a LLRB tree: overview Basic strategy. Maintain 1-1 correspondence with 2-3 trees by applying elementary red-black BST operations.

ip colors (after) h

insert C C insert

E S

less than A

between A and E

between E and S

greater than S

private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); asset isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }

E E SS R R

E A R S

A A add new add new node here node here right link red sorotate left rotate left A A

E E RR E E R R

C C

SS

A A

C C

S S

E A C R S

Insert into a 2-node LLRB tree at the bottom

2-3 tree

Invariants. Maintains symmetric order and perfect black balance.


23 24

Insertion in a LLRB tree

search ends at this null link


root

Insertion in a LLRB tree Case 1. Insert into a 2-node at the bottom. Do standard BST insert; color new link red. If new red link is a right link, rotate left.

Warmup 1. Insert into a tree with exactly 1 node. b


a

red link to new node containing a converts 2-node to 3-node


root

left

root

right

b search ends at this null link


root

a a

search ends at this null link attached new node with red link b
root

insert C C insert

E E SS R R

E A R S

A A add new add new node here node here right link red sorotate left rotate left A A

b a

red link to new node containing a converts 2-node to 3-node


root

b a

rotated left to make a legal 3-node

E E RR E E R R

right

C C

SS

a a

search ends at this null link attached new node with red link b
root

Insert into a single 2-node (two cases)


A A

C C

S S

E A C R S

inserting H

E C A R add new node here E S

Insert into a 2-node LLRB tree at the bottom


25

2-3 tree
26

rotated left a to make Insertion in a LLRB tree a legal 3-node

two lefts in a row so rotate right

Insertion in a LLRB tree


inserting H

C A E H S R E C two lefts in a row so rotate right E C A two lefts in a row so rotate right R C A H R E C A H S A H right link red so rotate left E C R

Warmup 2. Insert into a single with exactly 2 nodes. Insert into a tree 2-node (two cases)
larger
larger
larger smaller

smaller

b a

b search ends search ends c b at this at this a null link a b null link

searchc ends at this null b link

smaller between between c c c

between

c a search ends

b a

b a

a ca attached new attached new c attached new node with attached new with node with a b node attached new node with bc b red link b red node with b b link node with red link c attached new red link ab a b red link c attached new a link red node with a c c attached new node with red link c a b node with red link c b red link rotated a rotated left rotated b a right rotated left b colors flippedrighta c rotated b a to black rotated rotated left colors flipped b a right c b b right rotated c a colors flipped to black b c a c right a a to black c rotated b colors flipped a to right c b black c colors flipped colors flipped b cc a a to black b to black colors flipped colors flipped a to blackc b b c a to black colors flipped c c a b a

search endssearch ends at this null link c at this null at this null link link c

a search search ends ends at this null link at at this null this null link search ends link
c attached new

Case 2. Insert into a 3-node at the bottom. Rotate to balance the 4-node (if needed). E Flip colors to pass red link up inserting H
C S A
inserting H

Do standard BST insert; color new link red.


inserting H

both children red so flip colors R S

E Rotate to make lean left (if needed). R

C one

S level. R

add new node here

add new node here

Insert into a single 3-node (three cases)

Insert into a single 3-node (three cases) c

to black

27

Insert into a single 3-node (three cases)

H add new E node here E C both children red S two lefts in a rowC S so flip colors so rotate right A R E A R E C R H C S add new S A H node here A R both children red two lefts in a row so flip colors link red so rotate H right right E so rotate left E both children redC R C S so flip colors E S A H A R E C R R H C right link red S A H S so rotate left A H both children red so flip colors right link red R E E so rotate left S E C R C R
H

R S

Insert into a 3-node at the bottom

28

E S

SC

node here C

M H

Insertion in a LLRB tree: passing red links up the tree


A

A R E C H M

Case 2. Insert into a 3-node at the bottom. Do standard BST insert; color new link red. Rotate to balance the 4-node (if needed). Flip colors to pass red link up one level. C
A Rotate to make lean left (if needed). R
inserting P
inserting P

add new S node here both children R red so P flip colors S E C A R H S M P both children red so flip colors

LLRB tree insertion demo

right link red so rotate left E


R S

E M H

inserting P Repeat case 1 or case 2 up the M tree (if needed). C

add new node here R

right link red M so P rotate left R A H S E two lefts in a row M so rotate right C P A R H M E S two lefts in a row P so rotate right

E
inserting P

A S

H add new C node here A R

E M H

C R E S A H

Insertion in a LLRB tree: Java implementation Same code for both cases.

Right child Rred, left child black:Protate left. C H E S M child, S A C Left H P left-left grandchild red: rotate right. R E A P colors. H S Both childrenC red: flipPassing a red link up the tree M A
E C A H P R S Passing a red link up the tree

add new S node here E C M right link red C M so rotate left R both children A H R red so P A H S E add new E S flip colors node here C M C M both children right link red P A H R red so P A H so rotatecolors R flip left S E two S lefts in a row E C M so rotate right right children both link red C M red so R P so rotate left A H R flip colors P A H M S S E C P H E S Passing a red link up the tree P C M right link red two lefts in a row so rotate left so rotate right A C R H P A H R A S E both children red two lefts in a row M S Passing a red link up the tree C M so flip colors so rotate right P E P A H R C H M M S two lefts in a row A R E P E so rotate right both children red P H S so flip colors C R C H A M A S M both children red P E M so flip colors E R C H R E P C H S M A P C H S A both children red E R A so flip colors M P C H S h R E M A Passing a red link up the tree
h

C S A both children red so P flip colors

R H M S both children red P E so flip colors C H MA both children red R E so flip colors P C H S M A R E M P C H S R E A P C H S M A R E

29

30

Insertion in a LLRB tree: visualization

left rotate right rotate

flip colors

Passing a red link up a red-black tree

private Node put(Node h, Key key, Value val) { if (h == null) return new Node(key, val, RED); int cmp = key.compareTo(h.key); if (cmp < 0) h.left = put(h.left, key, val); else if (cmp > 0) h.right = put(h.right, key, val); else if (cmp == 0) h.val = val; if (isRed(h.right) && !isRed(h.left)) h = rotateLeft(h); if (isRed(h.left) && isRed(h.left.left)) h = rotateRight(h); if (isRed(h.left) && isRed(h.right)) flipColors(h); return h; }
only a few extra lines of code provides near-perfect balance

Passing a red link up the tree

insert at bottom (and color red)

lean left balance 4-node split 4-node

255 insertions in ascending order

31

32

Insertion in a LLRB tree: visualization

Insertion in a LLRB tree: visualization

255 random insertions

255 insertions in descending order

33

34

Balance in LLRB trees Proposition. Height of tree is 2 lg N in the worst case.

ST implementations: summary

Pf. Every path from root to null link has same number of black links. Never two red links in-a-row.

worst-case cost (after N inserts) implementation search insert delete

average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

sequential search (unordered list) binary search (ordered array)

N/2

N/2

no

equals()

lg N

lg N

N/2

N/2

yes

compareTo()

BST

1.39 lg N

1.39 lg N

yes

compareTo()

2-3 tree

c lg N

c lg N

c lg N

c lg N

c lg N

c lg N

yes

compareTo()

red-black BST

2 lg N

2 lg N

2 lg N

1.00 lg N *

1.00 lg N *

1.00 lg N *

yes

compareTo()

* exact value of coefficient unknown but extremely close to 1

Property. Height of tree is ~ 1.00 lg N in typical applications.


35 36

War story: why red-black? Xerox PARC innovations. [ 1970s ] Alto. GUI. Ethernet. Smalltalk. InterPress. Laser printing. Bitmapped display. WYSIWYG text editor. ...
A DIClIROlV1ATIC FUAl\lE\V()HK Fon BALANCED TREES

War story: red-black BSTs Telephone company contracted with database provider to build real-time database to store customer information. Database implementation. Red-black BST search and insert; Hibbard deletion. Exceeding height limit of 80 triggered error-recovery process.
allows for up to 240 keys


Xerox Alto

Leo J. Guibas .Xerox Palo Alto Research Center, Palo Alto, California, and Carnegie-Afellon University

and

Robert Sedgewick* Program in Computer Science Brown University Providence, R. I.

Extended telephone service outage. Main cause = height bounded exceeded!

Hibbard deletion was the problem

Telephone company sues database provider. Legal testimony: If implemented properly, the height of a red-black BST

ABSTUACT

I() this paper we present a uniform framework for the implementation and study of halanced tree algorithms. \Ve show how to imhcd in this framework the best known halanced tree tecilIliques and thell usc the framework to deVl'lop new which perform the update and rebalancing in one pass, Oil the way down towards a leaf. \Ve conclude with a study of performance issues and concurrent updating.

the way down towards a leaf. As we will see, this has a number of significant advantages ovcr the older methods. We shall cxamine a numhcr of variations on a common theme and exhibit full implementations which are notable for their brcvity. One imp1cn1entation is exatnined carefully, and some properties about its behavior are proved. ]n both sections 1 and 2 particular attention is paid to practical implementation issues, and cOlnplcte impletnentations are given for all of the itnportant algorithms. '1l1is is significant because one measure under which balanced tree algorithtns can differ greatly is the amount of code required to actually implement them. Section 3 deals with the analysis of the algorithlns. New results are givcn for the worst case perfonnance, and a technique for studying the average case is described. While no balanced tree algorithm has yet satisfactorily subtnitted to an average case analysis, empirical results arc given which show U1at the valious algorithms differ only slightly in perfonnance. One irllplication of this is Ulat the top-down algorithms of section 2 can be recommended for most applications because of their simplicity. Finally, in section 4, we discuss some other properties of the trees. In particular, a one-pass top down deletion algorithm is presented. In addition, we consider how to decouple the balancing from the updating operations and we explore parallel updating.
1.

with N keys is at most 2 lg N.

expert witness

37

38

O.

Introduction

I1alanced trees arc arnong the oldest and tnost widely used data stnlctures for searching. These trees allow a wide variety of operations, such as search, insertion, deletion, tnerging, and splitting to be performed in tinK GOgN), where N denotes the size of the tree [AHU], [KtJ]. (Throughout the paper 19 will denote log to the base 2.) A number of different types of balanced trees have been proposed, and while the related algorithms are oftcn conceptually sin1ple, they have proven cumbersome to irnp1cn1ent in practice. Also, the variety of such trees and the lack of good analytic results describing their performance has made it difficult to decide which is best in a given situation. In this paper we present a uniform fratnework for the imp1crnentation and study of balanced tree algorithrns. 'Inc fratTIework deals exclusively with binary trecs which contain two kinds of nodes: internal and external. Each internal node contains a key (chosen frorn a linear order) and has two links to other nodes (internal or external). External nodes contain no keys and haye null links. If such a tree is traversed in sYlnn1etlic order [Knl then the internal nodes will be visited in increasing order of their keys. A second defining feature of the frarncwork is U1at it allows one bit per node, called the color of the node, to store balance infonnation. We will use red and black as the two colors. In section 1 we further elaborate upon this dichrornatic framework and show how to imbed in it the best known balanced tree algorithms. In doing so, we will discover suprising new and efficient implementations of these techniques. In section 2 we use the frarnework to develop new balanced tree algorithms which perform the update and rebalancing in one pass, on This work was done in part while this author was a Visiting Scientist at the Xerox Palo Alto Research Center and in part under support from thc NatiGfna1 Sciencc Foundation, grant no. MCS7523738.

File system model Page. Contiguous block of data (e.g., a file or 4,096-byte chunk). Probe. First access to a page (e.g., from disk to memory).

The lJnifoml Franlcwork

In this section we present a unifonn frarnework for describing balanced trees. We show how to ernbed in this framework the nlost widely used balanced tree schemes, narnely B-trecs [UaMe], and AVL trees [AVL]. In fact this ernbedding will give us interesting and novel irnplclnentations of these two schemes. We consider rebalancing transfonnations which maintain the symrnetric order of the keys and which arc local to a s1na11 portion of the tree f()r obvious efficiency reasons. These transformations will changc the structure of thc tree in the salnc way as the single and double rotations used by AVL trees [Kn]. '111c differencc between the various algorithms we discuss arises in the decision of when to rotate, and in the tnanipulation of the node colors. For our first cxample, let us consider the itnp1cmentation of trees, the simplest type of B-tree. Recall that a 2-3 tree consists of 2nodes, which have one key and t\\'o sons, 3-nodes, which have two

2-3 search trees red-black BSTs B-trees


8

slow

fast

Property. Time required for a probe is much larger than time to access data within a page. Cost model. Number of probes. Goal. Access data using minimum number of probes.

CH1397-9/78/0000-QOOS$JO.75

1973 IEEE

39

40

B-trees (Bayer-McCreight, 1972) B-tree. Generalize 2-3 trees by allowing up to M - 1 key-link pairs per node. At least 2 key-link pairs at root. At least M / 2 key-link pairs in other nodes. External nodes contain client keys. Internal nodes contain copies of keys to guide search.
choose M as large as possible so that M links fit in a page, e.g., M = 1024

Searching in a B-tree

Start at root. Find interval for search key and take corresponding link. Search terminates in external node.

* K sentinel key * D H external 3-node * B C D E F client keys (black) are in external nodes H I J

2-node internal 3-node

searching for E

follow this link because E is between * and K * D H


external 4-node

* K

each red key is a copy of min key in subtree external 5-node (full) K M N O P

K Q U

K Q U follow this link because E is between D and H

Q R T

U W X Y

* B C

D E F

H I J

K M N O P

Q R T

U W X

all nodes except the root are 3-, 4- or 5-nodes Anatomy of a B-tree set (M = 6)

search for E in this external node Searching in a B-tree set (M = 6)

41

42

Insertion in a B-tree

Balance in B-tree Proposition. A search or an insertion in a B-tree of order M with N keys requires between log M-1 N and log M/2 N probes. Pf. All internal nodes (besides root) have between M / 2 and M - 1 links.
Q R T U W X

Search for new key. Insert at bottom. Split nodes with M key-link pairs on the way up the tree.
inserting A

* H K Q U

* B C E F

H I J

K M N O P * H K Q U

In practice. Number of probes is at most 4.


Q R T U W X new key (C) causes overflow and split

M = 1024; N = 62 billion log


M/2

N 4

* A B C E F new key (A) causes overflow and split

H I J

K M N O P * C H K Q U

Optimization. Always keep root page in memory.

* A B

C E F

H I J * K

K M N O P

Q R T

U W X

* C H

root split causes a new root to be created H I J K M N O P

K Q U

* A B

C E F

Q R T

U W X

Inserting a new key into a B-tree set


43 44

Building a large B tree


full page, about to split

Balanced trees in the wild Red-black trees are widely used as system symbol tables. Java: java.util.TreeMap, java.util.TreeSet. C++ STL: map, multimap, multiset. Linux kernel: completely fair scheduler, linux/rbtree.h.

external nodes (line segment of length proportional to number of keys in that node)

B-tree variants. B+ tree, B*tree, B# tree, B-trees (and variants) are widely used for file systems and databases. Windows: HPFS. Mac: HFS, HFS+. Linux: ReiserFS, XFS, Ext3FS, JFS. Databases: ORACLE, DB2, INGRES, SQL, PostgreSQL.

45

46

Building a large B-tree

Red-black BSTs in the wild

Red-black BSTs in the wild

Common sense. Sixth sense. Together they're the FBI's newest team.

47

48

Вам также может понравиться