Вы находитесь на странице: 1из 9

Information Processing Letters 45 (1993) 41{50

A Systematic Analysis of Splaying


Berry Schoenmakers
CWI P.O. Box 94079 1090 GB Amsterdam The Netherlands
berry@cwi.nl

In this paper we perform an amortized analysis of a functional program for splaying. We construct a potential function that yields the same bound for the amortized cost of splaying as given by D.D. Sleator and R.E. Tarjan|the inventors of splay trees. In addition, we show that this bound is minimal for the class of \sum of logs" potentials. Our approach also applies to the analysis of path reversal and pairing heaps.

Abstract

Keywords

Analysis of algorithms, data structures, amortized complexity, potential function, functional programming, splay trees.

1 Splaying
Splaying is the central operation in a particular implementation of dictionaries. A dictionary is an abstract data type involving operations on subsets of an in nite, linearly ordered set, such as the integers. In 5] Sleator and Tarjan developed an e cient implementation of dictionaries, called splay trees. Splay trees are binary search trees, i.e. binary trees of integers whose inorder traversal is strictly increasing. There is no balance condition imposed on these trees whatsoever, but it is the particular way splaying is de ned that makes the data structure e cient. Operation a6 (\splaying at a") is performed by \rotating" a to the root of a binary search tree while keeping the inorder traversal intact (see Figure 1). In cases (iii) and (iv), triangle y stands for the upper part of the tree that remains unchanged. The symmetrical counterparts of (ii){(iv), which are obtained by interchanging the role of left and right subtrees, are omitted. Transformations (iii) and (iv) are repeatedly applied until (i) or (ii) applies. It is assumed that a occurs in the tree. We translate this pictorial description of splaying into the following functional program, in which ht; a; ui denotes a nonempty binary tree with left subtree t, root a, and right subtree u (again, the symmetrical cases are omitted): 1

Information Processing Letters 45 (1993) 41{50


a (i) t u t u a

b a v t u (ii) t u

a b

y c b a v t u w (iii) t

y a b c u v w

y y c b a v t u v t u w w (iv) b a

Figure 1: Splaying at a (cf. 5, Figure 3]). = ht; a; ui (i) a6 ht; a; ui (ii) a6 hht; a; ui; b; vi = ht; a; hu; b; vii ; a<b 6 6 (iii) a hhx; b; vi; c; wi = ht; a; hu; b; hv; c; wiii where ht; a; ui = a x ; a<b<c (iv) a6 hhv; b; xi; c; wi = hhv; b; ti; a; hu; c; wii where ht; a; ui = a6 x ; b<a<c: Clearly, the root of a6 x equals a, and the inorder traversals of x and a6 x are equal. If the depth of a is even, the program returns the same tree as the operation described by Figure 1. In case the depth of a is odd, the resulting trees may be quite di erent. However, as far as e ciency is concerned, the di erence is not essential, since both versions of splaying can be analyzed in the same way.

2 Analysis
Evaluation of a6 amounts to repeatedly unfolding either (iii) or (iv), followed by a single unfolding of either (i) or (ii). A useful cost measure is therefore given by T:x = \number of unfoldings of (iii) and (iv) required for the evaluation of a6 x." Given cost measure T , we want to derive a logarithmic bound for the amortized cost A of a6 , given by A:x = T:x + :(a6 x) :x; where is a potential function 6]. For the sake of brevity it it left implicit that T:x and A:x depend on a. 2

Information Processing Letters 45 (1993) 41{50

Setting out for an inductive derivation, we rst calculate a recurrence relation for A. To this end, we take of the form: :h i = 0 :ht; a; ui = :t + ':t:u + :u: Note that a does not occur in the right-hand side of the de nition of :ht; a; ui; this re ects our decision to let :x depend on the structure of x only. Moreover, ':t:u will be de ned to be symmetric in t and u, so that the symmetrical counterparts of (ii){(iv) can be ignored in the sequel. Now, we calculate for case (iv):

A:hhv; b; xi; c; wi = f de nition of A g T:hhv; b; xi; c; wi + :(a6 hhv; b; xi; c; wi) :hhv; b; xi; c; wi = f de nitions of a6 and T g 1 + T:x + :hhv; b; ti; a; hu; c; wii :hhv; b; xi; c; wi = f de nition of A; a6 x = ht; a; ui g 1 + A:x + :x :ht; a; ui + :hhv; b; ti; a; hu; c; wii :hhv; b; xi; c; wi = f de nition of g 1 + A:x + :x :t ':t:u :u + :v + ':v:t + :t + ':hv; b; ti:hu; c; wi + :u + ':u:w + :w :v ':v:x :x ':hv; b; xi:w :w = f simplifying g A:x + 1 + ':v:t + ':u:w ':v:x ':t:u +':hv; b; ti:hu; c; wi ':hv; b; xi:w = f see below g A:x + 1 + ':v:t + ':u:w ':v:x ':t:u: Let jxj denote one plus the size of x: jh ij = 1 and jht; a; uij = jtj + juj. Then, for the last step, note that jtj + juj = jxj because ht; a; ui = a6 x and ja6 xj = jxj. Hence, ':hv; b; ti:hu; c; wi = ':hv; b; xi:w provided ':t:u depends on jtj + juj only. This proviso will be assumed in the sequel. Similar calculations for cases (ii) and (iii) then yield as recurrence relation for A: A:ht; a; ui = 0 A:hht; a; ui; b; vi = ':u:v ':t:u A:hhx; b; vi; c; wi = A:x + 1 + ':u:hv; c; wi + ':v:w ':x:v ':t:u A:hhv; b; xi; c; wi = A:x + 1 + ':v:t + ':u:w ':v:x ':t:u: A logarithmic bound on A follows if we are able to de ne ' such that, for instance, (1) A:x log jxj
3

Information Processing Letters 45 (1993) 41{50

with > 1. Inspired by 5, Lemma 1], however, and in view of Theorem 1 (Section 3), we will prove the following stronger bound: j: (2) A:x log jjx xja Here jxja denotes one plus the size of a's subtree in x. In case (i), (2) evidently holds. In order that (2) follows by induction in the other cases, the following requirements are imposed on ': juj+jvj ':u:v ':t:u log jtjj+ tj+juj j+jvj+jwj 1 + ':u:hv; c; wi + ':v:w ':x:v ':t:u log jtj+ju jtj+juj j+jvj+jwj : 1 + ':v:t + ':u:w ':v:x ':t:u log jtj+ju jtj+juj

The important observation is now that these requirements are not only sufcient but also necessary for (2) to hold. This is clear in case (ii), since in this case the requirement on ' is just a reformulation of (2). To see this for the other two requirements, we reason as follows. Consider case (iv) and take x = ht; a; ui. Then A:x = 0 and jxj = jxja = jhhv; b; xi; c; wija = jtj + juj, and|as may be gathered from the above calculation|the last requirement on ' is then equivalent to (2) in this case. The same reasoning applies to case (iii). Next, to remove log from the above requirements, we de ne ':t:u as the following function of jtj + juj: ':t:u = log (jtj + juj); with 6= 0. On account of the monotonicity of log for > 1, the requirements on ' then reduce to the following requirements on and : vj juj+jvj 1 + jtjj+ jtj+juj juj juj+jvj+jwj jvj+jwj j+jwj 1 + jjv jtj+juj+jvj jtj+juj tj+juj j+jwj jvj+jtj juj+jwj 1 + jjv jvj+jtj+juj jtj+juj tj+juj : 4

The last requirement corresponds to case (iv), and results from the following calculation: A:hhv; b; xi; c; wi = f above recurrence relation g A:x + 1 + ':v:t + ':u:w ':v:x ':t:u f induction hypothesis (2) g xj + 1 + ':v:t + ':u:w ':v:x ':t:u log jjx ja f last requirement on '; jxj = jtj + juj and jxja = jhhv; b; xi; c; wija g v; b; xi; c; wij log jhh jhhv; b; xi; c; wij :
a

Information Processing Letters 45 (1993) 41{50

1/3
1 Figure 2: Maximum at = 3 .

1/2

Summarizing, we have as su cient and necessary constraints on and to guarantee (2): m) 1 + m ) (3) (8 k; l; m :: ( lk+ +l k+l m+n m+n m+n (4) (8 k; l; m; n :: ( lk+ +l+m k+l ) 1+ k+l ) k + m l + n ) 1 + m + n ): (5) (8 k; l; m; n :: ( k + l+m k+l k+l Under these constraints, we now maximize so as to minimize bound (2). We distinguish three cases, using that > 1 and 6= 0. 1 2 ) 1 + k+1 Case < 0. Instantiation of (3) with l; m := 1; 1 yields that ( k+1 for all k. But this is false if < 0. . In the appendix it is proved that in this case: Case 0 < < 1 2 (3) true (Lemma 1) (1 )1 (4) (Lemma 2) (1 2 )1 2 (5) 4 (Lemma 3) ; so, taking the conjunction of the requirements on and , we obtain: (1 )1 1< (1 2 )1 2 min4 : To maximize , we determine the maximum of its upper bound over all satp 3 1 (see isfying 0 < < 2 . This yields = 4 ( 1:59) as maximal value at = 1 3 also Figure 2). n+1 ) 1 . Instantiation of (4) with k; l; m := 1; 1; 1 yields that ( n+2 Case 2 3 2 n+3 . This upper bound is 6 n+3 , for all n, which is equivalent to 2 2 (n2 +3n+2) 1 decreasing in , so to minimize it we take =2 . Taking n ! 1, we then get q 3 . 2 5

Information Processing Letters 45 (1993) 41{50


q

conclude from this case analysis that the maximal value for Since 3 < 3 4, we 2 p 3 ). is given by = 4 (for = 1 3

3 Result
The analysis in the previous section constitutes a proof of the following theorem.

Theorem 1 Let potential be of the form


:h i = 0 :ht; a; ui = :t + log jht; a; uij + :u with > 1 and 6= 0. Then A satis es xj ) (3) ^ (4) ^ (5); (8 a; x :: A:x log jjx ja
). and this bound on A is minimal for = 3 4 (and = 1 3

In other words, 3 log2 (jxj=jxja ) is the minimal bound for A:x for the class of \sum 2 of logs" potentials. A better bound for A:x of this form can only be obtained by using a di erent type of potential function. To obtain the bound of Sleator and Tarjan 5, Lemma 1], we de ne T 0 :x = 2 + 2 T:x. Then T 0 :x corresponds to the cost measure in 5], which counts the number of integer comparisons required for the evaluation of a6 x. Furthermore, we de ne A0 :x = T 0 :x + 0 :(a6 x) 0 :x where 0 :x = 2 :x. Since A0 :x = 2 + 2 A:x, Theorem 1 now yields xj 2 + 3 log jxj A0 :x 2 + 3 log2 jjx 2 ja as bound on the amortized number of comparisons used by splaying. The corresponding potential is given by 0 :h i = 0 0 :ht; a; ui = 0 :t + log2 jht; a; uij + 0 :u:

4 Concluding remarks
In a systematic way, we have analyzed a top-down version of splaying. We have shown that using (2) (or (1)) as induction hypothesis leads relatively straightforward to a \sum of logs" potential. The corresponding bound on the amortized costs matches the bound of Sleator and Tarjan. By carefully deriving requirements on that are su cient and necessary for (2) to hold, we showed in addition that this bound is minimal for the class of \sum of logs" potentials. 6

Information Processing Letters 45 (1993) 41{50

Along the same lines, path reversal 2] and the two-pass variant of pairing 1] can be analyzed (see 4]). In all these analyses a \sum of logs" potential arises. As the authors of 2] note, this is easy to explain for splaying and pairing, since there is a clear connection between these operations (see 1, p.121]). But they are at a loss to explain that such a potential can also be used to amortize the cost of path reversal, because they cannot discover a connection between splaying (and pairing) on the one hand, and path reversal on the other. Our analyses, however, show that \sum of logs" potentials can be derived systematically as one sets out to prove logarithmic bounds for the amortized costs of these operations. Another example of such a derivation is the analysis of top-down skew heaps in 3], in which an asymmetric variant of a \sum of logs" potential arises.

Acknowledgements The referees are gratefully acknowledged for their useful


comments which improved the presentation.

References

1. Fredman M.L., Sedgewick R., Sleator D.D., Tarjan R.E. The pairing heap: a new form of self-adjusting heap. Algorithmica 1 (1986) 111{129. 2. Ginat D., Sleator D.D., Tarjan R.E. A tight amortized bound for path reversal. Information Processing Letters 31 (1989) 3{5. 3. Kaldewaij A., Schoenmakers B. The derivation of a tighter bound for topdown skew heaps. Information Processing Letters 37 (1991) 265{271. 4. Schoenmakers B. Data Structures and Amortized Complexity in a Functional Setting. Ph.D. thesis, Eindhoven University of Technology, Eindhoven, Netherlands (1992). 5. Sleator D.D., Tarjan R.E. Self-adjusting binary search trees. Journal of the ACM 32 (1985) 652{686. 6. Tarjan R.E. Amortized computational complexity. SIAM Journal on Algebraic and Discrete Methods 6 (1985) 306{318.

Appendix
m) +m 1 + (8 k; l; m :: lk +l k+l Proof By mutual implication: +m ) 1 + km ) (8 k; l; m :: ( lk +l +l ( f >0g k+l+m ) l+m (8 k; l; m :: ( k+ k+l ) k+l f algebra g 1;
7

Lemma 1 For > 0,

1:

Information Processing Letters 45 (1993) 41{50

+m ) 1 + km ) (8 k; l; m :: ( lk +l +l ) f take k = 1 and l = 1 g +1 +2 (8 m :: ( m2 ) m2 ) ) f take m ! 1 g 1:

2
+n ) 1+ m k+l (1 )1 (1 2 )1

Lemma 2 For 0 < < ,


1 2

l+m+n m+n k+l+m k+l Proof By mutual implication:


(8 k; l; m; n ::

m+n m+n ) +n (8 k; l; m; n :: ( lk+ 1+ m +l+m k +l k+l ) ( f >0g +n +m+n m+n 1+ m (8 k; l; m; n :: ( k+lk +l k+l ) k+l ) +n fp= m k+l g (8 p : p > 0 : ((1+p)p) 1 + p) f simplifying g (1+p)1 (8 p : p > 0 : ) p

f minimize
(1 (1

(1+ )1

)1 2 )1

x x

(see below) g , using 0 < < 1 2

+n m+n m+n ) 1+ m (8 k; l; m; n :: ( lk+ +l+m k +l k+l ) ) f take k = m = 1 g n+1 n+1 +1 (8 l; n :: ( l+ 1+ n l+2 l+1 ) l+1 ) f algebra g l+2 l+1 +1 (8 l; n :: (1 + n l+1 )( l+n+1 n+1 ) ) 1 ) f take n l ! 1 2 and l ! 1 ( 1 2 > 0, since 0 < < 2 ) g (1 + 1 2 )( 1+ 1 1 2 )

f 1+

( 11 2 1 f simplifying g
1 1 2 (1 (1 )1 2 )1 2

1 2

1 1 2 2

Information Processing Letters 45 (1993) 41{50

) To complete the proof we minimize f (x) = (1+x over all x > 0, for 0 < < x 1 0 . Then f (x) = 0 is equivalent to (1 )x = (1 + x) , and f turns out to be 2 minimal at 1 2 , which is positive because 0 < < 1 . So the minimum of f 2 1 )1 1 2 ) , as 1 + 1 2 = 11 2 . 2 , which in turn equals (1 equals (1+ ( (1 2 )1 2 ) 1

1 2

Lemma 3 For 0 < < ,


1 2

k+m l+n k+l+m k+l Proof By mutual implication:


(8 k; l; m; n ::

+n ) 1+ m k+l

4:

l+n ) +n +m 1+ m (8 k; l; m; n :: ( kk +l+m k +l k+l ) ( f >0g m l+n ) +n (8 k; l; m; n :: ( kk+ 1+ m +l k +l k+l ) m and q = l+n g f p = kk+ +l k+l (8 p; q : p > 0 ^ q > 0 ^ p + q 1 : (pq) p + q) f algebra g p+q ) (8 p; q : p > 0 ^ q > 0 ^ p + q 1 : (pq ) +y 1 f minimize (x xy) , using 0 < < 2 (see below) g 4;
+m l+n ) +n (8 k; l; m; n :: ( kk 1+ m +l+m k +l k+l ) ) f take l = k and m = n = 1 g k+1 ) 1 + 22k ) (8 k :: ( 2kk+1 +1 2k f simplifying g 1 +1 2k (8 k :: (1 + k )( 2kk+1 k+1 ) ) ) f take k ! 1 g 4: +y We determine the minimum of (x xy) over all positive x and y satisfying x + y 1, when 0 < < 1 . To this end, we rst observe that this function takes on its 2 minimal value only if x = y because (xy) is maximized by taking x equal to y ( > 0), and this can be achieved without changing the value of x + y. We thus 1 . Since this function is increasing in x ( < 2 ), minimize 2x1 2 over all x 1 2 1 2 it attains its minimal value of 4 at x = 2 .

Вам также может понравиться