Keith Lindsay - A Three-Dimensional Cartesian Tree-Code and Applications To Vortex Sheet Roll-Up

A THREE-DIMENSIONAL CARTESIAN TREE-CODE AND APPLICATIONS TO VORTEX SHEET ROLL-UP
by Keith Lindsay
A dissertation submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy (Mathematics) in The University of Michigan 1997
Doctoral Committee: Professor Robert Krasny, Chair Assistant Professor Peter Smereka Associate Professor Grtar Tryggvason e Professor Arthur Wasserman Professor Michael Weinstein
Keith Lindsay 1997 All Rights Reserved
This thesis is dedicated to the memory of Bruce Lindsay. I miss you and think of you often.
ii
ACKNOWLEDGEMENTS
There are a few people I would like to thank for their support while I have worked on this thesis. I would rst like to thank my advisor Robert Krasny. With his guidance, I have learned a great deal about uid dynamics and numerical analysis. Without his assistance, this thesis would not have been possible. I am grateful for all that he has taught me and I look forward to working with him in the future. I would also like to thank the other members of my dissertaion committee, Peter Smereka, Grtar Tryggvason, Arthur Wasserman, and Michael Weinstein for their thoughtful e comments and suggestions. I extend a special thank you to Judy Florian for all of the support that she has given me. I love you very much.
iii
TABLE OF CONTENTS
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . 2. FLUID DYNAMICS . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Governing Equations . . . . 2.2 Vortex Sheets . . . . . . . . 2.2.1 Parametrization . 2.2.2 Desingularization 2.2.3 Discretization . . 2.3 Vortex Rings . . . . . . . . 2.3.1 Formation . . . . 2.3.2 Stability . . . . . 2.3.3 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii iii vi ix x
1 1 2 4 4 7 9 11 13 17 18 19 27 29 29 31 35 40 42
3. FAST METHODS FOR PARTICLE SIMULATIONS . . . . . 3.1 3.2 3.3 3.4 3.5 Mesh Codes . . . . . . . . . . . . . Tree Codes . . . . . . . . . . . . . Particle-Cluster Interactions . . . . Tree Construction . . . . . . . . . Recurrences for Taylor Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv
3.6 Error Analysis of Particle-Cluster Interactions . . . . . . . . . 3.7 Full Description of the Algorithm . . . . . . . . . . . . . . . . 3.8 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . 4. ALGORITHM VALIDATION AND PERFORMANCE . . . 4.1 Convergence of Vortex Method . . . . . . . . . . . . . . . . . 4.2 Selection of Runtime Parameters . . . . . . . . . . . . . . . . 4.3 Algorithm Performance . . . . . . . . . . . . . . . . . . . . . 5. APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Vortex Ring with Azimuthal Perturbation . . . . . . . . . . . 5.2 Elliptical Vortex Ring . . . . . . . . . . . . . . . . . . . . . . 5.3 Colliding Vortex Rings . . . . . . . . . . . . . . . . . . . . . . 6. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Directions for Future Work . . . . . . . . . . . . . . . . . . .
49 57 59 65 66 68 73 77 77 79 85 96 96 97
APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
LIST OF FIGURES
Figure 2.1 2.2 A vortex sheet modeling parallel shear ow. . . . . . . . . . . . . . 7
Vortex lines and circulation. 1 , 2 : Lagrangian parameters, y0 : reference point, y : point on surface, C : curve for circulation integral. 10 Discretization of parameter space and a circular disk. 1 , 2 : Lagrangian parameters. 1 is a radial parameter and 2 is a parameter around the disk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Particle insertion along a vortex line. given data (), new particle (). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vortex line insertion. given data (), new particle (). . . . . . . .
2.3
14
2.4
15 16 17 21
2.5 2.6 2.7 2.8
Propagating vortex ring. . . . . . . . . . . . . . . . . . . . . . . . . Cylindrical coordinates and basis vectors. . . . . . . . . . . . . . . . Dispersion relation. sign( 2 )|| vs. k. R = 1, = 0.18, 0.15, 0.12, 0.09, 0.06. Going left to right, the peaks correspond to decreasing . Colliding vortex rings. . . . . . . . . . . . . . . . . . . . . . . . . . Particle-cluster interaction. x : target particle, yj : particle in cluster, : cell, y : center of . . . . . . . . . . . . . . . . . . . . . . . Subdivision of space for random points. (a) Nested subdivision of space. (b) Associated tree structure. . . . . . . . . . . . . . . . . . . Subdivision of space for points on a spiral. (a) Nested subdivision of space. (b) Associated tree structure. . . . . . . . . . . . . . . . . Computing Taylor coecients for two-dimensional example. () : previous step, () : current step, () : future step. . . . . . . . . . . x
24 27
2.9 3.1
36
3.2
43
3.3
44
3.4
48
vi
4.1 4.2
Prole of rolling up vortex sheet. t = 1, = 0.10. . . . . . . . . . . . Prole of rolling up vortex sheet. t = 4, = 0.10, t = 0.05, 1 = 0.15, 0.10, 0.05, 2 = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . Execution time (sec.) vs. N0 . pmax = 6 (), 8 ( ), 10 ( ). . . Memory usage (MB) vs. N0 . pmax = 6 (), 8 ( ), 10 ( ). . . . Execution time (sec.) vs. N . pmax = 8. tol = 102 (), 103 ( ), 104 ( ). direct summation (). actual data (o), projected data (x). (a) Execution time, (b) Direct summation time / fast algorithm time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Memory usage (MB) vs. N . pmax = 8. fast algorithm (), direct summation (). actual data (o), projected data (x). (a) Memory usage, (b) Fast algorithm memory usage / direct summation memory usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Actual error vs. specied tolerance. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. potential error bound (), velocity error bound ( ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . Execution time (sec.) vs. actual error. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. Connected lines are tol = 102 , 103 , 104 . potential error bound (), velocity error bound ( ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Variance of perturbed vortex sheet. = 0.10, = 0.10. k : wavenumber of perturbation, t : time. . . . . . . . . . . . . . . . . . . . . . . Perturbed vortex sheet. k = 5. = 0.10, t = 0, 2, 4, 6. . . . . . . . Perturbed vortex sheet. k = 9. = 0.10, t = 0, 2, 4, 6. . . . . . . . Core of perturbed vortex sheet. k = 5, 9. = 0.10, t = 0, 2, 4, 6. . Elliptical vortex sheet. a = 0.8. = 0.10, t = 0, 2, 4, 6. . . . . . . . Elliptical vortex sheet. a = 0.6. = 0.10, t = 0, 2, 4, 6. . . . . . . . Elliptical vortex sheet. a = 0.5. = 0.10, t = 0, 2, 4, 6. . . . . . . .
67
69 71 72
4.3 4.4 4.5
74
4.6
74
4.7
75
4.8
76
5.1
80 81 82 83 86 87 88
5.2 5.3 5.4 5.5 5.6 5.7
vii
5.8 5.9
Vortex sheets modeling colliding disks. = 0.10, t = 0, 1, 2, 3, 4, 4.5. 90 Cut-away of vortex sheets modeling colliding disks. = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, perspective view. = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, front view. = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, side view. = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, top view. = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
5.10
92
5.11
93
5.12
94
5.13
95
viii
LIST OF TABLES
Table 4.1 4.2 Machine characteristics. . . . . . . . . . . . . . . . . . . . . . . . . 66
Maximum point position dierences for circular sheet. t = 1, = 0.10, e(t) = maxi xi (t) xi (t/2) . . . . . . . . . . . . . . . .
67
ix
LIST OF APPENDICES
Appendix A. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
B.
Cylindrical Coordinate Identities . . . . . . . . . . . . . . . . . . . . . 104
C.
Details from Circular Filament Analysis . . . . . . . . . . . . . . . . . 105 C.1 Propagation Speed of Circular Filament . . . . . . . . . . . . 105 C.2 Linearized Evolution Equations for Perturbation . . . . . . . 107
CHAPTER 1
INTRODUCTION
1.1
Overview
This thesis presents an algorithm for the rapid computation of three-dimensional vortex sheet motion. A vortex sheet is a material surface in the uid across which the tangential component of uid velocity has a jump discontinuity. They are frequently used as an asymptotic model for parallel shear ow. In our study of vortex sheets, the governing equations are taken in a Lagrangian form. When these equations are discretized, a large system of ordinary dierential equations results. Referring to the discretization elements as particles, this system of equations is an N -body problem, a collection of N particles with pairwise interactions. In an N -body problem, it is necessary to evaluate sums of the form
N j=1
K (xi , xj ) wj ,
i = 1, . . . , N,
(1.1)
where xi , xj are particle positions, wj is a vector-valued weight associated with the jth particle, and is a smoothing parameter. Computing the sums in (1.1) directly, which is referred to as direct summation, requires O(N 2 ) operations. In our simulations, N takes on values up to 106 , so it is not practical to perform direct summation. The algorithm presented in this thesis evaluates the above sums to a
specied tolerance with O(N log N ) operations. It extends the work of Draghicescu and Draghicescu [21], who studied two-dimensional vortex sheet dynamics, to the three-dimensional case. There are three main ingredients for the eciency of the algorithm: particle-cluster interactions, a tree-based nested subdivision of space to construct particle clusters, and adaptive strategies. In this thesis, the algorithm is used to study vortex ring dynamics with a vortex sheet model. The layout of the thesis is as follows. Chapter 2 gives an overview of the uid dynamics relevant to our work. Vortex sheets are discussed and an overview of vortex rings is presented. Chapter 3 presents the new algorithm. It is described in detail and is related to previous work. Chapter 4 presents a validation of the algorithm, analyzing its convergence, accuracy, and speed-up. Results are presented for the test case of axisymmetric vortex ring roll-up. Chapter 5 presents simulations for perturbed vortex rings, elliptical vortex rings, and the collision and reconnection of two vortex rings, a conguration based on experiments performed by Schatzle [55]. Chapter 6 gives a summary and discusses possible extensions to the work. Appendix A contains a table of the notation, Appendix B lists identities related to cylindrical basis vectors and Appendix C presents details from the circular lament analysis which is performed in Section 2.3.2.
1.2
Contributions of the Thesis
The thesis makes three main contributions. First, the algorithm generalizes previously developed particle simulation algorithms. The main dierences between the kernel K used here and the ones previously used are that K is not harmonic and it is a function of three variables. Second, we introduce new forms of adaptivity into the tree-based subdivision of space. This ensures that the algorithms execution time
will be small compared to direct summation for a variety of particle distributions. Third, we apply the algorithm to a three-dimensional smoothed vortex sheet model to study the dynamics of vortex rings. We show that the model allows vorticity isosurfaces to reconnect, even though the material surfaces do not.
CHAPTER 2
FLUID DYNAMICS
In this chapter, we present an overview of the uid dynamics relevant to our work. In Section 1 we introduce the basic equations of uid motion. Section 2 contains a discussion of vortex sheets, including their applications, how we parametrize them, the behavior they exhibit, and our numerical method for studying them. In Section 3 we introduce vortex rings, as an application of vortex sheet roll-up, and describe some issues that we are interested in studying such as stability and interactions.
2.1
Governing Equations
The motion of incompressible homogeneous (i.e. constant density) uid is governed by the Navier-Stokes equations ut + (u )u = p + u, u = 0, (2.1) (2.2)
where u(x, t) is the uid velocity at position x and time t, p(x, t) is the uid pressure and is the viscosity. Equation (2.1) is the momentum equation, a statement of Newtons second law that mass times acceleration is equal to force. Equation (2.2) is the continuity equation, representing conservation of mass and incompressibility.
As described by Batchelor [6], in many ows the eect of viscosity is signicant only in a small region of the uid, for example in boundary layers or thin shear layers. Away from these regions, the uid behaves as if it were inviscid. Furthermore, as demonstrated experimentally by Brown and Roshko for a turbulent mixing layer [9], the large scale features of the ow do not change for large Reynolds numbers, which may be considered the inverse of viscosity for our present purposes. So to understand the dynamics in these portions of the ow, it is useful to study the inviscid limit 0 of the Navier-Stokes equations. This yields the Euler equations ut + (u )u = p, u = 0. (2.3) (2.4)
In this thesis, we are considering vortex sheets, a particular type of weak solution to the Euler equations. As mentioned in Chapter 1, a vortex sheet is a surface in the uid across which the tangential component of uid velocity has a jump discontinuity. When analyzing weak solutions of dierential equations, one diculty that may arise is a lack of uniqueness of solutions. Thus, one must choose from among the possible solutions the one that is physically signicant. We view the vortex sheet as the zero viscosity limit of smooth solutions to the Navier-Stokes equations. Delort [17] proved that the two-dimensional Euler equations with vortex sheet initial data possess global weak solutions if the vorticity is of one sign. Majda [41] extended the proof to show that in the inviscid limit, solutions of the Navier-Stokes equations, with vortex sheet initial data having vorticity of one sign, converge to weak solutions of the Euler equations. It is not known if these results extend to more general vortex sheet congurations, much less to three dimensions. Uniqueness of solutions is also not known. Discussions of these and other analytical aspects of vortex sheets are given
by Majda [40] and Caisch [10]. In the next section, we describe vortex sheets in more detail. The above forms of the Navier-Stokes and Euler equations, in terms of velocity and pressure, are known as primitive variable formulations. An alternative form is in terms of the vorticity, = u, which measures rotation within the uid. Taking
the curl of the Euler equation (2.3), we obtain t + (u ) = ( )u. (2.5)
One advantage of this form is that the pressure has been removed. To close the system of equations, the velocity is recovered from the vorticity via the Biot-Savart integral u(x, t) = 1 4 (x y) (y, t) dy. |x y|3 (2.6)
R3
Equation (2.5) describes how the vorticity evolves in time and can be used together with (2.6) to form a numerical method to solve the Euler equations. Another evolution equation for the vorticity can be obtained in terms of the ow map (x, t), which denotes the position of the uid particle at time t that was initially at position x at time t = 0. The equations dening are (x, t) = u((x, t), t) t (x, 0) = x. The evolution equation for in terms of is ((x, t), t) = (x, t) (x, 0). (2.8) (2.7a) (2.7b)
Equations (2.6), (2.7) and (2.8) form a closed system which is the basis of the numerical method used in this thesis to study the Euler equations. In the remainder of this chapter we discuss vortex sheets and then present an overview of vortex rings.
Figure 2.1: A vortex sheet modeling parallel shear ow.
2.2
Vortex Sheets
As mentioned above, a vortex sheet is a surface in the uid across which the tangential component of the uid velocity has a jump discontinuity. Away from the surface, the uid is assumed to be irrotational, which means that the vorticity is zero. However, since the velocity has a jump discontinuity across the surface, the vorticity is a -function there. One common application of vortex sheets is as a model for parallel shear ow in which the transition region between two streams of uid is thin, as depicted in Figure 2.1. In this situation, the sheet evolves according to the velocity given by the Biot-Savart integral (2.6) and the sheet is called a free vortex sheet. Another application, described by Lamb [36], is to model the movement of a solid body through irrotational inviscid uid by placing a vortex sheet on the bodys boundary. In this case, the sheet is called a bound vortex sheet, since it is bound to the bodys surface. Our application of vortex sheets, described in the next section, is a model of the formation process of a vortex ring. A method of generating a vortex ring is to place a solid circular disk in a uid, give it an impulse along its axis and then dissolve the disk away. This process can be modeled by considering a bound vortex sheet on the solid disk. When the disk is dissolved away, a free vortex sheet remains in the uid and rolls up into a vortex ring. It is this free sheet that is represented in our computations.
To compute the induced velocity of a vortex sheet, it is necessary to consider the Biot-Savart integral (2.6) in the case where the vorticity is a -function on a surface. Before proceeding, we introduce some additional notation. Away from the vortex sheet, the uid is irrotational, so a velocity potential exists. Thus, for x not on the sheet, u(x, t) = (x, t). The limit of the uid velocity exists as the sheet
is approached from either side. Choosing an orientation for the sheet, let u+ and u denote the one-sided limits of u. Similarly, let + and denote the one-sided limits of . The jumps in u and across the sheet are denoted [u] = u+ u and [] = + respectively. The jump in velocity is tangential to the sheet, so we have n [u] = 0, where n denotes a unit vector normal to the sheet. One can show that the curl of a velocity eld which has a tangential jump discontinuity across a surface and is otherwise irrotational is a surface -function with vector-valued strength = n [u] = n [ ]. (2.9)
Although it is a slight abuse of notation and terminology, we will refer to this vectorvalued strength as the vorticity itself. A consequence of this relationship is that the vorticity is parallel to the surface and perpendicular to [ ], a result we will use later. The Biot-Savart integral is interpreted with this singular vorticity, leading to the following surface integral for the induced velocity : u(x, t) = K(x, y) (y, t) dSy , (2.10)
where S is the sheet, x is a point not on the sheet, K(x, y) = 1 xy 4 |x y|3 (2.11)
is the Biot-Savart kernel, (y, t) is given by (2.9) and dSy is the area element of S at y. For x on the sheet, the integral in (2.10) is interpreted as a principal value integral, because it diverges otherwise.
In the next subsection, we describe the Lagrangian parametrization of the vortex sheet which is the basis of our numerical method. Then we discuss the singular behavior that vortex sheets exhibit, which leads us to desingularize their motion in order to obtain a tractable model. Finally, the discretization of the equations is described. 2.2.1 Parametrization
For computations, it is advantageous to use a Lagrangian parametrization of the vortex sheet. We do this by representing the sheet as a collection of vortex lines, parametrizing across them with circulation. The Lagrangian parametrization was presented by Caisch [10] and Kaneda [28]. The sheets position is denoted y(1 , 2 , t), where 1 and 2 are Lagrangian parameters. The induced velocity eld at a point x on the sheet is u(x, t) = PV K(x, y(1 , 2 , t)) (1 , 2 , t) y y d1 d2 , 1 2 (2.12) where the PV denotes the principal value integral. Caisch [10] and Kaneda [28] showed that the jump [(y(1 , 2 , t))] is independent of time. The demonstration was based on the fact that the uid pressure is continuous across the vortex sheet, which follows from conservation of momentum. Thus, we may write J (1 , 2 ) = [(y(1 , 2 , t))]. (2.13)
Then, using (2.9) and some algebraic manipulations, they derived the identity (1 , 2 , t) y y J y J y . = 1 2 1 2 2 1 (2.14)
The specic choice of 1 and 2 is made to simplify the right-hand side of this equation. We choose 1 to be the circulation between a xed reference point on the
Figure 2.2: Vortex lines and circulation. 1 , 2 : Lagrangian parameters, y0 : reference point, y : point on surface, C : curve for circulation integral. sheet and other points on the sheet and 2 to be a parameter along curves of constant circulation, as shown in Figure 2.2. We describe 2 rst, in terms of vorticity. Vortex lines are integral curves of the vorticity. Geometrically, they are curves which are parallel to the vorticity eld . We choose 2 so that at time t = 0, 2 is a parameter along vortex lines, ensuring that that J = + (y, 0) (y, 0) 2 2 y y = + (y, 0) (y, 0) 2 2 y = 0, = [ (y, 0)] 2 where the last equality is due to the fact that
y 2 y 2
is parallel to (1 , 2 ). It follows
is parallel to and [ ] is
perpendicular to , as seen from (2.9). For such a choice of 2 , the Biot-Savart integral (2.12) reduces to u(x, t) = PV K(x, y(1 , 2 , t)) y J (1 , 2 , t) d1 d2 . 2 1 (2.18) For our vortex ring application, the vortex lines are closed curves. We choose 2 to range from 0 to 2, so the vortex lines are 2-periodic functions of 2 . In the

(2.15) (2.16) (2.17)
computations, 2 is chosen at t = 0 to be a linear rescaling of arclength. Note that this linear relationship does not hold for t > 0, because the vortex lines stretch non-uniformly as the sheet evolves. As mentioned above, 1 is chosen to be the circulation between a xed reference point on the sheet and other points on the sheet. We x a material point y0 on the sheet. For any point y on the sheet, 1 (y) is the circulation 1 (y) = u ds, (2.19)
where C is a closed curve meeting the sheet at y0 and y, and ds is a line element of arclength, as shown in Figure 2.2. Kelvins circulation theorem states that the circulation around a set of vortex lines moving with the ow does not change in time, ensuring that 1 is a Lagrangian parameter. It follows from the denition that 1 = J + c, where c is a constant which depends only on the reference point y0 . Thus,
J 1
= 1 and the Biot-Savart integral (2.18) reduces to u(x, t) = PV K(x, y(1 , 2 , t)) y (1 , 2 , t) d1 d2 . 2 (2.20)
This parametrization and the resulting form of the Biot-Savart integral is a generalization to three dimensions of the Birkho-Rott equation for the motion of a vortex sheet in two dimensions [8]. The circulation distribution for a vortex sheet depends on the initial condition of the specic problem being studied. We will describe it later when we discuss the application to vortex rings. 2.2.2 Desingularization
Vortex sheets exhibit behavior that smooth shear layers do not. For example, vortex sheet instabilities have arbitrarily large growth rates, the sheets form curvature singularities [46], and they roll up into innite spirals [48]. These features make the study of vortex sheets dicult both theoretically and numerically.
As an example of the numerical diculties, consider the motion of a at vortex sheet. When a small amplitude perturbation is introduced to the sheet, it is amplied at a rate proportional to the spatial wavenumber of the perturbation. This is known as Kelvin-Helmholtz instability. In a numerical simulation, roundo error introduces a perturbation to the sheet whose wavenumber is inversely proportional to the spacing of the points representing the sheet. Thus, when the computational mesh is rened, the wavenumber of the round-o error perturbation increases, the perturbation is amplied more rapidly and the computations become inaccurate. One technique to overcome this, introduced by Krasny [34], is to lter the sheets position at each time step. With this technique, it is possible to extend computations to longer times. However, the sheet still develops singularities in nite time. After the singularity forms, it is not possible to use the lter and round-o error grows, overwhelming the computations. Another technique, rst proposed by Chorin and Bernard [15], is to desingularize the Biot-Savart kernel K(x, y). We follow this approach, using a desingularization analogous to the one used by Krasny [33] for two-dimensional vortex sheet roll-up. Our smoothed three-dimensional Biot-Savart kernel is K (x, y) = 1 xy , 4 (|x y|2 + 2 )3/2 (2.21)
where > 0 is the smoothing parameter. We replace the singular Biot-Savart integral (2.20) with u(x, t) = K (x, y(1 , 2 , t)) y (1 , 2 , t) d1 d2 . 2 (2.22)
This kernel was rst introduced by Rosenhead [51] in the study of vortex dynamics in the wake behind a cylinder. It is related to the Plummer potential which is used in astrophysics to model the distribution of matter in a galaxy. Note that
as 0, K K. The introduction of smoothes the kernel and removes its singularity at the origin. This makes it unnecessary to treat the integral in (2.22) as a principal value integral. A consequence of the smoothing is that the kernel is no longer harmonic, which is one of the main reasons for developing the new fast computational method to be described in Chapter 3. Note that it is not possible to desingularize the kernel in such a way that the result is bounded and harmonic, which follows from the maximum principle. The strategy for computing vortex sheet roll-up is to solve the smoothed equation for xed > 0 and to investigate the behavior of these solutions as 0. This is analogous to nding weak solutions of the Euler equations by taking the zero viscosity limit of smooth solutions of the Navier-Stokes equations. This analogy is supported by the work of Tryggvason, Dahm and Sbieh [59] who performed computations for the 0 limit of a two-dimensional vortex sheet and the 0 limit of a corresponding Navier-Stokes computation. They found that the large scale features of a > 0 computation agree well with the features of a > 0 computation, and that the 0 limit and the 0 limit coincide. Liu and Xin [39] have shown that the 0 limit of solutions to the two-dimensional vortex-blob equations is a weak solution of the Euler equations when the vorticity is of one sign. It is not known if such results hold in three dimensions. 2.2.3 Discretization
In this subsection, we describe how the sheets position y(1 , 2 , t) and velocity (2.22) are discretized. The assumptions made about the parametrization are that 1 measures circulation across the vortex lines and 0 2 2 parametrizes along the vortex lines. We discretize the parameter space as shown in Figure 2.3. We rst
Figure 2.3: Discretization of parameter space and a circular disk. 1 , 2 : Lagrangian parameters. 1 is a radial parameter and 2 is a parameter around the disk. discretize 1 with a uniform grid. Each 1 value corresponds to a vortex line which is then discretized in 2 with a grid that is uniform with respect to arc-length in physical space (at t = 0). Note that there are more points on longer vortex lines, which leads to 1 and 2 being treated asymmetrically. This is done to ensure spatial resolution and accuracy of partial derivative computations along the vortex lines. With these points xi (t), we discretize the Biot-Savart integral (2.22) rst in 1 with the trapezoid rule and then in 2 , also with the trapezoid rule. The
y 2
term in the
integrand is approximated with a 2nd order centered dierence. This results in a system of ordinary dierential equations
N dxi = K (xi , xj ) wj , dt j=1
where xi (t), xj (t) are points on the sheet and wj = D2 (xj ) 1 2 (2.24)

(2.23)
particles for cubic interpolant new particle
Figure 2.4: Particle insertion along a vortex line. given data (), new particle (). is the product of the nite dierence D2 along a vortex line and the integration weights 1 and 2 . The integration weights are adjusted appropriately at the boundaries for the trapezoid rules. From here on, we refer to the xj as particles. The system of dierential equations (2.23) is solved with a 4th order Runge-Kutta method. Computing the right-hand side of (2.23) by direct summation requires O(N 2 ) operations, where N is the number of particles discretizing the vortex sheet. In Chapter 3, we present an algorithm which computes the sums in (2.23) more rapidly, to within a specied tolerance. As the sheet evolves, the vortex lines can individually stretch and can also separate from each other. This causes a loss of resolution which is overcome by inserting new particles along the lines and by inserting new lines. The rst case corresponds to rening in 2 for xed 1 and the second case corresponds to rening in 1 globally in 2 . The procedure for inserting a new point along a vortex line is depicted in Figure 2.4. The 2 coordinate of the new particle is set to be the average of the separated particles coordinates. The position of the new particle is computed with a cubic polynomial in 2 which interpolates the positions of the four particles surrounding the new particle, two on each side. The procedure for adding a new vortex line when adjacent lines become separated
" !
Figure 2.5: Vortex line insertion. given data (), new particle (). is analogous, and is depicted in Figure 2.5. The 1 coordinate of the new line is the average of the separated lines coordinates. The particle positions on the new line are generated as follows. The rst step is to select the 2 values where the particles will be placed. We do this by simply choosing the 2 values of an adjacent line. The reasoning behind this is that once renement along vortex lines takes place, the 2 values on the adjacent lines yield good spatial resolution along the new line. To compute a particle position for each of these 2 values, we rst generate a corresponding 2 particle position on each of the surrounding four vortex lines, two on each side. If one of these lines does not have a particle at that 2 value, then one is generated with a cubic interpolant as described above for particle insertion along a line. The 2 particle position on the new vortex line is then computed by interpolating these four particle positions with a cubic polynomial in 1 . In our computations we make a change of variable 1 = 1 (). This will ensure accuracy by placing more vortex lines in regions where the circulation is varying rapidly. Since 1 is a Lagrangian parameter, is one as well. With this change of
$ #
& %
vorticity velocity
propagation
Figure 2.6: Propagating vortex ring. variable, the smoothed velocity induced by the sheet is u(x, t) = K (x, y(, 2 , t)) y (, 2 , t) 1 () d d2 . 2 (2.25)
In the computations, 1 () is computed analytically at t = 0. When new vortex lines are inserted, the values of 1 () are obtained using a cubic interpolant of the values of 1 () at the surrounding lines.
2.3
Vortex Rings
A vortex ring is a ow in which vorticity is concentrated and directed around a torus, as depicted in Figure 2.6. The vorticity distribution causes the uid to rotate around the torus and the ring propagates. In this thesis, we use a desingularized vortex sheet model to investigate vortex ring dynamics. In particular, we are interested in the formation process, stability properties, and interactions between rings. We review each of these topics in the following subsections.
2.3.1
Formation
There are various methods of creating a vortex ring, each having advantages and disadvantages for the experimentalist or numerical analyst. One technique, described by Thomson and Newall [58], is to release a drop of colored liquid into a container of water. As the drop falls through the water, it rolls up around the edges and forms a descending vortex ring. This experiment can be performed with a simple apparatus, but it is dicult to simulate numerically due to the collision of the uid boundaries and the subsequent change in topology. Another method commonly used is to eject uid from a circular nozzle. A shear layer separates at the opening and rolls up into a vortex ring. As described in Shari and Leonards review [56], this process can be modeled using slug ow or self-similar vortex sheet roll-up. Another model, presented by Nitsche and Krasny [47], involves the roll-up of an axisymmetric vortex sheet which is not assumed to be self-similar. In their numerical computations, the sheet was desingularized in a manner similar to the method described above and they modeled the shedding of circulation at the edge of the nozzle. Their results agreed well with experiments performed by Didden [20]. The vortex sheet model used in this thesis is an extension of their work to fully three-dimensional ow. A simple model for vortex ring formation described by Taylor [57], is to supply an impulse to a at circular disk along its axis of symmetry and to then dissolve the disk. When the disk is given an impulse, the velocity eld in the uid is induced by a bound vortex sheet on the surface of the disk. When the disk is dissolved, the sheet remains in the uid and rolls up into a vortex ring. This method is more of a thought exercise and is not practical for experiments, but it is the one on which our computations are based. One reason for selecting this ow for our computations
is the absence of solid boundaries. The circulation distribution on the initially at sheet is given by 1 = 1 r2, (2.26)
where r is the distance from the center of the disk. The velocity eld induced by this circulation distribution is balanced so that the disk propagates. Note that in Cartesian coordinates, the circulation has a square-root singularity at the boundary of the disk. This implies that the jump in velocity becomes innite at the edge. When the smoothing eect of is introduced, the singularity is removed and the balance in velocity is lost, resulting in the disk rolling up into a ring. This eect occurs in physical ow, although the smoothing is due to viscosity. In terms of the vortex sheet parametrization, the disk is given by y(1 , 2 ) = ( 1 2 cos 2 , 1 1 2 sin 2 , 0), 1 (2.27)
where 0 1 1 and 0 2 2. For our reparametrization, we use 1 = cos , which yields y(, 2 ) = (sin cos 2 , sin sin 2 , 0), (2.28)
where 0 /2 and 0 2 2. The 1 () term which arises in (2.25) is given by 1 () = sin . 2.3.2 Stability
Since our model for vortex rings consists of a collection of circular vortex lines, we consider rst the stability of a single circular vortex line, referred to as a vortex lament. So let y(, t) denote the position of a vortex lament in three dimensions, where is a Lagrangian parameter along the lament, 0 2 and y(0, t) =
y(2, t). The lament evolves according to the equation y 1 (, t) = t 4

2 0
y K (y(, t), y(, t)) (, t) d,
(2.29)
where K is given in (2.21). A propagating circular lament is a steady solution of (2.29) and we are interested in its stability properties. It is convenient to perform the analysis in cylindrical coordinates, so let (r, , z) be cylindrical coordinates, as shown in Figure 2.7. Also shown in the gure are the basis vectors er (), e (), and ez associated with the point (r, , z). Identities pertaining to this basis are listed in Appendix B. Suppose that the lament at time t = 0 is given by y(, 0) = (R, , 0). (2.30)
Substituting into (2.29), it can be shown that the lament propagates with velocity U= ez 4R
2 0
2(1 cos ) + (/R)2
1 cos
3/2
d.
(2.31)
The derivation is presented in Appendix C. It is worthwhile to point out the eect of the smoothing parameter . If were equal to zero, the lament velocity would be U= ez 8R
2 0
1 d, 1/2 (2(1 cos ))
(2.32)
which is a divergent integral. The integrand is positive, so considering the integral as a principal value integral will not result in a nite value. The interpretation of this equation is that a circular vortex line propagates with innite velocity. The problem is that in an actual uid, even when the vorticity is concentrated into a small region, the vorticity distribution does not have line delta functions. The desingularization that we use is one approach to overcome this diculty, and was rst introduced by Rosenhead [51] in the study of vortex dynamics in the wake behind a cylinder.
Figure 2.7: Cylindrical coordinates and basis vectors. Intuitively, the introduction of into the kernel spreads the vorticity associated with the vortex line over a region around the line with radius . It is this eect that leads us to call the lines laments. Another approach to overcoming this diculty is to cut o the integral in a small neighborhood of the point = 0, thereby removing the singularity. This technique was used by Crow [16] and Moore [45] in their study of the stability properties of the vortex pair trailing from an airplane wing. We now analyze the linear stability of the propagating circular vortex lament y(, t) = (R, , U t), where U= 1 4R
2 0
2(1 cos ) + (/R)2
1 cos
3/2
d.
We introduce a perturbation p(, t) to the solution y(, t), which we write in terms
'
H GEC 03FDB ) 0( T SQP 03RAI

(2.33) (2.34)
@ 94642 A787531
of the cylindrical basis at (R, , U t) p(, t) = pr (, t)er () + p (, t)e () + pz (, t)ez . (2.35)
We substitute y(, t) + p(, t) into (2.29), and obtain a system of integro-dierential equations for the scalars pr (, t), p (, t), and pz (, t). Then we linearize these equations about p(, t) = 0, which is reasonable under the assumption that |p(, t)| is small in amplitude. The resulting linearized equations (C.18) appear in Appendix C for reference. If the equations are written in the abstract form
p t
= L(p), then it
can be shown that using the Fourier basis for p() diagonalizes the operator L. So we may restrict attention to a single mode of the Fourier expansion for p(). Thus, we x an integer k and substitute the expression p(, t) = eik+t (Ar er () + A e () + Az ez ) (2.36)
into (C.18) and after some simplications obtain the system of linear equations

I3 0
iI2 A
I1 Ar Az
0 0 ,
(2.37)
where the Ij are the integrals I1 = I2 = I3 = 1 4R2 1 4R2 1 4R2

2 0 2 0 2 0
3/2 2(1 cos ) + (/R)2 (1 cos )2 (1 + cos k ) 3 d. + (/R)2 5/2 2(1 cos )
cos (1 cos k ) k sin sin k d, 3/2 2(1 cos ) + (/R)2 k(1 cos ) cos k sin sin k d, 3/2 + (/R)2 2(1 cos ) k sin sin k + 2 cos k cos (1 + cos k )
(2.38)
(2.39)
(2.40)
The unbalanced form of (2.37) is due to the fact that the integral which would have appeared in the (3, 2) entry of the matrix is zero. There are non-zero solutions of the form (2.36) only if the matrix in (2.37) is singular, which is true only if the determinant of the matrix is zero, 0 = ( 2 I1 I3 ). Thus, we have a solution only if = 0 or 2 = I1 I3 . (2.42) (2.41)
This relationship between k and , for xed R and , is called a dispersion relation. For each of these values of , there is a corresponding (Ar , A , Az ) solution to (2.37). Note that the dispersion relation does not depend on I2 , though the solution (Ar , A , Az ) does. The solution y(, t) is linearly stable or unstable with respect to the perturbation p(, t) according to whether the real part of is negative or positive respectively, and if the real part of is zero, then y(, t) is linearly neutrally stable with respect to the perturbation p(, t). From the denitions (2.38) and (2.40), we see that the product I1 I3 is real, so the stability of y(, t) depends upon the sign of I1 I3 . If I1 I3 is negative, then y(, t) is linearly neutrally stable. However, if I1 I3 is positive, then there exist solutions p(, t) which grow and solutions p(, t) which decay, so y(, t) is unstable in general. An observation that can be made from the denitions of the integrals I1 and I3 is that for xed k and /R, depends linearly on R2 . In particular, the sign of 2 will be independent of R. Thus, whether or not a lament is unstable with respect to a perturbation depends only on k and /R. Figure 2.8 contains a plot of sign( 2 )|| as a function of k, for R = 1 and = 0.18, 0.15, 0.12, 0.09, 0.06. The sign term multiplying || is chosen so that positive and negative values correspond to unstable and neutrally stable modes respectively. The
1 0 1
sign(2) ||
2 3 4 5 6 7 0 5 10 15 20 25
Figure 2.8: Dispersion relation. sign( 2 )|| vs. k. R = 1, = 0.18, 0.15, 0.12, 0.09, 0.06. Going left to right, the peaks correspond to decreasing . values of the integrals were computed numerically with Maple. For values of k larger than those depicted, sign( 2 )|| continues to decrease, leveling o at a value which depends upon and R. For a given R and , 2 depends on k in the following qualitative manner. For k = 0 and 1, 2 = 0. As k increases from 1, 2 rst decreases and then increases, following a parabolic shaped curve. After reaching a local maximum, 2 then decreases, eventually leveling o. For some values of /R, the value of at the local maximum is positive, and for others it is not. Recall that the lament is unstable when the peak is positive and is neutrally stable otherwise. More extensive computations than those depicted in the gure do not reveal an obvious pattern for when the mode at this peak is unstable. Also, for some values, such as /R = 0.18 and
0.15, there is more than one k value for which 2 is positive. As decreases, the wavenumber where the peak is located increases, and more extensive computations suggest that the wavenumber grows like O( 1 ) as 0. The linear stability analysis assumes that the perturbation to the lament is small compared to , the nominal size of the laments core. However, the unstable modes, when they exist, have a wavenumber proportional to 1 , which implies that these modes have spatial oscillations with wavelengths on the order of the core size. Thus, as these oscillations grow, they quickly leave the realm where the linear stability analysis is valid. So it is not clear how to interpret the results physically. These results are qualitatively similar to those of Widnall and Sullivans [62] study of vortex ring stability. They used a thin lament approximation as a model for the vortex ring and overcame the divergence of the Biot-Savart integral (2.29) by using an integral cut-o and an asymptotic matching procedure to choose the location of the cut-o. They found that for certain intervals of core sizes, there is a narrow band of modes which are unstable. Rings with core size between these intervals are neutrally stable. As the core size decreases, the band of unstable modes narrows. The wavenumber that the band is centered around grows like a1 , where a is the size of the core. They compared their theoretical predictions with experimental results and found a fair agreement for their prediction for the wavenumber of the unstable mode and good agreement for the amplication rate. One obstacle to generalizing the vortex lament stability analysis to a vortex ring is that the core structure of the ring is not generally known. Thus, one needs to provide a model for the core structure. For instance, in their work mentioned above, Widnall and Sullivan [62] used a constant core radius model and a constant local volume model. Widnall, Bliss, and Tsai [61] modeled the vorticity in the core both as
being constant and having a continuous quartic prole across the core, peaked at the center and zero at the boundary. These later models were better able to predict the wavenumber of the unstable mode than the model used in [62]. Saman [52], using a vorticity distribution which includes viscous eects, was able to predict the unstable wavenumber found in the experiments of Krutzsch [35] and Maxworthy [42, 43]. Another model for the core vorticity distribution is a scaling of the third-order Gaussian exp(r 3 ), which was used in simulations by Knio and Ghoniem [32]. In their study, they modeled the vortex ring as a collection of smooth vortex laments, as we do. The principle dierences between their model and ours is the initial placement of the laments, the smooth kernel that is used, and the discretization. Their laments are initialized to form a solid torus and the lament strengths are chosen to approximate the vorticity distribution. They smooth the Biot-Savart kernel by convolving it with a third-order Gaussian. Their computational results agree well with the analytical predictions of Widnall, Bliss and Tsai [61]. Another technique for generating a vorticity distribution in the rings core is to numerically solve the dierential equations for an exactly propagating ring. The technique was used by Lifschitz, Suters, and Beale [38] in their study of the stability of axisymmetric vortex rings with swirl. In their study, they compared growth rate predictions from short wavelength asymptotics with computations using a vortex lament model of the ring. In their computations, the Biot-Savart kernel was smoothed by convolving it with a sixth degree piecewise polynomial having compact support. Their computations agreed reasonably well with the analytical predictions, the computational growth rates being consistently 1/3 to 1/2 the predicted maximum growth rates.
Figure 2.9: Colliding vortex rings. 2.3.3 Interactions
The type of vortex ring interaction that we are interested in is the collision depicted in Figure 2.9, a conguration studied experimentally by Schatzle [55]. The resulting collision exhibits vortex ring merger and has been studied experimentally, theoretically and numerically. Near the collision, oppositely oriented vortex laments collide and merge. Our interest in the ring conguration is to nd out if the vortex sheet model for vortex rings can capture such complex dynamics, despite the various simplifying assumptions built into the model. The regions of the rings which approach closely contain oppositely oriented vorticity. Saman [53] proposed a model to describe the dynamics of vortex reconnection. As the rings meet, viscosity causes the opposite vorticity to cancel. This decrease in vorticity causes that region of each ring to stretch away from the point of contact, due to a local increase in pressure. Thus, the uid is pushed away from the region of contact and it appears that the rings have connected. Saman [53] modeled this process and the predictions for time scales and strain rates agreed reasonably well with
Schatzles experiments [55]. Various researchers have studied the vortex reconnection problem numerically. Anderson and Greengard [1] used a Lagrangian method and discretized the rings as a collection of vortex laments. They smoothed the BiotSavart kernel by convolving it with a characteristic function and used a constant core vorticity model for the rings. They were able to compute the early stages of the ring merger and their results agree qualitatively with Schatzles experiments. Numerical simulations performed by Aref and Zawadzki [4] and Winckelmans [63] reproduced well the vortex ring collision and reconnection. Using a Eulerian-Lagrangian vortexin-cell code, they reproduced the ring merger and subsequent reconnection into two new rings, which begin to pinch o. Kida, Takaoka and Hussain [30, 31], using an Eulerian spectral method to study the vortex ring merger problem, were able to compute to later times in the sequence. However, in their computations, the rings remain connected after the reconnection, which conicts with experimental observations of ring separation. This disparity was attributed to the fact that the experimental results are visualized with passive scalar transport, which is dierent from vorticity transport. The experiments do not necessarily show where the vorticity is large, since it may be amplied by the vortex stretching term in the Navier-Stokes equations. One reason for interest in this ring conguration related to singularity formation in solutions to the Euler equations. As the rings begin to collide, the oppositely oriented vortex laments that approach each other begin to stretch. In an inviscid ow, this stretching intensies the vorticity, which is a process that plays an important part in singularity formation. For instance, interacting vortex tube computations by Pumir and Kerr [49] show signicant distortion in the core of the colliding rings and vortex laments computations by Pumir and Siggia [50] for other congurations inidicate the possibility of singularity formation.
CHAPTER 3
FAST METHODS FOR PARTICLE SIMULATIONS
As mentioned previously, evaluating the sums in (2.23) by direct summation, a technique also referred to as the particle-particle (PP) method, requires O(N 2 ) operations. For large values of N , the time required to perform these operations is excessively large. Two approaches that have been developed in the past to overcome this diculty are mesh and tree codes. They achieve their eciency by computing approximations to the exact particle interactions. This is in contrast to algorithms such as the fast Fourier transform, which achieve eciency by taking advantage of exact algebraic manipulations. Thus, performance is not the only issue to consider when examining mesh and tree codes, for the execution time typically depends on the desired accuracy. A brief description of mesh codes is given before proceeding to tree codes, the approach that this thesis follows.
3.1
Mesh Codes
Eciency is gained in a mesh code by using the fact that elliptic equations can be solved rapidly on meshes. This is done either by using iterative methods such as successive overrelaxation, conjugate gradient, or multigrid, or direct methods based
29
on cyclic reduction or the fast Fourier transform. A comprehensive reference for this material is the book by Hockney and Eastwood [27]. Though mesh codes apply to more general settings, I will describe them as applied to problems in astrophysics. In this setting, the quantities in a simulation are star positions xi (t) and masses mi . The acceleration of the ith star due to the gravitational inuence of the other stars is given by
N
ai = G
j=1, j=i
mj K(xi , xj ),
(3.1)
where K(xi , xj ) = xi x j . |xi xj |3 (3.2)
Dene the mass density and gravitational potential by

N
=
j=1
mj (x xj ),
N
(3.3)
(x) = G
j=1
mj (x xj ),
(3.4)
where (z) = |z|1 . Then from the identities K(xi , xj ) =

2
(xi xj ),
(3.5)
(z) = 4(z),
(3.6)
it follows that
2
(x) = 4G(x),
(3.7)
ai =
(xi ).
(3.8)
The particle-mesh (PM) method superimposes a xed mesh over the particles and uses the auxiliary functions and to compute ai as follows :
1. Assign a mass function to the mesh from the xj and mi . 2. Solve a discretized form of the Poisson equation (3.7) on the mesh. 3. Use (3.8) to compute accelerations on the mesh. 4. Interpolate accelerations from the mesh to the star positions xj . There are various techniques for implementing each step mentioned above. The main drawback of this method is that the accuracy is determined by the mesh size. When the grid is rened to improve the accuracy, the execution time increases. An alternative to the PM method is the particle-particle/particle-mesh (P 3 M) method, which combines the PP and PM methods. Interactions between nearby particles are computed with the PP method, and the rest of the interactions are computed with the PM method. So the functions being approximated with the mesh are smoother, resulting in a smaller error than the PM method produces with the same mesh. A full discussion of these methods is beyond the scope of this thesis and the interested reader is directed to Hockney and Eastwoods book [27] for more details.
3.2
Tree Codes
There are two main ingredients for achieving eciency in a tree code, particlecluster interactions and a nested subdivision of space which is used to construct the particle clusters. A particle-cluster interaction is used to rapidly compute the inuence of a particle cluster on a single target particle. This is done by approximating the cumulative inuence of the particles in the cluster on the target particle with a simplied expression. Once a preprocessing phase is performed, the expression can be evaluated for multiple target particles with an operation count independent of
the number of particles in the cluster. We will see below that particle-cluster interactions are only performed for particles and clusters which are separated from each other. Thus, the approximation used is referred to as a far-eld approximation. The nested subdivision of space, which is used to construct the particle clusters, has a natural tree structure. The objective behind the subdivision of space is to generate particle-cluster interactions in which the particle is far from the cluster, relative to the clusters size. The combination of these two ingredients leads to an algorithm whose asymptotic operation count is O(N log N ). Two early examples of tree code algorithms are due to Appel [3] and Barnes and Hut [5], who used the algorithms for problems in astrophysics. In these algorithms, particle-cluster interactions were performed by approximating the cluster as a single particle located at the clusters center of mass. A drawback of this approximation is that it has limited accuracy. The Fast Multipole Method of Greengard and Rokhlin [24, 25] overcame this obstacle by using a series expansion to approximate particle-cluster interactions to any specied tolerance. They also introduced clustercluster interactions by expanding the far-eld approximation into a local near-eld expansion for rapid evaluation at multiple target points. The series expansions used in [24, 25] are Laurent series in two space dimensions and spherical harmonic expansions in three dimensions. Van Dommelen and Rundensteiner [60] employed a similar series approach to study two-dimensional uid ow around a cylinder which was modeled with point vortices and a random walk simulation of diusion eects. They used a Laurent series to approximate particle-cluster interactions, but they did not use cluster-cluster interactions. This simplies the algorithm and results in smaller memory requirements, though for similar error tolerances, their ratio of improvement in execution time versus direct summation is less than Greengard and
Rokhlins two-dimensional results. Another tree code for two- and three-dimensional problems, due to Anderson [2] does not use series expansions. Instead, the approximations for particle-cluster and cluster-cluster interactions are based on the Poisson integral formula for the solution of Laplaces equation in the interior of a circle or sphere. For two-dimensional problems, the ratio of improvement in execution time for Andersons algorithm is between Van Dommelen and Rundensteiners and Greengard and Rokhlins. All of these expansions and approximations are appropriate when the interaction kernel is harmonic, such as the Newtonian potential of electrostatic and gravitational interactions, but they are unsuitable for non-harmonic kernels, such as the kernel K under consideration in this thesis. This is because they rely on the harmonicity of the kernel to ensure convergence. An expansion using Cartesian Taylor series, an idea rst proposed by Zhao [65], can be used to overcome this constraint. Zhao used Taylor series for simulations with the Newtonian potential, a harmonic function, in three dimensions. The motivation was to generalize Greengard and Rokhlins [24] two-dimensional complex Taylor series expansion to the three-dimensional setting. The rst application of this expansion to particle simulations with a non-harmonic kernel was by Draghicescu and Draghicescu [21], who computed the evolution of a desingularized vortex sheet in two space dimensions. An important contribution of their work is the introduction of recurrences to rapidly compute the expansion coecients. One contribution of this thesis is to generalize this approach to the threedimensional vortex blob kernel. Our algorithm and the algorithm of Draghicescu and Draghicescus are like van Dommelen and Rundensteiners [60], in that they do not use cluster-cluster interactions. The reasoning for this is that converting a far-eld expansion into a near-eld expansion for Taylor series is a time consuming procedure,
requiring O(p3 ) and O(p4 ) operations in two and three dimensions respectively, where p is the order of the Taylor series being used. A recent development concerning this issue is a new version of the Fast Multipole Method for the Newtonian potential by Greengard and Rokhlin [26]. They speed up the far-eld to near-eld conversion by using an intermediate step of converting the expansion into plane wave expansions using Bessel functions. It is a matter for future work to see if this technique can be extended to non-harmonic kernels. Salmon and Warren [54] discussed using Taylor series for the Newtonian potential and the non-harmonic Plummer potential, although their recurrences and expansions were used only for the Newtonian potential. Using low-order methods and error bounds, they were able to improve upon the performance of previous loworder method of Barnes and Hut [5]. Winckelmans et. al. [64], improving upon the error estimates of Salmon and Warren [54], were able to simulate the vortex wake behind an accelerated airfoil using a vortex method with a cut-o Gaussian smoothed Biot-Savart kernel. Before going into more detail, we rst briey describe the overall structure of our tree code. The algorithm has two stages, the construction of the tree and the computation of the particle velocities. The tree construction involves the recursive subdivision of space to form the nested particle clusters, and the computation of cluster parameters which are used for particle-cluster interactions. Particle velocities are computed using a combination of particle-cluster interactions and particle-particle interactions. The decision for where these interactions are performed is based upon tolerance conditions and execution time considerations. The inuence of a particle cluster on a target particle is computed with a particle-cluster interaction only when a tolerance condition is satised and when doing so takes less time than performing
individual particle-particle interactions. Otherwise, either the cluster acts on the target particle with particle-particle interactions, or the computation descends another level into the tree and considers interactions between the clusters subclusters and the target particle. This process continues until either particle-cluster interactions are performed or the leaves of the tree are reached, in which case particle-particle interactions are performed. Other algorithms which have a similar structure as ours have been shown to require O(N log N ) operations. We present numerical results in the next chapter which show that our algorithms execution time also grows like O(N log N ). The layout for the rest of this chapter is as follows. Section 3 describes particlecluster interactions where the approximation is based on Taylor series. The factors which determine the eciency of such an approximation are discussed. This motivates the nested subdivision of space described in Section 4. Section 5 describes a method based on recurrences for computing the far-eld expansion coecients. Section 6 presents the error analysis upon which the adaptive order selection is based. Section 7 gives a full description of the algorithm. Section 8 discusses the execution time and memory requirements of the algorithm.
3.3
Particle-Cluster Interactions
Consider a particle-cluster interaction between a target particle x and a collection of particles yj , where j = 1 . . . N . The yj are referred to as a particle cluster and the region of space containing them is referred to as a cell and is denoted . This situation is depicted in Figure 3.1 in two space dimensions. When the particle-cluster interaction is performed, the cumulative inuence of the yj on x is replaced with a
Figure 3.1: Particle-cluster interaction. x : target particle, yj : particle in cluster, : cell, y : center of . truncated series expansion. From (2.23), the inuence of the yj on x is
N j=1
K (x, yj ) wj .
We expand K (x, yj ) in a Taylor series in the second argument about a point y, the center of . For the moment, we will not specify where y is located except to say that two natural possibilities are : (1) the center of mass of the yj , (2) the geometrical center of when it is a rectangular box. Using multi-index notation, we have
N j=1 N
K (x, yj ) wj = =
j=1 N j=1
K (x, y + (yj y)) wj 1 k D K (x, y)(yj y)k wj k! y

=
k
N 1 k (yj y)k wj D K (x, y) k! y j=1
=
k
ak (x, y) bk ( ),
where 1 k ak (x, y) = Dy K (x, y), k!

N
bk ( ) =
j=1
(yj y)k wj .
The ak (x, y) are the Taylor coecients of K (x, y) with respect to y about y = y. The bk ( ) describe the distribution of particles in and are referred to as particle
` aY bc W
(3.9) (3.10) (3.11)
moments. Note that the Taylor coecients ak (x, y) are independent of the particles yj in the cell , and the particle moments bk ( ) are independent of the target particle x. So in a certain sense, the expansion is a separation of variables. Once the particle moments are computed for one particle-cluster interaction, they can be stored and used for subsequent particle-cluster interactions with dierent target particles. In practice, the inuence of the yj on x is approximated by truncating the innite series in (3.10), yielding ak (x, y) bk ( ), (3.12)
|k|<p
where |k| = k1 +k2 +k3 and p is chosen to ensure that the error is less than a specied tolerance. The determination of p is described in Section 3.6. Let r = max |yj y|
j
(3.13)
be the radius of the cluster about y, and R = (|x y|2 + 2 )1/2 (3.14)
be the regularized distance from x to the center of the cell. It is shown in Section 3.6 that the error incurred by using the truncation (3.12) is O(hp ), where h = r /R is the convergence factor of the expansion. Thus, the truncation is referred to as a pth order expansion for a particle-cluster interaction. A particle-cluster interaction is performed as follows : 1. Determine the minimum value of p which ensures that the series truncation error is less than the specied tolerance. 2. If the particle moments bk ( ) have not already been computed up to order p, then compute and store them.
3. Compute the Taylor coecients ak (x, y) for |k| < p. 4. Compute the sum in (3.12). Step 1 is based on error bounds described in Section 3.6 and can be performed with O(pmax ) operations, where pmax is the largest admissible value for p. Step 2 requires O(N p3 ) operations if the particle moments have not already been computed. However, this is a one-time cost whose relative eect on the overall execution time diminishes as is used in more particle-cluster interactions. The exponent on p in this operation count is three because we are in three space dimensions. Step 3 can be performed with O(p3 ) operations using a method based on recurrences that is described in Section 3.5. This is the best that can be expected, since there are O(p3 ) coecients to compute. The sum in step 4 has O(p3 ) terms and can be computed with O(p3 ) operations. Adding these operation counts, we see that a pth order particlecluster interaction requires O(p3 ) operations, assuming that the particle moments have been computed. Using a particle-cluster interaction is not always advantageous. For instance, computing the inuence of the yj on x by direct summation, i.e. (3.9), may require fewer operations than computing a pth order expansion, where p has been determined by accuracy constraints. In this situation, we do not use the expansion, opting either to use direct summation or to subdivide and consider particle-interactions between x and the resulting subcells. This decision is based on the following considerations. Direct summation requires O(N ) operations, where N is the number of particles in , and using the expansion requires O(p3 ) operations. So if direct summation requires fewer operations than the expansion, it may be loosely stated that either N is small or p is large. If N is small, then there are no alternatives to direct
summation that will reduce the operation count. This is quantied by introducing a parameter N0 and using direct summation when N < N0 . If N N0 and p is such that using the expansion requires more operations than direct summation, then we subdivide and consider particle-cluster interactions with the resulting subcells. The motivation for this is that the expansions for the particle-cluster interactions with the subcells will require lower orders (smaller value of p) to satisfy the specied tolerance. This is because the convergence factors for the new expansions are at most 0.79h, where the 0.79 factor arises from the partial bisection algorithm described in the next section. So when the execution time required to perform direct summation is less than the time required to perform a particle-cluster interaction, direct summation is performed if N < N0 , and is subdivided if N N0 . The iterative application of this leads to the nested subdivision of space which is described in detail in the next section. In practice, the comparison between the time required to perform direct summation and the time required to perform a pth order particle-cluster expansion is done as follows. A stand-alone program was written which performs direct summation between a particle and a cluster. The program was run using clusters with varying numbers of particles N . The execution time was t with a linear function of N . This linear function is used to estimate how long it takes to perform direct summation with a cluster that has an arbitrary number of particles. A similar program was written which performs particle-cluster expansions and the execution time of this program as a function of p was determined. These execution times are stored and a table lookup is used to determine how long an expansion takes. The comparison between execution times is made between the linear function of N and the stored execution time for a pth order expansion.
3.4
Tree Construction
The strategy of subdividing cells when using an expansion for a particle-cluster interaction is used to compute the velocity of every particle. A consequence of this is that every cell containing particles will be recursively subdivided until the resulting subcells have fewer than N0 particles. This operation of subdividing the cells can be done independently of the target particles, so it is advantageous to do it once, at the beginning of the velocity computations. The resulting collection of cells admits a natural tree structure where nodes in the tree correspond to cells of the subdivision. A cell 2 is a child of a cell 1 if 2 was obtained by subdividing 1 . The tree is constructed with the following recursive algorithm : 1. The collection of particles is enclosed with a rectangular box, which becomes the root cell of the tree and is denoted 0 . Set the current cell to 0 . 2. If the current cell contains fewer than N0 particles then exit. The cell is a leaf of the tree. 3. Otherwise, subdivide into subcells and apply step 2 to each subcell. The resulting subcells become children of in the tree. There are two aspects of this tree construction that need further explanation, the choice of N0 and the method by which the cells are subdivided. The choice of N0 aects the performance of the algorithm in two ways. If N0 is too small, then the tree will have many levels, leading to a large memory requirement. However, if N0 is too large, then the tree consists of cells having large spatial dimensions, and this increases the order p needed in the expansion for particle-cluster interactions, thereby increasing execution time. Computational experiments were performed on a
test case to determine an appropriate value of N0 . These tests are described in the next chapter. The subdivision of a cell is based upon s bounding box, the smallest rectangular box containing s particles whose sides are parallel to the coordinate axes. Let l be the longest edge of the bounding box. The bounding box is bisected in each direction in which its length is greater than l/ 2, yielding either 2, 4, or 8 subcells, and the particles are partitioned according to which subcell they are contained in. The subcells which contain particles become children of , and the subcells with no particles are discarded. The reason for bisecting the box only in the long directions is that bisecting in short directions does not signicantly reduce the convergence factor of the particle-cluster expansion. This is because the convergence factor for the expansion is proportional to r , the radius of the cell. Note that when the subdivision process is applied recursively to the subcell, their bounding boxes depend only on their particles. Hence, the bounding boxes shrink to t the particle distribution. The factor 1/ 2 was chosen to ensure that the child cells aspect ratio, before shrinking, is closer to 1 than the parent cells aspect ratio. Using a dierent factor was not found to improve the algorithms performance signicantly. A byproduct of this algorithm for constructing the tree is that every cell in the tree has a bounding box computed for it. We select y, the base point of the Taylor series expansions, to be the center of the bounding box. Using this expansion point and shrinking the bounding boxes yields an expansion point which is close to the particles. A consequence of this is that small values of p can be used for the expansion, which reduces execution time. Figures 3.2 and 3.3 depict the subdivisions and associated trees resulting from the application of this algorithm to a random collection of points and a sequence of points on a spiral. The rectangles shown are the bounding boxes for the points within
them. The thickness of the rectangle borders are thinner for cells deeper in the tree. For these gures, N0 was set to 20 for illustrative purposes, so cells with more than 20 particles were subdivided as described above. The spiral example demonstrates how the bounding boxes shrink to t the particle distribution. Distributions like this occur in our computations, since we are dealing with two-dimensional surfaces embedded in R3 .
3.5
Recurrences for Taylor Coecients
For the algorithm to be computationally ecient, it is necessary to rapidly compute the ak (x, y), the Taylor coecients of K (x, y) dened in (3.11). A method for doing this is described here. Recall that the kernel K is given by K (x, y) = 1 xy . 4 (|x y|2 + 2 )3/2 (3.15)
Our rst observation is that the computation of the partial derivatives of K can be sped up using the fact that K (x, y) = (z) = (x y), where (3.16)
1 (|z|2 + 2 )1/2 . 4
Note that is a regularized form of the fundamental solution to Laplaces equation in three dimensions. The Taylor coecients of K are ak (x, y) = Dening the quantities ck (x, y) = 1 k D (x y), k! (3.18) (1)|k| k 1 k Dy K (x, y) = D ( )(x y). k! k! (3.17)
r r r r r r r r r r r r r r r r rr r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r rr r r r r r r r r r r r r r r r r r r r r r r r rr r rr r r r r r r rr r r r r r r r r rr r r r r r rr r rr r rr r r r r r r rr r r r
r r r
(a)
(b)
Figure 3.2: Subdivision of space for random points. (a) Nested subdivision of space. (b) Associated tree structure.
r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r rrrrrrrrrr r r rr rrr r r rr r r rr rrrrrr r r r r r r r rr r rr r rr r r rr rr rr rrrrrrrrr r r r r r
r r r
(a)
(b)
Figure 3.3: Subdivision of space for points on a spiral. (a) Nested subdivision of space. (b) Associated tree structure.
we have the relationship

|k| (1)
ak =
(k2 + 1)ck1 ,k2 +1,k3
(k1 + 1)ck1 +1,k2 ,k3
(k3 + 1)ck1 ,k2 ,k3 +1
(3.19)
We compute the ck (x, y) with recurrences and then use (3.19) to compute the ak (x, y). Another relationship between ak (x, y) and ck (x, y) arises by considering them as functions of x with y xed ak (x, y) = (1)|k|
x ck (x, y).
(3.20)
This equation will be useful for the error analysis in the next section. To simplify the presentation of the recurrences for the ck (x, y), we rst present recurrences for the Taylor coecients of a one-dimensional analogue of , 1 (x) = (x2 + 2 )1/2 . Proposition 1 Fix x0 R and let ck = 1 (x0 )/k! be the kth order Taylor coecient of 1 (x) at x = x0 . Then the ck satisfy the recurrence (x2 + 2 ) ck + 2x0 (1 0 1 1 ) ck1 + (1 ) ck2 = 0 2k k (3.21)
(k)
for k > 0, with the convention that ck = 0 for k < 0. Proof : First observe that 1 satises the dierential equation (x2 + 2 ) 1 (x) + x 1 (x) = 0. (3.22)
Let k > 0. Dierentiating (k 1) times, using the Leibniz rule for dierentiating a product, we obtain (x2 + 2 ) 1 (x) + (k 1) 2x 1
(k) (k1)
(x) + (k 1)(k 2) 1 + x 1
(k1)
(k2)
(x)
(k2)
(x) + (k 1) 1
(x) = 0. (3.23)
Substituting x = x0 and grouping similar terms yields 1 (k1) (k2) (k) (x0 ) + (k 1)2 1 (x0 ) = 0. (x2 + 2 ) 1 (x0 ) + 2x0 (k ) 1 0 2 (3.24) The result (3.21) is obtained on dividing by k! and using the identities k 1/2 1 1/(2k) = , k! (k 1)! (k 1)2 1 1/k = . k! (k 2)! (3.25)
The essential ingredient in the proof of Proposition 1 is the dierential equation (3.22) that 1 satises. The function (z) of (3.16) whose Taylor coecients we require, satises three dierential equations which are analogous to (3.22) : (|z|2 + 2 ) (z) + z1 (z) = 0, z1 (z) + z2 (z) = 0, z2 (z) + z3 (z) = 0. z3 (3.26a)
(|z|2 + 2 )
(3.26b)
(|z|2 + 2 )
(3.26c)
Following the proof of Proposition 1, we obtain the following result. Proposition 2 Let z R3 , R = (|z|2 + 2 ) R2 ck1 ,k2 ,k3 + 2z1 (1
1/2
and ck =
1 D k (z). k!
Then
1 1 )ck1 1,k2 ,k3 + (1 )ck1 2,k2 ,k3 2k1 k1 (3.27a)
+ 2z2 ck1 ,k2 1,k3 + ck1 ,k2 2,k3 + 2z3 ck1 ,k2 ,k3 1 + ck1 ,k2 ,k3 2 = 0, R2 ck1 ,k2 ,k3 + 2z2 (1 1 1 )ck1 ,k2 1,k3 + (1 )ck1 ,k2 2,k3 2k2 k2
+ 2z1 ck1 1,k2 ,k3 + ck1 2,k2 ,k3 + 2z3 ck1 ,k2 ,k3 1 + ck1 ,k2 ,k3 2 = 0, (3.27b) R2 ck1 ,k2 ,k3 + 2z3 (1 1 1 )ck1 ,k2 ,k3 1 + (1 )ck1 ,k2 ,k3 2 2k3 k3
+ 2z1 ck1 1,k2 ,k3 + ck1 2,k2 ,k3 + 2z2 ck1 ,k2 1,k3 + ck1 ,k2 2,k3 = 0, (3.27c)
where k1 > 0 in (3.27a), k2 > 0 in (3.27b), k3 > 0 in (3.27c), with the convention that ck = 0 when any of the indices are negative. These recurrences are used to compute the ck for |k| p with O(p3 ) operations. To demonstrate how this is done and to simplify the presentation, we rst explain the process for a two-dimensional analogue, obtained by omitting the z3 and k3 dependence. We describe the generalization to the three-dimensional case afterward. So consider the two recurrences R2 ck1 ,k2 + 2z1 (1 1 1 )ck1 1,k2 + (1 )ck1 2,k2 + 2z2 ck1 ,k2 1 + ck1 ,k2 2 = 0, 2k1 k1 (3.28a) 1 1 )ck1 ,k2 1 + (1 )ck1 ,k2 2 = 0. 2k2 k2 (3.28b)
R2 ck1 ,k2 + 2z1 ck1 1,k2 + ck1 2,k2 + 2z2 (1
The coecients are computed in the following 4 steps, depicted in Figure 3.4. 1. Compute c0,0 from the denition. 2. Compute ck,0 and c0,k for k = 1 . . . p. 3. Compute ck,1 and c1,k for k = 1 . . . p 1. 4. Compute ck1 ,k2 for k1 + k2 p. The coecients obtained in step 4 are computed row-by-row. The computation is ordered this way to ensure that the coecients needed for the recurrences are available. It also breaks the code into blocks which correspond to the cases when dierent coecient indices arising in the recurrences are negative, allowing for a more understandable code. This is more of an issue in the three-dimensional case, where there are more index cases to consider. The computations for the three-dimensional case are performed in the following steps :
Step 1
Step 2
Step 3
Figure 3.4: Computing Taylor coecients for two-dimensional example. () : previous step, () : current step, () : future step. x
h 0g
Step 4
t 0s
r fq x fw 0y p 0i
e fd v fu
1. Compute c0 from the denition. 2. Compute ck when two indices are 0 and the other is 1. 3. Compute ck when one index is 0, one index is 1 and the other is 1. 4. Compute ck when one index is 0 and the other two are both 2. 5. Compute ck when two indices are 1 and the other is 1. 6. Compute ck when one index is 1 and the other two are 2. 7. Compute ck when all of the indices are greater than 2. As in the two-dimensional case, this ordering of the steps ensures that coecients needed for the recurrences are available. Once the coecients are computed, (3.19) is used to compute the ak (x, y) for |k| < p with another O(p3 ) operations. Thus, the overall operation count for computing the ak (x, y) for a pth order particle-cluster interaction is O(p3 ).
3.6
Error Analysis of Particle-Cluster Interactions
In this section, we obtain a bound for the error due to the series truncation in a particle-cluster interaction. This bound is used in the algorithm to compute an order p to satisfy the specied tolerance. So consider a cell with particles yj acting on the target particle x. To simplify the analysis, we initially bound the error in the series truncation for the inuence of a single particle yj in on x. The triangle inequality then ensures that the total error in the particle-cluster interaction is less than the sum of the individual errors. From (3.12), yj s computed inuence on x is ak (x, y) (yj y)k wj . (3.29)
|k|<p
Because the vector weight wj is independent of k, it factors out of the equation, so we restrict our attention to the expression ak (x, y)(yj y)k . (3.30)
|k|<p
To analyze the rate of convergence of this series as p increases, the quantities Sn =

|k|=n
ak (x, y)(yj y)k
(3.31)
are introduced, where the dependence of Sn on x, y, and yj is not explicitly displayed for notational convenience. With the series truncation that we are using, |k| < p, the multi-dimensional series (3.30) has been reduced to a one dimensional series ak (x, y)(yj y)k = Sn .
n<p
(3.32)
|k|<p
We estimate the error due to the truncation by showing that the magnitude of the Sn decrease geometrically and using a bound on the rst omitted term Sp . The dierential equation relating ak (x, y) and ck (x, y), (3.20), leads us to introduce the quantity Tn =
|k|=n
ck (x, y)(yj y)k ,
(3.33)
which is related to Sn by Sn = (1)n

x Tn .
(3.34)
From the recurrences for ck (x, y), (3.27), we derive a recurrence for Tn . From this, we derive an explicit expression for Tn which involves Legendre polynomials. Then using (3.34), we derive an expression for Sn . This expression is used to estimate the error incurred by truncating the series (3.12).
Proposition 3 With Tn , ck (x, y) and R dened as above, R2 Tn + 2(1 where = (x y) (yj y), = |yj y|. (3.36) 1 1 )Tn1 + 2 (1 )Tn2 = 0, 2n n (3.35)
Proof : Consider a term R2 ck (x, y)(yj y)k from the sum in (3.33) which makes up R2 Tn . Letting z = x y, we apply a linear combination of the identities (3.27a,b,c) with weights k1 /n, k2 /n and k3 /n respectively and solve for R2 ck (x, y)(yj y)k . The result is a weighted sum of ck (x, y) over |k| = n 1 and n 2. The weight on a particular ck (x, y) is computed as follows. If |k| = n 1, then the ck (x, y) arose from identities (3.27) being applied to R2 ck (x, y) terms with k equal to (k1 + 1, k2 , k3 ), (k1 , k2 + 1, k3 ), or (k1 , k2 , k3 + 1). The resulting weight multiplying ck (x, y) is then (using k1 + k2 + k3 = n 1) 2z1 k1 + 1 k2 k3 1 + 1 + (yj y)(k1 +1,k2 ,k3 ) n n n 2(k1 + 1) + 2z2 + 2z3 k1 k2 + 1 k3 1 1 + (yj y)(k1 ,k2 +1,k3 ) + n n n 2(k2 + 1)
(3.37) k1 k2 k3 + 1 1 + + 1 (yj y)(k1 ,k2 ,k3 +1) n n n 2(k3 + 1) 1 1 )(yj y)(k1 +1,k2 ,k3 ) + 2z2 (1 )(yj y)(k1 ,k2 +1,k3 ) = 2z1 (1 2n 2n 1 + 2z3 (1 )(yj y)(k1 ,k2 ,k3 +1) 2n (3.38) = 2 {z1 (yj,1 y1 ) + z2 (yj,2 y2 ) + z3 (yj,3 y3 )} (1 1 )(yj y)k 2n 1 )(yj y)k 2n (3.39) = 2z (yj y)(1 = 2(1 (3.40) (3.41)
1 )(yj y)k 2n
If |k| = n2, then the ck (x, y) arose from identities (3.27) being applied to R2 ck (x, y) terms with k equal to (k1 + 2, k2 , k3 ), (k1 , k2 + 2, k3 ), or (k1 , k2 , k3 + 2). The resulting weight multiplying ck (x, y) is then (using k1 + k2 + k3 = n 2) k1 + 2 1 k2 k3 (yj y)(k1 +2,k2 ,k3 ) )+ (1 + n n n k1 + 2 1 k1 k2 + 2 k3 + + (1 (yj y)(k1 ,k2 +2,k3 ) )+ n n n k2 + 2 k1 k2 k3 + 2 1 ) (yj y)(k1 ,k2 ,k3 +2) + + (1 n n n k3 + 2 1 1 = (1 )(yj y)(k1 +2,k2 ,k3 ) + (1 )(yj y)(k1 ,k2 +2,k3 ) n n 1 + (1 )(yj y)(k1 ,k2 ,k3 +2) n 1 = |yj y|2 (1 )(yj y)k n 1 = 2 (1 )(yj y)k n + Thus, when R2 Tn is expanded with identities (3.27), the result is R 2 Tn = 2(1 1 )c (x, y)(yj y)k 2n k 2 (1 1 )c (x, y)(yj y)k n k (3.46) (3.47) (3.42)
(3.43)
(3.44) (3.45)
|k|=n1
|k|=n2
= 2(1
1 1 )Tn1 2 (1 )Tn2 . 2n n
The recurrence (3.35) is related to the one satised by the Legendre polynomials, Pn (x) 2x(1 1 1 )Pn1 (x) + (1 )Pn2 (x) = 0, 2n n (3.48)
for n 2 with P0 (x) = 1 and P1 (x) = x [22, Chapter 10]. This observation leads to the following explicit formula for Tn . Proposition 4 With , , and R dened as above, Tn = hn Pn , 4R R (3.49)
where h= . R (3.50)
Proof : The proof works by showing that Tn and the right-hand side of (3.49), which we refer to as Tn , satisfy the same two-term recurrence and have the same values for n = 0, 1. The recurrence for Tn is given in Proposition 3. From the recurrence for
Pn in (3.48), we have that Pn ( R ) is a solution of the recurrence
fn + 2
1 1 (1 )fn1 + (1 )fn2 = 0. R 2n n
(3.51)
Thus, hn Pn ( R ) is a solution of
fn + 2h
1 1 (1 )fn1 + h2 (1 )fn2 = 0. R 2n n
(3.52)
Multiplying this equation by R2 and using h = /R yields R2 fn + 2(1 1 1 )fn1 + 2 (1 )fn2 = 0, 2n n (3.53)
which is the recurrence for Tn . Since Tn is a multiple of hn Pn ( R ), it too satises the
recurrence. So Tn and Tn satisfy the same two-term recurrence. From the denition of Tn , (3.33), and the recurrence that the Tn satisfy, (3.35), the initial values for Tn are T0 = c0 (x, y) = 1 1 (|x y|2 + 2 )1/2 = 4 4R T1 = 2 T0 = . R 4R3 (3.54) (3.55)
Using h = /R, the initial values for Tn are h0 1 P0 , = 4R R 4R h1 T0 = P1 . = 4R R 4R3 T0 = (3.56) (3.57)
Thus, Tn = Tn for all n 0. To obtain an expression for Sn , we take the gradient of Tn with respect to x. When written out in terms of x, yj , and y, we have from (3.49) Tn = |yj y|n |x y|2 + 2 4
(n+1)/2
Pn
(x y) (yj y) . |yj y|(|x y|2 + 2 )1/2 (3.58)
Dening = we have Sn =
(n+3)/2 (1)n |yj y|n (n + 1)(x y) |x y|2 + 2 Pn () 4 (n+1)/2 (yj y) + |x y|2 + 2 Pn () |yj y|(|x y|2 + 2 )1/2 (x y) (yj y) + (x y) |yj y|(|x y|2 + 2 )3/2
(x y) (yj y) , |yj y|(|x y|2 + 2 )1/2
(3.59)
. (3.60)
Recalling R = (|x y|2 + 2 )1/2 and making some rearrangements, we obtain (1)n Sn = 4R2 |yj y| R xy Pn () R yj y (x y) (x y) (yj y) + Pn () + |yj y| R2 |yj y| (n + 1)
n n
. (3.61)
Each fraction inside the curly braces is less than 1 in magnitude, so we have 1 |Sn | 4R2 Using the inequalities |Pn ()| 1, which rely on || 1, we have (n + 1)2 |Sn | 4R2 |yj y| R
n
|yj y| R
((n + 1)Pn () + 2Pn ()) .
(3.62)
|Pn ()| n(n + 1)/2,
(3.63)
(3.64)
So the terms of the series in (3.32) decay geometrically, which implies that the truncation error is roughly the magnitude of the rst omitted term. Recall that when a pth order particle-cell interaction is performed, the expansion for the inuence of all of the particles is ak (x, y) bk ( ) =

p0
|k|=p
ak (x, y) bk ( ) ,
(3.65)
and this is truncated by retaining the terms for which |k| < p. The geometric decay of the terms ensures that the error incurred by using this truncation will be on the order of ak (x, y) bk ( ). (3.66)
|k|=p
The bound (3.64) on Sn for n = p translates into the bound ak (x, y) bk ( ) (p + 1)2 4Rp+2
N j=1
|k|=p
|wj ||yj y|p .
(3.67)
Dene the quantities p ( ) = (p + 1)2

N j=1
|wj ||yj y|p .
(3.68)
Then the error incurred by using a pth order expansion to approximate a particlecluster interaction is bounded by error < p ( ) . 4Rp+2 (3.69)
The p ( ) are computed during the construction of the tree. The rst step in performing a particle-cluster expansion is to nd the smallest p such that the expression in (3.69) is less than the specied tolerance. This can be done with O(pmax ) operations, as stated in Section 3.3, where pmax is the maximum admissible value of p.
When an algorithm using the error bound (3.69) to compute p was implemented, it was found that the actual error incurred in computing the velocity was typically three orders of magnitude smaller than the specied tolerance. Presumably, this is due to the repeated use of the triangle inequality in the analysis, which leads to overestimates. An alternative to bounding the error in the velocity is to bound the error in the velocity potential, which is achieved by using the identity (3.49), leading to the bound |Tn | 1 4R |yj y| R
n
(3.70)
a bound analogous to (3.64). The cumulative error bound corresponding to (3.69) is error < where p ( ) is now dened as
N
p ( ) , 4Rp+1
(3.71)
p ( ) =
j=1
|wj ||yj y|p .
(3.72)
When an algorithm using this error bound to compute p was implemented, the error in computing the velocity was still smaller than the specied tolerance, but only by one order of magnitude. This is the approach used for all of the runs described in Chapters 4 and 5. For either of the error bounds, it is clear that the geometric decay rate of the truncation error depends linearly on r , the radius of the smallest sphere centered at y which encloses all the particles in the cell . Thus, it is appropriate to subdivide cells so that the resulting subcells are as close to spheres as possible. As mentioned in Section 3.4, this is the motivation for the bisection technique used when cells are subdivided.
3.7
Full Description of the Algorithm
The algorithm for computing all interactions has two stages, constructing the tree and computing the particle velocities with the aid of the tree. There are two parameters for the program, pmax , the maximum admissible order for expansions, and N0 , the maximum number of particles in an undivided cell. The tree is created with the recursive function create_tree, written here in pseudo-code, which accepts for input an array of particles associated with a cell , and an integer N , the length of the array. The purpose of the function is to create and initialize a tree node for the particles which are passed to it. This includes computing the particles bounding box, the cells moments, and the s. If there are more than N0 particles, then the cell is subdivided and the function is called recursively. The function returns a pointer to the created tree node. function create_tree(particles, N ) begin allocate memory for tree node being created compute particles bounding box compute center of bounding box compute p for p = 0 . . . pmax if N > N0 then compute the directions to subdivide the cell partition the particles, yielding subarrays of particles call create_tree for each subarray of particles make each returned tree node a child of return
end Once the tree is created, the recursive function compute_influence is called for each target particle to compute the inuence of all particles on it. The function accepts for input a target position x, a cell , and a tolerance tol. The function returns the inuence of the particles in the cell on the target position computed to the specied tolerance. It is initially called with the root cell of the tree 0 . function compute_influence(x, , tol) begin estimate t0 , the time for direct summation with linear model compute minimum p to satisfy tolerance if p > pmax or time for pth order expansion > t0 then if has no children then compute and return influence using direct summation else call compute_influence for each child of return sum of returned influences else if s pth order particle moments have not been computed yet, then compute and store them compute the ck compute the ak from the ck compute and return sum of expansion end For each recursive call of compute_influence to itself, a local tolerance is required.
We want the cumulative errors from the child computations to be less than or equal to the tolerance passed to compute_influence. We achieve this by multiplying the parents tolerance by the ratio of the childs weights and the parents weights. That is, if is the child cell, and P ( ) is s parent, then tol( ) = |wj | tol(P ( )), yj P ( ) |wj |
yj
(3.73)
where tol( ) denotes the local tolerance for the cell . Then it follows that the sum of the errors from the child computations is bounded by the tolerance specied for P ( ). Note that the sums in the ratio are the values of 0 ( ) and 0 (P ( )), which were computed in create_tree.
3.8
Complexity Analysis
In this section, we describe the memory and time requirements of the algorithm described above. We rst show that the memory required for the algorithm is O(N ). The algorithm consists of two stages, tree construction and velocity computation, and we analyze them separately for their time requirements. The number of operations required for constructing the tree is O(N log N ). We break the velocity computations into two parts, particle-cluster interactions and particle-particle interactions. We present heuristic reasons for why these take O(N log N ) and O(N ) operations respectively. The bounds obtained in this section should be considered as rough guides to how the algorithm performs in practice, as opposed to sharp estimates. We expect that the asymptotic behavior of the bounds matches the algorithms asymptotic performance, but that the constants involved may be considerably o. In the next chapter, we present data from runs on test cases which demonstrates the algorithms performance benet over an algorithm which uses only direct summation. It is our
position that that data is more signicant than the asymptotic bounds obtained here, since we desire an actual execution time improvement, not just an asymptotically fast algorithm. With that in mind, we proceed with the analysis. We now discuss the memory requirements for the algorithm, in terms of the parameters N , N0 , and pmax . The memory can be broken down into 2 categories, that required for the particles, and that required for the tree. It is clear that the particles require O(N ) words of memory. The data for a single cell that requires more than O(1) words are the cell moments and the p ( ), which require O(p3 ) max and O(pmax ) words respectively. So a single cell uses O(p3 ) words of memory. We max bound the number of cells in the tree, by rst bounding the number of cells which are parents of leaf cells, and then use that to bound the size of the entire tree. A cell which is the parent of a leaf cell has at least N0 particles. Since every particle is in exactly one such cell, there are at most N/N0 parents of leaf cells. Going down the tree, we see that there are at most 8N/N0 leaf cells, since each cell has at most 8 children. Going up the tree, we see that there are at most N/2N0 parents of parents of leaf cells, since each cell has at least 2 children, if it has children. Continuing up the tree, looking at parents of parents and so on, yields collections of cells with at most N/4N0 , N/8N0 , . . . cells respectively. Thus, there are at most (8 + 1 + 1/2 + 1/4 + . . . )N/N0 = 10N/N0 (3.74)
cells in the tree. Thus, the memory required for the tree is O(p3 N/N0 ). Note that max since each leaf cell has fewer than N0 particles in it, there must be more than N/N0 of them. Thus, our bound has the correct asymptotic order. The tree code algorithm presented here requires more memory than a direct summation program. However, in the next chapter, when the algorithm is validated,
the amount of memory actually used is compared to the memory used by a direct summation program. It is found that the memory required by the tree code algorithm is 1.3 to 1.6 times that required by direct summation. For certain particle distributions and tolerance values, the algorithm will perform poorly, taking more time than an algorithm which uses only direct summation. However, this behavior has not been observed in our tests. In order for the analysis to reect the actual performance characteristics of the algorithm, we make a simplifying assumption about the particle distribution. The assumption is that when a cell is subdivided, there is an upper bound on the percentage of particles contained in a subcell. Mathematically, this is stated as N C < 1, NP ( ) (3.75)
where is an arbitrary cell, P ( ) is the parent of , and C is a constant independent of . Intuitively, this assumption is bounding how inhomogeneous the distribution of particles can be. In our computations, the maximum value of N /NP ( ) was computed at each time step and found to be less than 0.5, so the assumption is justied. For the operation count of the tree construction, we rst obtain an upper bound on the number of levels in the tree. The root cell of the tree, 0 , has N particles. Inequality (3.75) gives an upper bound on the number of particles in a cell in terms of its parent. Applying it iteratively, we nd that cells at the lth level have fewer than C l N particles. Now if a cell has fewer than N0 particles, it is not subdivided. This will be guaranteed if C l N < N0 , which is true if l > log(N0 /N )/ log C = log(N/N0 )/ log C 1 . Thus, there are at most O(log(N/N0 )) levels in the tree. Consider a cell and let T ( ) be the total number of operations required to
construct the tree starting with and including all of s children. In the function create_tree, the steps that require more than O(1) operations are computing the particles bounding box and computing the p ( ) for p = 0 . . . pmax , which require O(N ) and O(pmax N ) operations respectively. If N N0 , then no more operations are performed, so T ( ) = O(pmax N ). If N > N0 , then the particles are partitioned and create_tree is called for each subcell. The partitioning consists of grouping the particles according to which octant they are located in with respect to the cells center, a procedure that can be done in O(N ) operations. Then create_tree is called for each subcell, implying T ( ) = O(pmax N ) +
P ()=
T ().
(3.76)
This equation, derived for N > N0 , is also true for N N0 , since the cell has no children and the sum is empty. Suppose N > N0 and consider applying (3.76) to itself, expanding each T (). The rst term of the expansion of T () is O(pmax N ). Since the are the children of , we have N = N ,
P ()=
(3.77)
an equality which requires N > N0 . Thus, the rst terms of the expansions of the T () sum to O(pmax N ). So if N > N0 , then T ( ) = 2O(pmax N ) +
P (P ())=
T (),
(3.78)
with the last sum potentially being empty. We repeat this process and apply (3.76) recursively to the T (). However, it may be the case that not all children of have children, because some children of may have fewer than N0 particles and thus are
not subdivided. Thus, the inequality N N , (3.79)
P (P ())=
analogous to (3.77), may be strict. So we obtain in general T ( ) O(lpmax N ) + T (),

P (l) ()=
l = 1, 2, . . . ,
(3.80)
where P (l) is the parent function P composed with itself l times. The recursion stops when l is larger than the number of levels in the tree below , since the sum in (3.80) is empty then. Thus, substituting = 0 , and using the fact shown above that there are O(log(N/N0 )) levels in the tree, we have T (0 ) O(pmax N log(N/N0 )). (3.81)
For the computation of the particle velocities, we do not have an upper bound on the number of operations that are used. The main diculties are because of the adaptive nature of the algorithm. There is not an apriori bound on the ratio of a parents cell size and a childs cell size. Thus, particle-cluster interactions become signicantly more advantageous when a parent cell is subdivided and shrunk, as opposed to the gradual improvement that occurs when only subdividing is performed. Also, the cells on a given level of the tree may have very dierent sizes. This makes it dicult to consider them together which is a natural technique. Tree codes in the past that are not as adaptive as the current one have been shown to take O(N log N ) operations. We believe that our algorithm does as well, based on heuristic considerations and actual execution times. The heuristic argument is as follows. We would show that O(N log N ) operations are required by showing that each particle takes part in O(log N ) particlecluster interactions and O(1) particle-particle interactions. Each particle takes part
in O(log N ) particle-cluster interactions because it takes part in O(1) of them on each level of the tree and there are O(log N ) levels in the tree. The reason that a particle only takes part in O(1) particle-cluster interactions on a given level is that it does not interact with cells which are suciently far away relative to the cells size, because the particle would have interacted with such cells parents. One obtains an upper bound on the relative distance to a cell with which a particle-cluster interaction is performed, and the upper bound is independent of the level. If the cells are of the same size on the level, this implies an upper bound on the number of cells satisfying the condition. This is one point where the analysis is heuristic and not rigorous for our algorithm. Thus, there is an upper bound on the number of particle-cluster interactions on a level, which implies an O(N log N ) operation count. To bound the number of particle-particle interactions, consider how many particles interact with a leaf cell on a particle-particle basis. If one assumes that the number of particles which do so is proportional to the number of particles in the leaf cell, then it follows that
2 N < N0 leaf cells leaf cells
N < N0 N,
(3.82)
where is the constant of proportionality. One can show that the particles which interact with a leaf cell on a particle-particle basis are contained in a sphere around the cell whose radius is proportional to the size of the cell. So if the particle density in the sphere is not too dierent than the density in the cell, then the assumption of the particle count is justied, and the operation count bound is achieved. So with these heuristic considerations and the rigorous bounds above, we have that the overall operation count for the algorithm is O(N log N ) and the memory usage is O(N ).
CHAPTER 4
ALGORITHM VALIDATION AND PERFORMANCE
In this chapter, we present a validation of the algorithm. The algorithm has two dierent aspects to it, the vortex method, i.e. the discretization of the vortex sheet model, and the tree-code which is used to evaluate particle velocities. The topics in the rst section are related to the vortex method. We demonstrate the 4-th order convergence of the Runge-Kutta method and present results showing convergence as the vortex sheet is rened. The other sections deal with the tree-code. We discuss the selection of the runtime parameters N0 and pmax and demonstrate the algorithms accuracy and execution time improvement over direct summation. The algorithm was implemented in C [29], using double precision arithmetic and runs were performed on a Silicon Graphics Power Challenge L, a Sun UltraSPARC 2, and a Sun SPARCstation 20. Relevant information about the machines is presented in Table 4.1. The computations which involved timing comparisons were performed on the Silicon Graphics machine.
65
Machine Power Challenge L UltraSPARC 2 SPARCstation 20
RAM (MB) 128 380 32
CPU clock rate (MHz) 75 168 150
Table 4.1: Machine characteristics.
4.1
Convergence of Vortex Method
The purpose of the rst test case is to verify the 4th order convergence of the Runge-Kutta method for the solution of the dierential equations (2.23). These runs were performed with a program using direct summation. The initial condition was a at circular vortex sheet of radius 1 with circulation distribution 1 = (1 r 2 )1/2 . Such a sheet rolls up into a vortex ring as described in Chapter 2. The change of variable employed was 1 = cos , 0 /2, yielding r = sin . The smoothing parameter was set to 0.1. The sheet was discretized with 64 circular vortex lines, uniformly spaced in . Each vortex line was discretized with 128(1 + r) particles, where r is the radius of the vortex line, rounding the number of particles up to the nearest multiple of 8. The total number of particles discretizing the sheet was 13444. A prole of the sheet at time t = 1 is depicted in Figure 4.1. The computations were performed with dierent time steps and the results from dierent runs were compared by computing the maximum distance between particle positions. This comparison was used because in these particular runs, no particles or lines were inserted. The values of t used were 0.2, 0.1, 0.05, 0.025 and 0.0125. The position dierences were computed for consecutive values of t and are displayed in Table 4.2. The results are consistent with 4th order accuracy.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.2 0.4 0.6 0.8 1
Figure 4.1: Prole of rolling up vortex sheet. t = 1, = 0.10.
t 0.2 0.1 0.05
e(t) 3.321 103 1.885 104 1.344 105
e(t)/(t)4 2.075 1.885 2.151 2.366
0.025 9.241 107
Table 4.2: Maximum point position dierences for circular sheet. t = 1, = 0.10, e(t) = maxi xi (t) xi (t/2) .
Recall from Section 2.2.3 that points and lines are inserted during a computation to maintain resolution as the vortex sheet is stretched. There are two parameters governing this process, denoted vortex lines is greater than
1, 1
and
2.
When the distance between two adjacent
a new vortex line is inserted and if two particles on a

2,
vortex line are separated by more than
a new particle is inserted. If either of these
parameters is too large, resolution is lost and the computations become inaccurate. Figure 4.2 depicts cross sections of an axisymmetric vortex sheet rolling up into a disk for dierent values of
1.
As above, the smoothing parameter is set to 0.1.
The cross sections are shown for t = 4. The time step size t was set to 0.05, based on the results of the previous section. The are discernible. The
1 1
= 0.05 curve is smooth and no corners
= 0.10 curve is not as resolved, but the point positions are in

1
good agreement with the better resolved the

1
= 0.05 curve. The same can be said of
= 0.15 curve, but the loss of resolution in the core of the ring is considerable.
4.2
Selection of Runtime Parameters
In this section, we present the results of runs which were performed to select the runtime parameters N0 , the upper bound on the number of particles in an undivided cell, and pmax , the maximum admissible value of p. As explained previously, if N0 is small, then memory usage is large because the tree will have many levels. If N0 is large, then fewer particle-cluster interactions will be possible since there will be fewer cells, resulting in a large execution time. Similarly, if pmax is large, then there will be a large memory requirement because cell moments require O(p3 ) words max of memory. If pmax is small, then fewer particle-cluster interactions will be possible since the tolerance conditions will be satised less often, resulting in an increase in execution time. To nd values for these parameters which ensure good performance,
0.6
0.8
1.2
1.4
1.6
1.8 0
0.2
0.4
0.6
0.8
1
= 0.15,
Figure 4.2: Prole of rolling up vortex sheet. t = 4, = 0.10, t = 0.05, 0.10, 0.05, 2 = 0.05
runs were performed with N0 and pmax taking on a range of values, N0 = 50 . . . 1000 in increments of 50, and pmax = 6, 8, 10. Runs were performed with dierent N values and tolerances to ensure consistent results. The execution times of these runs are presented in Figure 4.3 as a function of N0 . Going up the page, N increases, and going right across the page, tol, the requested tolerance decreases. The dierent line patterns correspond to dierent vales of pmax , as described in the caption. A few observations can be made from this data. First, though there are trends in the execution time as a function of N0 , the timings never vary by more than a few percentage points for xed N and tolerance. The dependence on pmax is similar, although for the smaller tolerances, the dierence in the execution time is larger. In particular, for the smallest tolerance, tol = 104 , the pmax = 6 times are 15 to 20 percent larger than the pmax = 8 and 10 times, which are nearly identical. In roll-up simulations in the Chapter 5, we use tol = 103 for accuracy. For this tolerance, pmax = 8 consistently has smaller execution times, so that is our choice of pmax . We postpone selection of N0 until after we discuss memory usage. The memory used for these runs, in megabytes, is presented in Figure 4.4 as a function of N0 . As in Figure 4.3, N increases going up the page. The amount of memory used by the algorithm is independent of the requested tolerance, so there is only one column. As N0 increases, the memory usage decreases and levels out. As expected, for larger values of pmax , the memory usage is larger. From execution time and memory considerations, we use the value 512 for N0 . This means that we are potentially performing particle-particle interactions with cells that contain 500 particles. Though this value may seem intuitively large, it is justied from the numerical data in Figures 4.3 and 4.4. If much smaller values of N0 are used, then execution times as well as memory usage are larger.
110 105 N=51276 t (sec.) 100 95 90 0
tol = 1.0e2
200 190 180 170
tol = 1.0e3
350
tol = 1.0e4
300
500
1000
160
500
1000
250
500
1000
80 75 38444 t (sec.) 70 65 60 0 500 1000
140 130 120 110
240 220 200 180 0 500 1000 160 0 500 1000
42 40 38 36
75 70 65 60
120 110 100 90
25572
t (sec.)
500
1000
500
1000
500
1000
15.5 15 12708 t (sec.) 14.5 14 13.5 0 500 N0 1000
25 24
40
35 23 22 30
500 N0
1000
500 N0
1000
Figure 4.3: Execution time (sec.) vs. N0 . pmax = 6 (), 8 ( ), 10 ( ).
100
N=51276
t (sec.)
50
200
400
600
800
1000
60 t (sec.) 38444 40 20 0 0 200 400 600 800 1000
50 40 25572 t (sec.) 30 20 10 0 0 200 400 600 800 1000
25 20 12708 t (sec.) 15 10 5 0 0 200 400 600 800 1000
N0
Figure 4.4: Memory usage (MB) vs. N0 . pmax = 6 (), 8 ( ), 10 ( ).
4.3
Algorithm Performance
In this section, we compare the tree codes performance to direct summation. Execution time and memory usage as functions of N are compared for dierent tolerances. These comparisons are based on evaluating the velocity at points on a surface which approximates a rolled up vortex sheet, no time evolution is performed. In Figures 4.5 and 4.6, the independent variable N , the total number of particles, was made to vary by changing the renement in 1 (i.e. ). Figure 4.5 displays the execution time. The dierent line patterns represent dierent requested tolerances, as described in the caption. Figure 4.5a presents the execution times in seconds and Figure 4.5b shows the ratio between the direct summation time and the tree codes time. In our roll-up computations in Chapter 5, we use tol = 103 , which corresponds to the dashed line. With this tolerance, the new algorithm is faster than direct summation by a factor of 10 when there are 100,000 particles, and this factor increases with N . The factor of improvement appears to be increasing at a rate which is slightly less than linear. This is the expected behavior for an algorithm which requires O(N log N ) operations, since N 2 /(N log N ) = N/ log N . Figure 4.6 displays the memory used by the programs. Figure 4.6a presents the usage in megabytes and Figure 4.6b shows the factor of increase, i.e. the ratio between the new algorithms usage and the direct summation usage. As noted in Section 3.8, the percentage increase over the direct summation algorithm is between 1.3 and 1.6. The actual error in the computed value of the particle velocities, which is due to series truncation, is less than the specied tolerance. The disparity is due to the application of the triangle inequality in the error estimates in Section 3.6. Figure 4.7 displays the actual error as a function of the specied tolerance. Recall from
10000 9000
execution time (sec.)
(a)
30 25 20 15 10 5 0 0
(b)
8000 7000 6000 5000 4000 3000 2000 1000 0 0 5 N 10 15 4 x 10
10
15 4 x 10
Figure 4.5: Execution time (sec.) vs. N . pmax = 8. tol = 102 (), 103 ( ), 104 ( ). direct summation (). actual data (o), projected data (x). (a) Execution time, (b) Direct summation time / fast algorithm time.
45 40
memory usage (MB)
(a)
1.6 1.55 1.5 1.45 1.4 1.35 1.3 0
(b)
35 30 25 20 15 10 5 0 0 5 10 15 4 x 10
10
15 4 x 10
Figure 4.6: Memory usage (MB) vs. N . pmax = 8. fast algorithm (), direct summation (). actual data (o), projected data (x). (a) Memory usage, (b) Fast algorithm memory usage / direct summation memory usage.
10
actual velocity error
10 10 10 10 10
10
10 specified tolerance
10
Figure 4.7: Actual error vs. specied tolerance. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. potential error bound (), velocity error bound ( ). Section 3.6 that there are two dierent error bounds, one on the velocity potential (3.71) and one on the velocity (3.69). The gure contains data for programs which determine p using these bounds for dierent values of N , plotted with solid and dashed lines as described in the caption. It was stated in Section 3.6 that if the choice of p is based on the velocity error bound, then the actual error in the velocity is several orders of magnitude smaller than the requested tolerance, which is clearly demonstrated by the gure. The actual error is also smaller when the potential error bound is used, but by a smaller margin. Note that the actual error is not sensitive to changes in N . Figure 4.8 depicts the execution time as a function of the actual error, using the two dierent error bounds. The plotted lines correspond to the requested tolerances tol = 102 , 103 , 104 for a xed value of N , going up the plot as N increases. A conclusion that can be drawn from the gure is that the potential error bound
600
execution time (sec.)
500 400 300 200 100 0 8 10
10
10 10 actual velocity error
10
10
Figure 4.8: Execution time (sec.) vs. actual error. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. Connected lines are tol = 102 , 103 , 104 . potential error bound (), velocity error bound ( ). requires less time to obtain a given actual error for the same number of points than the velocity error bound. This observation, and the closer match of requested tolerance and actual error are the reasons that we use the potential bound.
CHAPTER 5
APPLICATIONS
In this chapter, the results of simulations performed using our algorithm are presented. In all of the computations here, unless mentioned otherwise, the requested tolerance was tol = 103 , and the runtime parameters for the algorithm were N0 = 512 and pmax = 8. The smoothing factor was = 0.10.
5.1
Vortex Ring with Azimuthal Perturbation
This section presents the results of simulations of a perturbed rolling-up vortex sheet. An azimuthal instability was introduced to the sheet by perturbing a at circular disk. In polar coordinates, the perturbation is of the form p(r, ) = r 2 cos(k)ez , (5.1)
where k is the perturbation wavenumber and is the magnitude of the perturbation. The r 2 factor is present to smooth the perturbation at the origin. The perturbation may also be considered as a function of and , its initial magnitude being proportional to sin2 , since r = sin . After the sheet rolls up, the radius of the ring, the position of the core, is approximately 0.8, as seen in Figure 4.2. Recall from the linear stability analysis of Section 2.3.2 that the stability of a vortex lament with respect to a perturbation 77
with wavenumber k depends only on k and /R. For = 0.10, and R = 0.8, /R = 0.125. From Figure 2.8, a vortex lament with /R = 0.12 has an unstable mode for k = 9. However, the presence of the rolls which are larger than presumably has an eect of spreading the vorticity out more away from the core. This is analogous to increasing , which lowers the wavenumber of the unstable mode. With this in mind, simulations were performed with wavenumbers k ranging from 4 to 11. The time step used was t = 0.10, and the point insertion parameters
1
and
were
0.075 and 0.05 respectively. The value of , the magnitude of the perturbation at the edge of the disk, was 0.10. Figure 5.1 shows a measure of the variance of the rings as a function of time. The quantity plotted was obtained as follows. Each value of corresponds to a lament, which in our computations is perturbed from being circular. The average radius and z position of the lament are computed. For each value of , we compute the L2 distance from the lament to the circle whose radius and z position are the averages just computed. The quantity plotted in Figure 5.1 is the L2 norm of this distance as a function of 1 . The gure shows that the perturbation for the k = 4 and 5 modes does not grow much. For the larger wave numbers, the disturbance has more growth, peaking with the k = 10 perturbation. To visualize the sheets, we plot the sheet positions for the k = 5 and 9 simulations. These two values of k are representative of the behavior observed for other k values. The position of the sheet for the wavenumber k = 5 at times t = 0, 2, 4, 6 is shown in Figure 5.2. One can see from these images that the sheet is rolling up smoothly, the perturbation having only a marginal eect on the evolution. This is as opposed to the images in Figure 5.3, which shows the vortex sheet for the k = 9 simulation. In this simulation, and the other high wavenumber simulations, the outer turns of the
sheet are smooth, but the core is becoming highly distorted. A depiction of the rings core, for k = 5 and 9, is presented in Figure 5.4. The curves plotted are the laments that correspond to > 0.8. Initially, these laments were near the outer portion of the disk. The distortion in the core for the k = 9 simulations as compared to the k = 5 is clearly evident here. The bulging behavior of the sheet around the waves is consistent with the simulations of Knio and Ghoniem [32] and the experiments of Didden [19]. The bulges are also similar to the deformations found by Meiburg, Lasheras, and Martin [44] in their study of azimuthal perturbations to a jet, which was based upon experiments and numerical simulations. It should be noted that the surfaces plotted in Figures 5.2 and 5.3 and the surface plots which appear later in this chapter are the surfaces formed by the material curves which coincided with the vortex lines of the sheets at t = 0. However, since we are using a smoothed Biot-Savart kernel, they are not the actual vortex lines for t > 0.
5.2
Elliptical Vortex Ring
In this section, results from simulations of an elliptical vortex ring are presented. The computations are similar to those of Dhanak and de Bernardinis [18] and Fernandez et. al. [23]. The model used for the formation of an elliptical vortex ring is to give an impulse to an elliptical disk and then to dissolve the disk away. As with a circular disk, a free vortex sheet remains and rolls up into a vortex ring. Following Dhanak and de Bernardinis [18], the circulation distribution for an elliptical disk is taken to be 1 = 1 x2 y 2 2, a2 b (5.2)
k = 4 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 2 4 6 2 4 k = 10 6 2 4 6 2 4 6 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 2 2 2
k = 5
k = 6
k = 7
k = 8
k = 9
2 4 k = 11
Figure 5.1: Variance of perturbed vortex sheet. = 0.10, = 0.10. k : wavenumber of perturbation, t : time.
Figure 5.2: Perturbed vortex sheet. k = 5. = 0.10, t = 0, 2, 4, 6.
Figure 5.3: Perturbed vortex sheet. k = 9. = 0.10, t = 0, 2, 4, 6.
0 1 2 1
0 1 2 1
0 1 2 1
0 1 2 1
0 1 2 1
0 1 2 1
0 1 2 1
0 1 2 1
Figure 5.4: Core of perturbed vortex sheet. k = 5, 9. = 0.10, t = 0, 2, 4, 6.
where the disk is the region x2 y 2 + 2 1. a2 b (5.3)
The vortex laments are ellipses with the same eccentricity as the elliptical disk. As before, the change of variable used is 1 = cos , leading to 1 () = sin . Simulations were performed for disks with dierent eccentricities, which was controlled by setting b = 1 and allowing a < 1 to vary. The ratio of the minor axis length to the major axis length is a and the eccentricity is 1 a2 . We present results for a = 0.8, 0.6, 0.5. The insertion parameters step used was t = 0.05. For values of a close to 1, an elliptical ring may be considered as a small perturbation of a circular ring with wavenumber 2. From the linear stability analysis in Section 2.3.2, we expect the perturbation to oscillate with constant magnitude. The behavior is exhibited by the a = 0.8 computation, which is presented in Figure 5.5. Initially, the disk is narrower in the direction coming out of and to the right of the page. Thus, the laments running along the front-right edge are stretched in comparison to the rest of the disk. This intensies the vorticity and that is why the outer turns have wrapped up and around more along this and its opposite edge. However, the dierence is not enough to disturb the core, which is rolling up smoothly. The a = 0.6 and 0.5 computations are presented in Figure 5.6 and 5.7 respectively. The orientation of these disks is the same as for the a = 0.8 disk. In the regions where the uid is moving most rapidly around the edge of the disk, the front-right and back-left, the uid is forced up over the disk towards the center. As the uid from either side approaches the center, it is forced up and away from the disk. This is the cause of the protruding spikes on the sheets. The presence of these structures
1
and
were both set to 0.05. The time
make it dicult to study the sheets motion. This is because as the sheet stretches to form the peaks, additional laments are inserted, which increases the execution time. For the a = 0.5 computation, it was started with under 7500 particles and at time t = 6, it has 84,000 particles.
5.3
Colliding Vortex Rings
In this section, results of a simulation of oblique colliding vortex rings are presented. The conguration of vortex rings is based on experiments performed by Schatzle [55]. In our computations, the rings are inclined from horizontal by 30 degrees. The centers of the initial circular vortex sheets are located at (1, 0, 0). An adaptive time-step procedure was used, with an initial t = 0.10, although the time steps never went below 0.07. The point insertion parameters and 0.05 respectively. Figure 5.8 shows the vortex sheets which represent the colliding vortex rings. Figure 5.9 shows a cut-away of the same view, enabling one to see the rolling up structure which is present. In the region where the rings have merged, the windings of the sheet are attened up against each other and are being pushed down. Because of this stretching, a large number of laments and particles are inserted into this region, even though the vorticity amplitude is relatively low, as shown by the vorticity isosurfaces in the next gures. At time t = 0, there were 14984 particles representing the disks, and at time t = 4.5, the latest time in our runs, there were 891514 particles. For this number of particles, we estimate that our fast algorithm is performing the computations 60 times faster than direct summation. Even with the fast algorithm, the computation took 32 hours to go from t = 4 to t = 4.5, so a direct summation algorithm would take months.
1
and
were 0.075
Figure 5.5: Elliptical vortex sheet. a = 0.8. = 0.10, t = 0, 2, 4, 6.
Figures 5.10 through 5.13 show isosurfaces of the vorticity eld, computed by dierentiating the integral (2.25) and evaluating it for positions x on a regular grid. The values chosen for the isosurfaces are one- and two-thirds of the maximum initial computed vorticity. Each gure shows the rings from a dierent view point for the time sequence t = 0, 1, 2, 3, 4, 4.5. The rst view is a perspective view with shading on the surfaces, and the others are orthogonal projections. As the rings approach, they initially pinch, and then they merge and this region attens out. The connection region then begins to stretch out. This is in qualitative agreement with Schatzles experiment and the computations of Anderson and Greengard [1]. In Schatzles experiments, the connection region disconnects and there is another connection and subsequent disconnection which occurs at the bottom of the rings. Because of the stretching and reconnection of vorticity, it is an open question whether or not a vortex lament model can capture these later stages of the evolution. Our simulations appear to have eectively captured the merger of the rings. However, due to the large computational time, we were not able to explore the parameter space. For instance, it would be of interest to know how the ring merger depends upon the angle of inclination. We are also interested in knowing what happens when 0.
Figure 5.8: Vortex sheets modeling colliding disks. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
Figure 5.9: Cut-away of vortex sheets modeling colliding disks. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
Figure 5.10: Vorticity isosurfaces of colliding vortex rings, perspective view. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
Figure 5.11: Vorticity isosurfaces of colliding vortex rings, front view. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
Figure 5.12: Vorticity isosurfaces of colliding vortex rings, side view. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
Figure 5.13: Vorticity isosurfaces of colliding vortex rings, top view. = 0.10, t = 0, 1, 2, 3, 4, 4.5.
CHAPTER 6
CONCLUSIONS
6.1
Summary
A new algorithm has been presented for rapidly computing three-dimensional vortex sheet motion. The main ingredients of the algorithm are the use of Taylor series for particle-cluster interactions and a nested subdivision of space to create the particle clusters. An important feature of the algorithm is the use of recurrences to compute the expansion coecients for particle-cluster interactions. New features of the algorithm include its application to a non-harmonic three-dimensional kernel, its adaptive subdivision of space and its adaptive error control. The majority of treecode algorithms previously developed for rapid computations in particle simulations have been restricted to applications where the particle interaction kernel is harmonic. Our algorithm overcomes this restriction by extending the Taylor series approach of Draghicescu and Draghicescu [21] to the three-dimensional vortex blob kernel K . The subdivision of space, to obtain smaller particle clusters, takes into account the local particle distribution by using the particles bounding box. When the previous algorithms subdivide cells, they do not take into account the particles positions. Though there have been some algorithms which have only subdivided when there are suciently many particles to warrant it, such as the adaptive multipole algorithm of 96
Carrier, Greengard, and Rokhlin [11], even these algorithms have not taken the particles positions within the cells into consideration when subdividing. The bounding boxes also provide for series expansion points which yield good convergence. The order of the expansion used for particle-cluster interactions, p, is chosen adaptively and depends upon the selection of the expansion point, so good placement yields lower values of p, which improves the algorithms performance. The algorithm has been applied to study the dynamics of vortex rings which are modeled as rolled-up vortex sheets. With the fast algorithm we are able to perform simulations with 105 106 particles, which was not previously feasible. We performed simulations of perturbed vortex rings, elliptical vortex rings, and the collision of two vortex rings. In the simulations of the colliding vortex rings, the vorticity in the rings appears to reconnect, due to superposition, even though the model does not explicitly account for viscous eects. The sheet motion is computed with a Lagrangian numerical method, computing the sheets velocity with a smoothed version of the Biot-Savart integral (2.22). Discretization leads to a large system of dierential equations which are solved with a Runge-Kutta method. At each time step of the computation, we use the new algorithm to compute the velocity of the discrete particles representing the sheet.
6.2
Directions for Future Work
There are a number of ways to extend this work, which fall into three categories, investigating further the dynamics of the vortex sheet model for vortex rings, enhancing the algorithm, and applying the algorithm to other systems of equations. The vortex sheet model for vortex ring formation appears to capture the process of vortex reconnection. It would be useful to understand this better, which would
require more extensive runs and an exploration of the parameter space. For instance, one issue is to determine how the dynamics depend on . It is also of interest to extend the simulations to later times to see how the model performs. In this thesis, the only ows that we have considered are vortex rings modeled as rolled up vortex sheets. More general uid ow problems can be studied using smoothed vortex lament models as introduced by Chorin [12] and other three-dimensional vortex methods as discussed by Leonard [37]. These numerical methods reconstruct the velocity eld from the vorticity eld using the Biot-Savart integral (2.6), which leads to an O(N 2 ) operation count, where N is the number of computational elements. Our algorithm can be used to speed these computations as was done for the vortex sheet problems we studied. The motivation for using an asymptotically fast algorithm is the O(N 2 ) operation count of direct summation. However, if N is not too large, then direct summation is feasible. So it would be advantageous to use a vortex method which is more ecient in terms of the number of discretizing particles that it uses. Though we insert points and lines when the sheet stretches, we do not remove any when they concentrate in a small region. If this could be done, then the execution time could be lowered. One possibility for doing this is the removal of vortex hairpins as described by Chorin [13, 14]. Another possibility is Lagrangian reparametrization. However, it is not clear that a rolling-up vortex sheet can be resolved with a small number of points, so these options may have only limited benet for vortex sheet motion. In terms of enhancing the algorithm, one direction to take is to use dierent expansions than Taylor series for particle-cluster interactions. Two possible classes are orthogonal polynomials and wavelets. An advantage of orthogonal polynomials is that fewer terms would be needed to satisfy error tolerances. An advantage of
wavelets is that the approximant can be taken to be globally continuous, as opposed to piecewise continuous as the current method yields. This may be advantageous when the system being modeled is unstable. The main aspect of the algorithm which needs to be generalized for these changes is the computation of the expansion coecients. If the coecients are not computed eciently, i.e. not in linear time with respect to the number of terms in the expansion, then the performance of the algorithm will be degraded. This is because coecient computation will then dominate the overall operation count for particle-cluster interactions. Another way in which the algorithm can be improved is to use a better cell dividing technique. The current technique subdivides cells by bisecting the cells bounding box. This approach does not use any information about the internal structure of the particles in the cell, such as how the particles are grouped. Thus, it may break up natural clusters which span the cells mid-planes. An approach which detects such internal structure could be benecial. In terms of studying dierent systems, the present algorithm can be used to study other systems which are modeled with vortex sheets or the algorithm can be generalized to study particle systems where the interaction kernel is dierent than K . One application that is of interest is the three-dimensional simulation of the wake behind an airplane, modeled as a vortex sheet. For systems with dierent kernels, certain aspects of the algorithm need to be modied, although the basic idea of particle-cluster interactions and the subdivision of space are independent of the kernel. The main aspect of the algorithm which would need to be generalized is the computation of the expansion coecients. However, a recurrence similar to (3.27) exists for the Taylor coecients of any function which satises a linear dierential equation with polynomial coecients. When such a dierential equation
is dierentiated n times and the Leibniz rule for dierentiating a product is used, low order derivatives of do not appear because high order derivatives of the polynomial coecients vanish. Thus, the Taylor coecients will satisfy a short recurrence. For instance, consider the third-order Gaussian = exp(r 3 ), which has been used as a convolution function to smooth the Biot-Savart kernel [7, 32]. The function satises the dierential equation (r) + 3r 2 (r) = 0, and its Taylor coecients cn = (n) (r)/n! satisfy the recurrence cn + 3(r 2 cn1 + 2rcn2 + cn3 )/n = 0. (6.2) (6.1)
So the Taylor coecients could be computed rapidly. Thus, we believe the algorithm can be extended to a wide class of systems.
APPENDICES
101
APPENDIX A
Notation
ak (x, y) bk ( ) ck (x, y) ck h k K(x, y)
Taylor coecients for particle-cluster interaction particle moments for cell Taylor coecients of Taylor coecients of 1 convergence factor for particle-cluster interaction, error = O(hp ) wavenumber of vortex lament or ring perturbation Biot-Savart kernel
K (x, y) smoothed Biot-Savart kernel N N N0 p(x, t) p pmax P ( ) r total number of particles in simulation number of particles in a cluster maximum N in an unsplit cell uid pressure order of the series truncation maximum admissible order parent of cell radius of cell about y
R Sn Tn u(x, t) wi (t)
smoothed distance from y to target particle sum of order n terms from particle-cluster expansion using K sum of order n terms from particle-cluster expansion using uid velocity product of 2 nite dierences and 1 2 integration weights
x(1 , 2 , t) position of vortex sheet xi (t) yj y

1, 2
discrete particle approximating position on vortex sheet particles making up a cluster center of cell , expansion point for Taylor series reparametrization of 1 smoothing parameter insertion parameters in 1 , 2 directions respectively circulation parameter across vortex lines in vortex sheet parameter along a vortex line uid viscosity magnitude of vortex ring perturbation sum of absolute value of weights in a cell cell containing a cluster of particles velocity potential jump in across vortex sheet ow map potential function for K one-dimensional analogue of vorticity jump in across vortex sheet
1 2 p (x, t) J (1 , 2 ) (x, t) 1 (x, t) []
APPENDIX B
Cylindrical Coordinate Identities
Change of Basis Formulas er () = cos( ) er () + sin( ) e () e () = sin( ) er () + cos( ) e () (B.1) (B.2)
Derivatives of Basis Vectors d er () = e () d d e () = er () d (B.3) (B.4)
Cross Product Relationships er e = e z e e z = e r ez e r = e e er = ez ez e = er er ez = e (B.5) (B.6) (B.7)
APPENDIX C
Details from Circular Filament Analysis
This appendix contains some details of the analysis of a circular vortex lament from Section 2.3.2.
C.1
Propagation Speed of Circular Filament
The initial conditions, in cylindrical coordinates, are y(, 0) = (R, , 0). The evolution equation for y(, t) is y 1 (, t) = t 4 where K (x, y) = xy 1 . 4 (|x y|2 + 2 )3/2 (C.3)
2 0
(C.1)
K (y(, t), y(, t))
y (, t) d,
(C.2)
To evaluate the integral in (C.2), we express the integrand in terms of the cylindrical basis at (R, , 0). This ensures that the basis elements are independent of , the variable of integration, so that they can be factored out of the integral. The rst
expression to compute is y(, 0) y(, 0), y(, 0) y(, 0) = Rer () Rer () = Rer () R(cos( )er () + sin( )e ()) = R(1 cos( ))er () R sin( )e (). Thus, the denominator in K (y(, 0), y(, 0)) is |y(, 0) y(, 0)|2 + 2
2 3/2
(C.4) (C.5) (C.6)
= R (1 cos( ))2 + R2 sin2 ( ) + 2 = 2R2 (1 cos( )) + 2

3/2
3/2
(C.7)
(C.8)
3/2
= R3 2(1 cos( )) + (/R)2 The partial derivative term in the integrand is Rer () = Re ()
(C.9)
(C.10) (C.11)
= R( sin( )er () + cos( )e ()).
Dropping the dependence of the basis vectors, since all of vectors are based at = , we have y (y(, 0) y(, 0)) (, 0) = R2 (1 cos( ))er sin( )e sin( )er + cos( )e = R2 (1 cos( )) cos( ) sin2 ( ) ez = R2 cos( ) 1 ez (C.13) (C.14) (C.12)
Thus, the velocity of the lament is U= ez 4R ez 4R

2 0 2 0 3/2 2(1 cos( )) + (/R)2 1 cos d, 3/2 2 2(1 cos ) + (/R)
1 cos( )
(C.15)
(C.16)
as stated in Section 2.3.2.
C.2
Linearized Evolution Equations for Perturbation
When a perturbation p(, t) is added to a circular vortex lament, the perturbation satises an integro-dierential equation. As described in Section 2.3.2, we expand p(, t) in terms of the cylindrical basis er (), e () and ez , obtaining p(, t) = pr (, t)er () + p (, t)e () + pz (, t)ez . (C.17)
When the evolution equation for p(, t) is linearized about the steady solution p(, t) = 0, the result is pr 1 () = t 4R2 p 1 () = t 4R2 1 pz () = t 4R2
2 0 sin( ) pz () + cos( )(pz () pz ()) d, 3/2 2(1 cos( )) + (/R)2
(C.18a)
2 0
(1 cos( )) pz () + sin( )(pz () pz ()) d, )) + (/R)2 3/2 2(1 cos( (C.18b)
2 0
2(1 cos( )) + (/R)2 sin( )(p () p () + pr ())

3/2 2(1 cos( )) + (/R)2 (1 cos( ))2 (pr () + pr ()) 5/2
pr () pr () + (1 cos( ))(pr () + pr () +
3/2
p ())
3 +3
2(1 cos( )) + (/R)2 (1 cos( )) sin( )(p () p ()) 2(1 cos( )) + (/R)2
5/2
(C.18c) d.
BIBLIOGRAPHY
[1] C. Anderson and C. Greengard. The vortex ring merger problem at innite Reynolds number. Comm. Pure Appl. Math., 42(8):11231139, 1989. [2] C. R. Anderson. An implementation of the fast multipole method without multipoles. SIAM J. Sci. Statist. Comput., 13(4):923947, 1992. [3] A. Appel. An ecient program for many-body simulation. SIAM J. Sci. Statist. Comput., 6(1):85103, 1985. [4] H. Aref and I. Zawadzki. Linking of vortex rings. Nature, 354(6348):5053, 1991. [5] J. Barnes and P. Hut. A hierarchical O(N log N ) force-calculation algorithm. Nature, 324(6096):446449, 1986. [6] G. K. Batchelor. An Introduction to Fluid Dynamics. Cambridge University Press, 1967. [7] J. T. Beale and A. Majda. High order accurate vortex methods with explicit velocity kernels. J. Comput. Phys., 58(2):188208, 1985. [8] G. Birkho. Helmholtz and Taylor instability. In Proc. Sympos. Appl. Math., Vol. XIII, pages 5576, 1962. [9] G. L. Brown and A. Roshko. On density eects and large structure in turbulent mixing layers. J. Fluid Mech., 64:775816, 1974. [10] R. Caisch. Mathematical analysis of vortex dynamics. In Mathematical aspects of vortex dynamics (Leesburg, VA, 1988), pages 124, 1989. [11] J. Carrier, L. Greengard, and V. Rokhlin. A fast adaptive multipole algorithm for particle simulations. SIAM J. Sci. Statist. Comput., 9(4):669686, 1988. [12] A. J. Chorin. The evolution of a turbulent vortex. 83(4):517535, 1982. Comm. Math. Phys.,
[13] A. J. Chorin. Hairpin removal in vortex interactions. J. Comput. Phys., 91(1):1 21, 1990.
[14] A. J. Chorin. Hairpin removal in vortex interactions II. J. Comput. Phys., 107(1):19, 1993. [15] A. J. Chorin and P. S. Bernard. Discretization of a vortex sheet, with an example of roll-up. J. Comput. Phys., 13(3):423429, 1973. [16] S. C. Crow. Stability theory for a pair of trailing vortices. AIAA J., 8(12):2172 2179, 1970. [17] J. Delort. Existence de nappes de tourbillon en dimension deux. J. Amer. Math. Soc., 4(3):553586, 1991. [18] M. R. Dhanak and B. de Bernardinis. The evolution of an elliptic vortex ring. J. Fluid Mech., 109:189216, 1981. [19] N. Didden. Investigation of laminar, unstable vortex rings by means of laserDoppler anemometry. Mitt. Max-Planck-Institut Strmungsforschung Aero. o Versuch., 64, 1977. [20] N. Didden. On the formation of vortex rings: rolling-up and production of circulation. Z. Angew. Math. Phys., 30:101116, 1979. [21] C. Draghicescu and M. Draghicescu. A fast algorithm for vortex blob interactions. J. Comput. Phys., 116(1):6978, 1995. [22] A. Erdlyi, W. Magnus, F. Oberhettinger, and F. Tricomi. Higher Transcene dental Functions, volume II. McGraw-Hill, 1953. [23] V. M. Fernandez, N. J. Zabusky, V. M. Gryanik, and V. M. Gryanik. Vortex intensication and collapse of the Lissajous-elliptic ring: single- and multi-lament Biot-Savart simulations and visiometrics. J. Fluid Mech., 299:289331, 1995. [24] L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Phys., 73(2):325348, 1987. [25] L. Greengard and V. Rokhlin. The rapid evaluation of potential elds in three dimensions. In C. Anderson and C. Greengard, editors, Vortex methods (Los Angeles, CA, 1987), number 1360 in Lecture Notes in Mathematics, pages 121 141. Springer-Verlag, 1988. [26] L. Greengard and V. Rokhlin. A new version of the fast multipole method for the Laplace equation in three dimensions. Research Report 1115, Yale University Department of Computer Science, 1996. [27] R. W. Hockney and J. W. Eastwood. Computer Simulations Using Particles. McGraw-Hill, New York, 1981. [28] Y. Kaneda. A representation of the motion of a vortex sheet in a threedimensional ow. Phys. Fluids A, 2(3):458461, 1990.
[29] B. W. Kernighan and D. M. Ritchie. The C Programming Language. Prentice Hall, 2 edition, 1988. [30] S. Kida, M. Takaoka, and F. Hussain. Reconnection of two vortex rings. Phys. Fluids A, 1(4):630632, 1989. [31] S. Kida, M. Takaoka, and F. Hussain. Collision of two vortex rings. J. Fluid Mech., 230:583646, 1991. [32] O. Knio and A. Ghoniem. Numerical study of a three-dimensional vortex method. J. Comput. Phys., 86(1):75106, 1990. [33] R. Krasny. Desingularization of periodic vortex sheet roll-up. J. Comput. Phys., 65(2):292313, 1986. [34] R. Krasny. A study of singularity formation in a vortex sheet by the point-vortex approximation. J. Fluid Mech., 167:6593, 1986. [35] C. H. Krutzsch. Uber eine experimentel bebachtete erscheinung an wirbelringen bei ihrer translatorischen bewegung in wirklichen ssigkeiten. Ann. Phys., u 35(5):497523, 1939. [36] H. Lamb. Hydrodynamics. Dover Publications, New York, 6 edition, 1945. [37] A. Leonard. Computing three-dimensional incompressible ows with vortex elements. Ann. Rev. Fluid. Mech., 17:523559, 1985. [38] A. Lifschitz, W. Suters, and J. T. Beale. The onset of instability in exact vortex rings with swirl. J. Comput. Phys., 129(1):829, 1996. [39] J. Liu and Z. Xin. Convergence of vortex methods for weak solutions to the 2-D Euler equations with vortex sheet data. Comm. Pure Appl. Math., 48(6):611 628, 1995. [40] A. J. Majda. The interaction of nonlinear analysis and modern applied mathematics. In Proceedings of the International Congress of Mathematicians, Vol. I, II (Kyoto, 1990), pages 175191, 1991. [41] A. J. Majda. Remarks on weak solutions for vortex sheets with a distinguished sign. Indiana Univ. Math. J., 42(3):921939, 1993. [42] T. Maxworthy. The structure and stability of vortex rings. J. Fluid Mech., 51:1532, 1972. [43] T. Maxworthy. Some experimental studies of vortex rings. J. Fluid Mech., 81:465495, 1977. [44] E. Meiburg, J. C. Lasheras, and J. E. Martin. Experimental and numerical analysis of the three-dimensional evolution of an axisymmetric jet. In Turbulent Shear Flows 7 (Stanford University, USA, 1989), pages 195208, 1991.
[45] D. W. Moore. Finite amplitude waves on aircraft trailing vortices. Aeronautical Quarterly, 23:307314, 1972. [46] D. W. Moore. The spontaneous appearance of a singularity in the shape of an evolving vortex sheet. Proc. Roy. Soc. London Ser. A, 365(1720):105119, 1979. [47] M. Nitsche and R. Krasny. A numerical study of vortex ring formation at the edge of a circular tube. J. Fluid Mech., 276:139161, 1994. [48] D. I. Pullin. The large-scale structure of unsteady self-similar rolled-up vortex sheets. J. Fluid Mech., 88(3):401430, 1978. [49] A. Pumir and R. M. Kerr. Numerical simulation of interacting vortex tubes. Phys. Rev. Lett., 58(16):16361639, 1987. [50] A. Pumir and E. D. Siggia. Vortex dynamics and the existence of solutions to the Navier-Stokes equations. Phys. Fluids, 30(6):16061626, 1987. [51] L. Rosenhead. The spread of vorticity in the wake behind a cylinder. Proc. Roy. Soc. Ser. A, 127:590612, 1930. [52] P. G. Saman. The number of waves on unstable vortex rings. J. Fluid Mech., 84(4):625639, 1978. [53] P. G. Saman. A model of vortex reconnection. J. Fluid Mech., 212:395402, 1990. [54] J. K. Salmon and M. S. Warren. Skeletons from the treecode closet. J. Comput. Phys., 111(1):136155, 1994. [55] P. R. Schatzle. An experimental study of fusion of vortex rings. PhD thesis, California Institute of Technology, 1987. [56] K. Shari and A. Leonard. Vortex rings. Ann. Rev. Fluid. Mech., 24:235279, 1992. [57] G. I. Taylor. Formation of a vortex ring by giving an impulse to a circular disk and then dissolving it away. J. Appl. Phys., 24(1):104, 1953. [58] J. J. Thomson and H. F. Newall. On the formation of vortex rings by drops falling into liquids, and some allied phenomena. Proc. Roy. Soc. Ser. A, 39:417 436, 1885. [59] G. Tryggvason, W. J. A. Dahm, and K. Sbeih. Fine structure of vortex sheet rollup by viscous and inviscid simulation. J. Fluids Eng., 113(1):3136, 1991. [60] L. van Dommelen and E. A. Rundensteiner. Fast, adaptive summation of point forces in the two-dimensional Poisson equation. J. Comput. Phys., 83(1):126 147, 1989.
[61] S. E. Widnall, D. B. Bliss, and C. Tsai. The instability of short waves on a vortex ring. J. Fluid Mech., 66:3547, 1974. [62] S. E. Widnall and J. P. Sullivan. On the stability of vortex rings. Proc. Roy. Soc. London Ser. A, 332:335353, 1973. [63] G. S. Winckelmans. Topics in vortex methods for the computation of three- and two-dimensional incompressible unsteady ows. PhD thesis, California Institute of Technology, 1989. [64] G. S. Winckelmans, J. K. Salmon, A. Leonard, and M. S. Warren. Threedimensional vortex particle and panel methods: fast tree-code solvers with active error control for arbitrary distributions/geometries. In Forum on Vortex Methods for Engineering Applications (Albuquerque, NM, 1995), pages 2343, 1995. [65] F. Zhao. An O(N ) algorithm for three-dimensional N -body simulations. Masters thesis, Massachusetts Institute of Technology, 1987.
ABSTRACT
A THREE-DIMENSIONAL CARTESIAN TREE-CODE AND APPLICATIONS TO VORTEX SHEET ROLL-UP
by Keith Lindsay
Chair: Robert Krasny
An algorithm is presented for the rapid computation of vortex sheet motion in threedimensional uid ow. The equations governing vortex sheet motion, considered in Lagrangian form, are desingularized and discretized, resulting in a system of equations for the N discretizing particles. Since the particles interact pairwise, evaluating the velocities by direct summation requires O(N 2 ) operations, which becomes prohibitively expensive as N increases. Based on measured execution times, the new algorithm computes the particle interactions with O(N log N ) operations. The additional memory required by the algorithm is less than 60% of the memory used by a direct summation algorithm. The algorithm extends Draghicescus algorithm from two to three space dimensions. The main ingredients are the replacement of particle-particle interactions with particle-cluster interactions which are based on Cartesian Taylor series expansions and the use of an adaptive tree-based subdivision of space to create the particle clusters. An important feature of the algorithm is
the use of recurrences to compute the expansion coecients. The recurrences are a generalization of those used by Draghicescu. The new features of the algorithm are its application to a non-harmonic three-dimensional kernel, its adaptive subdivision of space and adaptive error control. The algorithm is used to study the dynamics of vortex rings which are modeled as rolling up vortex sheets. An adaptive point insertion algorithm is used to ensure that the vortex sheets are accurately resolved as they stretch. The problems considered are azimuthal vortex ring instabilities, the evolution of an elliptical vortex ring, and the collision of two vortex rings. In the last problem, the vorticity in the rings appears to connect, due to superposition, even though the vortex sheet model does not explicitly account for viscous eects and the sheets themselves do not connect.

Keith Lindsay - A Three-Dimensional Cartesian Tree-Code and Applications To Vortex Sheet Roll-Up

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Keith Lindsay - A Three-Dimensional Cartesian Tree-Code and Applications To Vortex Sheet Roll-Up

Загружено:

Авторское право:

Доступные форматы

A THREE-DIMENSIONAL CARTESIAN TREE-CODE AND APPLICATIONS TO VORTEX SHEET ROLL-UP

Keith Lindsay 1997 All Rights Reserved

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Figure 2.1 2.2 A vortex sheet modeling parallel shear ow. . . . . . . . . . . . . . 7

2.5 2.6 2.7 2.8

4.3 4.4 4.5

5.2 5.3 5.4 5.5 5.6 5.7

Table 4.1 4.2 Machine characteristics. . . . . . . . . . . . . . . . . . . . . . . . . 66

Appendix A. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Cylindrical Coordinate Identities . . . . . . . . . . . . . . . . . . . . . 104

Contributions of the Thesis

the curl of the Euler equation (2.3), we obtain t + (u ) = ( )u. (2.5)

Figure 2.1: A vortex sheet modeling parallel shear ow.

particles for cubic interpolant new particle

y(2, t). The lament evolves according to the equation y 1 (, t) = t 4

y K (y(, t), y(, t)) (, t) d,

2(1 cos ) + (/R)2

1 d, 1/2 (2(1 cos ))

2(1 cos ) + (/R)2

H GEC 03FDB ) 0( T SQP 03RAI

of the cylindrical basis at (R, , U t) p(, t) = pr (, t)er () + p (, t)e () + pz (, t)ez . (2.35)

where the Ij are the integrals I1 = I2 = I3 = 1 4R2 1 4R2 1 4R2

Figure 2.9: Colliding vortex rings. 2.3.3 Interactions

FAST METHODS FOR PARTICLE SIMULATIONS

where K(xi , xj ) = xi x j . |xi xj |3 (3.2)

Dene the mass density and gravitational potential by

where (z) = |z|1 . Then from the identities K(xi , xj ) =

K (x, y + (yj y)) wj 1 k D K (x, y)(yj y)k wj k! y

N 1 k (yj y)k wj D K (x, y) k! y j=1

where 1 k ak (x, y) = Dy K (x, y), k!

Recurrences for Taylor Coecients

r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r rrrrrrrrrr r r rr rrr r r rr r r rr rrrrrr r r r r r r r rr r rr r rr r r rr rr rr rrrrrrrrr r r r r r

we have the relationship

(k2 + 1)ck1 ,k2 +1,k3

(k1 + 1)ck1 +1,k2 ,k3

(k3 + 1)ck1 ,k2 ,k3 +1

1 1 )ck1 1,k2 ,k3 + (1 )ck1 2,k2 ,k3 2k1 k1 (3.27a)

R2 ck1 ,k2 + 2z1 ck1 1,k2 + ck1 2,k2 + 2z2 (1

Error Analysis of Particle-Cluster Interactions

To analyze the rate of convergence of this series as p increases, the quantities Sn =

ak (x, y)(yj y)k

ck (x, y)(yj y)k ,

which is related to Sn by Sn = (1)n

which is the recurrence for Tn . Since Tn is a multiple of hn Pn ( R ), it too satises the

(x y) (yj y) . |yj y|(|x y|2 + 2 )1/2 (3.58)

(x y) (yj y) , |yj y|(|x y|2 + 2 )1/2

((n + 1)Pn () + 2Pn ()) .

|Pn ()| n(n + 1)/2,

|wj ||yj y|p .

Dene the quantities p ( ) = (p + 1)2

|wj ||yj y|p .

|wj ||yj y|p .

Full Description of the Algorithm

not subdivided. Thus, the inequality N N , (3.79)

analogous to (3.77), may be strict. So we obtain in general T ( ) O(lpmax N ) + T (),

ALGORITHM VALIDATION AND PERFORMANCE

Machine Power Challenge L UltraSPARC 2 SPARCstation 20

RAM (MB) 128 380 32

CPU clock rate (MHz) 75 168 150

Table 4.1: Machine characteristics.

Convergence of Vortex Method

Figure 4.1: Prole of rolling up vortex sheet. t = 1, = 0.10.