FINITE STRIP METHOD FOR STRUCTURAL ANALYSIS ON PARALLEL COMPUTERS

PERFORMANCE OF THE FINITE STRIP METHOD FOR STRUCTURAL ANALYSIS ON A PARALLEL COMPUTER
HSIN-CHU CHENy AND AI-FANG HEz
Abstract. In this paper, we address the e ciency and parallelizability of the nite strip method for the analysis of certain structural problems. This method, rst developed in the context of thin plate bending analysis, is a semi-analytical nite element process. It takes advantage of orthogonal properties of harmonic functions in the sti ness matrix formulation to yield a block diagonal sti ness matrix. The method not only simpli es the data manipulation in forming the matrix but also reduces the bandwidth of the assembled matrix, resulting in great savings in the solution process. Furthermore, the block diagonal structure of the matrix o ers obvious advantages for solving the problem on parallel computers. Numerical experiments and the performance of this method on an Alliant FX/8 minisupercomputer are reported. Some parallel implementations of this method on the University of Illinois Cedar multiprocessor are presented.
1. Introduction. Recent developments in supercomputing technology and the availability of multiprocessors have encouraged researchers on one hand to search for new parallel algorithms and on the other to reexamine existing potential methods for e cient engineering and scienti c computations. In this paper we address the e ciency and parallelizability of the nite strip method for the analysis of certain structural problems and outline its parallel implementations on the Alliant FX/8 and the University of Illinois Cedar multiprocessor. Numerical experiments and the performance of this method on an Alliant FX/8 minisupercomputer are reported. The nite strip method, rst developed and named by Cheung Cheu68a] in the context of thin plate bending analysis, is a semi-analytical nite element process suitable for problems whose geometry and material properties do not vary along one or more directions Zien77]. This method allows for a discrete analysis of the problem into nite strips. This involves an approximation of the true solution using a continuous harmonic function series which satis es certain boundary conditions in one direction and piecewise interpolation polynomials in the others. This method takes advantage of orthogonal properties of the harmonic functions in the sti ness (mass) matrix formulation to yield a block diagonal system sti ness (mass) matrix structure. This method, therefore, not only simpli es the data manipulation in forming the matrix, but also reduces the bandwidth of the assembled matrix, resulting in considerable savings in the solution process. Furthermore, the block diagonal structure of the matrix o ers obvious advantages for solving the problem in parallel using a multiprocessor machine.
This work was supported by the U.S. Department of Energy under Grant No. DOE DE-FG0285ER25001 and the National Science Foundation under Grant No. NSF CCR-8717942. y Center for Supercomputing Research and Development, University of Illinois at UrbanaChampaign, 305 Talbot Laboratory, 104 South Wright St., Urbana, IL 61801-2932, USA. z Beijing Institute of Computer Applications and Simulation Technology, Beijing, PRC. 1
x y
x,
y,
my
z w, q
mx
Fig. 2.1
. The coordinate system and sign convention.
Although not as versatile as the nite element method, the nite strip method has been applied to a wide range of plate, folded plate, shell, and bridge deck problems BeHi76, Cheu68a, Cheu68b, Cheu76, CuLo74, O~Su83a] on sequential computers n due to its e ciency and simplicity. Because our main purpose is to exploit its potential for inducing large grain parallelism, we con ne our discussions to rectangular Mindlin plate problems simply supported on two opposite sides. For other types of plate boundary conditions, the nite strip method may still be used Cheu76]. The block diagonal structure of the matrix, however, cannot be expected in general. 2. The problem. In this section we brie y describe the mathematical modeling of the Mindlin plate problems Mind51]. Let be the domain in <2 and ? be the boundary. Let also the stress resultant, generalized strain, displacement, and surface loading vectors be denoted, respectively, by , , , and p where
Mx x 7 2 2 7 6 6 q 3 w 3 6 My 7 6 y 7 7 7 6 6 = 6 Mxy 7 ; = 6 xy 7 ; = 6 x 7 ; and p = 6 mx 7 : 5 4 5 4 7 7 6 6 7 7 6 6 my 4 Qx 5 4 xz 5 y Qy yz The subscripts x, y, and z represent the directions in the Cartesian coordinate system. The sign convention for the displacements and external loadings is shown in Figure 2.1. The di erential equations which govern the state of stress resultants, generalized strains, and displacements in an elastic plate can be expressed as:
1. Equilibrium equations: LT + p = 0 in , 1 subject to some appropriate boundary conditions on ?, 2. Stress-strain equations: = D , and 3. Strain-displacement equations: = L2 .
Here D is the material property matrix of an elastic plate. L1 and L2 are the di erential operators: 2 0 0 0 @=@x @=@y 3 (2.1) LT = 6 @=@x 0 @=@y ?1 0 7 5 1 4 0 @=@y @=@x 0 ?1 and 3 2 0 0 0 @=@x @=@y (2.2) LT = 6 ?@=@x 0 ?@=@y ?1 0 7 5 2 4 0 ?@=@y ?@=@x 0 ?1 where the superscript T denotes the transpose of a matrix. For orthotropic material, the matrix D takes the form 3 2 Dx D1 7 6 7 6 D1 Dy 7 6 7 6 (2.3) Dxy D=6 7 7 6 Gx 5 4 Gy where Dx, D1, ..., Gy are the standard exural and shear rigidities of plates and is a modi cation coe cient to account for the deviation of shear strain distribution from uniformity BeHi76] ( = 5=6 for rectangular cross section; see TiGe72, p. 371]. The rest of the entries in D are zero. If the material is isotropic, then the nonzero entries take the following values: Et3 Dx = Dy = 12(1 ? 2) ; (2.4)
D1 = Dx; Dxy = 1 ? Dx; and 2 Gx = Gy = 2(1Et ) + represent the material modulus, plate thickness, and Poisson's
where E , t, and ratio, respectively. The total potential energy of the plate due to the surface loading p is Mind51, O~Su83b] n 1 Z T d ? Z pT d : (2.5) =2 Using the stress-strain and strain-displacement relationships, (2.5) can be rewritten as Z Z = 1 (L2 )T D(L2 ) d ? pT d : (2.6) 2
3
Lx x
Ly 1 2 3
...
i j
y
Fig. 3.2
a
. A discretized plate.
3. The strip element for the Mindlin plates. We now outline the formulation of the nite strip method for the Mindlin plates using linear elements BeHi76, O~Su83b]. Figure 3.2 shows a rectangular plate discretized into n ? 1 nite n strips. The plate is assumed to be simply supported on edges y = 0 and y = Ly . Shown in Figure 3.3 is the mid-plane of a typical linear strip plate element of constant thickness t, whose local coordinate system is denoted by (x0; y0; z0) where x0 = x ? xi, y0 = y, and z0 = z. In the nite strip formulations, the displacements are approximated by linear combinations of the product of some continuous harmonic functions of y and piecewise interpolation polynomials of x. Let (e) be the domain of the eth strip element and i and j be the two longitudinal edge (nodal line) numbers associated with the element as shown in Figure 3.3. Let (e)(x; y) and ul(e) be de ned as:
(e) (x; y ) =
w(x; y)
" #
x (x; y )
h
T y (x; y )] ;
(x; y) 2
l xj
i l T yj
(e)
and
ul(e)
ul = uli j
wil
l xi
l yi
j wjl
where wil denotes the lth harmonic coe cient of wi(y), etc. For a linear strip element with m harmonic terms speci ed, the approximation to (e) is given BeHi76, O~Su83a] by n (3.7)
(e)
m X l l F l=1
u(e)
a x0 z0 Ly i y0
with
xi yi xj yj
wi (e)
wj
Fig. 3.3
. A typical plate strip element.

3 7 5
NiSl 0 0 Nj Sl 0 0 6 0 N j Sl 0 Fl = 4 0 NiSl 0 0 0 N i Cl 0 0 Nj Cl
where Sl and Cl are the lth harmonic functions of y, and Ni and Nj are the linear shape functions of x, de ned by Sl = sin lL y ; Cl = cos lL y ; y y
where r(e), ranging from ?1 to 1, is the natural coordinate in x-direction of the eth element. Note that r(e) = ?1 + 2 xx?xx for the element shown in Figure 3.3 It ? should be observed that the approximation to the displacement vector in 3.7 satis es the simply supported boundary conditions on edges y = 0 and y = Ly ; i.e., w, x, @w=@x, @ x=@x, and @ y =@y all vanish on these two edges. The surface loading on the eth element, p(e), can often be approximated by the sum of harmonic series in the longitudinal direction as shown below
i j i
Ni = 1 ?2r(e) ; and Nj = 1 +2r(e)
(3.8) where
p(e)(x; y)
m X l=1
H l(y)pl(e)(x)
H l = diag Sl; Sl; Cl]

5
and
pl(e) = ql mlx mly Te) : (

The subscript (e) outside the brackets indicates that every component of the vector is associated only with the eth element. To obtain the strip sti ness matrix and the force vector, (2.6) is now split into integrals over each individual strip element: (3.9) =
Ns X e=1
(e)
Ns X e=1
" Z
1 2
(e)
(L2
)T D(L
)d
(e) ?
(e)
pT
(e)
where Ns is the total number of strip elements. Substitution of (3.7) and (3.8) into (3.9) yields (3.10)
Ns X e=1
" Z #
1 2
m X B u(e))T D( Blul(e)) (e) l=1 l=1
m X l l (
(e) ?
m X F u(e))T ( Hlpl(e)) (e) l=1 l=1

h
m X ( l l
(e)
i
where Bl = L2Fl is a gradient matrix of order 5 6. By partitioning Bl into Bli j Blj , we can easily see that
2 6 6 6 6 6 6 6 6 l =6 i 6 6 6 6 6 6 6 6 4
0 0 0
@Ni S @x l l ( L )NiCl
y
? @N Sl @x
i
7 7 l )N S 7 0 ( Ly i l 7 7 7 7 7 l )N C @Ni C 7 ( Ly i l @x l 7 7 7 7 7 NiSl 0 7 7 7 5
?NiCl
and Blj is the same as Bli except that Ni is replaced by Nj . Now let u be the system displacement vector de ned by
2
6 6 6 6 =6 6 6 6 6 4
u1 u2 um
: : :
3 7 7 7 7 7 7 7 7 7 5
6 where ul = 6 :: 6 6 6 6 : 4
ul 6 1 6 ul 6 2 uln
7 7 7 7 7 7; 7 7 7 5
l = 1; m
and the system force vector f be consistently de ned. The equilibrium algebraic equations are then obtained by minimizing the total potential energy with respect to u; i.e., @ = 0; i = 1; n and l = 1; m: (3.11) @ uli At this point, it is important to mention the following well-known orthogonality properties of the harmonic series: ( ZL l y sin m y dy = L=2 for l = m sin L 0 otherwise L 0 and
cos l Ly cos mL y dy = L=2 for l = m: 0 otherwise 0 Following the standard nite element procedures and taking advantage of these orthogonality properties, we can generate a linear system of block diagonal structure depicted by:
2 6 6 6 6 6 6 6 6 4
ZL
K11 6
(3.12)
32 76 76 76 76 76 76 76 76 76 54
K22
u1 u2 um
: : :
K11
(e) ;
7 6 7 6 7 6 7 6 7 6 7=6 7 6 7 6 7 6 5 4
f1 f2 fm
: : :
3 7 7 7 7 7 7 7 7 7 5
where Kll is the sti ness matrix assembled from the strip sti ness matrix Klle), ( (3.13)
Klle) = (
(e)
(Bl)T DBld
l = 1; m;
and f l is the force vector assembled from the strip force vector f l(e), (3.14)
f(le) =
(e)
(Fl)T Hlpl(e)d
(e) :
For a plate discretized with n nodal lines, Kll is a square matrix of order 3n for each l. However, the strip sti ness matrix Klle) is of order 6 only. Once the system ( sti ness matrix and force vector are assembled, the remaining major work is to solve for the displacements from the algebraic linear system of (3.12).
7
strip method consists basically of the following four computational procedures:
4. Parallel implementations. Similar to the nite element method, the nite

the generation of the strip sti ness matrix and strip force vector for each strip element; the assemblage of the system sti ness matrix from the strip sti ness matrix and that of the system force vector from the strip force vector; the solution process of the assembled linear system after the imposition of boundary conditions (if not imposed at the strip level); and, the calculation of displacements and stresses.
To address the parallel implementation of the nite strip method, we should emphasize the uncoupled structure of the system sti ness matrix depicted by (3.12) due to the orthogonality properties among di erent harmonic terms. Clearly, this uncoupling leads to m independent sets of simultaneous equations. Therefore, solving (3.12) is equivalent to solving
Kllul = f l; l = 1; m
where each Kll is a block tridiagonal matrix with each block of order only 3 3 for the ordering shown in Figure 3.2. In other words, each such matrix is narrowly banded. The problem can therefore be solved very e ciently. Since there is no data dependency among these m linear subsystems, not only can the generation of the strip sti ness matrix and strip force vector and the assemblage of the system sti ness matrix and system force vector for each harmonic term be performed independently, but all the linear subsystems can be solved in parallel as well. In a parallel computing environment with parallelism of two levels (including vectorization as the rst level) such as the Alliant or the Cray XMP/48, this special feature leads the nite strip method to a fully parallelizable approach when the number of harmonic terms matches the number of processors. Algorithm 4.1 outlines a simple parallel implementation of this method on the Alliant machine Alli87].
Algorithm 4.1: CVD$L CNCALL (an Alliant directive) DO l = 1; m (concurrent) DO e = 1; Ns (sequential) ll and f l generate K(e) (vector) (e) ll and f l assemble K (vector) END DO solve Kllul = f l using a vectorized solver END DO The words concurrent and/or vector inside the parentheses in the algorithm indicate possible execution modes. For a machine with parallelism of three levels like the Cedar DKLS86], each cluster can handle one or more subsystems with ease if the number of subsystems exceeds the number of clusters. In addition, all processors within each cluster can be used to generate the strip data, to assemble the system, and to solve a single subsystem by employing a parallel linear system solver. This parallel implementation is depicted by Algorithm 4.2 where SDOALL and CDOALL are two constructs in Cedar Fortran Guzz87]. The SDOALL tells the compiler to spread tasks among clusters so that the tasks can be performed simultaneously. The CDOALL indicates the iterations of the loop may be executed in parallel using di erent processors within each cluster task. Algorithm 4.2: SDOALL l = 1; m (concurrent) CDOALL e = 1; Ns (concurrent with a lock) ll and f l generate K(e) (vector) (e) ll and f l assemble K (critical region) END CDOALL solve Kllul = f l using a vectorized parallel solver END SDOALL It should be noted that some care must be taken when applying Algorithm 4.2 to avoid potential con ict in memory access during the assemblage stage, because di erent elements may have some nodes in common, e.g., the adjacent elements. For the plate shown in Figure 3.2 this can be accomplished easily by splitting the CDOALL loop into two CDOALL loops as shown by Algorithm 4.3, provided the elements are naturally ordered. Algorithm 4.3: SDOALL l = 1; m (concurrent) CDOALL e = 1; Ns; 2 (concurrent) ll and f l generate K(e) (vector) (e) ll and f l assemble K (vector) END CDOALL
9
Displacement w 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.0

Fig. 5.4
|{ Exact (Kirchho theory) - - - Finite strip method (64 Mindlin strips)
12.0
24.0 x - Axis
36.0
48.0
. Comparison of solutions for the plate under the loading q.
CDOALL e = 2; Ns; 2 (concurrent) generate Klle) and f(le) (vector) ( ll and f l assemble K (vector) END CDOALL solve Kllul = f l using a vectorized parallel solver END SDOALL 5. Numerical experiments. To demonstrate the e ectiveness and parallelizability of the nite strip method, we consider the static analysis of a thin plate problem using the Mindlin theory. This plate is assumed to be 48 feet wide (Lx = 48) and 32 feet long (Ly = 32) and simply supported on all of its four edges. The thickness is 0.5 feet throughout the entire plate. The material of the plate is assumed to be isotropic with Young's modulus E = 432000 ksf and Poisson ratio = 0:17. Two loading cases are employed for the experiments: one is a 0.3 ksf uniformly distributed loading q and the other is a 20 kip concentrated one P at (x; y) = (12ft; 8ft), both acting downward in the z-direction. In evaluating the strip sti ness matrix and force vector, the reduced integration using one Gaussian point is used to overcome the shear locking behavior O~Su83a]. In Figure 5.4, we compare the displacement w n along the central span, y = 16 ft, of the numerical solution using 64 Mindlin strip elements with that of the exact solution using the Kirchho thin plate theory for the plate subjected to the uniform loading q. Eight harmonic terms (four of them contribute nothing to the solution because of this special loading) are used. Figure 5.5 shows the comparison of the solutions for the same plate under the concentrated loading P using also eight harmonic terms. The relative errors against the number of
10
Displacement w 0.00 |{ Exact (Kirchho theory) 0.01 - - - Finite strip method (64 Mindlin strips) 0.02 0.03 0.04 0.0
Fig. 5.5
12.0
24.0 x - Axis
36.0
48.0
. Comparison of solutions for the plate under the loading P .
strips for both loading cases, de ned as j (wN ? w128)=w128 j where wN denotes the vertical displacement at the center of the plate discretized with N strips, are plotted in Figure 5.6. Here w128 is used to replace the exact solution of the Mindlin theory because the exact solution is not available. It is clear from this gure that the error decreases as the number of strips increases. The performance of this method on an Alliant FX/8 using various numbers of processors for the static analysis of the plate under the loading P is shown in Table 5.1 and Figure 5.7. In Table 5.1, we compare the CPU time (all in seconds) consumed in the analysis, including the generation of the data, the assemblage of the system, the solution for the unknowns, and nally the calculation of the displacement. Table 5.2 and Figure 5.7 display the speedups obtained from Table 5.1 for three di erent discretizations: N = 16, 64, and 128. Here the speedup, S (k), is de ned to be the ratio of the CPU time spent using only one processor to that spent using k processors. As seen from Figure 5.7 (or Table 5.2), very satisfactory speedups have been obtained for all cases. The speedups obtained for the plate discretized with 128 strips are 1.98, 3.88, and 7.40 using 2, 4, and 8 processors respectively. 6. Conclusions. The e ciency and parallelizability of the nite strip method for the static analysis of certain Mindlin plates have been addressed and some useful parallel implementations on the Alliant FX/8 minisupercomputer and the Cedar multiprocessor presented. The performance of this method on an Alliant FX/8 has also been tested via the analysis of a rectangular plate simply supported on all edges under two di erent loading cases. From the experiments performed, we have obtained speedups of 1.98, 3.87, and 7.40 using 2, 4, and 8 processors respectively. These
11
Relative error (%) 1.00 0.75 0.50 0.25 0.00 0 16 32 48 64 |{ Concentrated loading - - - Uniform loading
Number of strip elements

Fig. 5.6
. Relative errors (%) of the displacement w at the center of the plate.
CPU time on the Alliant FX/8 in seconds for the plate under the loading P .
Table 5.1
No. of strips 16 24 32 40 48 64 128
1 0.229 0.342 0.456 0.570 0.683 0.909 1.811
No. of processors 2 4 8 0.117 0.060 0.033 0.174 0.089 0.048 0.234 0.119 0.063 0.287 0.147 0.079 0.349 0.177 0.094 0.461 0.237 0.125 0.913 0.467 0.244
12
Speedup 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 0 1 2 3 4 5 6 Number of processors 7 8 |{ 128 strips - - - 64 strips ...... 32 strips
Fig. 5.7
. Speedups on an Alliant FX/8 for the static analysis of the plate under the loading P .
Speedups based on the 1-CE CPU time on the Alliant FX/8.
Table 5.2
No. of strips 16 24 32 40 48 64 128
No. of processors 1 2 4 8 1.00 1.96 3.82 6.94 1.00 1.97 3.84 7.13 1.00 1.95 3.83 7.24 1.00 1.99 3.88 7.22 1.00 1.96 3.86 7.27 1.00 1.97 3.84 7.27 1.00 1.98 3.88 7.42
13
speedups are satisfactory and very encouraging. It clearly demonstrates the superiority of the nite strip method in parallel computation. In summary, we conclude that although the nite strip method is not as versatile as the nite element method, it is very suitable for multiprocessor and multicluster computers such as the Cray XMP or Cray 2, the Alliant, and the Cedar, due to its potential for inducing large grain parallelism. This is especially true when the problem requires a large number of harmonic terms to yield accurate results. 7. Acknowledgments. This work was supported by the National Science Foundation under grants NSF-MIP-8410110 and NSF-CCR-8717942 and by the U.S. Department of Energy under grant DOE-DE-FG02-85ER25001.
REFERENCES
Alli87] BeHi76] Cheu68a] Cheu68b] Cheu76] CuLo74] DKLS86] Guzz87] Mind51] O~Su83a] n O~Su83b] n TiGe72] Zien77] Alliant Computer Systems Corporation, FX/FORTRAN Programmer's Handbook, Alliant Computer Systems Corporation, Acton, Massachusetts, 1987. P.R. Benson and E. Hinton, A thick nite strip solution for static, free vibration and stability problems, Int. J. for Numer. Meth. in Eng., 10 (1976), pp. 665-678. Y.K. Cheung, The nite strip method in the analysis of elastic plates with two opposite simply supported ends, Proc. Inst. Civ. Eng., 40(1968), pp. 1-7. Y.K. Cheung, Finite strip method analysis of elastic slabs, ASCE J. of Mechanics Div., 94 (1968), pp. 1365-1378. Y.K. Cheung, Finite Strip Method in Structural Analysis, Pergamon Press, New York, 1976. A.R. Cusens and Y.C. Loo, Applications of the nite strip method in the analysis of concrete box bridges, Proc. Inst. Civ. Eng., 57-II (1974), pp. 251-273. E. Davidson, D. Kuck, D. Lawrie, and A. Sameh, Supercomputing tradeo s and the Cedar system, CSRD Tech. Rept. 577, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1986. M.D. Guzzi, Cedar fortran programming handbook, CSRD Tech. Rept. 601, Center for Supercomputing Research and Development, University of Illinois at UrbanaChampaign, 1987. R.D. Mindlin, In uence of rotatory inertia and shear on exural motions of isotropic, elastic plates, J. of Applied Mechanics, 18 (1951), pp. 31-38. E. O~ate and B. Suarez, A uni ed approach for the analysis of bridges, plates and n axisymmetric shells using the linear Mindlin strip element, Computers & Structures, 17 (1983), pp. 407-426. E. O~ate and B. Suarez, A comparison of the linear quadratic and cubic Mindlin strip n elements for the analysis of thick and thin plates, Computers & Structures, 17 (1983), pp. 427-439. S.P. Timoshenko and J.M. Gere, Mechanics of Materials, Van Nostrand Co., New York, 1972. O.C. Zienkiewicz, The Finite Element Method, 3rd ed., McGraw-Hill, London, 1977.
14

FINITE STRIP METHOD FOR STRUCTURAL ANALYSIS ON PARALLEL COMPUTERS

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

FINITE STRIP METHOD FOR STRUCTURAL ANALYSIS ON PARALLEL COMPUTERS

Загружено:

Авторское право:

Доступные форматы

PERFORMANCE OF THE FINITE STRIP METHOD FOR STRUCTURAL ANALYSIS ON A PARALLEL COMPUTER

HSIN-CHU CHENy AND AI-FANG HEz

. The coordinate system and sign convention.

. A typical plate strip element.

Ni = 1 ?2r(e) ; and Nj = 1 +2r(e)

H l = diag Sl; Sl; Cl]

pl(e) = ql mlx mly Te) : (

m X B u(e))T D( Blul(e)) (e) l=1 l=1

m X F u(e))T ( Hlpl(e)) (e) l=1 l=1

strip method consists basically of the following four computational procedures:

4. Parallel implementations. Similar to the nite element method, the nite

Displacement w 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.0

|{ Exact (Kirchho theory) - - - Finite strip method (64 Mindlin strips)

. Comparison of solutions for the plate under the loading q.

. Comparison of solutions for the plate under the loading P .

Number of strip elements

. Relative errors (%) of the displacement w at the center of the plate.

No. of strips 16 24 32 40 48 64 128

1 0.229 0.342 0.456 0.570 0.683 0.909 1.811

Speedups based on the 1-CE CPU time on the Alliant FX/8.

No. of strips 16 24 32 40 48 64 128

Вам также может понравиться