Beruflich Dokumente
Kultur Dokumente
AUTOMATIC ELEMENT REORDERING FOR FINITE ELEMENT ANALYSIS WITH FRONTAL SOLUTION SCHEMES
S. W. SLOAN.;
SUMMARY This paper describes an element reordering algorithm which is suitable for use with a frontal solution package. The procedure is shown to generate efficient element numberings for a wide variety of test examples. In an effort to obtain an optimum elimination order, the algorithm first renumbers the nodes, and then uses this result to resequence the elements. This intermediate step is necessary because of the nature of the frontal solution procedure, which assembles variables on an element-by-element basis but eliminates them node by node. To renumber the nodes, a modified version of the King algorithm is used. In order to minimize the number of nodal numbering schemes that need to be considered, the starting nodes are selected automatically by using some concepts from graph theory. Once the optimum numbering sequence has been ascertained, the elements are then reordered in an ascending sequence of their lowest-numbered nodes. This ensures that the new elimination order is preserved as closely as possible. For meshes that are composed of a single type of high-order element, it is only necessary to consider the vertex nodes in the renumbering process. This follows from the fact that mesh numberings which are optimal for low-order elements are also optimal for high-order elements. Significant economies in the reordering strategy may thus be achieved. A computer implementation of the algorithm, written in FORTRAN IV, is given.
INTRODUCTION In recent years, the frontal solution procedure has become increasingly popular as a means of so-fingthe sparse symmetric matrix equations which often arise in finite element computations. A Wailed discussion of the history and merits of the frontal approach has been given in a recent text by Irons and Ahmad,* and will not be repeated here. It suffices to note that it is generally as efficient as the more traditional type of bandwidth solver, both in terms of arithmetric and storage requirements, and Is particularly suited to high-order elements which have midside or interior nodes. When employed in a finite element context, the efficiency of a frontal scheme is dependent upon the ordering of the elements, with the ordering of the nodes being immaterial. This is because the equations are assembled and factorized on an element-by-element basis. In contrast, the efficiency of bandwidth-oriented schemes is solely a function of the nodal ordering. In the simplest type of bandwidth algorithm, where all matrix entries inside the bandwidth are stored and operated on, the nodes are labelled in an attempt to procure a small bandwidth. Indeed, this is essential for an economical solution.
./.
t Assistant Lecturer. 0029-598 1/83/08 1153-29$02.90 0 1983 by John Wiley & Sons, Ltd.
Research Fellow.
1154
For problems which have few elements, or an uncomplicated topology, it is relatively straightforward to label elements so that the frontal procedure will perform efficiently. In more complex examples, however, such as large water distribution networks and two- and three-dimensional finite element grids, this is frequently difficult. The need to number finite element meshes efficiently is particularly important in nonlinear computations, where it is often necessary to solve a system of linear algebraic equations a large number of times. Classes o problems for which efficient solutions are necessary include implicit dynamic calculations f and large-scale plasticity analysis. In a more general sense, the increasing use of microcomputers, which have limited fast store and relatively slow central processing units, has provided additional impetus for deriving efficient solution schemes.
. . . K , j ~+. . . =fs j
xi+
...=fi-[-]fs
KS I Ks,
(3)
Note that row s is unaltered during the elimination of x , from the remaining equations, and that column s of [ K ] may be eliminated as soon as it is fully assembled. More generally, the Gaussian elimination procedure may be summarized by the equations
1155
If, for the moment, symmetry is ignored, it is a relatively simple task to determine the number of operations? that are required to eliminate a variable from an n x n set of equations. From an inspection of the left-hand side of equation (3) it is apparent that we need one division to form the multiplier Ki,/Ks,,and n - 1 multiplications to compute all the starred coefficients for the ith row (noting that no multiplication is necessary to compute Kis* since it is identically zero). Hence the total number of operations that are required to eliminate x s from all the equations is
n ( n - 1)
(5)
More generally, if the effect of the right-hand side is included, the number of operations is
( n - l ) ( n + 1)
(6)
If the governing linear equations are symmetric, only the upper triangle of the coefficient matrix needs to be stored and operated on. This leads to significant savings in the total number of operations that are required to eliminate a variable from the left-hand side. Referring to equations (3) and (4), the total number of multiplications that are required to eliminate x, from all the equations is
$ ( n ) ( n+ 1)-n
where the -n term arises because none of the entries in row s needs to be modified. The number of divisions that are required to compute the multiplier for each row i, i f s, is equal to n - 1. Hence, the total number of operations that are required to eliminate x, from the left-hand side of the equations is i ( n ) ( n+ 1) - 1 If the effect of the right-hand side is included, the number of operations is increased to $ ( n 2 + 3 n- 4 )
(7)
(8)
For a symmetric matrix of coefficients [K], equation (7) may be used to estimate the number of operations that are required to decompose the left-hand side when using a frontal solution procedure. If, at some stage during the elimination process, the number of active variables in the front is w , then the number of operations that are necessary to eliminate a variable from the front is simply
$(w)(w
+ 1)-
(9)
If wi denotes the value of w just prior to the elimination of variable x,, then the total number of operations that are required to decompose the [K] matrix is
In deriving equation (lo), it has been assumed that no zeros occur inside the front during the elimination process. This assumption is a simplification, as the elimination of fully assembled equations may lead to the creation o zero rows/columns inside the upper triangle of the f active frontal matrix. This is because, for the simplest form of frontal solver, the elimination of active variables may occur in any order (Figure 1). Although it is usual to try and fill these vacant locations with coefficients when assembling each new element, unfilled rows/columns may still persist inside the front. The occurrence of this phenomenon requires that a distinction
~~
t Following the usual convention, an operation is defined as one multiplication or one division.
1156
S. W. SLOAN A N D M. F. RANDOLPH
global equations currently active prior to elimination of equation 3 frontwidth = 4 number of active variables = 4
K2 2
K2 4
K33
K34
K4 4
global equations currently active after elimination of equation 3 frontwidth = 4 number of active variables = 3
be made between the number of active variables in t h e front and the actual current frontwidth. The actual current frontwidth is always greater than, or equal to, the number of active variables. If, during the elimination process, unfilled zero rows/columns arise inside the upper triangle of coefficients, the number of operations given by equation (10)will be a fower bound. However, approximately half of the redundant operations introduced may be removed by checking for zero columns in the outer loop of the elimination routine. Since the current frontwidth is equal to the number of active variables when the latter is a maximum, an upper bound on the number of operations is given by
n T=-(W2+W-2) 2
where W is the maximum frontwidth. A REVIEW OF ELEMENT RESEQUENCING STRATEGIES Although a wealth of literature exists on heuristic algorithms for minimizing the bandwidth of sparse symmetric matrices, very little attention has been focused on the development o f schemes aimed at minimizing the frontwidth. Broadly speaking, there are two different methods of approaching this problem. Each of these will be considered in turn.
1157
1=1
c Mf
(12)
where M iis the number of off-diagonalnodes (i.e. current number of active nodes minus one) during the elimination o node i, and N is the total number of nodes. f The approximate theoretical basis for this criterion may be seen from equation (10). IF the average value of wi is relatively large, then we may write
n
T = $ C wj?
i=l
The fundamental steps in Kings algorithm may be summarized as follows: 1 . Generate an adjacency list indicating the connectivity of all nodes in the grid. A node is said to be adjacent to another node if they share an element. 2. Select a starting node and relabel it as node one. 3. Generate a list of nodes which are currently active. Mark all references to these nodes in the adjacency list by negative values. Accumulate the ordering efficiency parameter, (+. 4. Examine each active node in the list and compute the increment in the number of active nodes if each of these nodes were to be eliminated. Note that a positive increment may occur only if the node is connected to positive entries in the adjacency list, and that the increment must be either a positive integer, zero or minus one. 5 . Select and relabel the node which has the lowest increment in active nodes. In the case o a tie, choose the node which has been active the longest. f 6 . Delete the selected node from the list of active nodes, and add nodes which are adjacent to the selected node onto the list of active nodes (if they are not already active). In the adjacency list, mark all references to the newly added nodes by negative values.
1158
g,by adding the square of the number of active nodes in the list. 8. Repeat steps 4-7 until all nodes have been relabelled. 9. Repeat steps 2-8 until all starting nodes have been processed. 10. Accept the numbering strategy which yields the smallest ordering efficiency parameter.
Steps 4 and 5 may be viewed as a criterion for ensuring that the frontwidth grows by a minimum amount, and are the essence of the King procedure. Note, however, that the algorithm only reorders the nodes and not the elements. As noted by C ~ t h i l l the King algorithm is ,~ quite efficient and, for finite element meshes with few nodes per element, will often furnish optimal or near-optimal frontwidths. A major disadvantage of the method, however, is that it is highly sensitive to the location of the starting node. For problems with a large number of nodes this is a severe drawback, since it is uneconomic to consider each node as a starting point. Another heuristic method for resequencing finite element grids to reduce the maximum frontwidth has been given by Levy. This method is very similar to that of King, but is based on an expanded minimum front-growth principle. When searching for the next node to be relabelled at each stage, all nodes which have yet to be relabelled are considered. (Recall that in stage 4 of the King algorithm, nodes are considered for possible relabelling only if they are currently in the front.) Furthermore, the criterion for assessing the merit of each numbering scheme is based simply on the smallest maximum nodal frontwidth. Because of the extended search which is conducted before selecting the next node to be renumbered the Levy algorithm is invariably slower than the King algorithm. It may, however, furnish maximum frontwidths which are closer to the true m i n i m ~ mAs with Kings method .~ the starting nodes for the Levy scheme must be specified a priori. Due to the extra computation associated with the generation of each numbering sequence, this is again a serious limitation. More recently Pina6 has described another method for optimizing finite element numberings. As with the Levy and King algorithms, Pinas strategy is based on a minimum front-growth principle, but includes an additional search procedure which attempts to look ahead. The search refinement incorporated by Pina6 is apparently useful in meshes comprised of elements with midside nodes. Later in this paper it will be shown that, for meshes which are comprised of a single type of high-order element, it is unnecessary to consider non-vertex nodes in the renumbering process. This is due to the fact that mesh numberings which are optimal for low-order elements are also optimal for high-order elements. As with the Levy and King techniques, Pinas method suffers from the disadvantage that the starting nodes must be specified by the user. The indirect methods When attempting to develop heuristic schemes for minimizing the frontwidth of a sparse set of matrix equations, it is fruitful to consider schemes which are aimed at minimizing the bandwidth. This is because the maximum frontwidth must always be less than, or equal to, the corresponding bandwidth (if the variables are eliminated in the same order). An illustration of this property is given in Figure 2 (taken from Cuthil14)for a simple grid of one-dimensional bar elements, each with 1 degree-of-freedom per node. The maximum bandwidth of a symmetric n X n matrix [ K ] may be defined as
1159
max ( b .
:1
\<
6 51
= 4
~ = m a x { w . :1 \ < i \ < 5 1 = 3 Corresponding pattern of non-zero entries in global stiffness matrix CKI
where bi is defined as the difference between i + 1 and the column index of the first non-zero entry in row i of [K]. This definition of bandwidth differs from that given widely in the literature, since it includes the diagonal term. The wavefront (or number of active equations) for row i, w i , may be defined as the number of active columns in row i. A column j is said to be active if (i) j a i and (ii) there is a non-zero element in that column with a row index k, such that k s i. The maximum frontwidth, or wavefront, is then defined by W=max{wi: lsisn}
(16)
From an inspection of Figure 2, it is clear that the maximum frontwidth, W, can never exceed the bandwidth B. Thus, one method of reducing the maximum frontwidth is first to resequence the nodes to minimize the bandwidth, and then to relabel the elements so that the new order of elimination is preserved as closely as possible. This approach has been used by Akin and Pardue' and, more recently, by Razzaque.' The effectiveness of this strategy is obviously dependent on the performance of the bandwidth minimization procedure, and thus suffers from the disadvantage of being indirect.
1160
A NEW ELEMENT RESEQUENCING- ALGORITHM In this section a new element resequencing algorithm, which is suitable for use with a frontal solution programme, is described. The procedure first renumbers the nodes in an effort to obtain an optimum elimination sequence, and then uses this result to reorder the elements. This intermediate step is necessary because of the nature of the frontal method, where variables are assembled on an element-by-element basis but eliminated node by node. T o renumber the nodes, a modified form of the King' algorithm is used. The main disadvantage associated with this technique, namely the difficulty of knowing where to initiate the renumbering process, is overcome by selecting the starting nodes automatically. As a result, only a few nodenumbering sequences need to be generated. After choosing the nodal numbering scheme which yields the lowest maximum frontwidth, the elements are reordered in an ascending sequence of their (new) lowest-numbered nodes. This preserves the optimum elimination order as closely as possible. After the elements have been renumbered, the new node numbering scheme may be discarded.
(17)
t Note that these edges should not be confused with the 'edges' of each finite element. In some instances they are
equivalent, but not always, e.g. a single 4-noded quadrilateral generates a graph with six edges, four along its sides and two along its diagonals.
1161
With reference to Figure 2, the distance between nodes one and three, for example, is two. The diameter, D ( G ) ,of this graph is four, and the endpoints of the diameter are nodes four and five. Following George and Lui" a pseudo-diameter, S ( G ) ,is defined by endpoints i and j for which d ( i , j ) is cfose to D ( G ) .Nodes which define a pseudo-diameter are known as pseudo-peripheraf nodes. In bandwidth minimization algorithms which are based on graph theory (e.g. Cuthill and M C K ~ ~ , ~ Poole and Stockmeyer") an important concept is the rooted level structure. Gibbs, A rooted level structure of a graph G is defined as a partitioning of the set of nodes N ( G ) into levels fl(r), f 2 ( r ) ,. . . l h ( r ) such that:
1 . f l ( r )= { r } , where r is the root (or starting) node of the level structure. 2. All nodes adjacent to nodes in level l , ( r ) :1 < i < h are in levels f i - ' ( r ) , f , ( r )and L + l ( r ) . 3 . All nodes adjacent to nodes in level l h ( r )are in levels f h - , ( r )and f h ( r ) .
Following George and Liu" the overall level structure may be expressed as the set L ( r )= { f l ( r ) , lz(r), . . G ( r ) } ,where h is the depth of the level structure rooted at node r, and is simply the total number of levels. The width of level is defined by ) f , ( r ) l(i.e. the number of nodes on level i ) , and the width of the level structure is defined by
With reference to Figure 2, the level structure rooted at node one is given by
= f2(1)= (2, 4}, f3(1)= (3)and 14(1)= { 5 } . For this example h is equal to four, where fl(l) {l}, and w ( 1 ) = 2.
where w ( r ) is the width of the level structure rooted at node r and is defined by (18). It is intuitively apparent that pseudo-peripheral nodes should also make good starting nodes for a heuristic algorithm which attempts to minimize the maximum frontwidth. For instance, with the King' scheme the front (of active nodes) will tend to propagate down the level structure in a manner such that the maximum level width provides an approximate measure of the maximum frontwidth. In the applications section of this paper, it is demonstrated that pseudo-peripheral nodes make excellent starting nodes for the King' algorithm. An efficient method for locating a set of pseudo-peripheral nodes for a graph, G, is as follows:
1. Pick an arbitrary node, r, and generate the corresponding level structure L ( r )= { f l ( r ) , Z2(r),. . . l h ( r ) } . Store the depth, h, and width, w ( r ) ,for this level structure.
1162
S. W. SLOAN A N D M. F. RANDOLPH
2. Generate the level structures for each node in level h of L ( r ) (i.e. for all nodes which are at maximum distance from r ) . Select the node, s, which has the greatest level structure depth, h,, and narrowest width w ( s ) . If h, > h, set r = s, h = h,, w ( r )= w ( s ) and go to step 1. I f h , = h a n d w ( s ) < w ( r ) , s e t r = s , o ( r ) = w ( s ) a n d g o t o s t e p1. 3. Store all entries on level f h ( r )and the root node, r, to furnish the required set of pseudoperipheral nodes.
Steps 1-3 are a modified version of the algorithm given by Gibbs, Poole and Stockmeyer." Another algorithm for finding the endpoints of a pseudo-diameter has been given by George and Liu." Typically, the above procedure furnishes the required set of starting nodes after only two or three iterations. In many cases the pseudo-diameter calculated is actually a true diameter, but there is no guarantee of this. Node re nurnbering algorithm Although the frontal method assembles the equations on an element-by-element basis, the variables are eliminated node by node. Thus, as an intermediate step towards reordering the elements, it is first necessary to ascertain an efficient elimination order for the nodes. As described previously, there are essentially two different types of algorithms for renumberf ing nodes to reduce the maximum frontwidth. In the first o these, the nodes are renumbered using a minimum front-growth criterion, and are thus termed direct methods. In the second, the maximum frontwidth is reduced indirectly by minimizing the bandwidth, and uses the result that the maximum frontwidth must always be less than, or equal to, this quantity (providing the nodes are eliminated in the same order). If the Cuthill-McKee algorithm is used in the latter type of approach, it is interesting to note than an uppper bound on the maximum frontwidth is given by
w s 2w ( r )
where w ( r )is the width of the level structure rooted at node r. In this paper, a direct numbering scheme is used. The fundamental steps are as follows:
1. Generate an adjacency list for each node i : i c N ( G ) , noting that a node is said to be adjacent to another node if they share a common element. Compute and store the degree of each node i. This completely specifies the graph G. 2. Using the algorithm described in the previous subsection, compute a pseudo-diameter of the graph G,and assemble the associated set of pseudo-peripheral nodes. The latter constitute the set of possible starting nodes for steps 3-9. 3. Select a node from the list of pseudo-peripheral nodes and relabel it as node one. Assign node one an eliminated status. Using the adjacency list generated in step 1, mark all nodes which are adjacent to node one as currently active (i.e. in the current front). Store the number of active nodes. 4. Examine the nodes which are currently active, and calculate the increase/decrease in the number of active nodes if each of these nodes were t@beeliminated. Note that the increment in the number of active nodes will be either a positive integer, zero, or minus one. 5 . Select the node which, if eliminated, requires the smallest increase in the number of active nodes. In the case of a tie, select the node which has been active the longest. Relabel the chosen node, and assign it an eliminated status. 6. If the increment in active nodes = -1 go to step 7. Examine all nodes which are adjacent to the node just eliminated (relabelled) and mark them as being currently active. In this step, ignore nodes which already have an active or eliminated status.
1163
Accumulate the number of active nodes. If the number of active n o d e s s t h e maximum frontwidth from a previous scheme, abandon the current numbering strategy and go to step 3. Repeat steps 4-7 until all nodes have been relabelled, and store the maximum frontwidth. Repeat steps 3-8 until all entries in the pseudo-peripheral node list have been processed. Incorporated in this algorithm is the minimum front-growth criterion due to King.' For some problems, the minimum front-growth criterion suggested by Levy' may furnish smaller fronts, but at the expense o additional computer time. To incorporate the Levy algorithm, f step 4 needs to be modified as follows:
4. Examine all nodes which have yet t o be relabelled, and calculate the increase/decrease in the number of active nodes if each of these nodes were be eliminated.
Both of these criteria may be included in one computer program. If the governing equations are to be assembled/eliminated many times, such as in nonlinear finite element applications, it may be worth while to spend the extra time and employ the Levy algorithm. O n the other hand, if the equations are to be assembled/eliminated once only, it may be preferable to use the King scheme. For the latter case, renumbering is justified only if the saving in cost for one solution is greater than the cost associated with renumbering.
Appiications
In this section, the performance of the algorithm is assessed by applying it to a series of test problems. The range of finite element grids considered is illustrated in Figures 3-16.
'I
16 bar elements 16 nodes
1164
Examples 1-4 have been taken from Cuthill and McKee, and involve one-dimensional bar elements only. Examples 5 and 6, which involve meshes composed of 4-noded quadrilaterals and bars, are due to Gr00ms.l~ Example 7 depicts a water distribution network as described by King, and examples 8 and 9 are continuum meshes taken from Akhras and Dhatt.14 The remaining five examples are due to the authors, and are typical of grids which may arise when applying the finite element method to continuum problems.
1
19
18
1165
Table I illustrates the maximum frontwidths for these problems, before and after renumbering, together with a comparison of the results presented by Razzaque.' In all cases where comparisons are available, the new algorithm yields the lowest, or equal lowest, maximum frontwidths. In examples 10 and 11, where 6-noded linear strain triangles are used to discretize
Table I. Results of frontwidth minimization algorithm for example problems Initial maximum frontwidth (nodes)
3 45 19 41 35 8 11 23 30 28 49 43 23 26
5
6 7 8 9 10 11 12 13 14
7 6
18 14 121 24t 31$ 11 19
t Meshes of 6-noded linear strain triangles-corner nodes only utilized in renumbering process, but frontwidth based on all nodes. $ Mesh of 15-noded cubic strain triangles--corner nodes only utilized in renumbering process, but frontwidth based on all nodes.
1166
laterals
bars
bars
quadrilaterals
1167
78 elements 62 nodes
1168
S. W. SLOAN A N D M. F. RANDOLPH
the domain, the elements have been renumbered by considering corner nodes only (i.e. by regarding each element as a 3-noded constant strain triangle). This procedure has also been used in example 12, for a grid o 15-noded cubic strain triangles. In general, for continuum f meshes which are comprised of one type of high-order element, it is sufficient to consider the corner nodes only during the node and element relabelling process, This yields low maximum frontwidths and is inexpensive, since the number of corner nodes is often much less than the total number of nodes. For example, in the mesh of cubic strain trangles shown in Figure 14, only 32 nodes are involved in the node and element renumbering procedure, even though there are 413 nodes altogether.
1169
58 linear strain triangles 141 nodes 42 corner nodes Figure 12. Example problem 10
Table I1 illustrates the computer times required to produce the element reorderings for each example. The statistics indicate the times required for assembling the nodal adjacency lists, locating the set of pseudo-peripheral starting nodes, and renumbering the nodes and elements. All of these results are for the IBM 370/165 installation at Cambridge, and were
1170
S.
48 cubic strain triangles 413 nodes 32 corner nodes Figure 14. Example problem 12
obtained using the internal clock o the machine for the optimising Q-compiler. The times f f quoted are accurate to the nearest one-hundredth o a second. In general, the selection of the starting nodes and the nodal renumbering procedure consume most o the time. f DISCUSSION Generally speaking, the element reordering algorithm works best for meshes made up of triangular or one-dimensional elements. Indeed, for the cases presented with these types of
1171
elements, it appears as though the automatic procedure yields numbering schemes which are quite close to the optimum. For example, Figure 12 illustrates a mesh which has been used by the soil mechanics group at Cambridge to study the behaviour of an unsupported tunnel. With this configuration o linear strain triangles, the best numbering scheme that could be f produced by hand had a maximum frontwidth of 12 nodes.15 This is identical to the maximum
Table 11. Timing statistics for example problems (IBM 370/165, Q-compiler) Example problem no. Time (sec) Node renumbering
Element renumbering
Total 0.05 0.09 0.05 0.07 0.29 0.07 0.12 0.31 0.4 1 0.16 0.82 0.15 0.91 4.30
1 2 3 4
5
6 7 8 9 10 11 12 13 14
0.01 0.01 0.01 0.01 0.01 0.0 1 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.03
0.01 0.02 0.01 0.01 0.08 0.02 0.04 0.17 0.13 0.10
0.01 0.01 0.0 1 0.01 0.01 0.01 0.01 0.01 0.10 0.0 1 0.10 0.01 0.10 0.60
1172
frontwidth achieved by the automatic algorithm. Similarly, Figure 13 shows an embankment mesh, also composed of linear strain triangles. For this problem, Britto has derived a numbering scheme which has a maximum frontwidth of -3 nodes. This is slightly less than that obtained from the automatic scheme, which resulted in a numbering with a maximum frontwidth of 24 nodes. When applied to grids with quadrilateral elements, the algorithm furnishes numbering schemes which are further from the optimum. This is because these meshes generate graphs in which many o the nodes have a high degree, and the effect of the tie-breaker becomes f more pronounced (i.e. step 5 in the previous section). In addition, their pseudo-peripheral nodes often yield rooted level structures which are wide in comparison with their depth. An example of this is a square grid of 4-noded rectangles, with an equal number of elements along each side. The numbering scheme described in this paper is most effective when the pseudo-peripheral nodes give level structures which are long and narrow. Notwithstanding this shortcoming, the algorithm presented furnishes numberings which are efficient for a broad range of meshes comprised of quadrilateral elements. CONCLUSIONS An element reordering algorithm has been described which is suitable for finite element programmes using the frontal solutioh scheme. The algorithm, which first relables the nodes, and then the elements, has been shown to yield low maximum frontwidths for a wide variety of test meshes-including those which are composed of different types of elements. For grids of high-order elements of the same type, it has been demonstrated that efficient orderings may be obtained by considering the corner nodes only. This leads to significant savings in computational effort for these types of problems. Finally, for completeness, an illustrative FORTRAN IV implementation of the algorithm has been given.
ACKNOWLEDGEMENTS
The authors are indebted to several people for their assistance during the course of this work. The generosity of Dr. A. Britto in providing the data for some of the test examples, as well as continual discussion on the topic of renumbering finite element meshes, is gratefully acknowledged. Particular thanks are also due to Mr. M. J. Gunn and Mr. C. M. Szalwinski for their useful comments on the manuscript. The first author was supported by an External Research Studentship from Trinity College while undertaking the reported research. APPENDIX I: HAND-WORKED EXAMPLE Consider the mesh of seven constant strain triangles shown in Figure 17. The graph, G, for this example is defined by the set of nodes N ( G )= (1, 2,. . . , 8) and the set of edge pairs E ( G ) = { 1 , 21, (1, 31, (1, 41, (1, 61, (1, 71, (2, 31, (2, 61, I3, 41, (3, 51, (4, 51, (4, 71, 14, 81, {6, 71, (7, 8). In this case, the unordered pairs which define E ( G ) correspond to the edges which define the elements. Figure 18 illustrates the algorithm for finding the starting nodes as described in the subsection entitled Selection of starting nodes. For this example, the computed pseudo-diameter, 6 ( G ) ,is identical to the real diameter, D ( G ) ,and is equal to three. The resulting set of starting nodes is { 5 , 61.
1173
maximum frontwidth = 6
Before relabelling the elements, it is necessary to relabel the nodes. Consider the numbering scheme that results if node five is chosen as the starting node. Prior to the elimination of node five, the active nodes are ( 5 , 3, 4}, and the number of active nodes is three. After elimination of node five, the active nodes are (3, 4). If the King' algorithm is employed, then the next node to be eliminated must be either node three or node four. In order to eliminate node three, two new nodes (nodes one and.two) need to be brought into the front. On the other
Iteration 1
root node = 1 2
h = 3
w(1)
F
Iteration 2
4
= 5 Eh(l) = {5, 8)
v
3
I51
1174
S. W. SLOAN A N D M. F. RANDOLPH
Table 111. Node renumbering algorithm Eliminated nodes Node to be eliminated Nodes in front Number of active nodes New node no.
Table IV. Element renumbering algorithm Nodal definition vector Lowest node New element no. no.
Element
2 1 6 4 3 4 2
2 1 7 5 4 6 3
hand, it node four is eliminated, three new nodes (nodes one, seven and eight) need to be activated. Therefore, the next node to be selected for elimination is node three, and the set of active nodes is (1, 2, 4). This process may be repeated for the remaining nodes in the graph and is summarized in Table 111. For this new nodal elimination order the maximum frontwidth is reduced from six to four. If node six is chosen to start the numbering scheme, the maximum frontwidth is again four. After renumbering the nodes, the new nodal definition vectors for each element are as shown in Table IV. After reordering the elements in ascending sequence of their lowestnumbered nodes, the maximum frontwidth for element-by-element assembly is four, and the elimination order implied by the nodal renumbering is approximately preserved. The actual order of elimination, however, depends on where t h e variables are inserted into the front during the assembly phase. APPENDIX 11: DESCRIPTION OF FORTRAN IV IMPLEMENTATION This appendix illustrates five subroutines, written in FORTRAN IV, which may be used to reorder the elements for a frontal solution package. The function of subroutine SETUP is to establish the adjacency list and degree of each node in the graph, G, of the finite element mesh. It is a modification of the code published by Collins,16 and includes the facility for generating an adjacency list for the corner nodes only of a grid. The starting points for the node renumbering algorithm are determined using subroutines DIAM and LEVEL, which employ the algorithm described in the subsection entitled 'Selection of starting nodes' to
1175
compute a set of pseudo-peripheral nodes. Another algorithm for locating pseudo-peripheral nodes, together with its FORTRAN IV implementation, may be found in George and Liu.'' In subroutine RESEQ1, the nodes are renumbered using the minimum front-growth principle described in the subsection entitled 'Node renumbering algorithm'. This new elimination order is then employed to relabel the elements, in ascending sequence of their lowest-numbered nodes, in subroutine RESEQ2. A glossary of the essential variable names, in alphabetical order, is as follows: ALL Logical variable which is used to ascertain whether all of the nodes in the finite element mesh are to be used in the reordering procedure. For meshes with one type of high-order element, it is necessary to consic'er the corner nodes only, and ALL is set to .FALSE. in the main driving routine. Otherwise, ALL is set to .TRUE. Vector containing the initiai element numbers. The address in this array indicates the new element numbers, e.g. IEN(6) = 1 means that the old number for new element six is one. Dimension equal to NET. Vector containing level structure information. The level of node Z is equal to LEV(1). Dimension equal to NODES. Control parameter indicating the maximum allowable degree of any node in the graph. Control parameter indicating the maximum allowable number of nodes for any element in the mesh. New maximum frontwidth generated by the program. It should be set to a large value before entering subroutine RESEQ1. Note that for a mesh of high-order elements, where the corner nodes only are employed in the renumbering scheme, MINMAX does not represent the actual maximum frontwidth. Instead it represents the maximum frontwidth based on the corner nodes only. For grids of high-order elements in this case, the actual maximum frontwidth must be calculated in the normal manner, using the full list of nodes for each element and the new element numbering strategy (stored in array NEN). Vector containing the adjacency lists for all the nodes. Dimension equal to MAXDEG*NODES. The address list of nodes adjacent to node Z is given by (I- l)*MAXDEG + 1, (I - l)*MAXDEG + 2, . . . , (I - l)*MAXDEG + NDEG(1) Vector containing the degree of each node. Dimension equal to NODES. The degree of node I is equal to NDEG(1). Vector containing the new element numbers. The address in this array indicates the old element number; e.g. NEN(1) = 6 means that the new number for old element one is six. Dimension equal to NET. Control parameter indicating the total number of elements in the mesh. Vector containing the new node numbers generated for each starting node in subroutine RESEQ1. The address in this vector gives the old node number, e.g. NEWNN(1) = 6 means that the new number for old node one is six. Dimension equal to NODES. Vector containing the new node numbers which give the lowest maximum frontwidth in subroutine RESEQ1. The address in this vector gives the old node number, e.g. NEWNUM(1) = 6 means that the new number for old node
IEN
NADJ
NDEG NEN
NET NEWNN
NEWNUM
1176
NODES
NPE NPN
one is six. Dimension equal to NODES. After the new element numbers have been computed in subroutine RESEQ2, the contents of this array may be ignored and the original node numbers used. The number of nodes in the graph. In some cases this may differ from the total number of nodes in the finite element mesh; e.g. in a grid of one type of high-order element, NODES would be equal to the number of corner nodes. Note that if NODES is not equal to the total number of nodes, then the logical variable ALL must be set to .FALSE. Vector containing the number of nodes for each element. For element I, t h e number of nodes is equal to NPE(1). Dimension equal to NET. Vector containing the nodal definition vectors for all the elements. The addresses of nodes which define element I are given by (I - l)*MAXNOD + 1, (I - l)*MAXNOD + 2, . . . , (I - l)*MAXNOD +NPE(I). In this array, it is assumed that the corner nodes are listed first. Dimension equal to MAXNOD*NET. For a mesh of high-order elements of a single type, where N corner nodes are used in the renumbering scheme, the corner nodes must be numbered from 1 to N . Figure 19 illustrates the form of t h e data required to assemble the NPN array.
*
NPN a r r a y for e l e m e n t 1
g
El
8
NPN(l),
1
.....
2
,NPN(8) 4
7 , a , 9 , 1 0
Y
U'.
4 3
c o m e r nodes
2 4)
midside nodes
11
,,
El
4 1 13
NPN a r r a y f o r e l e m e n t 2 NPN(9),
.....
,NPN(16)
NS NSTART
NVN
The number of pseudo-peripheral nodes which are to be used as starting points for the node renumbering algorithm. Vector of pseudo-peripheral nodes which are to be used as starting nodes for the node renumbering subroutine (RESEQ1). Maximum dimension of NODES. The number of corner nodes for each element for the case where the mesh is composed of elements of the same type. Used if the reordering process involves the corner nodes only.
1177
C C C C
SUBPROGRAM SETUP
COMPUTE ADJA CENCY LIST AND DEGREE FOR EACH NODE - MODIFIED VERSION OF COLLINS ROUTINE - USE ONLY CORNER NODES IF ALL=.FALSE.
C*******~~.*********************************4*********.**************
W 50 1-1,"
C
50 CONTINUE
C
60
SUBPROGRAM DIAM
C U t ~ * ~ i * i * R i ~ ~ * ~ * i ~ * ~ R ~ 4 ~ t ~ ~ 4 U f * * * * ~ * 4 ~ ~ + * 4 N ~ ~ 4 * t * * ~ 4 ~ ~ + + * ~ ~ R * * +
DIMENSION NDEG( l).NADJ(l).LEV(l).NSTART(l) LOGICAL BETTER C C C C C BEGIN ITERATION SELECT INITIAL ROOT NODE ARBITRARILY AND GENERATE ITS LEVEL STRUCTURE IROOTI 1 ITER-0 1 ITERsITER+l 0 CALL LEVEL(NDEG.LEV.IDEPTH,NADJ. IWIDTH,NODES.IROOT.MAXDEG) C C C C CREATE LIST OF NODES WHICH ARE AT MAXIMUM DISTANCE FROM ROOT NODE LHW-0 Do 20 Iz1,NODES IF(LEV( I ) . NE .IDEPTH)GOTO 20
1178
LHW=LHW+l NSTART (LHW)=I 20 CONTINUE C C C C C C C STORE ROOT ON END OF LIST NS=LHW+1 NSTART (NS)=IROOT
S . W.S L O A N AND
M.F. R A N D O L P H
OF
LOOP OVER NODES AT MAXIMUM DISTANCE FROM ROOT NODE GENERATE LEVEL STRUCTURE FOR EACH NODE SET SWITCH IF A LEVEL STRUCTURE OF GREATER DEPTH OCCURS BETTER: .FALSE. w 30 I=I,LHW NEND=NSTART(I) CALL LEVEL (NDEC,LEV,NDEPTH ,NADJ , NWIDTH ,NODES,NEND ,MAXDEG) IF(NDEPTH.LT.1DEPTH)GOTO 30 IF( ( NDEPTH.EP.IDEPTH ) .AND.( NWIDTH .GE . IWIDTH) )GOTO 30 IROOT=NEND IDEPTH =NDEPTH IWIDTH=NWIDTH BETTERs.TRUE. 30 CONTINUE IF(BETTER)COTO 10 RETURN END
SUBROUTINE LEVEL(NDEC,LEV.LSD.NADJ.MLW.NODES.NROOT.MAXDEG)
C444.4...4.4....4.444.4.4.4444.4.4..44..4.4.44.44.44.4.~4.4...4
C C C C
SUBPROGRAM LEVEL
C4..44..44.4.4444..4~44.*.4444444..4.44.4.4..4.....444.4444..444
C C C C
LWZO
C
Do 20 JJ=l.NCS
NODE=NADJ(JSUB+JJ) IF (LEV(NODE ) NE .L-1 )COT0 20 LSD=L LW=LW+l LEV( I)=L KOUNT=KOUNT+ 1 IF (KOUNT.EQ. NODES )GOTO 50 GOTO 30 20 CONTINUE
C
30 CONTINUE
IF(LW. CT .MLW)MLW=LW
1179
C RETURN C END
C C C C
W 100 II=l,NS
I=NSTART(II)
DO 1 0 J = l . N O D E S
1 0 NEWNN ( J ) = O NIFzNDEC( I ) MA X F R T i N I F NEWNN ( I ) = 1
C C C C
NEGATE ALL NDEC E N T R I E S FOR NODES WHICH ARE ADJACENT TO STARTING NODE I NCN=NDEC(I) J S U B = ( 1 - 1 )*MAXDEC W 20 J = l . N C N N=NADJ(JSUB+J) NDEC(N)=-NDEG(N) 20 CONTINUE NDEC(I)=-NDEG(1)
C C C
Do 60 K z 2 , N O D E S MINNEW=lO** 1 0 LMIN = 1 0 1 0
C C C C C LOOP OVER UNNUMBERED NODES S K I P TO NEXT NODE I F OLD NODE IS ALREADY RENUMBERED R E S T R I C T SEARCH TO A C T I V E NODES FOR KING SCHEME
30 L = I . N C N N=NADJ ( L S U B + L ) I F ( NDEG ( N ) .CT O)NEW=NEW+l I F ( NEWNN (N ) EQ. 0 )GOTO 30 IF(NEWNN ( N ) .LT.MIN )MIN=NEWNN ( N ) 30 CONTINUE
C C
1180
C C
IN THE CASE OF A TIE , SELECT NODE WHICH HAS BEEN ACTIVE THE LONGEST
4U
IF(NDEG(J).LT.O)NEW:NEW-1 IF(NEW. CT .MINNEW)GOTO 40 IF ( ( NEW.EQ.MINNEW) .AND. (MIN CE .LMIN ) )GOTO 40 MINNEWzNEW LMINzMIN NEXT-J CUN'I'INUE
C C C
C
RENUMBER NODE AND COMPUTE NUMBER OF ACTIVE NODES ABANDON SCHEME IF NUMBER OF ACTIVE NODES EXCEEDS PREVIOUS LOWEST MAXIMUM FRONTWIDTH NEWNN (NEXT)=K NIF=NIF+MINNEW IF(N1F. CT.MAXFRT)MAXFRTzNIF IF(MAXFRT .GE.MINMAX)GOTO 80
C C C C
NEGATE ALL NDEC ENTRIES FOR NODES WHICH ARE ADJACENT TO NODE JUST RENUMBERED IF(MINNEW.EQ.-l)GOTO 60 NCN-IABS (NDEG(NEXT)) JSUB-(NEXT-1 )*MA XDEG Do 50 J=l.NCN N=NADJ(JSUB+J) IF(NDEG(N) .GT. O)NDEG(N) =-NDEC( N) 50 CONTINUE
C
60 CONTINUE
C C C
C
SUBROUTINE RESEQ2(NEWNUM,NPN,NEN,IEN,NPE,MAXNOD,NET,NODES,NVN,ALL)
C**..***I**~.XII*II***~************************************************
C C C C
SUBPROGRAM RESEQ2
RESEQUENCE ELEMENT NUMBERS TO MINIMISE THE FRONTWIDTH - REORDER THE ELEMENTS IN AN ASCENDING SEQUENCE OF THEIR LOWEST NUMBERED NODES
Cii**i***iii**,.ii*************************************a***************
KOUNT=O C C C C
LOOP OVER EACH NEW NODE NUMBER
Do 40 1-1.NODES
1181
. .
50
30 CONTINUE
40 CONTINUE
50 RETURN END
REFERENCES
1. I. P. King, An automatic reordering scheme for simultaneous equations derived from network systems, I n f . J. num. Meth. Engng, 2 , 523-533 (1970). 2. B. M. Irons and S. Ahmad, Techniques ofFinite Elements, Ellis Horwood, Chichester, U.K., 1980. n 3. E. Hinton and D. R. J. Owen, Finite Ekment Programming, Series i Computational Mathematics and Applications, vol. 1, Academic Press, London, 1977. 4. E. Cuthill, Several strategies for reducing the bandwidth of matrices, in Sparse Matrices and Their Applications (Rose, D. J. and Willoughby, R. A., Eds.), Plenum Press, New York, 1972. 5. R. Levy, Resequencing of the structural stiffness matrix to improve computational efficiency, Jet Propul. Lab. Q. Tech. Rev., 1, 61-70 (1971). 6. H. L. G. Pina, An algorithm for frontwidth reduction, Int. J. num. Meth. Engng, 17, 1539-1546 (1981). 7. J. E. Akin and R. M. Pardue, Element resequencing for frontal solutions, in The Mathematics ofFinire Elemenfs and Applicafions (MAFELAP 1975) (Whiteman, J. R., Ed.), Academic Press, London, 1975, pp. 535-541. 8 . A. Razzaque, Automatic reduction of frontwidth for finite element analysis, I n f . J. num. Meth. Engng, 15, 1315-1324 (1980). 9. E. Cuthill and J. McKee, Reducing the bandwidth of sparse symmetric matrices, Proc. A.C.M. Nat. Conf., Association for Computing Machinery, New York, 1969. 10. A. George and J. W. H. Liu, An implementation of a pseudo-peripheral node finder, A.C.M. Trans. Math. Software, 5, 284-295 (1979). 11. N. E. Gibbs, W. G . Poole and P. K. Stockmeyer, An algorithm for reducing the bandwidth and profile of a sparse matrix, SIAMJ. Numer. Anal. 2 , 236-250 (1976). 12. B. M. Irons, A frontal solution program for finite element analysis, Znt. J. num. Meth. Engng, 2, 5-32 (1970). 13. H. R. Grooms, Algorithm for matrix bandwidth reduction, J. Struct. Diu., A.S.C.E. 98(ST1), 203-214 (1972). 14. G. Akhras and G. Dhatt, An automatic node relabelling scheme for minimising a matrix or network bandwidth, Int. J. Num. Meth. Engng, 10, 787-797 (1976). 15. A. M. Britto, private communication. 16. R. J. Collins, Bandwidth reduction by automatic renumbering, Inf. J. num. Merh. Engng, 6 , 345-356 (1973).