Sie sind auf Seite 1von 18

Finite Elements in Analysis and Design 47 (2011) 12621279

Contents lists available at ScienceDirect

Finite Elements in Analysis and Design


journal homepage: www.elsevier.com/locate/nel

A new parallel nite element algorithm for the stationary NavierStokes equations
Yueqiang Shang a,b,, Yinnian He c, Do Wan Kim a, Xiaojun Zhou b
a

Department of Mathematics, Inha University, Incheon, 402-751, Republic of Korea School of Mathematics and Computer Science, Guizhou Normal University, Guiyang, 550001, PR China c Faculty of Science, Xian Jiaotong University, Xian, 710049, PR China
b

a r t i c l e i n f o
Article history: Received 20 September 2010 Received in revised form 20 May 2011 Accepted 1 June 2011 Available online 2 July 2011 Keywords: NavierStokes equations Finite element Parallel computing Parallel algorithm Two-grid method Domain decomposition

a b s t r a c t
Based on two-grid discretization, a new parallel nite element algorithm for the stationary NavierStokes equations is proposed and analyzed. This algorithm rst solves the NavierStokes equations using a coarse grid, and then corrects the resultant residual on a ne grid by solving local NavierStokes equations in a parallel manner with homogeneous boundary conditions. Existing sequential NavierStokes solver is available for each problem on sub-domains, so that the proposed parallel algorithm can be implemented on the top of existing sequential software. The error bounds of the approximate solution are estimated. Moreover, the efciency of the algorithm is also demonstrated by numerical simulations of the lid-driven cavity ow, the backward-facing step ow, and the ow past a circular cylinder. & 2011 Elsevier B.V. All rights reserved.

1. Introduction Computational uid dynamics models are in general based on the solution of the NavierStokes equations and its discretization scheme, for instance, nite element methods and nite volume methods. To accurately capture the physical properties of the uid ow being simulated, we usually need highly rened meshes on the entire ow domain which can cause a large scale computation possibly beyond the capability of a single computer. Therefore, to utilize the computational power of modern high-performance parallel computers, much effort is thrown into the development of efcient parallel computing methods for the NavierStokes equations and related ow problems (see, e.g., [17]). Recently, local and parallel algorithms for the stationary NavierStokes equations were proposed and analyzed in [810], respectively, based on a new approach to local and parallel nite element computations [11,12] together with the fact that the global behavior of a solution to the NavierStokes equations is mostly dominated by the low frequency components and, on the contrary, the local behavior is basically affected by high frequency components. Such algorithms were numerically compared in [13]. The key
Corresponding author. Tel.: 82 32 860 8819.

E-mail addresses: shangyueqiang@sina.com, yueqiangshang@gmail.com (Y.Q. Shang), heyn@mail.xjtu.edu.cn (Y.N. He), dokim@inha.ac.kr (D.W. Kim), zxj0702@126.com (X.J. Zhou). 0168-874X/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.nel.2011.06.001

idea of these algorithms is to use the classical nite element discretization on a coarse grid to approximate the low frequencies, and then employ linearizations on local ne grids to correct the resultant residual of high frequencies. Theoretical analysis shows that these algorithms can yield the same order of convergence rate as in the classical Galerkin nite element method if appropriate ratio between coarse mesh size and ne mesh size is taken. However, although the coarse grid size is suitably chosen in some cases of incompressible ows, numerical computation showed that the nite element solutions obtained from these local and parallel algorithms are inaccurate particularly for the pressure when the overlapping size of the sub-domains is small. The objective of this paper is that we employ the local and parallel nite element computations approach of Xu and Zhou [11,12] to develop an efcient parallel nite element algorithm for the d-dimensional stationary NavierStokes ows d 2,3. This novel algorithm is based on a coarse grid nite element solution to the global NavierStokes equations and ne grid solutions to local NavierStokes equations dened on overlapped sub-domains. Here, the nonlinear problems are solved by means of linearization methods such as Newton and Picard iterations. Since existing sequential solvers are available for problems on sub-domains, our method can be easily implemented on top of the existing sequential software. It is of worth to mention that similar two-level or multi-level methods for the NavierStokes equations were proposed in

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1263

[1419] since the pioneering work of Xu [20]. The major difference between those methods and our method is laid on the fact that the coarse grid solution is used to linearize the nonlinear convection term on the ner grid(s) in those methods but in our method a predictioncorrection-type approach is employed. Indeed, the solution is rst predicted on a coarse grid and then we correct it by solving the residual equations on the ne grid in a parallel manner. Our method is also reminiscent of the nonlinear Galerkin methods (cf. [2124]). However, there are several essential differences between the nonlinear Galerkin methods and our method. First, the velocity is separated into two parts, small and large eddies components, in the nonlinear Galerkin method. While in our method, both the velocity and pressure are decomposed into low and high frequency components. Second, the coarse grid solution and the ne grid correction in our method are uncoupled in the computational process. They are calculated sequentially, while, in the nonlinear Galerkin methods, such calculations are coupled together. Third, our method is parallel computing version. One can expect that a global solver may yield more accurate solution than our parallel solver. As mentioned before, however, the amount of storage desired by the global solver often exceeds the capacity of modern computers. The current method proposed in this paper also differs from the classical two-level Schwarz methods (cf. [25,26,1]) in that the global coarse grid problem and the ne grid local problems need to be solved only once; moreover, in solving local problems, there is no communication between processors for our method. It is also a distinct feature of our method that it is to design a discretization scheme compared to the methods in [29,27,28], where the two-level nonlinear methods were used as preconditioners. Moreover, in the present method, the coarse grid problem does not have to be coupled with the local ne grid problems. The rest of the paper is organized as follows. In the next section, the NavierStokes equations and their mixed nite element approximations are provided. In Section 3, based on two-grid nite element discretization and domain decomposition, a new parallel algorithm is designed and analyzed. Numerical results on some benchmark problems such as the lid-driven cavity ow, the backward-facing step ow and ow past a circular cylinder are given in Section 4. Finally, conclusions are drawn in Sections 5.

assumption H0 below. As usual, for a nonnegative integer k, we denote by Hk O the Sobolev space of functions with square integrable distribution up to order k in O, equipped with 1 the standard norm J Jk, O , while denote by H0 O the closed subspace of H1 O consisting of functions with zero trace on @O, see, e.g., [30,31]. Throughout this paper, we shall use the letter c (with or without subscripts) to denote a generic positive constant which is independent of mesh parameter and may take on different values on different occurrences. 2.1. The NavierStokes equations We consider the following incompressible NavierStokes equations: nDu u ru rp f div u 0 u0 in O, in O, 2:1a 2:1b 2:1c
T

on @O,

where u u1 , . . . ,ud is the velocity, p the pressure, f f1 , . . . ,fd T the prescribed body force and n the kinematic viscosity. Given a characteristic length L and a characteristic velocity U, the Reynolds number is dened as Re UL=n. To introduce the variational formulation of (2.1), we set   Z 1 X H0 Od , Y L2 Od , M L2 O q A L2 O : q dx 0 , 0
O

and dene a,, b, ,, d, as au,v nru, rv, dv,q div v,q, bu,v,w 1 u rv,w1u rw,v, 2 2 8u,v,w A X, qA M,

where , is the standard inner-product of L2 Ol l 1,2,3. As mentioned above, a further assumption on O is needed: H0. Assume that O is regular in the sense that the unique solution u,q A X M of the steady Stokes problem nu rq g, div u 0 in O, uj@O 0,

for prescribed g A Y exists and satises JuJ2, O JqJ1, O rcJgJ0, O : It is noted that the validity of Assumption H0 is known if @O is C2, or if O is a two-dimensional convex polygon; see [32,33]. With the above notations, the variational formulation of (2.1) reads: nd a pair u,p A X M such that au,v bu,u,vdv,p f ,v, 8vA X, 2:2a

2. Preliminaries Let O be a bounded domain with Lipschitz-continuous boundary @O in Rd d 2,3 and satisfy an additional condition stated in

D2

D4

D1

D3

Fig. 1. Triangulation (left) and decomposition (right) of the solution domain.

1264

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

du,q 0, Dening N : sup


u,v,w A X, u,v,w a 0

8qA M:

2:2b

jbu,v,wj , JruJ0, O JrvJ0, O JrwJ0, O

t1, while the Taylor-Hood elements [42] and the augmented P2 P1 elements [43,44] satisfy Assumptions A1A4 when t 2. The mixed nite element approximation of problem (2.2) 0 0 reads: nd a pair uh ,ph A Xh O Mh O such that auh ,v buh ,uh ,vdv,ph f ,v, duh ,q 0,
0 8qA Mh O: 0 8vA Xh O,

2:11a 2:11b

we have the following existence and uniqueness results (cf. [34,35]). Lemma 2.1. Given f A X 0 (the dual space of X), there exists at least a solution pair u,p A X M satisfying (2.2) and JruJ0, O r n1 Jf J1, O , Jf J1, O sup
v A X, va0

The following results on uh ,ph are classical (cf. [34,35]). Lemma 2.2. Under Assumptions A0, A1 and A4, there exists a small h0 4 0 such that for all h A 0,h0 , problem (2.11) admits a 1 unique solution uh ,ph . Moreover, if u,p A Ht 1 O \ H0 Od Ht O \ L2 O, then the following error estimate holds: 0 Juuh J1, O Jpph J0, O rch JuJs 1, O JpJs, O , 3. Parallel nite element algorithms
s

jf ,vj : JrvJ0, O

2:3

Moreover, if n and f satisfy the following uniqueness condition 1 NJf J1, O

n2

4 0,

2:4

1r s rt:

2:12

then the solution pair (u,p) of problem (2.2) is unique. 2.2. Mixed nite element spaces To describe the mixed nite element approximations of problem (2.2), let us assume T h O fKg be a shape-regular triangulation (see, e.g., [35,31]) of O into triangles or quadrilaterals (if d 2), or tetrahedrons or hexahedrons (if d 3) with mesh-size function h(x) whose value is the diameter hK of the element K containing x, satisfying the following assumption: A0. Triangulation. There exists g Z 1 such that hO r chx,
g

8 x A O,
h

2:5

where hO maxx A O hx is the largest mesh size of T O. Sometimes, we shall use h instead of hO for the mesh size on a domain that is clear from the context. Let Xh O & H1 Od ,Mh O & L2 O be two nite element subspaces associated with a mesh T h O and
0 1 Xh O Xh O \ H0 Od , 0 Mh O Mh O \ L2 O: 0

In this section, we rst recall a parallel algorithm based on local nite element computations proposed in [9] for the steady NavierStokes equations, and then give an analysis for improvement and introduce our new parallel nite element algorithm based on two-grid discretization. Let us rst divide O into a number of disjoint sub-domains D1 , . . . ,Dm , and then enlarge each Dj to obtain Oj such that Dj & & Oj & O j 1,2, . . . ,m, here Dj & & Oj & O means that dist@Dj \@O,@Oj \@O 40). These Oj s are an overlapping decomposition of O. Assume T H O to be a shape-regular coarse grid with size H b h, T h Oj a local shape-regular ne grid of subdomain Oj and T h O a global ne grid which coincides with the local ne grid in sub-domain Oj . We are interested in obtaining an approximate solution in sub-domains Dj j 1,2, . . . ,m with an accuracy comparable to that of the classical nite element solution uh ,ph from T h O. 3.1. A parallel linearized algorithm The parallel nite element algorithm based on two-grid discretization proposed in [9] for the stationary NavierStokes equations reads: Algorithm 1. Parallel linearized nite element algorithm.
0 0 1. Find a global coarse grid solution uH ,pH A XH O MH O such that

Given a sub-domain G & O, we dene Xh G, Mh G, and T h G to be the restriction of Xh O, Mh O and T h O to G, respectively, and set
h X0 G fv A Xh O : supp v & & Gg, h M0 G fq A Mh O : supp q & & Gg:

We shall not restrict our attention to any specic mixed nite element space; rather we shall study a class of mixed nite element spaces satisfying the following assumptions (cf. [11,3638]). A1. Approximation. For each u,p A Ht 1 Gd Ht Gt Z 1, there exists an approximation ph u, rh p A Xh G Mh G such that Jh1 uph uJ0,G Juph uJ1,G rchG JuJ1 s,G , Jh
1 s

0 r s rt, 0 r s rt:

auH ,v buH ,uH ,vdv,pH f ,v, duH ,q 0,


0 8q A MH O:

2:6 2:7

0 8v A XH O,

prh pJ1,G Jpr

s h pJ0,G r chG JpJs,G ,

A2. Inverse estimate. For any v,q A Xh G Mh G, there hold JvJ1,G r cJh1 vJ0,G , JqJ0,G r cJh1 qJ1,G : 2:8

0 0 2. Find local ne grid corrections eh,j , Zh,j A Xh Oj Mh Oj j 1, 2, . . . ,m in parallel:

aeh,j ,v beh,j ,uH ,v buH ,eh,j ,vdv, Zh,j Rj ,v,

0 8v A Xh Oj ,

1 A3. Superapproximation. For G & O, let o A C0 O with supp o & & G. Then for any u,p A Xh G Mh G, there is h h v,q A X0 G M0 G such that

deh,j ,q duH ,q,

0 8q A Mh Oj :

Jh1 ouvJ1,G r cJuJ1,G , div v,q , JrvJ0,G

Jh1 opqJ0,G r cJpJ0,G :

2:9
Table 1 Errors of the solutions obtained from Algorithm 1. h
1 27 1 64 1 125

A4. Infsup condition. There exists a constant b 4 0 such that

bJqJ0,G r sup

v A X 0 G, h va0

0 8q A Mh G:

2:10

H
1 18 1 32 1 50

CPU(s) 2.393 8.476 25.236

itC 3 3 3

Jjruuh jJ0, O Jr uJ0, O

Jjpph jJ0, O JpJ0, O

Iph 0.163686 0.331027 0.0482335

Rate

We refer to [39] for some examples satisfying Assumptions A1A4. For instance, the MINI nite elements [40] and the P2 P0 nite elements [41] satisfy Assumptions A1A4 when

0.00381917 0.000725397 0.000190847

14.1379 2.25073 0.339697

2.12883 2.82248

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1265

3. Set uh ,ph uH ,pH eh,j , Zh,j in Dj j 1,2, . . . ,m. Here and hereafter, Rj ,v f ,vauH ,vbuH ,uH ,v dv,pH ,
0 8vA Xh Oj ,

j 1,2, . . . ,m:

3:1

Remark 3.1. Similar parallel linearized algorithms were also proposed for the stationary NavierStokes equations in [10,13], respectively. They differ from Algorithm 1 in that they solve a different linearized problem on the ne grid; see [10,13] for details. Dening piecewise norms 0 11=2 m X h h 2 A Jruu J0,Dj , Jjruu Jj0, O @
j1

However, detailed analysis and numerical tests showed that still there is room to improve the above algorithm. To begin with, let us consider the approximate pressure obtained from Algorithm 1 and set X Z m h Iph : p dx : 3:2 j 1 Dj From problem (2.1), it is clear that the pressure is a function of L2 O which is dened up to an additive constant. This issue can be circumscribed by considering one of the two solutions: the rst one is to look for a pressure with a vanishing average in O, i.e., belonging to the space L2 O; the second one is to seek a 0 pressure belonging to L2 O\R. Obviously, Algorithm 1 adopts the

0 Jjpph Jj0, O @

m X j1

11=2 Jpph J2 j A 0,D ,

Table 3 Comparison of the two strategies. h H Zero-restriction of pressure on articial boundaries


Jjpph Jj0, O JpJ0, O 1 27 1 64 1 125 1 18 1 32 1 50

Nonlinear corrections

we have the following error estimates (see [9]). Theorem 3.1. Assume that Dj & & Oj & O j 1,2, . . . ,m, Assumptions A0A4, Lemmas 2.1 and 2.2 hold, and uh ,ph is obtained from Algorithm 1. Then
Jjruh uh Jj0, O Jjph ph Jj0, O rcHs 1 JuJs 1, O JpJs, O , 1 r sr t:

Iph 1.48964e 005 7.93512e 007 1.10631e 006

Jjpph Jj0, O JpJ0, O

Iph 0.093903 0.220091

0.000693688 9.11414e 005 3.65099e 005

14.1271 2.22346

0.336198 0.0338185

Consequently,
Jjruuh Jj0, O Jjpph Jj0, O r chs Hs 1 JuJs 1, O JpJs, O , 1r s rt:
Table 4 Errors of the classical nite element solutions. h
1 27 1 64 1 125

Theorem 3.1 shows that if the ratio of coarse mesh size H to ne mesh size h is suitably chosen, Algorithm 1 can yield the same order of convergence rate as the classical Galerkin nite element method and may provide asymptotically optimal errors for the approximate solution.
Table 2 Errors of the solutions obtained from Algorithm 2. h
1 27 1 64 1 125

CPU (s) 3.878 24.354 118.539

itF 3 3 3

Jruuh J0, O Jr uJ0, O

Jpph J0, O JpJ0, O

Jph JL1 O 2.96937e 011 2.78859e 014 3.49921e 010

Rate

0.00402224 0.000717938 0.000188292

0.00050581 8.98303e 005 2.35458e 005

1.99677 1.99931

H
1 18 1 32 1 50

CPU (s) 2.697 10.684 29.382

itC 3 3 3

itF 3 3 2

Jjruuh Jj0, O JruJ0, O

Jjpph Jj0, O JpJ0, O

Iph 2.47622e 005 1.04331e 006 2.03235e 007

Rate

0.00381339 0.000720109 0.000187746

0.000680126 9.14859e 005 3.08976e 005

1.94062 1.99942

5.5 6 6.5 7
Algorithm 1 Algorithm 2 Classical FEM h2

4 2 0

log(error)

7.5 8 8.5 9 9.5 10 5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.2

log(error)

2 4 6 8 10 12
Algorithm 1 Algorithm 2 Classical FEM h2

5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.2

log(h)

log(h)

Fig. 2. H1-error for the velocity (left) and L2-error for the pressure (right).

1266

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

rst solution to determine the pressure uniquely. From Algorithm 1, we can see that both the coarse grid approximation pH and the ne grid corrections Zh,j j 1,2, . . . ,m have a vanishing average R on their respective solution domains, i.e., both O pH dx 0 and R Oj Zh,j dx 0 j 1,2, . . . ,m are enforced. However, due to the overlapping of sub-domains Oj j 1,2, . . . ,m, Algorithm 1 cannot guarantee that the nal result ph is really in L2 O or Iph is small 0

u1=1, u2=0

enough to ensure that ph is an acceptable approximation of the exact solution. In other words, Algorithm 1 cannot guarantee that 0 for j 1,2, . . . ,m, Zh,j A Mh Oj at Step 2 is exactly the local correction of pH obtained at Step 1 in the subregion Dj; it may be the correction of another coarse grid approximation of the pressure. If this is the case, the approximate solution ph obtained from Algorithm 1 may be far away from the exact solution. Consequently, the accuracy of the approximate pressure obtained from Algorithm 1 depends not only on the coarse grid size H (or, equivalently, the coarse grid solution pH), but also on whether the ne grid corrections Zh,j s j 1,2, . . . ,m at Step 2 are exactly the corrections of the coarse grid solution pH in the disjoint sub-domains.

L=1

u1=0, u2=0

u1=0, u2=0

3.2. New parallel nite element algorithm Our new parallel nite element algorithm is motivated by the above analysis and observation. We just modify Step 2 of Algorithm 1 to more precisely calculate the corrections eh,j , Zh,j on the overlapped sub-domains Oj j 1,2, . . . ,m. On one hand, unlike Algorithm 1, we conne the pressure correction Zh,j in space L2 Oj \R by adding a homogeneous boundary condition on the articial boundary @Oj \@O of sub-domains Oj j 1,2, . . . ,m in the ne grid local correction problems. On the other hand, we solve a fully nonlinear correction problem by an iterative method such as Newton and Picard iterations (see, e.g., [45,46]) independently on

u1=0, u2=0

L=1
Fig. 3. Schematic diagram of the lid-driven cavity ow.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.4 0.2 0 y

Re=100

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.4 0.2 y

Re=100

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

0.2 0.4 0.6 u1velocity Re=100

0.8

0.2 0.4 0.6 u1velocity Re=100

0.8

0.2 0.15 0.1 0.05 u2velocity 0.05 0.1 0.15 0.2 0.25 0.3 0 0

0.2 0.15 0.1 0.05 u2velocity 0 0.05 0.1 0.15 0.2 0.25 0.8 1 0.3 0

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

0.2

0.4 x

0.6

0.2

0.4 x

0.6

0.8

Fig. 4. Comparison of u1-velocity proles along the vertical centerline (top) and u2-velocity proles along the horizontal centerline (bottom) for lid-driven cavity ow at Re 100: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1267
G

sub-domains Oj j 1,2, . . . ,m. Specically, we rst approximate the low frequency components of the solution to the NavierStokes equations using a coarse grid on the entire domain as done in Algorithm 1, and then use a ne grid to correct the resultant residual in parallel on a collection of overlapped sub-domains, where the local problems for these ne grid corrections are fully nonlinear with homogeneous boundary conditions for the velocity on all boundaries of the overlapped sub-domains and homogeneous conditions for the pressure only on the articial boundaries. All of these nonlinear correction problems are solved in parallel by an iterative method such as Newton and Picard iterations. Setting Mh j Oj fq A Mh Oj : qjGj 0g,
G

0 2. Find ne grid corrections eh,j , Zh,j A Xh Oj Mh j Oj j 1,2, . . . ,m in parallel by the following iterative procedure:

aen ,v ben ,en1 ,v ben1 ,en ,vdv, Zn h,j h,j h,j h,j h,j h,j ben1 ,en1 ,v Rj ,v, h,j h,j den ,q duH ,q, h,j
0 8v A Xh Oj ,

8q A Mh j Oj ,

3:4

for n 1,2, . . ., where the initial guess e0 0 for j 1,2, . . . ,m. h,j 3. Set uh ,ph uH ,pH eh,j , Zh,j in Dj j 1,2, . . . ,m. Remark 3.2. In our new algorithm, we add zero restriction on the articial boundaries of sub-domains in the local correction problems. It is noted that similar boundary conditions were used in [4749] for the incompressible Stokes and NavierStokes equations, respectively. Such a restriction does not lead to singular problems because the zero Dirichlet boundary condition for the pressure enforces a unique pressure solution.

Gj @Oj \@O,

3:3

our new algorithm with Newton iteration for the nonlinear correction problems reads: Algorithm 2. New parallel nite element algorithm.
0 0 1. Find a global coarse grid solution uH ,pH A XH O MH O such that

Remark 3.3. Step 2 of the above new algorithm is the Newton iterative method applied to the following local residual: aeh,j ,v beh,j ,eh,j ,vdv, Zh,j Rj ,v, deh,j ,q duH ,q, 8q A Mh j Oj :
G
0 8v A Xh Oj ,

auH ,v buH ,uH ,vdv,pH f ,v, duH ,q 0,


0 8q A MH O:

0 8v A XH O,

3:5

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.4 0.2 0 y

Re=1000

Re=1000 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.4 0.2
Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

0.2 0.4 0.6 u1velocity Re =1000

0.8

0.2 0.4 0.6 u1velocity Re =1000

0.8

0.4 0.3 0.2 0.1 u2velocity 0.1 0.2 0.3 0.4 0.5 0.6 0

0.4 0.3 0.2 0.1 u2velocity 0 0.1 0.2 0.3 0.4 0.5 0.8 1 0.6 0

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/128)

0.2

0.4 x

0.6

0.2

0.4 x

0.6

0.8

Fig. 5. Comparison of u1-velocity proles along the vertical centerline (top) and u2-velocity proles along the horizontal centerline (bottom) for lid-driven cavity ow at Re 1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

1268

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

We can also employ other linearization methods to solve the nonlinear correction problem (3.5). For example, the Picard iterative method (see, e.g., [45,46]) applied to problem (3.5) reads: aen ,v ben1 ,en ,vdv, Zn Rj ,v, h,j h,j h,j h,j den ,q duH ,q, h,j for n 1,2, . . .. Remark 3.4. As one of the referees pointed out that the corrections in the velocity and pressure elds can be viewed as approximations of the discretization errors between the solutions computed on the two different meshes (see (3.5) and (2.11), respectively). This is, in a way, related to the residual-type methods for a posteriori error estimation in nite element analysis (cf. [5052]). We refer, for example, to [5355] for such residual-type a posteriori error estimations for the steady NavierStokes equations, and to [5659] for the unsteady NavierStokes equations. However, the main philosophy behind our present paper is that we should treat phenomena of different scales by different tools [11], which is different from that of a posteriori error estimation. Remark 3.5. The approximation u ,p obtained from our Algorithm is piecewise dened. It is in general discontinuous. In the case D i \ D j a | ia j, on the interface, we can simply take the
h h 0 8v A Xh Oj ,

average of the two subdomains solutions as its solution (this strategy was used in our numerical experiments). To obtain a global continuous approximation, one can use an additional local ne grid problem to smooth the solution uh ,ph as done in [11]. For j 1,2, . . . ,m, dening JRj J1, Oj sup jRj ,vOj j JrvJ0, Oj , 3:7

8q A Mh j Oj ,

3:6

v A H1 Oj d , 0 va0

Nj

sup
u,v,w A H1 Oj d , 0 u,v,w a 0

jbu,v,wj , JruJ0, Oj JrvJ0, Oj JrwJ0, Oj

3:8

we have the following error estimate for our new parallel algorithm. Theorem 3.2. Suppose that the conditions of Theorem 3.1 are valid and the following stability conditions hold: 25Nj JRj J1, Oj o 1, 3n2 j 1,2, . . . ,m: 3:9

Then the approximate solution uh ,ph obtained from Algorithm 2 has the following error estimate: Jjruuh Jj0, O Jjpph Jj0, O r chs Hs 1 JuJs 1, O JpJs, O , 1 r s rt:

Proof. From Lemmas 4.2 and 5.2 in [46], we obtain that, under the stability condition (3.9), the iterative procedure (3.4) is stable

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 0 y

Re=5000

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 y

Re=5000

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/256)

0.5 u1velocity Re=5000

0.5 u1velocity Re=5000

0.6 0.4 0.2 u2velocity 0 0.2 0.4 0.6 0.8 0

0.6 0.4 0.2 u2velocity 0 0.2 0.4 0.6 0.8 1 0.8 0

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Classical FEM (h=1/128) Ghia et al. (h=1/256)

0.2

0.4 x

0.6

0.2

0.4 x

0.6

0.8

Fig. 6. Comparison of u1-velocity proles along the vertical centerline (top) and u2-velocity proles along the horizontal centerline (bottom) for lid-driven cavity ow at Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1269

and convergent for all j 1,2, . . . ,m. By a similar argument as that used in the proof of Theorem 4.2 in [9] and Theorem 3.2 in [37], we can easily nish the proof. & Remark 3.6. A fully nonlinear problem on the coarse grid needs to be solved both in Algorithms 1 and 2. We usually solve this nonlinear NavierStokes problem using either the Newton method or the Picard method (see, e.g., [45,46]). From the denitions of Jf J1, O , JRj J1, Oj ,N and Nj j 1,2, . . . ,m (see (2.3), (3.7), (2.4) and (3.8), respectively), we see that when the Newton iterative method (which needs the stability condition 25NJf J1, O =3n2 o 1; see [46]) is employed to solve the coarse grid problem, the stability conditions (3.9) are apparently valid. Therefore, no stricter conditions than those of Algorithm 1 are required for our new Algorithm 2. Throughout this paper, we assume that the nonlinear problems are uniquely solvable by the above mentioned iterative methods and the corresponding conditions for these methods hold. Comparing Algorithm 2 with Algorithm 1, we can see that the difference between the two algorithms lies in Step 2. First, unlike Algorithm 1 where the correction problems are linear, the local correction problems in our new algorithm are nonlinear. Second, our new algorithm applies a homogeneous boundary condition for pressure on the articial boundary @Oj \@O of sub-domains Oj j 1,2, . . . ,m in the nonlinear correction problems. The homogeneous boundary condition on the articial boundaries of

overlapped sub-domains for the pressure ensures that in Dj j 1,2, . . . ,m, the computed results Zh,j j 1,2, . . . ,m are exactly the corrections of pH and hence the nal result ph is in L2 O or has a small value of Iph . 0 From Algorithm 2 we can see that our new parallel algorithm is based on a global coarse grid nonlinear problem and local ne grid nonlinear problems. There is no communication between processors in the solving process of the local correction problems. If we allow all processors to simultaneously compute the coarse grid solution, our algorithm only requires an existing sequential solver as sub-problem solver and hence allows existing sequential PDE codes to run in a parallel environment with a little investment in recoding: given an existing or black-box sequential NavierStokes equations solver, our algorithm only requires the application of the solver on overlapped sub-domains and its application on a global coarse mesh. This is a very attractive feature of our algorithm.

4. Numerical results In this section, we shall report some numerical results to demonstrate the efciency of our new parallel algorithm. The test cases include a simple problem with known analytical solution, the lid-driven cavity ow, the backward-facing step ow, and the

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 0 y

Re=7500

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 y

Re=7500

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

0.5 u1velocity Re=7500

0 0.5 u1velocity Re=7500

0.6 0.4 0.2 u2velocity

0.6 0.4 0.2 u2velocity 0 0.2 0.4 0.6 0.8 1 0.8 0

0 0.2 0.4 0.6 0.8 0


Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

0.2

0.4 x

0.6

0.2

0.4 x

0.6

0.8

Fig. 7. Comparison of u1-velocity proles along the vertical centerline (top) and u2-velocity proles along the horizontal centerline (bottom) for lid-driven cavity ow at Re 7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

1270

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

ow past a circular cylinder. The routine UMFPACK [60] is used to solve the linear systems arising from each nonlinear iteration. In all the numerical experiments, the second-order Taylor-Hood elements are used for the nite element discretization. 4.1. Analytical solution In this test case, O is the unit square 0,1 0,1 in R2 . we set f and the boundary conditions such that the exact solution of the stationary NavierStokes equations is given by u1 sin2 pxsin2py, u2 sin2pxsin2 py, p cospx: The mesh consists of triangular elements which are obtained by dividing O (or Oj , j 1,2, . . . ,m) into sub-squares of equal size and then drawing the diagonal in each sub-square; see Fig. 1 (left). We divide O 0,1 0,1 into four disjoint subdomains D1 0, 1 0, 1 , 2 2 D3 1 ,1 0, 1 , 2 2 D2 0, 1 1,1, 2 2 D4 1 ,1 1,1, 2 2

These Oj s are composed of an overlapping decomposition of O. We compute the nite element solutions on sub-domains Oj j 1,2,3,4 independently by using Algorithms 1 and 2, respectively. The coarse grid nonlinear problem is solved by Newton iterative method and convergence is achieved when the relative L2-error of the successive iterative velocities is within a xed tolerance of 106 , i.e., the following condition is satised: Jun 1 un J0, O H H Jun 1 J0, O H o 106 , 4:1

where un 1 is the n 1-th iterative solution. In our new AlgoH rithms 2, the stopping criterion for the local nonlinear correction problems on Oj j 1,2, . . . ,m is Jen 1 en J0, Oj h,j h,j Jen 1 J0, Oj h,j o106 : 4:2

and then extend each sub-domain Dj j 1,2,3,4 outside with an extra layer of size h to obtain Oj j 1,2,3,4; see Fig. 1(right).

We set n 0:1 and compute the nite element solutions with ne meshes of size h n3 n 3,4,5 and corresponding coarse meshes of size H satisfying 2H3 h2 . The numerical results are listed in Tables 1 and 2, respectively, where the CPU time is the maximum of CPU time taken by the algorithms over the four overlapped sub-domains, which includes the mesh generation time, the time spent on solving problems both on coarse and ne grids, and the error computing time. itC stands for the nonlinear iterations count satisfying the stopping criterion (4.1) for the coarse grid

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 0 y

Re=10000

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 y

Re=10000

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

0.5 u1velocity Re = 10000

0.5 u1velocity Re=10000

0.6 0.4 0.2 u2velocity

0.6 0.4 0.2 u2velocity 0 0.2 0.4 0.6 0.8 1 0.8 0

0 0.2 0.4 0.6 0.8 0


Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h= /256)

Present (H=1/32, h=1/64) Present (H=1/64, h=1/128) Ghia et al. (h=1/256)

0.2

0.4 x

0.6

0.2

0.4 x

0.6

0.8

Fig. 8. Comparison of u1-velocity proles along the vertical centerline (top) and u2-velocity proles along the horizontal centerline (bottom) for lid-driven cavity ow at Re 10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1271

1 0.8 0.6 Y 0.4 0.2 0 Y

1 0.8 0.6 0.4 0.2 0

0.2

0.4

0.6 X

0.8

0.2

0.4

0.6 X

0.8

1 0.8 0.6 Y 0.4 0.2 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.06 0.04 0.02 0 -0.02 -0.05 -0.1 0 0.2 0.4 0.6 X 0.8 1

1 0.8 0.6 Y 0.4 0.2 0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.06 0.04 0.02 0 -0.02 -0.05 -0.1 0 0.2 0.4 0.6 X 0.8 1

Fig. 9. Computed streamlines (top) and isobars (bottom) for lid-driven cavity ow at Re 1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

1 0.8 0.6 Y 0.4 0.2 0 Y

1 0.8 0.6 0.4 0.2 0

0.2

0.4

0.6 X

0.8

1
0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05

0.2

0.4

0.6 X

0.8

1
0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05

1 0.8 0.6 Y 0.4 0.2 0

1 0.8 0.6 Y 0.4 0.2 0 0

0.2

0.4

0.6 X

0.8

0.2

0.4

0.6 X

0.8

Fig. 10. Computed streamlines (top) and isobars (bottom) for lid-driven cavity ow at Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

1272

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

problem, while itF is the maximum of iterations counts satisfying the stopping criterion (4.2) for the ne grid local nonlinear correction problems for our new algorithm. Iph is dened by (3.2). The convergence rates with respect to mesh parameter h are computed by the formula logEi =Ei 1 =loghi =hi 1 , where Ei and Ei 1 are the relative errors Jjruuh Jj0, O Jjpph Jj0, O =JruJ0, O JpJ0, O corresponding to the ne meshes of sizes hi and hi 1 , respectively. According to the mixed nite element spaces we choose and the relationship between the mesh sizes H and h, i.e., H Oh2=3 , from Theorems 3.1 and 3.2, we have Jjruuh Jj0, O Jjpph Jj0, O % ch : The results shown in Tables 1 and 2 support the above estimate both for Algorithm 1 and our new Algorithm 2; see Fig. 2. However, from Table 1 we can see that the computed results for the pressure by Algorithm 1 are inaccurate. Although both the coarse grid solution pH and the local ne grid corrections Zh,j j 1,2,3,4 are of average-vanishing on O and Oj j 1,2,3,4, respectively, the accuracy of the pressure is very poor and the values of Iph are far from zero; this is predicted by our analysis in Section 3.1. While from Table 2, we can see that with a homogeneous condition on the articial boundaries of sub-domains for the pressure corrections and by several nonlinear iterations for the local correction problems, our new algorithm yields a reasonable approximate solution. To investigate the contributions of the modication strategies (i.e., the zero restriction of pressure on the articial boundaries and the nonlinear version of the corrections) to the improvement
2

on the approximations of pressure, we computed the nite element solutions with each strategy separately. Numerical results listed in Table 3 show that the improvement on the approximations of pressure mainly results from the zero restriction of pressure on the articial boundaries, which veries our previous analysis in Section 3.1. Comparing Table 1 with Table 2, we can see that our new algorithm has much better performance than Algorithm 1. As for the CPU time, our new algorithm spends a little more than Algorithm 1. However, compared to the classical nite element method, our new algorithm saves a large amount of computational time with a very comparable accuracy for the solutions; see Tables 2, 4 and Fig. 2, respectively. 4.2. Lid-driven cavity ow For this test case, we consider the 2D lid-driven cavity ow which is a well-known benchmark problem and numerically investigated by many researchers (cf. [6163]). This problem is dened in the unit square. With zero source external force, velocities are zero on all boundaries except the top one (the lid), which has the driving horizontal velocity set to unity; see Fig. 3. The Reynolds number for this problem is dened as Re UL=n, where U is the velocity of the top lid and L is the length of the side wall. For the 2D lid-driven cavity ow problem, it is well documented that to ensure the convergence of the iterative method used for the nonlinear NavierStokes system so as to generate an

1 0.8 0.6 Y Y 0.4 0.2 0

1 0.8 0.6 0.4 0.2 0

0.2

0.4

0.6 X

0.8

0.2

0.4

0.6 X

0.8

1 0.8 0.6 Y 0.4 0.2 0

0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05

1 0.8 0.6 Y 0.4 0.2 0

0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05

0.2

0.4

0.6 X

0.8

0.2

0.4

0.6 X

0.8

Fig. 11. Computed streamlines (top) and isobars (bottom) for lid-driven cavity ow at Re 7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1273

approximate solution, ne enough meshes are necessary as the Reynolds number increases. For example, based on the velocity pressure formulation of the NavierStokes equations, Layton et al.

[64] reported that at Re 3200, the classical nite element method combined with a continuation method failed to converge on a 31 31 grid mesh. Using the classical nite element method,

1 0.8 0.6 Y 0.4 0.2 0 Y

1 0.8 0.6 0.4 0.2 0

0.2

0.4

0.6 X

0.8

0.2

0.4

0.6 X

0.8

1 0.8 0.6 Y 0.4 0.2 0

0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05 0 0.2 0.4 0.6 X 0.8 1

1 0.8 0.6 Y 0.4 0.2 0

0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.09 0.075 0.065 0.05 0.04 0.02 0 -0.02 -0.03 -0.05 0 0.2 0.4 0.6 X 0.8 1

Fig. 12. Computed streamlines (top) and isobars (bottom) for lid-driven cavity ow at Re 10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

u1 = 24y (0.5 - y) (0, 0.5) u2 = 0 u1 = u2 = 0 -p + u1 = u2 = 0 (0, -0.5) u1 = u2 = 0


Fig. 13. Schematic diagram of the backward-facing step ow.

(30, 0.5)
u1 =0 x

u2 = 0 (30, -0.5)

0.5 0.4 0.3 0.2 0.1 Present (x=7) 0 Present (x=15) 0.1 Gartling (x=7) Gartling (x=15) 0.2 0.3 0.4 0.5 0.2 0 0.2 0.4 0.6 0.8 1 1.2 u1velocity

Fig. 14. Comparison of u1-velocity (left), u2-velocity (middle) and pressure (right) at various downstream locations for backward-facing step ow at Re 800.

0.16 0.17 0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 pressure

0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5 20

0.5
Present (x=7) Present (x=15) Gartling (x=7) Gartling (x=15) Present (x=7) Present (x=15) Gartling (x=7) Gartling (x=15)

15

10 5 u2velocity

0 5 x 103

0.5

1274

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

Wang [65] was just able to compute the solution at Reynolds numbers up to Re5000 on a 81 81 uniform grid mesh. Based on the stream function-vorticity formulation of the NavierStokes equations, using pseudo-time derivations and a nite difference method, Ertural et al. [62] reported that they could not get a steady solution at Re7500 on a 129 129 grid mesh; while using a ner 257 257 grid mesh, they were able to obtain a steady solution at Reynolds numbers up to 12 500.

In our algorithm, a nonlinear NavierStokes problem needs to be solved both on the coarse and ne grids. In view of the above remarks, to ensure that a coarse grid solution can be obtained at high Reynolds numbers, we incorporate our parallel method with the defectcorrection method (cf. [64,65]) which can yield an approximate solution on a relatively coarse grid compared to the classical nite element method. The defectcorrection method consists of an initial defect step followed by serval correction

0.25 0.2 0.15 y 0.1 0.05 0 y

0.01 0.005 0 0.005


Lower wall Upper wall

Lower wall Upper wall

0.01 0.015

10

15 x

20

25

30

10

15 x

20

25

30

Fig. 15. Pressure (left) and shear stress (right) proles along upper and lower channel walls for backward-facing step ow at Re 800.

Table 5 Comparison of the normalized (by the step height) length (Lm) of the main recirculation region downstream the step, the separation location (Xs), the reattachment location (Xr) and the length Ls Xr Xs of the second recirculation region on the upper wall for the backward-facing step ow at Re 800. Reference Gartling [66] Erturk [67] Barton [68] Keskar and Lyn [69] Grigoriev and Dargush [70] Present Lm 12.20 11.83 12.03 12.19 12.18 12.15 Xs 9.70 9.48 9.64 9.71 9.70 9.67 Xr 20.96 20.55 20.96 20.96 20.94 20.90 Ls Xr Xs 11.26 11.07 11.32 11.25 11.24 11.23

Re = 100

Re = 500

Re = 800

Re = 1000

Fig. 16. Computed streamlines for backward-facing step ow at various Reynolds numbers.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1275

steps. In the defect step, an articial viscosity parameter a Oh is added to the kinematic viscosity as a stability factor, and the system is then anti-diffused in the correction steps; see, for example, [64,65] for details. We compute the solutions by our new parallel algorithm on 1 1 1 1 uniform meshes of sizes H 32 ,h 64 and H 64 ,h 128, respectively, and compare our computed results with those obtained by the classical nite element method and those of Ghia et al. [61] where the computations were based on the vorticity-stream function formulation of the NavierStokes equations and using the coupled strongly implicit multigrid method. The nonlinear problems both on coarse and ne grids are solved by the Picard iterative method combined with the defectcorrection method, where the stability factor a is chosen as a 0:05h and three-step corrections (within the defectcorrection method) are involved. The corresponding stopping criterion for the nonlinear iterations is that the relative L2-error of two successive iterates of velocity is within a xed tolerance of 106 . We compute an approximate solution at Re 100,1000,5000, 7500 and 10 000 for the lid-driven cavity ow with 2 2 and 4 4 sub-domains, respectively, where the overlapped sub-domains are constructed by extending each disjoint sub-domain outside with an extra layer of size h. Figs. 48 plot the computed u1 component of velocity along the vertical centerline and u2 component of velocity

along the horizontal centerline, compared with those of Ghia et al. [61], where much ner 129 129 (for Re100,1000) and 257 257 (for Re 5000,7500,10 000) grid meshes were used, and those obtained by the classical nite element method on a uniform mesh 1 of size h 128. It is worth mentioning that at Re7500 and 10 000, the classical nite element method is not able to yield an approximate solution since the iterations for the nonlinear system do not converge. From Figs. 48 we can see that the accuracy of the computed solutions is comparable to those of Ghia et al. [61] and the classical nite element solutions. As expected, the computed 1 1 results on grids of sizes H 64 ,h 128 are better than those of 1 1 H 32 ,h 64. Figs. 912 depict the numerical streamlines and isobars 1 1 computed by our new algorithm with H 64 ,h 128 and a 0:08 h. 4.3. Backward-facing step ow In this example, we consider the 2D backward-facing step ow which is a signicant test problem for validating the robustness of a NavierStokes solver. The literature offers many numerical and experimental studies on 2D steady incompressible ows over a backward-facing step. Flow features are known to depend on the Reynolds number, the boundary conditions and the geometrical parameters such as the step height and the channel height.

Re = 100

Re = 500

Re = 800

Re = 1000

Fig. 17. Computed isobars for backward-facing step ow at various Reynolds numbers.

14 12 10 Lm 8 6 4 2
Present Erturk

15 14 13 12 11 10 9 8 7 6 5
Present Erturk

100 200 300 400 500 600 700 800 900 1000

Ls

500 550 600 650 700 750 800 850 900 9501000

Re

Re

Fig. 18. Normalized length (Lm) of the main recirculation region downstream the step (left) and the normalized length (Ls) of the second recirculation region on the upper wall (right) with respect to the Reynolds number for the backward-facing step ow.

1276

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

The problem we consider here is dened on a long channel 0,30 0:5,0:5, with no-slip conditions imposed on the top and bottom walls, as well as the lower half of the left boundary. At the inlet boundary, a fully developed parabolic velocity prole u1 24y0:5y for 0 r y r0:5 is specied, which leads to a maximum inow velocity of umax 1:5 and an average inow velocity of uave 1:0. The outlet boundary condition is set as p n@u1 =@x 0. See Fig. 13 for detailed geometry and boundary conditions information. The Reynolds number for this problem is dened as Re Uave L=n, where Uave 1 is the average velocity at the inlet boundary and L 1 is the channel height. An interesting feature of this problem is that the length of the recirculation zone downstream the step is proportional (approximately) to the Reynolds number. We decompose the ow domain into 5 1 disjoint sub-domains of equal size, and then extend each sub-domain outside with an extra layer of size h. The quasi-uniform meshes sizes are set as 1 1 H 32 ,h 64. First, we compute the approximate solution at Re800 by our new parallel algorithm. In Fig. 14, the computed velocity and pressure across the channel at x7 and 15 are compared with those of Gartling [66]. From Fig. 14 we can see that for the horizontal velocity and pressure, our numerical results agree well with those of Gartling [66]. While for the vertical velocity, there is a very little difference at x7. It is noted that due to the different

solutions to uniquely determining the approximate pressure, our computed pressure is not the same as Gartlings [66]; there is a constant difference between them. For the sake of comparison, our pressure data presented in Fig. 14 were adjusted by making the computed pressure equal to that of Gartling [66] at the lower channel wall point x,y 7:0,0:5. Fig. 15 describes the computed pressure and shear stress along the upper and lower channel walls, which are also in perfect agreement with those of Gartling [66]. In Table 5, we compare the normalized (by the step height) length (Lm) of the main recirculation region downstream the step, the separation location (Xs), the reattachment location (Xr) and the length Ls Xr Xs of the second recirculation region on the upper wall obtained by our new algorithm with those in the literature [6670]. The good agreement indicates the accuracy of our new algorithm. Figs. 16 and 17 depict the computed streamlines and isobars at different Reynolds numbers, respectively, where the vertical y-scale is expanded in order to be able to see the details. Fig. 16 clearly shows that the length of the main recirculation region downstream the step increases as the Reynolds number grows. At Re500, a second recirculation eddy forms on the upper wall, which becomes
Table 6 Comparison of the separation angle y and wake length (Lw) for the ow past a circular cylinder at Re 10,20,40. Re 10 Reference Dennis and Chang [71] Ding et al. [72] Kim et al. [73] Present Dennis and Chang [71] Fornberg [74] Ding et al. [72] Kim et al. [73] Present Dennis and Chang [71] Fornberg [74] Ding et al. [72] Kim et al. [73] Present

y
29.6 30.0 29.5 29.8 43.7 44.1 43.7 43.7 53.8 53.5 55.1 53.4

Lw 0.265 0.252 0.281 0.257 0.94 0.91 0.93 0.91 0.937 2.345 2.24 2.20 2.187 2.258

20

40

Fig. 19. Schematic diagram of the ow past a circular cylinder.

Fig. 20. Nonoverlapping (left) and overlapping (right) domain decomposition for the ow past a circular cylinder.

10 5 0 -5 -10

1 0.5 0 -0.5 -1 -5 0 5 10 15 20 -1 0 1 2

Fig. 21. The coarse grid for the ow past a circular cylinder: full (left) and zoom-in (right) view.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1277

longer as the Reynolds number increases further. Fig. 18 depicts the normalized length (Lm) of the main recirculation region downstream the step and the normalized length (Ls) of the second recirculation region on the upper wall with respect to the Reynolds number compared with those of Erturk [67]. Considering the different grid meshes, the different outow locations and boundary conditions, the results are in good agreement.
2

4.4. Flow past a circular cylinder A circular cylinder of radius r 0.5 resides in a rectangular domain 5,20 10,10, where the center of the circular cylinder is located at the origin. A uniform ow with free-stream velocity U1 coming from the left far eld passes around the circular cylinder; see Fig. 19. A no-slip boundary condition is specied on the surface of the cylinder,
2

-1

-1

-2 2

-2

-1

-2 2

-1

-1

-2 -2 2

-1

-2 -2 2

-1

-1

-1

-2 2

-2

-1

-2 2

-2

-1

-1

-1

-2

-2

-2

-2

Fig. 22. Computed streamlines (left) and isobars (right) for the ow past a circular cylinder at Re 5,10,20,40 (from top to bottom).

1278

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

while on the inow boundary, on the outow boundary and on the upper and lower wall boundaries, a potential ow velocity u u1 ,u2 U1r 2 =x2 y2 2r 2 y2 =x2 y2 2 ,2r 2 xy=x2 y2 2 is prescribed. The Reynolds number based on the free-stream velocity U (here U 1) and the cylinder diameter D (here D1) is dened as UD=n. It is well known that the stationary and symmetric ow past a circular cylinder becomes unstable for values of the Reynolds number greater than 40, in which case the ow becomes periodic and unsymmetric We decompose the domain into six disjoint sub-domains, and then enlarge each sub-domain by extending outside an extra layer of size 0.5; see Fig. 20. The meshes sizes are H 1 ,h 1 with a 2 4 local renement around the cylinder; see Fig. 21 for the coarse grid where 5762 vertices are involved. In Table 6, we tabulated the separation angle y and the length of the wake behind the cylinder obtained by our new algorithm together with those in the literature [7174], where good agreement is observed. The computed streamlines and isobars around the cylinder are also plotted in Fig. 22.

5. Conclusions In this work we have proposed a new parallel nite element algorithm for the stationary NavierStokes equations. It is based on a coarse grid nonlinear problem and local ne grid nonlinear correction problems dened on overlapped sub-domains, and hence allows existing sequential PDE codes to run in a parallel environment without extensive recoding. Numerical simulations of the lid-driven cavity ow, the backward-facing step ow and the ow past a circular cylinder demonstrated the efciency of the proposed algorithm.

Acknowledgments The authors thank the editor and reviewers for their valuable comments and suggestions which led to a large improvement of the paper. This work was supported by the National Research Foundation (NRF) Grant funded by the Korean Government (MEST) (No. 20100017532), the Natural Science Foundation of China (No. 11001061, 10971166), the National High Technology Research and Development Program of China (863 Program: 2009AA01A135) and the Ph.D. Research-Starting Foundation of Guizhou Normal University, China ([2010] Parallel Algorithms for Computational Fluid Dynamics Problems). References
[1] A. Toselli, O. Widlund, Domain Decomposition Methods: Algorithms and Theory, Springer, Berlin, 2005. [2] H. Elman, V.E. Howle, J. Shadid, et al., A taxonomy and comparison of parallel block multi-level preconditioners for the incompressible NavierStokes equations, J. Comput. Phys. 227 (2008) 17901808. [3] S. Behara, S. Mittal, Parallel nite element computation of incompressible ows, Parallel Comput. 35 (2009) 195212. [4] C.A. Rivera, M. Heniche, R. Glowinski, P.A. Tanguy, Parallel nite element simulations of incompressible viscous uid ow by domain decomposition with Lagrange multipliers, J. Comput. Phys. 229 (2010) 51235143. [5] Y.Q. Shang, Y.N. He, Parallel nite element algorithms based on full domain partition for stationary Stokes equations, Appl. Math. Mech.Engl. Ed. 31 (5) (2010) 643650. [6] Y.Q. Shang, Y.N. He, Parallel iterative nite element algorithms based on full domain partition for the stationary NavierStokes equations, Appl. Numer. Math. 60 (7) (2010) 719737. [7] Y.Q. Shang, A parallel two-level linearization method for incompressible ow problems, Appl. Math. Lett. 24 (2011) 364369. [8] Y.N. He, L.Q. Mei, Y.Q. Shang, J. Cui, Newton iterative parallel nite element algorithm for the steady NavierStokes equations, J. Sci. Comput. 44 (1) (2010) 92106.

[9] Y.N. He, J.C. Xu, A.H. Zhou, Local and parallel nite element algorithms for the NavierStokes problem, J. Comput. Math. 24 (3) (2006) 227238. [10] F.Y. Ma, Y.C. Ma, W.F. Wo, Local and parallel nite element algorithms based on two-grid discretization for steady NavierStokes equations, Appl. Math. Mech.Engl. Ed. 28 (1) (2007) 2735. [11] J.C. Xu, A.H. Zhou, Local and parallel nite element algorithms based on twogrid discretizations, Math. Comput. 69 (2000) 881909. [12] J.C. Xu, A.H. Zhou, Local and parallel nite element algorithms based on twogrid discretizations for nonlinear problems, Adv. Comput. Math. 14 (2001) 293327. [13] Y.Q. Shang, Y.N. He, Z.D. Luo, A comparison of three kinds of local and parallel nite element algorithms based on two-grid discretizations for the stationary NavierStokes equations, Comput. Fluids. 40 (2011) 249257. [14] W. Layton, A two level discretization method for the NavierStokes equations, Comput. Math. Appl. 5 (26) (1993) 3338. [15] W. Layton, H.W.J. Lenferink, A multilevel mesh independence principle for the NavierStokes equations, SIAM J. Numer. Anal. 33 (1) (1996) 1730. [16] W. Layton, H.K. Lee, J. Peterson, Numerical solution of the stationary Navier Stokes equations using a multilevel nite element method, SIAM J. Sci. Comput. 20 (1) (1998) 112. [17] X.X. Dai, X.L. Cheng, A two-grid method based on Newton iteration for the NavierStokes equations, J. Comput. Appl. Math. 220 (2008) 566573. [18] Y.N. He, A.W. Wang, A simplied two-level method for the steady Navier Stokes equations, Comput. Meth. Appl. Mech. Engrg. 197 (2008) 15681576. [19] H. Abboud, V. Girault, T. Sayah, A second order accuracy for a full discretized time-dependent NavierStokes equations by a two-grid scheme, Numer. Math. 114 (2009) 189231. [20] J.C. Xu, Two-grid discretization techniques for linear and nonlinear PDEs, SIAM J. Numer. Anal. 33 (5) (1996) 17591777. [21] M. Marion, R. Temam, Nonlinear Galerkin methods, SIAM J. Numer. Anal. 26 (5) (1989) 11391157. [22] M. Marion, R. Temam, Nonlinear Galerkin methods: the nite element case, Numer. Math. 57 (1990) 122. [23] A.A.O. Amni, M. Marion, Nonlinear Galerkin methods and mixed nite element: two-grid algorithms for the NavierStokes equations, Numer. Math. 68 (1994) 189213. [24] Z.D. Luo, J. Zhu, A nonlinear Galerkin mixed element method and a posteriori error estimator for the stationary NavierStokes equations, Appl. Math. Mech.Engl. Ed. 23 (10) (2002) 11941206. [25] B.F. Smith, P.E. Bjrstad, W. Gropp, Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations, Cambridge University Press, Cambridge, 1996. [26] A. Quarteroni, A. Valli, Domain Decomposition Methods for Partial Differential Equations, Oxford Science Publications, London, 1999. [27] F.N. Hwang, X.C. Cai, A parallel nonlinear additive Schwarz preconditioned inexact Newton algorithm for incompressible NavierStokes equations, J. Comput. Phys. 204 (2005) 666691. [28] F.N. Hwang, X.C. Cai, A class of parallel two-level nonlinear Schwarz preconditioned inexact Newton algorithms, Comput. Meth. Appl. Mech. Engrg. 196 (2007) 16031611. [29] X.C. Cai, D.E. Keyes, L. Marcinkowski, Nonlinear additive Schwarz preconditioners and applications in computational uid dynamics, Int. J. Numer. Meth. Fluids 40 (2002) 14631470. [30] R. Adams, Sobolev Spaces, Academic Press Inc, New York, 1975. [31] P.G. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland, Amsterdam, 1978. [32] J.G. Heywood, R. Rannacher, Finite element approximation of the nonstationary NavierStokes problem I: regularity of solutions and second-order error estimates for spatial discretization, SIAM J. Numer. Anal. 19 (2) (1982) 275311. [33] R.B. Kellogg, J.E. Osborn, A regularity result for the Stokes problem in a convex polygon, J. Funct. Anal. 21 (1976) 397431. [34] R. Temam, NavierStokes Equations: Theory and Numerical Analysis, NorthHolland, Amsterdam, 1984. [35] V. Girault, P.A. Raviart, Finite Element Methods for NavierStokes Equations: Theory and Algorithms, Springer-Verlag, Berlin Heidelberg, 1986. [36] Y.N. He, J.C. Xu, A.H. Zhou, J. Li, Local and parallel nite element algorithms for the Stokes problem, Numer. Math. 109 (3) (2008) 415434. [37] Y.Q. Shang, Z.D. Luo, A parallel two-level nite element method for the NavierStokes equations, Appl. Math. Mech.Engl. Ed. 31 (11) (2010) 14291438. [38] Y.Q. Shang, K. Wang, Local and parallel nite element algorithms based on two-grid discretizations for the transient Stokes equations, Numer. Algor. 54 (2) (2010) 195218. [39] D.N. Arnold, X. Liu, Local error estimates for nite element discretizations of the Stokes equations, RAIRO M2AN 29 (1995) 367389. [40] D.N. Arnold, F. Brezzi, M. Fortin, A stable nite element for the Stokes equations, Calcolo 21 (1984) 337344. [41] M. Fortin, Calcul numerique des ecoulements uides de Bingham et des uides Newtoniens incompressible par des methodes delements nis, Doctoral Thesis, Universite de Paris VI, 1972. [42] P. Hood, C. Taylor, A numerical solution of the NavierStokes equations using the nite element technique, Comput. Fluids 1 (1973) 73100. [43] M. Crouzeix, P.-A. Raviart, Conforming and nonconforming nite element methods for solving the stationary Stokes equations, RAIRO Anal. Numer. 7 (R-3) (1973) 3376.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279

1279

[44] L. Manseld, Finite element subspaces with optimal rates of convergence for stationary Stokes problem, RAIRO Anal. Numer. 16 (1982) 4966. [45] H.C. Elman, D.J. Silvester, A.J. Wathen, Finite Elements and Fast Iterative Solvers: With Applications in Incompressible Fluid Dynamics, Oxford University Press, Oxford, 2005. [46] Y.N. He, J. Li, Convergence of three iterative methods based on nite element discretization for the stationary NavierStokes equations, Comput. Meth. Appl. Mech. Engrg. 198 (2009) 13511359. [47] A. Klawonn, L.F. Pavarino, Overlapping Schwarz methods for mixed linear elasticity and Stokes problems, Comput. Meth. Appl. Mech. Engrg. 165 (1998) 233245. [48] L.F. Pavarino, Indenite overlapping Schwarz methods for time-dependent Stokes problems, Comput. Meth. Appl. Mech. Engrg. 187 (2000) 3551. [49] F.N. Hwang, Some parallel linear and nonlinear Schwarz methods with applications in computational uid dynamics, Ph.D. Dissertation, University of Colorado, 2004. [50] M. Ainsworth, J.T. Oden, A posteriori error estimation in nite element analysis, Comput. Meth. Appl. Mech. Engrg. 142 (1997) 188. [51] M. Ainsworth, J.T. Oden, A Posteriori Error Estimation in Finite Element Analysis, John Wiley & Sons, 2000. [52] T.J. Barth, H. Deconinck, Error Estimation and Adaptive Discretization Methods in Computational Fluid Dynamics, Lecture Notes in Computer Science and Engineering, vol. 25, Springer, 2003. [53] H. Jin, S. Prudhomme, A posteriori error estimation of steady-state nite element solutions of the NavierStokes equations by a subdomain residual method, Comput. Meth. Appl. Mech. Engrg. 159 (1998) 1948. [54] L. Machiels, J. Peraire, A.T. Patera, A posteriori nite element output bounds for the incompressible NavierStokes equations: application to a natural convection problem, J. Comput. Phys. 172 (2001) 401425. [55] M. Farhloul, S. Nicaise, L. Paquet, A priori and a posteriori error estimations for the dual mixed nite element method of the NavierStokes problem, Numer. Meth. Part. Diff. Eq. 25 (4) (2009) 843869. [56] S. Prudhomme, J.T. Oden, A posteriori error estimation and error control for nite element approximations of the time-dependent NavierStokes equations, Finite Elem. Anal. Des. 33 (1999) 247262. [57] J. Cao, Application of a posteriori error estimation to nite element simulation of incompressible NavierStokes ow, Comput. Fluids 34 (2005) 972990. [58] J. Hoffman, C. Johnson, A new approach to computational turbulence modelling, Comput. Meth. Appl. Mech. Engrg. 195 (2006) 28652880.

[59] S. Berrone, M. Marro, Spacetime adaptive simulations for unsteady NavierStokes problems, Comput. Fluids 38 (2009) 11321144. [60] T.A. Davis, Available at: hhttp://www.cise.u.edu/research/sparse/umfpacki. [61] U. Ghia, K. Ghia, C. Shin, High-Re solutions for incompressible ow using the NavierStokes equations and a multigrid method, J. Comput. Phys. 48 (1982) 387411. [62] E. Erturk, T. Corke, C. Gokcol, Numerical solutions of 2-D steady incompressible driven cavity ow at high Reynolds numbers, Int. J. Numer. Meth. Fluids 48 (2005) 747774. [63] E. Erturk, Discussions on driven cavity ow, Int. J. Numer. Meth. Fluids 60 (2009) 275294. [64] W. Layton, H. Lee, J. Peterson, A defectcorrection method for the incompressible NavierStokes equations, Appl. Math. Comput. 129 (2002) 119. [65] K. Wang, A new defect correction method for the NavierStokes equations at high Reynolds numbers, Appl. Math. Comput. 11 (216) (2010) 32523264. [66] D.K. Gartling, A test problem for outow boundary conditions-ow over a backward-facing step, Int. J. Numer. Meth. Fluids 11 (1990) 953967. [67] E. Erturk, Numerical solution of 2D steady incompressible ow over a backward-facing step, part I: high Reynolds number solutions, Comput. Fluids 37 (2008) 633655. [68] I.E. Barton, The entrance effect of laminar ow over a backward-facing step geometry, Int. J. Numer. Meth. Fluids 25 (1997) 633644. [69] J. Keskar, D.A. Lyn, Computations of a laminar backward-facing step ow at Re 800 with a spectral domain decomposition method, Int. J. Numer. Meth. Fluids 29 (1999) 411427. [70] M.M. Grigoriev, G.F. Dargush, A poly-region boundary element method for incompressible viscous uid ows, Int. J. Numer. Meth. Eng. 46 (1999) 11271158. [71] S.C.R. Dennis, G.Z. Chang, Numerical solutions for steady ow past a circular cylinder at Reynolds up to 100, J. Fluid Mech. 42 (1970) 471489. [72] H. Ding, C. Shu, K.S. Yeo, D. Xu, Simulation of incompressible viscous ows past a circular cylinder by hybrid FD scheme and meshless least squarebased nite difference method, Comput. Meth. Appl. Mech. Engrg. 193 (2004) 727744. [73] Y. Kim, D.W. Kim, S. Jun, J.H. Lee, Meshfree point collocation method for the stream-vorticity formulation of 2D incompressible NavierStokes equations, Comput. Meth. Appl. Mech. Engrg. 196 (2007) 30953109. [74] B. Fornberg, A numerical study of steady viscous ow past a circular cylinder, J. Fluid Mech. 98 (1980) 819855.

Das könnte Ihnen auch gefallen