Numerical Solution of Partial Differential Equations Using Finite Element Methods
Aurelien Larcher
1
Contents
Contents 2
1 Weak formulation of elliptic PDEs 9
1.1 Historical perspective......................... 9
1.2 Weak solution to the Dirichlet problem............... 11
1.2.1 Formal passage from classical solution to weak solution . . 12
1.2.2 Formal passage from weak solution to classical solution . . 13
1.2.3 About the boundary conditions............... 13
1.3 Weak and variational formulations................. 13
1.3.1 Functional setting ...................... 13
1.3.2 Determination of the spaces................. 14
1.4 Abstract problem........................... 15
1.5 Well-posedness............................ 16
1.6 Exercises ............................... 17
2 Ritz and Galerkin for elliptic problems 19
2.1 Approximate problem ........................ 19
2.2 Ritz method for symmetric bilinear forms............. 19
2.2.1 Variational formulation and minimization problem  .... 19
2.2.2 Well-posedness........................ 22
2.2.3 Convergence ......................... 23
2.2.4 Method............................ 25
2.3 Galerkin method........................... 25
2.3.1 Formulation.......................... 25
2.3.2 Convergence ......................... 25
2.3.3 Method............................ 26
2.4 Boundary conditions......................... 27
2.5 Exercises ............................... 28
3 Finite Element spaces 29
3.1   A preliminary example in one dimension of space......... 30
3.1.1 Weak formulation ...................... 30
3.1.2 Galerkin method....................... 31
3.1.3 Construction of the discrete space ............. 32
3.1.4 Transport of Finite Element contributions......... 34
3.1.5 Generalization of the methodology............. 35
2
CONTENTS
3
3.2 Admissible mesh........................... 36
3.3 Definition of a Finite Element.................... 36
3.4 Transport of the Finite Element................... 39
3.5 Method................................ 40
3.6 Exercises ............................... 41
Simplicial Lagrange Finite Elements 43
4.1 Definitions............................... 43
4.2 Polynomial interpolation in one dimension............. 44
4.3 Construction of the Finite Element space ............. 45
4.3.1 A nodal element....................... 45
4.3.2 Reference Finite Element .................. 46
4.3.3 Lagrange P& elements.................... 46
4.4 Extension to multiple dimensions.................. 47
4.4.1 Barycentric coordinates................... 47
4.4.2 Affine transformation .................... 49
4.5 Local equation for Lagrange Pi in one dimension......... 51
4.6 Exercises ............................... 53
Error analysis 55
5.1 Preliminary discussion on the Poisson problem .......... 55
5.2 Stability of the Lagrange interpolation operator.......... 57
5.3 A priori error estimate with Lagrange Pi ............. 59
5.4 Superconvergence........................... 61
5.5 Exercises ............................... 64
6 Time-dependent problems 65
6.1 Time marching schemes....................... 65
6.2 A priori stability estimate...................... 67
6.2.1    Heat equation......................... 67
7 Adaptive error control 71
7.1 A posteriori estimates........................ 71
7.2 Residual-based error estimator for Poisson............. 73
7.3 Dual weighted residual estimate................... 74
7.3.1 Adjoint operator....................... 74
7.3.2 Duality-based a posteriori error estimate.......... 74
7.4 Method................................ 75
7.5 Exercises ............................... 76
8 Stabilized methods for advection dominated problems 77
8.1 An advection-diffusion problem in one dimension......... 77
8.2 Coercivity loss ............................ 77
8.3 Stabilization of the Galerkin method................ 77
8.4 Exercises ............................... 77
9   Iterative solvers and Multigrid 79
9.1   Iterative methods........................... 79
4 CONTENTS
9.2 Relaxation methods ......................... 81
9.2.1 Jacobi, methods of simultaneous displacements...... 81
9.2.2 Gauss-Seidel, methods of sucessive displacements..... 81
9.2.3 Relaxation of Jacobi and Gauss-Seidel........... 82
9.2.4 Parallelization of Gauss-Seidel............... 83
9.3 Krylov-subspace methods...................... 83
9.3.1 Principle of descent methods: Steepest Gradient ..... 83
9.3.2 Conjugate Gradient ..................... 84
9.3.3 Preconditioners........................ 86
9.4 Power method............................. 87
9.5 Multigrid methods.......................... 88
10 Mixed problems 89
10.1 The Stokes equations......................... 89
10.1.1 Position of the problem................... 89
10.1.2 Abstract weak formulation ................. 91
10.1.3 Well-posedness in the continuous setting.......... 92
10.2 The discrete Inf-Sup condition.................... 93
10.2.1 Results ............................ 93
10.2.2 Commonly used pairs of approximation spaces ...... 94
10.3 Exercises ............................... 94
A Definitions 95
A.l  Mapping................................ 95
A. 2  Spaces................................. 95
B Duality 97
B. l   In finite dimension.......................... 97
C Function spaces 99
C. l   Banach and Hilbert spaces......... ............. 99
C.2  Spaces of continuous functions.................... 100
C.3  Lebesgue spaces............................ 100
C.4  Hilbert-Sobolev spaces........................ 100
C. 5  Sobolev spaces ........... ................. 100
D Inequalities 101
D. l  Useful inequalities in normed vector spaces ............101
E  Tensor formulae 103
E. l   Operators...............................103
E.l.l   Tensor product........................103
E.1.2   Dot product (simple contraction)..............103
E.1.3   Double-dot product (double contraction)..........103
E.1.4   Gradient............................104
E.1.5   Divergence ..........................104
E.l.6   Curl (Rotational).......................104
E.2   Identities ...............................104
CONTENTS 5
E.2.1   First order tensors......................104
E.2.2   Second order tensors.....................105
Bibliography 106
6
CONTENTS
Introduction
This document is a collection of short lecture notes written for the course "The Finite Element Method" (SF2561), at KTH, Royal Institute of Technology during Fall 2013, then updated for the course "Numerical Solution of Partial Differential Equations Using Element Methods" (TMA4220) at NTNU, during Fall 2018. It is in not intended as a comprehensive and rigorous introduction to Finite Element Methods but rather an attempt for providing a self-consistent overview in direction to students in Engineering without any prior knowledge of Numerical Analysis.
Content
The course goes through the basic theory of the Finite Element Method during the first six lectures while the last three lectures are devoted to some applications.
1. Introduction to PDEs, weak solution, variational formulation.
2. Ritz method for the approximation of solutions to elliptic PDEs
3. Galerkin method and well-posedness.
4. Construction of Finite Element approximation spaces.
5. Polynomial approximation and error analysis.
6. Time dependent problems.
7. Mesh generation and adaptive control.
8. Stabilized finite element methods.
9. Mixed problems.
The intent is to introduce the practicals aspects of the methods without hiding the mathematical issues but without necessarily exposing the details of the proof. There are indeed two side of the Finite Element Method: the Engineering approach and the Mathematical theory. Although any reasonable implementation of a Finite Element Method is likely to compute an approximate solution, usually the real challenge is to understand the properties of the obtained solution, which can be summarized in four main questions:
1. Well-posedness: Is the solution to the approximate problem unique?
2. Consistency:   Is the solution to the approximate problem close to the continuous solution (or at least "sufficiently" in a sense to determine)?
3. Stability: Is the solution to the approximate problem stable with respect to data and "well-behaved"?
CONTENTS
7
4. Maximum principle, Physical properties: Does the discrete solution reproduce features of the physical solution, like satisfying physical bounds or energy/entropy inequalities?
Ultimately the goal of designing numerical scheme is to combine these properties to ensure the convergence of the method to the unique solution of the continuous problem (if hopefully it exists) defined by the mathematical model. In a way the main message of the course is that studying the mathematical properties of the continuous problem is a direction towards deriving discrete counterparts (usually in terms of inequalities) and ensuring that numerical algorithms possess good properties.
Answering these questions requires some knowledge of elements of numerical analysis of PDEs which will be introduced throughout the document in a didactic manner. Nonetheless addressing some technical details is left to more serious and comprehensive works referenced in the bibliography.
Literature
At KTH the historical textbook used mainly for the exercises is Computational Differential Equations [6] which covers many examples from Engineering but is mainly limited to Galerkin method and in particular continuous Lagrange elements.
The two essential books in the list are Theory and Practice of Finite Elements [4] and The Mathematical Theory of Finite Element Methods [2]. The first work provides an extensive coverage of Finite Elements from a theoretical standpoint (including non-conforming Galerkin, Petrov-Galerkin, Discontinuous Galerkin) by expliciting the theoretical foundations and abstract framework in the first Part, then studying applications in the second Part and finally addressing more concrete questions about the implementation of the methods in a third Part. The Appendices are also quite valuable as they provide a toolset of results to be used for the numerical analysis of PDEs. The second work is written in a more theoretical fashion, providing to the Finite Element method in the first six Chapters which is suitable for a student with a good background in Mathematics. Section 2 about Ritz's method is based on the lecture notes [5] and Section 10.1 on the description of the Stokes problem in [7].
Two books listed in the bibliography are not concerned with Numerical Analysis but with the continuous setting. On the one hand, book Functional Analysis, Sobolev Spaces and Partial Differential Equations [3] is an excellent introduction to Functional Analysis, but has a steep learning curve without a solid background in Analysis. On the other hand, Mathematical Tools for the Study of the Incompressible Navier-Stokes Equations and Related Models [1], while retaining all the difficulties of analysis, offers a really didactic approach of PDEs for fluid problems in a clear and rigorous manner.
Chapter 1
Weak formulation of elliptic Partial Differential Equations
1.1   Historical perspective
The physics of phenomena encountered in engineering applications is often modelled under the form of a boundary and initial value problems. They consist of relations describing the evolution of physical quantities involving partial derivatives of physical quantities with respect to space and time, such relations are called Partial Differential Equations (PDEs). Problems involving only variations in space on a domain $1 C Md are called Boundary Value Problems (1.1) as they involve the description of the considered physical quantities at the frontier of the physical domain d£l.
Find u : Rd ->• Rn such that:
Au(x) = fix) , y x g n (1.1)
+ Boundary conditions on dtt
with A a differential operator, i.e. involving partial derivatives of u, like the first derivatives with respect to each axis
d
dxi
for i = 1, • • • , d, or more generally
gN
daiX! ■ --d^xi ■ --d^xd
given the multi-index a = (a±, ■ ■ ■ , ai, ■ ■ ■ , ad), and \a\ the module of the multi-index is the order of the derivative.
Equations describing the evolution in time of physical quantities required the definition of an initial condition in time, and are therefore called Initial Value Problems. They consist of the coupling of an Ordinary Differential Equation
9
10
CHAPTER 1.   WEAK FORMULATION OF ELLIPTIC PDES
(ODE) in time, a Cauchy Problem (1.2), with a boundary value problem in space.
Find y : R ->• W1 such that:
y'(t) = F(t,y(t)), Vie [0,T) (1.2)
+ Intial condition at t = 0: y(t = 0) = yo
with F:lxlMI".
The term Finite Element Method denotes a family of approaches developed to compute an approximate solution to boundary and initial value problems.
Example 1.1.1 (Partial Differential Equations). A few usual mathematical models are listed below.
• Transfer of heat/mass by conduction/diffusion, "Fourier's Law":
— kAT = f
with k constant conductivity/diffusivity, and for example T temperature.
• Unsteady heat equation:
dT
with k thermal diffusivity, and T temperature.
• Transport of a passive scalar field:
£ + />w = /
with (3 advective vector field, and for example c a concentration.
• Burgers equation in one dimension (Forsyth, 1906, and Burgers, 1948):
du du d2u dt      dx dx2
with v a viscosity, possibly zero in the inviscid case.
• Wave equation in one dimension (D'Alembert, 1747):
d2u 2 d2u dt2 dx2
with c a celerity.
• Euler equations, inviscid and incompressible case (Euler, 1757):
Ou
— + (u ■ V)u + Vp = f Vu = 0
with v a viscosity, possibly zero in the inviscid case.
1.2.  WEAK SOLUTION TO THE DIRICHLET PROBLEM
11
• Navier-Stokes equations, homogeneous, incompressible and isothermal case (Navier, 1822, then 19th century):
f d^ + {u-V)u- V-(r(U,p)) =/ ( Vu = 0
with r(u,p) = u'Vu — pi for a Newtonian fluid.
The study of equations involving derivatives of the unknown has led to rethinking the concept of derivation: from the idea of variation, then the study of the Cauchy problem, finally to the generalization of the notion of derivative with the Theory of Distributions. The main motivation is the existence of discontinuous solutions produced by classes of PDEs or irregular data, for which the classical definition of a derivative is not suitable. The concept of weak derivative introduced by L. Schwartz in his Theory of Distributions, was used to seek solutions to Partial Differential Equations like general elliptic and parabolic problems (J.-L. Lions) or Navier-Stokes (J. Leray). The idea is to replace the pointwise approach of the classical (strong) derivative
lim f(X + h)~ f(x) h^O h
which requires regularity (derivability, continuity) at any point in space and time by the derivation in the distributional sense, i.e. by considering the action of linear forms on smooth functions, such as
T-.ip^ [ ftp
with if a sufficiently regular function such that the integral is defined.
1.2   Weak solution to the Dirichlet problem
Let us consider the Poisson problem posed in a domain $1, an open bounded subset of Md, d > 1 supplemented with homogeneous Dirichlet boundary conditions:
-Au(x) = f(x),      V x £ $1 (1.3a) u(x) = 0,      Vxeffl (1.3b) with / £ C°($l) and the Laplace operator,
i=i 1
thus involving second order partial derivatives of the unknown u with respect to the space coordinates.
Definition 1.2.1 (Classical solution). A classical solution (or strong solution) of Problem (1.3) is a function u 6 C2($l) satisfying relations (1.3a) and (1.3b).
12
CHAPTER 1.   WEAK FORMULATION OF ELLIPTIC PDES
Problem (1.3) can be reformulated so as to look for a solution in the distributional sense by testing the equation against smooth functions. Reformulating the problem amounts to relaxing the pointwise regularity (i.e. continuity) required to ensure the existence of the classical derivative to the (weaker) existence of the distributional derivative which regularity is to be interpreted in term in terms of Lebesgue spaces: the obtained problem is a weak formulation and a solution to this problem (i.e. in the distributional sense) is called weak solution. Three properties of the weak formulation should be studied: firstly that a classical solution is a weak solution, secondly that such a weak solution is indeed a classical solution provided that it is regular enough, and thirdly that the well-posedness of this reformulated problem, i.e. existence and uniqueness of the solution, is ensured.
1.2.1   Formal passage from classical solution to weak solution
Let u £ C2($l) be a classical solution to (1.3) and let us test Equation (1.3a) against any smooth function tp 6 C2°($l):
— / A.u(x)p>(x) dx =     f(x)tp(x) dx Jn Jn
Since u £ C2($l), Au is well defined. Integrating by parts, the left-hand side reads:
— / A.u(x)p>(x) dx = — /   Vu(x) ■ ntp(x) ds + / Vm(s) • V^(x) da; Jn Jan Jn
For simplicity, we recall the one-dimensional case:
-J0 ^xMx)dX = -[^xMx)\0 + l d-X{x)^{x)dX
Since if has compact support in $1, it vanishes on the boundary dtt, consequently the boundary integral is zero, thus the distributional formulation reads
f Vu(x)-Vtp(x)dx= [ f(x)p>(x)dx   , Vy>eCf(n) Jn Jn
and we are led to look for a solution u belonging to a function space such that the previous relation makes sense.
A weak formulation of Problem (1.3) consists in solving:
Find u G H, given / £ V', such that:
f (1-5) Vtt • Vt) dx =     fv dx   , V v G V
n Jn
in which H and V are a function spaces yet to be defined, both satisfying regularity constraints and for H boundary condition constraints. The choice of the solution space H and the test space V is described Section 1.3.
1.3.  WEAK AND VARIATIONAL FORMULATIONS
13
1.2.2   Formal passage from weak solution to classical solution
Provided that the weak solution to Problem (1.5) belongs to C2(S1) then the second derivatives exist in the classical sense. Consequently the integration by parts can be performed the other way around and the weak solution is indeed a classical solution.
1.2.3   About the boundary conditions
Boundary condition	Expression on d£l	Property
Dirichlet Neumann	Vu • n = 0	"essential" boundary condition "natural" boundary condition
Essential boundary conditions are embedded in the function space, while natural boundary conditions appear in the weak formulation as linear forms.
1.3   Weak and variational formulations 1.3.1   Functional setting
Hilbert-Sobolev spaces Hs (Section C.4) are a natural choice to "measure" functions involved in the weak formulations of PDEs as the existence of the integrals relies on the fact that integrals of powers | • \p of u and weak derivatives Y)au for some 1 < p < +oo exist:
Hs(n) = {ue L2(n) : Bau g L2(ft), 1 < a < s}
with the Lebesgue space of square integrable functions on $1:
L2(fl) = |u : J \u(x)\2 dx < +oo endowed with its natural scalar product
(u, v )h2(Q) = / uvdx Since Problem (1.5) involves first order derivatives according to relation,
•      dx = / fv dx
n Jn
then we should consider a solution in H1($l).
H1^) = {u g L2(tt) : Bu g L2(ft)}
with the weak derivative Dm i.e. a function of L2($l) which identifies with the classical derivative (if it exists) "almost everywhere", and endowed with the norm,
II  II        -t ^2 II • llHi(fi) - ( • ,  • )Hi(fi)
14
CHAPTER 1.   WEAK FORMULATION OF ELLIPTIC PDES
defined from the scalar product,
(u,v)hi/q\= / uvdx+ / Vu-Vt) dx Jn Jn
Moreover, the solution should satisfy the boundary condition of the strong form of the PDE problem. According to Section 1.2.3 the homogeneous Dirichlet condition is embedded in the function space of the solution: u vanishing on the boundary d£l yields that we should seek u in Hq($1).
1.3.2   Determination of the spaces
We will now establish that any weak solution "lives" in Hq($1) and that the natural space for test functions is the same space.
Choice of test space
In order to give sense to the solution in a Hilbert-Sobolev space we need to choose the test function tp itself in the same kind of space. Indeed C£°($l) is not equipped with a topology which allows us to work properly. If we chose if g Hq($1) then by definition, we can construct a sequence (p>n)ne^ of functions in C£°($l) converging in Hq($1) to </?, i.e.
Il</>n - ^llHi(n)     0, asn-> +oo
For the sake of completeness, we show that we can pass to the limit in the formulation, term by term for any partial derivative:
diu diipn ->• / diu diip n Jn
as dnpn —Dj</5 in L2(S1), which denotes the weak convergence i.e. tested on functions of the dual space (which, in case of L2($l), is L2($l) itself).
/ ^      / / V
n Jn
as (pn —> if in L2($l).  Consequently the weak formulation is satisfied if tp g
Hj(n).
Choice of solution space
The determination of the function space is guided,
• firstly, by the regularity of the solution: if u is a classical solution then it belongs to C2($l) which involves that u £ L2($l) and diu £ L2($l), thus u g H1^),
• secondly by the boundary conditions: the space should satisfy the Dirichlet boundary condition on d£l. This constraint is satisfied thanks to the following trace theorem for the solution to the Dirichlet problem: since Ker(-j) = Hg($l), we conclude u g Hg(Sl).
1.4. ABSTRACT PROBLEM
15
Lemma 1.3.1 (Trace Theorem). Let Q be a bounded open subset of Rd with piecewise C1 boundary, then there exists a linear application 7 : H1($l) —> L2(cWl) continuous on H1($l) such that ^(u) = 04u£ Ker(7).
The regularity of the solution itself depends on the nature of the differential operators involved in the problem (e.g. up to which order should be derivatives controlled?), but also on the data of the problem: regularity of the domain and right-hand side.
The weak formulation of Problem (1.3) reads then:
Find u g Hg($l), such that:
f f 1 (L?)
/ V«-Vt)dx= / fvdx   , VveH^(ft)
Jn Jn 1.4   Abstract problem
The study of mathematical properties of PDE problems is usually performed on a general formulation called abstract problem which reads in our case:
Find u g V, such that:
(1.8)
a(u, v) = L(v)   , V v g V
with a( •, •) a continuous bilinear form on V x V and L( •) a continuous linear form on V.
Proposition 1.4.1 (Continuity). A bilinear form a( •, •) is continuous onV x W if there exists a positive constant real number M such that
a(v,w) < M||t>||y||H|w   ,V (v,w) g V x W
For example, in the previous section for Problem (1.7), the bilinear form reads
a:   V x V   ->• R
(u,v)     1—>    / Vm • Vu da; Jn
and the linear form,
L :   V   -+ R
v 1—y    I fvdx Jn
The continuity of these two forms comes directly from that they are respectively the inner-product in Hq($1), and the L2 inner-product with / £ L2($l): the Cauchy-Schwarz inequality gives directly a continuous control of the image of the forms by the norms of its arguments.
16
CHAPTER 1.   WEAK FORMULATION OF ELLIPTIC PDES
Definition 1.4.2 (Topological dual space). The topological dual space V of a normed vector space V is the vector space of continuous linear forms on V equipped with the norm:
In the following chapters, we consider the case of elliptic PDEs, like the Poisson problem, for which the bilinear form a( •, •) is coercive.
Proposition 1.4.3 (Coercivity). A bilinear form is said coercive in V if there exists a positive constant real number a such that for any v G V
This property is also know as V-ellipticity.
1.5 Well-posedness
In the usual sense, a problem is well-posed if it admits a unique solution which is bounded in the V-noim by the data: forcing term, boundary conditions, which are independent on the solution and appear at the right-hand side of the equation. In this particular case of the Poisson problem the bilinear form a( •, •) is the natural scalar product in Hq($1), thus it defines a norm in Hq($1) (but only a seminorm in H1($l) due to the lack of definiteness, not a norm !).
Theorem 1.5.1 (Riesz-Frechet). Let H be a Hilbert space and H' its topological dual, V <3? G H', there exists a unique representant u G H such that for any v G H,
=(u,v)H
and furthermore \\u\\h = \\&\\h'
This result ensures directly the existence and uniqueness of a weak solution as soon as a( •, •) is a scalar product and L is continuous for || • ||a. If the bilinear form a( •, •) is not symmetric then Theorem 1.5.1 (Riesz-Frechet) does not apply.
Theorem 1.5.2 (Lax-Milgram). Let H be a Hilbert space. Provided that a{ ■, •) is a coercive continuous bilinear form on H x H and L{ ■) is a continuous linear form on H, Problem (1.8) admits a unique solution u G H.
Now that we have derived a variational problem for which there exists a unique solution with V infinite dimensional (i.e. for any point x G $1), we need to construct an approximate problem which is also well-posed.
WfWv = sup
xev,x^o \\x\\v
a(v, v) > a||u||y
1.6. EXERCISES
17
1.6 Exercises
Exercises for this section cover some preliminary notions introduced for the weak formulation of PDEs.
Exercise 1.6.1 (Based on Exercise 4 from [8]). Answer the following questions.
(a) Discuss whether the set S = {v G C£°((0,1)) : v(^) = l} is a vector space.
(b) For V = Hq((0, 1)), show that L(v) = /J xv dx defines a linear functional. Recall the definition of the topological dual V and show that L is continuous for x G V.
(c) For V = M discuss whether ( u , v )v = \u\\v\ is an inner-product in V.
(d) Does |ii|n1(n) = II^^IIl2^); ^ £ ^2 define a norm in H1($l)? Explain why.
(e) Assess whether the function f(x) = x3/4 an element of the following spaces: L2((0,1)), H1((0,1)), H2((0,1)).
(f) For v = e~10x and $1 = (0,1), is the relation |ii|n1(n) = llilH2(S7) satisfied?
Exercise 1.6.2.
Let us consider the following problem posed on the domain $1 = (0,1), with re a real coefficient, and / G L2($l):
Find u G Hg(Sl) such that:
f   du f du dv f f (1-9)
/ K—vdx+ / —— — dx + / uv dx = j  fvdx,       v G Hn(Sl) Jn  dx Jn dx dx        Jn Jn
(a) Formulate the strong problem corresponding to weak formulation (1.9).
(b) Discuss the existence and uniqueness of the solution to Problem (1.9).
Exercise 1.6.3.
Let us consider the biharmonic equation posed on the domain $1 = (0,1):
A2u(x)=f(x),       V x G $1 (1.10a) with / G L2($l), and satisfying the boundary condition on dfl
u(x)=u'(x)=0,       yxedfl (1.10b)
(a) For / = 1 give a solution to Problem (1.10).
(b) Derive a weak formulation (WF) of Problem (1.10).
(c) Specify the solution space and the test space.
(d) Show that there exists a unique solution u to (WF) belonging to the chosen solution space.
Exercise 1.6.4.
Let us consider the Helmholtz equation posed on the domain $1 = (0,1), given re a real coefficient:
- u"(x) + ku(x) = f(x),       V x G $1 (1.11a)
with / G L2(ft),
u(x) = 0,      yxedfl (1.11b)
18
CHAPTER 1.   WEAK FORMULATION OF ELLIPTIC PDES
(a) Derive a weak formulation (WF) of Problem (1.11).
(b) Specify the solution and test spaces.
(c) What is the nature of the bilinear form for re = 1?
(d) Prove that the problem is well-posed for re = 0 and re > 0.
(e) Comment on the difficulty posed by the case re < 0.
(f) The boundary condition is now given by:
Derive a weak formulation for the Problem (4.2a)-(1.12) and show that it admits a unique solution. To prove the coercivity the following relation can be used:
u(0)
u'(0)=0,    u'(l) = -l
(1.12)
Chapter 2
Ritz and Galerkin methods for elliptic problems
In Section 1. we have reformulated the Dirichlet problem to seek weak solutions and we showed its well-posedness. The problem being infinite dimensional, it is not computable.
Question: Can we construct an approximation to Problem (1.3) which is also well-posed?
2.1 Approximate problem
In the previous section we showed how a classical PDE problem such as Problem (1.3) can be reformulated as a weak problem. The abstract problem for this class of PDE reads then:
Find u G V, such that:
(2.1)
a(u, v) = L(v)   , V v G V
with a( •, •) a coercive continuous bilinear form onV xV and L( •) a continuous linear form on V.
Since in the case of the Poisson problem the bilinear form is continuous, coercive and symmetric, the well-posedness follows directly from Riesz-Frechet representation Theorem. If the bilinear form is still coercive but not symmetric then we will see that the well-posedness is proven by the Lax-Milgram Theorem.
But for the moment, let us focus on the symmetric case: we want now to construct an approximate solution un to the Problem (2.1) then prove that the solution to the obtained approximate problem exists and is unique.
2.2 Ritz method for symmetric bilinear forms
2.2.1   Variational formulation and minimization problem
The idea behind the Ritz method is to replace the solution space V (which is infinite dimensional) by a finite dimensional subspace Vn C V, dim(V^) = n.
19
20
CHAPTER 2.  RITZ AND GALERKIN FOR ELLIPTIC PROBLEMS
Problem (2.2) is the approximate weak problem by the Ritz method: Find un G Vn, Vn C V, such that: a{un, vn) = L(vn)   , V vn G Vn
(2.2)
with a( •, •) a coercive symmetric continuous bilinear form on V x V and L{ ■) a continuous linear form on V.
Provided that the bilinear form is symmetric, Problem (2.3) is the equivalent approximate variational problem under minimization form:
Find un G Vn, Vn C V, such that:
J{un) < J(vn)   , V vn G Vn
with J(vn) = —a(vn,vn) — L(vn
(2.3)
-a{vn,vn
Proposition 2.2.1 (Equivalence of weak and variational formulations). Problem 2.2 and 2.3 are equivalent.
Before moving to the well-posedness of the approximate variational problem some definitions are introduced to characterize the solution of minimization problems, then the equivalence of formulations for the Poisson problem with homogeneous Dirichlet boundary conditions in one dimension of space is given as example.
Definition 2.2.2 (Directional derivative). Let V be a Hilbert space, for any u £ V the relation:
1
J'(u: w) = lim - (J(u + ew
defines J'(-; •) : V x V w.
J(u)) (2.4) derivative of the functional J at u in the direction
Definition 2.2.3 (Frechet derivative). Let V be a Hilbert space, J is Frechet-derivable at u if:
J(u + v) = J(u) + Lu(v) + e(t>)||t>||y with Lu a continuous linear form on V and e{v) —> 0 as v —> 0.
(2.5)
Proposition 2.2.4 (Optimality conditions). Let V be a Hilbert space and J a twice Frechet-derivable functional, uq G V is solution to
inf J(v)
vev
(2.6)
if the following conditions are satisfied: 1. J'{uq) = 0 (Euler condition).
2.2.  RITZ METHOD FOR SYMMETRIC BILINEAR FORMS
21
2. ( J"{u)w , w ) > 0 (Legendre condition).
Both conditions can be interpreted in terms of the simpler case of real functions: the first one requires that the first derivative cancels so that uq is an extremum, while the second condition is a convexity argument. Moreover, a sufficient condition is given by ( J"{u)w , w ) > 0 for any u in a neighbourhood of uq (strong Legendre condition). The coercivity of the bilinear form a(-, •) is an even stronger condition equivalent to: 3a > 0 such that ( J"{u)w , w ) > a(w,w).
Example 2.2.5. Equivalence of weak and variational formulations for the Dirich-let problem posed on $1 = (0,1). Let us derive the expression of J'{u\ w) defined by (2.4) given e > 0 and w g V.
First let us verify that if u solves the minimization problem then it solves the corresponding weak problem.
J(u + ew)   =   — J [(u + ew)'~\2 dx — j f(u + ew)dx
- I [{u')2 + 2eu'w' + e2{v')2] dx - [ fudx-e [ fwdx 2 Jn Jn Jn
fw dx
2jn
J{u) + £
u w dx
'n
1 ,
z -in
V)2 dx
Writing the derivative gives,
1
lim — (J(u + ew) — J(u)) = lim
e^O ey J e^O
/ u'w' dx — [ fwdx-\— eltulxTi Jn Jn 2  1 lHo
so that the Euler condition holds if for any w g V = Hq($1)
J'(u; w) = I u'w' dx — / fw dx = 0 Jn Jn
In this case the functional J is Frechet-derivable as Lu is linear.
Secondly, the other way around considering that the weak formulation holds for the test function ew g V then in the relation
J{u + ew) = J{u) + £
u w dx
fw dx
1 2
(w')2 dx
the second term of the right-hand side cancels, and the third term is non-negative, then
J{u + ew) > J{u)
so that u is solution to the minimization problem.
The same result holds for the continuous problem in V and the approximation in Vn since only requirement is to work in a Hilbert space. Actually the following result for the Dirichlet problem is due to Stampacchia which characterizes the solution to the weak problem in term of minimization.
22
CHAPTER 2.  RITZ AND GALERKIN FOR ELLIPTIC PROBLEMS
Theorem 2.2.6 (Stampacchia). Let a(-, •) be a bilinear coercive continuous form on H a Hilbert space, and K be a convex closed non-empty subset of H. Given <j) £ H', 3\u £ K such that
a(u , v — u )>((/), v — u ) H, H,       V v £ K
and if a is symmetric then
u = argmin { \a( v , v ) - ( 0 , v )H, H veK lz
The solution can be seen as satisfying a minimization of energy, also called Dirichlet principle.
2.2.2 Well-posedness
Theorem 2.2.7 (Well-posedness). Let V be a Hilbert space and Vn a finite dimensional subspace ofV, dim(V^) = n, Problem (2.2) admits a unique solution un.
Proof. Given that the weak formulation differs only by introducing finite dimensional subspaces the proof could conclude directly with the Lax-Milgram Theorem. Instead we show that there exists a unique solution to the equivalent minimization problem (2.3) by explicitly constructing an approximation un £ Vn decomposed uniquely on a basis (tpi, ■ ■ ■ ,ipn) of Vn:
n
In practice this basis is not any basis but the one constructed to define the approximation space Vn: to one chosen approximation space will correspond one carefully constructed basis. In so doing, the constructive approach paves the way to the Finite Element Method and is thus chosen as a prequel to establishing the Galerkin method.
Writing the minimization functional for un reads: J{un)   =   ^a(un,un) - L(un)
j = l i=l j=l
^    n     n n
i=l j=l j=l ^    n     n n
i=l j=l j=l
Collecting the entries by index i, the functional can be rewritten under algebraic form:
It t J(u) = —u Au — u b
v ' 2
2.2.  RITZ METHOD FOR SYMMETRIC BILINEAR FORMS
23
where u is the vector of algebraic unknowns also called degrees of freedom
t
U    =(«!,..., Un)
and A, b are respectively the stiffness matrix and the load vector:
Aij = a(<pj,<pi),       bi = L((fi)
Proposition 2.2.8 (Convexity of a quadratic form).
J(u) = uTKu-uTG + F
is a strictly convex quadratic functional iff K symmetric positive definite non-singular.
As a consequence to Proposition 2.2.8 J is a strictly convex quadratic form, then there exists a unique u £ W1 : J(u) < J(v), Vv £ Mn, which in turns proves the existence and uniqueness of un 6 Vn.
The minimum is achieved with u satisfying Au = b which corresponds to the Euler condition J'(un) =0 □
The general setting for Galerkin methods will be to construct approximate solutions of the form:
n
un = s^2lujLPj (2-8)
where {uj}i<j<n is a family of real numbers and B = (ip±,..., ipn) a basis of Vn. Since Vn is finite dimensional, there exist a unique decomposition (2.8) on the basis. This basis can be chosen in a way that seems natural so that in practice we will construct one basis for a given type of space Vn and which will define the approximation properties (the basis itself is not unique but we need to choose one that possesses good properties, in a similar fashion that it is more suitable to work with the canonical basis in Euclidean spaces).
2.2.3 Convergence
The question in this section is: considering a sequence of discrete solutions (un)ngN, with each un belonging to Vn, can we prove that un —> u in V as n —> oo? The ingredients are similar to the Lax principle: stability and consistency implies convergence.
Lemma 2.2.9 (Estimate in the energy norm). Let V be a Hilbert space and Vn a finite dimensional subspace of V. We denote by u G V, un G Vn respectively the solution to Problem (2.1) and the solution to approximate Problem (2.2). Let us define the energy norm \\ ■ \\a = a( -, ■ )1/27 then the following inequality holds:
\\u - Un\\a < \\u - Vn\\a     , Vl)„e7„
24
CHAPTER 2.  RITZ AND GALERKIN FOR ELLIPTIC PROBLEMS
Proof. Using the coercivity and the continuity of the bilinear form, we have:
aIMIy < \\u\\a — M\\u\\y
then \\u\\a is norm equivalent to ||ii||y, thus (V, || • ||a) is a Hilbert space.
a(u - PVn u, vn) = 0   , V vn G Vn
by definition of Pyn as the orthogonal projection of u onto Vn with respect to the scalar product defined by the bilinear form a.
\\u - un\\l = a(u -un,u- vn) + a(u - un,vn - un)   , V vn G Vn
Since the second term of the right-hand side cancels due to the consistency of the approximation, we deduce un = Pyn u, then un minimizes the distance from u to Vn:
\\u - un\\l < \\u - vn\\l ,Vu„eK which means that the error estimate is optimal in the energy norm. □
Lemma 2.2.10 (Cea's Lemma). Let V be a Hilbert space and Vn a finite dimensional subspace of V. we denote by u G V, un G Vn respectively the solution to Problem (2.1) and the solution to approximate Problem (2.2) , then the following inequality holds:
ii n [Mu II
\\u-un\\v<\ — w-^nv   , Vi)ne7n
V a
with M > 0 the continuity constant and a > 0 the coercivity constant.
Proof. Using the coercivity and continuity of the bilinear form, we bound the left-hand side of the estimate (2.2.9) from below and its right-hand side from above:
a\\u — un\\v < M\\u — vn\\v    V vn G Vn
Consequently:
[M..
\\u-un\\y<\ - m-^nV     , Vt)n£l4
V a
□
Lemma (2.2.10) gives a control on the discretization error en = u — un which is quasi-optimal in the V-noim (i.e. bound multiplied by a constant).
Lemma 2.2.11 (Stability). Any solution un G Vn to Problem (2.2) satisfies:
ll    ll   / ll-^llv
\\un\\v < -
a
2.3.  GALERKIN METHOD
25
Proof. Direct using the coercivity and the dual norm. First, the inequality
"Ihnlly < a(un,un) = L(un) holds for some a > 0, and secondly by definition of the dual norm,
urn L^
\\-Li\\yi =    SUp -—-—
vev,v^o \\v\\v
so that
a||^n||y < \\L\\v
□
2.2.4 Method
Algorithm 2.2.12 (Ritz method). The following procedure applies:
1. Chose an approximation space Vn
2. Construct a basis B = (ip±,..., ipn)
3. Assemble stiffness matrix A and load vector b 4- Solve Au = b as a minimization problem
2.3   Galerkin method
2.3.1 Formulation
We use a similar approach as for the Ritz method, except that the abstract problem does not require the symmetry of the bilinear form. Therefore we cannot endow V with a norm defined from the scalar product based on a( •, •).
Problem (2.9) is the approximate weak problem by the Galerkin method: Find un G Vn, Vn C V, such that:
(2 9)
a{un,vn) = L(vn) ,yvn£Vn
with a( •, •) a coercive continuous bilinear form onV xV and L( •) a continuous linear form on V.
2.3.2 Convergence
The following property is merely a consequence of the consistency, as the continuous solution u is solution to the discrete problem (i.e. the bilinear form is the "same"), but it is quite useful to derive error estimates. Consequently, whenever needed we will refer to the following proposition:
26
CHAPTER 2.  RITZ AND GALERKIN FOR ELLIPTIC PROBLEMS
Proposition 2.3.1 (Galerkin orthogonality). Let u G V, un G Vn respectively the solution to Problem (2.1) and the solution to approximate Problem (2.9), then:
a( u - un, vn ) = 0   , V vn G Vn
Proof. Direct consequence of the consistency of the method. The approximate problem
a( un,vn ) = L(vn)   , VvneK (2.10)
is obtained by replacing V by a finite dimensional space Vn but the relation
a(u,v)= L(v)   , Vt) G V (2.11)
from the weak formulation is otherwise the same: a(-, •) and L(-) are unchanged. Since Vn C V, it is possible to choose v G Vn in Equation (2.11), then
a(un,vn) = a(u,vn)   , V vn G Vn (2.12)
which gives directly the orthogonality property. □
Lemma 2.3.2 (Consistency). Let V be a Hilbert space and Vn a finite dimensional subspace of V. we denote by u G V, un G Vn respectively the solution to Problem (2.1) and the solution to approximate Problem (2.9), then the following inequality holds:
M
\\u-un\\v<— u-^nv  , VnneK
a
with M > 0 the continuity constant and a > 0 the coercivity constant. Proof. Using the coercivity:
— Unlly   —   d{u — un, u — un)
<   a(u — un, u — vn) + a(u — un, vn — u
n I
< a(u — un, u — vn)
< M\\u — un\\y\\u — vn\\y M.,
\\U — Un\\y    < -\\u — Vn\\y
a
□
The only difference with the symmetric case is that the constant is squared due to the loss of the symmetry.
2.3.3 Method
Algorithm 2.3.3 (Galerkin method). The following procedure applies:
1. Chose an approximation space Vn
2. Construct a basis B = (ip±, ...,</?„)
3. Assemble stiffness matrix A and load vector b 4- Solve Au = b
4.  BOUNDARY CONDITIONS
.4   Boundary conditions
28
CHAPTER 2.  RITZ AND GALERKIN FOR ELLIPTIC PROBLEMS
2.5 Exercises
Exercise 2.5.1.
Given an abstract weak problem posed in a Hilbert space V:
Find u G V, V, such that: a(u,v) = L(v)   , V v G V
and a minimization problem
Find u G V, V, such that: J(u)<J(v) ,VwGF
with J(v) = -a(v,v) — L(v)
(a) Show the equivalence of the formulations when a is bilinear symmetric positive definite and L linear.
(b) Show that if V = W1 the minimization problem can be recast into a strictly convex quadratic form J(u) = Au — u b and the unique solution satisfies Au = b.
Exercise 2.5.2.
Let us consider the Poisson problem posed on the domain $1 = (0,1):
-u"(x)= f(x),       V x G $1 (2.13a) with / G L2($l), and satisfying the boundary condition on cWl
u(x)=0,       yxedfl (2.13b)
(a) For / = 1 give a solution to Problem (2.13).
(b) Find the weak formulation (WF) of Problem (2.13) and specify the function spaces.
(c) Is this problem well-posed?
(d) Justify that it is possible to reformulate this problem into a minimization problem?
(e) Derive the minimization functional J(u).
(f) Let w\ = a\ sin(7nc). Find the value of the amplitude a± minimizes J{w\). How does a\ compare with the maximum of the exact solution ul
(g) Show that J{w\) > J(u) and interpret.
(h) Let 4>i = sin((2i — l)irx), i £ N. Verify that these function are infinitely differentiable and satisfy (f>i(0) = <f>i(l) = 0. Compute coefficients
aij = / dx >       k=      f{x)4>i(x) dx
Jo J o
(i) Given a finite dimensional space Vn = span{0j}i<j<n, express the linear system obtained by the Galerkin method and give the solution.
Chapter 3
Finite Element spaces
In the previous lectures we have studied the properties of coercive problems in an abstract setting and described Ritz and Galerkin methods for the approximation of the solution to a PDE, respectively in the case of symmetric and non-symmetric bilinear forms.
The abstract setting reads:
Find Uh £ Vh C H such that: a(uh, vh) = L{vh)   , V vh G Vh
such that:
• Vh is a finite dimensional approximation space characterized by a discretization parameter h,
• a( •, •) is a continuous bilinear form on Vh x Vh, coercive w.r.t || • ||y,
• L( •) is a continuous linear form.
Under these assumptions existence and uniqueness of a solution to the approximate problem holds owing to the Lax-Milgram Theorem and Uh is called discrete solution. Provided this abstract framework which allows us to seek approximate solutions to PDEs, we need now to define the discrete space Vh and construct a basis [ip\, ■ ■ ■ , y>Nv ) °f Vh, Nyh = dim(V/l), on which the discrete solution is decomposed as
Nvh
uh = Y^ ui
with {uj} a family of Nyh real numbers called global degrees of freedom and {ifj} a family of Nyh elements of Vh called global shape functions.
Previously no assumption was made on the finite dimensional space Vn aside from that Vn C V. The change of notation to Vh is to reflect that the discrete space Vh will be characterized more carefully as an approximation space by constructing the shape functions and by defining the degrees of freedom.
To construct the Finite Element space Vh, three ingredients are introduced:
29
30
CHAPTER 3.  FINITE ELEMENT SPACES
1. An admissible mesh Th generated by a tessellation of domain $1.
2. A reference Finite Element (K, V, X) to construct a basis of Vh and define the meaning of Uj.
3. A mapping that generates a Finite Element (K, V, S) for any cell in the mesh from the reference element (K,V,Ti).
As a preliminary step the approximation of the Poisson problem in one dimension by linear Lagrange Finite Elements is described to give an overview of the methodology without hitting the technical difficulties. Concepts and notations for the discretization of the physical domain are then introduced. Provided that all the requirements are identified, a framework for all the Finite Element methods is introduced by stating the definition of a Finite Element. Finally, the generation of the Finite Element space from a reference Finite Element will be described. Some examples of Finite Element spaces are listed at the end of the chapter.
3.1   A preliminary example in one dimension of space
For the sake of completeness, steps performed to derive a Galerkin method for the Poisson Problem 1.3 on $1 = (0,1) are sketched below to recapitulate the methodology.
3.1.1   Weak formulation
A solution is sought in the distributional sense by testing the equation against smooth functions,
u"(x)v(x) dx = / f(x)v(x) dx,       Vd £ C~(n) n Jn
then reporting derivatives on the test functions using the integration by part
u»(IMx)dx = -[«'(xWI)]j+/U'(Iy(x)dx
n v-v-' Jn
o
the weak formulation consists of finding u g V such that
u (x)v (x) dx =     f(x)v(x) dx ,       V v g V n Jn
given / g L2(S1). The choice of solution space and test space is guided by the equation and the data: in this case V = Hq($1) since v and v' should be controlled in L2($l), and homogeneous Dirichlet boundary conditions are imposed.
3.1. A PRELIMINARY EXAMPLE IN ONE DIMENSION OF SPACE 31
3.1.2   Galerkin method
The approximate problem by a Galerkin method consists of seeking a discrete solution Uh in a finite dimensional space Vh G V, such that
Uh (x)v (x) dx =     f(x)v(x) dx ,       V v G Vh n Jn
and given a basis {ifj} of Vh any function w G Vh can be written as
Nvh
Wh
with Ny = dim Vh, and moreover
Nvh
by simple application of the derivation on the linear combination. Bounds of the sum will be omitted to simplify the notation,
when there is no possible confusion.
Inserting the Galerkin decomposition in the weak formulation and using the commutativity of the derivation with the linear combinations,
/ (^2uj f'Ax)) (YVi dx = / f(x)(YVi dx
Jn   j i Jn i
and the integration can also commute with the linear combinations,
y2YuiVi / v'jWv'ii00) dx = y2Vi / f{x)pi{x)dx 3■    i        Jn i Jn
so that up to some cosmetic reordering, for any v G V
V^V^i / tpi(x)tp'Ax) dxuj = / f{x)(pi(x)dx
Jn Jn
* 3
the relation to the linear system of algebraic equations becomes evident,
t   a t ,
v A u = v b
for any v = [vi] G RNvh, with A =
ip'Ax)ip'i(x) dx
f(x)(fi(x) dx
and the discrete solution is represented by the solution vector u = [uj] G Computing contributions Ajj and bj is possible as soon as the basis {<p}j is constructed explicitly.
32
CHAPTER 3.  FINITE ELEMENT SPACES
Remark 3.1.1. The choice of indices i and j in the previous expressions follows the usual convention for row and column indices. The matrix A represents a linear application from the solution space H to the trial space V: therefore the solution space is the column space, while the trial space is the row space. In the case of Galerkin approximations where H = V - and even more when the bilinear form is symmetric - it is tempting to choose the indices arbitrarily, but for the sake of consistency following the convention is recommended.
3.1.3   Construction of the discrete space
A discretization of the computational domain $1 = [0,1] is constructed by partitioning the interval into disjoints subintervals Ki = ],l<i< NK.
NK
Q = [JKi,       KiDKj = 0 i=i
The family of cells {K{\ defines a mesh noted Th- A partition of [0,1] into Nk subintervals corresponds to a family of Nk + 1 points {x{\ which are
the vertices of the mesh Th-
In this example the approximation space is constructed with piecewise linear Lagrange polynomials. The chosen discrete space is
vh = {v g c°(n) n Hj(n): v k g %Mk e Pi(*0} (s.i)
which consists of functions continuous over that are linear on each cell K, and Vh c V so that the approximation is HQ-conformal. A continuous piecewise linear function Vh g Vh is the linear interpolate of v g V on Th if Vh = Tyh with the interpolation operator
TVh   :   V Vh
NVl
h
(3.2)
1=1
such that Nyh = dimV^ and is a family of Nyh distinct nodes. In this particular case of a linear interpolation Nyh = Nk + 1 and the family is identified with the vertices {x{\.
The construction of the basis {ifj} of Vh should produce a linear system that can be solved easily so the matrix should be as sparse as possible. Given the expression of the contributions Aij the requirement is that functions tpj overlap as little as possible with each other: the support of tpj should be reduced so that contributions are non-zero only for neighbouring tpi functions; this choice is consistent with the locality of differential operators.
For any X{ the shape function is defined as
Xi£—1_ ^      X X     X 2
lfi{X)
X j — 1 <~~^~ X <~~^~ X j
•Ei+1     %i -Ei+l
Xj < X < X
•Ei+1 %i
0 otherwise
i+l
3.1. A PRELIMINARY EXAMPLE IN ONE DIMENSION OF SPACE 33
v G V
Figure 3.1: The function v{x) = 0.8sin(27rx) — 0.32 sin(47rx) and its linear interpolate Xyh v with 6 equidistributed nodes.
0
Xi-2     OCi-i        Xi        Xi+i Xi+2
Figure 3.2: Shape functions for Lagrange Pi on a unit interval discretized into subintervals K{ =
so that the support of constructed functions tpi overlaps only with tfi-i and <fi+i, and the expression of all the functions can be obtained from one another by an affine transformation; in the case of a uniform grid, \xi+i — Xi\ = h, shape functions are obtained simply by translation. Therefore it would be convenient to define shape functions on a reference interval, then apply an affine transformation to generate any tpi. Shape functions are considered on a reference cell K which is the unit interval, then transported to any subinterval K{ using the geometric transformation from K to K.
In the reference cell K = [0,1] shape functions depicted in Figure 3.1.3 have expressions
<p0(x)   = 1-x
34 CHAPTER 3.  FINITE ELEMENT SPACES
and the affine mapping Tk mapping K to K in one dimension is given by Tk '■ x i—> b + ax, for some a, b G M. The affine mapping to any subinterval is given by the change of coordinates
Tk :xGKH'X = ij + (a^+i — Xi)x G
which is consistent with the definition of the global shape functions and reference shape functions, since
<po{x)   =   1-x <Pi{£)      = x
tpi(x)    =--- = 1---- I   tpi+l{X)    =--—
Xi+±      X{ 3^+1      X{ ^ X{+\ X{
so that x = (x — Xi)(xi+i — Xi)^1 as expected.
Remark 3.1.2 (Link to higher dimensions of space). The affine mapping in Md is given a relation of the type Tk : x \—> bx + Bkx, with bx G Md and Bjf G M.dxd. The matrix Bx is the Jacobian of Tk, noted Jtk- In one dimension of space JfK contains only one entry OTk/Ox = (xi+i—Xi) = \K\, and its inverse J^1 is obviously (xi+i — Xj)_1 =
3.1.4   Transport of Finite Element contributions
In the case of linear Lagrange elements the transport of Finite Element contributions corresponds to the change of coordinates Tk- The change of variable x = Tk{x) is sketched for the mass matrix and the stiffness matrix, given a cell K = [xi+1 - Xi\.
Firstly, let us consider contributions to the mass matrix M,
Mij = / ifj(x)ipi(x) dx jk
3.1. A PRELIMINARY EXAMPLE IN ONE DIMENSION OF SPACE 35
which can be rewritten, given that x = Tk{x and dx = — Xi)) dx,
Mij = /  Vj{TK{x))(fi{TK(x))(xi+1 - Xi) dx Jk
Since tfi o Tk = <pi and (xi+i — Xi) = \K\, then
Mij = \K\ I  ipj(x)ipi(x) dx Jk
so that contributions for K G Th to the mass matrix can obtained by a scaling of the contribution on K.
Secondly, let us consider contributions to the stiffness matrix K,
= / dxtpj(x) dxtpi(x) dx Jk
which can be rewritten using the chain rule for the derivation of ipi = ipi o T^1, formally dx<fi = dx(f>i dxT7^ with dxT7^ =
therefore,
= J_ j^dxfij(x) j^-dxtfi(x)\K\ dx
Kij = -J- I <pj(x)ipi(x) dx \K \ Jk
so that contributions for K G Th to the stiffness matrix can also obtained by a scaling of the contribution on K.
3.1.5   Generalization of the methodology
The example of linear Lagrange in one dimension provides a good overview of the methodology for constructing discrete spaces using Finite Elements. The steps can be summarized as:
1. Choose a function space suggested by the weak formulation.
2. Discretize the computational domain into a mesh.
3. Choose a polynomial approximation space.
4. Define an interpolation operator.
5. Provide a definition of a reference element.
6. Construct a mapping to generate the discrete space.
The generalization of this approach can take different directions:
• Extending to multiple dimensions in space.
• Increasing the polynomial order.
• Using a different approximation space.
• Controlling the solution in other means than pointwise values.
36
CHAPTER 3.  FINITE ELEMENT SPACES
3.2   Admissible mesh
Definition 3.2.1 (Mesh). Let Q be polygonal (d = 2) or polyhedral (d = 3) subset of Md, we define Ai (a triangulation Th in the simplicial case) as a finite family {K{\ of disjoint convex non-empty subsets of $1 named cells. Moreover V(7h) = {vj} denotes the set a vertices of j\4, £(M) = {o~kl = K D L} denotes the set of facets, which are edges (d = 2) or faces (d = 3).
Definition 3.2.2 (Mesh size).
h-r = max(diam(.fO)
KeThK
with diam(i^) with diameter of the cell, i.e. the maximum distance between two points of K.
Definition 3.2.3 (Geometrically conforming mesh). A mesh is said geometrically conforming if two neighbouring cells share either exactly one vertex, exactly one edge in the case d = 2, or in the case d = 3 exactly one face.
The meaning of the previous condition is that there should not be any "hanging node" on a facet. Moreover some theoretical results require that the mesh satisfies some regularity condition: for example, bounded ratio of equivalent ball diameter, Delaunay condition on the angles of a triangle, ...
3.3   Definition of a Finite Element
In the frame of the Galerkin method, given a function space V, a discrete space Vh = span {(f>j} C V was introduced, this section describes how to build such space by constructing an abstraction called Finite Element.
Definition 3.3.1 (Finite Element - [4] page 19, [2] page 69). A Finite Element consists of a triple (K,V, £), such that
• K is a compact, connected subset of Md with non-empty interior and with regular boundary (typically Lipshitz continuous),
• V is a finite dimensional vector space of functions p : K —> M, which is the space of shape functions; dim('P) = N-p.
• S is a set {o~}j of linear forms,
<jj :   V —^   R , Vj G [1,A^]
p Pj = Uj(j>)
which is a basis of C(V,M), the dual of V.
3.3.  DEFINITION OF A FINITE ELEMENT
37
Practically, the definition requires first to consider the Finite Element on a cell K which can be an interval (d = 1), a polygon (d = 2) or a polyhedron (d = 3) (Example: triangle, quadrangle, tetrahedron, hexahedron), then an approximation space V (Example: polynomial space) and the local degrees of freedom £ are chosen (Example: value at Np geometrical nodes (Ti({pj) = ipj(^i)). The local shape functions {ifi} are then constructed so as to ensure unisolvence. The dimension of V is usually called dimension of the Finite Element.
Proposition 3.3.2 (Determination of the local shape functions). Let {cjIk^at^ be the set of local degrees of freedoms, the local shape functions are defined as {^^kkn-p a basis ofV such that,
o-i(ipj) = 5ij   , V i,j G {I, N-p]
Definition 3.3.3 (Unisolvence). A Finite Element is said unisolvent if for any vector (ai, • • • ,olnv) G there exists a unique representant p G V such that ai(p) =ah Vi G ll,Npj.
The unisolvence property of a Finite Element is equivalent to construct £ as dual basis of V, thus any function p G V can be expressed as
n
p = J2aj(.p) ^3
3 = 1
the unique decomposition on {<fj}, with pj = (Tj(p) the j-th degree of freedom. In other words, the choice of £ = {ctj} ensures that the vector of degree of freedoms (pi, ■ ■ ■ ,pnv) uniquely represents a function of V. Defining S as dual basis of V is equivalent to:
dim(P) = card(S) = Np (3.3a)
V p G V, {(Ji{p) = 0,1 < i < N) => (p = 0) (3.3b)
in which Property (3.3a) ensures that S generates C(V,M) and Property (3.3b) that {ctj} are linearly independent. Usually the unisolvence is part of the definition of a Finite Element since chosing the shape functions such that <Ti(ifj) = Sij is equivalent.
As a first step, the Galerkin decomposition of functions G Vh was introduced given a basis of Vh and degrees of freedom Uj, but without giving a proper definition of the latter; we just assumed that they are real coefficients. In a second step, the Lagrange interpolation operator of Definition 3.2 was introduced to interpolate functions in V as continuous piecewise linear functions in Vh- Degrees of freedom Uj were then defined as u(£j) the value of the function u at the Lagrange node £j. Now that a definition a Finite Element was stated, in a similar fashion a global interpolation operator can be defined for general Finite Elements, i.e. with degrees of freedom defined by linear forms aj.
38
CHAPTER 3.  FINITE ELEMENT SPACES
Definition 3.3.4 (Global interpolation operator).
TVh :   V Vh
i=l
Given that the Finite Element is defined on a cell K, the corresponding local interpolation operator can be introduced.
Definition 3.3.5 (Local interpolation operator - [4] page 20).
TKy.   V(K) V
nv
i=i
Remark 3.3.6. The notation using the dual basis can be confusing but with the relation <Tj(p) = p(£i) in the nodal Finite Element case it is easier to understand that the set £ of linear forms defines how the interpolated function Ik,v u represents its infinite dimensional counterpart u through the definition of the degrees of freedom. In the introduction, we defined simply U{ = Oi{u) without expliciting it. A natural choice is the pointwise representation Ui = u(£i) at geometrical nodes which is the case of Lagrange elements, but it is not the only possible choice! For example, ctj can be:
• a mean flux trough each facet of the element (Raviart-Thomas)
&i(v) =    v ■ n^ ds
• a mean value over each facet of the element (Crouzeix-Raviart)
ai(v) =    v ds
• a mean value of the tangential component over each facet of the element (Nedelec)
o~i{v) = jv-r^ds
A specific choice of linear form allows a control on a certain quantity: divergence for the first two examples, and curl for the third. The approximations will then not only be Hs-conformal but also include the divergence or the curl in the space.
Remark 3.3.7. The Finite Element approximation is said H-conformal if Vh C H and is said non-conformal is Vh <]L H. In this latter case the approximate problem can be constructed by building an approximate bilinear form
as described, for instance, in the case of stabilized methods for advection-dominated problems in Section 8.3.
3.4.  TRANSPORT OF THE FINITE ELEMENT
39
3.4   Transport of the Finite Element
In practice to avoid the construction of shape functions for any Finite Element (K, V, £), K G 7~h, the local shape functions are evaluated for a reference Finite Element (K, V, X) defined on a reference cell K and then transported onto any cell K of the mesh. For example, in the case of simplicial meshes the reference cell in one dimension is the unit interval [0,1], in two dimensions the unit triangle with vertices {(0, 0), (1, 0), (0,1)}, and in three dimensions the unit tetrahedron with vertices {(0, 0, 0), (1, 0, 0), (0,1, 0), (0, 0,1)}.
Using this approach any Finite Element (K, V, S) on the mesh can be generated from (K,V,T<) provided that a mapping can be constructed such that (K, V, S) and (K, V, X) possess equivalent properties. In particular, an important property is the unisolvence: if a Finite Element is equivalent to another Finite Element which is unisolvent, then it is also unisolvent.
Definition 3.4.1 (Affme-equivalent Finite Elements). Two Finite Elements (K, V, S) and (K, V, X) are said affine-equivalent if there exists a bijection Tk from K onto K such that:
VpGP, poTK £V
and
X = TK(t)
By collecting the local shape functions and local degrees of freedom from all the generated (K,V, X) on the mesh, we then construct global shape functions and global degrees of freedom and thus the approximation space Vh-
For Lagrange elements the transformation used to transport the Finite Element on the mesh is the affine mapping Tk but this is not suitable in general. An auxiliary mapping is needed to transfer the approximation space on K to the approximation space on K. The following definition extends the affine-equivalence to a general equivalence property between Finite Elements.
Definition 3.4.2 (Equivalent Finite Elements). Two Finite Elements (K,V, X) and (K, V, S) are said equivalent if there exists a bicontinuous bijection <3?x from V{K) onto V{K) such that (K,V, S) is generated from (K,V, £):
K   = TK{K)
V   = {$K\p),Vp£V}
£   = Wkj ■ VK,j(p) = °K,j{^{p)),y p e V}
Given that any Finite Element (K,V, S) can be generated from a reference Finite Element (K,V,T<), then global interpolation properties for the entire space Vh (spanned by collecting all the shape functions) can be inferred from local interpolation properties on K. More precisely, without going into the details, the following result is specified in [4]:
40
CHAPTER 3.  FINITE ELEMENT SPACES
V{K)-^— V(K)
V
T
K,V
K
V
which means that the interpolation operators and the transport of the elements commute:      v o <3?x = <3?x oIKy.
Remark 3.4.3. Note that in the literature the mapping between spaces is defined from V(K) to V(K), while the geometric mapping between cells is defined from K to K. The affine-equivalence consists in the case <3?x coinciding withT^1. For Lagrange we defined <3?x : C°(K) —> C°(K) given bytii->uoTk-In that case V = spanj^oT^1} and the degrees of freedom are defined by
£ = {a* : ai{v) = ai{$K{v)) = $K{v){£i) =«oT^)}. If & = Tx(&) then the definition is consistent.
3.5 Method
Algorithm 3.5.1 (Finite Element Method). Solving a problem by a Finite Element Method is defined by the following procedure:
1. Choose a reference Finite Element (K,V,Ti).
2. Construct an admissible mesh Th such that any cell K G Th is in bijection with the reference cell K.
3. Define a mapping to transport the reference Finite Element defined on K onto any K G Th to generate (K,V, £).
4- Construct a basis for Vh by collecting all the shape functions of Finite Elements {(K,V,T,)}KeTh sharing the same degree of freedom.
3.6. EXERCISES
41
3.6 Exercises
Exercise 3.6.1.
Let us consider the Poisson problem posed on the domain $1 = (0,1):
-u"(x)= f(x),       Vxett (3.4a)
with / £ L2(S1), and satisfying the boundary condition on cWl
u(x)=0,       yxedfl (3.4b)
The domain is discretized into a family of subintervals [xi, Xi+i], i = 0, • • • ,N, and Problem 3.4 is approximated by a linear Lagrange finite element method. The approximation space is the space of continuous piecewise linear functions Vh = {(fi}o<i<N with
*   1        Xi-l ^ X < Xi, i ^ 0
tfi{X)
Xi+1 ~ X ,    Xi < x < xi+1, i / N (3-5)
Xi+l Xi
0 otherwise
(a) Find the weak formulation of Problem 3.4.
(b) Prove that a(u — u^, ifi) = 0, for i = 1, • • • , N — 1.
(c) Prove that for any v G H^fO, 1]) n C°([0,1]), i = 1, • • • , N — 1:
a(v, ifi) = i [-v(xi-i) + 2v(xi) - v(xi+1)] (3.6)
Let us consider f(x) = x4:
(a) Find the expression of the solution to Problem 3.4.
(b) Give the expression of the linear system obtained by the suggested method on a uniform grid, i.e. X{ = ih, i = 0, • • • , N.
(c) Implement a program computing the discrete solution      using the suggested method.
(d) Plot the discrete solution u^, the exact solution u, and the error \u — with N = 8,16,32.
(e) Implement a function computing the L2 error-norm     —    ||l2(S7) an(i pl°t the value for different values of N.
(f) Modify the program to handle non-uniform grids, given a list of node coordinates {^i}o<i<iV-
(g) Based on the error \u — Uh\ suggest a distribution of the nodes {xi}0<i<N, repeat the same study, then compare the error values.
Chapter 4
Simplicial Lagrange Finite Elements
4.1 Definitions
Vector spaces of polynomials are used as approximation spaces to construct Finite Elements.
Definition 4.1.1 (Space of polynomials with real coefficients). Let K C Md, k G N, ¥k(K) is the vector space of polynomials with real coefficients of degree k on K, and the canonical basis is given by the family {x^1 ■ ■ ■ xfl ■ ■ ■ x°^d : \a\ = fc}, with x-i the i-th. coordinate of x G Rd.
The dimension of the space is given by
1 d
dim(fk(Rd)) = -Jl(k + i) ' i=i
so that in particular dim(Pfc(M1)) = k + 1, which means that such polynomials will be uniquely defined by its values at k + 1 nodes.
Example 4.1.2. Linear and quadratic polynomials in different dimensions of space are listed below.
•i0	R1)	=   span {1,	x}
•iO	R2)	=   span {1,	x,y}
•iO	R3)	=   span {1,	x,y,z}
'20	R1)	=   span {1.	, X, x2}
'20	R2)	=   span {1.	,x,y,xy,x2,y2}
'20	R3)	=   span {1.	,x,y,z,xy,yz,xz
Simplicial Lagrange Finite Elements are considered, for which the approximation space V will be polynomials on K, a d-simplex. Simplices are a generalization of triangles to d dimensions, which consists of intervals [d = 1), triangles [d = 2), or tetrahedra [d = 3).
43
44
CHAPTER 4.  SIMPLICIAL LAGRANGE FINITE ELEMENTS
Definition 4.1.3 (Simplex). Let {vj}0<i<d be a family of d + 1 points of M.d that do not belong to the same hyperplane, the associated <i-simplex K is the convex hull of these points. Points {vj} are called vertices of the simplex, and
pairs €ij
), i / j, consist of the eddges. The diameter of the simplex is
the maximum Euclidean distance between two vertices,
diam(Ä')
max   v j
0<i,j<d
The convex hull is the minimum convex subset of M.d enclosing the points. The condition that all points are not in the same hyperplane means that the convex hull does not degenerate into a lower-dimensional entity; for instance in two dimensions, a triangle degenerates to a segment when points are aligned, and in three dimensions, a tetrahedron degenerates if all points are in the same plane. Therefore a degenerate simplex has a zero <i-dimensional Lebesgue measure.
Consider the matrix M^ of R(d+1)x(d+1) consisting of column vectors with coordinates of vertices {vj} of K, and completed by a unit row.
Mi
x0 Xi 1 1
Mo
Xq    X\ X2
yo 2/1 2/2 l   l l
M,
X0   Xi   x2 x3
yo yi 2/2 2/3
ZQ    Zi    z2 z3
1111
The determinant of M^ gives the signed <i-dimensional measure (volume) of the corresponding simplex K.
det(Md
±d!|Äl
In particular, if all the points are contained in the same hyperplane then any vertex is the linear combinations of others so that the determinant is zero. Given that the sign of the determinant depends on permutations of the matrix Md, in practice the meaning of the sign is the orientation of the simplex which depends on how vertices are numbered. If one substract the first column from all the other columns then the link to the Jacobian matrix of the affine mapping Tk introduced ealier is direct: the determinant of Jtk is equal to d\\K\ and a negative determinant means that the simplex K is inverted with respect to K.
4.2   Polynomial interpolation in one dimension
Let Pfc([a, b]) be the space of polynomials p = Yli=o aix% °f degree lower or equal to k on the interval [a, b], with CiX% the monomial of order i, Ci a real number.
A natural basis of    ([a, b]) consists of the set of monomials {l,x,x2, - • • Its elements are linearly independent but in the frame of Finite Elements we can chose another basis which is the Lagrange basis -[iZ^ j-Q<^<fc of degree k defined on a set of k + 1 points {£«}o<i<fc which are called Lagrange nodes.
4.3.  CONSTRUCTION OF THE FINITE ELEMENT SPACE
45
Definition 4.2.1 (Lagrange polynomials - [4] page 21, [6] page 76). The Lagrange polynomial of degree k associated with node £m reads
i=0
i=0
and
i=0
Proposition 4.2.2 (Nodal basis).
The following result gives a pointwise control of the interpolation error.
Theorem 4.2.3 (Pointwise interpolation inequality - [6] page 79). Let u £ Ck+1([a,b]) and tt^ u £ Pfc([a, &]) its Lagrange interpolate of order k, with Lagrange nodes fe}o<«<fc' then Vi£ [a, b]:
\u(x) — -Kku{x)\ <
i=0
max
dk+1u{s)
(k + l)l se[a,b]
4.3   Construction of the Finite Element space 4.3.1   A nodal element
Let us take ■ ■ ■ ,$,n} a family of points of K such that <Tj(p) = 1 < i < N:
• {£i}i<j<7v is the set of geometric nodes,
i<i<n ^s a n°dal basis ofV, i.e. <fi(£j) — Sij. We can verify, for any p G V that:
n
p(£j) = 5>«G0 Witt) ,i<iJ<N
i=l
which reduces to:
p(£i) = (Ti(p)
Remark 4.3.1 (Support of shape functions). The polynomial basis being defined such that (fj(£i) = Sij then any shape function tpi has support on the union of cells containing the node £j.
46
CHAPTER 4.  SIMPLICIAL LAGRANGE FINITE ELEMENTS
4.3.2 Reference Finite Element
To make the connection between the abstract Definition 3.3.1 and a simple concrete example, the reference element for Lagrange Pi in one dimension is given by the following triple (K, V, £).
Definition 4.3.2 ((K,V,T,) for Lagrange Pi ID). The Finite Element space Lagrange Pi ID approximating
Vh = {v G C°(JJ) n H1^) :v\K€ F1(K), V K G %}
is given by
• K is the unit interval [0,1].
• V is the space of linear polynomials Pi([0,1]) with the basis ((^o,^i) by Definition 4.2.1 of £},
ipo(x) = Cq{x) = 1 - x,       (pi{x) = C\(x) = x
• £ is the set of linear forms evaluating the function at Lagrange nodes £0 = 0 and £i = 1, {a0 : v ^ v(£0), &i ■ v ^ v{^i)}.
Consequently the local interpolation operator is
TKy   :   V(K) Vh(K)
v ^   v(£0)ip0 + v(£1)tp1
Proposition 4.3.3. (K,V, S) by Definition 4-3.2 is a unisolvent H1 -conformal Finite Element.
Proof. During the lecture we proved that:
• Vh C H1($l), since piecewise linear and piecewise constant functions belong to L2{fl).
• (ifi) is a basis of Pi (If) since shape functions are linearly independent as they satisfy ifi(£j) = Sij, and they generate the space Vh as any piecewise linear function coincides with its interpolate.
• (<7j) is a dual basis of ¥±(K) since Ui[ipj) = Sij.
□
4.3.3 Lagrange elements
The extension to Lagrange P& is natural as it boils down to construct a basis of Pfc using (£i)o<i<k and choose the degrees of freedom at the corresponding Lagrange nodes {£«}o<i<fc- Given that the unisolvence for Lagrange Pi is a direct consequence of the construction from a nodal basis, the same argument applies for higher polynomial order as long as degrees of freedom are located at Lagrange nodes. As illustration Table 4.3.3 depicts the shape functions for k = 1, 2, 3 in one dimension with equidistributed nodes (but other distributions are possible and also other polynomials can be used to build nodal elements).
4.4.  EXTENSION TO MULTIPLE DIMENSIONS
47
Table 4.1: Shape functions for Lagrange Pi,P2>IP>3 on the interval k = [0,1].
4.4   Extension to multiple dimensions 4.4.1   Barycentric coordinates
Lagrange polynomials (4.2.1) construct directly one-dimensional shape functions, while in higher dimensions they can be reformulated in terms of barycen-
48
CHAPTER 4.  SIMPLICIAL LAGRANGE FINITE ELEMENTS
trie coordinates. High-order Lagrange basis can also be expressed as polynomials of barycentric coordinates.
Definition 4.4.1 (Barycentric coordinates). Let us consider K a d-simplex with vertices {vj}0<i<d, any point x G K satisfies
x = y^j\i(x)Yi
i
where barycentric coordinates are obtained by relation
\i :   Rd   -+ R
w N     ,     (x -Vi) ■ m x    i—> Xi(x) = 1
(v/ - Vi) • Ili
with n« the unit outward normal to the facet opposite to v«, and vj a vertex belonging to this facet.
The geometric interpretation of barycentric coordinates is given by
meas(Ki)
Xi(x)
\K\
with meas(Ki) the signed measure of Ki{x) the d-simplex constructed with point x and the facet opposite to vj. In particular, points located within Ki have non-negative Xi, and the point xg satisfying equal weight Xi(xg) = (d + 1)_1 is the isobarycentre. In practice this property can be used to check if a point is inside a simplex: if the signed measure of one Xi is negative or if Y2i \Ki(x)\ > 1-^1 then the point is outside of the simplex.
Example 4.4.2 (Lagrange Pi ID). In one dimension of space, barycentric coordinates on K = [xq,x\\ are
x - x0      xi-x
X0{x)   = 1
Xi(x)   = 1
X\ — Xq Xi — Xq X — X\ X — Xq
Xq — X\       X\— Xq
which is exactly the same expression as linear shape functions tpo and ip\.
Example 4.4.3 (Lagrange Pi 2D). In two dimensions of space, barycentric coordinates on the unit triangle K depicted Figure 4.4.3 are
Mx) = l~ (i,o) -(1,1)        = 1~x-y
Xlix)   = -(-1.0)   _ A
H*) = i
(-1,0) -(-1,0) (£,£-!)• (0,-1) (0,-1). (0,-1)
which is the linear Lagrange basis on K (since the normal is at the numerator and the denominator, there was no need to normalize the vector).
4.4.  EXTENSION TO MULTIPLE DIMENSIONS
49
Figure 4.1: Unit triangle with outward facet normals
We can verify easily that shape functions
<p0(x)   = 1-x-y <pi{x)   = x <P2{x)   = y
form a nodal basis and that <pa(x) + <pi{x) + <p2{x) = 1 f°r any x-
Lagrange elements of polynomial degree k = 1,2,3 can be expressed using barycentric coordinates in higher dimensions, the shape functions are given by:
k = 1, A,-
k = 2,   A, (2A, - 1)
4 Xi Xj
- Xt (3A, - 1) (3A, - 2)
^ A, (3A, - 1) Xj 27 Xi Xj Xk
0<i<d
0<i<d
0 < i < j < d
0<i<d
0 <i,j < d, i Ý j 0<i<j<k<d
4.4.2   AfRne transformation
In Chapter 3 the one-dimensional affine mapping between the unit interval and any sub interval K = [xi, Xi+i] of a one-dimensional mesh Th was given by Equation (3.1.3); the link to higher dimensions in space was then briefly discussed. The following example describes how the affine mapping Tk '■ K —> K is defined for a triangle in M2.
The shape of the reference triangle K is defined by vectors vi— vo and V2 — vo, and in the same fashion the shape of any triangle K is defined by vectors vi — vq
50
CHAPTER 4.  SIMPLICIAL LAGRANGE FINITE ELEMENTS
V2 = (0, 1)
v0 = (0,0) vi = (1,0)
and v2 — vo. The affine mapping is a simple change of coordinates but the detail is given below for the sake of completeness.
vi
V2
Tx(vi)
vo + (vi
VO + (V2
VOJ Vo)
and any point x of K can be expressed in terms of the relation
vo + A(vi - v0) + /x(v2 - v0)
x
with given A and fi. The reference triangle is defined by the canonical basis of
as (vi - v0, v2 - v0J
(ex, ey) so that the affine mapping x = Vq + T$kx
satisfies Tx(ex) = (vi — vo) and TK{ey) = (v2 — vo). The matrix Bx is then the matrix of the corresponding change of basis composed of column vectors
Vj — Vq, thus
x
vq +
Vl,x — Y0,x Y2,x — Y0,x yi,y — Y0,y    Y2,y ~ Vq,j,
x
Definition 4.4.4 (Affine mapping from reference simplex in M.d). The generalization of the affine mapping in M.d from the reference simplex K = {vj}c to K = {vj}0<-<d is given by x = v0 + Jtkx, with
)0<i<d
dT,
k
given by column vectors (v^ — vo).
While the change of coordinates for the mass matrix does not pose any difficulty, the case of the stiffness matrix requires some precisions. The derivation of the composition of two functions reads
Vtp = V(£ o TKX) = (V£ o TKX) ■ JTr1
which can be also written formally component by component dip    sr^ dp d£j dp
dxi
^ dxj dxi
j J
^ dx.
■[JT-il ..
4.5. LOCAL EQUATION FOR LAGRANGE Px IN ONE DIMENSION 51
and can be interpreted as the decomposition of variation dx along each axis in terms of dx. Moreover the Jacobian matrix of the inverse mapping is the inverse of the Jacobian matrix
Jt-1 = (Jtk ° TK
so that
and since the Jacobian matrix is constant on each cell k, it can be simplified as
4.5   Local equation for Lagrange Pi in one dimension
The approximation of Problem (1.7) by Lagrange Pi elements on domain $1 = (0,1) reads:
Find u £ Vh, given / £ L2(S1), such that:
Vtt • Vu dx
fv dx   , v v £ Vh
(4.1a)
with the approximation space Vh chosen as:
vh = {v £ c°(n) n Hj(n) :v\K€ f1(k), v k g rft}
(4.1b)
The interval $1 = [0,1] is discretized by partitioning into disjoints subin-tervals [xn,xn+i], 1 < n < Nk of length h = 1/Nk- Steps to obtain a weak formulation and deriving a discrete problem were detailed in Section 3.1.
Expressing the local equation for any subinterval k = [xn,xn+i\ consists of assembling a matrix corresponding to contributions
k
dx(fj(x) dx(fi(x) dx
for shape functions tpj and tpi which have support on k. Given that the dimension of the Lagrange Pi element in one dimension is two, with two shape functions ipn and ipn+i, the local matrix is of dimension 2x2. The derivative of ipn and </?n+i is constant on k and of opposite signs: ipn(x) = — 1 and Vn+i{x) = +1-
A
k
+ 1 -1 -1 +1
with row and column indices of the local matrix mapping to row and column indices (n,n+ 1) of the global matrix. Therefore assembling the local equation into the global matrix consists of adding entries of Ak to the submatrix with row and column indices (n, n +1). Since each node xn has two adjacent subintervals [xn_i,xn] and [xn,xn+i], inner nodes (which are not on the boundary) will see
52
CHAPTER 4.  SIMPLICIAL LAGRANGE FINITE ELEMENTS
two contributions +1 on the diagonal, one contribution —1 for columns n — 1, and one contribution —1 for columns n + 1, scaled by factor 1/h.
0
0   -1 +2
-1 0
0
If the partition of the interval is not uniform then the assembly of the local equation should be modified,
A
K
1
W\
+1 -1 -1 +1
with \K\
|^n+l %n\-
4.6. EXERCISES
53
4.6 Exercises
Exercise 4.6.1.
Let us consider the Helmholtz problem posed on the domain $1 = (0,1), given k a real coefficient:
- u"{x) + ku{x) = f(x),       V x G Q (4.2a)
with / G L2(ft),
u(x) = 0,      Vxedfl (4.2b)
Let us consider the analytic solution for re = 1 and f(x) = sin(7nc) given by
u(x) = sin(7Tx)/(l + 7T2).
(a) Derive a weak formulation of Problem (4.2).
(b) Show that the problem solved by a Galerkin method using the discrete space Vh = span{</?j}, 1 < i < N, can be written under the form of a linear system
(A + reM)u = b
(c) Express the linear system when the problem is approximated with Vh =
the space of linear Lagrange finite element on a uniform grid.
(d) Implement a program to solve the problem with Vh = X^ and compare the discrete solution with the suggested analytic solution.
(e) Express the linear system when the problem is approximated with Vh = X2 the space of quadratic Lagrange finite element on a uniform grid.
(f) Modify the linear system for a general non-uniform grid, and for testing use vertices:
V = {0.0, 0.1, 0.25, 0.3, 0.4, 0.45, 0.5, 0.55, 0.6, 0.7, 0.8, 0.9,1.0}
(g) Write a function computing integrals by the quadrature formula:
J g{x) dx « i [5(ci) + g{c2)],   ci = ^ + "y'   C2 = 2~~1T
to approximate contributions for each element to the load vector b.
(h) Implement a program to solve the problem with Vh = X^ and compare the discrete solution with the suggested analytic solution.
(i) Modify the program to implement the boundary condition u(0) = 1, u'(x) = 2. Which changes are required?
Chapter 5
Error analysis
The goal of this section is to bound the approximation error = u — in a Sobolev or Lebesgue norm. To this purpose we have already two ingredients:
— on the one hand, in the analysis of Ritz and Galerkin methods, consistency estimates like Cea's Lemma give a control on the approximation error in the solution space V in term of "distance" between the solution space and the approximation space:
with C > 0 a constant real number,
— on the other hand, the pointwise interpolation inequality of Theorem (4.2.3) gives a control on the interpolation error eT = u— iik u, i-e. the difference between any function and its interpolate by Lagrange polynomials of order k.
For consistency with the notation introduced in Chapter 3 for the Lagrange interpolation operator in the Finite Element setting, X^k will stand for the Lagrange     interpolation operator in this section.
Question: Can we control the approximation error by bounding the right-hand side of the consistency inequality using interpolation properties?
5.1   Preliminary discussion on the Poisson problem
In the previous chapters, the weak problem derived for the Poisson problem with homogeneous Dirichlet boundary conditions and / £ L2($l),
Find u £ V = Hq(S1) such that:
u - uh\\v < C\\u - vh\\v ,Vvh£Vh
Vtt • Vt) dx = / f v dx ,    Vd 6 7
was approximated by a Galerkin problem,
Find Uh g Vh c Hq($1) such that:
55
56
CHAPTER 5.  ERROR ANALYSIS
by simply looking for a finite dimensional solution.
In a first stage, we verified that the weak problem is well-posed for the chosen solution and test spaces, and that the exact solution u and discrete solution Uh satisfy both a relation of the type
Mh1^) < cII/IIl2^)
with C > 0, which we denoted as a priori estimate.
In a second stage, without making any further assumption on the approximation space Vh, a general result was introduced showing that the distance between the exact solution and the approximate solution \\u — Uh\\v is bounded by the distance of u to the best approximation of u in 1^; this is Cea's Lemma. For the sake of completeness, the estimate corresponding to Cea's Lemma is given in the case of the Poisson problem with homogeneous Dirichlet boundary conditions.
E = \u - u&{n) = / | V(« -      d* = / v,a - „,). V(a - „„) d,
Jn Jn
For any vh G Vh
E=     V(u - uh) ■ V(u - vh) dx + / V(u - uh) ■ V(vh - uh) dx Jn Jn
and the second term cancels by virtue of consistency (Galerkin orthogonality), so that Cauchy-Schwarz gives
E = \u-uh\nHn) < thus
\V(u - uh)\2 dx^j '        \V(u - vh)\2 dx^j '
\u ~ uh\ni(n) <\u- vh\ni{n) As introduced, the relation
|eft|Hi(íí) <\u- ^Ihi(íí) (5-1)
means that the approximation error = u — Uh is controlled as soon as we are able to bound the distance between u and its best representant in Vh-
In a third stage, we introduced Finite Element approximation spaces based on Lagrange polynomials, Definition 4.2.1. Their local interpolation operator given here in one dimension,
k
-j
3=0
with Lagrange nodes satisfies Inequality (4.2.3) controlling the interpola-
tion error et = v —     v pointwise on K.
5.2.  STABILITY OF THE LAGRANGE INTERPOLATION OPERATOR 57
Inequality (5.1) can be written
eh\m(n)
< \e,
(5.2)
which means that the approximation error is bounded by the interpolation error. From pointwise polynomial interpolation estimates we need to derive inequalities in Sobolev and Lebesgue norms, so that the right-hand side of Equation (5.2) can be bounded by a function e of the mesh size h-r, such that e(h-]-) —> 0 as hj- —> 0; convergence of the approximation is then ensured.
Inequalities of the form
for some constant C depending on the domain and the exact solution, are called a priori error estimates. Usually e{hf) ~ 0(hj-), with r > 0 the convergence rate of the method, which represents the expected accuracy. In this case sequences of approximate solution (uh)hT converge to the exact solution u, as fast as the function e allows, when hj- tends to zero.
The goal of this section is to derive such error estimates after proving required interpolation inequalities. In particular the space of continuous linear Lagrange elements
will be considered, then more general results will be stated without proof.
5.2   Stability of the Lagrange interpolation operator
Before moving to interpolation inequalities, the stability of the interpolation operator should be proved: such estimate shows that the interpolate of any function in V is also in V. The expose is restricted to the one-dimensional case and detailed so that anybody without prior experience in estimates should be able to follow the procedure.
Interpolation properties need to be reformulated in terms of estimates in L2 norms. First they are expressed elementwise, then on the entire domain by collecting contributions over the mesh. To give a better idea, let us introduce below an ingredient used for subsequent estimates, considering a function v £ H1(7) with / an interval, and two points £, x G /.
eh\\ < Ce(hT)
vh = {vt a
(Ü) n Hä(n) :VK€%,v\K€ F1(K)}
(5.3)
v'(s) ds
v'(s)\ ds
58
CHAPTER 5.  ERROR ANALYSIS
is bounded.
Now let us consider that £ realizes the minimum of \v\ on /, then
\v(x)\ <\x- £|1/2MHi(Kvr]) + KOI >    £ = argmin \v(s)\
sei
using the second triangle inequality.
Immediately if \v(£)\ = 0 the estimate gives for any x g /
\v(x)\ < |/|1/2|t;|hi(j) (5.5) but otherwise we can bound \v(£)\ using
\v(s)\ ds < \x - ^r1/2||w||L2([?,x])
-,
so that
\v(x)\ <\x- ^|1/2|w|hi(K,x]) + \oo- ^r1/2||«||L2(K,x]) thus for any x g /
\v(x)\ < \I\1/2\v\nm) + \ir1/2\Mmi) (5.6)
in other words
< I^I1/2I«Ihi(/) + lzK1/2II^IIl2(j-) (5-7)
Proposition 5.2.1 (H1-stability of the Lagrange Pi interpolation operator).
There exists a constant C > 0 such that,
\\xhv\\-w-{n) < C |M|Hi(n) (5-8)
Proof. Since || • Ij2^ = || • ||£2 + I " Ih1 then the L2 norm of the interpolate and its derivative should be controlled. Since controlling the derivative of a function in L2 gives a control on the function in L2 (Poincare Inequality), it is natural to start looking for an estimate of | - |Hi, then move to || • ||L2.
(i) Estimate in | • |hi:
Let us consider the restriction of the interpolation operator to any K = [xi,xi+1] g Th
r„ = (4")' = h^(v(xi+1) - v(Xi))
k
which is constant over K. The elementwise H1 semi-norm of the interpolate can be bounded using Inequality (5.4)
\xkv\m(k)    =    (yjkkk2\v(xi+l) - v(xi)\2 dx^j
< hK1/2 \v(xi+1) - v(Xi)\
< hK> \v\ni(K)
< \v\w-{K)
Summing over K and using the definition of h-r,
5.3. A PRIORI ERROR ESTIMATE WITH LAGRANGE P
59
(ii) Estimate in || • ||L2:
In this case we do not need to prove an elementwise estimate first as we already introduced a control of pointwise values in terms of the H1 semi-norm and the L2 norm: it boils down to estimate the maximum attained by the linear interpolate.
hz/HIl^) = ( / \zlv\2 dx
<     (   /  \\Xl v||?oo/rA dx
1/2
-h ~u\\L°°(n) dx )
< i^r/2ii«iiL°°(n)
the last line is given by the L°° stability of the linear interpolation: the function and its linear interpolate coincide at Lagrange nodes so that \\-^hullL°°(n) < Therefore, using Inequality (5.7)
\\Zhv\\L2(n) < M1/2\v\iP-(n) + M~1/2\\v\\L2{n) we can conclude using the Poincare Inequality with constant Cp1
H^lli^) < (N +cp1)klHi(n)
□
Inequality (5.8) is also called uniform continuity in zero of the interpolation operator since C should not depend on Th] but possibly depends on $1.
5.3   A priori error estimate with Lagrange Pi
Proposition 5.3.1 (Interpolation Inequalities for Lagrange Pi). There exists two positive constants C\ and Cq such that for any v £ H2($l)
\v ~ xh v\m(o.) < CiH Mh2(^) (5-9)
\\v -Xhv\\-LHVL) < Co^rlwlH2(^) (5-10) with hj- = m&XKeThih-K)
Proof. The proof is sketched in one dimension, based on a decomposition of the error per element and on the Mean-Value Theorem (also known as Rolle Theorem). The global interpolation error is then recovered by summing over the cells, given that
INIiWn) =        / \et(x)\2dx= ^2\\et\\l2(K) KeTh Jk KeTh
This makes sense since the polynomial interpolation estimate is defined point-wise, it is then a local property.   In the same spirit as the stability of the
60
CHAPTER 5.  ERROR ANALYSIS
interpolation operator of Proposition 5.2.1 which we proved first elementwise, the estimate is derived in H1 semi-norm then in L2 norm.
The main technical ingredient is given by Inequality (5.4) for a given function w g H1^),
\w(x) - w(£)\ <\x- C\1/2\w\nm^,x]) (5-11)
with £,x g K. If we choose £ such that w(£) cancels and if we integrate the square of the expression over any K g 7~h, then we get a control of w in L2(K),
IMb(K) <\x- i\1/2 ( f \w\h([^x]) dx)    ^ hK\w\Ki(K) (5.12)
(i) Estimate in | • |Hi:
The elementwise estimate on any K = is obtained by taking
w = (v — X\v)' in Inequality (5.11). Given that x% and Xi+i are Lagrange nodes, (v — X\ vj (xj) = 0 and (v — X\ vj (xj+i) = 0, then by virtue of the Mean-Value Theorem, there exists £ g such that the derivative
cancels, (v — X\ v)'(£) = 0. Moreover \(v — X\ «)/|h1(x) = I^'Ih^K) smce (v — X\ v)" = v" — (X^ v)" by linearity and (X^ v)" is identically zero, so we get directly
\\{v -Xlv)'\\L2{K) < hK\v'\ni{K)
in the same fashion as Inequality (5.12), which can be rewritten under the expected form
\v-Zhv\w(K) < hK\v\^(K) (5-13)
(ii) Estimate in || • ||l2:
The elementwise estimate on any K = is obtained by taking
w = (v — X^v) in Inequality (5.11). Given that (v — X^v)(xi) = 0 and (v — Z^u)(xj+i) = 0, a similar argument as for the H1 semi-norm can be used with £ = x\. Inequality (5.12) with w = (v — X\ v) reads
\\v-Zhv\\l*(K) < hK\v -Xlv\ni{K)
so that using Inequality (5.13)
\\v ~ Xh V\\-LHK) < hK\v\^(K)
In both cases global estimates are obtained by summing over K g 7~h and factoring hj- = maxxe7^(^x)- D
Remark 5.3.2 (Convergence order in H1 norm). Using the definition of the norm
\\v -^^Hh1^) = \\v ~xhv\\12(n) + \v ~xhv\n-i(n)
we get
\\v ~ 4 «ll|i(íí) < CI i.hT\v\^(Q) + ^tMhW \\v-Zhv\\Hi(Q) < Ci hr (1 + h^-)1/2\v\n2{n)
Thus we verify that the approximation is first order in H1 norm if Hh2(íí) is bounded.
5.4. SUPERCONVERGENCE
61
A more general result Proposition 5.3.3 can be proved for Lagrange P& Finite Elements, which we verify is equivalent to Proposition 5.3.1 for s = 1, k = 1.
Proposition 5.3.3 (Interpolation Inequality for Lagrange P&, [4]). Given 0 < s < k, there exists a positive constant C such that for any v £ Hs+1($l)7
\\V-Zhv\\h*(n) +hr\v -Tlv\ni{n) < Ch^lv^s+i^
One important remark is that the convergence order depends on the regularity of the solution since the order is s + 1 for a solution in Hs+1($l). Unfortunately the Lagrange Pi Finite Element is only H1-conformal, so a priori it cannot represent functions of H2 accurately. Given that the solution space is H1 the interpolation inequality of Proposition 5.3.3 only applies with s = 0 so that the convergence order is only one in L2 norm in the general case, and we have only a weak convergence result in H1. In the next section we show that the convergence rate for the Poisson problem approximated with Lagrange Pi is actually second order in L2 norm and first order in H1 semi-norm: this is one order more than expected from interpolation inequalities.
5.4 Superconvergence
The following result shows the the convergence properties of the method is not only limited by interpolation inequalities. Indeed, using a result by Aubin and Nitsche, we show that even if the approximation is not H2-conformal, we can improve the error estimate by one order: the convergence in L2 becomes then second order in hf- The idea behind this result is that if u is the weak solution to the Poisson equation then it is not only in Hq($1) since the differential operator involves second order derivatives: we say that u is regularized due to the ellipticity of the operator.
Theorem 5.4.1 (Superconvergence). LetQ be a convex polygonal subset ofM.d, d > 1, f g L2($l)7 u solution to the Dirichlet Problem (1.3) and approximate solution, hj- = maxKeThi^-K):
\\u ~ uh\\ni(n) < Ci hr   and   \\u - uh\\L2^ < C0 h\
Proof. If u g Hq($1) is solution to the Poisson problem, then by elliptic regularity and density of H2(ft) in H1($l)), u £ H2(ft), thus 3 Cu > 0 such that:
IMlH2(n) < Cu ||/||L2(n)
Thus replacing the H2 semi-norm in the right-hand side of the error estimate, we have
\\u ~ ^IIhi(^) < Cu hr ||/||L2(n) (5.14) Let us introduce the following auxiliary problem:
— A.(p(x) = eh(x)   , x g $1 (5.15a)
tp(x) = 0   , x £ dQ
(5.15b)
62
CHAPTER 5.  ERROR ANALYSIS
obtained by a duality argument, formally
— ( Au , w ) = ( X7v , X7w ) = —(«, A.w )
by integration by parts; since the adjoint operator of the Laplace operator is the Laplace operator itself, it is said self-adjoint. The motivation of introducing this equation is to control the approximation error     in L2:
( eh , eh ) = ( eh , -A0 ) = ( Veh , V0 ) = ( - Aeh , 0 )
for (f) £ Hq($1) called duaZ solution satisfying —A0 = e^.
Similarly to the Poisson equation, the weak formulation of Problem 5.15 reads:
Find ip G Hg($l), given     £ L2(S1), such that:
/• r 1 (5-16)
/ V(^-V0d£c= / e^dx ,V0GH^) in in
Since      is bounded in L2(S1) then the same regularity result holds for the
auxiliary Problem (5.15), BC^ > 0 such that:
IMIh2(£}) < \\eh\\i?(o:) so we have from the interpolation inequality for tp
ll</> - VhWwin) < Cv hT IKIIl2^) (5-17)
Let us try to bound the L2 norm of the approximation error by noticing that its amounts to take 0 =     in (5.16):
IKIbm) = / |eft|2dx= / 'Vtp-'Vehdx Jn Jn
If we consider the approximate of Problem (5.16) by Galerkin method, with £ Vh its solution, then the Galerkin orthogonality reads:
Vtfh ■ Veft dx = 0
n
Thus we can subtract and add this latter to the previous expression:
IKIbm) = / V(ip-iph)-'Vehdx+ / 'Vtph-'Vehdx Jn Jn
o
First we use Cauchy-Schwarz and make the H1 norm of the approximation errors appear since we control them by Equation (5.14) and (5.17):
lleftllL2(n) < lb - v^IIhh^II^IIh1^)
Replacing by the bounds from the interpolation inequalities we get:
lleft||L2(n) < Cu     hj- ||/||l2(n)
which concludes the proof. We have then a second order error estimate in L2. □
5.4. SUPERCONVERGENCE
63
The conclusion of this result is that the observed convergence order may be different than the order suggested by the interpolation inequality. As seen in this example it may be improved if the differential operator has a regularizing effect, but can also be influenced by other factors like the regularity of the boundary or the discretization of the computational domain.
64
5.5 Exercises
CHAPTER 5.  ERROR ANALYSIS
Chapter 6
Time-dependent problems
The objective of this section is to introduce the a priori stability analysis of time-dependent problems on several examples to obtain estimates similar to Lemma 2.2.11.
6.1   Time marching schemes
In this section, problems describing the evolution in time of an unknown u will be considered. Since the unknown depends on the space coordinates and on the time, such evolution problem will be posed on a domain which is the open cylinder Q = Q x (0,T).
The Initial and Boundary Value problem for a Partial Differential Equation will therefore take a form such as
in the case of a Dirichlet problem which is first-order in time, and with A a differential operator in space. The differential operator and the right-hand side may depend on u, in which case the problem becomes non-linear.
The equation can be recast under the form of a Cauchy problem
so that an evolution problem can be seen as the coupling between a partial differential equation in space, and an ordinary differential equation in time.
This suggests that two discretizations may be considered: a Finite Element discretization in space which was discussed for elliptic problems in Chapter 2 and Chapter 3, and a time discretization. Similarly to the one-dimensional spatial case, the time discretization consists of solving the problem on a partition of (0,T). Let us define a family {in}o<n<Af oi N + 1 discrete times, with t° < ■ ■ ■ < tN, and integrate the equation on each subinterval [tn_1,tn], 1 < n < N,
Find u(x,t) satisfying:
65
66
CHAPTER 6.  TIME-DEPENDENT PROBLEMS
characterized by the time-step 5tn = tn — tn_1. Marching in time consists of solving a succession of problems at discrete times tn, 1 < n < N given solutions at previous discrete times.
For example, in the case of a first order approximation in time, the discrete time-derivative is
un _ un-l
dt,nu =-—- (6.2)
for n = 1,..., N, so that Relation (6.1) reads
un =un-1 + StnF(x,t;u) (6.3)
and the treatment of term F(x, t; u) is left to be determined as it can be expressed as a function of un but also of solutions un~k, k = 1,..., n at previous time-steps.
The choice of the time-derivative and the way F(x,t;u) is expressed will define the type of numerical scheme. For example, in Relation (6.3) F(x,t;u) can be evaluated at time tn,
.n—1
+ 5tnFn(x,t;u) (6.4)
or at time tn 1,
un-x +5tnFn-1(x,t;u) (6.5)
which correspond respectively to Backward Euler and Forward Euler schemes. The former is an implicit scheme as the term F(x, t; u) depends on the unknown u11, while the latter is an explicit scheme as the term F(x,t;u) is expressed in terms of un_1 which is known. In a more general fashion, the theta-scheme reads,
un = un-i + Sfn ftp™ fa f. u) + (1 - 6)Fn-1(x, t; u)] (6.6)
with parameter 6 £ [0,1], so that Backward Euler is recovered for 9 = 1, Forward Euler is recovered for 6 = 0, and Crank-Nicolson corresponds to the choice 6 = 1/2.
Stability and accuracy properties of the numerical scheme will depend on which terms are chosen as implicit or explicit: without going into the details and as a general rule implicit schemes tend to be more stable while explicit schemes will be limited by a condition on the time-step. You can refer Von Neumann stability analysis for ordinary differential equations, and the Courant-Friedrichs-Levi (CFL) condition.
Regardless of the numerical scheme, properties of solutions will also depend on the regularity of the initial condition and the nature of the differential operator. Parabolic equations involving an elliptic operator will enjoy a smoothing property, given that energy dissipation is induce by diffusion-type operators, while hyperbolic equations may give rise to discontinuities and see the propagation of shocks.
6.2.  A PRIORI STABILITY ESTIMATE
67
6.2   A priori stability estimate
In a similar fashion as elliptic problems, a priori estimates can be derived by a careful choice of test function.
6.2.1   Heat equation
In the continuity of the Poisson problem the following unsteady problem is considered
dtu(x, t) — Au(x, t) = f(x, t) u(x, t) = 0 u(x, 0) = uq(x)
which corresponds to the case Aa = —A u.
( A u , v ) = — / A uv dx = / Vtt • Vt) dx Jn Jn
Firstly, let us derive the energy estimate for the heat equation in the by recalling the weak form and then taking the test function to be the unknown u:
dtuv dx + re / Vtt • Vt) dx   =       f v dx n Jn Jn
dtuudx + k / | Vu|2 dx   = fudx n Jn Jn
— I dt \u\2 dx + re / | Vu|2 dx   =    I fudx 2 Jn Jn Jn
l-AfMidx + KfIVu?dx = ffudx
~ll*(0,+«l*(0,d« =
n
In the case of an homogeneous equation, the latest relation is directly the instantaneous conservation of energy
\^tWu\\h(n) +K\u\h(n) = 0 (6-7)
with the first term being the variation of kinetic energy and the second term being the dissipation of energy with diffusion coefficient re. Integrating over the time interval, we get the energy budget over [0,T]:
1 fT
2\\uWh(n) + K Jo Mh^) dt = 0 (6-8)
Let us consider now a non-zero source term /, then using the Cauchy-Schwarz inequality yields the following relation:
1 d ii   Il2 ,| i2
2 dt1
u^(n) + KHw(n) ^ H/llL2(n)IMlL2(n) (6.9)
68
CHAPTER 6.  TIME-DEPENDENT PROBLEMS
Since the bound should depend only on the data, the name of the game is to absorb any term involving the unknown in the left-hand side. To this purpose, inequalities like Holder, Korn, Sobolev injections are to be used in order to get a power of the proper Lp or Hs norm of the unknown. In the case of coercive problems, the diffusion term giving directly the H1 seminorm (to a factor depending on the diffusive coefficient), we should try to make it pop from the right-hand side. Using first the Poincare inequality (Lemma D.1.8) and then the Young inequality (Lemma D.1.3), we can bound the right-hand side by the data and the H1 seminorm,
1 72
ll/lb(n)IMb(n) < ^xll-fIIl2(^) + yMhi(^) (6-10)
with 7 a positive real number which can be chosen arbitrarily. Therefore, as soon as we choose 7 < \/2k, it is possible to subtract the second term of (6.10) to the left-hand side of the estimate, given that
2 d£IHlL2(«) + —2-l^liW ^ 2^4 II/IIl2^) (6'n
2
Consequently, taking 7 = y/~K there exists a constant C > 0 depending on the Poincare constant, such that
■^IMIl^) + < C(cP)\\f\\l2{n) (6.12)
This inequality yields a control of the L2 norm and H1 seminorm of the solution at any time t of the time interval [0,T]. Similarly to Equation (6.8), if we integrate over the time, we get
Hi(^) dt - c(cp) / II/IIl2(£})
dt
which, by defining,
Mli/-(o,r;lp(n)) =       IMIlp(ü) dt) (6-13)
can be rewritten as
\HT)\\h(n) + KWu\\h(o,t^0(n)) ^ c(cp)\\f\\h(o,t;LHn)) + \H°)\\h(n)
The solution is said to be bounded in L°°(0, T; L2($l)), i.e. u £ L2($l) for almost every t G [0, T], and is it also bounded in L2(0, T; H1($l)) by the data (provided that / G L2(0,T;L2(ft)) of course).
Now, if we turn to the discrete case the estimate is not different aside from the the discrete time derivative. The term for the discrete time derivative in the case of backward Euler reads
1
. (u — u*) u dx ot Jn
with St the current time step, u and u* respectively the solution at the current and previous time stepping.
6.2.  A PRIORI STABILITY ESTIMATE
69
H2
Take v = —tAu.
T
— \u 2
12 dx 12 dx
— / (ptu — Am) t Au dx Jn
/ £V(dtu) ■ Vudx + / £|Au|: in Jn
t / dt(Vu) • Vudx + f / |Au|: Jn Jn
£ d     2 , ,2
^"^Nh1^) + Nh2(^)
■(T)\lHn)+ f MlHn)At-\ I H|i(n) = ^KO)lHi(fi)
0 «^ 0 v j
Using the previous control in H1 allows us to conclude.
0
1  fT, ,, T, o
Chapter 7
Adaptive error control
In Section 5. we derived a priori error estimates which give a control of the discretization error for any approximate solution. The order of convergence given by the exponent 0(h?f) is an indication on "how close" to the continuous solution any approximate solution is expected to be. Provided that we are able to compute an approximate solution u^, we want now to evaluate the "quality" of this solution in the sense of the residual of the equation: such an estimate is thus called a posteriori as it gives a quality measure of a computed solution.
Question: How can we evaluate the quality of an approximate solution computed on a given mesh to improve the accuracy?
7.1   A posteriori estimates
In Chapter 5 interpolation inequalities were established, such as
\\u-ThU\\u2(n)     < C0^|M|H2(n)
in order to bound the right-and side of Cea's inequality
\\u ~ uh\\m(n) < \\u ~ v\\m(n)
for v G Vh- As such this provides directly a control on the H1 norm of the approximation error = u — Uh- the obtained estimation is called a priori error estimate. This type of result is important to indicate the optimal convergence order of sequence of discrete solutions to the exact solution as the mesh size tends to zero (which means that the dimension of the solution space tends to infinity). Since a priori error estimates depend on the exact solution to the problem they does not provide a quality measure of the discrete solution which is computable. Instead we would like to derive an error estimate which would hep us determine if the computed discrete solution is accurate. This type of estimation is called a posteriori error estimate: it consists of computing error indicators depending on the discrete solution and the discretization, providing a quality measure of uh.
71
72
CHAPTER 7.  ADAPTIVE ERROR CONTROL
As an introduction, let us consider the exact solution u G V and its approximation Uh G Vh and observe that there exists a direct relation between the approximation error      = u — Uh G V and the residual of the equation
n{uh) e v,
( K{uh) , v ) = L(u) - a(uh, v) for dgFso that by consistency of the Galerkin method
( TZ(uh) , v) = a(u- uh,v)
and in the case v G Vh the inner-product cancels.
In case a(-, •) is a coercive continuous bilinear form we can write:
1. Coercivity:
a\\u - uh\\v < a(u - uh,u - uh) (7.1) a(u - uh,u- uh) = ( TZ{uh) , u-Uh)
<A\eh\\v<\\K{uh)\\v, (7.2)
2. Continuity:
a(u — Uh, u — Uh) < M\\u — Uh\\v (7-3)
\\fc{Uh)\\v> = SUp -r—r-
vev \\v\\v
\\K{uh)\\v> < M\\eh\\v (7.4)
Using only the coercivity (7.1) and the continuity (7.3) of the bilinear form, it follows from relations (7.2) and (7.4) that estimating lZ{uh) is equivalent to estimating the approximation error in the norm of V. Given that Uh is know the quantity lZ{uh) is computable and can be used to derive a posteriori error estimators.
Different strategies can be used for improving the accuracy:
• h-adaptivity: cells with largest error indicators are refined, i.e. divided into smaller cells so that the mesh size is decreased: the mesh topology is changed as new cells are inserted.
• p-adaptivity: polynomial order is increased on the cell so that the exponent becomes larger in the error estimate.
• r-adaptivity: points are moved to get smaller cells where error indicators are large: the mesh topology is left unchanged as the transformation is only geometric.
7.2. RESIDUAL-BASED ERROR ESTIMATOR FOR POISSON
73
7.2   Residual-based error estimator for Poisson
Let u and Uh be respectively the solutions to Problem (1.7) and the approximate Problem (4.1) by a Lagrange Pi discretization. Therefore the ideas of the previous section are applied with V = Hq($1) and Vh the space of continuous piecewise linear functions vanishing on dtt.
The objective of this section is to exhibit a control of the H1 seminorm of the error
\eh\h(n)=J ^eh-^eh dx
in
in terms of the residual of the equation lZ(uh).
By Galerkin orthogonality a(eh,Vh) = 0 for Vh £ Vh, in particular testing against Vh = Ih eh is possible so that
\eh\h(n) = J ^eh-V{eh-Theh)dx
The immediate motivation for introducing this test function is to be able to use interpolation inequalities.
Following the ideas of the previous section,
\eh\m(n)= j ^u-V(eh-Theh)dx- j Vuh-V(eh-Theh) dx
so that by consistency of the Galerkin method
\eh\m(n) = \ f (eh ~ Ih eh) dx - / Vur V(eft -lheh)dx (7.5) Jn Jn
with the first term being L{v) and the second term a(v,h,v).
Similarly to interpolation inequalities we consider the expression per element K £ 7~h- The second term is integrated the by part,
/ Vuh- V(eh-Theh)dx = I Vuh ■ n (eh-Th eh) dx- / Auh {eh-Th eh) dx jk JdK jk
and combined with the first term to obtain the element residual
T^x(uh) = (/ + Auh)\K
Summing again over the domain yields
\eh\nun) = Y]    / KK(uh) V(eft -Xheh) dx + /   Vuh ■ n (eh - Th eh) dx KeTh Uk JdK
with the first term being a volume integral involving the residual of the equation, and the second term being a surface integral involving jump of the normal gradient of Uh across the cell facets.
74
CHAPTER 7.  ADAPTIVE ERROR CONTROL
Using first the Cauchy-Schwarz inequality
\eh\n^(n) < WR-{uh)\\L2(n)\\eh - lheh\\L2(n) then the interpolation inequality with constant Cj
\eh\ni(U) < Cj||7e(u/l)||L2(n)|^e/l|Hi(n) Consequenlty, we conclude
\eh\w(n) < CihT\\K(uh)\\L2(n)
7.3   Dual weighted residual estimate
7.3.1 Adjoint operator
Definition 7.3.1 (Adjoint operator). Let us define A*, the adjoint operator of A as:
(Au,v) = (u, A*v )
Example 7.3.2 (Matrix of jMjv Let A = A be a real square matrix of
dimension N x N and x, y g 1^:
( Ax , y ) = ( Ax , y ) = ( x , ATy ) = ( x , A*y )
with ( • , • ) the scalar product of       then A* = AT.
Example 7.3.3 (Weak derivative). Let A = and u, v g L2($l), with compact support on $1:
(„4u , v ) = ( D^u , v ) = — ( u , Dxv ) = ( u , A*v )
with ( • , • ) the scalar product of L2($l), then A* = — Dx.
Example 7.3.4 (Laplace operator). Let A = — A and u, v g Hq($1):
(Au,v) = ( - Au , v ) = ( Vtt, Vu ) = ( u , -Av ) = ( u , A*v )
with ( • , • ) the scalar product of L2($l), then A* = —A. The Laplace operator is said self-adjoint.
7.3.2 Duality-based a posteriori error estimate
We define the dual problem as seeking rj satisfying A*rj = eh, which gives a control on the discretization error, using the definition of the adjoint operator A*:
Weh\\h(n)   =   ( eh > eh ) = (eh,A*ri) =   ( Aeh , V ) =   (Au,rj)-( Auh , r] ) =   ( / - Auh , i] ) =   ( K{uh) , V )
7.4. METHOD
75
with TZ(uh) = f — Aufr. Moreover, if the dual problem is stable then there exists a constant S such that the dual solution rj is bounded:
II^IIl2^) < <5|le/illL2(n)
with the stability factor S satisfying
6 =  max ——-
ee^(n) \\0\\L2(n)
Thus we can obtain a bound of the form:
IKIbd}) < s\\n(uh)\\L2^
Combining this estimate with an interpolation inequality in EP, we can bound the discretization error in terms of the residual and the stability factor. For instance, if we control the second derivatives of the dual solution, i.e. a = 2,
n   n       ^ s-i iii 2i-,/   mi MH2(m
\\eh\\mn)
Consequently,
IMIl2(X!) < Ci <S\\h2TZ(uh)\\-L2^n)
7.4 Method
Definition 7.4.1 (ft-adaptivity). Given a tolerance parameter eto/ > 0 defining a quality criterion for the computed solution Uh, adapt the mesh such that it satisfies:
er = ^ eK < etoZ KeTh
Algorithm 7.4.2 (Adaptive mesh strategy). The following procedure applies:
• Generate an initial coarse mesh Th°.
• Perform adaptive iterations for levels £ = (),■■■ , £max :
1. Solve the primal problem with solution u^0 £ V^.
2. Compute the residual of the equation lZ{u}^).
3. If dual weighted, solve the dual problem with solution rj G W^.
4- Compute error indicators ex, V K G 7jf.
5. If (eT > etol) :
—> Generate mesh T^+l by refining cells with largest values of ex-Else :
—> Terminate adaptive iterations, £max = I.
76
CHAPTER 7.  ADAPTIVE ERROR CONTROL
7.5 Exercises
Exercise 7.5.1 (Diffusion-Reaction problem on the unit interval).
Consider the following one-dimensional problem:
—dx (a(x) dx u(x)} + c(x) u(x) = f(x)   , V x £ fl = [0,1]
with a > 0, c > 0, and supplemented with homogeneous Dirichlet boundary conditions
u(0) = u(l) = 0
1. Write the weak formulation for the given problem and its approximation by piecewise linear Lagrange elements.
2. Write the dual problem for unknown rj.
3. Obtain the following estimate:
IKIlL2(n) < \\h2n{uh)\\L2{n)\\h-2(r] - llr])\\L2{n)
with the discretization error = u — Uh, the equation residual lZ{uh) = f + dx (adxUh)—cuh and the Lagrange Pi interpolation operator T^. First you should test the dual equation against eh, then write the expression element-wise to be able to define the residual.
4. Conclude that the a posteriori error estimate holds
with Cj the interpolation constant and S a stability factor that you will define.
Chapter 8
Stabilized methods for advection dominated problems
8.1   An advection diffusion problem in one dimension
Let us consider the following one-dimensional advection-diffusion problem:
-dx (v(x) dx u(x)) + dx u(x) = f(x)   , V x g Q = (0,1) with viscosity v > 0, and supplemented with boundary conditions:
u(0) = 1 ,    u(l) = 0
8.2 Coercivity loss
8.3 Stabilization of the Galerkin method
Galerkin	(Au,	V )			= (/,	v )
Galerkin-Least squares	(Au,	v + SAv )			= (/,	v + SAv )
	(Au,	v )	+	( Au , SAv )	= (/,	v)+(f,SAv)
Streamline Diffusion	(Au,	v + SAv )	+	( uhVu , Vv )	= (/,	v + SAv )
Entropy viscosity	(Au,	v )	+	( uhVu , Vv )	= (/,	v )
8.4 Exercises
77
Chapter 9
Iterative solvers and Multigrid
9.1   Iterative methods
As seen in the previous lecture, direct methods can theoretically compute exact solutions xMN to linear systems in the form of:
Ax = b
with matrix with real coefficients A £ Mtv(M) and given data b £ M.N, in a determined finite number of steps.
As computing the inverse of the matrix is unrealistic, several methods were introduced based on factorizations of the type A = P Q where P and Q have a structure simplifying the resolution of the system: diagonal, banded, triangular.
Methods like LU, Cholevski take advantage of the existence of a decomposition involving triangular matrices while QR for example, involves the con-strucion of an orthogonal basis. All methods prove to be quite expensive, hard to parallelize due to the sequential nature of the algorithm and prone to error propagation.
Iterative methods have been developed for:
• solving very large linear systems with direct methods is in practice not possible due to the complexity in term of computational operations and data,
• taking advantage of sparse system for which the structure of the matrix can result in dramatic speed-up (this is the case for numerical schemes for PDEs),
• using the fact that some systems like PDEs discretizations are already formulated in an iterative fashion.
In this section, we discuss briefly the computational properties of iterative methods for solving linear systems. Computing the exact solution is not a requirement anymore but instead the algorithm is supposed to converge asymptotically to the exact solution: the algorithm is stopped when the approximate
79
80
CHAPTER 9.  ITERATIVE SOLVERS AND MULTIGRID
solution is deemed close enough to the exact solution in a sense to be defined. A parameter used as stopping criterion triggers the completion of the algorithm.
The general idea of these methods is to introduce a splitting of the form:
A = G -H
such the solution x satisfies:
Gx = b + Hx
Similarly to fixed-point methods we can define a sequence of approximate solutions (xk) satisfying relations of the form:
Gxk+1 = b + Bxk
with G invertible.
The matrix viewed as a linear mapping in M.N, the counterpart of such approaches is given by the Brouwer Theorem in finite dimension, where a continuous mapping / : Q —> Q with Q compact of admits a fixed-point x* satisfying f(x*) = x* and is contracting.
Methods introduced depend on the iteration defined by the splitting and call for several questions regarding the computational aspects:
1. How can the convergence be ensure?
2. How fast is the convergence?
3. How expensive is each iteration?
4. How does the algorithm behave with respect to numerical error?
The question of the convergence is addressed by proving an estimate on error vectors in terms of iteration error ek = xk+1 — xk or global error: ek = xk — x. The convergence rate a means that C > 0, |efc+1| < C|efc|a.
For example, substituting xk+1 = G_1fo + G~1H.xk in ek = xk+1 - xk gives a relation between successive iteration errors:
ek = G^H ek-1
with M = G_1H the iteration matrix, and recursively ek = (G_1H)fc+1e°. Convergence is then conditioned to the existence of a contraction factor K < 1 such that ||efc||oo < K ||efc_1|| oo ensuring decrease of the error.
In terms of the matrix M, this translate for the spectral radius p(M) as p(M) < 1 since in that case lim/^oo Mke° = O^n . The smaller the spectral radius, the faster the convergence.
Each method is described briefly and qualitatively with just the necessary ingredients to discuss practical implementations.
9.2.  RELAXATION METHODS
81
9.2   Relaxation methods
Consider the relations for each row i = 1,..., N:
Xi =-*-(bi-^ciijXj^ (9.1)
11 i+3
Let us introduce two methods based on constructing sequences of approximate solutions , k > 1 given an initial guess x° £ ~RN and then associated relaxation methods.
9.2.1   Jacobi, methods of simultaneous displacements
Convergence: the global error ek is controlled by
|efc|| < Kk lie1!!
-lW<Y
i+3
It is then enough if the matrix is strictly diagonally dominant. Expressing the iteration error gives directly that M = G_1H such that p(M) < 1.
Algorithm: the splitting is
A = D - H
with D = diag(.A), thus
xk+1 =B-1(b + Bxk^
Implementation:
1. Parallelization component by component is possible since there is only dependency on xk.
2. Memory requirement for storing both xk+1 and xk at each iteration.
9.2.2   Gauss—Seidel, methods of sucessive displacements
In Jacobi iterations, notice that sequential ordered computation of terms
-(&*-Ea^J) (9-3)
involves components xk which are also computed for xk+1 if j < i.
xk+1
- (bi - E a^3+1 - E a^D (9-4)
a,
Kj %>j
82
CHAPTER 9.  ITERATIVE SOLVERS AND MULTIGRID
Algorithm: the splitting is
A = L - R0
with L = D + Lo lower-triangular matrix and Ro strict upper-triangular matrix, thus
xk+l = D-l(fo _ + RQAk)
or
Lxk+1 = B-\b + Roxk) Recast under the usual form:
xk+i =L-±(b + R{jxk)
and the iteration matrix is M = L_1Ro.
Convergence: the global error ek is controlled by
I aij
k+l\\ <r     %>1_ Ilefc|| < j^k i|el
eK+1 <
Kj
If the Jacobi contraction factor K < 1 then K < 1. Expressing the iteration error gives directly that M = L_1Ro such that p(M) < 1.
Implementation:
1. Parallelization component by component is not possible easily since there is serialization for each row i due to the dependency on xk+1, j < i.
2. Memory requirement is only for storing one vector of at each iteration. 9.2.3   Relaxation of Jacobi and Gauss-Seidel
Relaxation methods consists of adding a linear combination of the approximate solution at the previous iteration to minimize the spectral radius for convergence, using the relaxation parameter 7 6 (0,1).
1. Jacobi Over-Relaxation (JOR):
-1 = (1 _ 7) $ + 7 J_ (hi _ £ aijAk\ (9 5)
1+3
x^
which reads in matricial form
xk+1 = M7xfc + 7D-1 b with M7 = (1 - 7)1 + 7D_1H
9.3.  KRYLOV-SUBSPACE METHODS
83
2. Successive Over-Relaxation (SOR):
-k+1 = (1 - 7) $ + 7— (bi ~ E " E (9-6)
xi ,
l<] l>]
which reads in matricial form
xk+1 = M,,xk + -fCb withM7 = (1+7D-1L0"1)-1 [(1-7)1+70-%] andC = (^D^Lo-1)-^-1
The relaxation parameter 7 cannot be known a priori and is usually determined by heuristics.
9.2.4   Parallelization of Gauss—Seidel
Overcoming the serialization in Gauss-Seidel is possible if the matrix is sparse. Taking advantage of the fact that components does not all possess connectivities with each other: such dependencies can be built from the sparsity pattern then decoupled graphs identified:
1. Component Dependency-Graph: generate a graph to reorder entries such that dependencies are avoided.
2. Red-Black coloring: special case for two-dimensional problems.
9.3   Krylov-subspace methods
The idea of these methods is that the solution is decomposed on a sequence of orthogonal subspaces.
If A is symmetric definite positive it induces the corresponding scalar product:
(x,y)=(Ax,y)=yTAx with ( A- , • ) canonical scalar product in M.N. The vectors (ei,..., e^v) are said
induced by A.
A-conjugate if ej Ae« = 0 for i 7^ j: they are orthogonal for the scalar-product
9.3.1   Principle of descent methods: Steepest Gradient
Minimisation of the residual:
x* = argmin^ 3(x) = — ( x , x ) — (b , x) Construct a sequence of solutions to approximate minimization problems,
*■ k
given x :
3(xk+1) < 3(xk)
where xk+1 = xk + ak+iek+1, with a^+i a descent factor and e^+i a direction. For the Steepest Gradient:
84
CHAPTER 9.  ITERATIVE SOLVERS AND MULTIGRID
1. take the direction given by — VJ(xfc) = b — Axk which is the residual rk = b — Axk, thus xk+1 = xk + ak+irk-
2. choose the descent factor ak+l minimizing the functional J(xk + ak+irk):
^Ark
The speed of convergence is bounded by O (l — C(A)_1) with C(A) the conditioning of A. The gradient direction may not be optimal, Conjugate Gradient methods improve the choice of (e^).
9.3.2   Conjugate Gradient
The Conjugate Gradient (CG) is a Krylov-subspace algorithm for symmetric positive definite matrices.
Given x°, (xk) is q sequence of solutions to approximate fc-dimensional minimisation problems.
For the Conjugate Gradient:
1. take the direction ek+i such that (ei,... ,ek,ek+i) is A-conjugate, thus xk+1 = xk +ak+1ek+1.
2. choose the descent factor ak+l minimizing the functional J(xk + ak+irk), which is defined by
ejAej
and with ejb ^ 0 (unless the exact solution is reached).
The construction of (ei,..., e^+i) is done by orthogonalization of residuals by Gram-Schmidt:
e^Ark-i
efc+i =rk--Tjr--ek
e%Aek
so that rk+i = b — Axk+1 = rk — ak+iAek+i
After N steps, the A-conjugate basis of M>N is done and the exact solution is reached:
n
£'
For any k, the speed of convergence is bounded by
in the norm induced by A, with C(A) the conditioning of A.
The Conjugate Gradient can therefore be seen as a direct methods but in practice:
9.3.  KRYLOV-SUBSPACE METHODS
85
• the iterative computation of the A-conjugate basis suffers from the same issue of numerical error propagation as the QR factorization leading to a loss of orthogonality,
• the convergence is slow, which makes it unrealistic to compute the exact solution for large systems,
so it is used as an iterative method. Example algorithm on first steps:
1. Given x° = 0, set tq = b — Ax° and e.\ = tq,
2. Take x\ = ct\e.\, then a\e^Ke.\ = e^fo, thus
rTQb
3. Compute the residual:
4. Compute the direction:
t\ = b — Ax1
efAr0
e2=ri--Tjr-—ei
ef Aei
5. Compute the factor:
6. Update the solution:
7. ...
The algorithm iteration reads: 1. Compute the residual:
a2
4b
eTAe2
x2 = x1 + a2e2
rk = b- Axk
2. Compute the direction:
ekArk-l
efc+i = rk--^r--ek
3. Compute the factor:
ek+ib
el+1Aek+1 4. Update the solution:
xk+1 = xk + ak+1ek+1
86
CHAPTER 9.  ITERATIVE SOLVERS AND MULTIGRID
which requires two matrix-vector multiplications per loop, Axk then Aek+i Using rk+\ = rk — ak+iAek+i saves one matrix-vector multiplication.
While the residual norm gk = ||-r^||| is big: 1. Compute the projection:
Qk-l
2. Compute the direction:
efc+i =rk + (3kek
w = Aek+1;    ak+i = —
ek+iw
xk+1 = xk + ak+1ek+1
rk+1 =rk - ak+iw
3. Compute the factor:
4. Update the solution:
5. Update the residual:
9.3.3 Preconditioners
While seeing the Conjugate Gradient as a pure iterative method relieves from concerns regarding orthogonality loss, the convergence is still slow as soon as the condition number of the matrix is bad.
Preconditioning the system consists in finding a non-singular symmetrix matrix C such that A = C_1AC_1 and the conjugate gradient is applied to
Ax = b
with x = C~1x and b = C_1fo.
With:	
• M =	C2
• ek =	C-l~ek
• xk =	C-l~xk
• zk =	
• rk =	Cfk = b — Axk
and M is a symmetric positive definite matrix called the preconditioner. While the residual norm gk = \\rk\\2 is big: 1. Solve:
Mzk = rk
9.4.  POWER METHOD
87
2. Compute the projection:
a Zkrk
Pk —
zk-lrk-l
3. Compute the direction:
efc+i = Zk + ßkek
4. Compute the factor:
5. Update the solution:
6. Update the residual:
X Zkrk
ak+l = —-
ek+lAek+l
xk+1 = xk + ak+1ek+1
rk+1 =rk - ak+iw
The linear system Mzk = rk should be easy to solve and can lead to fast convergence, typically 0(()V~N). Since
Mzk = b — Axk
Then an iterative relation appears:
xk+1 = M-^b-Ax,,)
therefore iterative methods like Jacobi, Gauss-Seidel and relaxation methods can be used.
9.4   Power method
This method is used for finding the dominant eigenvalues of a matrix A 6 Mtv(M) of eigenvectors (yi) with associated eigenvalues (Aj) ordered in decreasing module. The eigenvalues are either real or conjugate complex pairs.
Given a random vector x°, construct a sequence of vectors (xk) such that
xk+1 = Axk
then Vfc > 0
N-l i=0
for some coefficients (fj)-
Assume that Ao is a dominant real eigenvalue and £o 7^ 0, then
xk = \a{(,ovo+rk)
88
CHAPTER 9.  ITERATIVE SOLVERS AND MULTIGRID
with the residual     defined as
N-l
and linifc^oo = Okn. To the limit Xk+i ~ Ao£fc ~ Ao^o^O almost parallel to the first eigenvector.
• This method is fast to compute the spectral radius for the Jacobi method and relaxation parameters.
• The convergence is geometric and the speed depends on the ratio |Ai/Ao|-
• If the matrix is symmetric, the convergence speed can be doubled.
• If Ao is very large or very small then taking high powers lead to numerical issues, the algorithm requires a normalization.
9.5   Multigrid methods
TBD
Chapter 10
Mixed problems
This section is an opportunity to describe step by step the methodology described throughout the course by studying the Stokes problem and to give an overview of the difficulties arising in mixed problems.
Question: In the case of a problem involving a pair of unknown (u,p), is there a criterion to chose the approximation spaces ?
10.1   The Stokes equations 10.1.1    Position of the problem
Let us consider the equations governing the velocity u and pressure p of an incompressible creeping flow, subject to the gravity, in a domain $1, open bounded subset of M.d. As the flow is supposed to be sufficiently slow to neglect the ad-vection compared to the diffusion, the momentum balance equation reduces to
- V- <r(x) = <?(x)g(x) (10.1a)
with the stress tensor
a = r-pl (10.1b)
consisting of a viscous stress tensor r and a pressure term with I the identity matrix of A^d(M). The incompressibilty constraint
V-u(x)=0 (10.1c)
represents the mass conservation for an incompressible continuum. Moreover, the relations are supplemented with boundary conditions on dtt = d£l£> Ud^l^. Dirichlet boundary conditions are enforced on BCId
u = uD (lO.ld)
with U£, while Neumann boundary conditions on 8£In
a ■ n = ajy (10. le)
with o"tv a surface force acting on dClft.
89
90 CHAPTER 10.  MIXED PROBLEMS
According to the method developed during the course, we would like first of all to derive a weak formulation by testing Equations (10.1a) and (10.1c) against smooth functions, such that we consider
V-r-üdi+ / Vp-i>dx= / gg-vdx ,y v E V n Jn Jn
and
V-üqdx = 0 ,VogM
in
Integrating by parts to report the derivatives on the tests functions:
V-T-i>dx = — / V- (ttv) dx + / r : Vi> dx
n Jn Jn
which uses the tensor identity, given under repeated indices form:
<), (Tij)vi = (), (TßVi) - Tijdj Vi
Owing to relation
and
V'T-üdx = - /   T-n-vds+ / r : Vt) dx
n Jon Jn
Vp-vdx = — / pn-vds+ / pV-vdx n Jan Jn
the weak formulation of Problem (10.1) reads: Find (u,p) G W x M such that:
t :'Vv dx — / pV-t)dx   =       gg-vdx+        ajy • n ds   ,y v E V
n Jn Jn JdnN
V-uqdx   =   0 ,\/q e M
n
In the case of a Newtonian fluid the stress tensor reads
<r(u,p) = 2ue{u) — pi
with the strain rate tensor
e(u) = ^{Vu + VTu)
which is symmetric.
10.1.  THE STOKES EQ UATIONS
91
10.1.2   Abstract weak formulation
As a first step we can reformulate the previous problem as: Find (u,p) G W x M such that:
a(u, v) + b(v,p)   =   L{v)   ,y v E V b(u,q)   =   0        ,Vg £ M defining a( •, •) as the continuous bilinear form:
a:   W x V   -+ R
(u,v)      i—>
b( ■, ■) as the continuous bilinear form:
b :   V x M M
and L( •) as the continuous linear form L :   V   -+ R
v l—^       gg-v dx +        o"tv • n ds Jn JdnN
Choice of the functional spaces: — Regularity: as in Section 1 we chose the test and solution space so that the integrals make sense. Owing to these requirements, W and V should be subspaces of H1($l)d and M should be a subspace of L2($l), — Boundary conditions: the boundary condition on dQjy appears in the weak formulation as a linear form so that the solution will satisfy the constraint an = a^, while the boundary condition is included in the definition of the functional space W:
W = {v £ H1($l)d : u =uD ,on <9ftD|
By homogenizing the Dirichlet boundary condition, we can lift the solution u so that the problem is rewritten to seek a velocity u in V.
The generalized Stokes problem reads then: Find (u,p) G V x M such that:
a(u,v)+b(v,p)   =   L(v) ,VveV (10.2)
b(u,q)   =   (*,p)M,tM ,VgeM
with (V,M) a pair of Hilbert spaces to be determined, a( •, •) bilinear form continuous on V x V, L( ■) linear form continuous on V and ^ a given continuity constraint in M'.
t : Vi> dx
p V- v dx
92
CHAPTER 10.  MIXED PROBLEMS
10.1.3   Well-posedness in the continuous setting
Let us change the space in which test functions are chosen to the space of divergence-free functions of V to satisfy the continuity constraint:
V0 = {v G V : b(v,q) = 0,Vg6 m}
The bilinear form b is continuous on Vq x M, i.e. b(y,q) < |M|v0 IMIm, thus Im(6) is closed and V = Vq © Vq- The first relation of the Stokes problem becomes then:
a(u,v) +b(v,p) = L(v) ,\/veVo o
Therefore, the new abstract problem with solenoidal test functions reads:
Find (u,p) G V x M such that:
a(u,v)   =   L{y) ,Vt) £ Vo
b(u,q)   =   (*,p)M,tM   , V g G m
Theorem 10.1.1 (Well-posedness of constrained problem). Let us define the space
V* = {v G V : b(v, q) = (V, p)M>,M , V g G m}
supposed non-empty and consider a( •, •) a bilinear form coercive on V. The problem
Find (u,p) G V$ x M such that:
a(u,v)   =   L{y) ,V^GVo
admits a unique solution.
Proof. The given problem satisfies the assumptions of the Lax-Milgram Theorem. □
We denote by C(V x W; M), the space of bilinear form continuous on V x W which is a Banach space for the operator norm
a(v,w) a V,W = SUP 7r~n—li—li—
vev \\v\\v\\w\w
Proposition 10.1.2 (Babuska-Necas-Brezzi condition). The bilinear form a G C{V x W;M) satisfies the (BNB) condition if there exists [3 > 0 such that
■ r a{v,w)
ml sup -—-—-—-— > p
weW vey \\v\\v\\w\\-w
10.2.  THE DISCRETE INF-SUP CONDITION
93
Theorem 10.1.3 (Existence). IfV^ is non-empty, a( •, •) is a bilinear form coercive on V with coercivity constant a, and the bilinear form b( ■, ■) satisfies Proposition (10.1.2), i.e.
3/3 > 0 : inf sup „ bjV,..q\    > /3
then Problem (10.1.3) admits solution pairs (u,p) g v x m such that u is unique, satisfying
\\u\\v < —\\L\\v' + — (1 + ||a||v,v) II^IIm' a a
and any p g m can be written as p = p + Mq, p g
a\\v v \ (1 1
IIpIIm a    j \^\\L\\v' + ^2 llallv,v||^||m'
Indeed, p playing the role of a potential, it is defined up to a constant. Then we can interpret the space Mq as the space of functions on which gradients are vanishing which is the space of constants on $1, so that we seek p g M, with M = Mq defined as the equivalent class: v p, q g m, p = q<^p = q + c : c g
Consequently,
Theorem 10.1.4 (De Rham - [4] page 492). The continuous bilinear forms on W1,P(Q) which are zero on ker(V-) are gradients of functions in Lj_Q($l).
10.2   The discrete Inf-Sup condition 10.2.1 Results
Let us consider an approximation of Problem 10.2 by a Galerkin method:
Find (uh,Ph) £ Vh x      such that:
a(uh,vh) + b(vh,ph)   =   L(vh) ,Vvh£Vh b{uh,qh)   =   (^,Ph)M',M ,v%gmft
with (V/^Mh) a pair of approximation spaces to be chosen and the discrete divergence operator B^,
b(uh,qh) = ( Bhu , qh )
Theorem 10.2.1 (Well-posedness). // Vh g lm(Bh) then Problem (10.2.1) admits solutions (uh,Ph) £ Vh x such that is unique and the pressure can be written as Ph = Ph + ker(BT^) with ph g ker(BT^)^ unique.
94
CHAPTER 10.  MIXED PROBLEMS
Theorem 10.2.2 (Convergence - [7] page 21). Let (u,p) £ V x M be the solution of Problem (10.1) and (uh,Ph) £ Vh x the solution of discrete Problem (10.2.1) and we denote by ah the coercivity constant of a( •, •) on Voth and by fih the constant of the discrete Inf-Sup condition. If ^ 6 Im(Bh) then the following two consistency estimates hold:
\\u - uh\\vh < Ci   inf  \\u-Vh\\v+C2  inf \\p - qh\\M
\\Ph -Ph\\uh ^ C3   inf  \\u-vh\\v+CA   inf  \\p - qh\\M h        vhevh vheMh
with constants
Ci   = (1+Nv!vW1+||6||v,m
c*h    J \ ßh
Co
\V,M
ah
C3    - -7.-Oi
Ph
n , . \\b\\v,M . \\a\\v,M n Ca   =   1 + —--+ —--C2
Ph Ph
The previous result shows then that satisfying the discrete Inf-Sup condition is crucial to ensure optimal convergence of the numerical scheme, i.e. the discretization error decreases with the mesh size hf- Indeed, if the parameter /3h is not bounded from below then it is clear that values tending to zero will degrade the consistency estimates.
10.2.2   Commonly used pairs of approximation spaces
Velocity space Vh   Pressure space Mh   Inf-Sup stable Comment
¥i ¥i No
Pi P0 No "Locking effect"
Pfc+i Pfc Yes k>l, "Taylor-Hood"
10.3 Exercises
Appendix A
Definitions
A.l Mapping
Definition A. 1.1 (Mapping). Let E and F be two sets, a mapping
/:   E F
x    i—> f{x)
is a relation which, to any element x g E, associates an element y = f{x) £ F.
Definition A. 1.2 (Linear mapping). Let E and F be two K-vector spaces, the mapping / : E —> F is linear if:
1. Vx,ye£, f(x + y) = f(x)+f(y)
2. V A g K, y g £, /(Ax) = A/(x)
A. 2 Spaces
Definition A.2.1 (Vector space (on the left)). Let (K, +, x) be a field i.e. defined such that (K, +) is an Abelian additive group and (K \ {Ok}, x) is an Abelian multiplicative group, K = 1 or K = C.
	(K\{0K}, x)
"+" commutative and associative Ok neutral for '+" "+" admits an opposite	"x" commutative and associative Ik neutral for 'x" " x" admits an inverse
" x" is distributive	with respect to "+"
(E, +, •) is a vector space on (K, x) if:
1. (E, +) is an additive Abelian group (same properties as (K, +)).
2. The operation •  : K x E —> E satisfies:
distributive w.r.t u+e" on the left X- (u + v) = X ■ u + X ■ v
distributive w.r.t "+k" on the right (X+fi)-u = X- u + fi-u associative w.r.t " x" (A x fi) ■ u = X ■ (// • u)
Ik neutral element on the left Ik - u = u
95
96
APPENDIX A. DEFINITIONS
In short the vector space structure allows writing any u £ E as linear combinations of elements {i>«} of E called vectors with elements {A«} of K called scalars as coefficients,
and both the multiplications for vectors and scalars are distributive with respect to the additions. In this document we only consider real vector spaces, K = M.
Definition A.2.2 (Norm). Let E be a K-vector space, the application
|| • || : E -+ R+ is a norm if the following properties are satisfied:
1. Separation: Vn6E, ( \\x\\e = 0 ) =4> ( x = Oe )
2. Homogeneity: V A 6 K, V x 6 -E, ||Ax||£; = |A| ||x||£;
3. Subadditivity: Vi,|/6£, ||x + y||£ < ||x||£; + ||y||,e
Note A.2.3. The third property is usually called triangle inequality.
Definition A.2.4 (Equivalent norms). Let E be a K-vector space, norm || • is said equivalent to || •     if there exist C±, C2 > 0 such that:
CiIHIe < IMI.e.e < C2 \\u\\e   , V u g E
Definition A.2.5 (Seminorm). Let E be a K-vector space, the application
II-|| : E^R+
is a seminorm if it satisfies properties (A.2.2).2 and (A.2.2).3.
Definition A.2.6 (Scalar product). Let E be a R-vector space, the bilinear mapping
( • , • ) : E x E ->• R is a scalar (or inner) product of E if it satisfies the following three properties:
1. Symmetry: Vx,yE.E,(x,y) = (y,x)
2. Positivity: VxG-E, ( x , x )>0
3. Definiteness: ( ( x , x)=0)<^(x = 0)
Appendix B
Duality
B.l   In finite dimension
Definition B.l.l (Dual space). Let E be a finite dimensional real vector space, its dual E* is the space of linear forms on E, denoted by C(E;R).
Definition B.l.2 (Dual basis). Let E be a finite dimensional real vector space, dim(-E) = ./V and B = (ei, • • • , ejv) a basis of E. Let us denote, for any i,j £
e* :   E   ->• R aj i y Sij
the i-ih. coordinate. The dual family of B, B* = (e*, • • • , e^) is a basis of E*. Thus we can write any element u g E as:
n
i=i
Proving that B* is a basis of E* requires that {e{\ generates E* and that its elements are linearly independent. The corollary of the first condition is that dim(B*) = N.
97
Appendix C
Function spaces
C.l   Banach and Hilbert spaces
Definition C.l.l (Cauchy criterion). Let (E, \\ ■ \\e) be a normed vector space and (vn)ne-N be a sequence of element of E which satisfies:
V e > 0, 3N such that Vp, q > N, \\vp - vq\\E < e
then (vn)ne^ is a Cauchy sequence in E.
Definition C.l.2 (Banach space). A Banach space (E, \\ ■ \\e) is a normed vector space with is complete with respect to the norm || • i.e. Cauchy sequences converge in E.
Definition C.l.3 (Hilbert space). Let E be a K-vector space and ( • , • ) be a sesquilinear form on the left (or bilinear form if K = M),
(Xx , y)   =   A( x , y )
(x , Xy)   =   X( x , y )
which is also positive definite on E,
Vx/0£, ( • , • ) >0
then (E, ( • , • )) is a pre-Hilbertian space. Moreover, if E is complete with
respect to the norm defined by ( • , • ), it is a Hilbert space.
Definition C.l.4 (Hilbertian norm).
( xi +x2 , y) ( x , yi + y2 )
( xi , y ) + ( x2 , y ) ( x , yi ) + ( x , y2 )
1
x + y
x-y
99
100
APPENDIX C FUNCTION SPACES
Remark C.1.5. This is basically the parallelogram identity stating that the sum of the square of the length diagonals is equal to the sum of the square of the length of all the sides. This equality is useful to check that a norm is generated from a scalar product. For Banach spaces like Lp, this becomes the Clarkson inequality.
Theorem C.1.6 (Projection on a convex subset). Let H be a Hilbert space and K c H be a convex closed non-empty subset, V x g H there exists a unique xq g K such that
Remark C.3.1. Lebesgue spaces Lp, 1 > p > oo are Banach spaces for the norm
\\x - x0||h = inf \\x - y||H
yeK
with xq the projection of x onto K and we denote it by xq = Pk x
C.2   Spaces of continuous functions
C£°(S1) = {u g C°°(S1) with compact support in $1}
C.3   Lebesgue spaces
and L2
is a Hilbert space endowed with the scalar product
(C.l)
C.4   Hilbert Sobolev spaces
Hs(ü) = {ue L2(ü) : Bau g L2(ü), 1 < a < s}
C.5   Sobolev spaces
Wm'p(ft) = {u g Lp(ft) : T>au g Lp(ft), 1 < a < m} Remark C.5.1. Hs spaces are Ws'2 spaces.
Appendix D
Inequalities
D.l   Useful inequalities in normed vector spaces
Lemma D.l.l (Cauchy-Schwarz). Let E be a 'K-vector space, any positive sesquilinear form ( • , • ) on E satisfies the inequality:
|( u , v )E\ < \\u\\E\\v\\E
Remark D.l.2. In particular any scalar product satisfies the Cauchy-Schwarz inequality. For example:
(u, v )l2(ü) = J uv dx < |M|l2(ü)IMIl2(ü)
Lemma D.l.3 (Young). Let a, b > 0 be two real numbers:
l/a\p    1/, \<?
with —|— = 1 and 7 > 0. q p
Remark D.1.4. In particular, the following inequality is commonly used for energy estimates:
l/a\2 . 1„ ,2
ab^2{-,) +2(^
Lemma D.I.5 (Generalized Holder). Let u £ LP(Q), v g Lq(Q), with 1 < p < 00, then:
\\uv\\Lr(n)  < \\u\\Lq(n)\\v\\Lq(n)
with
1 _ 1 1 r     p q
Lemma D.I.6 (Minkowski).
\\u + v\\LP(n) < \\u\\LP(n) + IMIlp(ü)
101
102
APPENDIX D. INEQUALITIES
Remark D.1.7. The previous result is basically the triangle inequality for the Lp- norm.
Lemma D.1.8 (Poincare). Let $1 be an open bounded subset, for any 1 < p < oo
there exists a constant real number cp > 0 such that v«£ Wq'p($1):
Cp|M|lp(£}) < l|Vu||lp(n)
Remark D.1.9. As a Corollary usefull for the Poisson problem that we address, we get that ||Vu||l2^) defines an equivalent norm to IMIh1^) on Hg(Sl).
Lemma D.1.10 (Clarkson). Let 1 < p < oo, and u, v be two functions of LP(n), then:
1. for p > 2
I U ~^ V l|2 U ~ V i|2 1 /||   i|2 i|2
+ —^— t.pCo'i ^ 77 (  u t.pfrn + \\v\
II    2    NLp(^) ^ 11    2    lllp(^) - 2^    llLP(^)     II ^ IlLf (17)^ 2. for p < 2
,,« + ^,,9 nn — un2 /" 1 ii   n2 1 ii     "        \1/(p ^
2    HLp'(n) ^ H    2    HLp'(n) - \   2"   "LP(n)      2h "lp(n)
Remark D.l.ll. These inequalities are basically parallelogram inequalities generalized to Lp spaces.
Appendix E
Tensor formulae
This short chapter is a reminder of tensor notations and identities that are useful for usual partial differential equations.
E.l Operators
A tensor of order p denotes an element of fl£dix —xdP5 with di dimension on the i-th axis: a zero-order tensor is a scalar, a first-order tensor is a vector, and a second-order tensor is a matrix. The following operators are defined for second order tensors.
E.l.l    Tensor product
(order p + q)
(P®Q)ijki = PijQkl ei(g>ei(g>efc(g'ez (E.l)
E.l.2   Dot product (simple contraction)
(order p + q — 2)
(P Q) = Tr(p'p+1)(P(g>Q) (E.2)
with Tr^p'p+1^(-) the Trace operator with respect to indices p and p + 1. In index notation, it consists of a summation on indices p and p+1 (contraction),
(P-Q) = ^Pij-QJ-fc (E.3)
j
Since summation occurs on a pair of indices, the order of the resulting tensor is reduced by two.
E.l.3   Double-dot product (double contraction)
(order p + q — 4)
(P : Q) = Tr^'P+^Tr^+^P^Q)) (E.4)
103
104
APPENDIX E. TENSOR FORMULM
with two contractions in this case, corresponding in index notation to summations over two pairs of indices,
;I>:Q!    ^^IV/Q,, (E.5)
i j
so that the order of the resulting tensor is reduced by four. If P = Q this corresponds to the Frobenius norm.
E.1.4 Gradient
The gradient of a scalar field is the first order tensor
Vf = [dif]i (E.6) while the gradient of a vector field is the second order tensor
Vv = [djvi]ij (E.7) such that the derivative is applied to the last index.
E.1.5 Divergence
V- (T) = VT : G = Tr(p'p+1)(VT) (E.8)
with G the metric tensor, which entries are the scalar product of the chosen
basis vectors. If the canonical basis is chosen, it is simply the identity matrix, therefore the divergence is simply the sum of diagonal entries of the gradient.
E.1.6   Curl (Rotational)
VAT = -VT:H (E.9)
with H is the orientation tensor, which entries are the mixed product of the chosen basis vectors. The curl operator is also denoted as Vx.
E.2 Identities
E.2.1    First order tensors
• Gradient of a vector field:
V(fv) = fVv+v®Vf (E.10)
which corresponds to the expansion of the derivative of a product, as the index notation shows
dj (fvi) = fdjVi+Vidjf (E.ll)
E.2.  IDENTITIES 105
• Divergence of a vector field:
V-(fv) = fV-(v)+vVf (E.12)
which corresponds to the expansion of the derivative of a product, as the index notation shows
di(fvi) = fdivi+vidif (E.13)
• Identity of the advection operator:
(v ■ V)v = ^V{v -v) + (V Av) Av (E.14)
• Identity for the Laplace operator:
Av = V(V-i>) — V A (V A i>) (E.15)
• Divergence of a vector product:
V- (it Ad) = v ■ (V Au) - u ■ (V Av) (E.16)
• Curl of of vector product:
V A (it A v) = (V- u)it - (V- it)u + (v ■ V)it - (it • V)v (E.17)
E.2.2   Second order tensors
• Dyadic/scalar mixed product:
(u®v) ■ w = (v ■ w)u (E.18)
• Gradient of a tensor field:
V(T^) = T- V«+«'VTT (E.19)
• Divergence of a tensor field:
V-(v -T) =v ■ V-(T) +T : Vv (E.20)
which corresponds to the expansion of the derivative of a product, as the index notation shows
dj (viTij) = vidj Tij + Tijdj vi (E.21)
• Divergence of a dyadic product:
V- (u®v) = (V- u)it + (i; • V)it (E.22)
106
BIBLIOGRAPHY
Bibliography
[1] F. Boyer and P. Fabrie. Mathematical Tools for the Study of the Incompressible Navier-Stokes Equations and Related Models, volume 183 of Springer Series: Applied Mathematical Sciences. Springer, 2013.
[2] S. Brenner and R. Scott. The Mathematical Theory of Finite Element Methods, volume 15 of Springer Series: Texts in Applied Mathematics. Springer, 2008.
[3] H. Brezis. Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer, 2011.
[4] A. Ern and J.-L. Guermond. Theory and Practice of Finite Elements, volume 159 of Springer Series: Applied Mathematical Sciences. Springer, 2004.
[5] R. Herbin. Analyse Numérique des EDPs. Notes de cours de Master 2, Universitě de Provence, 2006.
[6] P. Hansbo K. Eriksson, D. Estep and C. Johnson. Computational Differential Equations. Press Syndicate of the University of Cambridge, 1996.
[7] J.-C. Latché. Méthodes d'Elements Finis pour quelques problěmes elliptiques linéaires issus de la mécanique des fluides. Notes de cours de Master 2, Universitě de Provence, 2002.
[8] A. T. Patera. Finite Element Methods for Elliptic Problems. Lecture Notes, MIT OpenCourseWare, 2003.