Beruflich Dokumente
Kultur Dokumente
(Theory)
Claudio Canuto
Dipartimento di Matematica
Politecnico di Torino
10129 Torino Italy
claudio.canuto@polito.it
http://calvino.polito.it/ccanuto
Basic Concepts
1.1 Vectors
The inner product between two column vectors a = (ai ) IRm and b = (bi ) IRm will be denoted
by
m
X
a b := ai bi .
i=1
|| u
D u or .
x1 1 . . . xmm
Other symbols may be preferred for indicating low order derivatives. For instance, the first order
partial derivatives with respect to xi will also be denoted by
Di u or Dxi u or ux i ;
3
4 CHAPTER 1. BASIC CONCEPTS
and so on.
We now introduce the most commonly used first order differential operators. The gradient
or grad is defined as
u
x1 x1
. .
u :=
.. = .. u ;
u
xm xm
note that acts on a scalar function and produces a column vector function, i. e., a vector field
defined in O.
The divergence or div acts on a vector-valued function u = (u1 , . . . , um )T and produces a
scalar function, according to the definition
u1 um
u := + + .
x1 xm
The notation is coherent with the fact that u can be formally obtained as the inner product
of the column vectors and u. Therefore, an equivalent notation for u is T u.
In dimension m = 3, the curl or rot acts on a vector-valued function u and produces a
vector-valued function, according to the definition
u3 u2 u1 u3 u2 u1 T
u= , , ;
x2 x3 x3 x1 x1 x2
the vector u can be formally obtained as the vector product of the column vectors and u.
In dimension m = 2, we can define the curl of a scalar function u as the column vector in IR2
u u T
u= , ;
x2 x1
note that u contains the two first components of the vector U IR3 , where U =
(0, 0, u)T IR3 (the last component is obviously 0). Similarly, we can define the curl of a vector
function u = (u1 , u2 )T IR2 as the scalar
u2 u1
u= ,
x1 x2
which coincides with the third component of the vector U IR3 , where U = (u, 0)T (the first
two components are 0 since u does not depend on x3 ).
The perhaps most popular second order differential operator is the Laplacian , defined as
2u 2u
u := + + .
x21 x2m
R(x , u; D ) = 0 (1.2.1)
among the independent variable x , the dependent variable u = u(x ) and certain partial derivatives
D applied to u or to some functions depending on u; the multi-integers vary in some finite
subset of INm . The equation is required to be satisfied in some open set O IRm . The order of
the equation is the maximum order of the partial derivatives which appear in the relationship.
Examples of (first order) partial differential equations are
L(x , u; D ) = f, (1.2.5)
and, at each x O, at least one coefficient with || = N is not vanishing. For convenience, the
left-hand side L(x , u; D ) will be denoted by Lu.
If we restrict the sum in (1.2.6) to the indices with || = N , we obtain the principal part L(N )
of the operator L, i.e., X
L(N ) u = aD u.
||=N
The transport equation (1.2.2) is an example of a linear first order equation. Examples of
linear second order equations (in two independent variables) are
6 CHAPTER 1. BASIC CONCEPTS
where the coefficients a as well as f may depend not only on x but also on u and certain
derivatives D u of order || < N . An example is the inviscid Burgers equation (1.2.3), which can
be written in the (formally) equivalent expression
u u
+u = 0.
t x
Finally, a partial differential equation is semi-linear if it is quasi-linear and the coefficients
a in (1.2.11) depend neither on u nor on its derivatives (whereas f may depend). Examples of
semi-linear equations are the viscous Burgers equation
u 2u u2
2 + = 0,
t x x 2
u u 3 u
+u + = 0, (1.2.12)
t x x3
and the ground-state equation
u = u3 .
We will now discuss in which sense a function u defined in the open set O IRm is a solution
of the partial differential equation (1.2.1) therein. Indeed, we can give different meanings to the
word solution. We go from the concept of classical solution to that of strong solution, and
then to weaker and weaker definitions, which require a solution to be less and less regular (i.e.,
1.2. INTRODUCTION AND NOTATIONS 7
differentiable). One of the main achievements of the Mathematics of the XXth century has been
the relaxation of the concept of solution of a partial differential equation; this has allowed the
differential problems to be formulated in the most appropriate way for being studied by often
sophisticated analytical tools, and numerically discretized by efficient methods.
Let us denote by N the order of the partial differential equation (1.2.1). A classical solution is
a N -time continuously differentiable function in O (i.e., u CN (O)) which, inserted with all its
derivatives in the left-hand side of (1.2.1), makes the equation satisfied pointwise in O:
R(x , u; D ) = 0, x O. (1.2.13)
We want to formulate these conditions in an equivalent way, which subsequently will allow us
to relax the concept of solution. To this end, we introduce the notion of test function, i.e., an
infinitely differentiable function defined and having compact support in O: this means that the
closed set
supp = closure of {x O : (x ) 6= 0}
is bounded and contained in O. Then, vanishes with all its derivatives in a neighborhood of the
boundary O. The set of all test functions forms a vector space, which will be denoted by D(O).
Note that any partial derivative of a test function is itself a test function.
Example 1.2.1. It is often important to know that test functions with certain properties exist;
for example, one often needs a test function that is positive in a small neighborhood of a given
point x 0 and zero outside that neighborhood. Such a function can be given explicitly:
exp 2
if kx x 0 k <
(x ) = kx x 0 k2 2
0 otherwise,
which holds, at least, if g is continuously differentiable in O. Note that no boundary term appears,
since a test function vanishes in a neighborhood of O. While the left-hand side requires the partial
derivative of g with respect to xi to be defined and integrable on O, the right-hand side is defined
under the milder condition that g be integrable on O, only.
To explain how (1.2.15) is used to manipulate (1.2.14), assume that the partial differential
equation is written in the quasi-divergence form
m
X
R(x , u; D ) = g(x , u; D ) + g0 (x , u; D ) = gi (x , u; D ) + g0 (x , u; D ),
xi
i=1
where each gi (i = 0, 1, . . . , m) only involves partial derivatives of order strictly less than N .
Many partial differential equations which model fundamental phenomena of the physical world
are precisely obtained in this form; conservation laws are an example. Then, applying (1.2.15),
conditions (1.2.14) can be written as
Z "X m
#
gi (x , u; D ) (x ) + g0 (x , u; D )(x ) dx = 0, D(O). (1.2.16)
O xi
i=1
In this formulation, u need not be differentiable up to order N ; it is enough for the functions gi to
be defined and integrable on O. Any function u for which this is true and which satisfies (1.2.16)
is called a weak solution of the partial differential equation. Obviously, a classical solution is also
a weak solution, whereas the converse need not be true.
Further integrations by parts in (1.2.16) may lead to even weaker definitions of solution.
Example 1.2.2. Consider the transport equation (1.2.2) and assume that the coefficients ai
(i = 1, . . . , m) belong to C1 (O). After a change of sign, the equation can be written as
m m
!
X X ai
(ai u) + u = 0.
xi xi
i=1 i=1
In this way, we allow the transport equation to have bounded, piecewise smooth but discontinuous
weak solutions.
Example 1.2.3. Recalling that = , the Poisson equation
u = f
in O is written in weak form as
Z Z Xm Z
u
u dx = dx = f dx , D(O),
O O xi xi O
i=1
provided f and all the first order partial derivatives of u exist and are integrable on O. A further
integration by parts yields
Z Z
u dx = f dx , D(O),
O O
Partial differential equations are usually supplemented by boundary and/or initial conditions,
i.e., conditions that the solution has to satisfy on all or part of the boundary O of the region O in
which the equation is set; if O is unbounded, the solution may be required to match a prescribed
asymptotic behaviour at infinity. Indeed, in most cases, a partial differential equation admits
infinitely many solutions; the conditions on O or at infinity, which often originate as part of
the mathematical model describing the phenomenon of interest, allow us to select precisely one
solution.
Example 1.2.4. Consider the simple transport equation in one space variable
ut + ux .
It is immediate to check that if g = g(s) is any continuously differentiable function on the real
line, then u(x, t) = g(x t) is a classical solution of the equation. Thus, if we set the equation in
the half-plane {(x, t) IR2 : t > 0}, so that O = {(x, 0) : x IR} represents the space at the
initial time t = 0, then u is the unique solution of the initial value problem
ut + ux = 0 in O
u = g on O.
a = (a1 , . . . , am )T 6= 0 and a0 are the coefficients of the equation, whereas f is a given function.
An alternative formulation of the equation is the quasi-divergence form
m
X
(au) + a0 u = (ai u) + a0 u = f.
xi
i=1
If the coefficients ai (i = 1, . . . , m) are differentiable, the two formulations are equivalent by the
differentiation rule of a product, up to a different definition of the zeroth-order coefficient a0 .
We want to show that (1.3.1) is equivalent to a family of ordinary differential equations. We
u
write a = kak a, with a having unitary Euclidean norm, and we denote by = a u the
a
directional derivative of u along a. Then, (1.3.1) becomes
u
kak + a0 u = f,
a
which is a family of ordinary differential equations in the directions of a. To be more explicit, let
us assume that the coefficients ai (i = 1, . . . , m) are bounded, continuously differentiable functions
in the closure O of the region O in which the equation is set. Let us introduce the characteristics
curves of the equation, i.e., the curves x = x (s) defined as the solutions of the autonomous
ordinary differential system
dx
= a(x ). (1.3.2)
ds
Here, s is a real variable which parametrizes each curve. A classical result in the theory of
ordinary differential equations (see, e.g., (??)) guarantees that, under the assumptions made on
the coefficients, for each x 0 O there exists exactly one characteristic curve passing through x 0 ;
it is defined as the solution x = x (s; x 0 ) of the Cauchy problem
dx = a(x )
ds
x (0) = x 0 .
The solution exists for positive and negative values of the parameter s, until x reaches the boundary
O. Note that the characteristics only depend on the principal part of the operator.
Let now u be a (classical) solution of (1.3.1), and let us consider its restriction u = u(s) =
u(x (s)) to a characteristic curve. By the chain rule and (1.3.2), one has
m
du X u dxi
= = a u.
ds xi ds
i=1
It follows that u can be determined by solving the linear ordinary differential equation
du
+ a0 u = f (1.3.3)
ds
1.3. LINEAR FIRST ORDER EQUATIONS 11
O0 n
a
n a n a
O O +
O0 a
n
Figure 1.1: Decomposition of the boundary of a channel O into inflow boundary O , character-
istic boundary O0 and outflow boundary O+
on each characteristic curve (again, the symbol indicates restriction to the characteristic curve);
solvability is guaranteed if, for instance, a0 and f are bounded and continuous in O. Furthermore,
u can be uniquely determined by prescribing its value at one point of each characteristic curve.
A situation of particular interest is the following one. Let O be smooth enough so that the
unit vector n = n(x ) normal to O exists at each point x O; we assume that O is locally on
one side of O, and n is pointing outwards. Let us introduce the inflow boundary of O as the set
The terminology comes from the fact that if a is the (Eulerian) velocity of fluid particles, then O
is the portion of the boundary where the fluid is entering the region O. The sets O+ (outflow
boundary) and O0 (characteristic boundary) are defined similarly, with < replaced by > and
=, respectively.
Now, suppose that each point in O is reached by a characteristic curve issuing from O (see
Figure 1.1). Then, we can prescribe the value of u at each point in O and uniquely solve the
set of equations (1.3.3), getting u at each point in O. In other words, given a function g on O ,
the boundary value problem
a u + a0 u = f in O
(1.3.5)
u = g on O
admits a unique solution.
Before presenting an example, we anticipate that in Chapter ?? we shall see that this problem
is indeed solvable under weaker assumptions on the data (the domain, the coefficients of the
operator and the right-hand sides f and g).
Example 1.3.1. Consider the simple, constant coefficient equation
ut + aux = 0 (1.3.6)
in the variables (x1 , x2 ) = (x, t). Thus, a = (a, 1)T and a0 = 0. The characteristic curves are
defined by the relations
dx dt
= a, = 1.
ds ds
Eliminating s, we get
x at = constant; (1.3.7)
12 CHAPTER 1. BASIC CONCEPTS
t = t0 + a1 (x x0 )
t0 + t
t0
x0 x0 + at x
in other words, the characteristics are straight lines in the plane (x, t) having slope 1/a (see Fig.
1.2). The solution u is constant along these lines. Thus, the equation models the propagation of
a signal in the x-direction, with speed a: a signal issued at time t0 from position x0 is received
at time t0 + t at position x0 + at (see Fig. 1.3). Indeed,
At first, let us suppose that O is the half-plane {(x, t) : t > 0}. Since n = (0, 1)T on
O = {(x, 0) : x IR}, we have a n = 1 therein, so that O = O, i.e., all the boundary is
inflow. Thus, we prescribe the value u0 of u on O, i.e., at the initial time t = 0. In this case,
(1.3.5) reads as
ut + aux = 0 x IR, t > 0 ,
(1.3.8)
u(x, 0) = u0 (x) x IR ,
which is more properly called an initial value problem. Given a point (x, t) O, the characteristic
line passing through it originates from the point (x0 , 0) O such that x at = x0 (see (1.3.7)
and Fig. 1.4). Since u is constant on this line, we have u(x, t) = u(x0 , 0) = u0 (x0 ) = u0 (x at).
1.4 1.4
1.2 1.2
S S
1 1
0.8 0.8
0.2 0.2
x x
0 0
x0 x0 +at
0.2 0.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
(x, t)
x at x
Dropping the bars on x and t, we get the explicit formula for the solution of the initial value
problem (1.3.8)
u(x, t) = u0 (x at) , for all (x, t) O . (1.3.9)
(note that the normal vector to O does not exist at the origin, yet this boundary point is an
inflow point for the equation). We prescribe the value u0 = u0 (x) at time t = 0 and the value
g = g(t) at the left endpoint of the interval (0, 1); no condition has to be prescribed at the right
endpoint. Thus, we consider the initial-boundary value problem
ut + aux = 0 0 < x < 1, t > 0 ,
u(x, 0) = u0 (x) 0 < x < 1 , (1.3.10)
u(0, t) = g(t) t>0.
In order to solve this problem, let us fix a point (x, t) O. If x at, then the characteristic passing
through (x, t) meets O at the point (x0 , 0), with x0 = xat; hence, as above, u(x, t) = u0 (xat).
On the other hand, if x < at, then the characteristic passing through (x, t) meets O at the
point (0, t0 ), with t0 = t x/a (see Fig. 1.5); hence, u(x, t) = u(0, t0 ) = g(t0 ) = g(t x/a). We
conclude that the solution of the initial-boundary value problem (1.3.10) is
(
u0 (x at) if x at
u(x, t) =
g(t x/a) if x < at.
Note that if the data u0 and g do not match properly at the origin, a singularity propagates along
the characteristic line x = at.
14 CHAPTER 1. BASIC CONCEPTS
(x, t)
x
t a
0 1 x
If the coefficient a is strictly negative, the boundary condition g is enforced at the right endpoint
of the interval (0, 1).
The concept of characteristic line introduced above is a particular case of the more general
concept of characteristic manifold. A (m 1)-dimensional manifold (a line in two dimensions,
a surface in three dimensions, and so on) contained in O is said non-characteristic for equation
(1.3.1) whenever the following property holds: if one prescribes the value of u on , then u is
uniquely determined by the partial differential equation in a neighborhood of . As a first step,
one aims at determining the gradient of u on ; then, if the manifold, the coefficients and the
data are smoother and smoother, one can differentiate the equation to get derivatives of u on of
higher and higher order; at last, the condition of real analyticity leads to the representation of u
in terms of its Taylor series in a neighborhood of each point in (Cauchy-Kowalewska Theorem).
Confining ourselves to the determination of the gradient of u on , we observe that the directional
derivative of u along any tangential vector to is uniquely determined by the prescribed value of
u therein. Therefore, the differential equation should allow to express the derivative of u along a
non-tangential direction to in terms of the value of u on the manifold. In other words, denoting
by n the normal vector to , one should have a n 6= 0 on . This motivates the following
an=0 on
It is easy to check that characteristic curves lie on characteristic manifolds. Furthermore, the
inflow boundary O of O is obviously a non-characteristic manifold.
(the choice of the minus sign in front of the principal part will be motivated in the sequel). We
actually consider the equation in the quasi-divergence form
Xm X m
u
aij + (ai u) + a0 u = f. (1.4.2)
xi xj xi
i,j=1 i=1
As already mentioned above, often the equation is derived in this form; if not, we can transform
(1.4.1) into (1.4.2) by an appropriate modification of lower order coefficients ai (i = 0, 1, . . . , m).
To simplify the notation, let us introduce the square matrix of order m
A := (aij )1i,jm. (1.4.3)
Since uxi xj = uxj xi for any twice continuously differentiable function, it is not restrictive to assume
that aij = aji for all i and j, i.e., to assume that the matrix A is symmetric. Indeed, if the matrix
is not symmetric, we write
1 1
aij uxi xj + aji uxj xi = (aij + aji )uxi xj + (aij + aji )uxj xi ,
2 2
i. e., we replace A by 21 (A + AT ). As before, a = (a1 , . . . , am )T denotes the coefficients of the
first order part. Then, (1.4.2) is compactly written as
Lu = (Au) + (au) + a0 u = f , (1.4.4)
or, equivalently,
Lu = T (Au) + T (au) + a0 u = f . (1.4.5)
A linear second order differential equation can be classified according to the structure of its
principal part. This classification is very important: indeed, the type of the equation influences
the kind of boundary and/or initial conditions which are admissible for the equation, the relevant
properties of the solution, as well as the techniques for solving the equation (analytically or
numerically).
The classification is accomplished by looking at the sign of the eigenvalues of the coefficient
matrix A (recall that A is symmetric, so all its eigenvalues are real). Note that since the coefficients
may depend on x , the type of the equation may vary from point to point. Let us consider A = A(x )
at a fixed point x O.
Three situations are most commonly encountered in applications:
(i) all the eigenvalues of A are not zero, and they all have the same sign; in this case we say
that the operator L (or the equation (1.4.4)) is of elliptic type at x ;
(ii) precisely one eigenvalue of A is zero, while the others have constant sign; in this case we say
that the operator L is of parabolic type at x ;
(iii) all the eigenvalues of A are not zero, and precisely one eigenvalue has a different sign with
respect to the others; in this case we say that the operator L is of hyperbolic type at x .
In two independent variables, this classification is exhaustive (since, by assumption, A cannot
be the null matrix). The terminology comes from the fact that the level curves in the (1 , 2 )-plane
of the associated quadratic form
Q() = T A , = (1 , 2 )T
are ellipses, or degenerate parabolae, or hyperbolae, depending whether the operator L is elliptic,
or parabolic, or hyperbolic.
16 CHAPTER 1. BASIC CONCEPTS
Example 1.4.1. The Poisson equation (1.2.7) is elliptic, the heat equation (1.2.8) is parabolic,
whereas the wave equation (1.2.9) is hyperbolic. Obviously, the type of each equation is the same
at all points in the plane.
Conversely, the Tricomi equation (1.2.10) is of variable type: it is elliptic in the upper half
plane, parabolic on the axis y = 0 and hyperbolic in the lower half plane.
In three or more independent variables, other situations may occur. If A has two or more zero
eigenvalues and the remaing ones are of one sign, we say that the operator is ultra-parabolic. If
two or more eigenvalues are of one sign, whereas two or more remaining ones are of the opposite
sign, we say that L is ultra-hyperbolic. We shall not consider these cases further on.
We now use the classification introduced above to reduce the general second order equation
(1.4.4) to a canonical form. To this end, we shall make the simplifying assumption that the
coefficients of the principal part are constant (otherwise, one can modify the arguments below by
freezing the coefficients in a neighborhood of each point x O).
Denote by i (i = 1, . . . , m) the eigenvalues of A, and let wi be the corresponding eigenvectors,
which form a complete set since A is symmetric. Define the diagonal matrix := diag(1 , . . . , m ),
as well as the orthogonal matrix S := (w1 , . . . , wm ). The eigenvalue-eigenvector relations, written
as AS = S, yield the diagonalization of A
S T A S = . (1.4.6)
Now, let us fix a point x O and let us make the change of independent variable
y = x + S T (x x ).
Denoting by x the gradient in the x -variable and defining y similarly, we have by the chain
rule
x = Sy .
Pm T) x
Pm
since yj = xj + i=1 (S ji i = xj + i=1 sij xi . Substituting into (1.4.5) gives
Lu = Ty S T A S y u + Ty S T (au) + a0 u = f ;
Lu = Ty (y u) + Ty (a u) + a0 u = f. (1.4.7)
In order to proceed, we consider the three main types of equations introduced above.
1.4. LINEAR SECOND ORDER EQUATIONS 17
Lu = z u + Tz (a u) + a0 u = f,
where z is the Laplacian in the z -variable. We conclude that the Laplace operator is the
canonical form of an elliptic operator.
and perform the same change of variable as in the parabolic case. Setting a := Da , (1.4.7)
becomes
2
Dtt u z u + Tz (a u) + a0 u = f.
We conclude that the wave operator (also termed the DAlembert operator)
2
:= Dtt
with 1 > 0 and 2 < 0. Let us define a2 := 1 /|2 | > 0; setting x = y1 , t = y2 and g = f /|2 |,
we obtain
2
Dtt u a2 Dxx
2
u = g. (1.5.2)
The equation factorizes as
(Dt + aDx ) (Dt aDx ) u = g, (1.5.3)
which is equivalent to the first order hyperbolic system
(
(Dt + aDx ) w = g (1.5.4)
(Dt aDx ) u = w (1.5.5)
(note that the + and signs can be exchanged in these formulae). Recalling the results of Sect.
1.3, u can be obtained by first integrating (1.5.4) along the characteristics x at = constant,
next integrating (1.5.5) along the characteristics x + at = constant. Actually, the family of lines
x at = constant are called the characteristics of equation (1.5.2). In order to uniquely determine
the solution, one can prescribe a condition on u for each characteristic line at each boundary point
where it enters the region O. Let us detail two examples.
Example 1.5.1. At first, suppose that O is the half-plane {(x, t) : t > 0}. Both characteristics
enter O at each point in O; thus, we prescribe u and a non-tangential derivative of u, such as
the normal derivative ut , therein. Precisely, we consider the initial value problem
utt a2 uxx = 0 x IR , t>0,
u(x, 0) = u0 (x) x IR , (1.5.6)
ut (x, 0) = u1 (x) x IR ,
(where, for simplicity, we have chosen g 0). Taking into account (1.5.4), (1.5.5) and noting
that w(x, 0) = (ut aux )(x, 0) = u1 (x) au0 (x), we first integrate along the characteristics
x at = constant to solve the initial value problem
wt + awx = 0 x IR , t > 0 ,
w(x, 0) = u1 (x) au0 (x) x IR ;
we get w(x, t) = u1 (x at) au0 (x at). Next, we integrate along the characteristics x + at =
constant to solve the initial value problem
ut aux = w x IR, t > 0 ,
u(x, 0) = u0 (x) x IR .
We get Z t
u(x, t) = u0 (x + at) + w(x + at as, s) ds;
0
1.5. BOUNDARY AND INITIAL CONDITIONS. CHARACTERISTICS 19
t t
(x, t)
x + at = x0 x at = x0
x at x + at x x0 x
Figure 1.6: The domain of dependence of a point (x, t) (left) and the domain of influence of a
point (x0 , 0)
substituting the expression of w and making a change of variable in the integral leads to the final
form of the solution:
Z x+at
1 1
u(x, t) = [u0 (x at) + u0 (x + at)] + u1 (s) ds . (1.5.7)
2 2a xat
Rz R0
1
Setting (z) = 21 u0 (z) + 2a 1 1
0 u1 (s) ds and (z) = 2 u0 (z) + 2a z u1 (s) ds, we have
Note that the solution is the superposition of two signals, traveling leftwards and rightwards,
respectively, with speed a and +a; also note that u at (x, t) only depends on the initial data on
the interval [x at, x + at]. If we had considered our equation with a nonzero right-hand side g,
then u(x, t) would have depended on the values of g in the triangle
Indeed, adapting the computations above to the presence of the right-hand side yields
Z x+at
1 1
u(x, t) = [u0 (x at) + u0 (x + at)] + u1 (s) ds
2 2a xat
Z t Z t
+ ds g(x a(2 t s), s) d .
0 s
We call the region T the domain of dependence of the point (x, t) (see Fig. 1.6, left).
Conversely, the initial values at a point (x0 , 0) influence the solution in the angle
A = {(x, t) : x0 at x x0 + at};
this region is called the domain of influence of the point (x0 , 0) (see Fig. 1.6, right).
This simple example shows that a second order hyperbolic equation describes the propagation
and composition of two signals moving at finite speed; the solution depends locally on the data of
the problem (the initial data u0 and u1 , the right-hand side g).
Example 1.5.2. Let us now consider our equation in the semi-infinite strip
At each point of the spatial boundary {(0, t) : t > 0} {(1, t) : t > 0}, one characteristics is
entering the domain and one is leaving it, see Fig. 1.7. Thus, one has to prescribe one boundary
condition on u; this can be either the value of u or the value of ux (which is the normal derivative
to O therein). For instance, we can consider the following initial-boundary value problem
0 1 x
utt a2 uxx = 0 x IR , t > 0 ,
u(0, t) = 0 (t) t>0,
ux (1, t) = 1 (t) t>0, (1.5.8)
u(x, 0) = u0 (x) x IR ,
ut (x, 0) = u1 (x) x IR .
In order to motivate the admissibility of the boundary conditions, let us fix a point P0 = (0, t0 )
in O (see Fig. 1.8). If we prescribe u at this point, say u(0, t0 ) = 0 (t0 ), it is convenient to
exchange the signs in (1.5.4), (1.5.5); then, we use the boundary data to integrate u along the
characteristic line x at = at0 entering O at P0 , i.e., we solve
ut + aux = w
u(0, t0 ) = 0 (t0 ),
with w coming from the inside along the characteristic lines x + at = constant.
P0 w
0 x
thus, w(0, t) = ut (0, t0 )a1 (t0 ) and we can integrate w along the characteristic line xat = at0 ,
i.e.,
wt + awx = 0
w(0, t0 ) = ut (0, t0 ) a1 (t0 ).
For an initial-boundary value problem, the domains of dependence and influence are defined
in the obvious way.
2u
1 =f
x2
in the sole space variable y1 = x. The solution at each point x depends on the values of the data
f at all points x in the domain, as well as on the boundary data at all boundary points.
If a low order term is present in the equation, i.e., if the equation is
2u 2u u u
1 2 + 2 2 + a1 + a2 =f
x t x t
with a2 6= 0, the limit equation for 2 0 is parabolic; the solution at each point (x, t) in the
domain depends on all the values of the data f and the boundary data for all t < t, as well as on
the initial condition u0 at t = 0. Propagation of signals takes place with infinite speed.
At last, we briefly deal with the concept of characteristic manifold. For a second order partial
differential equation, a manifold O is non-characteristic if the prescription of u and u on
uniquely determines the Hessian of u (i.e., the set of all its second order partial derivatives)
therein, via the differential equation.
Suppose that the manifold is described by an implicit equation (x ) = 0, for a smooth . Fix
a point x on and let
(x )
n=
k(x )k
be the normal vector to at x . Let us make a change of independent variable y = x + RT (x x ),
1
such that the last coordinate direction is along n. Setting (y ) = x + RT (y x ) , we
have R1 x = y , so we choose R such that R1 n = em = (0, . . . , 0, 1)T . The differential
equation in the new coordinates becomes
Ty RT A R y u + lower order terms = f.
22 CHAPTER 1. BASIC CONCEPTS
We note that all second order derivatives of u except uym ym are determined at x by the values
of u and u on . Thus, in order to get the value of uym ym , we must have
RT A R mm = eT T T
m R A R em = n A n 6= 0.
For instance, the characteristic manifolds of the wave equation (1.5.2) are defined by the
equation a2 n2x n2t = 0 (where n = (nx , nt )T ), i.e., they are precisely the lines x at = constant.
The characteristic manifolds for the heat equation (1.2.8) satisfy n2x = 0, i.e., they are the lines
t = constant.
Finally, any elliptic equations has no (real) characteristic manifold. This means that, under
appropriate regularity conditions, the Cauchy problem
Lu = f in O ,
u = u0 on ,
u
= u1 on ,
n
is always uniquely solvable in a neighborhood O of any (m 1)-dimensional manifold . However,
the Cauchy problem is not well-posed for an elliptic equation. This means that arbitrarily small
changes in the data u0 and u1 may lead to arbitrarily large changes in the solution u, as the
following example shows.
sin nx u u
u(x, 0) = , (x, 0) = (x, 0) = 0 ,
n n y
for a fixed n > 0 (thus, u(x, y) = un (x, y)). The solution can be found by the ansatz
sin nx
u(x, y) = u(y),
n
which reduces the problem to a second order ordinary differential equation with two initial condi-
tions:
1
u nu = 0
n
u(0) = 1
u (0) = 0.
The result is
eny + eny
u(y) = = cosh ny,
2
1.6. EXERCISES 23
1.6 Exercises
1.1. Consider the transport equation
u u
+2 =0
t x
in the half-plane {(x, t) : t > 0}, with the initial condition
(
3 if x < 0
u(x, 0) = u0 (x) =
1 if x > 0.
(ii) solve the initial value problem in O = IR (0, +) with the condition u(x, 0) = u0 (x);
(iii) solve the initial-boundary value problem first for x [0, 1] and then for x [1, 2] with the
further condition u = g(t) on the inflow boundary.
(ii) Deduce that the characteristics are straight lines in the half plane {(x, t) : t > 0}.
(iii) Suppose the initial datum u(x, 0) = u0 (x) is prescribed for every x IR; find the slope of
the characteristics.
24 CHAPTER 1. BASIC CONCEPTS
Theory of Distributions
The theory of distributions was created by Laurent Schwartz in 1944; its main purpose is to extend
the results which hold for integrable and differentiable functions to those functions that do not
satisfy the necessary conditions of classical regularity.
where supp denotes the support of , i.e., the closure of the set of all points x in O such that
does not vanish on them:
supp = {x O | (x ) 6= 0};
it is easy to verify that D(O) is a linear space.
Let us now introduce the following notion of convergence in D(O):
(i) there exists a compact set K O which contains all the supports of n and ;
(ii) for all multi-integers INm , the sequence {D n }n0 converges to D uniformly on K,
i.e.,
n
kD n D k,K 0.
T : D(O) IR
such that if {n }n0 converges to in D(O) then {T (n )}n0 also converges to T () in IR when
n .
The set of all distributions on O is a linear space denoted by D (O). Moreover, the notation
hT, i is often used instead of T () and it is called a duality form.
25
26 CHAPTER 2. THEORY OF DISTRIBUTIONS
Example 2.1.3. Let f be a real-valued and Riemann (or Lebesgue)-integrable function on O; let
us set Z
hTf , i := f (x )(x ) dx D(O)
O
and let us verify that Tf is a distribution. To do this, we have to check the properties of the
previous definition; in particular:
(i) Tf is certainly a linear form because it is real-valued and the integral is a linear operator;
(ii) suppose {n }n0 D(O) is a sequence such that kn k,K 0 when n for a
certain D(O); then
Z
hTf , n i hTf , i = hTf , n i = f (x )[n (x ) (x )] dx
O
and so
Z
|hTf , n i hTf , i| |f (x )| |n (x ) (x )| dx
K Z
kn k,K |f (x )| dx =
K
n
= Gkn k,K 0
R
where G = K |f (x )| dx is a finite constant that comes from the hypothesis that f is inte-
grable on K. Thus we have hTf , n i hTf , i when n .
because the supports of all the n s and of are contained in K, so the integral vanishes on O\K.
This imply that only a local integrability of f on subdomains of O, and not on the whole set O,
is needed to define the distribution Tf .
Throughout this chapter, we shall refer to this type of distribution as a function-like distri-
bution.
Example 2.1.4 (The Dirac delta). Consider a point x 0 O; we introduce now the following
form
hx 0 , i := (x 0 ) D(O)
and we want to verify that it is a distribution in the sense of Definition 2.1.2.
and so, if x 0 K:
n
|hx 0 , n i hx 0 , i| = |n (x 0 ) (x 0 )| max |n (x ) (x )| 0.
x K
1 1
2n 2n x
Such a distribution is called the Dirac delta on the point x 0 ; it is possible to show (see Exercise
2.1) that it is not a function-like distribution, i.e., it does not exist any function f such that the
action of x 0 on a test function D(O) can be expressed as the integral on O of f versus .
After introducing the notion of convergence in D(O), it would be useful to provide a similar
tool for the space D (O) too. This is accomplished by the following
Definition 2.1.5. Let T, Tn D (O), n 0; the sequence {Tn }n0 is said to converge to T in
the sense of D (O) if
n
hTn , i hT, i
for every D(O).
This definition leads us to an important characterization of the Dirac delta. Let us set O = IR
and T = 0 , hT, i = (0) for all D(IR); then, for every n > 0, let us define the function (see
Figure 2.1) (
1
n if |x| 2n
fn (x) =
0 otherwise.
does not depend on n: every function fn has therefore the same unitary area on IR. If we now
consider the family of distributions Tfn , we have:
Z Z 1
2n 1
hTfn , i = fn (x)(x) dx = n (x) dx = n (xn ) = (xn )
IR 1
2n n
1 1
where xn is a point in the interval 2n , 2n whose existence is guaranteed by the Integral Mean
Theorem. It is clear that xn 0 when n ; then using the continuity of gives
n
hTfn , i (0) = h0 , i.
28 CHAPTER 2. THEORY OF DISTRIBUTIONS
Since this argument holds for every D(IR), we conclude that Tfn 0 in the sense of D (IR).
This show that, although the Dirac delta cannot be represented by a classical function, it can
nevertheless be obtained as a limit of classical functions in the sense of Definition 2.1.5.
R
In general, it is easy to check that any sequence {fn } of integrable functions satisfying R fn (x) dx =
1 and supp fn B(0, rn ) with rn 0 as n , converges to 0 in D (IR) as n .
Definition 2.1.6. A distribution T is said to be of finite order if there exist r IN and a constant
Cr > 0 such that
D(O), |hT, i| Cr max kD k,O .
||r
The smallest r for which this condition holds is called the order of the distribution.
so
|hTf , i| Ckk,O .
Definition 2.1.8. Let T D (O); the support of T is the smallest closed set K O such that
This definition states that the support of a distribution T is strictly related to those of test
functions. More in detail, the support K of T is the smallest closed set in O that has the following
property: every test function that vanishes on the whole K, i.e., such that its support does not
intersect K, sees T as zero.
For instance, if we take T = x 0 , x 0 O, we find supp x 0 = {x 0 } because every test function
whose support does not contain x 0 is such that (x 0 ) = 0 and so hx 0 , i = 0.
As another example, let us consider an integrable function f with a compact support in O;
then supp Tf = supp f .
Example 2.1.9. Consider an open set O in IRm and let be a closed (m 1)-dimensional regular
manifold contained in O; let g be an integrable function defined on . Then, the distribution ,g
defined as Z
h,g , i = g()() d D(O),
Definition 2.2.1. Let INm and T D (O); the partial derivative of T of order is the
distribution D T whose action on a test function D(O) is defined as
hD T, i = (1)|| hT, D i.
We can immediately observe that, in the sense of this definition, all the distributions are
infinitely differentiable, since the derivative is moved on the test function which is of class C (O).
The following example will explain the reason of such a definition and where it comes from.
Example 2.2.2. Let f C1 (O) and consider the distribution T = Tf ; in order to calculate its
derivative Di Tf , we set = (0, . . . , 0, 1, 0, . . . , 0) (where the only component of the multi-integer
different from zero is the i-th) and then we apply the Definition 2.2.1:
Z
hDi Tf , i = hTf , Di i = f (x ) (x ) dx =
O xi
Z
f
= (x )(x ) dx = hTDi f , i
O xi
for all D(O); we recall that in applying the integration-by-parts formula no boundary term
appears, since a test function vanishes in a neighborhood of O.
We conclude that Di Tf = TDi f , i.e., the partial derivative with respect to xi of the distribu-
tion based on the function f is the distribution based on the function Di f , which exists in the
classical sense under the hypothesis f C1 (O). As we have just seen, this result follows from
the integration-by-parts formula and it allows us to calculate the derivatives of a function-like
distribution in a somewhat classical way.
which is not differentiable in the classical sense because of the singularity at the origin. Neverthe-
less, in the distributional sense we have:
Z
h(Tf ) , i = hTf , i = f (x) (x) dx =
IR
Z 0 Z +
= x (x) dx 2x (x) dx =
0
Z 0 Z +
= (x) dx + 2(x) dx =
0
Z
= g(x)(x) dx = hTg , i D(IR)
IR
30 CHAPTER 2. THEORY OF DISTRIBUTIONS
Example 2.2.6. Consider the Dirac delta 0 D (IR) and let D(IR); from Definition 2.2.1
one has
h0 , i = h0 , i = (0)
h0 , i = h0 , i = (0)
..
.
(k)
h0 , i = = (1)k (k) (0), k IN.
Example 2.2.7. The multidimensional counterpart of the general situation considered in Example
2.2.4 is as follows. Let O an open set in IRm and let be an (m 1)-dimensional regular manifold
contained in O, which splits O as O O+ , with O open disjoint sets such that O O+ = .
Let the function f satisfy (
f (x ) if x O
f (x ) =
f+ (x ) if x O+
with f+ C1 (O+ ), f C1 (O ). Then, for any i = 1, . . . , m, one has
Di (Tf ) = Tgi + ,hi (2.2.2)
where (
Di f (x ) if x O
gi (x ) =
Di f+ (x ) if x O+
and
hi (s) = |[f ]|s ni (s), s ,
where |[f ]|s denotes the jump of f at the point s in going from O to O+ , and ni is the i-th
component of the normal unit vector to pointing from O to O+ .
We prove the result in the particular case in which f is the Heaviside function associated with
the given partition of O, i.e., (
0 if x O
H(x ) =
1 if x O+
Let us compute Di (TH ) in the sense of distributions. Using the divergence theorem (see (3.1.3)),
we have
Z Z Z
hDi (TH ), i = H(x ) (x ) dx = (x ) dx = ni d = h,ni , i
O xi O+ xi
as an operator into the space of distributions D (O); in particular, we are interested in those
functions g : O IRm IR whose Laplacian is the Dirac delta 0 on the origin.
32 CHAPTER 2. THEORY OF DISTRIBUTIONS
g = 0 in D (O)
Let us start with m = 1 (dimension 1); if we take the function u(x) = |x|, it is easy to verify
that u (x) = sign(x) and thus u (x) = 20 , as it immediately follows from (2.2.1). Hence, the
function g(x) = 21 u(x) = 12 |x| is a fundamental solution of the Laplacian on IR.
Let us now consider m = 2; in this case, it is convenient to use the polar coordinates defined
by the transformation
since we can think of every function u = u(x, y) as u(x, y) = u((r, )) = U (r, ), we have the
relationship u(x, y) = U (r, ) and consequently (x, y) u = (r, ) U , where (x, y) and (r, ) denote
the Laplacian in cartesian and polar coordinates respectively, with (see Exercise 2.8)
2 1 1 2
(r, ) =+ + . (2.3.1)
r 2 r r r 2 2
p
If we take the function u(x, y) = log x2 + y 2 = log r and we set (x, y) = 6 (0, 0), we obtain
from (2.3.1)
1 1
u = (r, ) log r = 2 + 2 = 0;
r r
hence, log r is a harmonic function in the classical sense everywhere in the plane except at the
origin.
Let us now calculate u in the sense of distributions; taking D(IR2 ) we have
hu, i = hu, i =
Z Z
= log r dx dy = lim log r dx dy
IR2 0+ IR2 \B(0, )
where B(0, ) is the open ball of radius > 0 centered at the origin. Applying the integration-by-
parts formula gives
Z Z
hu, i = lim log r dx dy + log r d =
0+ n
Z r> Z r=
Z
= lim log r dx dy log r d + log r d =
0+ r> r= n r= n
Z Z
= lim log r d + log r d
0+ r= n r= n
where the result log r = 0 out of the origin has been used.
Since B(0, ) is a circle, the normal vector of its circumference r = is a radial vector, which
allows us to write
d 1
log r = log r =
n dr r
2.3. STUDY OF THE LAPLACE OPERATOR IN D (O) 33
where the minus sign depends only on the fact that n log r = log rn should be negative because
the two vectors log r and n point in opposite directions. Thus
Z Z Z
1 1
log r d = d = 2 d.
r= n r= 2 r=
1
R
Note that 2 r= d is the mean value that takes along the circumference r = ; since
is continuous, it follows
Z
1
lim 2 d = 2(0, 0).
0+ 2 r=
Moreover Z Z
log r d = log d
r= n r= n
and it results
Z Z Z
d | n| d
n n d =
r=
Zr= r=
Z
kk d max kk d = 2 max kk;
r= (x, y)IR2 r= (x, y)IR2
in the third passage, the Cauchy-Schwartz inequality has been used within the fact that knk = 1
(here k k denotes the Euclidean norm in IR2 ). Since
2 max kk = M
(x, y)IR2
and finally
hu, i = 2(0, 0) = 2h0 , i D(IR2 )
that is
u = 20 in D (IR2 ).
1 1
p
Hence, the function g(x, y) = 2 u(x, y) = 2 log x2 + y 2 is a fundamental solution for the
Laplacian on IR2 .
In three dimensions, with the aid of the spherical coordinates, it can be found that the function
1
u(x, y, z) = p
x2 + y2 + z2
is such that u = 40 ; it is therefore proportional to a fundamental solution on IR3 .
In general, we have the following expressions for the fundamental solutions of the Laplacian:
1
r m=1
2
1 v
log r m=2 um
2 uX
g(x ) = r = kx k = t x2i (2.3.2)
1 1
m = 3 i=1
4 r
1 1
m2 m 4
(m 2)m r
34 CHAPTER 2. THEORY OF DISTRIBUTIONS
2 m/2
where m = is the surface area of the unit sphere in IRm .
(m/2)
It is obvious that adding any harmonic function to g, i.e., a function v such that v 0,
leads to another fundamental solution of the Laplacian. Actually, we are more interested in the
existence rather than in the uniqueness of the fundamental solutions, since their importance is
due to the fact that they provide a powerful tool for solving the following more general matter:
find u such that u = f in O, where f is a given bounded function with a compact support and
integrable on O (i.e. f L1 (O)).
Note that, given any function g such that g = 0 , the new function
has the following property: if we denote by x the Laplacian with respect to the variable x , then
x G = y in D (IRm ),
because the singularity of g has now been moved from the origin to the point y .
Let us set
Z
u(x ) := (f g)(x ) = g(x y )f (y) dy =
Z O
= G(x , y )f (y ) dy;
O
= G(x , y )f (y )(x ) dx dy =
O O
Z Z
= f (y) G(x , y )(x ) dx dy ;
O O
and then Z
hu, i = f (y)(y ) dy = hf, i;
O
since this argument holds for every test function D(O), we conclude that such a u is a solution
of the elliptic equation u = f in the sense of distributions.
2.4 Exercises
2.1. Prove that the Dirac delta is not a function-like distribution, i.e., that it does not exist any
integrable function f : O IRm IR such that
Z
hx0 , i = f (x )(x ) dx , D(O).
O
2.4. EXERCISES 35
2.2. Consider an open set O in IRm and let be a (m 1)-dimensional regular manifold contained
in O. Moreover, let g be an integrable function defined on ; prove that the formula
Z
hT,g , i = g() () d
n
2.5. Let be the straight line in the plane having equation y = 2x. Define then the distribution
D (IR2 ) such that Z
h , i = () d
2
for every test function D(IR ).
u
= in D (IR2 ).
x
v v
+ = in D (IR2 ).
x y
u
= 0 in D (IR2 ).
x
Which is the support of u?
2.8. Prove that the Laplacian in polar coordinates is given by equation (2.3.1).
36 CHAPTER 2. THEORY OF DISTRIBUTIONS
Chapter 3
Sobolev Spaces
3.1 Motivation
In order to motivate the introduction of the Sobolev space H1 (), let us consider the following
Dirichlet boundary-value problem for a general second-order elliptic operator Lu:
(
Lu = (Au) + (au) + a0 u = f in
(3.1.1)
u = 0 on .
Here, A, a, a0 and f are known functions defined in ; precisely, A takes its values in the space
of symmetric and positive-definite matrices of order n, a is a vector-valued function, whereas a0
and f are scalar functions.
We aim at giving a weak (or integral, or variational) formulation of this problem, which
corresponds to the general form (1.2.16). At the beginning, we will proceed in a formal manner,
assuming that all mathematical operations are permitted; then, step by step, we will envisage a
set of assumptions on the data of the problem (the coefficients of the operator, the right-hand
side, the domain) which make the resulting formulation mathematically rigorous.
The starting point consists of multiplying the first equation in (3.1.1) by a test function v and
integrating over , to get
Z Z Z Z
(Au)v + (au)v + a0 uv = fv . (3.1.2)
Next, we perform an integration-by-parts in the first and second term on the left-hand side.
Precisely, we invoke the divergence theorem
Z Z
F= Fn, (3.1.3)
where F is a vector field and n is the unit vector which is normal to and pointing outwards,
as well as the differentiation rule for a product
(v) = ( ) v + v , (3.1.4)
where is a vector field and v is a scalar function. Applying (3.1.3) and (3.1.4) to F = v with
= Au, we obtain
Z Z Z
(Au)v + (Au) v = n (Au) v ;
37
38 CHAPTER 3. SOBOLEV SPACES
Now, we observe that u is required to vanish on ; therefore, from now on, we will require
that our test functions v vanish on , too (note that functions in D() do satisfy this condition).
Then, (3.1.8) simplifies as
Z Z Z Z
(Au) v u a v + a0 uv = fv . (3.1.9)
Note that this equation only involves first-order partial derivatives of u and v.
Next, we make assumptions on the functions appearing in (3.1.9), so that all integrals therein
are guaranteed to be meaningful and finite. On the left-hand side, we have integrals of products
of three functions, such as
Z Z Z
u v v
aij or ai u or a0 uv ,
xj xi xi
whereas the right-hand side is the integral of the product of two functions. Thus, we set ourselves in
the framework of the Lebesgue Integration Theory, which, in particular, ensures that the product
of two functions is integrable in , i.e., L1 () if Lp () and Lp () with
p, p [1, ] satisfying p1 + p1 = 1; furthermore, the following Holder inequality holds:
Z Z Z 1/p Z 1/p
|| ||p
||p
= kkLp () kkLp () (3.1.10)
R 1/p
(if p = , the term ||p has to be replaced by ess sup ||, and similarly if p = ). This
result extends to the product of three functions, i.e., L1 () if Lp (), Lp () and
Lp () with p, p , p [1, ] satisfying p1 + p1 + p1 = 1; in this case, one has
Z
kkLp () kk p kk p . (3.1.11)
L () L ()
The structure of the integrals in (3.1.9) suggests to work in a Hilbertian setting, i.e., to assume that
u, v and their first derivatives belong to L2 (). More precisely, the previous results tell us that
3.2. THE SPACE H1 () 39
R R
f v is well-defined if f and v L2 (); a0 uv is well-defined if a0 L () and u, v L2 ();
R v
an integral of the form ai x u is well-defined if ai L (), u L2 () and x
v
L2 (); finally,
R i
u v
i
an integral of the form aij x j xi
is well-defined if aij L (), xu
j
and x v
i
L2 (). In
conclusion, if we assume that
then u and v should belong to L2 () together with all their first-order partial derivatives. Such
derivatives have to be considered in the sense of distributions, since u and v are merely L2 -
integrable functions, and not classical differentiable functions.
This leads us to introduce the Sobolev space H1 () and, subsequently, its closed subspace
1
H0 () of the functions vanishing on . This will be the appropriate space for setting the weak
formulation of problem (3.1.1) and for studying its well-posedness.
Definition 3.2.1. H1 () is the subspace of L2 () of the functions whose first-order partial deriva-
tives, in the distributional sense, belong to L2 (), i.e.,
v
H1 () = {v L2 () : L2 () for 1 i n} = {v L2 () : v (L2 ())n } .
xi
We point out that the requirement v/xi L2 () means that there exists gi L2 () such that
Tv /xi = Tgi in the sense of distributions, i.e.,
Z Z
h Tv , i = hTv , i= v = gi = hTgi , i D() ;
xi xi xi
Proof. Let {vk }k0 be a Cauchy sequence in H1 ()-norm, i.e., > 0, k IN such that
, m > k one has kv vm kH1 () < . This immediately implies that each sequence {vk }k0 ,
40 CHAPTER 3. SOBOLEV SPACES
{vk /xi }k0 for i = 1, . . . , n, is a Cauchy sequence in L2 (). By the completeness of this space,
there exist functions v and gi , i = 1, . . . , n, belonging to L2 (), such that
vk
lim vk = v , lim = gi , i = 1, . . . , n ,
k k xi
in L2 (). The property is proven if we prove that v/xi = gi for i = 1, . . . , n. This follows from
Z Z Z
v
h , i = v = lim vk = lim vk
xi xi k xi k xi
Z Z Z
vk vk
= lim = lim = gi = hgi , i D() .
k xi k xi
Next property, which we state without proof, is important both from the theoretical and
the constructive/numerical point of view; indeed, it guarantees that functions in H1 () can be
approximated arbitrarily well by functions belonging to a sequence of finite dimensional subspaces.
Property 3.2.3. H1 () is separable, i.e., it contains a sequence {vk }k0 which is dense in it.
Let us improve our knowledge of the space H1 () by observing that it contains classical dif-
ferentiable functions. Indeed, if is bounded, any function v C 1 () belongs to H1 (), and one
has !1/2
n
X
v
2
kvkH1 () ||1/2 2
kvkC 0 () +
(n + 1)||1/2 kvkC 1 () .
xi
0
i=1 C ()
So far, we have seen that sufficiently smooth functions, in a classical sense, belong to H1 ().
On the other hand, H1 () also contains piecewise smooth functions, provided they are globally
continuous. The following result illustrates the situation.
Property 3.2.6. Let be a bounded open set, which is divided into two open subsets and +
by a smooth (n 1)-dimensional manifold . Given two functions v C 1 ( ), the function v
defined as (
v (x) if x ,
v(x) =
v+ (x) if x + ,
belongs to H1 () if and only if v is continuous across .
Property 3.2.7. If = I is a bounded interval, then H1 (I) C0,1/2 (I) with continuous injection,
where C0,1/2 (I) is the space of the Holder continuous functions of exponent 1/2 in I.
Proof. Let us fix v H1 (I) and let us set g = v L2 (I). Let us define the function
Z x
w(x) = g(s) ds ,
x0
where x0 is any fixed point in I. Since v = w in the distribution sense, there exists a constant C
such that v(x) = w(x) + C in I. Thus, for any two points x1 , x2 I,
Z x1
v(x1 ) v(x2 ) = w(x1 ) w(x2 ) = g(s) ds ,
x2
This precisely means that v is Holder continuous of exponent 1/2 in I, and that
|v(x1 ) v(x2 )|
|v|C0,1/2 (I) := sup kv kL2 (I) . (3.2.1)
x1 ,x2 I |x1 x2 |1/2
We will see later on (Thm. 3.8.2) that functions in H1 (), although not necessarily continuous,
belong to some space Lp () with p > 2 depending on n.
Another fundamental result is the following one. Let D() be the space of the C -functions
defined in , whose support is compact and contained in (thus, they are allowed to be nonzero
on ). Note that D() = C () if is bounded, whereas D() = D() iff = IRn . The space
D() can equivalently be defined as the space of the restrictions to of the functions in D(IRn ).
Property 3.2.8. D() is a dense subspace of H1 ().
We will give some ideas of the proof, in some particular cases, later on. The property states
that any function in H1 () can be approximated arbitrarily well by smooth classical functions.
This will allow us to pass to the limit and extend results which are well-known for classical
functions to analogous results for functions in H1 ().
Hm () = {v L2 () : D v L2 () for || m}
3.4. THE SPACES HS (IRN ) 43
(we use here the convention that D0v = v). In this way, we obtain a separable Hilbert space which
enjoys properties similar or equal to those seen for H1 (): for instance, it contains D() as a dense
subspace and, if is bounded, it contains Cm () but also those functions of Cm1 () which are
piecewise Cm -differentiable. Furthermore, all functions in Hm () enjoy classical differentiability
of some order < m (which depends on the space dimension n); for instance, in dimension n = 2,
the space H2 () is contained in C0 () with continuous inclusion (but not in C1 ()). The precise
result will be given in Thm. 3.8.2.
We thus have a scale of function spaces, in which smoothness is measured in a weak, integral
sense; each space is strictly contained in all the spaces of lower index, with continuous inclusion.
Such a scale is the counterpart of the classical scale of spaces Cm (), in which smoothness is
measured in a strong, pointwise sense. Precisely, the two sequences of spaces satisfy
Hm+1 () Hm () Hm1 () H1 () H0 () = L2 ()
Cm+1 () Cm () Cm1 () C1 () C0 ()
and if is bounded each space of the lower sequence is contained with continuous inclusion in
the space above it in the upper sequence. Working in the Sobolev scale rather than in the classical
one is more appropriate for handling the weak, or integral, formulation of an elliptic boundary
value problem; in particular, the Sobolev scale consists of Hilbert spaces, whereas the classical
scale consists merely of non-reflexive Banach spaces.
A further generalization comes from replacing L2 () by some Lp () with p [1, +] in the
definition of the Sobolev space. Thus, we set
Wm,p () = {v Lp () : D v Lp () for || m}
equipped with the norm
1/p
X
kvkWm,p () = kD vkpLp () .
||m
Such a space is a Banach space, which as Lp () is reflexive if 1 < p < + and is non-reflexive
if p = 1 or p = +. Note that Wm,2 () = Hm (). Sobolev spaces of summability index p 6= 2
play a crucial role in studying nonlinear partial differential equations.
and Z 1/2
2 2
kvkH1 (IRn ) = (1 + kk )|v()| d .
IRn
v
Proof. It is enough to prove that, for each k = 1, . . . , n, x k
L2 (IRn ) in the distributional
sense iff the function 7 k v() belongs to L2 (IRn ), with identical norm. Let us assume that
v 2 n n
xk = gk L (IR ); then, using (3.4.2) and (3.4.1), for all D(IR ) one has on the one side
Z Z Z
v
h , i = v(x ) (x ) dx = v() ()d = i k v() () d
xk IRn xk IRn xk IRn
By equating the two last expressions and by recalling that is arbitrary, we get i k v() = gk ()
almost everywhere in IRn , and therefore the function 7 k v() belongs to L2 (IRn ). Conversely,
if this happens, one sets gk () = i k v(), so that its inverse transform gk (x ) belongs to L2 (IRn )
v
and satisfies gk = x k
.
The argument given in the proof shows that
1 n v
for all v H (IR ) , () = i k v() IRn , k = 1, . . . , n . (3.4.3)
xk
3.4. THE SPACES HS (IRN ) 45
(D v) () = i|| v() ,
which generalizes (3.4.3). Thus, any Sobolev space Hm (IRn ) can be characterized as the subset
of L2 (IRn ) of those functions satisfying v() L2 (IRn ) for all INn such that || m; the
norm in Hm (IRn ) could be represented as
1/2
Z X
kvkHm (IRn ) = ||2|v()|2 d ,
IRn ||m
where || = (|1 |, |2 |, . . . , |n |). An equivalent but simpler expression of the norm is preferred,
which can be derived by applying the following technical lemma, whose elementary proof is left to
the reader.
Lemma 3.4.2. There exists constants c, C > 0 depending only on n and m such that
m X m
c 1 + kk2 ||2 C 1 + kk2 , IRn .
||m
The result allows us to characterize Hm (IRn ) in an equivalent manner as the subset of L2 (IRn )
m/2
of those functions such that 1 + kk2 v() L2 (IRn ), and to use the L2 (IRn )-norm of this
m n
function as an equivalent norm in H (IR ).
At this point, a remarkable observation can be made, namely, that the latter characterization
does not require the parameter m to be an integer: any real value of m is admissible. This leads
us to extend the definition of Sobolev spaces given so far, to the case of real positive indices.
In this way, we obtain a continuos family of separable Hilbert spaces, which satisfy
The last relation shows that the Sobolev spaces Hs (IRn ) of non-integer index can be viewed as a
sort of interpolating spaces between consecutive Sobolev spaces of integer index. The concept can
be made rigorous, within the so-called Theory of Space Interpolation.
46 CHAPTER 3. SOBOLEV SPACES
One can furtherly extend the definition of Sobolev space to the case of negative indices, by
setting
Hs (IRn ) = H|s| (IRn ) if s < 0 ,
where X denotes the dual space of the Hilbert space X; equivalently, Hs (IRn ) can be defined
as the space of the distributions whose Fourier transform (defined in a suitable sense) makes the
right-hand side of (3.4.4) finite.
0 R (t) 1 t IR ,
R (t) 1 if |t| R ,
R (t) 0 if |t| R + 1 .
Then, given any v H1 (IRn ), one can prove that the function vR (x ) = R (kx k)v(x ) belongs to
H1 (IRn ) and is supported in B(0, R + 1); furthermore, kv vR kH1 (IRn ) 0 as R +.
Regularization. Given any > 0, let (x ) be any non-negative function in D(IRn ) satisfying
Z
supp B(0, ) , (x ) dx = 1 .
IRn
An example of such function is obtained by properly scaling the function given in Example 1.2.1.
Note that as 0, converges in D (IRn ) to the distribution 0 .
Then, given any v H1 (IRn ), one can prove that the convolution function
Z
v (x ) = ( v)(x ) = (x y )v(y ) dy
IRn
belongs to H1 (IRn ) and is infinitely differentiable at every x IRn ; furthermore, kvv kH1 (IRn ) 0
as 0.
Finally, we combine the two previous approximations by considering functions vR, = (vR )
obtained by first truncating a function v H1 (IRn ), and then regularizing the result. Since both
vR and are compactly supported, so is vR, ; precisely, supp vR, B(0, R + 1 + ). Thus, vR,
belongs to D(IRn ).
An appropriate choice of = (R), such that (R) 0 as R , shows that v can be
approximated in H1 (IRn ) to any prescribed precision by a function vR, for a sufficiently large R.
A number of properties of H1 (IRn+ ), such as Property 3.2.8, can be obtained from the analogous
properties of H1 (IRn ) after introducing a suitable prolongation operator which extends the func-
tions belonging to H1 (IRn+ ) into functions belonging to H1 (IRn ). Precisely, given any v H1 (IRn+ ),
let us set
(P v)(x ) = v(x , |xn |) x = (x , xn ) IRn .
Thus, P realizes an extension of v by an even reflection around the boundary IRn+ . It is easy to
check that P v H1 (IRn ) iff v H1 (IRn+ ), and that
kP vkH1 (IRn ) 2kvkH1 (IRn+ v H1 (IRn+ ) ,
Proof. Given any D(IRn+ ), let A > 0 be such that supp IRn1 [0, A]. For any x IRn1 ,
one has by the fundamental theorem of integral calculus in one dimension,
Z A Z A
2 2 2
2
(x , 0) = (x , A) (x , 0) = (x , xn ) dxn = 2 (x , xn ) dxn ;
0 xn 0 xn
48 CHAPTER 3. SOBOLEV SPACES
Recalling Proposition 3.4.1, the first integral on the right-hand side equals
Z " n1 2 #
X
2 + (x ) dx ;
n
IR+ xk
k=1
on the other hand, the second integral on the right-hand side equals
Z
2
(x ) dx ,
IRn
+
xn
termed trace operator, such that () = | for all D(IRn+ ). It is surjective upon
H1/2 (IRn+ ) and admits a continuous right-inverse
Definition 3.6.1. A bounded open domain is said of class Cm (or simply a Cm -domain) for m 1
if there exists a finite covering of by open bounded sets Ai , i = 0, 1, . . . , I, such that
i) A0 A0 ;
ii) for each i = 1, . . . , I, there exists a mapping i : Ai B(0, 1) with the following properties:
a) i is bijective ;
b) i is of class Cm , with inverse i1 also of class Cm ;
50 CHAPTER 3. SOBOLEV SPACES
Definition 3.6.2. A bounded open domain is said a Lipschitz domain if there exists a finite
covering of by open bounded sets Ai , i = 0, 1, . . . , I, such that
i) A0 A0 ;
ii) for each i = 1, . . . , I, there exists a mapping i : Ai B(0, 1) with the following properties:
a) i is bijective ;
b) i is of class C1 , with inverse i1 also of class C1 ;
c) there exists a Lipschitz-continuous function gi : IRn1 IR such that gi (0) = 0,
and
i (Ai ) = {y = (y , yn ) B(0, 1) : yn = g(y )} .
Obviously, a C1 -domain is a particular case of Lipschitz domain, where one can take each gi 0.
The concept of C1 -domain (of Lipschitz domain, resp.) can be equivalently expressed by saying
that locally its boundary is a graph of a C1 -function (a Lipschitz-continuous function, resp.), such
that the domain lies on one side of the graph.
One can prove that any convex domain is a Lipschitz domain.
Examples of domain which are neither C1 nor Lipschitz are those containing cusp points, such
as
= {x IR2 : x2 + (y 1)2 < 1 and x2 + (2y 1)2 > 1} ,
i.e., the region between two circumferences which are tangent at the origin. The origin is a cusp
point, such that in none of its neighborhoods the boundary can be represented as the graph
of a function.
The concept of partition of unity provides the tool which allows us to localize the study of a
function defined in .
Definition 3.6.3. A partition of unity associated with the covering {Ai }i=0,1,...,I of is a set of
nonnegative C -functions i : IRn IR such that
i) supp i Ai ;
I
X
ii) i (x) = 1 x .
i=0
Property 3.6.4. Given any finite covering of , there exists a partition of unity associated with
it.
3.6. SOBOLEV SPACES ON BOUNDED DOMAINS 51
Example 3.6.5. Let us exhibit a simple partition of unity associated with a covering of an interval
of the real line. Let us start by considering the even function
(
exp t211 |t| < 1 ,
(t) =
0 |t| 1 ;
0 (x) = (x+ 34 )+( 43 x)1 , 1 (x) = (x 43 )+( 54 x)1 , 2 (x) = (x+ 54 )+( 34 x)1 .
I
! I
X X
v(x ) = 1 v(x ) = i (x ) v(x ) = (i (x )v(x )) ,
i=0 i=0
P
i.e., setting vi (x ) = i (x )v(x ), we express v as v = Ii=0 vi , with supp vi Ai and kvi kH1 ()
CkvkH1 () for i = 0, . . . , I.
Now, v0 vanishes in a neighborhood of , hence, we can think of it as extended by zero outside
, i.e., v0 H1 (IRn ). On the other hand, for i = 1, . . . , I, we define vi = vi i1 , a function which
is supported in B(0, 1) IRn+ , so that it can be extended by zero to a function vi in IRn+ , satisfying
vi H1 (IRn+ ) with kvi kH1 (IRn+ ) Ckvi kH1 () .
Let P : H1 (IRn+ ) H1 (IRn ) be the prolongation operator defined in Sect. 3.5. Then, the
function
vi (x ) on Ai ,
vi (x ) = i (x ) (P vi ) i (x ) on Ai C ,
0 on IRn \ A , i
is an extension of vi which satisfies kvi kH1 (IRn ) Ckvi kH1 () . Finally, the global prolongation
P
operator is defined as P v = v0 + Ii=1 vi . Thus, we have established the following result.
Property 3.6.6. Let be a bounded Lipschitz domain. There exists a linear continuous operator
P : H1 () H1 (IRn ) such that P v| v for all v H1 ().
52 CHAPTER 3. SOBOLEV SPACES
As for the case = IRn+ , this property allows one to prove Property 3.2.8 for all Lipschitz domains.
Let us now consider the problem of defining the trace operator. To this end, let IRn+ :
H (IRn+ ) H1/2 (IRn+ ) be the trace operator defined in Sect. 3.5. Then, the function IRn+ (vi )i
1
its image, which is a subspace of L2 (). On the other hand, is clearly non-injective (many
functions in may have the same trace on ). Thus, let us introduce the subspace
We observe that the infimum above is actually a minimum. Indeed, given x X/X0 , there exists
a unique element x X such that kx kX = kxkX/X0 . This result can be proven by taking any
element y x and setting x = y y, where is the orthogonal projection operator from X
upon X0 ; it is easily seen that x is independent of the particular choice of y, and satisfies the
conditions stated above. Equivalently, the linear continuous mapping x X 7 x X/X0 admits
a continuous right-inverse x X/X0 7 x X.
We apply these results to the quotient space H1 ()/H10 (), after observing that, by definition
of kernel of a linear operator, induces an algebraic isomorphism between H1 ()/H10 () and
H1/2 (). Therefore, we can equip the space H1/2 () by the quotient norm
where v is the unique equivalence class of all functions v H1 () satisfying (v) = g; equiva-
lently, we have
|k g |kH1/2 () = inf kvkH1 () , (3.6.5)
vH1 (), (v)=g
or
|k g |kH1/2 () = kvg kH1 () ,
where v = vg is the element in v of smallest H1 ()-norm. One can prove (exercise) that vg is the
unique solution of the elliptic problem
(
vg + vg = 0 in ,
(3.6.6)
vg = g on .
3.7. THE SPACE H10 () AND THE POINCARE-FRIEDRICHS INEQUALITY 53
Thus, for the norm just introduced in H1/2 (), the mapping g 7 vg is a continuous right-inverse
of the mapping v 7 (v).
One can give an intrinsic definition of the space H1/2 (), as one of the fractional order
Sobolev spaces Hs (), 0 < s < 1, defined as
Z Z
s 2 2 |g(x ) g(y )|2
H () = {g L () : |g|Hs () = 2s+n
dxdy < +}
kx y k
This norm is equivalent to the norm defined in (3.6.4). If = IRn+ , this definition is also equivalent
to the one given in Sect. 3.5 via the Fourier transform.
We summarize the results that can be proven about the trace operator .
Theorem 3.6.7. Let be a bounded Lipschitz domain. There exists a linear continuous operator
: H1 () H1/2 () ,
termed trace operator, such that () = | for all C (). It is surjective upon H1/2 ()
and admits a continuous right-inverse
: H1/2 () H1 () ,
Remark 3.6.8. Let us show on a single example how the previous results can be extended to
Sobolev spaces of higher order.
Consider a bounded C2 -domain, and take a function v H2 (). Then, not only its trace (v)
is well-defined in H1/2 (), but also the traces (v/xi ) of its first-order partial derivatives
are well-defined in H1/2 ().
This is expressed by saying, on the one side, that (v) belongs to the Sobolev space H3/2 ()
1
(i.e., it is more regular, as a consequence of the fact that v is more regular
P than just an H ()-
function) and, on the other side, that the normal derivative v/n = i (v/xi )ni is well-
defined in H1/2 ().
Thus, H10 () can be equivalently defined as the closure of D() with respect to the topology
of H1 (); i.e., a function v H1 () belongs to H10 () if and only if there exists a sequence of
functions n D() satisfying kv n kH1 () 0 as n .
As a closed subspace of the Hilbert space H1 (), H10 () is itself a Hilbert space, for the
same inner product as in H1 (). On the other hand, a simpler, yet equivalent inner product
54 CHAPTER 3. SOBOLEV SPACES
can be defined in H10 () (equivalent meaning that it induces an equivalent norm). In order to
motivate its definition, let us start from the observation that the norm in H1 (), given in Definition
3.2.1, depends on both the L2 ()-norm of the function and the L2 ()-norm of its gradient. In
H1 () functions exist, which have one of the two norms much larger that the other one. For
instance, highly oscillatory but bounded functions (such as v,k (x, y) = sin kx cos ky in the square
= (0, 2)2 ) may be arbitrarily small while their gradients may be arbitrarily large. On the
contrary, constant functions (such as vk (x, y) = k) may be arbitrarily large while their gradients
are identically zero.
However, suppose bounded and consider a function constrained to vanish on : then, it
can be large somewhere in the domain only if its gradient, too, is large somewhere. This intuitive
concept can be made rigorous through an important inequality, known as the Poincare-Friedrichs
inequality, which we now state.
Proposition 3.7.2. Let be a bounded domain. Then, there exists a constant CP > 0 such that
Any constant for which this inequality holds is referred to as a Poincare constant in the domain.
There exists a minimal value CP () of this constant, depending only on , which can be referred
to as the Poincare constant of the domain .
Proof. We follow the strategy of first proving the inequality for all functions in D(). Next, since
D() is dense in H10 () and both sides of the inequality depend continuously on the H1 ()-norm,
we can pass to the limit and extend the inequality to all functions in H10 ().
Since is bounded (in particular, in the xn -direction), there exist constants a < b such that
IRn1 [a, b]. Given any D(), let us extend it by zero outside ; then, for any x IRn1
and any xn [a, b], the fundamental theorem of Calculus yields
Z xn
(x , xn ) = (x , s) ds .
a xn
By the Cauchy-Schwarz inequality, we get
Z Z xn 1/2 Z !1/2
xn
xn 2
|(x , xn )| =
1
(x , s)ds 2
1 ds
xn (x , s) ds
a xn a a
Z b !1/2
2
(xn a) 1/2
xn (x , s) ds .
a
thus, inequality (3.7.1) is established with CP = 21/2 (b a) for any D(). The existence of
a minimal value of the Poincare constant will be proven in Chap. 6.
The assumptions make above in order to obtain the Poincare-Friedrichs inequality ( bounded
and functions vanishing on the whole of ) are just one of the possible sets of assumptions which
guarantee the inequality to hold. Here are possible extensions:
The proof clearly indicates that need not be bounded, but just bounded in one direction,
i.e., in the direction of a coordinate axis (possibly after a rigid rotation).
Functions need not vanish on the whole of , but just on a proper subset which has
positive (n 1)-dimensional measure, provided any point in the domain can be connected
to by a curve completely contained in the domain. In particular, if we introduce the
closed subspace of H1 () of the functions vanishing on a subset of positive (n 1)-
dimensional measure, i.e., if we set
Let us introduce the closed subspace of L2 () of the zero-average functions, i.e., let us set
Z
L20 () 2
= {v L () : v = 0} . (3.7.3)
Note that zero-average functions cannot be strictly positive or strictly negative throughout
the domain; hence, if in addition they are continuous, they necessarily vanish somewhere
in the domain. So, it is not unexpected that one can prove (see Exercise 3.2) that the
Poincare-Friedrichs inequality holds in the closed subspace of H1 () given by H1 () L20 ().
We are now ready to introduce, as announced, a new inner product in H10 () or, more generally,
in any subspace of H1 () for which a Poincare-Friedrichs inequality holds.
Proposition 3.7.3. Let H0 denote any closed subspace of H1 () for which there exists a constant
CP > 0 such that
kvkL2 () CP kvk(L2 ())n v H0 . (3.7.4)
1
2
kvkH1 () kvkH0 kvkH1 () v H0 . (3.7.6)
1+CP
Proof. It is enough to prove the first inequality in (3.7.6), which immediately follows from
(3.7.4).
56 CHAPTER 3. SOBOLEV SPACES
Theorem 3.8.1. (Rellich) Let be a bounded domain in IRn . Then, for any m 0 the inclusion
Hm () Hk () with 0 k < m, is compact.
For instance, if {vn }n1 is a sequence of functions in H1 () satisfying kvn kH1 () C for
some constant C > 0, then the theorem assures the existence of a subsequence {vnj }j1 which is
convergent in L2 ().
The Sobolev imbedding theorem links Sobolev regularity to classical regularity, allowing one
e.g. to see the weak solution of a variational problem as a classical solution as well. In essence,
the theorem says that any function in Hm () for large enough m has the property that all its
derivatives of order up to a certain k < m are classical derivatives (and not just distributional
derivatives), and they are continuous in . Furthermore, even if m is not large enough, any
function in Hm () is p-integrable for some p > 2 (and not just square-integrable), as soon as
m > 0.
The precise statement is as follows.
Theorem 3.8.2. (Sobolev) Let be any domain in IRn . Let m > 0 be given, and denote by [z]
the largest integer z.
iii) If m > n/2, then Hm () Ck, () L (), for k = [m n/2] and = m n/2 k if
m n/2 is not an integer, and for k = m n/2 1 and arbitrary < 1 if m n/2 is an
integer.
All inclusions above are continuous. In addition, if is bounded, the inclusions are compact.
Examples 3.8.3.
i) In dimension n = 1, the theorem gives Hm () Cm1,1/2 (). This result, for m = 1, has
been already proven in Sect. 3.2 (Property 3.2.7).
ii) In dimension n = 2, the theorem gives H1 () Lp () for all p < (but not contained in
L (), as already noted in Sect. 3.2), and Hm () Cm2, () L () for any integer m 2
F (x)
kF kX = sup .
xX kxkX
If X is a Hilbert space, the Riesz Representation Theorem says that each F X can be written
as F (x) = (y, x)X x X, for a unique y X, which satisfies kykX = kF kX . Thus, X can be
identified to X via the isometry F 7 y.
If X and Y are Banach spaces satisfying X Y with continuous injection, i.e., kxkY CkxkX
for all x X, then any FY Y defines an FX X by setting FX (x) = FY (x) x X;
furthermore, kFX kX CkFY kY , i.e., the mapping FY Y 7 FX X is continuous. If,
in addition, X is dense in Y , then this mapping is injective, since FX = GX means FY (x) =
GY (x) x X, whence FY (y) = GY (y) y Y , by the uniqueness of the extension of a continuous
map defined on a dense subset. In other words, we can identify Y to a subspace of X , i.e., we
have Y X with continuous injection and dense image.
A particularly important situation is the following one. We are given two Hilbert spaces V
and H, such that V H with continuous injection and dense image; furthermore, we identify H
to H via the Riesz Representation Theorem as above. Then, we have H = H V , so that we
can write the chain of inclusions
V H V , (3.9.1)
where each inclusion is continuous and with dense image. The pair (V, H) is often called a Gelfand
pair; equivalently, the triple (V, H, V ) is called a Gelfand triple. Examples of Gelfand triples are
(H1 (), L2 (), (H1 ()) ) and (H10 (), L2 (), (H10 ()) ) ,
S : H1 () (L2 ())n+1
v v (3.9.2)
v 7 v, , . . . , = Sv .
x1 xn
The mapping is trivially injective, since Sv = 0 implies that the first component of Sv, i.e., v
itself, is zero; furthermore, it is an isometry, since kSvk(L2 ())n+1 = kvkH1 () for all v H1 ().
Thus, we can identify H1 () to the subspace Z = S(H1 ()) of (L2 ())n+1 ; this subspace is closed,
since H1 () is complete. Correspondingly, the dual of H1 () can be identified to the dual of Z,
via the isometry F (H1 ()) 7 FZ Z defined as FZ (w) = F (S 1 (w)) w Z.
We now invoke the Hahn-Banch Theorem, which guarantees that, given a linear continuous
form FZ on a closed subspace Z of a Banach space X, there exists a linear continuous form Fe on
58 CHAPTER 3. SOBOLEV SPACES
X which extends FZ and such that kFZ kZ = kFekX . Thus, if we start from any F (H1 ()) ,
there exists Fe (L2 ())n+1 ) such that
and kF k(H1 ()) = kFZ kZ = kFek(L2 ())n+1 ) . On the other hand, having identified (L2 ()) to
L2 (), the space (L2 ())n+1 ) can be identified to (L2 ())n+1 , so that Fe can be identified to an
element f = (f0 , f1 , . . . , fn ) (L2 ())n+1 by the relation
n
X
Fe(w) = (f , w)L2 ()n+1 = (fi , wi )L2 () w (L2 ())n+1 , (3.9.4)
i=0
P 1/2
and kFek((L2 ())n+1 ) = kf k(L2 ())n+1 = n 2
i=0 kfi kL2 () . Combining (3.9.3) and (3.9.4), we
conclude that F (H1 ()) can be represented as
n
X
v
F (v) = (f0 , v)L2 () + fi , v H1 () , (3.9.5)
xi L2 ()
i=1
P 1/2
n
for suitable functions f0 , f1 , . . . , fn L2 () satisfying kF k(H1 ()) = 2
i=0 kfi kL2 () .
Since the extension FZ 7 Fe is not unique, the representation (3.9.5) is not unique as well.
For instance, since Z Z
v
dx = v dx if D() ,
x1 x1
we can replace in (3.9.5) f1 by f1 + and f0 by f0 x 1
without changing the right-hand side. If
2 n+1
f = (f0 , f1 , . . . , fn ) (L ()) denotes now any (n + 1)-ple of functions for which (3.9.5) holds,
then by the Cauchy-Schwarz inequality we have
n
X
v
|F (v)| kf0 kL2 () kvkL2 () +
kf1 kL2 ()
kf k(L2 ())n+1 kvkH1 () ,
xi
L2 ()
i=1
i.e., kF k(H1 ()) kf k(L2 ())n+1 . On the other hand, the construction above shows that there
exists f (L2 ())n+1 for which the equality sign is attained.
The following statement summarizes the results obtained so far.
We now consider the dual space of H10 (), which is usually denoted by H1 (). All previous
considerations can be repeated, with the only change that the operator S introduced in (3.9.2)
is now restricted to H10 (), and consequently its image is a closed subspace of Z, say Z0 . Then,
given any F H1 (), we arrive as above to the representation formula
n
X
v
F (v) = (f0 , v)L2 () + fi , v H10 () , (3.9.8)
xi L2 ()
i=1
D() H10 () L2 () H1 () D () .
Recalling the expression of the partial derivatives of a distribution (Def. 2.2.1), the right-hand
side of (3.9.8) can be written as
n * n
+
X X fi
(f0 , )L2 () + fi , = f0 , D() ,
xi L2 () xi
i=1 i=1
i.e.,
n
X fi
F = f0 = f0 (f1 , . . . , fn ) in D () . (3.9.9)
xi
i=1
Thus, any F H1 () is the sum of an L2 ()-function and the divergence (in the sense of
distributions) of a vector of L2 ()-functions. The representation of its norm is analogous to the
one in (H1 ()) .
We summarize the results as follows.
Theorem 3.9.2. Let us denote by H1 () the dual space of H10 (). Then, H1 () D (), and
any F H1 () can be represented, in a non-unique way, as in (3.9.9) for suitable f0 , f1 , . . . , fn in
L2 (); equivalently, (3.9.8) holds. In addition, setting R0 (F ) = {f (L2 ())n+1 : (3.9.8) holds },
we have !1/2
Xn
kF kH1 () = min kf k(L2 ())n+1 = min kfi k2L2 () . (3.9.10)
fR0 (F ) fR0 (F )
i=0
Examples 3.9.3.
i) Let = (a, b) be an interval in IR and let x0 any point in . Since H10 () H1 () C0 (),
the linear form v 7 Fx0 (v) = v(x0 ) belongs to both H1 () and (H1 ()) . As an element of
H1 (), it coincides with the distribution x0 (we simply say that x0 belongs to H1 ()); indeed,
we can represent it as in (3.9.9) setting Fx0 = f0 df
dx , with f0 0 and f1 (x) = H(x x0 ), where
1
H is the Heaviside function. We refer to Exercise 3.3 for a representation of Fx0 , as an element of
(H1 ()) , in the form (3.9.6).
ii) In dimension n > 1, the distribution x 0 , with x 0 , belongs neither to H1 () nor to
(H1 ()) ; indeed, neither H1 () nor H10 () are imbedded in C0 ().
60 CHAPTER 3. SOBOLEV SPACES
where (v) is the trace of v on , belongs to both H1 () and (H1 ()) , since the mapping
: H1 ( ) L2 () is continuous, as seen in 1
R Sect. 3.6. As an element of H (), F coincides
with the distribution such that h , i = | d for all D(). It can be represented in
the form (3.9.9) by setting f0 = f2 = = fn 0 and f1 = + , the characteristic function of
the set + . We refer to Exercise 3.4 for a representation of F , as an element of (H1 ()) , in the
form (3.9.6).
iii) Let Lw = (Aw) + a w + a0 w be any second-order operator in ; let us assume
that all its coefficients belong to L (). Then, for any w H1 (), one has Aw (L2 ())n as
well as a w + a0 w L2 (). Thus, according to Thm. 3.9.2, Lw belongs to H1 () and the
mapping w H1 () 7 Lw H1 () is continuous. A particularly relevant case occurs when
Lw = w: by restricting w to H10 (), we obtain that the operator maps H10 () into its dual
H1 (). Precisely, there exists a constant C > 0 depending on such that
The Laplacian is actually an isomorphism between these two spaces, as discussed in the next
chapter (see Property 4.3.7).
One can also consider Lw, for w H1 (), as an element of (H1 ()) , by setting
Z Z
F (v) = (Aw) v + (a w + a0 w) v v H1 () .
iv) Let be a bounded Lipschitz domain; recall that the trace operator is continuous
from H1 () to L2 (). Thus, given any h L2 (), the form Fh defined by
Z
v 7 Fh (v) = h (v) d (3.9.12)
belongs to (H1 ()) . How can we represent Fh according to (3.9.6)? To answer this question, let
us consider the Neumann problem
w + w = 0 in ,
w = h on .
n
Anticipating the results of next chapter, we can say that there exists a unique solution w H1 ()
of the variational formulation of this problem, which is given by
Z Z
w v + wv = h (v) v H1 () . (3.9.13)
w w
Then, the (n + 1)-ple f = w, x 1
, . . . , xn provides a representation of the form Fh . It is not
difficult to prove, indeed, that this is the representation which realizes the minimum in (3.9.7).
3.10. EXERCISES 61
Consider now the same mapping (3.9.12), but restrict it to the functions of H10 (). Obviously,
(Fh )|H10 () belongs to H1 (), but... it is nothing else than the null form, i.e., Fh0 = 0
Fh0 =
1
H (). Since, by (3.9.12),
Z Z
0 = Fh0 () = h | = w + w = hw + w, i D() ,
the same f as above does provide one of the possible representations of Fh0 in the form (3.9.9),
but surely it is not the one with minimal norm! Obviously, such a representation is provided by
the (n + 1)-ple f = (0, 0, . . . , 0).
3.10 Exercises
3.1. Let us consider the function
u(x, y) = |x y|
on the set = [1, 2] [1, 2] IR2 , where is a real parameter.
(i) Find the values of for which u L2 () and those for which u H1 ().
(ii) For the values of that allow u to be in H1 (), calculate the Laplacian u and find the
space which it belongs to.
is a norm in H1 (), which is equivalent to the standard norm kvkH1 () . Deduce from this the
validity of a Poincare-Friedrichs inequality in the space H1 () L20 (), where L20 () is defined in
(3.7.3).
3.3. Prove that the form Fx0 defined in Example 3.9.3, i) can be represented as
Z b Z b
Fx0 (v) = f0 (x)v(x) dx + f1 (x)v (x) dx v H1 (a, b) ,
a a
where ( (
0 in (a, x0 ) , 0 in (a, x0 ) ,
f0 (x) = f1 (x) =
w(x) in (x0 , b) , w (x) in (x0 , b) ,
and w is the solution of the Neumann problem in (x0 , b):
(
w + w = 0 in (x0 , b) ,
w (x0 ) = 1, w (b) = 0 .
3.4. Adapt the arguments of the previous exercise in order to find a representation of the form
F defined in Example 3.9.3, ii), as an element of (H1 ()) .
62 CHAPTER 3. SOBOLEV SPACES
Chapter 4
Elliptic Problems
Note that all integrals which appear in this equation make sense if u, v H1 (), thanks to the
definition of this space and the trace properties established in Chapter 3. Hence, we introduce the
bilinear form
Z Z Z Z
a(u, v) := (Au) v u a v + a0 uv + an uv (4.1.3)
N
63
64 CHAPTER 4. ELLIPTIC PROBLEMS
defined in H1 (). We also introduce the closed subspace of H1 () of the functions vanishing on
D , i.e., we set
H10,D () = {v H1 () : v = 0 on D } ,
still defined in H1 (), we reduce the non-homogeneous problem for u to the following one for w:
This change of dependent variable is made in preparation of applying the abstract existence and
uniqueness theory developed in the subsequent section.
Obviously, if D = , we simply consider Problem 4.1.2 with H10,D () = H1 ().
4.1. WEAK FORMULATION OF ELLIPTIC BOUNDARY-VALUE PROBLEMS 65
Examples 4.1.4. Let us consider the homogeneous Dirichlet problem for the Poisson equation:
(
u = f in ,
(4.1.8)
u = 0 on .
It admits the following weak formulation: Find u H10 () such that
Z Z
u v = fv v H10 () . (4.1.9)
We observe that, keeping in mind Theorem 3.9.2, the data f L2 () can be replaced by a data
F H1 (), i.e., of the form
n
X fi
F = f0 , fi L2 () .
xi
i=1
In this case, (4.1.9) becomes
Z Z n Z
X v
u v = f0 v + fi v H10 () . (4.1.10)
xi
i=1
ii) Let us consider the non-homogeneous Neumann problem for the Helmholtz equation:
(
u + a0 u = f in ,
u
(4.1.11)
n = h on .
Next section will be devoted to study an abstract form of Problem 4.1.2 or 4.1.3, providing a
set of assumptions which guarantee its solvability.
66 CHAPTER 4. ELLIPTIC PROBLEMS
F : V R
(4.2.2)
v 7 F (v)
In order to prove the solvability of this problem, we assume that the forms a and F are
continuous. The latter condition is equivalent to F V ; concerning the form a, we give the
following definition.
Definition 4.2.2. The bilinear form a is said to be continuous in V if there exists a constant
C > 0 such that
|a(w, v)| C kwkV kvkV w, v V . (4.2.4)
a(w, v)
kak = sup , (4.2.5)
wV,vV kwkV kvkV
A:V V ,
defined as follows: for any w V , the mapping v V 7 a(w, v) R is linear and continuous;
therefore, it is an element of V , which we denote by Aw. In other words, A is defined by the
relations
(Aw)(v) = a(w, v) w, v V . (4.2.6)
Au = F .
4.2. THE LAX-MILGRAM THEOREM 67
It will be convenient to introduce the duality pairing between V and V , i.e., the bilinear form
hAw, vi = a(w, v) w, v V ,
hAu, vi = hF, vi v V .
Next, we introduce the crucial property of the bilinear form a, which will provide, together
with continuity, a sufficient condition for the solvability of Problem 4.2.1.
Definition 4.2.4. The bilinear form a is said to be coercive in V if there exists a constant > 0,
such that
a(v, v) kvk2V v V . (4.2.8)
Any > 0 satisfying (4.2.8) is termed a coercivity constant of the form a. The best coercivity
constant is given by
a(v, v)
= inf . (4.2.9)
vV kvk2 V
We are now ready to state the main result of the present theory.
Theorem 4.2.5. (Lax-Milgram) Assume that the bilinear form a on the reflexive Banach space
V is continuous and coercive, with coercivity constant . Then, given any linear continuous form
F on V , Problem 4.2.1 admits one and only one solution, which satisfies
1
kukV kF kV . (4.2.10)
Remark 4.2.7. The Lax-Milgram Theorem can be viewed as a generalization of the Riesz Rep-
resentation Theorem in a Hilbert space. Indeed, assume that V is such a space and that a(w, v)
is a continuous and coercive bilinear form on V , which in addition is symmetric, i.e.,
Proof of Theorem 4.2.5 At first, let us remark that if we assume for the moment the existence
of a solution, then its uniqueness and the bound (4.2.10) are immediate. Indeed, this inequality
follows from coercivity after choosing v = u in (4.2.3, since
dividing by kukV , we get the result. Uniqueness follows from (4.2.10): if u1 , u2 are any two
solutions of Problem 4.2.1, their difference satisfies
Step 2): A is injective. This follows from the coercivity of a. Indeed, Aw = 0 V means
hAw, vi = a(w, v) = 0 for all v V ; taking v = w we get kwk2V a(w, w) = 0, which imples
w = 0.
Let us introduce the image of A in V , i.e., the subspace
Then, A is an algebraic isomorphism between V and Z, whose inverse will be denoted, as usual,
by A1 .
4.2. THE LAX-MILGRAM THEOREM 69
Step 3): A1 : Z V is continuous. This follows again from the coercivity of a. Indeed, given
any G Z, let w V be such that Aw = G. This means that w satisfies
a(w, v) = G(v) v V ;
1
applying the bound (4.2.10) to this problem, we get kA1 GkV kGkV ,
which is precisely the
claim.
Step 4): Z is a closed subspace of V . This follows from the completeness of V . Indeed, let F
belong to the closure of Z in V . Then, there exist a sequence {Gn }nN converging to F in the
norm of V . Let us set wn = A1 Gn V . By the result of Step 3), we have
1
kwn wm kV kGn Gm kV n, m N .
This implies that {wn }nN is a Cauchy sequence in V , hence, it is converging to some w V
since this space is complete. Then, the form G = Aw belongs to Z and, by Step 1), the sequence
{Gn = Awn }nN converges to G in V . Since it also converges to F by assumption, necessarily
F = G by the uniqueness of the limit, hence, F belongs to Z.
Step 5): Z coincides with V . This follows from the reflexivity of V . By contradiction, assume
that Z is a proper subspace of V . Then, according to the Hahn-Banach theorem, there exists
a linear continuous form W on V such that W(G) = 0 for all G Z, but W(F ) 6= 0 for some
F V \ Z. In other words, W V , and kWkV > 0.
Since V is reflexive, W can be identified with an element w in V ; precisely, there exists a
unique w V such that
kwk2V a(w, w) = 0 ,
which implies kwkV = 0, i.e., w = 0. This contradicts the property kwkV > 0 stated before, and
the claim is proven.
The proof of the Lax-Milgram Theorem is then concluded.
J: V R
(4.2.12)
v 7 J(v) = 21 a(v, v) F (v) ,
Proof. The result is an immediate consequence of the bilinearity of a and the linearity of F .
The identity can be interpreted as follows: think of v as an increment v = w given to w. Then,
w 7 a(w, w)F (w) represents the linear part of the increment of J, whereas w 7 12 a(w, w)
represents the quadratic part. Equivalently stated, (4.2.14) is nothing but the Taylor expansion
of J around w
h(HJ) v1 , v2 i = a(v1 , v2 ) v1 , v2 V .
On the other hand, using the symmetry of the bilinear form, we obtain the relation
J(v) 14 kvk2V ,
6 0, then a(v, v) > 0 by coercivity, hence J(u + v) > J(u), i.e., u is a strict minimizer of J.
If v =
Conversely, let u be a solution of Problem 4.2.8. For any fixed v V , consider the quadratic
function (parabola) : R R defined by
() = J(u + v) = J(u) + a(u, v) F (v) + 21 2 a(v, v)
The first integral on the right-hand side is called the Dirichlet integral of v in .
In a (extremely simplified) description of small deformations in linear Elasticity (such as in
the membrane problem), v represents an admissible displacement, constrained to vanish on the
boundary of the body which occupies the domain . Then, the quantity
Z
1
kvk2Rn
2
represents the internal elastic energy associated with the displaced configuration, whereas the
integral
Z
fv
represents the potential energy associated with the work of the external force of density f when
the displacement v takes place. Thus, J(v) represents the total energy of the configuration de-
scribed by v. Eq. (4.2.13) translates the well-known physical principle that, among all admissible
displacements, the one which corresponds to the equilibrium of the elastic body under the external
forces is characterized by having the minimal total energy.
Lemma 4.3.1. Under the assumptions on the coefficients A, a, a0 stated at the beginning of
Sect. 4.1, one has
where C(A, a, a0 ) depends upon kAk(L ())nn , kak(L ())n , ka0 kL () and ka nkL (N ) .
Proof. In the proof, C will denote a constant, independent of w, v and the coefficients of the
operator, which be different from place to place.
Using Holders inequality (3.1.11) with p = , p = p = 2, we easily bound each addend in
the definition of a(w, v); precisely:
Z
(Aw) v CkAk(L ())nn kwk 2
(L ())n kvk(L2 ())n
CkAk(L ())nn kwkH1 () kvkH1 () ,
Z
w a v kak(L ())n kwk 2 kvk 2
L () (L ())n kak(L ())n kwkH1 () kvkH 1 () ,
4.3. WELL POSEDNESS OF ELLIPTIC PROBLEMS 73
Z
a0 wv ka0 kL () kwk 2 kvk 2
L () L () ka0 kL () kwkH 1 () kvkH1 () ,
Z
an wv ka nkL (N ) kwkL2 (N ) kvkL2 (N ) ka nkL (N ) kwkL2 () kvkL2 ()
N
Cka nkL (N ) kwkH1 () kvkL2 (H1 ()) .
The last inequality follows from the continuity of the trace operator : H1 () L2 () (recall
(3.6.1)).
Lemma 4.3.2. Let the assumptions on the data f, g, h stated at the beginning of Sect. 4.1 be
satisfied. In addition, suppose that the function ug be any lifting of the data g inside satisfying
kug kH1 () 2kgkH1/2 () (according to the definition (3.6.5) of the H1/2 ()-norm of g). Then,
one has
|F (v)| kf kL2 () + khkL2 (N ) kvkH1 () v H10,D () , (4.3.2)
|F (v)| kf kL2 () + khkL2 (N ) + 2C(A, a, a0 )kgkH1/2 () kvkH1 () v H10,D () . (4.3.3)
Proof. The first inequality follows immediately from the Cauchy-Schwarz inequality. Concerning
the second one, we have
Next, let us define a set of assumptions on the coefficients of the operator L, which ensure that
the bilinear form a(w, v) is coercive in H10,D () with respect to the H1 ()-norm. At first, let us
observe that
Z Z Z Z
2
a(v, v) = v Av v a v + a0 v + an v 2 v H10,D () . (4.3.4)
N
The first integral on the right-hand side is non-negative, since the assumption that the operator
is elliptic throughout the domain implies that A > 0 almost everywhere in , for any Rn .
However, this assumption is not sufficient to yield a control on the L2 -norm of v; we need a
stronger condition, expressed by the following definition.
Definition 4.3.3. The operator L is said to be uniformly elliptic in if there exists a constant
> 0 such that
The second integral on the right-hand side of (4.3.4) can be manipulated as follows. We first
note that vv = ( 12 v 2 ), so that
Z Z
v av = a ( 21 v 2 ) .
74 CHAPTER 4. ELLIPTIC PROBLEMS
the last integral is indeed an integral over N , since v vanishes on D . Thus, substituting this
expression and inequality (4.3.6) into (4.3.4) yields
Z Z
2
2 1
a(v, v) kvk(L2 ())n + 1
2 a + a0 v + 2 an v 2 v H10,D () .
N
At this point, we make the assumptions that a n 0 on N and that there exists a constant
such that 21 a + a0 almost everywhere in . Then, the above inequality implies
Finally, we observe that if the Poincare-Friedrichs inequality (3.7.4) holds for H10,D (), then it is
enough to assume 0 to get coercivity; indeed, recalling the first inequality in (3.7.6), we obtain
Lemma 4.3.4. Assume that L be uniformly elliptic in , i.e., (4.3.5) holds true. Furthermore,
assume that a L () and that there exists a constant 0 for which 21 a + a0 almost
everywhere in ; let > 0 whenever the Poincare-Friedrichs inequality (3.7.4) does not hold for
H10,D (). Finally, assume that a n 0 on N . Then, the bilinear form a(w, v) is coercive in
H10,D (), with coercivity constant in the H1 ()-norm given by
2 if the Poincare-Friedrichs inequality holds in H10,D () ,
1+CP
= (4.3.7)
min(, ) if the inequality does not hold .
The three previous lemmas guarantee that the bilinear form a and the linear form F or F , which
define Problems 4.1.2 or 4.1.3, satisfy the assumptions of the Lax-Milgram Theorem. Consequently,
each of these Problem is well-posed. Recalling that the solution u of Problem 4.1.1 is given by
u = u0 + ug , hence in particular kukH1 () ku0 kH1 () + kug kH1 () , we arrive at the following final
result.
Theorem 4.3.5. Let the assumptions stated in Lemmas 4.3.1, 4.3.2 and 4.3.4 on the coefficients
A, a, a0 and the data f , g, h of the mixed Dirichlet/Neumann boundary-value problem (4.1.1) be
satisfied. Then, the weak formulation of the problem, given by Problem 4.1.1, admits one and only
one solution u, for which the following bound holds:
kukH1 () C kf kL2 () + kgkH1/2 () + khkL2 (N ) , (4.3.8)
where the constant C depends upon kAk(L ())nn , kak(L ())n , ka0 kL () and ka nkL (N ) .
4.3. WELL POSEDNESS OF ELLIPTIC PROBLEMS 75
Note that a direct estimate on the norm kukH10 () = kuk(L2 ())n (see (recall Proposition 3.7.3)
can be obtained from the relations
which give
kukH10 () CP kf kL2 () . (4.3.10)
A similar well-posedness result holds if the data f is replaced by the more general data F
H1 (), according to (4.1.10). In this case, recalling (3.9.7), one has
n
!1/2
X
kukH1 () (1 + CP2 )kF kH1 () (1 + CP2 ) kfi k2L2 () . (4.3.11)
i=0
The latter result can be stated in the form of the following fundamental property.
ii) Consider now the non-homogeneous Neumann problem for the Helmholtz equation, stated
in (4.1.11). Assume that a0 almost everywhere in , for a suitable constant > 0. Then, the
problem admits one and only one variational solution u H1 () satisfying (4.1.12). The following
bound holds for the H1 ()-norm of u:
kukH1 () min(1, ) kf kL2 () + khkL2 () . (4.3.12)
iii) Finally, let us consider the mixed Dirichlet/Neumann problem for the convection-diffusion
equation, stated in (4.1.13). Let us assume that the Dirichlet part D of the boundary be non-
empty, and that the velocity field a be solenoidal, i.e., a = 0 in . Note that by the very
definition of D we have an 0 on N . Thus, the problem admits one and only one weak solution
u H1 () satisfying u = g on D and (4.1.14). The following bound holds for the H1 ()-norm of
u:
1+C 2
kukH1 () P kf kL2 () + kgkH1/2 () , (4.3.13)
This means that the equation is satisfied in the distributional sense, i.e., we have
From this relation, we can derive an additional information on the solution. Indeed, we write
(Au) = (au) + a0 u f in D () ,
and we observe that the right-hand side, which we write (a)u + a u + a0 u f , is a sum of
functions belonging to L2 (). Thus, the principal part of the operator, L(2) u = (Au), is
not just an element in D () (or in H1 ()), but is an element of L2 (). Therefore, if we define
the domain of the operator L(2) as the space
we conclude that the solution u is not just a function in H1 (), but it satisfies
u D(L(2) ) . (4.4.4)
The property (Au) L2 () (together with the property (au) L2 (), already used
above) allows us to write (4.4.1) in the equivalent form
Z Z Z Z
(Au) + (au) + a0 u = f D() ;
i.e., the equation is actually satisfied in a stronger sense than the distributional one, compare with
(4.4.2).
Condition (4.4.4) has also an important consequence on the interpretation of the Neumann
boundary conditions. Indeed, the following crucial property holds.
4.4. WHAT THE WEAK SOLUTION SATISFIES 77
v
Proposition 4.4.1. For any function v D(L(2) ), the conormal derivative n A
is well defined as
1/2 1/2
an element of the dual space H () = (H ()) . Precisely, the following formula holds:
Z Z
v
h , i = (Av) w + (Av) w H1/2 () , (4.4.6)
nA
Finally, we use
R the fact that u is the solution of the weak formulation (4.1.1), so that the right-hand
side equals N hv. In other words, the conormal derivative of u satisfies
Z
u
h , i = h H1/2 (), = 0 on N . (4.4.8)
nA N
this is precisely the way in which the Neumann boundary condition is satisfied by u.
We summarize the results obtained so far in the following theorem.
Theorem 4.4.2. Under the sole assumptions on the domain, the coefficients and the data for
which Theorem 4.3.5 holds, the weak solution of the boundary-value problem (4.1.1) is such that:
Proposition 4.5.1. Let be any open set contained with its closure in . Let m 0 be a
non-negative integer and assume that there exists an open set , satisfying , such that
Then,
u Hm+2 () .
The result means that if the coefficients of the operator are sufficiently smooth, then there is
a gain of two orders of Sobolev regularity between the data f and the solution u, i.e.,
f Hm ( ) u Hm+2 () . (4.5.1)
This is a manifestation of the property that elliptic operators are regularizing operators.
A partial yet simple justification of the previous result can be provided for the model equation
u = f in = Rn ,
using the powerful tool of the Fourier transform. Indeed, using (3.4.3), if both u and f belong to
L2 (Rn ) this equation is equivalent to
Using the expression (3.4.4) for the Sobolev norms, one easily gets
kukHm+2 (Rn ) C kukL2 (Rn ) + kf kHm (Rn ) , (4.5.3)
which is precisely (4.5.1) in the particular situation of being the full space.
4.5. BACK TO CLASSICS: THE REGULARITY OF THE WEAK SOLUTION 79
Going back to the general situation, condition u Hm+2 () with m large enough implies
classical regularity of u, thanks to the Sobolev Imbedding Theorem 3.8.2. Precisely, if m > n/2 2
then
u Hm+2 () u Ck, () ,
with k = [m + 2 n/2] and = m + 2 n/2 k if m n/2 is not an integer, or k = m + 1 n/2
and < 1 arbitrary if m n/2 is an integer. In particular, if the coefficients and the data f
are infinitely differentialble in (so that one can take m arbitrarily large), then u is infinitely
differentiable in .
It is worthwhile detailing certain results of minimal regularity in dimension two and three.
Corollary 4.5.2. Let n = 2 or 3, and let the assumptions of Proposition 4.5.1 hold. Then,
m = 0 implies u C0 (),
m = 1 implies u C1 (),
m = 2 implies u C2 ().
In the latter case, u is a classical solution of the partial differential equation at each point of .
The result can be extended to more general situations, for instance when D and N are both
non-empty, but each connected component of is completely contained in one of these sets. In
all cases, a result of classical regularity similar to Corollary 4.5.2 holds in the whole of .
The question of assessing the regularity of the weak solution u up to the boundary becomes
quite delicate if is not a manifold of class Cm+1 (for instance, if it has corners), or if there is a
transition between Dirichlet and Neumann boundary conditions at some point of . We confine
ourselves to the illustration of some of the possible situations in the case of a polygonal domain.
80 CHAPTER 4. ELLIPTIC PROBLEMS
u(0, 0) = 1 .
2
On the other hand, the boundary condition on the side [0, 1] {0} implies xu2 (x, 0) = 0 for
2
0 < x < 1; by continuity, we would also have yu2 (0, 0) = 0. Similarly, using the boundary
2u
condition on the side {0} [0, 1] we would have x2
(0, 0) = 0, whence
u(0, 0) = 0 ,
f L2 () = H0 () u H2 () (4.5.5)
is true?
We assume that is a bounded polygonal domain in R2 with vertices Vi , i = 1, . . . , I. Each
side i (which could even be a curved side) carries a unique type of boundary condition, i.e., either
i D or i N .
Let us assume that the vertex Vi is common to the sides i and i+1 (setting I+1 = 1 ); let
i (0, 2) be the measure of the angle at Vi , contained in . Note that the situation of a point
P internal to a side at which there is a change of type of boundary conditions can be included
into the present setting by considering P as an additional vertex of the polygon, with associated
angle of measure .
We also associate an angle i to each side i , by setting i = 0 if i N and i = 2 if
i D . (More generally, we could enforce as boundary condition the vanishing of an oblique
u
derivative of u, i.e., i
= 0 on i , where i is a fixed unitary vector which is not perpedicular to
the normal vector ni ; in such a case, i would be the measure of the angle between i and ni .)
With these notations at hand, we define the quantities
i i+1 m
i,m = , mZ, (4.5.6)
i
4.5. BACK TO CLASSICS: THE REGULARITY OF THE WEAK SOLUTION 81
and, correspondingly, the functions Si,m which, in a polar coordinate system (r, ) around the
vertex Vi , take the form
r i,m
Si,m (r, ) = cos(i,m i+1 ) i (r, ) ;
i i,m
here, each cut-off function i is infinitely differentiable, takes the value 1 at Vi and vanishes
identically if r is large enough.
The following result describes the structure of the solution of Problem (4.5.4).
Theorem 4.5.4. Let u H10,D () be the weak solution of Problem (4.5.4), with f L2 (). Then,
u can be represented as
u = ureg + using ,
where ureg H2 (), while
I
X X
using = ci,m (f ) Si,m
i=1 0<i,m <1
If is convex around Vi , i.e., if 0 < i < , then i > 1, so that there is no m Z such that
0 < i,m < 1: the solution is H2 in a neighborhood of Vi .
This result is a particular case of the following property.
Obviously, the counter-example at the beginning of the present section shows that a similar
result cannot hold for higher-order regularity: the convexity of the domain is not enough to ensure
that f H2 () implies u H4 ().
Going back to our discussion, if is not convex around Vi , i.e., if < i < 2, then 12 < i < 1,
so that there is exactly one m Z, namely m = 1, such that 0 < i,m < 1: the solution is not
H2 in any neighborhood of Vi . For instance, if i = 23 as in the re-entrant corner of an L-shaped
domain, then one can prove that u belongs at most to H7/4 in a neighborhood of Vi .
ii) Consecutive sides carrying different boundary conditions
Assume that i D , whereas i+1 N . Then, i i+1 = 2 , so that
1
i,m = m .
2 i
82 CHAPTER 4. ELLIPTIC PROBLEMS
4.6 Exercises
4.1. Consider the following problem in = (0, 1)2 IR2 :
u = f in
u = 0 on 2 3 4
u + u = 0 on 1
n
where IR and 1 = (0, 1) {0}, 2 = {1} (0, 1), 3 = (0, 1) {1}, 4 = {0} (0, 1).
(ii) Find the conditions on which guarantee the coercivity of the associated bilinear form.
4.2. Setting up a suitable bilinear form in V = H10 () H10 (), use the Lax-Milgram Theorem to
prove the existence and uniqueness of the solution of the following elliptic system:
u2
u1 + u1 + = f1 in
x1
u2 + u1 + u2 = f2 in
u1 = u2 = 0 on
Under the name of Maximum Principle, we find several important theoretical properties of the
solutions of elliptic and parabolic problems, all related to the ordering relation between real
numbers. The Maximum Principle can be expressed in various forms, from the classical ones to
the more general statements derived from the weak, or variational, formulations of the problems.
In this chapter, we will confine ourselves to elliptic boundary value problems. The Maximum
Principle for parabolic problems will be briefly accounted for in Sect.
Then, necessarily u(x ) = 0, and furthermore the Hessian of u in x is nonnegative, which implies
in particular
2u
0 i = 1, 2, . . . , n.
x2i
Thus
n
X 2u
u(x ) = (x ) 0
i=1
x2i
83
84 CHAPTER 5. THE MAXIMUM PRINCIPLE
Proof. Suppose that the second inequality does not hold, and assume that there exists x
such that
u(x ) = max u > max g;
we can then choose IR such that
max g < < u(x ).
Define now := {x : u(x ) > }; this set is a nonempty (since x ), bounded (since
) and open (since u is continuous); furthermore, = {x : u(x ) = }. In , u
solves the Dirichlet problem
u = 0 in ,
u = on .
By the uniqueness of the solution, we must then have u in ; but, by assumption, it results
u(x ) > , a contradiction. The first inequality in the thesis is proven similarly.
5.2. VARIATIONAL RESULTS 85
Remark 5.1.5. For harmonic functions, we can actually state more: a harmonic function cannot
have strict maxima or minima inside its domain.
The result is an immediate consequence of the following property of harmonic functions, which
we will derive in a moment: given a point x and any neighborhood BR (x ) = {z : kz x k < R}
contained in , with boundary R = {z : kz x k = R}, we have the expression:
Z
1
u(x ) = u() d, (5.1.2)
|R | R
where |R | denotes the measure of R . As a consequence, u cannot achieve, for instance, a local
maximum value in x , since in that case we would have u() < u(x ) for every which parametrizes
R , and then Z Z
1 1
u(x ) = u() d < u(x ) d = u(x ),
|R | R |R | R
a contradiction. For a local minimum value in x , the reasoning is analogous.
We will derive (5.1.2) in dimension 2, the general case being similar. Consider the function
1 r
v(z ) = log , with r = kz x k, which according to (2.3.2) satisfies
2 R
v = x in BR (x ),
v = 0 on R .
Then,
Z Z
v
u(x ) = hx , ui = hv, ui = v u dx u d
BR (x ) R n
Z Z Z Z
v u v
= vu dx u d + v d = u d,
BR (x ) R n R n R n
v 1
since u is harmonic in BR (x ) and v vanishes on R . We conclude by observing that =
n 2R
on R .
The functions v + and v are said to be the positive part and the negative part of v, respectively.
Note that the following decomposition holds true:
v = v+ v ;
Lu = (Au) + a u + a0 u,
is continuous and coercive in H10 (); furthermore, given f L2 () and g H1/2 () C0 (),
let us denote by u the solution of the Dirichlet problem
Lu = f in ,
u = g on ,
that is (
u H1 (), u| = g,
a(u, v) = (f, v) v H10 ().
Finally, let us set
mg = min g and Mg = max g. (5.2.2)
i) if f a0 mg 0 in , then u mg in ;
ii) if f a0 Mg 0 in , then u Mg in .
Proof. We prove i), since the proof of ii) is similar. Let us write u in the form u = (u mg ) + mg
and let us substitute it in the variational formulation:
a(u mg , v) = (f a0 mg , v).
(u mg ) H10 (),
since the supports of v + and v do not intersect in , except for possible sets of measure zero;
consequently, equation (5.2.4) becomes
a (u mg ) , (u mg ) = f a0 mg , (u mg ) .
which implies
k(u mg ) k21, 0
and finally (u mg ) = 0. This means that u mg 0 in , i.e., u mg in .
Remark 5.2.3. Instead of g C0 (), one could make the weaker assumption that g L ().
In that case, the theorem holds after replacing (5.2.2) by
Remark 5.2.4. By inspecting the proof, it is easily seen that the implications i) and ii) of the
theorem hold if mg indicates any number min g and Mg indicates any number max g.
In the examples below we shall show how Stampacchias Theorem can be applied to some
particular cases of elliptic problems.
Example 5.2.5. Let us consider the Dirichlet problem for the Laplace operator
u = f in ,
u = 0 on ;
again a result already proved in the classical case (cf. Proposition 5.1.4). k
88 CHAPTER 5. THE MAXIMUM PRINCIPLE
Example 5.2.7. Let us consider the Dirichlet problem for the Helmholtz operator
u + u = f in ,
u = g on .
We have a0 = 1. Assume at first that f 0 in and mg 0: we conclude that u min g in
. On the other hand, let again f 0 in but now assume mg > 0; then, the assumptions of
Stampacchias Theorem may not be satisfied and we can indeed have u < min g in , as shown
in the forthcoming discussion of singular perturbation problems. However, in this case we can
apply Remark 5.2.4 with mg = 0 and get at least u 0 in . k
In conclusion, we have
u(x) = f + (g0 f )u0 (x) + (g1 f )u1 (x).
where > 0 is again a constant and a is a smooth vector field such that kak 1 in . The term
a u models the transport of the scalar quantity u (which may represent the temperature of a
fluid, or the concentration of a pollutant in a fluid) along the streamlines of the field a.
If is small, so that the term u may be neglected, the equation in reduces to a particular
case of the linear first order equation (1.3.1) considered in Sect. 1.3. We have seen that this
equation may be solved with the boundary condition assigned on the inflow boundary defined
as in (1.3.4). The solution u = u , far from the portion of the boundary 0 + , is close to
the solution u of the reduced first-order problem
a u = f in ,
u = g on .
(note that x = 0 is the inflow boundary point), getting u(x) = g0 + f x. Next we make the change
of variable w = u u and we observe that w is the solution of
w + w = 0 in (0, 1),
w(0) = 0,
w(1) = g1 g0 f.
By linearity we get
u(x) = g0 + f x + (g1 g0 f )u1 (x),
where now u1 is the solution of
u1 + u1 = 0 in (0, 1),
u1 (0) = 0,
u1 (1) = 1.
6.1 Introduction
In this chapter, we introduce the concept of eigenvalue and eigenfunction of a uniformly elliptic,
self-adjoint operator with appropriate boundary conditions. They can be considered as the gener-
alization to an infinite dimensional Hilbert space H of the concept of eigenvalue and eigenvector of
a symmetric positive-definite matrix in a Euclidean space Rn . The eigenfunctions of the operator
form an orthonormal basis in H, with respect to which the operator is diagonalized.
2u
u = f
t2
in a bounded domain Rn , submitted to honogeneous Dirichlet boundary conditions
u=0.
In one dimension (n = 1), u can be interpreted as the (small) displacement of a guitar string in the
direction perpendicular to the plane of the guitar, under the effect of a density force; the boundary
condition forces the string to remain attached to the guitar. The two-dimensional analog is the
displacement of a drum membrane (or drum skin) in the direction perpendicular to the drum
surface. (For simplicity, we have set to 1 all physical constants.)
An important characteristic of the musical instrument is represented by the set of the modes
of free vibration of the guitar string or the drum skin. They correspond to those solutions of the
wave equation with no forcing term (f = 0), which are periodic in time. More precisely, they can
be defined as the solutions of the form
where R is the frequency of pulsation, whereas w determines the spatial shape of the pulsation.
Differentiating and substituting into the wave equation, we get
eit 2 w(x) w(x) = 0 ,
91
92 CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS
w = 2 w in ;
on the other hand, the boundary condition for u is equivalent to the boundary condition for w,
w=0 on .
Using the functional machinery developed in Chap. 4, this problem can be equivalently written
in variational form as follows.
Problem 6.1.2. Find all R and w H10 (), w 6= 0, such that
Z Z
w v = w v v H10 () . (6.1.2)
The situation described so far is just an example, and it can be generalized in various ways,
e.g. by considering variable-coefficient elliptic operators or other types of boundary conditions. In
the next section, we will provide an abstract framework to study spectral problems formulated in
a variational form.
The solutions of this problem are referred to as the eigenvalues and the eigenfunctions w of
the bilinear form a. Note that any eigenfunction is defined up to a multiplicative constant, i.e., if
w is an eigenfunction, then w is also an eigenfunction, for any real 6= 0. Furthermore, given
an eigenfunction w, the corresponding eigenvalue can be expressed as the Rayleigh quotient
a(w, w)
= .
(w, w)
Problem 6.2.1 can be solved by resorting to the spectral theory for a compact, self-adjoint and
positive operator in a Hilbert space. Let us recall the main result of this theory.
6.2. THE ABSTRACT VARIATIONAL EIGENVALUE PROBLEM 93
Theorem 6.2.2. Let X be a separable Hilbert space, with inner product (x, y). Let T : X X
be a linear operator, satisfying the following assumptions:
a) T is compact, i.e., from every bounded sequence {xn }n1 , one can extract a subsequence
{xnj }j1 such that {T xnj }j1 is convergent;
T xk = k xk k = 1, 2, . . . (6.2.2)
(xk , x ) = k , k, 1 . (6.2.4)
iii) This system is indeed an orthonormal basis in X, i.e., every x X can be uniquely repre-
sented as
X
x= k xk with k = (x, xk ) ; (6.2.5)
k=1
the coefficients k are termed the (generalized) Fourier coefficients of x with respect to the
basis of eigen-elements of T . The convergence of the series has to be meant in the norm of
X, i.e.,
XN
kx xN k 0 as N , where xN = k wk .
k=1
iv) The following representation of the norm in X, termed Parseval identity, holds:
X
2
kxk = |k |2 x X . (6.2.6)
k=1
V () = {z X : T z = z}
is a vector space, termed the eigenspace of . As a consequence of (6.2.3), V () has finite dimen-
sion, say m 1 if we have for some k
Remark 6.2.4. Recall that for a linear operator, the property of compactness is stronger than
the property of continuity, i.e., if T is compact, then necessarily it is continuous. Indeed, by
contradiction, if there were no constant C > 0 such that
kT xk C kxk x X ,
kT xn k > n kxn k ;
Clearly, no converging subsequence can be extracted from {T yn }, a contradiction with the property
of compactness.
We are ready to introduce the operator T : H H associated with our Problem 6.2.1.
Proposition 6.2.6. The operator T defined above satisfies assumptions a)-c) of Theorem 6.2.2.
Proof. Let us check compactness. Let C > 0 denote the continuity constant of the inclusion
V H, i.e., kvkH CkvkV for all v V . Furthermore, let F V be the form associated with
f , i.e., the form defined by F (v) = (f, v) for all v V . Then, recalling (4.2.10) and the definition
of dual norm (see Sect. 3.9), one has
1 1
kukH kukV kF kV Ckf kH .
C
2
This implies both kT f kH C kf kH (i.e., the continuity of T , which we have seen above to be a
necessary condition for compactness), and
C
kT f kV kf kH .
Thus, if {fn } is a bounded sequence in H, then {T fn } is a bounded sequence in V and since the
inclusion V H is compact by assumption, we can extract a subsequence {T fnj } converging in
H. This proves that T is compact.
In order to prove assumption b), consider any f and g in H, and set u = T f , w = T g. Using
the symmetry of the inner product in H and the definition of T , we have
The result just proven ensures us that statements i)-iv) of Theorem 6.2.2 apply to our operator
T . Let us rephrase them in the language of the variational setting.
We will denote by wk , k = 1, 2, . . . , the eigenfunctions of T . Then, each T wk satisfies
a(T wk , v) = (wk , v) v V ;
Theorem 6.2.7. Under the assumptions on the Gelfand triple (V, H, V ) and the bilinear form
a(u, v) stated at the beginning of this section, the variational eigenvalue Problem 6.2.1 admits a
sequence of real strictly positive eigenvalues k and corresponding eigenfunctions wk V , k =
1, 2, . . . , with the following properties:
(wk , w ) = k , k, 1 . (6.2.10)
iii) This system is indeed an orthonormal basis in H, i.e., every v H can be uniquely repre-
sented as
X
v= vk wk with vk = (v, wk ) ; (6.2.11)
k=1
Corollary 6.2.8. The previous conclusions apply to the eigenvalue Problem 6.1.2 for the operator
with Dirichlet boundary conditions in a bounded domain .
96 CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS
Proof. We know that (H10 (), L2 (), H1 ()) is a Gelfand triple (see Sect. 3.9), and that the
inclusion H10 () L2 () is compact if is bounded (by Rellich Theorem 3.8.1).
We close this section by showing that the system of eigenfunctions {wk }k1 provides a basis
not only in H but also in other functional spaces related to our problem, such as V and V ; in
addition, the norm of an element in such a space can be expressed in terms of a suitably weighted
norm of the sequence of its (generalized) Fourier coefficients. Let us begin with the space V .
Setting v = w in (6.2.8) and using (6.2.10), we obtain
a(wk , w ) = k k , k, 1 . (6.2.13)
This means that the eigenfunctions form an orthogonal system in V , for the inner product (u, v)a =
a(u, v) associated with the bilinear form a (which induces a norm, kvka , uniformly equivalent to
the norm kvkV , see Remark 4.2.7). The system is indeed a basis for V , as stated by the following
result.
In this case, the series (6.2.11) converges also in V and the following Parseval identity holds true:
X
a(v, v) = kvk2a = k |vk |2 . (6.2.14)
k=1
Proof. For any fixed N 1, let VN = span {wk : 1 k N } the subspace of V spanned by the
first N eigenfunctions.
PN that v V . Define vN VN as the truncation to N terms of the series (6.2.11), i.e.,
Assume
vN = k=1 vk wk . It is easily seen, thanks to (6.2.13), that vN is the orthogonal projection of v
upon VN in the inner product (u, v)a , i.e., it satisfies
(v vN , z)a = 0 z VN .
which implies
N
X
k |vk |2 = kvN k2a kvk2a < + , N 1 .
k=1
Thus, the series in the statement of the proposition is convergent, and the other properties follow
immediately from this result.P PN
Conversely, assume that 2
k=1 k |vk | < +. Then, setting vN = k=1 vk wk for all N 1,
we have that the sequence {vN } is a Cauchy sequence in V , since
N
X
kvN vM k2a = k |vk |2 0 if N >M .
k=M +1
6.2. THE ABSTRACT VARIATIONAL EIGENVALUE PROBLEM 97
Proposition 6.2.9 provides an example of the fundamental property that for an element v H
the condition of being more regular (in a suitable sense) is equivalent to a faster decay of its
Fourier coefficients vk as k , since the sequence {k } is diverging to + (recall (6.2.9)). Note
in particular that from v belonging to H we can only infer that
since the series (6.2.12) is convergent; on the other hand, from v belonging to V we infer that
1
vk = o k.
k
Another example of this property is as follows. Recall the definition (4.2.6) of the operator
A : V V associated with the bilinear form a. Note that the relations (6.2.8) can be equivalently
written as
Awk = k wk , k = 1, 2, . . . .
Let us introduce the domain of the operator A as the subspace of V
D(A) = {v V : Av H} . (6.2.15)
P
One can prove (see the remark below) that if v =
k=1 vk wk belongs to D(A), then
X
X
Av = vk Awk = k vk wk ,
k=1 k=1
so that
X
kAvk2H = 2k |vk |2 < + .
k=1
In this case,
1
vk = o k.
2k
In the model situation of the Laplacian with Dirichlet boundary conditions, we have H =
L (), V = H10 () and D() = H10 () H2 () provided has a C1 boundary (Proposition 4.5.3)
2
or is convex (Property 4.5.6). In one of these cases, the following equivalences hold:
X
v L2 () if and only if |vk |2 < + , (6.2.16)
k=1
X
v H10 () if and only if k |vk |2 < + , (6.2.17)
k=1
X
v H10 () H2 () if and only if 2k |vk |2 < + . (6.2.18)
k=1
On can continue this sequence indefinitely, by considering domains of the successive powers m
of the Laplacian.
98 CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS
Remark 6.2.10. We can also expand in eigenfunction series any element F of the dual space V .
Indeed, we define its (generalized) Fourier coefficients by setting
Fk = F (wk ) , k = 1, 2, . . . ;
P
then, for any v = k=1 vk wk V , one has (at least formally)
X
X
F (v) = vk F (wk ) = Fk vk . (6.2.19)
k=1 k=1
Since F (v) is finite for all v V , one can prove using Proposition 6.2.9 that necessarily
X 1
|Fk |2 < + ,
k
k=1
and actually the square root of this expression is the dual norm of F , if V is equipped with the
norm kvka . In other words, one can prove that
X
X 1
F = Fk wk V if and only if |Fk |2 < + ;
k
k=1 k=1
in this case, the series of F is convergent in V , and the action of F on any v V can be expressed
as in (6.2.19).
This inequality is true if CP2 1 = 1, since k /1 1 for any k by (6.2.9). On the other hand, with
such a choice the inequality becomes an equality for v = w1 , meaning precisely that
1
CP =
1
is the smallest admissible constant in (3.7.1).
Next, we explicitly compute the eigenvalues and eigenfunctions of Problem 6.1.2 in some rele-
vant cases.
Following the same procedure illustrated above, one can easily compute (see Exercise 6.1)
eigenvalues and eigenfunctions of other boundary-value problems for the second derivative opera-
tor, such as a mixed Dirichlet/Neumann boundary-value problem
w = w in (0, L), w = w in (0, L),
w(0) = 0, or w (0) = 0, (6.3.4)
w (L) = 0, w(L) = 0.
100 CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS
Remark 6.3.2. Note that one cannot directly apply Theorem 6.2.7 to the variational formulation
of the pure Neumann problem (6.3.5), i.e.,
Z L Z L
w v dx = w v dx v V = H1 (0, L) , (6.3.6)
0 0
RL
since the bilinear form a(w, v) = 0 w v dx is not coercive in V : one has a(w, w) = 0 for all
constant functions w. In other words, the problem admits the eigenvalue = 0 (with eigenfunction
w = const), which is excluded by Theorem 6.2.7.
In order to apply the theorem, one uses the trick of adding the L2 -inner product on both sides
of (6.3.6), i.e., one considers the modified problem
Z L Z L Z L
w v dx + w v dx = w v dx v V = H1 (0, L) ,
0 0 0
with = + 1. Now the bilinear form on the left-hand side is precisely the inner product in V ,
so that this problem fulfils the assumptions of Theorem 6.2.7. The eigenfunctions wk of the two
problems are the same, whereas the eigenvalues of (6.3.6) are given by k = k 1.
The same trick of shifting the eigenvalues applies whenever one has to solve a general eigenvalue
Problem 6.2.1, in which the bilinear form a(w, v) is not coercive in V , but is such that the shifted
form a(w, v) + (w, v) is indeed coercive for > 0 large enough.
i.e., a product of a function of the x-variable alone and a function of the y-variable alone. Substi-
tuting into the differential equation w = w, we get
(x) (y)
=. (6.3.8)
(x) (y)
This identity holds in , i.e., for all x (0, L) and (independently) for all y (0, L). Keeping y
fixed and varying x, we see that necessarily
(x)
= constant (say, ) ,
(x)
6.3. CLASSICAL EXAMPLES. SEPARATION OF VARIABLES 101
(0)(y) = 0 , 0yL,
and since cannot be identically 0 (otherwise w would be so), then necessarily (0) = 0. Con-
sidering all other sides of the square, we find by similar arguments (L) = 0, (0) = 0 and
(L) = 0.
In conclusion, a function w of the form (6.3.7) is an eigenfunction of our problem if and only
if and satisfy
( (
= in (0, L) , = in (0, L) ,
(0) = (L) = 0 , (0) = (L) = 0 ,
i.e., if and only if both and are arbitrary eigenfunctions of the one-dimensional problem
considered in the previous subsection. In this way, we find the eigenfunctions
2
whk (x, y) = sin(h x) sin(k y) , h, k = 1, 2, . . . , (6.3.10)
L L L
with associated eigenvalues
2
hk = (h2 + k2 ) , h, k = 1, 2, . . . , (6.3.11)
L2
A reasonable question at this point is whether there exist eigenfunctions other than those found
so far. The answer is no, since one can prove that the set (6.3.10) is complete in L2 (), so that
any w orthogonal to all whk would necessarily be the zero function in L2 ().
Note that the adopted labeling of eigenfunctions and eigenvalues, by two indices h and k,
differs from the single-index labeling used in Theorem 6.2.7. Yet, it is not difficult to see that the
set {(h, k) : h, k = 1, 2, . . . } can be numbered in such a way that the monotonicity conditions
(6.2.9) are fulfilled.
() = () in (0, 2) (6.3.14)
for some real constant ; on the other hand, should be 2-periodic. Hence, all admissible
have the form
m () = eim for arbitrary m Z ,
and the corresponding in (6.3.14) is given by
m = m2 .
Figure 6.2: Plots of the Bessel functions of the first kind J0 , J1 and J2
where is an arbitrary real or complex number. The solutions of (6.3.15) with the boundary
conditions described above can be related to certain solutions of the Bessel equation, precisely to
the Bessel functions of the first kind J (x), with a non-negative integer. The behavior of these
functions for = 0, 1, 2 is shown in Fig. 6.2 (taken from Wikipedia); it is apparent that near the
origin they behave as we want that 0 , 1 , 2 behave. Indeed, we have J (0) = 0 for all integer
> 0, whereas J0 (0) is finite and non-zero.
Each function J exhibits an oscillatory behavior around the horizontal axis as x increases,
with a monotonically increasing sequence of strictly positive, simple zeroes x,k , k = 1, 2, . . . .
Thus, for any m Z and any integer k 1, let us define
Then, using (6.3.16) with x = x|m|,k r and = |m|, it is easily seen that m,k is a solution of
(6.3.15) if the constant = m,k is defined as
This means that we are able to expand the solution u in the eigenfunction series (6.2.11), where
the (generalized) Fourier coefficients of u are expressed in terms of those of the data f . The precise
result is as follows.
be the expansion of f in the series of eigenfunctions of the bilinear form a. Then, the Fourier
coefficients of the solution u of Problem 6.4.1 are given by
1
uk = fk , k = 1, 2, . . . , (6.4.2)
k
X
X
a(u, w ) = a( uk wk , w ) = uk a(wk , w ) = (f, w ) ,
k=1 k=1
where the second equality is justified by Proposition 6.2.9. Recalling (6.2.13), we obtain
u = f ,
Assume that right-hand side is the constant function f = 1, so that the solution is the parabola
u(x) = 12 x(L x).
6.4. EXPANSION IN SERIES OF EIGENFUNCTIONS 105
Let us first compute the Fourier coefficients of f with respect to the eigenfunctions wk intro-
duced in Sect. 6.3.1. Since
Z L 0 for k even ,
sin(k x) dx = 2L (6.4.5)
0 L for k odd ,
k
we get
0 for k even ,
fk = 2L 2 r
for k odd .
k L
The Fourier coefficients of the solution are given by (6.4.2), i.e.,
0 for k even ,
1 r
uk = fk = 2L3 2
k
3 3
for k odd ,
k L
so that the eigenfunction expansion of the solution is as follows:
X
4L2 X 1
u(x) = uk wk (x) = sin (2m + 1) x .
3 m=0 (2m + 1)3 L
k=1
with = (0, L)2 . Assume as above that right-hand side is the constant function f = 1; recall
that this problem has been already considered in Sect. 4.5 while discussing the regularity of the
solution of an elliptic problem in a domain with corners.
Let us first compute the Fourier coefficients of f with respect to the eigenfunctions whk intro-
duced in Sect. 6.3.2. Recalling (6.4.5), we have
Z L Z L
0 for h or k even ,
2
fhk = sin(h x) dx sin(k y) dy = 8L2 1
L 0 L 0 L
for h and k odd .
2 hk
The Fourier coefficients of the solution are given by (6.4.2), i.e.,
0 for h or k even ,
1
uhk = fhk = 8L3 1
hk
4 for h and k odd .
hk(h2 + k2 )
so that the eigenfunction expansion of the solution is as follows:
X
X
u(x) = uhk whk (x)
h=1 k=1
2 X X
16L 1
= sin (2m + 1) x sin (2n + 1) y .
4 2 2
(2m + 1)(2n + 1)((2m + 1) + (2n + 1) ) L L
m=0 n=0
106 CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS
6.5 Exercises
6.1. Compute the eigenvalues and the eigenfunctions of the one-dimensional boundary-value prob-
lems (6.3.4) and (6.3.5).
Parabolic Problems
Parabolic problems describe propagation phenomena with infinite speed, the so-called diffusion
phenomena. Many problems in the applied sciences lead to this type of mathematical model: the
heat propagation through a rod or the motion of a viscous flow in a channel are just two important
examples, from Thermodynamics and Fluid Dynamics, in which parabolic equations describe the
time evolution of temperature and velocity, respectively.
Initial/boundary-value problems for a second-order linear parabolic operator can be studied in
a manner similar to what has been done in the previous chapters for elliptic problems. Precisely,
we first obtain a weak, or variational, formulation of the problem, by proceeding initially in a
formal manner and then making the functional assumptions fully rigorous. Next, we prove the
well-posedness of the weak formulation; in the present situation, the result will be a consequence
of an a-priori bound on any possible solution of the weak problem. At last, we interpret the weak
solution as a strong, or classical, one, provided suitable assumptions on the data are satisfied.
While the variational treatment of the spatial part of the operator follows the guidelines es-
tablished in Chaps. 3 and 4, here the mathematical novelty is represented by the first-order time
derivative. Its treatment will be based on a new result, that represent a generalization of the
well-known integration-by-part formula in one dimension.
Before starting, let us introduce some slightly new notations that are needed in order to take
into account the time variable.
Let be a bounded open set in IRn and suppose its boundary is smooth enough; further-
more, let (0, T ) be the time interval of interest. We introduce the cylinder in space and time
Q = (0, T ) IRn+1 .
Consider a generic real-valued function v = v(x , t) defined on Q; it is convenient to think of
it as a function v = v(t) defined in for every fixed t (0, T ) (i.e., a function in depending on
t as a parameter). In other words, for all t (0, T ), the function v(t) : IR is defined as
(v(t))(x ) = v(x , t) x .
107
108 CHAPTER 7. PARABOLIC PROBLEMS
it results Z T
kvk2L2 (Q) = kv(t)k2L2 () dt .
0
Since the left-hand side is, by assumption, a finite number, it follows that we can think of v as
a function v : (0, T ) L2 () such that the further function t 7 kv(t)kL2 () belongs to L2 (0, T ).
We write v L2 (0, T ; L2 ()).
Thus, we may regard functions in L2 (Q) as functions of the variable t taking values in the
space L2 (), with the further property that the spatial norm has a certain degree of integrability
with respect to the time. This can be generalized as follows.
Lp (0, T ; X)
(with the obvious change when p = ). In addition, Lp (0, T ; X) is a Banach space if and only if
X is so.
Hence, we can identify L2 (Q) with the space L2 (0, T ; L2 ()); in this and next chapters, we
shall also use spaces like L2 (0, T ; H10 ()) and L2 (0, T ; H1 ()). We also define C0 ([0, T ]; X) as
the space of all continuous functions v : [0, T ] X, equipped with the norm
where u = u(x , t) is the unknown temperature, at position x and time t, of a conducting body
occupying the domain , f = f (x , t) and u0 = u0 (x ) are two given functions representing the
heat exchange with the surrounding environment and the initial temperature, respectively. This
choice is motivated by the arguments of Sect. 1.4, where it has been shown that the heat operator
is the canonical form to which we can reduce any second-order parabolic operator. However, we
will also mention extensions of the subsequent results to more general situations.
7.1. VARIATIONAL FORMULATION 109
As we know, the weakest way to give meaning to the heat equation is in the distributional
sense: we assume u and f to be locally integrable in Q and we require u to satisfy
u
u = f in D (Q) , (7.1.2)
t
i.e., Z Z Z
u dx dt u dx dt = f dx dt D(Q) . (7.1.3)
Q t Q Q
A more balanced formulation, suggested by the experience gained with elliptic problems, consists
of applying Gauss therem only once in the term involving the spatial operator, i.e., writing the
term Z Z
u dx dt as u dx dt .
Q Q
Obviously, this requires more regularity on u: it is quite natural to assume u L2 (0, T ; H10 ()),
since then also the Dirichlet boundary condition is rigorously defined a.e. (almost everywhere, i.e.,
outside a set of zero measure) in time. Furthermore, this assumption allows us to give a precise
meaning to the time derivative of u. Indeed, recalling the bound (3.9.11), one has
Z T Z T
kuk2H1 () dt C kuk2H1 () dt < + ,
0
0 0
i.e.,
u L2 (0, T ; H10 ()) u L2 (0, T ; H1 ()) .
If we furtherly assume that f L2 (Q) = L2 (0, T ; L2 ()) L2 (0, T ; H1 ()) (the latter inclusion
being a consequence of (3.9.1)), then we deduce from (7.1.2) that
u
= u + f L2 (0, T ; H1 ()) . (7.1.4)
t
Thus, under the above assumption on f , we are led to look for a solution u of problem (7.1.1)
in the space
w
W(0, T ; H10 (), H1 ()) = {w L2 (0, T ; H10 ()) : L2 (0, T ; H1 ())} , (7.1.5)
t
where the time derivative has to be understood, as usual, in the distributional sense. This is a
Hilbert space for the graph norm
1/2
w 2
kwkW(0,T ; H10 (),H1 ()) = kwk2L2 (0,T ; H1 ()) +k k 2 1 .
0 t L (0,T ; H ())
In this case, all three addends of the heat equation are in L2 (0, T ; H1 ()), and the following
variational formulation of the equation can be given:
Z T Z T
u
1
H () h u, vi 1
H0 () dt = H1 () hf, viH10 () dt v L2 (0, T ; H10 ()) . (7.1.6)
0 t 0
What about the initial condition? We have to give a correct meaning to it. This is precisely
what is provided by the following result, which in addition establishes a useful integration-by-
parts formula in time for functions in W(0, T ; H10 (), H1 ()).
110 CHAPTER 7. PARABOLIC PROBLEMS
Proposition 7.1.1. Any function in W(0, T ; H10 (), H1 ()) is (up to a negligible set) con-
tinuous from [0, T ] to L2 (); in other words, the space W(0, T ; H10 (), H1 ()) is contained in
C0 ([0, T ]; L2 ()) with continuous inclusion.
Furthermore, for any w, v W(0, T ; H10 (), H1 ()), the following identity holds:
d w v
(w, v)L2 () = H1 () h , viH10 () + H1 () h , wiH10 () in D (0, T ) ; (7.1.7)
dt t t
equivalently, for any 0 t1 < t2 T , one has
Z t2 Z t2
w v
H1 () h , viH10 () dt + H1 () h , wiH10 () dt = (w(t2 ), v(t2 ))L2 () (w(t1 ), v(t1 ))L2 () .
t1 t t1 t
(7.1.8)
Proof. (We just provide the essential steps.) At first, one proves that the set of all smooth enough
functions, e.g. those in C1 (Q) vanishing on [0, T ], are dense in W(0, T ; H10 (), H1 ()); we
skip the technical details.
Next, we prove (7.1.7). Let {vn }n0 be any sequence of smooth functions converging to v in
W(0, T ; H10 (), H1 ()). Then, for all D(0, T ), one has
Z T Z T
d
D (0,T ) h (w, v)L2 () , iD(0,T ) = (w(t), v(t))L 2 () (t) dt = lim (w(t), vn (t))L2 () (t) dt .
dt 0 n 0
Now,
Z T Z T Z
d
(w(t), vn (t))L2 () (t) dt = w(x , t)v(x , t)
(t) dx dt
0 0 dt
Z T Z Z TZ
vn
= w(x , t) vn (x , t)(t) dx dt w(x , t)(t) (x , t) dx dt .
0 t 0 t
w
By definition of t L2 (0, T ; H1 ()), observing that vn D(Q), we can write
Z T Z Z T
w
w(x , t) vn (x , t)(t) dx dt = H1 () h , vn iH10 () dt .
0 t 0 t
Now, we remember that - since (H10 (), L2 (), H1 ()) form a Gelfand triple (see Sect. 3.9), the
L2 ()-inner product of a function g L2 () with a function z H10 () can be equivalently viewed
as the duality pairing between H1 () and H10 (), i.e.,
Therefore,
Z T Z Z T
vn vn
w(x , t)(t) (x , t) dx dt = H1 () h , wiH10 () dt .
0 t 0 t
Summarizing,
Z T Z T Z T
w vn
(w(t), vn (t))L2 () (t) dt = H1 () h , vn iH10 () dt + H1 () h , wiH10 () dt .
0 0 t 0 t
7.1. VARIATIONAL FORMULATION 111
Letting n , we get
Z T Z T
d w v
D (0,T ) h (w, v)L2 () , iD(0,T ) = H1 () h , viH10 () dt + H1 () h , wiH10 () dt
dt 0 t 0 t
Z T
w v
= 1
H () h , vi 1
H0 () + 1
H () h , wiH0 () dt
1
0 t t
w v
= D (0,T ) h H1 () h , viH10 () + H1 () h , wiH10 () , iD(0,T ) .
t t
Thus, (7.1.7) is proven.
This relation indicates that the L1 (0, T )-function t 7 (w(t), v(t))L 2 () has first derivative, in
the distributional sense, which itself belongs to L1 (0, T ). From such a result one can easily deduce
(as in the proof of Property 3.2.7) that
the function (w(t), v(t))L 2 () is continuous (indeed, absolutely continuous) on [0, T ] . (7.1.10)
In order to prove the continuity in L2 () of w(t) at any point t0 [0, T ], we use (7.1.10) once
with v = w and then with the constant function v = w(t0 ). Using the identity
kw(t) w(t0 )k2L2 () = (w(t), w(t)L 2 () 2(w(t), w(t0 ))L2 () + (w(t0 ), w(t0 )2L2 ()
Proposition 7.1.1 suggests the choice of the initial data u0 in L2 (), since u(0) is well-defined
as a function in this space, if u W(0, T ; H10 (), H1 ()); thus, condition u(0) = u0 has to be
understood as an equality between functions in L2 ().
Remark 7.1.2. Proposition 7.1.1 is just a particular case of a more general result, due to J.-
L. Lions, which concerns Gelfand triples (V, H, V ) (where V H with continuous and dense
inclusion). Precisely, introducing the space
w
W(0, T ; V, V ) = {w L2 (0, T ; V ) : L2 (0, T ; V )} , (7.1.11)
t
one has:
Proposition 7.1.3. The space W(0, T ; V, V ) is contained in C0 ([0, T ]; H) with continuous in-
clusion. Furthermore, for any w, v W(0, T ; V, V ) and any 0 t1 < t2 T , one has
Z t2 Z t2
w v
V h
, viV dt + V h , wiV dt = (w(t2 ), v(t2 ))H (w(t1 ), v(t1 ))H . (7.1.12)
t1 t t1 t
Note that if one takes H = V (hence, V = H = V ) the result tells us that if w L2 (0, T ; V )
w
and L2 (0, T ; V ), then w C0 ([0, T ]; V ).
t
We are ready for stating a variational formulation of Problem (7.1.1). For simplicity, in the
sequel we will set
Z Z
a(u, v) = (u, v)H10 () = u v dx , (u, v) = (u, v)L2 () = uv dx .
Problem 7.1.4. Given f L2 (Q) and u0 L2 (), find u W(0, T ; H10 (), H1 ()) satisfying
u(0) = u0 and such that
Z T Z T Z T
u
H1 () h , viH10 () dt + a(u, v) dt = (f, v) dt v L2 (0, T ; H10 ()) . (7.1.13)
0 t 0 0
Two alternative but entirely equivalent expressions of (7.1.13) can be given. They are:
u
H1 () h (t), viH10 () + a(u(t), v) = (f (t), v) v H10 (), a.e. in (0, T ) (7.1.14)
t
and
d
(u(t), v) + a(u(t), v) = (f (t), v) v H10 (), a.e. in (0, T ). (7.1.15)
dt
Expression (7.1.15) is particularly important both for proving the existence of a solution (as we
shall see later) and for the numerical approximation of the problem. The forms (7.1.13) and (7.1.14)
are equivalent, thanks to the fact that the set of all piecewise constant functions v : [0, T ] H10 ()
is dense in L2 (0, T ; H10 ()). On the other hand, (7.1.14) and (7.1.15) are equivalent, since
d u
(u(t), v) = H1 () h (t), viH10 () in D (0, T ) ,
dt t
thanks to (7.1.7) with v L2 (0, T ; H10 ()) constant in time.
The first integral can be expressed via Proposition 7.1.1 as follows. Set t1 = 0, t2 = and
w = v = u in (7.1.8); this easily yields
Z
u 1
H1 () h , uiH10 () dt = [(u( ), u( )) (u(0), u(0))]
0 t 2
1 1
= ku( )k2L2 () ku0 k2L2 () .
2 2
On the other hand, Z Z
a(u, u) dt = kuk2H1 () dt .
0
0 0
7.2. AN A PRIORI ESTIMATE 113
1 2 1
(f, u) CP () kf k2L2 () + kuk2H1 () ;
2 2 0
Now, if we neglect the second term on the left-hand side and use the trivial bound
Z Z T
2
kf kL2 () dt kf k2L2 () dt ,
0 0
we get Z T
ku( )k2L2 () ku0 k2L2 () + CP2 () kf k2L2 () dt for all [0, T ] ,
0
i.e.,
Z T
max ku( )k2L2 () ku0 k2L2 () + CP2 () kf k2L2 () dt .
[0,T ] 0
On the other hand, if we neglect the first term on the left-hand side of (7.2.2) and we choose
= T , we have Z T Z T
2 2 2
kukH1 () ku0 kL2 () + CP () kf k2L2 () dt .
0
0 0
p
Taking the square roots of both sides in the two last inequalities, and using the relation 2 + 2
+ for all , 0, we end up with the following result.
Proposition 7.2.1. Any solution u of Problem 7.1.4 satisfies the estimate
kukC0 ([0,T ]; L2 ()) + kukL2 (0,T ; H10 ()) C ku0 kL2 () + kf kL2 (0,T ; L2 ()) , (7.2.3)
(i) Suppose that two sets of data {f1 , u0,1 } and {f2 , u0,2 } are given and denote by u1 and u2
the solutions of the corresponding Problems 7.1.4. Due to the linearity of the equations,
u1 u2 is the solution of the problem whose data are f1 f2 and u0,1 u0,2 , so that from
(7.2.3) we have
which shows that as the data set {f1 , u0,1 } approaches {f2 , u0,2 }, the first solution u1 also
approaches the second solution u2 .
This means that the solution of the problem, if it exists, depends continuously on the data.
(ii) Consider a set of data {f, u0 } and suppose that two solutions u1 and u2 arise; then we must
have
ku1 u2 kC0 ([0,T ]; L2 ()) + ku1 u2 kL2 (0,T ; H10 ()) 0
and thus u1 = u2 . This means that the solution of the problem, if it exists, is unique.
substituting into equation (7.1.15) and choosing any eigenfunction wm as a test function, gives
+
! +
! +
!
d X X X
un (t)wn , wm + a un (t)wn , wm = fn (t)wn , wm , m = 1, 2, . . . ,
dt
n=1 n=1 n=1
that is
+
X +
X +
X
un (t) (wn , wm ) + un (t) a(wn , wm ) = fn (t) (wn , wm ) m = 1, 2, . . . ;
n=1 n=1 n=1
recalling that
(wn , wm ) = n,m and a(wn , wm ) = n n,m ,
7.3. WELL-POSEDNESS OF THE PROBLEM 115
we obtain
um (t) + m um (t) = fm (t) , m = 1, 2, . . . (7.3.3)
which shows that every generalized Fourier coefficient um (t) of the expansion of u satisfies a
first-order linear ordinary differential equation.
Moreover, since u(0) = u0 , the following relationship must hold:
+
X +
X
un (0)wn = u0,n wn ,
n=1 n=1
thus
um (0) = u0,m m = 1, 2, . . . . (7.3.4)
Finally, from (7.3.3) and (7.3.4) we have for every m 1 the Cauchy problem
(
um (t) + m um (t) = fm (t) ,
(7.3.5)
um (0) = u0,m ,
whose solution is Z t
um (t) = e m t
u0,m + em ( t) fm ( ) d. (7.3.6)
0
It remains now to be checked that the series (7.3.2) converges to a function u which solves
Problem 7.1.4. To do this, let us set
N
X N
X N
X
uN (t) : = un (t)wn , fN (t) : = fn (t)wn , u0,N : = u0,n wn ;
n=1 n=1 n=1
it is easy to verify that uN satisfies the variational formulation of the problem
uN
uN = fN in Q ,
t
uN = 0 on (0, T ) ,
uN = u0,N on {0} ;
then, if we consider a further set of data {fM , u0,M } with the corresponding solution uM we have,
from the upper bound (7.2.3),
Since {fN }N 1 and {u0,N }N 1 converge in their spaces (to f and u0 , respectively), they are
Cauchy sequences; from (7.3.7) it follows that so is {uN }N 1 both in C0 ([0, T ]; L2 ()) and in
L2 (0, T ; H10 ()). Using the completeness of these spaces allows us to conclude that when N
the sequence uN converges to a function u belonging to L2 (0, T ; H10 ()) C0 ([0, T ]; L2 ()).
Passing to the limit in the distributional equations satisfied by uN , i.e.,
Z Z Z
uN dx dt uN dx dt = fN dx dt D(Q) ,
Q t Q Q
we obtain (7.1.2), from which one can easily deduce that u solves Problem 7.1.4.
Summarizing, we have proven the following fundamental result.
Theorem 7.3.1. Problem 7.1.4 admits one and only one solution, for which the bound (7.2.3)
holds. Furthermore, the solution depends continuously on the data, as expressed by the bound
(7.3.1).
116 CHAPTER 7. PARABOLIC PROBLEMS
Z +
1
u(, t) = u(x, t)eix dx ,
2
d
2u
(, t) = 2 u(, t) ,
x2
c
u u
(, t) = (, t) ,
t t
7.4. SOME FACTS ABOUT THE REGULARITY OF THE SOLUTION 117
and then
u + 2 u = 0
t IR, t > 0 . (7.4.2)
u(, 0) = u0 ()
Note that the condition u(x, t) 0 is implicit in the fact that u(t) admits a Fourier transform,
i.e., u(t) L2 (IR) and so it need not be furtherly imposed in Problem (7.4.2).
Solving this Cauchy problem with respect to t and regarding as a parameter yields
2
u(, t) = u0 ()e t ;
Z + Z +
1 ix 1 2
u(x, t) = u(, t)e d = u0 ()e t+ix d =
2 2
Z + Z +
1 2
= e t+ix u0 (y)eiy dy d =
2
Z + Z +
1 2 t+i(xy)
= e d u0 (y) dy.
2
If we set Z +
1 2 t+i(xy)
K(x y, t) : = e d , (7.4.3)
2
we can write the solution as
Z +
u(x, t) = K(x y, t)u0 (y) dy = (K u0 )(x, t) ,
where the convolution is intended with respect to the space variable only.
The function K is said to be the kernel of the heat equation; it is the solution of the parabolic
problem
K 2K
= 0 in IR (0, +) ,
t x2
K 0 for |x| ,
K = 0 for t = 0 ,
in the sense of distributions (for this reason it is sometimes called a fundamental solution of the
heat equation) and it can be expressed in a closed form, since from equation (7.4.3) it follows
z2
e 4t
K(z, t) = ,
4t
so that we also have
Z +
(xy)2
e 4t
u(x, t) =
u0 (y) dy .
4t
From here we see that u is infinitely differentiable with respect to x even when u0 is not, since
the kernel K lies in C for every t > 0.
An alternative fashion for studying the regularity of the solution of a parabolic problem consists
of the following procedure:
118 CHAPTER 7. PARABOLIC PROBLEMS
w = 0 on (0, T ) ,
w = f (0) + u0 on {0} ,
admits a unique solution w L2 (0, T ; H10 ()) with w 2 1
t L (0, T ; H ()). But it is clear that
u
one has w = t (at least in the sense of distributions) and then u 2 1
t L (0, T ; H0 ()) with
2u
t2
L2 (0, T ; H1 ()), i.e. by Proposition 7.1.1, u 0 2
t C ([0, T ]; L ()).
As a second step, we write
u
u(t) = f (t) (t) a. e. in t , (7.4.5)
t
which is an elliptic equation with respect to x , whose right-hand side lies in L2 () for almost
every t (0, T ). If is smooth or convex, then the following relationship holds true:
u
ku(t)kH2 () C
f (t) (t)
< + , C>0,
t
L2 ()
and assume that the same hypothesis of Theorem 5.2.2 hold on L. Then:
(i) if f a0 m 0 in Q, then u m in Q;
(ii) if f a0 M 0 in Q, then u M in Q.
7.6 Exercises
7.1. Solve in series of eigenfunctions the following parabolic problem:
u 2 u
2 = 0 in (0, 1) (0, T ), T > 0 ,
t x
u(0, t) = et o
0<tT ,
u(1, t) = 0
u(x, 0) = 0 x (0, 1) .
(Hint: do a suitable substitution to have a homogeneous Dirichlet condition on the boundary; for
example, consider u(x, t) = u(x, t) + et (1 x) and solve first for u).
Hyperbolic Problems
Hyperbolic problems arise in modeling transport phenomena with finite speed, especially those
involving wave motion. Maxwells equations, which describe the propagation of electromagnetic
waves in the vacuum, as well as DAlemberts equation for the pressure waves in a fluid like air or
water, and the so-called equation of a vibrating string, which gives the motion of an elastic wave
induced across a rope blocked at its ends, go all back to a hyperbolic mathematical model.
In this chapter we deal with the most general second order linear hyperbolic operator in its
canonical form, as it has been presented in Chapter 1. We shall use the same notation which has
been introduced in the previous chapter and our explanation will be almost entirely parallel to
that of parabolic problems.
2u
c2 u = f , (8.1.2)
t2
where c is the speed of the wave which is propagating across the domain .
However, it is possible to reduce such an equation to the form discussed in the text by a classical
procedure of the Mathematical Physics which consists in rewriting the problem in a dimensionless
form.
For instance, suppose u is a velocity field and let us denote by U a characteristic (constant)
velocity such that u = U u, where u is a dimensionless variable; analogously, let us set x = Lx and
121
122 CHAPTER 8. HYPERBOLIC PROBLEMS
t = t, L and being a characteristic length of the spatial domain and a characteristic time,
respectively. Substituting into equation (8.1.2) gives
U 2 u c2 U
2 u = f ,
2 t2 L
that is
2 u c2 2 2
u = f.
t2 L2 U
Choosing = L/c and setting 2 f /U = f, we finally obtain
2 u
u = f ,
t2
which is a dimensionless equation of the desired form, in the unknown u.
2u
D (Q) h u, iD(Q) = D (Q) hf, iD(Q) D(Q)
t2
or, more explicitly,
Z T Z T Z T
2
u, dt + a(u, ) dt = (f, ) dt D(Q), (8.1.4)
0 t2 0 0
On the other hand, if we assume f L2 (Q) = L2 (0, T ; L2 ()) and u L2 (0, T ; H10 ()), which
implies u L2 (0, T ; H1 ()), we obtain from (8.1.3)
2u
L2 (0, T ; H1 ()) ,
t2
so that we can express our equation in the following variational form:
Z T Z T Z T
2u
H1 () h , viH10 () dt + a(u, v) dt = (f, v) dt v L2 (0, T ; H10 ()) . (8.1.5)
0 t2 0 0
In order to give a precise meaning to the initial conditions, we make the further assumption
that
u
L2 (0, T ; H10 ()) ;
t
in other words, all together the solution u is required to satisfy u L2 (0, T ; H10 ()) with u
t
1 1 u 2 u
W(0, T ; H0 (), H ()). As a consequence, Proposition 7.1.1 applied to the pair t , t2 yields
u
C0 ([0, T ]; L2 ()) ;
t
8.2. AN A PRIORI ESTIMATE 123
u
on the other hand, recalling Remark 7.1.2 for the pair u, t , we get
u
Thus, the pointwise evaluation of both u and t makes sense in [0, T ]. In particular, we have
u
u(0) = u0 H10 () and (0) = u1 L2 (),
t
giving the natural spaces in which one has to choose the initial data.
We are ready to state the initial/boundary value problem (8.1.1) in variational form.
Problem 8.1.2. Given f L2 (Q), u0 H10 () and u1 L2 (), find u L2 (0, T ; H10 ()) with
u 1 1 u
t W(0, T ; H0 (), H ()) satisfying u(0) = u0 , t (0) = u1 and such that
Z T Z T Z T
2u
1
H () h , viH10 () dt + a(u, v) dt = (f, v) dt v L2 (0, T ; H10 ()) . (8.1.6)
0 t2 0 0
2u
H 1
() h (t), viH10 () + a(u(t), v) = (f (t), v) v H10 (), a.e. in (0, T ) (8.1.7)
t2
and
d2
(u(t), v) + a(u(t), v) = (f (t), v) v H10 (), a.e. in (0, T ). (8.1.8)
dt2
2u u
Now take t1 = 0, t2 = , w = t2 and v = t in (7.1.7) and use the initial condition to get
Z
2 u u 1 u u u u
H1 () h 2
, iH10 () dt = ( ), ( ) (0), (0) =
0 t t 2 t t t t
2
1
u
1
=
( )
ku1 k2L2 () .
2 t 2 2
L ()
Furthermore, the following identity holds for our bilinear symmetric form a(u, v):
u 1 d
a u, = a(u, u) .
t 2 dt
124 CHAPTER 8. HYPERBOLIC PROBLEMS
1
a(u(t + t), u(t + t)) a(u(t), u(t))
t
1
= a(u(t + t), u(t + t)) a(u(t), u(t + t)) a(u(t), u(t))
t
u(t + t) u(t) u(t + t) u(t)
=a , u(t + t) + a u(t), ,
t t
Finally, using the Cauchy-Schwartz inequality and the fact ab 12 a2 + b2 a, b > 0, we get
Z Z
u 1
u
f, dt kf (t)kL2 ()
(t)
dt
0 t 0 t
L2 ()
Z Z
1
u
2
2
kf (t)kL2 () dt +
(t)
dt
2 0 2 0
t
L2 ()
2
1
u ( )
1 1 1
ku1 k2L2 () + ku( )k2H1 () ku0 k2H1 ()
2 t L2 () 2 2 0 2 0
Z
1
u
2
kf k2L2 (0, ; L2 ()) +
(t)
dt ;
2 2
t
2
0 L ()
u
2
max
(t)
max ku(t)k2H1 ()
t[0,T ]
t
2 + t[0,T ] 0
L ()
Z T
1
u
2
kf k2L2 (0,T ; L2 ()) + ku0 k2H1 () + ku1 k2L2 () + max
(t)
dt .
0 t[0,T ]
t
L2 () 0
Remark 8.2.2. Observe that the right-hand side depends on T , which implies that the upper
bound becomes less and less tight as T grows to infinity. Although this is not a so serious problem
from a theoretical viewpoint, it may be unfavourable in a numerical approach, where one would
like to control and confine as much as possible the error on the numerical solution.
If we want to improve in this sense the a priori upper bound (8.2.2) we may, for instance,
control the norm of the solution u by using a different norm of the forcing term f . In particular,
since f L2 (0, T ; L2 ()) implies1 f L1 (0, T ; L2 ()), we have
Z Z
u
u
f, dt kf (t)kL2 ()
(t)
dt
t
t
2
0 0 L ()
Z
u
max
(t)
2 kf (t)kL2 () dt
t[0, ] t L () 0
u
t
0 kf kL1 (0,T ; L2 ())
C ([0, ]; L2 ())
2
1
u
1
+ kf k2L1 (0,T ; L2 ()) ,
2 t C0 ([0, ]; L2 ()) 2
Theorem 8.3.1. Problem 8.1.2 admits one and only one solution, for which the bound (8.2.2)
holds. Furthermore, the solution depends continuously on the data in the norms which appear in
this bound.
Proof. (Sketch). Let us just detail the construction of the approximants of the solution by an
eigenfunction expansion; the rest of the arguments are similar to the parabolic case.
Let us denote, as usually, by {n , wn } the eigenvalue-eigenfunction pairs of the Laplacian with
Dirichlet boundary conditions; then we can represent the data as
+
X
f (t) = fn (t)wn , fn (t) = (f (t), wn ) ,
n=1
+
X
u0 = u0,n wn , u0,n = (u0 , wn ) ,
n=1
+
X
u1 = u1,n wn , u1,n = (u1 , wn ) ,
n=1
1
Given a set A, we recall that the following relationship holds true: L2 (A) L1 (A), provided A has a finite
Lebesgue measure; in this case, there exists a constant C > 0 such that kvkL1 (A) CkvkL2 (A) v L2 (A), which
shows that L1 -norm is weaker than L2 -norm. Nevertheless, these two norms are not equivalent, since the converse
need not be true; consider, e.g., the function v(x) = 1x on A = [0, 1].
126 CHAPTER 8. HYPERBOLIC PROBLEMS
Recalling that
(wn , wm ) = n,m and a(wn , wm ) = n n,m
we obtain
um (t) + m um (t) = fm (t), m = 1, 2, . . . ,
together with the conditions
+
X +
X
u(0) = un (0)wn = u0 = u0,n wn um (0) = u0,m m 1 ,
n=1 n=1
+
X +
X
u
(0) = un (0)wn = u1 = u1,n wn um (0) = u1,m m 1 ;
t n=1 n=1
thus, each generalized Fourier coefficient of the solution u is the solution of the following initial-
value problem for a linear second-order differential equation
um (t) + m um (t) = fm (t) ,
um (0) = u0,m , m = 1, 2, . . . ,
um (0) = u1,m
which can be rewritten in a canonical form as a first-order system by defining vm (t) = um (t)
um (t) = vm (t) ,
vm (t) = m um (t) + fm (t) ,
m = 1, 2, . . . ,
u (0) = u0,m ,
m
vm (0) = u1,m
and then introducing the vector vm (t) = (um (t), vm (t))T :
0 1 0
v (t) = vm (t) +
m m 0 fm (t)
m = 1, 2, . . .
u0,m
vm (0) =
u1,m
The solution of this differential system writes formally as
8.4. QUALITATIVE PROPERTIES OF THE SOLUTION 127
Z t
tAm u0,m ( t)Am 0
vm (t) = e + e d , (8.3.2)
u1,m 0 fm ( )
0 1
where we have set Am = . This matrix has the following eigenvalues: i m , thus it
m 0
can be diagonalized using a nonsingular matrix P CI 2,2 such that
i m 0
Am P = P ,
0 i m
which also yields the explicit expression of the exponential matrix
!
ei m t 0
etAm = P P 1 .
0 ei m t
converges in general only for s = 0 and s = 1, according to the fact that u L2 (0, T ; H10 ()).
If f is not zero, an analogous result can be achieved.
In order to have a smooth solution (in the Sobolev or classical scale) one has to assume
sufficiently smooth initial data and forcing term, plus of course the smoothness of the domain if
one is interested to the regularity up to the boundary.
Conservation properties
Let us consider an elastic membrane blocked at its boundary, which can oscillate from an initial
position u0 and with an initial speed u1 . The mathematical dimensionless model describing such
a phenomenon is
128 CHAPTER 8. HYPERBOLIC PROBLEMS
2
u
u = 0 in (0, ) ,
2
t
u = 0 ) on (0, ) , (8.4.1)
u = u 0
u on {0} ,
= u1 .
t
that is a linear hyperbolic problem with in particular the forcing term f equal to zero. Due to
this, the a priori upper bound derived from (8.2.1) can now be written as an equality:
2
1
u ( )
1 2 1 2 1 2
2 t
2 + 2 ku( )kH10 () = 2 ku1 kL2 () + 2 ku0 kH10 () , 0 .
L ()
Thus, the expression
2
1
u
1
E(t) =
(t)
+ ku(t)k2H1 ()
2 t L2 () 2 0
denotes a quantity which does not vary during the evolution in time, i.e.,
d
E(t) 0 ;
dt
In other words, the quantity E(t) is conserved in time. Physically, it represents the membranes
dimensionless total energy, and, more precisely,
2
1
u
(T )
is the dimensionless kinetic energy ,
2 t 0,
1
|u(T )|21, is the dimensionless elastic energy .
2
Instead, the following problem
2
u u
+ u = 0 in (0, ) ,
2
t
t
u = 0 ) on (0, ) ,
u = u0
u on {0} ,
= u1
t
with > 0 models the motion of the same membrane but without neglecting friction forces,
proportional to its velocity, which are taken into account by the term u t .
The a priori upper bound reads now as
Z
2
u
2 1
u
1 1 1
dt +
( )
+ ku( )k2H1 () = ku1 k2L2 () + ku0 k2H1 () , 0 ,
t (t)
2 2 t
2 0 2 2 0
0 L () L2 ()
and we can see that, due to the dissipation term on the left-hand side
Z
2
u (t)
dt > 0 and strictly increasing with ,
t
2
0 L ()
one has
d
E(t) < 0 ,
dt
i. e., the total energy of the system is not conserved.
8.5. EXERCISES 129
Time reversibility
Linear hyperbolicity allows us to reverse the time axis, and reconstruct the solution from knowing
its values at a final time, instead of at an initial time. In other words, the retrograde boundary-
value problem 2
w
w = 0 in Q ,
t2
w = 0 ) on (0, T ) , (8.4.2)
w = w 0
w on {T } ,
= w1
is well posed, as the forward problem. Indeed, appliying the change of variable = T t and
setting u(x , ) = w(x , T ), we have
u w
(x , ) = (x , T ) ,
t
2u 2w
(x , ) = (x , T ) ,
2 t2
u(x , ) = w(x , T )
and moreover
u(0) = w(0) = w0 ,
u w
(0) = (T ) = w1 ,
t
so that w satisfies the standard problem
2
u
u = f in Q ,
2
u = 0 ) on (0, T ) , (8.4.3)
u = w0
u on {0} .
= w1
Roughly speaking, we may say that w solves the problem (8.4.2) backwards in time.
This, however, no longer holds if we include the term wt , since in dissipative phenomena the
time axis cannot be reversed: indeed, the previous transformation would give the term u in
(8.4.3), with the wrong sign!
The fact the a dissipative retrograde problem is not well posed is nothing but the mathematical
counterpart of the physical concept of entropy.
8.5 Exercises
8.1. [... to be added ...]