Sie sind auf Seite 1von 155

October 12, 2011

Quantum Field Theory I


Ulrich Haisch
Rudolf Peierls Centre for Theoretical Physics, University of Oxford
OX1 3PN Oxford, United Kingdom
Abstract
This course deals with modern applications of quantum eld theory with emphasize on
the quantization of theories involving scalar and spinor elds.
Recommended Books and Resources
There is a vast array of quantum eld theory texts, many of them with redeeming features.
Here I mention a few of them, mostly the ones that I used or looked at when preparing this
course. To a large extent, I will follow the rst section of
M. Peskin and D. Schroeder, An Introduction to Quantum Field Theory
This is a very clear and comprehensive book, covering essentially everything in this course
as well as many advanced aspects of quantum eld theory that go (far) beyond the scope of
this lecture.
S. Weinberg, The Quantum Theory of Fields: Volume 1, Foundations
This is the rst in a three volume series by one of the masters of quantum eld theory.
It takes a unique route through the subject, focussing initially on particles rather than elds.
Since it has a very particular viewpoint, it is dicult to digest, but certainly worth reading.
L. Ryder, Quantum Field Theory
This elementary text has a nice discussion of much of the material in this course. It is
good for a rst reading.
A. Zee, Quantum Field Theory in a Nutshell
This is a charming book, where emphasis is placed on physical understanding and the
author isnt afraid to hide the ugly truth when necessary. It contains many gems.
By browsing the web, I also found interesting material. Nice introductions to quantum
eld theory (of dierent length and viewpoint) have been written by C. Anastasiou and D.
Tong. The corresponding scripts can be found at:
http://www.phys.ethz.ch/babis/Teaching/QFTI/qft1.pdf
http://www.damtp.cam.ac.uk/user/tong/qft/qft.pdf
Other links to useful resources can be found on the web page of D. Tong:
http://www.damtp.cam.ac.uk/user/tong/qft.html
For completeness, I will also give relevant references at the end of each section of this
script. The interested reader can consult them for further details on the discussed topics.
1
Contents
1 Introduction 3
1.1 Why QFT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Scales and Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Elements of Classical Field Theory 8
2.1 Dynamics of Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Noethers Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Example: Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Space-Time Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Klein-Gordon Theory 22
3.1 Klein-Gordon Field as Harmonic Oscillators . . . . . . . . . . . . . . . . . . . 23
3.2 Structure of Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Particle States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Two Real Klein-Gordon Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Complex Klein-Gordon Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6 Heisenberg Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.7 Klein-Gordon Correlators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.8 Non-Relativistic Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Interacting Fields 56
4.1 Classication of Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Interaction Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 First Look at Scattering Processes . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Wicks Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 Second Look at Scattering Processes . . . . . . . . . . . . . . . . . . . . . . . 67
4.6 Feynman Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.7 Third Look at Scattering Processes . . . . . . . . . . . . . . . . . . . . . . . . 73
4.8 Yukawa Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.9 Connected and Amputated Feynman Diagrams . . . . . . . . . . . . . . . . . 78
4.10 From Correlation Functions to Scattering Matrix Elements . . . . . . . . . . . 85
4.11 Decay Widths and Cross Sections . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5 Dirac Theory 107
5.1 Spinor Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.2 Discrete Symmetries of Dirac Theory . . . . . . . . . . . . . . . . . . . . . . . 117
5.3 Continuous Symmetries of Dirac Theory . . . . . . . . . . . . . . . . . . . . . 123
5.4 Solutions to Dirac Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.5 Quantization of Dirac Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2
1 Introduction
As the term quantum eld theory (QFT) suggests, QFT is the application of quantum me-
chanics (QM) to dynamical systems of elds, in the same sense that QM is concerned mainly
with the quantization of dynamical systems of particles. QFT is not only a subject that is
absolutely essential to understand the current state of elementary particle physics as well as
modern aspects of cosmology, but also plays a crucial role in many active areas of research,
ranging from atomic over nuclear and condensed-matter physics to pure mathematics. Since
the ultimate goal of this course is to gain a basic understanding of the fundamental laws of
nature, we will in the following focus mainly on the physics of elementary particles and hence
deal mostly with relativistic elds.
1.1 Why QFT?
The primary reason for introducing the concept of elds in classical physics is to construct laws
of nature that are local. The old laws of Newton (Coulomb) involve action at a distance.
This means that the force felt by a planet (an electron) changes immediately if a distant
star (proton) moves. The laws of Newton and Coulomb thus feature non-local interactions.
The eld theories of Einstein (general relativity) and Maxwell (electrodynamics) remedied the
situation, with all interactions mediated in a local fashion by elds. The requirement of locality
remains a strong motivation for studying QFTs. However, there are further good reasons to
treat the quantum eld (and not the particle) as fundamental (or as Steven Weinberg puts it
in [1]: Quantum elds are the basic ingredients of the universe, and particles are just bundles
of energy and momentum made out of them.).
QM and Special Relativity
A rst reason is that the combination of QM and special relativity implies that particle number
is not conserved. Consider a particle of mass m trapped in a box of size L. Heisenbergs
uncertainty principle tells us that the uncertainty in the momentum of our particle is p
/L. In the relativistic limit, momentum and energy can be treated on equivalent footing,
and one has an uncertainty in the energy of order E c/L. Yet, if E = 2mc
2
, there is
enough energy available to create a virtual particle-antiparticle pair from the vacuum (Dirac
sea). This little exercise shows that when a particle with mass m is localized within a distance

Compton
= /(mc), talking about a single particle loses its sense. For distances smaller than
this Compton wavelength there is a high probability that we will detect particle-antiparticle
pairs swarming around the single particle that we initially put into the box. Notice that

Compton
is always smaller than the de Broglie wavelength given by
de Broglie
= /[p[.
1
If
you like, the de Broglie wavelength is the distance at which the wavelike nature of particles
becomes apparent, while the Compton wavelength is the distance at which the concept of
a single pointlike particle breaks down and one has to start thinking about how to describe
multiparticle states.
1
Throughout this course we will use boldface type (ordinary italic type) to denote 3-vectors (4-vectors).
3
The presence of a multitude of particles and antiparticles at short distances (or high
energies) tells us that any attempt to write down a relativistic version of the one-particle
Schr odinger equation is doomed to fail. There is no mechanism in standard non-relativistic
QM to deal with changes in the particle number. Indeed, any attempt to naively write down a
relativistic version of the one-particle Schrodinger equation meets serious problems: negative
probabilities, innite towers of negative energy states, or a breakdown of causality are the
common issues that arise.
QM and Causality
Let us have a closer look at the issue of breakdown of causality. Consider the amplitude
A(t) =

e
iEt/

x
_
, (1.1)
that describes the propagation of a free particle from the point x to y. In non-relativistic QM
one has E = p
2
/(2m) and hence
2
A(t) =

exp
_
i
_
p
2
/(2m)
_
t/

x
_
=
_
d
3
p
(2)
3

exp
_
i
_
p
2
/(2m)
_
t/

p
_
p

x
_
=
_
d
3
p
(2)
3
exp
_
i
_
p
2
/(2m)
_
t/

exp
_
ip (y x)/

=
_
m
2it
_
3/2
exp
_
im(y x)
2
/(2
2
t)

.
(1.2)
Here we have made use of the completeness
_
d
3
p/(2)
3
[p)p[ = 1 of [p) and a little bit
of algebra. The expression (1.2) is non-zero for all y and t, indicating that a particle can
propagate between any two points in an arbitrarily short time. In a relativistic theory, this
conclusion would signal a violation of causality. One might hope that using the relativistic
expression E =
_
p
2
c
2
+m
2
c
4
for the energy would cure the problem, but it does not. In fact,
in the relativistic case one has
A(t) =

exp
_
it/
_
p
2
c
2
+m
2
c
4

x
_
=
_
d
3
p
(2)
3
exp
_
it/
_
p
2
c
2
+m
2
c
4

exp
_
ip (y x)/

=
1
2
2

2
[y x[
_

0
dp p sin (p [y x[ /) exp
_
it/
_
p
2
c
2
+m
2
c
4

.
(1.3)
This integral can be evaluated explicitly in terms of Bessel functions, but for our purposes it
is sucient to consider its asymptotic behavior for L
2
= [y x[
2
c
2
t
2
, i.e., separations well
outside the light-cone. We use the method of stationary phase. The relevant phase function
2
The symbol p denotes the momentum operator, which in many QM books is indicated by a . To avoid
clutter, I will not use the latter notation, but simply write p.
4
pL t
_
p
2
c
2
+m
2
c
4
has a stationary point p = imcL/

L
2
c
2
t
2
. Plugging this value into
(1.3), we nd that (up to a rational function of L and t),
A(t) exp
_
m/

L
2
c
2
t
2
_
. (1.4)
This expression is small but non-zero outside the light-cone and causality is again violated.
In both cases, the observed failure is telling us that we need a new formalism to preserve
causality. This formalism is QFT. It solves the causality problem in a miraculous way. We
will see later that in QFT the propagation of a particle across a space-like interval is indis-
tinguishable from the propagation of an antiparticle in the opposite direction. When we ask
whether an observation made at point x can aect an observation made at point y, we will
nd that the amplitudes for particle and antiparticle propagation cancel in such a way that
causality is preserved.
What else is QFT good for?
Besides solving the causality problem, QFT also provides an elegant framework to describe
transitions between states of dierent particle number and type. An example physical pro-
cesses, exhaustivelly studied (from 1989 until 2000) at the Large Electron Positron (LEP)
collider in Geneva, is the production of a muon (

) and its antiparticle (


+
) out of the
annihilation of an electron (e

) and its antiparticle (the positron e


+
):
e

+e
+

+
+
. (1.5)
The experimental conrmation of the QFT predictions for processes such as (1.5), often to an
unprecedented level of accuracy, is our real reason for studying QFT. But the power of QFT
does not end here. In traditional QM the relation between spin and statistics has to be put
in by hand. To agree with experiment, one should choose Bose statistics (no minus sign if
one exchanges two identical particles) for integer spin particles, and Fermi statistics (minus
sign if one exchanges two identical particles) for half-integer spin particles. On the other
hand, in QFT the relationship between spin and statistics is a consequence of the framework,
following from the commutation quantization conditions for boson elds and anticommutation
quantization conditions for fermion elds.
1.2 Scales and Units
There are three fundamental dimensionful constants in nature: the speed of light c, Plancks
constant (divided by 2), and Newtons constant G
N
. Their dimensions are
[c] = length time
1
,
[] = length
2
mass time
1
,
[G
N
] = length
3
mass
1
time
2
.
(1.6)
5
In order to avoid unnecessary clutter, we will work throughout this course in natural units,
dened by c = = 1.
3
This allows us to express all dimensionful quantities in terms of a
single scale which we choose to be mass or, equivalently, energy (since E = mc
2
has become
E = m). Energies will be given in units of eV (the electron volt) or more often GeV = 10
9
eV
or TeV = 10
12
eV, since we are typically dealing with high energies. To convert the unit of
energy back to units of length or time, we have to insert the relevant powers of c and . E.g.,
the length scale associated to a mass m is = h/(mc). Remembering that
hc 1.24 10
6
eVm, (1.7)
one nds that the length scale corresponding to the electron with mass m
e
511 keV is

e
2 10
12
m.
Throughout this course we will refer to the dimension of a quantity, meaning the mass
dimension. Newtons constant, e.g., has [G
N
] = 2 and denes a mass scale
G
N
= M
2
P
, (1.8)
where M
P
10
19
GeV is the Planck scale. This energy corresponds to a length scale L
P

10
35
m the Planck length. The Planck length is believed to be the smallest length scale that
makes sense: beyond this scale quantum gravity eects are likely to become important and
its no longer clear that the concept of space-time can be applied. The largest length scale we
can talk of is the size of the cosmological horizon, roughly 10
60
L
P
.
A number for particle physics and cosmology relevant masses and the corresponding length
scales are shown in Table 1. Let me go through the list and spend some words on the most
important quantities. After the size of the observable universe, the rst scale we encounter is
the cosmological constant () measured to be around 10
3
eV. Since nobody can really explain
why the cosmological constant has this particular value, lets forget about it real quick and
turn our attention to the masses of the known elementary particles. These range from less
than 1 eV for the neutrinos (s) to around 175 GeV for the top quark (t). The (in)famous
Higgs boson (h), which is the only not yet observed ingredient of the standard model (SM) of
elementary particle physics, is believed to weigh in at about 100 to 200 GeV. For scales around
1 TeV, i.e., the terascale, the predictive power of the SM is expected to break down. This is
precisely the energy regime that the Large Hadron Collider (LHC) at CERN in Geneva has
started to explore, having a design center-of-mass energy of 14 TeV. Beyond the electroweak
scale (v) of around 250 GeV, again nobody knows with certainty what is going on. One could
nd a plethora of new (elementary) particles or a great desert. There are experimental
hints that the coupling constants of electromagnetism, and the weak and strong forces unify
at around M
GUT
= 10
16
GeV, i.e., the grand unication scale (GUT). Everything is topped o
at the Planck scale where a QFT description might no longer be possible and a quantum theory
including the eects of gravity is needed to describe the physics of fundamental interactions.
The most likely possibility for such a theory seems to be some kind of string theory. But
also many other ideas such as loop quantum gravity, Horava-Lifshitz gravity, etc. exist. In
fact, the theory of everything (TOE) could also be a QFT, but one in which the nite or
3
The whole point of units is that you can choose whatever units are most convenient!
6
Quantity Mass Length
Observable universe 10
33
eV 10
27
m 2 10
10
ly
Cosmological constant () 10
3
eV 10
3
m
Neutrinos (s) 1 eV 10
6
m
Electron (e

) 511 keV 2 10
12
m
Muon (

) 106 MeV 10
14
m
Charm quark (c) 1.3 GeV 10
15
m
Tau (

) 1.78 GeV 7 10
16
m
Bottom quark (b) 4.6 GeV 3 10
16
m
Top quark (t) 175 GeV 7 10
18
m
Higgs boson (h) [100, 200] GeV [6, 12] 10
18
m
Electroweak scale (v) 250 GeV 5 10
18
m
LHC energy 14 TeV 9 10
20
m
GUT scale (M
GUT
) 10
16
GeV 10
31
m
Planck scale (M
P
) 10
19
GeV 10
35
m
Table 1: An assortment of masses and corresponding lengths scales that appear in the
context of particle physics and cosmology.
innite number of renormalized couplings do not run o to innity with increasing energy,
but hit a xed point of the renormalization group equation. This possibility goes by the
name of asymptotic safety. Dont worry if havent understood a single word of what I have
mumbled about possible TOEs. All this is way too advanced to be covered in this course. I
only mentioned it, to make propaganda for the research of Joe Conlon (string theory), Andre
Lukas (string theory), and John Wheater (quantum gravity), which work on such theories
here in Oxford. Ask them if you want to know more about it.
References
[1] S. Weinberg, What is quantum eld theory, and what did we think it was?, arXiv:hep-
th/9702027.
[2] S. Weinberg, The Search for Unity: Notes for a History of Quantum Field Theory,
Daedalus, Vol. 106, No. 4, Discoveries and Interpretations: Studies in Contemporary
Scholarship, Volume II (1977), 17 p.
[3] Chapter 1 of S. Weinberg, The Quantum theory of elds. Vol. 1: Foundations, Cam-
bridge, UK, Univ. Pr. (1995), 609 p.
[4] F. Wilczek, Rev. Mod. Phys. 71, S85 (1999) [arXiv:hep-th/9803075].
7
2 Elements of Classical Field Theory
In this second section we will discuss various aspects of classical elds. We will cover only
the bare minimum ground necessary before turning to the quantum theory, and will return
to classical eld theory at several later stages in this course when we need to introduce new
concepts or ideas.
2.1 Dynamics of Fields
A eld is a quantity dened at every space-time point x = (t, x). While classical particle
mechanics deals with a nite number of generalized coordinates q
a
(t), indexed by a label a, in
eld theory we are interested in the dynamics of elds

a
(t, x) , (2.1)
where both a and x are considered as labels. We are hence dealing with an innite number of
degrees of freedom (dofs), at least one for each point x in space. Notice that the concept of
position has been relegated from a dynamical variable in particle mechanics to a mere label
in eld theory.
Lagrangian and Action
The dynamics of the elds is governed by the Lagrangian. In all the systems we will study
in this course, the Lagrangian is a function of the elds
a
and their derivatives

a
,
4
and
given by
L(t) =
_
d
3
x /(
a
,

a
) , (2.2)
where the ocial name for / is Lagrangian density. Like everybody else we will, however, sim-
ply call it Lagrangian from now on. For any time interval t [t
1
, t
2
], the action corresponding
to (2.2) reads
S =
_
t
2
t
1
dt
_
d
3
x / =
_
d
4
x /. (2.3)
Recall that in classical mechanics L depends only on q
a
and q
a
, but not on the second time
derivatives of the generalized coordinates. In eld theory we similarly restrict to Lagrangians
/ depending on
a
and

a
. Furthermore, with an eye on Lorentz invariance, we will only
consider Lagrangians depending on
a
and not higher derivatives.
Notice that since we have set = 1, using the convention described in Section 1.2, the
dimension of the action is [S] = 0. With (2.3) and [d
4
x] = 4, it follows that the Lagrangian
must necessarily have [/] = 4. Other objects that we will use frequently to construct La-
grangians are derivatives, masses, couplings, and most importantly elds. The dimensions of
the former two objects are [

] = 1 and [m] = 1, while the dimensions of the latter two quan-


tities depend on the specic type of coupling and eld one considers. We therefore postpone
4
If there is no (or only little) room for confusion, we will often drop the arguments of functions and write

a
=
a
(x) etc. to keep the notation short.
8
the discussion of the mass dimension of couplings and elds to the point when we meet the
relevant building blocks.
Principle of Least Action
The dynamical behavior of elds can be determined by the principle of least action. This
principle states that when a system evolves from one given conguration to another between
times t
1
and t
2
it does so along the path in coguration space for which the action is an
extremum (usually a minimum) and hence satises S = 0. This condition can be rewritten,
using partial integration, as follows
S =
_
d
4
x
_
/

a
+
/
(

a
)
(

a
)
_
=
_
d
4
x
__
/

_
/
(

a
)
__

a
+

_
/
(

a
)

a
__
= 0 .
(2.4)
The last term is a total derivative and vanishes for any
a
that decays at spatial innity and
obeys
a
(t
1
, x) =
a
(t
2
, x) = 0. For all such paths, we obtain the Euler-Lagrange equations
of motion (EOMs) for the elds
a
, namely

_
/
(

a
)
_

a
= 0 . (2.5)
Hamiltonian Formalism
The link between the Lagrangian formalism and the quantum theory goes via the path integral.
While this is a powerful formalism, we will for the time being use canonical quantization, since
it makes the transition to QM easier. For this we need the Hamiltonian formalism of eld
theory. We start by dening the momentum density
a
(x) conjugate to
a
(x),

a
=
/

a
. (2.6)
In terms of
a
,

a
, and / the Hamiltonian density is given by
H =
a

a
/, (2.7)
where, as in classical mechanics, we have eliminated

a
in favor of
a
everywhere in H. The
Hamiltonian then simply takes the form
H =
_
d
3
x H. (2.8)
2.2 Noethers Theorem
The role of symmetries in eld theory is possibly even more important than in particle mechan-
ics. There are Lorentz symmetry, internal symmetries, gauge symmetries, supersymmetries,
etc. We start here by recasting Noethers theorem in a eld theoretic framework.
9
Currents and Charges
Noethers theorem states that every continuous symmetry of the Lagrangian gives rise to a
conserved current J

(x), so that the EOMs (2.5) imply

= 0 , (2.9)
or in components dJ
0
/dt + J = 0. To every conserved current there exists also a conserved
(global) charge Q, i.e., a physical quantity which stays the same value at all times, dened as
Q =
_
R
3
d
3
x J
0
. (2.10)
The latter statement is readily shown by taking the time derivative of Q,
dQ
dt
=
_
R
3
d
3
x
dJ
0
dt
=
_
R
3
d
3
x J , (2.11)
which is zero, if one assumes that J falls o suciently fast as [x[ . Notice, however, that
the existence of the conserved current J is much stronger than the existence of the (global)
charge Q, because it implies that charge is in fact conserved locally. To see this, we dene the
charge in a nite volume V by
Q
V
=
_
V
d
3
x J
0
. (2.12)
Repeating the above analysis, we nd
dQ
V
dt
=
_
V
d
3
x J =
_
S
dS J , (2.13)
where S denotes the area bounding V , dS is a shorthand for ndS with n being the outward
pointing unit normal vector of the boundary S, and we have used Gauss theorem. In physical
terms the result means that any charge leaving V must be accounted for by a ow of the
current 3-vector J out of the volume. This kind of local conservation law of charge holds in
any local eld theory.
Proof of Theorem
In order to prove Noethers theorem, well consider innitesimal transformations. This is
always possible in the case of a continuous symmetry. We say that
a
is a symmetry of the
theory, if the Lagrangian changes by a total derivative
/(
a
) =

(
a
) , (2.14)
for a set of functions

. We then consider the transformation of / under an arbitrary change


of eld
a
. Glancing at (2.4) tells us that in this case
/ =
_
/

_
/
(

a
)
__

a
+

_
/
(

a
)

a
_
. (2.15)
10
When the EOMs are satised than the term in square bracket vanishes so that we are simply
left with the total derivative term. For a symmetry transformation satisfying (2.13) and (2.14),
the relation (2.15) hence takes the form

= / =

_
/
(

a
)

a
_
, (2.16)
or simply

= 0 with
J

=
/
(

a
)

a

, (2.17)
which completes the proof. Notice that if the Lagrangian is invariant under the innitesimal
transformation
a
, i.e., / = 0, then

= 0 and J

contains only the rst term on the


right-hand side of (2.17).
We stress that that our proof only goes through for continuous transformations for which
there exists a choice of the transformation parameters resulting in a unit transformation, i.e.,
no transformation. An example is a Lorentz boost with some velocity v, where for v = 0
the coordinates x remain unchanged. There are examples of symmetry transformations where
this does not occur. E.g., a parity transformation does not have this property, and Noethers
theorem is not applicable then.
Energy-Momentum Tensor
Recall that in classic particle mechanics, spatial translation invariance gives rise to the con-
servation of momentum, while invariance under time translations is responsible for the con-
servation of energy. What happens in classical eld theory? To gure it out, lets have a look
at innitesimal translations
x

=
a
(x)
a
(x +) =
a
(x) +

a
(x) , (2.18)
where the sign in the eld transformation is plus, instead of minus, because we are doing an
active, as opposed to passive, transformation. If the Lagrangian does not explicitly depend
on x but only through
a
(x) (which will always be the case in the Lagrangians discussed in
this course), the Lagrangian transforms under the innitesimal translation as
/ / +

/. (2.19)
Since the change in / is a total derivative, we can invoke Noethers theorem which gives
us four conserved currents T

= (J

one for each of the translations

( = 0, 1, 2, 3).
From (2.18) we readily read o the explicit expressions for T

,
T

=
/
(

a
)

/. (2.20)
This quantity is called the energy-momentum (or stress-energy) tensor. It has dimension
[T

] = 4 and satises

= 0 . (2.21)
11
The four conserved charges are ( = 0, 1, 2, 3)
P

=
_
d
3
x T
0
, (2.22)
Specically, the time component of P

is
P
0
=
_
d
3
x T
00
=
_
d
3
x
_

a
/
_
, (2.23)
which (looking at (2.7) and (2.8)) is nothing but the Hamiltonian H. We thus conclude that
the charge P
0
is the total energy of the eld conguration, and it is conserved. In elds theory,
energy conservation is thus a pure consequence of time translation symmetry, like it was in
particle mechanics. Similarly, we can identify the charges P
i
(i = 1, 2, 3),
P
i
=
_
d
3
x T
0i
=
_
d
3
x
a

a
, (2.24)
as the momentum components of the eld conguration in the three space directions, and they
are of course also conserved.
2.3 Example: Electrodynamics
As a simple application of the formalism we have developed so far in this section, let us try
to derive Maxwells equations using the eld theory formulation. In terms of the electric and
magnetic elds E and B and the charge density and 3-vector current j, these equations
take the well-known form
B = 0 , (2.25)
E +
B
t
= 0 , (2.26)
E = , (2.27)
B
E
t
= j . (2.28)
The E and B elds are spatial 3-vectors and can be expressed in terms of the components
of the 4-vector eld A

= (, A) by
E =
A
t
, B = A. (2.29)
This denition ensures that the rst two homogeneous Maxwell equations (2.25) and (2.26)
are automatically satised,
(A) =
ijk

j
A
k
= 0 , (2.30)

_

A
t
_
+

t
(A) = () =
ijk

k
= 0 . (2.31)
12
Here
ijk
is the fully antisymmetric Levi-Civita tensor with
123
=
123
= +1 and the indices
i, j, k = 1, 2, 3 are summed over if they appear twice.
The remaining two inhomogeneous Maxwell equations (2.27) and (2.28) follow from the
Lagrangian
/ =
1
2
(

) (

) +
1
2
(

)
2
A

, (2.32)
with J

= (, j). From the rules presented in Section 2.1, we gather that the dimension of the
vector eld and current is [A

] = 1 and [J

] = 3, respectively. The funny minus sign of the rst


term on the right-hand side is required to ensure that the kinetic term 1/2

A
2
i
is positive using
the Minkowski metric. Notice also that the Lagrangian (2.32) has no kinetic term 1/2

A
2
0
and
hence A
0
is not dynamical. Why this is and necessarily has to be the case will only become
fully clear if you attend the advanced QFT course. Yet, we can already get an idea what
is going on by remembering that the photon (the quantum of electrodynamics) has only two
polarization states, i.e., two physical dofs, while the massless vector eld A

has obviously four


dofs. The fact that time component A
0
is not dynamical reduces the number of independent
dofs in A

from four to three. But this is still one too many. The last unwanted dof can be
gauged away using the gauge symmetry of the quantum version of electromagnetism aka
quantum electrodynamics (QED).
Enough said, lets do serious business and compute something. To see that the statement
made before (2.32) is indeed correct, we rst evaluate
/
A

= J

,
/
(

)
=

+ (

, (2.33)
from which we derive the EOMs,
0 =

_
/
(

)
_

/
A

=
_

2
A

+J

) +J

. (2.34)
Introducing now the eld-strength tensor
F

, (2.35)
we can write (2.32) and (2.34) quite compact,
/ =
1
4
F

. (2.36)

= J

, (2.37)
Does this look familiar? I hope so. Notice that [F

] = 2. In order to see that (2.37) indeed


captures the physics of (2.27) and (2.28), we compute the components of F

. We nd
F
0i
= F
i0
=
0
A
i

i
A
0
=
_
+
A
t
_
i
= E
i
,
F
ij
= F
ji
=
i
A
j

j
A
i
=
ijk
B
k
,
(2.38)
13
while all other components are zero. With this in hand, we then obtain from

F
0
= and

F
1
= j
1
,

F
0
=
0
F
00
+
i
F
i0
= E = ,

F
1
=
0
F
01
+
i
F
i1
=
E
1
t
+
B
3
x
2

B
2
x
3
=
_
B
E
t
_
1
= j
1
.
(2.39)
Similar relations hold for the remaining components i = 2, 3. Taken together this proves the
second inhomogeneous Maxwell equation (2.28).
Let me also derive the energy-momentum tensor T

of electrodynamics, ignoring for the


moment the source term A

. Using (2.33) one nds


T

= (

)(

) (

)(

) +
1
4

. (2.40)
Notice that the rst term in (2.40) is not symmetric, which implies that T

,= T

. In fact,
this is not really surprising since the denition of the energy-momentum tensor (2.20) does
not exhibit an explicit symmetry in the indices and . Nevertheless, there is typically a way
to massage the energy-momentum tensor of any theory into a symmetric form.
5
To learn how
this can be done in the case under consideration is the objective of a homework problem.
2.4 Space-Time Symmetries
One of the main motivations to develop QFT is to reconcile QM with special relativity. We
thus want to construct eld theories in which space and time are placed on an equal footing
and the theory is invariant under Lorentz transformations,
x

(x

, (2.41)
with

, (2.42)
so that the distance ds
2
=

dx

dx

is preserved. Here

= diag (1, 1, 1, 1)
denotes the Minkowski metric. E.g., a rotation by the angle about the z-axis, and a boost
by v < 1 along the x-axis are respectively described by the following Lorentz transformations

=
_
_
_
_
_
1 0 0 0
0 cos sin 0
0 sin cos 0
0 0 0 1
_
_
_
_
_
,

=
_
_
_
_
_
v 0 0
v 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
, (2.43)
with = 1/

1 v
2
. The Lorentz transformations form a Lie group under matrix multipli-
cation. You can learn more about this if you attend the lecture course on group theory held
5
One (but not the only) reason that you might want to have a symmetric energy-momentum tensor T

is to make contact with general relativity, since such an object sits on the right-hand side of Einsteins eld
equations.
14
by Andre Lukas. Alternatively, you can study the group theory crash course written by
Martin Bauer (a PhD student at Mainz University). It can be found on my Oxford homepage.
The various elds belong to dierent representations of the Lorentz group. The simplest
example is the scalar eld , which under the Lorentz transformation x x,
6
transforms as
(x)

(x) = (
1
x) . (2.44)
The inverse
1
appears in the argument because we are dealing with an active transformation,
in which the eld is truly shifted. To see why this means that the inverse appears, it will suce
to consider a non-relativistic example such as a temperature eld. Suppose we start with an
initial eld (x) which has a hotspot at, say, x = (1, 0, 0). Lets now make a rotation x Rx
about the z-axis so that the hotspot ends up at x = (0, 1, 0). If we want to express the new
eld

(x) in terms of the old eld (x), we have to place ourselves at x = (0, 1, 0) and ask
what the old eld looked like at the point R
1
x = (1, 0, 0) we came from. This R
1
is the
origin of the
1
factor in the argument of the transformed eld in (2.44).
The Lagrangian formulation of eld theory makes it especially easy to discuss Lorentz
invariance, since an EOM is automatically Lorentz invariant if it follows from a Lagrangian
that is a Lorentz scalar. This is an immediate consequence of the principle of least action. If a
Lorentz transformation leaves the Lagrangian unchanged, the transformation of an extremum
in the action will be another extremum. To give an example, lets look at the following
Lagrangian
/ =
1
2
(

)
2

1
2
m
2

2
. (2.45)
where is a real scalar and, as we will see later, m is the mass of (for now on just think
about m as a parameter). Obviously, the dimension of the eld is [] = 1. You will show in a
homework assignment that the EOM corresponding to (2.45) takes the form
_

+m
2
_
= 0 . (2.46)
This equation is the famous Klein-Gordon equation. The Laplacian in Minkowski space is
sometimes denoted by . In this notation, the Klein-Gordon equation reads (+m
2
) = 0.
Let us rst check that a Lorentz transformation leaves the Lagrangian (2.45) and its ac-
tion invariant. According to (2.44), the mass term transforms as 1/2 m
2

2
(x) 1/2 m
2

2
(x

)
with x

=
1
x. The transformation of

is

(x)

((x

)) = (
1
)

)(x

) . (2.47)
Using (2.43) we thus nd that the derivative term in the Klein-Gordon Lagrangian behaves as
1
2
(

(x))
2

1
2
(
1
)

)(x

)(
1
)

)(x

=
1
2
(

)(x

)(

)(x

=
1
2
(

(x

))
2
,
(2.48)
6
To shorten the notation we will often use matrix notation and drop the indices , etc.
15
under the Lorentz transformation . Putting things together, we nd that the action of the
Klein-Gordon theory is indeed Lorentz invariant,
S =
_
d
4
x /(x)
_
d
4
x /(x

) =
_
d
4
x

/(x

) = S . (2.49)
Notice that changing the integration variables from d
4
x to d
4
x

, in principle introduces an Ja-


cobian factor det (). This factor is, however, equal to 1 for Lorentz transformation connected
to the identity, that we are dealing with.
A similar calculation also shows that, as promised, also the EOM of the Klein-Gordon eld
is invariant,
_

2
+m
2
_
(x)
_

2
+m
2
_
(x

)
=
_
(
1
)

(
1
)

+m
2
_
(x

)
=
_

+m
2
_
(x

) = 0 .
(2.50)
In the case of the Klein-Gordon theory, we hence conclude that the statements made before
(2.45) are indeed correct.
Representations of Lorentz Group
The transformation law (2.44) is the simplest possible transformation law for a eld. In fact,
it is the only possibility for a one-component eld aka a real scalar. Yet, it is also clear that in
order to describe nature (think only about electromagnetism) we need multicomponent elds,
which have more complicated transformation properties. The most familiar case is that of a
vector eld, such as the vector potential A

, which we have already met in Section 2.3. In


this case the quantity that is distributed in space-time also carries an orientation which must
be rotated and/or boosted.
In fact, we will learn in this course that the Lorentz group has a variety of representa-
tions, corresponding to particles with integer (bosons) and half-integer spins (fermions) in
QFT. These representations are normally constructed out of spinors. To start this general
(and somewhat formal) discussion, let me examine the allowed possibilities for linear eld
transformations

a
(x)

a
(x

) = M()
ab

b
(x) , (2.51)
under (2.41). The rst important point to notice is that the Lorentz transformations form a
group. This means that two successive Lorentz transformations,
x x

= x , x

, (2.52)
can also be described in terms of a single one
x x

x , (2.53)
with

. (2.54)
16
What happens to (2.51) under this set of Lorentz transformations? For x x

x, we
have (in matrix notation)
(x)

(x

) = M(

)(x) . (2.55)
On the other hand, for x x

= x x

x, we get
(x)

(x

) = M(

(x

) = M(

)M()(x) . (2.56)
In order for the last two equations to be consistent with each other, the eld transformations
M must obviously fulll
M(

) = M(

)M() . (2.57)
In group theory terminology, this means that the matrices M furnish a representation of the
Lorentz group. Field Lorentz transformations are therefore not random, but they can be found
if we nd all (nite dimensional) representations of the Lorentz group.
So how do the common representation of the Lorentz group look like and how do we get
all of them? While both questions will be answered in this lecture, I believe it is best to do
it case-by-case whenever we will meet a new type of (quantum) eld. Since we already talked
about the real scalar (Klein-Gordon eld) and the vector A

(potential in electrodynamics),
it makes nevertheless sense to give the representations for these two types of elds already at
this point.
Since a scalar eld by denition does not change under Lorentz transformations, (x)

(x

) = (x), the scalar representation of the Lorentz group is simply


M() = 1 . (2.58)
This was easy! The representation of the vector A

is also not dicult to gure out. Let me


for the time being only state the result. One nds
M() = , (2.59)
which means that a vector eld A

transforms under a Lorentz transformation as (restoring


indices)
A

(x) (A

(x

) =

(x

) . (2.60)
It is important to notice that the latter transformation property implies that any term build
out of A

and

, where all Lorentz indices are contracted is invariant under Lorentz trans-
formations. As an exercise you are supposed to show this explicitly for terms like

,
etc.
Angular Momentum
In classical particle mechanics, rotational invariance gives rise to conservation of angular
momentum. What is the analogy in eld theory? Moreover, we now have further Lorentz
transformations, namely boosts. What conserved quantity do they correspond to? In order to
address these questions, we rst need the innitesimal form of the Lorentz transformations

, (2.61)
17
where

is innitesimal. The condition (2.42) for to be a Lorentz transformation becomes


in innitesimal form

) (

) =

+O(
2
) , (2.62)
which implies that

must be an antisymmetric matrix,

. (2.63)
Notice that an antisymmetric 44 matrix has six independent parameters, which agrees with
the number of dierent Lorentz transformations, i.e., three rotations and three boosts.
Applying the innitesimal Lorentz transformation to our real scalar eld , we have
(x) (x x) = (x)

(x) , (2.64)
where the minus sign arises from the factor
1
in (2.43). The variation of the eld under
an innitesimal Lorentz transformation is hence given by
=

. (2.65)
By the same line of reasoning, one shows that the variation of the Lagrangian is
/ =

/ =

/) , (2.66)
where in the last step we used the fact that

= 0 due to its antisymmetry. The Lagrangian


changes by a total derivative, so we can apply Noethers theorem (2.17) with

/
to nd the conserved current,
J

=
/
(

/
=

_
/
(

/
_
x

.
(2.67)
Stripping o

, we obtain six dierent currents, which we write as


(

= x

. (2.68)
These currents satisfy

= 0 , (2.69)
and give (as usual) rise to six conserved charges. For , ,= 0, the Lorentz transformation
is a rotation and the three conserved charges give the total angular momentum of the eld
(i, j = 1, 2, 3):
Q
ij
=
_
d
3
x
_
x
i
T
0j
x
j
T
0i
_
. (2.70)
Whats about the boosts? In this case, the conserved charges are
Q
0i
=
_
d
3
x
_
x
0
T
0i
x
i
T
00
_
. (2.71)
18
The fact that these are conserved tells us that
0 =
dQ
0i
dt
=
_
d
3
x T
0i
+t
_
d
3
x
dT
0i
dt

d
dt
_
d
3
x x
i
T
00
= P
i
+t
dP
i
dt

d
dt
_
d
3
x x
i
T
00
.
(2.72)
Yet, also the momentum P
i
is conserved, i.e., dP
i
/dt = 0, and we conclude that
d
dt
_
d
3
x x
i
T
00
= const. . (2.73)
This is the statement that the center of energy of the eld travels with a constant velocity. In
a sense its a eld theoretical version of Newtons rst law but, rather surprisingly, appearing
here as a conservation law. Notice that after restoring the label a our results for (

etc.
also apply in the case of multicomponent elds.
Poincare Invariance
We now require that a physical system possesses both space-time translation (2.18) and Lorentz
transformation symmetry (2.41). The symmetry group that includes both transformations is
called the Poincare group. Notice that for any Poincare-invariant theory the two charge
conservation equations (2.21) and (2.69) should hold. This is only possible if the energy-
momentum tensor T

is symmetric. Indeed,
0 =

_
x

_
= x

+T

= T

= T

.
(2.74)
Since Maxwells theory is Poincare invariant, this general result tells us that the expression
of the energy-momentum tensor in (2.40) can be made symmetric without changing physics.
The key to actually do it, lies in making use of the conservation law (2.21) in an appropriate
way.
2.5 Problems
i) Suppose that a no further specied Lagrangian / depends not only on and

but
also on the second derivatives of the elds:
7
/ = /(,

) . (2.75)
For the case that the variations vanish at the endpoints and that (

1
. . .

N
) =

1
. . .

N
() holds, derive the Euler-Lagrange EOMs for such a theory.
7
For the sake of brevity, we have omitted the subscript a labelling the dierent elds.
19
Apply your result to obtain the EOMs for the eld with Lagrangian
/ =
1
2
(
t
) (
x
) +

6
(
x
)
3


2
(

)
2
. (2.76)
ii) Let us study the dynamics of acoustic waves in an elastic medium (e.g. air), as described
by the Lagrangian
/ =
1
2

_
y
t
_
2

1
2
v
2
sound
(y)
2
, (2.77)
with the density of the medium and v
sound
the speed of sound.
Find the Euler-Lagrange EOMs for the system and their solutions. What do they de-
scribe? Calculate the Hamiltonian H.
iii) Consider the Klein-Gordon Lagrangian (2.45). Derive the kinetic and potential energy
(T and V with L = T V ) as well as the Euler-Lagrange EOMs for the eld . Write
down the energy-momentum tensor T

and show that it indeed satises

= 0.
Give the expressions for the conserved energy E and momentum P
i
.
iv) Using (2.60) show that the terms

, (

)
2
, and (

)(

) are Lorentz invariant.


What are the dimensions of these terms?
v) We saw that in the case of electrodynamics in vacuum using (2.20) leads to an energy-
momentum tensor T

that is not symmetric. To remedy that, one can add to T

a term of the form

, where

is antisymmetric in its rst two indices, i.e.,

.
Show that such an object is automatically divergenceless, i.e., it obeys

= 0.
This feature implies that instead of T

one can also use

= T

, (2.78)
without changing the physics, since

has the same globally conserved energy and


momentum as T

.
Show that this construction, with

= F

, (2.79)
leads to an energy-momentum tensor

that is symmetric and yields the standard


formulas for the electromagnetic energy and momentum densities:
c =
1
2
_
E
2
+B
2
_
, S = E B. (2.80)
20
References
[1] Chapter 4 of L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields, Fourth
Edition: Vol. 2 (Course of Theoretical Physics Series), Butterworth-Heinemann (1975),
481 p.
[2] Chapter 5 of B. Thide, Electromagnetic Field Theory, revised and extended 2nd edi-
tion, http://www.plasma.uu.se/CED/Book/index.html
21
3 Klein-Gordon Theory
In QM, canonical quantization is a recipe that takes us from the Hamiltonian formalism
of classical dynamics to the quantum theory. The recipe tells us to take the generalized
coordinates q
a
and their conjugate momenta p
a
= // q
a
and promote them to operators.
The Poisson bracket structure of classical mechanics descends to the structure of commutation
relations between operators, namely
[q
a
, q
b
] = [p
a
, p
b
] = 0 ,
[q
a
, p
b
] = i
a
b
,
(3.1)
where [a, b] = ab ba is the usual commutator.
If one wants to construct a QFT, one can proceed in a similar fashion. The idea is to
start with the classical eld theory and then to quantize it, i.e., reinterpret the dynamical
variables as operators that obey canonical commutation relations,
8
[
a
(x),
b
(y)] = [
a
(x),
b
(y)] = 0 ,
[
a
(x),
b
(y)] = i
(3)
(x y)
a
b
.
(3.2)
Here
a
(x) are eld operators and the Kronecker delta in (3.1) has been replaced by a delta
function since the momentum conjugates
a
(x) are densities. Notice that for now, we are
working in the Schrodinger picture which means that the operators
a
(x) and
a
(x) do only
depend on the spatial coordinates but not on time. The time dependence sits in the states
[) which obey the usual Sch odinger equation
i
d
dt
[) = H[) . (3.3)
While all this looks pretty much the same as good old QM there is an important dierence.
The wavefunction [) in QFT, is a functional, i.e., a function of every possible conguration
of the eld
a
, and not a simple function.
9
So things are more complicated in QFT than in
QM after all.
The Hamiltonian H, being a function of

a
and
a
, also becomes an operator in QFT. In
order to solve the theory, one task is to nd the spectrum, i.e., the eigenvalues and eigenstates
of H. This is usually very dicult, since there is an innite number of dofs within QFT, at
least one for each point x in space. However, for certain theories, called free theories, one can
nd a way to write the dynamics such that each dof evolves independently from all the others.
Free eld theories typically have Lagrangians which are quadratic in the elds, so that the
EOMs are linear.
8
This procedure is sometimes referred to as second quantization. We will not use this terminology here.
9
In functional analysis, a functional is a map from a vector space to the eld underlying the vector space,
which is usually the real numbers. In other words, it is a function that takes a vector as its argument or input
and returns a scalar. Commonly, the vector space is a space of functions, so the functional takes a function
as its argument, and so it is sometimes referred to as a function of a function. The use of functionals goes
back to the calculus of variations where one searches for a function which minimizes a certain functional. A
particularly important application in physics is to search for a state of a system which minimizes the energy
functional.
22
3.1 Klein-Gordon Field as Harmonic Oscillators
So far the discussion in this section was rather general. Let us be more specic and consider
the simplest relativistic free theory as a practical example. It is provided by the classical
Klein-Gordon equation (2.45). To exhibit the coordinates in which the dofs decouple from
each other, we only have to Fourier transform the eld ,
(t, x) =
_
d
3
p
(2)
3
e
i px
(t, p) . (3.4)
In momentum space (2.45) simply reads
_

2
t
2
+
_
p
2
+m
2
_
_
(t, p) = 0 , (3.5)
which tells us that for each value of p, the Fourier transform (t, p) solves the equation of a
harmonic oscillator with frequency

p
=
_
[p[
2
+m
2
. (3.6)
We see that the most general solution of the classical Klein-Gordon equation is a linear super-
position of simple harmonic oscillators, each vibrating at a dierent frequency with a dierent
amplitude. In order to quantize the eld , we must hence only quantize this innite number
of harmonic oscillators (as Sidney Coleman once said [1]: The career of a young theoretical
physicist consists of treating the harmonic oscillator in ever-increasing levels of abstraction.).
Lets recall how to do it in QM.
Harmonic Oscillator in QM
Consider the QM Hamiltonian
H =
1
2
p
2
+
1
2

2
q
2
, (3.7)
with the canonical commutation relations [q, p] = i. In order to nd the spectrum of the
system, we dene annihilation and creation operators (also known as lowering and raising or
ladder operators)
a =
_

2
q +
i

2
p , a

=
_

2
q
i

2
p . (3.8)
Expressing q and p through a and a

gives
q =
1

2
(a +a

) , p = i
_

2
(a a

) . (3.9)
The commutator of the operators introduced in (3.8) is readily computed. One nds [a, a

] = 1.
Expressing the Hamiltonian (3.7) through a and a

gives
H =

2
_
aa

+a

a
_
=
_
a

a +
1
2
_
. (3.10)
23
It is also easy to show that the commutator of H with a and a

takes the form


[H, a] = a , [H, a

] = a

. (3.11)
These relations imply that if [) is an eigenstate of H with energy E, i.e., H[) = E[), then
we can construct other eigenstates by acting with the operators a and a

on [):
Ha[) = (E ) a[) , Ha

[) = (E +) a

[) , (3.12)
This feature explains why a (a

) is called annihilation (creation) operator. From the latter


equation it is also clear that the spectrum of (3.7) has a ladder structure, . . . , E 2, E
, E, E+, E+2, . . . . If the energy is bounded from below, there must be a ground state [0),
which satises a[0) = 0. This state has the ground state or zero-point energy H[0) = /2[0).
Excited states [n) are then created by the repeated action of a

,
(a

)
n
[0) =
_
n(n 1) . . . 1 [n) =

n! [n) , (3.13)
and satisfy
H[n) =
_
N +
1
2
_
[n) =
_
n +
1
2
_
[n) , (3.14)
where N = a

a is the number operator with N[n) = n[n). The prefactor on the right-hand
side of (3.13) is needed to guarantee that the states [n) are normalized to 1, i.e., n[n) = 1.
Quantization of Real Klein-Gordon Field
If we treat each Fourier mode of the eld as an independent harmonic oscillator, we can apply
canonical quantization to the real Klein-Gordon theory, and in this way nd the spectrum of
the corresponding Hamiltonian. In analogy to (3.9), we write and as a linear sum of an
innite number of operators a
p
and a

p
, labelled by the 3-momentum p,
(x) =
_
d
3
p
(2)
3
1
_
2
p
_
a
p
e
i px
+a

p
e
i px
_
,
(x) =
_
d
3
p
(2)
3
(i)
_

p
2
_
a
p
e
i px
a

p
e
i px
_
.
(3.15)
The commutation relations (3.2) become
[a
p
, a
q
] = [a

p
, a

q
] = 0 ,
[a
p
, a

q
] = (2)
3

(3)
(p q) .
(3.16)
Let us assume that the latter equations hold, it then follows that
[(x), (y)] =
_
d
3
p d
3
q
(2)
6
i
2
_

p
_
[a
p
, a

q
] e
i pxi qy
+ [a

p
, a
q
] e
i px+i qy
_
=
_
d
3
p d
3
q
(2)
6
i
2
_

p
(2)
3

(3)
(p q)
_
e
i pxi qy
e
i px+i qy
_
=
_
d
3
p
(2)
3
i
2
_
e
i p(xy)
e
i p(xy)
_
= i
(3)
(x y) ,
(3.17)
24
where we have dropped terms [a
p
, a
q
] = [a

p
, a

q
] = 0 from the very beginning. To show that
[(x), (y)] = [(x), (y)] = 0 is left as an exercise.
In terms of the ladder operators a
p
and a

p
the Hamiltonian of the real Klein-Gordon theory
takes the form
H =
1
2
_
d
3
x
_

2
+ ()
2
+m
2

=
1
2
_
d
3
x d
3
p d
3
q
(2)
6
_

q
2
_
a
p
e
i px
a

p
e
i px
_ _
a
q
e
i qx
a

q
e
i qx
_
+
1
2

q
_
ipa
p
e
i px
ipa

p
e
i px
_

_
iq a
q
e
i qx
iq a

q
e
i qx
_
+
m
2
2

q
_
a
p
e
i px
+a

p
e
i px
_ _
a
q
e
i qx
+a

q
e
i qx
_
_
=
1
4
_
d
3
p
(2)
3
1

p
_
(
2
p
+p
2
+m
2
)(a
p
a
p
+a

p
a

p
) + (
2
p
+p
2
+m
2
)(a
p
a

p
+a

p
a
p
)
_
,
(3.18)
where we have rst used the expressions for and given in (3.15) and then integrated over
d
3
x to get delta functions
(3)
(p q), which, in turn, allows us to perform the d
3
q integral.
Inserting nally the expression (3.6) for the frequency, the rst term in (3.18) vanishes and
we are left with
H =
1
2
_
d
3
p
(2)
3

p
_
a
p
a

p
+a

p
a
p
_
=
_
d
3
p
(2)
3

p
_
a

p
a
p
+
1
2
[a
p
, a

p
]
_
=
_
d
3
p
(2)
3

p
_
a

p
a
p
+
1
2
(2)
3

(3)
(0)
_
.
(3.19)
We see that the result contains a delta function, evaluated at zero where it has an innite
spike. This contribution arises from the innite sum over all modes vibrating with the zero-
point energy
p
/2. Moreover, the integral over
p
diverges at large momenta [p[. To better
understand what is going on let us have a look at the ground state [0) where the former innity
rst becomes apparent.
3.2 Structure of Vacuum
As in the case of the harmonic oscillator in QM, we dene the vacuum[0) through the condition
that it is annihilated by the action of all a
p
,
a
p
[0) = 0 , p. (3.20)
With this denition the energy E
0
of the vacuum comes entirely from the second term in the
last line of (3.19),
H[0) = E
0
[0) =
__
d
3
p
(2)
3

p
2
(2)
3

(3)
(0)
_
[0) = [0) . (3.21)
25
In fact, the latter expression contains not only one but two innities. The rst arises
because space is innitely large. Innities of this kind are often referred to as infrared (IR)
divergences. To isolate this innity, we put the theory into a box with sides of length L and
impose periodic boundary conditions (BCs) on the eld. Then, taking the limit L , we
get
(2)
3

(3)
(0) = lim
L
_
L/2
L/2
d
3
x e
i px

p=0
= lim
L
_
L/2
L/2
d
3
x = V , (3.22)
where V denotes the volume of the box. This result tells us that the delta function singularity
arises because we try to compute the total energy E
0
of the system rather than its energy
density c
0
. The energy density is simply calculated from E
0
by dividing through the volume
V . One nds
c
0
=
E
0
V
=
_
d
3
p
(2)
3

p
2
, (3.23)
which is still divergent and resembles the sum of zero-point energies for each harmonic oscil-
lator. Since c
0
in the limit [p[ , i.e., high frequencies (or short distances), this
singularity is an ultraviolet (UV) divergence. This divergence arises because we want too much.
We have assumed that our theory is valid to arbitrarily short distance scales, corresponding
to arbitrarily high energies. Recalling the discussion of energy scales in Section 1.2, this as-
sumption is clearly absurd. The integral should be cut o at high momentum, reecting the
fact that our theory presumably breaks down at some point (most likely far below the GUT
or Planck scale).
Fortunately, the innite energy shift in (3.19) is harmless if we want to measure the energy
dierence of the energy eigenstates from the vacuum. We can therefore recalibrate our
energy levels (by an innite constant) removing from the Hamiltonian operator the energy of
the vacuum,
: H: = H E
0
= H 0[H[0) . (3.24)
With this denition one has : H: [0) = 0. In fact, the dierence between the latter Hamiltonian
and the previous one is merely an ordering ambiguity in moving from the classical to the
quantum theory. E.g., if we would have dened our Hamiltonian to take the form
H =
1
2
(q ip) (q +ip) , (3.25)
which is classically the same as our original denition (3.7), then after quantization instead of
(3.10), we would have gotten
H = a

a . (3.26)
This type of ordering ambiguity arises often in eld theories. The method that we have
used above to deal with it is called normal ordering. In practice, normal ordering works by
placing all annihilation operators a
p
in products of eld operators to the right. Applied to the
Hamiltonian of the real Klein-Gordon theory this prescription leads to
: H: =
_
d
3
p
(2)
3

p
a

p
a
p
. (3.27)
In the remainder of this section, we will normal order all operators in this manner (dropping
the : : for simplicity).
26
Cosmological Constant
Above we concluded that as long as we are interested in the dierences between energy levels
the innite total energy E
0
of the vacuum does not matter (which eectively means that E
0
has no eect on particle physics phenomenology). So is the value of E
0
unobservable then?
No, in fact, not at all, since gravity is supposed to see all energy densities. In particular, the
sum of all the zero-point energies should contribute to Einsteins equations,
R

R
2
g

+ g

= 8G
N
T

, (3.28)
in the form of a cosmological constant = E
0
/V . Here R

is the Ricci curvature tensor, R


the scalar curvature (for their denitions please consult a text on general relativity), g

is
the metric tensor (not to be mixed up with the Minkowski metric

), G
N
denotes Newtons
constant, which we have already met in (1.8), and T

is the energy-momentum tensor in its


symmetric form. Unfortunately, I do not have time to explain (3.28) in detail. If you want to
learn more about Einsteins equation, I suggest that you attend Andrew Steanes course on
general relativity. In order to be able to follow this lecture, it is sucient to know that these
equations contain a term proportional to E
0
/V .
An assortment of observations (cosmic microwave background, type-Ia supernovae, baryon
acoustic oscillations, etc.) tells us that 74% of the energy density in the universe has the prop-
erties of a cosmological constant. This constant energy density lling space homogeneously
is one form of dark energy. Another possibility of dark energy would be a scalar eld such
as quintessence, a dynamic quantity whose energy density can vary in space. The rest of the
composition of todays cosmos is made up by dark matter, amounting to 22%, and visible
matter (atoms, etc.), giving the missing 4%. Dark matter is dark in the sense that it is in-
ferred to exist from gravitational eects on visible matter and background radiation, but is
undetectable by emitted or scattered electromagnetic radiation. So in conclusion, fully 96%
of the universe seems to be composed of stu weve never seen directly on earth.
But our lack of understanding does not end there. In the last subsection, we have argued
that integrating in (3.23) up to innity is not the right thing to do, but that one should
only consider modes up to a certain UV cut-o
UV
, where one stops trusting the underlying
theory. The resulting energy density c
0
then scales like
4
UV
. While it is not clear which precise
value we should take for
UV
, let us be not very ambitious and take a value for this scale, up
to which we truly believe that we understand the physics of fundamental interactions. The
electron mass m
e
= 511 keV could be such a choice. In consequence,
c
predicted
0
(511 keV)
4
6 10
22
eV, (3.29)
where the superscript predicted should probably better read guessed. Glancing at Table 1,
we see that the observed value of c
0
is
c
observed
0
(10
3
eV)
4
10
12
eV, (3.30)
so it is clearly non-zero but unfortunately also roughly 34 orders of magnitude smaller than
our prediction. Notice that the choice
UV
= m
e
that lead to (3.29) was, in fact, a conservative
27
one, because other educated guesses such as
UV
= v, M
GUT
, etc. would have lead to a much
bigger disagreement of up to 120 orders of magnitude for the choice
UV
= M
P
.
From the point of view of QFT, the net cosmological constant, is the sum of a number of
apparently disparate contributions, including zero-point uctuations of each eld theory dof
and potential energies from scalar elds, as well as a bare cosmological constant. There is no
obstacle to imagining that all of the large and apparently unrelated contributions add together,
with dierent signs, to produce a net cosmological constant consistent with the limit (3.30),
other than the fact that it seems ridiculous. We know of no special symmetry which could
enforce a vanishing vacuum energy while remaining consistent with the known laws of physics.
This conundrum is the cosmological constant problem. While no widely accepted solution to
this problem exists, there are many proposed ones ranging from the anthropic principle to the
string-theory landscape. Dont bother if you have never even heard of any of them, it is not
important at all for what follows.
Casimir Eect
Using the normal ordering prescription we can happily set E
0
= 0, while chanting the mantra
that only energy dierences can be measured. However, it should be possible to see that the
vacuum energy is dierent if, for a reason, the elds vanish in some region of the space-volume
or if some frequencies
p
do not contribute to the vacuum energy. Such a set-up can be
realized, by forcing the real Klein-Gordon eld to satisfy appropriate BCs. Let us assume,
that vanishes on the planes with x = 0 and x = L,
(0, y, z) = (L, y, z) = 0 , (3.31)
The presence of these BCs aects the Fourier decomposition of the eld and, in particular,
leads to a quantization of the momentum of the eld inside the planes (k Z
+
),
p =
_
k
L
, p
y
, p
z
_
. (3.32)
For simplicity let us consider a massless real scalar eld. In this case the ground-state energy
per unit area S between the planes is given by the following expression
E
0
(L)
S
=

k=1
_
dp
2

(2)
2
1
2

_
k
L
_
2
+p
2

. (3.33)
Notice that we only integrate over the perpendicular directions p

= (p
y
, p
z
), since the mo-
mentum p
x
is discretized. Consequently, the volume integral has to be replaced by a surface
integral of the planes. In analogy to (3.22), this gives a factor S/(2)
2
instead of V/(2)
3
.
Let us see if we are able to calculate (3.33). We rst switch to polar coordinates,
E
0
(L)
S
=

k=1
_

0
dp

2
1
2

_
k
L
_
2
+p
2

. (3.34)
28
As it stands this integral is divergent in the limit p

. We can regulate this singularity


in a number of dierent ways. One way to do it, is to introduce a UV cut-o a L, so that
modes of momentum much bigger than a
1
are removed. E.g., multiplying the integrand in
(3.34) by the factor exp [a ((k/L)
2
+p
2

)
1/2
] would do the job, since the resulting expression
has the property that as a 0, one regains the full, innite result (3.33). The drawback of
this method is that the new integral is quite dicult to perform (though doable), so lets see
if we nd an easier way.
The trick is to consider (3.33) not in d = 4 dimensions, but to work in less dimensions,
say, d = 4 2 with > 0. While this looks very weird at rst sight, let me mention that
in general there exists a value of for which the integral is well-dened. We shall perform
our calculation for such a value, and then try to analytically continue the result to = 0. In
d = 4 2 dimensions the integral (3.34) takes the form
E
0
(L)
S
=

k=1
_

0
dp

p
12

2
1
2

_
k
L
_
2
+p
2

. (3.35)
To evaluate this expression, we rst change variables p

k/Ll

. We then obtain
E
0
(L)
S
=
1
4
_

L
_
32
_

k=1
k
32
_
_

0
dl

l
12

_
1 +l
2

=
1
8
_

L
_
32
(2 3)
_

0
dl
2

(l
2

_
1 +l
2

,
(3.36)
where we have identied the innite sum with a Riemann zeta function, employing

k=1
1
k
a
= (a) . (3.37)
Performing now the change of variables l
2

x/(1 x), we arrive at


_

0
dl
2

(l
2

_
1 +l
2

=
_
1
0
dx x

(1 x)
5/2
= B(1 , 3/2) , (3.38)
where in the last step we have used the denition of the Euler beta function,
B(a, b) =
(a)(b)
(a +b)
=
_
1
0
dx x
a1
(1 x)
b1
. (3.39)
Putting everything together, the nal result in d = 4 2 dimensions reads
E
0
(L)
S
=
1
8
_

L
_
32
(2 3) B(1 , 3/2) . (3.40)
Amazingly,
10
we can even take the limit 0. Using (a + 1) = a (a) with (1) = 1 and
(1/2) =

, and recalling that (3) = 1/120, we arrive at the nite expression


E
0
(L)
S
=

2
1440L
3
. (3.41)
10
Many subtleties have been swept under the carpet in this calculation. E.g., the dimensions of the ex-
pressions in (3.35) to (3.40) are wrong by 2. All cheats will become clear when the method of dimensional
regularization is properly introduced.
29
This result implies that the vacuum energy depends on the distance between the two planes,
on which vanishes. Can we realize this in an experiment?
Remember that the electromagnetic eld is zero inside a conductor. If we place two un-
charged conducting plates parallel to each other at a distance L, then we can reproduce the
BCs of the set-up that we have just studied. While the quantization of the electromagnetic
eld is more complicated than the real Klein-Gordon eld, which we have used to model the
eect, this dierence becomes (almost) immaterial as far as the vacuum energy is concerned.
Our analysis, leads to an amazing prediction. Two electrically neutral metal plates attract
each other. This is known as the Casimir-Polder force, rst predicted in 1948 [4]. Notice,
that the energy of the vacuum gets smaller when the conducting plates are closer, as indicated
by the minus sign in (3.41). Therefore, there is an attractive force between them. This is an
eect that has by now been veried experimentally with great precision.
11
In our example,
the force per unit area (pressure or rather anti-pressure) between the two conductor plates is
given by
T =
1
S
E
0
(L)
L
=

2
480L
4
. (3.42)
In fact, the true Casimir-Polder force is twice as large as the latter result, due to the two
polarization states of the photon.
3.3 Particle States
After the discussion of the properties of the vacuum, we can now turn to the excitations of
. Its easy to verify (and therefore left as an exercise) that, in full analogy to (3.11), the
Hamiltonian and the ladder operators of the real Klein-Gordon theory obey the following
commutation relations
[H, a
p
] =
p
a
p
, [H, a

p
] =
p
a

p
. (3.43)
These relations imply that we can construct energy eigenstates by acting on the vacuum state
[0) with a

p
(remember that they also imply that a
p
[0) = 0, p). We dene
[p) = a

p
[0) . (3.44)
This state has energy
H[p) = E
p
[p) =
p
[p) , (3.45)
with
p
given in (3.6), which is nothing but the relativistic energy of a particle with 3-
momentum p and mass m. We thus interpret the state [p) as the momentum eigenstate
of a single scalar particle of mass m.
Let us check this interpretation by studying the other quantum numbers of [p). We begin
with the total momentum P introduced in (2.24). Turning this expression into an operator,
we arrive, after normal ordering, at
P =
_
d
3
x =
_
d
3
p
(2)
3
pa

p
a
p
. (3.46)
11
The rst experimental test of the Casimir-Polder force was conducted by Marcus Sparnaay in 1958, in a
delicate and dicult experiment with parallel plates. Due to the large experimental errors, his results could
neither prove the theoretical prediction right nor wrong.
30
Acting with P on our state [p) gives
P [p) =
_
d
3
q
(2)
3
q a

q
a
q
a

p
[0) =
_
d
3
q
(2)
3
q a

q
_
(2)
3

(3)
(p q) +a

p
a
q
_
[0) = p[p) , (3.47)
where we have employed the second line in (3.16) and used the fact that an annihilation
operator acting on the vacuum is zero. The latter result tells us that the state [p) has
momentum p. Another property of [p) that we can study is its angular momentum. Again
we take the classical expression for the total angular momentum (2.67) and turn it into an
operator,
J
i
=
ijk
_
d
3
x (
0
)
jk
, (3.48)
It is a good exercise to show that by acting with J
i
on the one-particle state with zero
momentum one gets
J
i
[p = 0) = 0 . (3.49)
This result tells us that the particle carries no internal angular momentum. In other words,
quantizing the real Klein-Gordon eld gives rise to a spin-zero particle aka a scalar.
Multiparticle States
Acting multiple times with the creation operators on the vacuum we can create multiparticle
states. We interpret the state
[p
1
, ..., p
n
) = a

p
1
. . . a

p
n
[0) , (3.50)
as an n-particle state. Since one has [a

p
i
, a

p
j
] = 0, the state (3.50) is symmetric under exchange
of any two particles. E.g.,
[p, q) = a

p
a

q
[0) = a

q
a

q
[0) = [q, p) . (3.51)
This means that the particles corresponding to the real Klein-Gordon theory are bosons. We
see that, as promised already in Section 1.1, the relationship between spin and statistics
is, in fact, a consequence of the QFT framework, following, in the case at hand, from the
commutation quantization conditions for boson elds (3.2).
The full Hilbert space of our theory is spanned by acting on the vacuum with all possible
combinations of creation operators,
[0) , a

p
[0) , a

p
a

q
[0) , a

p
a

q
a

r
[0) , . . . . (3.52)
This space is known as the Fock space and is simply the sum of the n-particle Hilbert spaces,
for all n 0. Like in QM, there is also an operator which counts the number of particles in a
given state in the Fock space. It is the number operator
N =
_
d
3
p
(2)
3
a

p
a
p
, (3.53)
31
which satises N[p
1
, . . . , p
n
) = n[p
1
, . . . , p
n
). Notice that the number operator commutes
with the Hamiltonian, i.e., [N, H] = 0, ensuring that particle number is conserved. This
means that we can place ourselves in the n-particle sector, and will remain there. This is a
property of free theories, but will no longer be true when we consider interactions. Interactions
create and destroy particles, taking us between the dierent sectors in the Fock space.
Operator-Valued Distributions
We have referred to the states [p) as particles. Yet, this name is somewhat misleading,
since these states are momentum eigenstates and therefore not localized in space. Recall that
in QM both the position and momentum eigenstates are not good elements of the Hilbert
space since they are not normalizable (they normalize to delta functions). Similarly, in QFT
neither the operators (x), nor a
p
and a

p
are good operators acting on the Fock space. This
is because these operators all produce states that are not normalizable:
0 [(x)(x)[ 0) =
(3)
(0) ,

a
p
a

0
_
= (2)
3

(3)
(0) . (3.54)
This feature implies that they are operator-valued distributions and not functions. In the case
of (x) one has that although the eld operator has a well-dened vacuum expectation value
(VEV), 0[(x)[0) = 0, the uctuations 0[(x)(x)[0) of the operator at a xed point are
innite. We can construct well-dened operators by smearing these distributions over space.
E.g., we can create a wavepacket
[) =
_
d
3
p
(2)
3
e
i px
(p) [p) , (3.55)
which is partially localized in both position and momentum space. A typical state might be
described by the Gaussian (p) = exp [p
2
/(2m
2
)].
Relativistic Normalization
The vacuum [0) is normalized as 0[0) = 1. The one-particle states [p) = a

p
[0) then satisfy
p[q) =

a
p
a

0
_
=

_
(2)
3

(3)
(p q) +a

q
a
p

0
_
= (2)
3

(3)
(p q) , (3.56)
where we have made use of (3.16) and (3.20) to arrive at the nal answer. Since the latter
expression depends on 3-momenta, an immediate question that arises is whether it is Lorentz
invariant. What could go wrong? Suppose we perform a Lorentz transformation
p

(p

, (3.57)
such that p p

. In our QFT it would be preferable, if the state p changes under this Lorentz
transformation as
[p) [p

) = U() [p) , (3.58)


with U() being unitary, i.e., U

()U() = U()U

() = 1. In such a case the normalization


of [p) would remain unchanged
p[p) p

[p

) =

()U()

p
_
= p[p) . (3.59)
32
In order to nd out whether or not the original and the Lorentz-transformed state, [p) and
[p

), are related by an unitary transformation, we should look at an object which we know


is Lorentz invariant. One such object is the identity operator (which is really the projection
operator onto one-particle states). With the normalization (3.56) we know that it is given by
1 =
_
d
3
p
(2)
3
[p)p[ . (3.60)
This operator is Lorentz invariant, but it consists of two terms: the measure
_
d
3
p and the
projector [p)p[. Are these two objects Lorentz invariant by themselves? In fact, they are not.
In order to prove this statement, we start with the measure
_
d
4
p which is obviously Lorentz
invariant. The relativistic dispersion relation for a massive particle, i.e., p
2
= m
2
, and hence
p
2
0
= E
2
p
= p
2
+m
2
is also Lorentz invariant. Solving for p
0
, there are two branches of solutions,
namely p
0
= E
p
. But the choice of branch is another Lorentz-invariant concept. Putting
everything together tells us that
_
d
4
p (p
2
0
p
2
m
2
)

p
0
>0
=
_
d
3
p
2p
0

p
0
=Ep
=
_
d
3
p
2E
p
, (3.61)
is Lorentz invariant. From the latter result we can gure out everything else. E.g., the
Lorentz-invariant delta function for 3-momenta is
2E
p

(3)
(p q) , (3.62)
since
_
d
3
p
2E
p
2E
p

(3)
(p q) = 1 . (3.63)
This nally tells us that the relativistically normalized momentum eigenstates are given by
12
[p) =
_
2E
p
[p) =
_
2E
p
a

p
[0) , (3.64)
and satisfy
p[q) = (2)
3
2E
p

(3)
(p q) . (3.65)
We can also express the identity operator in terms of the [p) states. One has
1 =
_
d
3
p
(2)
3
1
2E
p
[p)p[ . (3.66)
We remark that some texts on QFT also dene relativistically normalized annihilation (cre-
ation) operators by a(p) =
_
2E
p
a
p
_
a

(p) =
_
2E
p
a

p
_
. In order to avoid (further) confusion,
we wont make use of this notation here.
12
Our notation is rather subtle here, since the relativistically normalized momentum states [p) dier from
[p) just by the fact that they are not set in boldface type.
33
3.4 Two Real Klein-Gordon Fields
Our task is to describe all known particles and their interactions. It is then interesting to
study the quantization of a system with more than one eld. In order to keep things simple,
let us try to describe a system of two real Klein-Gordon elds
1,2
which dier only in their
mass parameters (m
1
,= m
2
),
/ =

i=1,2
_
1
2
(

i
)
2

1
2
m
2
i

2
i
_
. (3.67)
This Lagrangian leads to two independent Klein-Gordon equations,
(
2
+m
2
i
)
i
= 0 . (3.68)
The Hamiltonian, the total momentum, and the number operator of the system is given by
H = H
1
+H
2
, P = P
1
+P
2
, N = N
1
+N
2
, (3.69)
where
H
i
=
_
d
3
p
(2)
3

i,p
a

i,p
a
i,p
, P
i
=
_
d
3
p
(2)
3
pa

i,p
a
i,p
, N
i
=
_
d
3
p
(2)
3
a

i,p
a
i,p
, (3.70)
with
i,p
= (p
2
+m
2
i
)
1/2
. It should be clear, that we can construct particle states in the same
fashion as we did with the Lagrangian of just a single real Klein-Gordon eld. Products of a

1,p
operators acting on [0) create relativistic particles with mass m
1
, while a

2,p
operators create
particles with mass m
2
. E.g., the states
[S
1
) = a

1,p
[0) , [S
2
) = a

2,p
[0) , (3.71)
satisfy
H[S
i
) =
i,p
[S
i
) , P [S
i
) = p[S
i
) , N[S
i
) = 1[S
i
) . (3.72)
These relations tell us that the states [S
1,2
) are degenerate in the sense that they are single-
particle states with the same momentum p. However, they can be distinguished by measuring
the energy of the particles as long as the masses m
1,2
are dierent (which we have assumed
for the time being).
Equal-Mass Case
Admittedly the case of two real Klein-Gordon elds with dierent masses m
1,2
is pretty boring.
Things get a little bit more interesting, if we consider the special case m
1
= m
2
= m. Why?
Because in this case the system possesses an additional rotation symmetry in the space of elds

1,2
. According to Noethers theorem this should lead to a new conserved charge. In order to
be able to identify the additional charge, we rst write the Lagrangian (3.67) in a form that
exhibits the symmetry
/ =
1
2
(

T
)(

)
1
2
m
2

T
. (3.73)
34
Here we have introduced the eld vector = (
1
,
2
)
T
.
Obviously, the latter Lagrangian is invariant under the orthogonal transformations (O(2)
transformations or two-dimensional rotations),

= R, (3.74)
with R
T
= R
1
. To calculate the conserved current, we again consider innitesimal symmetry
transformations (i, j = 1, 2)
R
ij
=
ij
+
ij
+O(
2
) . (3.75)
The orthogonality of the matrix R,

ij
+
ji
= R
T
ij
= R
1
ij
=
ij

ij
, (3.76)
tells us that the matrix is antisymmetric. The innitesimal transformation of the eld
1
under (3.74) is

1
= R
1i

i
= (
1i
+
1i
)
i
=
1
+
11

1
+
12

2
=
1
+
12

2
, (3.77)
which tells us that the variation of
1
is

1
=
12

2
. (3.78)
An analog calculation gives

2
=
21

1
=
12

1
. (3.79)
Knowing the variations
1,2
of the elds, the conserved current corresponding to (3.74) is
readily written down,
J

=
/
(

i
)

i
=
12
_
(

1
)
2
(

2
)
1

, (3.80)
so the conserved charge is
Q =
_
d
3
x
_ _

1
_

2
_

. (3.81)
Substituting in the above expression the physical solutions (3.15) for the elds
1,2
, and
performing the integration over the space coordinates, one obtains (the actual computation is
part of an exercise)
13
Q = i
_
d
3
p
(2)
3
_
a
1,p
a

2,p
a
2,p
a

1,p
_
, (3.82)
which is an hermitian operator, i.e., it satises Q

= Q. There is an ambiguity worth noting,


when applying Noethers theorem to nd the conserved charge under the transformation (3.74).
Obviously, if Q is conserved, then so is every other operator c
1
Q + c
2
with c
1,2
constant
numbers. The expression for Q in (3.82) is therefore unique up to a multiplicative and an
13
This expression has not be normal ordered.
35
additive constant. The ambiguity on the additive constant is removed when we remove the
contribution of the vacuum to the charge of particle states (as we have done for the energy).
The normal-ordered charge operator
: Q: = Q0[Q[0) , (3.83)
is ambiguous only up to a multiplicative factor, which essentially denotes the units in which
we measure the charge of a state. Notice that we have already used this ambiguity in (3.81)
and simply ignored the factor
12
. In the following, we will use the normalization (3.83) of Q,
dropping as before the : : to avoid unnecessary clutter.
So far so good. Next we would like to determine the spectrum of Q. This is most easily
done using the technique of ladder operators. We rst dene the following linear combinations,
a
,p
=
1

2
(a
1,p
ia
2,p
) , (3.84)
of annihilation operators (an analog denition holds for the hermitian conjugate operators).
It is left as a homework problem to show that these new operators satisfy the following
commutation relations
[Q, a
,p
] = a
,p
, [Q, a

,p
] = a

,p
. (3.85)
The latter relations imply that we can obtain states with charge q 1 from a state [S) of
charge q, i.e., Q[S) = q [S), by the action of a

,p
,
Q
_
a

,p
[S)
_
= (q 1)
_
a

,p
[S)
_
. (3.86)
In other words the operators a

,p
are ladder operators with respect to Q. Since a

,p
are linear
combinations of a

1,p
and a

2,p
, which are ladder operators for the Hamiltonian H and the total
momentum operator P, so are a

,p
.
To nd now all the common eigenstates of the charge operator Q, it is sucient to start
from a single common eigenstate and then to act with a

,p
on this state. It is not surprising
that the vacuum [0) is also an eigenstate of Q, namely the one with zero charge
14
Q[0) = 0[0) = 0 . (3.87)
Repeated application of the ladder operators,
[S

n
) =
n

i=1
a

,p
i
[0) , (3.88)
then creates n-particle states with positive ([S
+
n
)) and negative ([S

n
)) charge. Consequently,
one has
H[S

n
) =
_
n

i=1

i,p
i
_
[S

n
) , P [S

n
) =
_
n

i=1
p
i
_
[S

n
) ,
N[S

n
) = n[S

n
) , Q[S

n
) = n[S

n
) ,
(3.89)
14
Notice that the normal ordering (3.83) of Q plays an essential role here.
36
The main results of this subsection can be summarized as follows. The mass degeneracy of
the Klein-Gordon elds
1,2
results in a new O(2) symmetry of the Lagrangian. This gives rise
to a new conserved quantity, the charge Q. A particle state is then characterized by its mass
(or equivalently its energy), its momentum, and its charge, which can be either positive or
negative. States with the same energy and momentum, but opposite charge, can be interpreted
as particles and antiparticles. Notice that for a single real Klein-Gordon eld there is only a
single type of particle, since a real scalar particle is its own antiparticle.
3.5 Complex Klein-Gordon Field
We can gain further insight into the theory by rewriting the Lagrangian (3.73) a little bit,
/ = (

)(

) m
2

, (3.90)
where
=
1

2
(
1
+i
2
) , (3.91)
denotes the complex Klein-Gordon eld. We could now compute the Hamiltonian and mo-
mentum operators directly in terms of and

, arriving at the same expressions as in the


representation with two real elds (if you dont believe me you are free to check this yourself).
In order to compute the charge Q, we then need to identify the internal symmetry of the new
Lagrangian. In fact, it is easy to see that (3.90) is invariant under a eld phase-redenition
aka a global U(1) transformation,

= e
i
,

= e
i

. (3.92)
Notice that this transformation is the equivalent of the rotation symmetry transformation
(3.74) that we have found earlier, in the real eld representation. We verify this by using the
explicit form of the matrix R in terms of sine and cosine of the rotation angle ,
_

2
_

_
cos sin
sin cos
__

2
_
, =
_

1
i
2
_

_
cos i sin
i sin cos
__

1
i
2
_
,
=
1
+i
2
e
i
(
1
+i
2
) , = e
i
.
(3.93)
So why should we bother about the complex Klein-Gordon Lagrangian if (3.73) and (3.90)
are equivalent? The reason is that the complex eld representation is more suggestive to the
fact that we have both particle and antiparticle states. To see this we rederive the expression
for the charge operator (3.82). The variations of the elds and

(treated as independent)
under (3.92) are
= i,

= i

. (3.94)
Now we can again use the machinery of Noethers theorem to calculate Q. I spare you the
details of this computation and simply quote the nal result after normal ordering. One nds
15
Q =
_
d
3
p
(2)
3
_
a

+,p
a
+,p
a

,p
a
,p
_
= N
+
+N

, (3.95)
15
You can obtain this expression by simply reexpressing (3.82) in terms of a
,p
and a

,p
using the inverse
of (3.84) and its hermitian conjugate analog.
37
where in the last step we have introduced the number operators
N

=
_
d
3
p
(2)
3
a

,p
a
,p
. (3.96)
The expression (3.95) implies that Q counts the number of antiparticles (created by a

+,p
)
minus the number of particles (created by a

,p
). Since [H, Q] = 0, this dierence is a conserved
quantity in our quantum theory. Of course, in a free eld theory this isnt such a big deal
because both N
+
and N

, i.e., the numbers of positively and negatively charged states, are


separately conserved. However, we will see soon that in interacting theories Q survives as a
conserved quantity, while N

individually do not.
3.6 Heisenberg Picture
Although we started with a Lorentz-invariant Lagrangian, we slowly butchered it as we quan-
tized the theory, introducing a preferred time coordinate t. Its not at all obvious that the
theory is still Lorentz invariant after quantization. E.g., the various eld operators (x) we
met depend on space, but not on time. Yet, the one-particle states obey the Schrodingers
equation,
i
d[p(t))
dt
= H[p(t)) , (3.97)
which means that they evolve in time according to
[p(t)) = e
iEpt
[p) . (3.98)
Things start to look better in the Heisenberg picture where the time dependence is assigned
to the operators O,
O
H
= e
iHt
O
S
e
iHt
, (3.99)
so that
dO
H
dt
=
_
d
dt
e
iHt
_
O
S
e
iHt
+e
iHt
O
S
_
d
dt
e
iHt
_
= iH e
iHt
O
S
e
iHt
+e
iHt
O
S
e
iHt
(iH) = i [H, O
H
] .
(3.100)
Here the subscripts S and H tell us whether the operator is in the Schrodinger or Heisenberg
picture. In QFT, we drop these subscripts and we will denote the picture by specifying whether
the elds depend on space (x) (the Schr odinger picture) or space-time (t, x) = (x) (the
Heisenberg picture).
The operators in the two pictures agree at a xed time, say, t = 0. The commutation
relations (3.2) become equal-time commutation relations in the Heisenberg picture. In the
case of the real Klein-Gordon theory (2.45),
[(t, x), (t, y)] = [(t, x), (t, y)] = 0 , [(t, x), (t, y)] = i
(3)
(x y) . (3.101)
38
Now that our operators depend on time, we can study how they evolve when the clock starts
ticking. For the eld operator , we have

(x) = i [H, (x)] =


i
2
_
_
d
3
y
_

2
(y) +
_
(y)
_
2
+m
2

2
(y)
_
, (x)
_
= i
_
d
3
y (y) (i)
(3)
(x y) = (x) .
(3.102)
Similarly, we get for the conjugate operator ,
(x) = i [H, (x)] =
i
2
_
_
d
3
y
_

2
(y) +
_
(y)
_
2
+m
2

2
(y)
_
, (x)
_
=
i
2
_
d
3
y
_
_

y
[(y), (x)]
_
(y) +
_
(y)
_

y
[(y), (x)] + 2i m
2
(y)
(3)
(x y)
_
=
_

2
m
2
_
(x) , (3.103)
where we have included the subscript y on
y
when there may be some confusion about
which argument the derivative is acting on. To reach the last line, we have simply integrated
by parts. Putting (3.102) and (3.103), we then nd that satises (as one could have guessed)
the Klein-Gordon equation (2.46). Things start to look more relativistic.
We can also write the Fourier expansion (3.15) of the eld by using the denition of
Heisenberg operators (3.99). We rst note that
(a
p
)
H
= e
iHt
a
p
e
iHt
=
_
[e
iHt
, a
p
] +a
p
e
iHt
_
e
iHt
=
_
e
iEpt
a
p
e
iHt
a
p
e
iHt
+a
p
e
iHt
_
e
iHt
= e
iEpt
a
p
,
(3.104)
where have applied repeatedly
H
n
a
p
= a
p
(H E
p
)
n
, (3.105)
which holds for any n and follows from the commutation relations (3.43), after expanding the
exponential in a power series (this step is actually not shown). A similar relation (with
replaced by +) holds for a

p
. In the case of a

p
, we hence have
_
a

p
_
H
= e
iHt
a

p
e
iHt
= e
iEpt
a

p
. (3.106)
Using (3.104) and (3.106) then gives,
(x) =
_
d
3
p
(2)
3
1
_
2E
p
_
a
p
e
ipx
+a

p
e
ipx
_
, (3.107)
which looks pretty much like (3.15) except that the exponentials are now written in terms
of 4-vectors, px = E
p
t p x. Note also that the sign has ipped in the exponent due to
39
the Minkowski metric. Its a simple exercise to check that (3.107) indeed satises the Klein-
Gordon equation (2.45), and is therefore left as a homework. For completeness let me also
give the result for the conjugate eld in the Heisenberg picture. One nds,
(x) =
_
d
3
p
(2)
3
(i)
_
E
p
2
_
a
p
e
ipx
a

p
e
ipx
_
, (3.108)
as you might have guessed immediately from looking at (3.15) and (3.107).
The equation (3.107) makes explicit the dual particle and wave interpretations of the
quantum eld . On the one hand, is written as an operator, which creates and destroys
the particles that are the quanta of eld excitation. On the other hand, is written as a
linear combination of solutions (the exponentials) of the Klein-Gordon equation. Both signs
of the time dependence, i.e., ip
0
t with p
0
> 0, appear in the exponential. If these were
single-particle wavefunctions, they would correspond to states of positive and negative energy.
Let us refer to them more generally as positive- and negative-frequency modes. The connection
between the particle-creation operators and the waveforms displayed here is always valid for
free quantum elds. A positive-frequency solution of the eld equation has as its coecient
the operator that destroys a particle in that single-particle wavefunction, while a negative-
frequency solution of the eld equation (being the hermitian conjugate of a positive-frequency
solution) has as its coecient the operator that creates a particle in that positive-energy
single-particle wavefunction. In this way, the fact that relativistic wave equations have both
positive- and negative-frequency solutions is reconciled with the requirement that a sensible
quantum theory should contain only positive excitation energies.
Causality
It looks like we are approaching something Lorentz invariant in the Heisenberg picture, where
the eld operator satises the Klein-Gordon equation. Yet, there is still a hint of non-
Lorentz invariance because and satisfy the equal-time commutation relations (3.101). The
question that we thus have to address is, what happens for arbitrary space-time separations? In
particular, for our theory to be causal, we must require that all space-like separated operators
commute,
[O
1
(x), O
2
(y)] = 0 , (x y)
2
< 0 . (3.109)
This ensures that a measurement at x cannot aect a measurement at y, when x and y are not
causally connected (outside the light-cone). A graphical representation of the latter equation
is given in Figure 3.1.
Does our theory satisfy the requirement (3.109)? To answer this question, we rst dene
(x y) = [(x), (y)] . (3.110)
While the objects of the right-hand side are operators, it is seen (after a short calculation)
40
x
t
O
2
(y)
O
1
(x)
Figure 3.1: Picture of space-like separated operators O
1
(x) and O
2
(y).
that the left-hand side is simply a complex number,
(x y) =
_
_
d
3
p
(2)
3
1
_
2E
p
_
a
p
e
ipx
+a

p
e
ipx
_
,
_
d
3
q
(2)
3
1
_
2E
q
_
a
q
e
iqy
+a

q
e
iqy
_
_
=
_
d
3
p d
3
q
(2)
6
1
2
_
E
p
E
q
_
[a
p
, a

q
] e
ipx+iqy
+ [a

p
, a
q
] e
ipxiqy
_
=
_
d
3
p d
3
q
(2)
6
1
2
_
E
p
E
q
(2)
3

(3)
(p q)
_
e
ipx+iqy
e
ipxiqy

=
_
d
3
p
(2)
3
1
2E
p
_
e
ip (xy)
e
ip (xy)

.
(3.111)
So what do we know about (x y)? First of all, it is Lorentz invariant thanks to the
Lorentz-invariant measure
_
d
3
p/(2E
p
) that we have introduced in (3.61). Second, it does
not vanish for time-like separations. E.g., taking x = (t, 0, 0, 0) and y = (0, 0, 0, 0) gives
[(x), (y)] exp (imt) exp (imt), where the exp (imt) term arises from
_
d
3
p
(2)
3
1
2E
p
e
iEp t
=
4
(2)
3
_

0
dp
p
2
2
_
p
2
+m
2
e
it

p
2
+m
2
=
1
4
2
_

m
dE

E
2
m
2
e
iEt
=
m
8t
_
Y
1
(mt) +iJ
1
(mt)


t
e
imt
,
(3.112)
where to arrive at the second line, we have simply changed variables p (E
2
m
2
)
1/2
. In
41
order to obtain the nal answer, one only needs to know that the Bessel functions of rst and
second kind, J
1
(x) and Y
1
(x), behave like J
1
(x)/x sin xcos x and Y
1
(x)/x sin x+cos x in
the relevant limit x . An analog calculation gives the exp (imt) term. Third, it vanishes
for space-like separations. This follows by realizing that (x y) = 0 at equal times for all
(x y)
2
= (x y)
2
< 0, which can be seen explicitly by writing
[(t, x), (t, y)] =
_
d
3
p
(2)
3
1
2
_
p
2
+m
2
_
e
ip(xy)
e
ip(xy)

= 0 . (3.113)
Notice that in order to arrive at the nal result, we have ipped the sign of p in the second
exponent. This obviously does not change the result since p is an integration variable and
(p
2
+m
2
)
1/2
is invariant under such a change. But since (x y) is Lorentz invariant, it can
only be a function of (x y)
2
and must hence vanish for all (x y)
2
< 0.
Taken together the above ndings imply that the real Klein-Gordon theory is indeed causal
with commutators vanishing outside the light-cone. This property will continue to hold in the
interacting theory. Indeed, it is usually given as one of the axioms of local QFTs. Let me
mention, however, that the fact that [(x), (y)] is a complex function, rather than an operator,
is a property of free elds only and does not hold in an interacting theory.
3.7 Klein-Gordon Correlators
The causal structure of the real Klein-Gordon theory (2.45) can also be probed in a dierent
way. Lets create a particle at the space-time point y. What is the amplitude to nd it at
point x? This question can be answered by calculating
D(x y) = 0 [(x)(y)[ 0) =
_
d
3
p d
3
q
(2)
6
1
2
_
E
p
E
q

a
p
a

0
_
e
ipx+iqy
=
_
d
3
p d
3
q
(2)
6
1
2
_
E
p
E
q

[a
p
, a

q
]

0
_
e
ipx+iqy
=
_
d
3
p d
3
q
(2)
6
1
2
_
E
p
E
q
(2)
3

(3)
(p q) e
ipx+iqy
=
_
d
3
p
(2)
3
1
2E
p
e
ip(xy)
.
(3.114)
The function D(x y) is called propagator and is a Lorentz-invariant 3-momentum integral.
Let us now evaluate (3.113) for purely space-like separations, i.e., x y = (0, r).
16
The
propagator is then
D(x y) =
_
d
3
p
(2)
3
1
2E
p
e
ipr
=
2
(2)
3
_

0
dp
p
2
2E
p
e
ipr
e
ipr
ipr
=
i
2(2)
2
r
_

dp
p e
ipr
_
p
2
+m
2
.
(3.115)
16
Notice that for purely time-like separations one would obtain the result (3.112).
42
Re p
Imp
+im
im
Figure 3.2: Branch cuts of the propagator D(x y) for a space-like transition.
Here we have rst introduced spherical coordinates, then performed the integration over the
azimuthal and polar angles, and nally changed variables in the second term from p p in
order to combine the result into one term. The integrand in (3.115), considered as a complex
function of p, has branch cuts on the imaginary axis starting at im. In order to evaluate the
integral we push the contour up to wrap around the upper branch cut. The chosen integration
contour is shown in Figure 3.2. Dening = ip, we then recast (3.115) into
D(x y) =
1
4
2
r
_

m
d
e
r
_

2
m
2
=
1
4
2
r
mK
1
(mr)
r
e
mr
,
(3.116)
where the modied Bessel function K
1
(x) scales like K
1
(x) =
__
/(2x)+O(x
3/2
)
_
e
x
in the
limit of x . The latter equation tells us that the propagator (xy) decays exponentially
quickly outside the light-cone but, nonetheless, it is non-vanishing. The quantum eld appears
to leak out of the causal region. Yet, we have just seen in (3.113) that space-like measurements
commute and the theory is causal. How do we reconcile these two facts?
We get a rst clue of how this puzzle is resolved by realizing that the relation (3.113),
expressed in terms of propagators, takes the form
(x y) = [(x), (y)] = D(x y) D(y x) = 0 . (3.117)
What is the physical meaning of this result? It simply means that for (xy)
2
< 0, there is no
Lorentz-invariant way to order events. If a particle can travel in a space-like direction from x
to y, it can just as easily travel from y to x.
17
In any measurement, the amplitudes for these
two possible events cancel, so that the underlying QFT is causal.
17
When x y is space-like, a continuous Lorentz transformation can take x y to y x.
43
Another way to think about the cancellation of the two contributions in (3.117) is in terms
of amplitudes of particles and antiparticles. Let us rst consider the case of a complex scalar
eld. If we look at the equation [(x),

(y)] = 0 outside the light-cone, the physical inter-


pretation of (3.117) (or better its analog) is that the amplitude for the particle to propagate
from x to y cancels the amplitude for the antiparticle to travel from y and x. In fact, this
interpretation also applies (maybe in a less obvious way) to the case of the real scalar eld,
because the particle is then its own antiparticle.
Greens functions
In fact, the statements made after (3.117) can be put on mathematical solid grounds. Lets
see how this goes. We start by considering the amplitude
0 [[(x), (y)][ 0) =
_
d
3
p
(2)
3
1
2E
p
_
e
ip(xy)
e
ip(xy)

, (3.118)
and assume for now that x
0
> y
0
. In this case we can rewrite the 3-momentum integral on
the right-hand side of (3.117) as a 4-momentum integral,
0 [[(x), (y)][ 0) =
_
d
3
p
(2)
3
_
1
2E
p
e
ip(xy)

p
0
=Ep
+
1
2E
p
e
ip(xy)

p
0
=Ep
_
=
x
0
>y
0
_
d
3
p
(2)
3
_
dp
0
2i
1
p
2
m
2
e
ip(xy)
=
_
d
4
p
(2)
4
i
p
2
m
2
e
ip(xy)
.
(3.119)
Notice that this is the rst time in this course that we have integrated over 4-momentum.
Until now, we integrated only over 3-momentum, with p
0
xed by the mass-shell condition to
be p
0
= E
p
.
Barring possible typos, the calculation in (3.119) is certainly correct, but this fact might not
be obvious to everybody in the audience right away. So let me do some reverse engineering.
First notice that the denominator in the last line of (3.119) can be written as
p
2
m
2
= (p
0
)
2
p
2
m
2
= (p
0
)
2
E
2
p
= (p
0
E
p
)(p
0
+E
p
) , (3.120)
which implies that, for each value of p, the denominator produces a pole in the integrand at
p
0
= E
p
= (p
2
+m
2
)
1/2
. The 4-momentum integration is hence ill-dened and we need
a prescription for avoiding the singularities on the real p
0
-axis. How do we have to choose
the integration contour in order to arrive at (3.119)? It is not dicult to see that in the
case x
0
> y
0
the contour has to be chosen as shown in Figure 3.3. Notice that closing the
contour in the lower half-plane, where p
0
i, ensures that the integrand vanishes since
exp (ip
0
(x
0
y
0
)) 0. The integral over p
0
then picks up the residues at p
0
= E
p
which
are 2i/(p
0
E
p
)

p
0
=Ep
= 2i/(2E
p
), where the relative minus sign arises because we
take a clockwise contour. Combining these elements shows that the calculation that led to the
nal result in (3.119) is in fact correct.
44
Re p
0
Imp
0
E
p
+E
p
p
0
i
Figure 3.3: Integration contour for the retarded Greens function D
R
(x y).
In the following, we will call the last line of (3.119) together with the prescription for going
around the pole retarded Greens function,
D
R
(x y) = (x
0
y
0
) 0 [[(x), (y)][ 0) = (x
0
y
0
)
_
D(x y) D(y x)

, (3.121)
where the Heaviside step function (x) is dened as (x) = 0 for x < 0 and (x) = 1 for x > 0.
It seldom matters what value is used for (0), since (x) is mostly used as a distribution (in
the half-maximum convention one has (0) = 1/2). The name retarded Greens function is in
fact the correct one for D
R
(x y), since this mathematical object obeys
18
_

2
+m
2
_
D
R
(x y) =
_

2
(x
0
y
0
)
_
0 [[(x), (y)][ 0)
+ 2
_

(x
0
y
0
)
_
(

0 [[(x), (y)][ 0))


+(x
0
y
0
)
_

2
+m
2
_
0 [[(x), (y)][ 0)
= (x
0
y
0
) 0 [[(x), (y)][ 0)
+ 2(x
0
y
0
) 0 [[(x), (y)][ 0)
+ 0
= i
(4)
(x y) ,
(3.122)
and vanishes for x
0
< y
0
by denition. Here all derivatives are understood with respect to
x. In order to obtain the second line we have used the two relations
x
(x) = (x) and
_

2
x
(x)
_
f(x) = (x)
_

x
f(x)
_
, the latter of which is shown easily by partial integration, and
paid tribute to the fact that (x) obeys the Klein-Gordon equation. The last line then follows
by employing the second equal-time commutation relation in (3.101).
The retarded Greens function is useful in classical eld theory if we know the initial value
of some eld conguration and want to gure out what it evolves into in the presence of
18
Notice that the same result is obtained by applying the dierential operator (
2
+ m
2
) directly to the
expression in the last line of (3.119).
45
Re p
0
Imp
0
E
p
+E
p
p
0
+i
Figure 3.4: Integration contour for the advanced Greens function D
A
(x y).
a source, meaning that we want to know the solution to the inhomogeneous Klein-Gordon
equation, (
2
+ m
2
) (x) = J(x) for some xed background function J(x), acting as a static
source. Similarly, one can dene the advanced Greens function D
A
(x y) which vanishes
when x
0
> y
0
, which is useful if we know the end point of a eld conguration and want to
gure out where it came from. The integration contour corresponding to the advanced Greens
function is shown in Figure 3.4. You will get more familiar with the advanced Greens function
in an exercise.
Feynman Propagator
In fact, the most important quantity in interacting eld theory is neither the retarded nor the
advanced Greens function but the Feynman propagator,
D
F
(x y) = 0 [T (x)(y)[ 0) = (x
0
y
0
)D(x y) +(y
0
x
0
)D(y x) , (3.123)
where T stand for time ordering, i.e., placing all operators evaluated at later times to the left
so that e.g.,
T (x)(y) = (x
0
y
0
)(x)(y) +(y
0
x
0
)(y)(x) . (3.124)
Given the similarity of (3.118) and (3.123), it is does not come as a surprise, that the Feynman
propagator can be written as,
D
F
(x y) =
_
d
4
p
(2)
4
i
p
2
m
2
e
ip(xy)
. (3.125)
Again we distinguish the cases x
0
> y
0
and y
0
> x
0
. In the former case, we perform the p
0
integration following the contour shown in Figure 3.5, which encloses the pole at p
0
= +E
p
with residuum 2i/(2E
p
), where the minus sign arises again since the path has a clockwise
46
Re p
0
Imp
0
E
p
+E
p
p
0
i
Figure 3.5: Integration contour for the Feynman propagator D
F
(x y) for x
0
> y
0
.
In the case y
0
> x
0
, the integration contour is closed in the upper-half plane.
orientation. Consequently, one obtains
D
F
(x y) =
_
d
3
p
(2)
4
2i
2E
p
ie
iEp (x
0
y
0
)+ip(xy)
=
_
d
3
p
(2)
3
1
2E
p
e
ip(xy)
= D(x y) .
(3.126)
In contrast, in the case y
0
> x
0
one nds
D
F
(x y) =
_
d
3
p
(2)
4
2i
(2E
p
)
ie
iEp (x
0
y
0
)+ip(xy)
=
_
d
3
p
(2)
3
1
2E
p
e
iEp (y
0
x
0
)ip(yx)
=
_
d
3
p
(2)
3
1
2E
p
e
ip(yx)
= D(y x) .
(3.127)
where the integration is chosen as in Figure 3.5, but the path is closed in the upper-half plane
(due to the counter-clockwise orientation of the half-circle the residuum does not pick up a
minus sign). To go from the second line in (3.127) to the third, we have ipped the sign of p
which is valid since we integrate over d
3
p and all other quantities depend only on p
2
. Taken
together the latter two relations prove the equality of (3.123) and (3.125).
Like D
R
(x y) and D
A
(x y), also the Feynman propagator is a Greens function of the
Klein-Gordon equation,
_

2
+m
2
_
D
F
(x y) =
_
d
4
p
(2)
4
i
p
2
m
2
(p
2
+m
2
) e
ip(xy)
= i
_
d
4
p
(2)
4
e
ip(xy)
= i
(4)
(x y) .
(3.128)
47
Re p
0
Imp
0
+i
i E
p
+E
p
p
0
i
Figure 3.6: Schematic picture of the i prescription for x
0
> y
0
. In the case y
0
> x
0
,
the integration contour is closed in the upper-half plane.
Notice that instead of specifying the contour, we may instead write the Feynman propa-
gator as follows
D
F
(x y) =
_
d
4
p
(2)
4
i
p
2
m
2
+i
e
ip(xy)
, (3.129)
with > 0 and innitesimal. As shown in Figure 3.6, this has the eect of shifting the
poles slightly o the real p
0
-axis, so that the integration along this axis is equivalent to the
integration contour displayed in Figure 3.5. This way of writing D
F
(x y) is, for obvious
reasons, called the i prescription.
3.8 Non-Relativistic Limit
In order to study the non-relativistic limit of our theory, we return to the classical complex
Klein-Gordon eld (for reasons that will become clear later on). We decompose it as
19
(x) = e
imt
(x) , (3.130)
to single out the large kinematical part of the momentum of . In terms of the new eld ,
the Klein-Gordon equation reads
_

2
+m
2
_
=
_

2
t

2
+m
2
_
e
imt
= e
imt
_

2im


2

_
= 0 , (3.131)
where the explicit m
2
term cancelled against the time derivatives. The non-relativistic limit
is m [p[, which after a Fourier transform is equivalent to saying that [

[ m[

[. We are
19
The exponential factor removes the large frequency part from the x-dependence in . Consequently, the x-
dependence of is only governed by the small residual momentum and derivatives acting on are suppressed
by powers of 1/m. This way of decomposing a eld is often the starting point for the construction of an
eective eld theory that entails the physics of the full theory in the kinematical limit m [p[. The most
well-known example of such a theory in particle physics is heavy quark eective theory.
48
hence allowed to neglect the

term in (3.131), so that the Klein-Gordon equation in the limit
m becomes
i
d
dt
=
1
2m

2
. (3.132)
This looks very similar to the Schr odinger equation for a non-relativistic free particle of mass
m. Except it does not have any probability interpretation. It is simply a classical eld evolving
through an equation thats rst order in time derivatives.
It is also worthwhile to consider the Lagrangian of the complex scalar eld itself and
to investigate what happens to (3.90) in the non-relativistic limit. We again take the limit
[

[ m[

[, and obtain after a straightforward calculation (where in the last step we have
divided by 2m),
/ = i



1
2m
(

) ( ) . (3.133)
This Lagrangian has a conserved current related to its invariance under the global phase
transformation e
i
. Employing Noethers theorem (2.17), we nd that the conserved
current takes the form
J

=
_

,
i
2m
[

]
_
. (3.134)
To get the Hamiltonian we compute the conjugate momentum
=
/



= i

, (3.135)
which does not contain a time derivative. This looks a little disconcerting, but its fully
consistent for a theory which is rst order in time derivatives. In order to determine the full
trajectory of the eld, we only need to specify initial conditions for and

at some point
in time, say t = 0 (knowing the time derivatives on the initial slice is not necessary).
Since the Lagrangian (3.133) already contains a term p q (and not the usual 1/2p q),
the time derivatives drop out when one computes the Hamiltonian,
H =
1
2m
(

) ( ) . (3.136)
In order to quantize the system, we impose in the Schr odinger picture,
[ (x), (y)] = [

(x),

(y)] = 0 , [ (x),

(y)] =
(3)
(x y) , (3.137)
and expand the eld into its Fourier components,
(x) =
_
d
3
p
(2)
3
a
p
e
ipx
. (3.138)
Inserting this into the commutation relations (3.137), leads to
[a
p
, a

q
] = (2)
3

(3)
(p q) . (3.139)
49
where the trivial expressions have been skipped. As usual the vacuum satises a
p
[0) = 0, and
the excitations are a

p
1
. . . a

p
n
[0). The one-particle states [p) = a

p
[0), have energy
H[p) =
p
2
2m
[p) , (3.140)
which is the non-relativistic dispersion relation. From the above, we conclude that quantizing
the rst order Lagrangian (3.133) gives rise to non-relativistic particles of mass m.
Some comments seem to be in order. Notice that we have a complex eld but only a single
type of particle. The antiparticle is not in the spectrum. The existence of antiparticles is a
consequence of relativity. A related fact is that the conserved charge Q =
_
d
3
x :

: is the
particle number. This remains conserved even if we include interactions in the Lagrangian of
the form (

)
2
etc., which are invariant under a global phase rotation. So in non-relativistic
theories, particle number is conserved. It is only with relativity, and the appearance of an-
tiparticles, that particle number can change. Finally, there is no non-relativistic limit of a real
scalar eld. In the relativistic theory, the particles are their own antiparticles, and there is no
way to construct a multiparticle theory that conserves particle number.
Recovering QM
In QM, we talk about the position and momentum operators X and P. On the other hand,
as we saw below (2.1), in QFT position is relegated to a label. How do we get back to good
old QM? We already have the operator for the total momentum of the eld, namely (3.46).
When acting on a single-particle state, it gives P [p) = p[p). It is also not too dicult to
write down the position operator X. Lets do it in the non-relativistic limit. In this case the
operator

(x) =
_
d
3
p
(2)
3
a

p
e
ipx
, (3.141)
creates a particle localized with a delta function at x. We hence write [x) =

(x)[0). It is
now natural to dene the position operator X as
X =
_
d
3
x x

(x) (x) , (3.142)


since it has the sought property,
X[x) =
_
d
3
y y

(y) (y)

(x)[0)
=
_
d
3
y y

(y)
_

(3)
(y x)

(x) (y)

[0) = x[x) .
(3.143)
We can now construct a state [) by taking a superposition of the one-particle states [x),
[) =
_
d
3
x (x) [x) . (3.144)
50
Notice that the weight function (x) is what we would usually call the Schr odinger wavefunc-
tion (in the position representation). Lets make sure that it indeed has the right properties.
First, it is clear that for what concerns X it behaves correctly, namely
X[) =
_
d
3
x x(x)[x) . (3.145)
What about the momentum operator P? A straightforward calculation gives,
P [) =
_
d
3
x d
3
p
(2)
3
pa

p
a
p
(x)

(x)[0) =
_
d
3
x d
3
p
(2)
3
pa

p
e
ipx
(x)[0)
=
_
d
3
x d
3
p
(2)
3
a

p
_
ie
ipx
_
(x)[0) =
_
d
3
x d
3
p
(2)
3
e
ipx
(i(x)) a

p
[0)
=
_
d
3
x
_
i(x)
_
[x) .
(3.146)
This tells us that P acts as the familiar derivative on wave functions (x). To obtain the nal
result in (3.146), we have used in a rst step the relationship [a
p
,

(x)] = e
ipx
which can
be easily checked. We learn that when acting on one-particle states, the operators X and P
act as position and momentum operators in QM, with [X
i
, P
j
] [) = i
ij
[) and i, j = 1, 2, 3.
But what about dynamics? In particular, how does our wavefunction (x) change with
time? To address this question, we rst express the Hamiltonian corresponding to the density
(3.136) through ladder operators,
H =
_
d
3
x
1
2m
(

) ( ) =
_
d
3
p
(2)
3
p
2
2m
a
p
a

p
, (3.147)
which implies that
i
d
dt
=
1
2m

2
, (3.148)
which formally looks exactly like the time evolution of the original eld given in (3.132).
Yet this time, it is really the Schrodinger equation, complete with the usual probabilistic
interpretation for the wavefunction (and not just a rst-order dierential equation). Note
in particular, that the conserved charge arising from the current (3.134) is Q =
_
d
3
x [(x)[
2
which is the total probability.
Historically, the fact that the equation for the classical eld (3.132) and the one-particle
wavefunction (3.148) coincide caused some confusion. It was thought, that perhaps one is
quantizing the wavefunction itself and the resulting name second quantization is still some-
times used today meaning QFT. However, it is important to stress that, despite the name,
nothing is quantized twice. One simply quantizes a classical eld once. Nonetheless, it is good
to know that, if one treats the one-particle Schrodinger equation as a quantum eld, then it
will give the correct generalization to multiparticle states.
51
3.9 Problems
i) Consider the Klein-Gordon equation with the mass term set equal to zero and a dilatation
transformation with parameter ,
x

= e

, (x)

(x

) = (x) e
d

. (3.149)
Show that this transformation is a global symmetry of / if one chooses the scaling
dimension d

in an appropriate way. Compute the associated Noether current and verify


that it is conserved. Is the symmetry preserved if you add a quartic /(4!)
4
to the
Lagrangian? What happens if you add a mass term m
2

2
?
ii) Consider the Lagrangian
/ =
1
2
(

M
2

2
) +
1
2
(

m
2
)

2

2
, (3.150)
with m M and derive the EOMs for the elds and .
Express the heavy eld through the light eld and insert it back into the Lagrangian.
Expand your result in 1/M
2
. What has changed compared to the original Lagrangian?
Up to which energy scale would you trust the predictions of this eective Lagrangian?
iii) Show that the rst relation in (3.2) is satised if [a
p
, a
q
] = [a

p
, a

q
] = 0 holds. Prove the
commutation relations (3.43). Calculate J
i
[p = 0) with J
i
dened in (3.48). Show that
the number operator N dened in (3.53) commutes with the Hamiltonian H of (3.27)
and satises N[p
1
, . . . , p
n
) = n[p
1
, . . . , p
n
), where [p
1
, . . . , p
n
) denotes the n-particle
state introduced in (3.50).
iv) Consider a real scalar (t, x) eld living on a two-dimensional space-time and dened on
an interval x [0, L] with Dirichlet BCs (t, 0) = (t, L) = 0. Show that the (classical)
positive- and negative-frequency solutions to the Klein-Gordon equation that also satisfy
the BCs have the form

()
n
(t, x) =
1

n
L
e
int
sin(k
n
x) . (3.151)
Give the expression for k
n
in terms of L. How is
n
related to k
n
? We now quantize the
eld (t, x), keeping in mind that momentum here is discretized, i.e.,
(t, x) =

n=1
_

()
n
(t, x) a
n
+
(+)
n
(t, x) a

n
_
, (3.152)
with the ladder operators satisfying [a
n
, a
m
] = [a

n
, a

m
] = 0 and [a
n
, a

m
] =
mn
.
Compute the VEV 0[H[0) of the Hamiltonian density
H =
1
2
_

2
+ (
x
)
2
+m
2

2
_
. (3.153)
52
Integrating your result over the interval [0, L] and show that the total vacuum energy is
E
0
(L) =
1
2

n=1

n
. (3.154)
Since this quantity is innite, we need some form of regularization in order to handle the
divergence. Let us introduce an exponentially damping function exp(
n
) with > 0
in the sum, and consider for simplicity the case of a massless eld. Prove that in this
case the vacuum energy can be written as
E
0
(L, ) =

8L
sinh
2
_

2L
_
, (3.155)
Take the limit 0 and determine the vacuum energy for the case when no BCs are
imposed. With all this at hand calculate the Casimir force.
v) Derive the charge operator Q for the U(1) invariant Lagrangian (3.90) using innitesi-
mal transformations. Show that the result expressed through creation and annihilation
operators takes the form of (3.95). Prove that Q satises (3.85). Verify that the charge
is conserved (via [H, Q] = 0). Are the operators N

as dened in (3.96) conserved as


well? What happens if you add an interaction term
/ =

4!
(

)
2
, (3.156)
to the Lagrangian? What does this result imply for the case in which particles are their
own antiparticles?
vi) Consider a theory with two complex scalar elds
1
and
2
. Write down all possi-
ble terms of the Lagrangian which are Lorentz invariant and renormalizable, i.e., have
mass dimension of four and couplings with non-negative mass dimensions. Which terms
survive, if there is an additional discrete symmetry
(
1
,
2
) (
1
,
2
) , (3.157)
and
(
1
,
2
) (
1
,
2
) , (3.158)
under which the Lagrangian remains invariant?
Assume further, that both elds have the same mass, m
1
= m
2
, and that all dimension-
less couplings are identical. You can now rewrite the Lagrangian in a more economic
way if you introduce the scalar doublet
=
_

2
_
, (3.159)
53
and its hermitian conjugate. The theory at hand has four conserved global charges. One
charge follows from the U(1) invariance,
e
i
, = i, (3.160)
of the theory, that is already present in the case of a single complex scalar. The other
three charges correspond to the mixing of the scalar elds under an SU(2) transforma-
tion,
e
i
j

j
, = i
j

j
, (3.161)
where the indices j = 1, 2, 3 are summed over and
j
=
j
/2 with
j
being the usual
Pauli matrices.
Compute the four conserved charges using Noethers theorem. For the SU(2) charges
you should nd
Q
j
=
_
d
3
x i
_

a
(
j
)
ab

b

a
(
j
)
ab

b
_
, (3.162)
where a, b = 1, 2 are eld labels.
Show further, that the latter charges fulll the SU(2) commutation relation,
[Q
j
, Q
k
] = i
jkl
Q
l
. (3.163)
What symmetries survive if you allow for dierent masses and dimensionless couplings?
vii) Consider the Lagrangian for a free complex scalar eld (3.90) which is invariant under
global U(1) transformations (3.92). Is this Lagrangian invariant, if the global gets pro-
moted to a local symmetry, i.e., e(x), where e is just a universal constant and
(x) a function of space-time?
If you now add a vector eld A

to the Lagrangian with a coupling


/
A
= i
_

) (

+
2
(A

)(A

) , (3.164)
how do the vector eld and the coupling constant have to transform under the local
U(1), if the Lagrangian / +/
A
should remain invariant under phase redenitions?
Compute the Noether current for this local symmetry. Add a kinetic term for the vector
eld to the Lagrangian,
/
A
=
1
4
F

, (3.165)
and derive the EOMs for the eld A

considering the full Lagrangian /

+/
A
+/
A
.
viii) Compute the advanced Greens function D
A
(x y) for the Klein-Gordon equation using
the integration contour in Figure 3.4. Recall which initial conditions one assumes in
electrodynamics and why they lead to the use of the retarded Greens function. Can
you imagine physical BCs in which the advanced propagator would be the right choice?
Prove and explain in this context the following two relations
D
R
(x) = D
A
(x) , D
F
(x) = D
F
(x) . (3.166)
54
ix) Explicitly perform the steps that lead to the Lagrangian (3.133). Show that the corre-
sponding EOM (vary with respect to

) is the Schrodinger equation. The Lagrangian


has a global U(1) symmetry, e
i
. Verify the correctness of the expression for
the Noether current (3.134) and discuss the physical meaning of the conserved charge.
Based on your ndings, give a reason why there is no non-relativistic limit for a real
scalar eld?
x) Prove that the position operator X given in (3.142) satises X[x) = x[x). Furthermore,
show that (3.144), (3.146), and (3.148) are correct.
References
[1] S. R. Coleman, Physics 253: Quantum Field Theory, Course given at Harvard
University, 1975 and 1976, http://www.damtp.cam.ac.uk/user/tong/qft/col1.pdf,
http://www.damtp.cam.ac.uk/user/tong/qft/col2.pdf,
http://www.physics.harvard.edu/about/Phys253.html
[2] N. Straumann, The history of the cosmological constant problem, arXiv:gr-
qc/0208027.
[3] S. M. Carroll, The Cosmological Constant, Living Rev. Relativity 3, 1 (2001),
http://relativity.livingreviews.org/Articles/lrr-2001-1
[4] H. B. G. Casimir and D. Polder, The Inuence of retardation on the London-van der
Waals forces, Phys. Rev. 73, 360 (1948).
[5] K. A. Milton, The Casimir eect: Recent controversies and progress, J. Phys. A 37,
R209 (2004) [arXiv:hep-th/0406024].
55
4 Interacting Fields
Often in QM, we are interested in particles moving in some xed background potential V (x).
This can be easily incorporated into eld theory by working with a Lagrangian with explicit
x dependence. E.g., in the case of our non-relativistic complex scalar eld discussed in
Section 3.8, we could simply add a term
/ = V (x)

(x) (x) , (4.1)


to the Lagrangian (3.133). Since this interaction does not respect translational symmetry,
we wont have the associated energy-momentum tensor. While such Lagrangians are useful
in condensed matter physics, we rarely (or never) come across them in high-energy physics,
where all equations obey translational (and Lorentz) invariance.
One can of course also consider interactions between particles. Obviously, these are only
important for n particle states with n 2. We therefore expect them to arise from additions
to the Lagrangian (3.133) of the form
/ =

(x)

(x) (x) (x) , (4.2)


which, in QFT, is an operator which destroys two particles before creating two new ones. Such
terms in the Lagrangian will indeed lead to inter-particle forces, both in the non-relativistic
and relativistic setting. In the following, we will explore these types of interactions in detail
for relativistic theories.
4.1 Classication of Interactions
The free QFTs we have discussed so far are special. We can determine their spectrum, but
they are dull since nothing happens as their name suggests. They have particle excitations,
but these do not interact.
To make things more interesting (i.e., more complicated) let us include interactions in
our theory. These will take the form of higher-order terms in the Lagrangian. We start by
asking what kind of small perturbations we can add to the theory. E.g., let us consider the
Lagrangian for a real scalar eld (2.45) and add the innite tower of additional terms
/ =

n=3
/
n
, /
n
=

n
n!

n
, (4.3)
to it. Here the coecients
n
are called coupling constants. The rst question that we have to
address, is which restrictions the coupling constants have to satisfy in order for the additional
terms to be small perturbations. Naively one would think that one simply has to require that

n
1. But this turns out to be not quite right. In order to see why the naive guess is
not correct, we perform a dimensional analysis. Applying the rules gathered in Section 2.1,
we nd that the dimensions of the coupling constants are
[
n
] = 4 n. (4.4)
56
This result makes clear why we cannot simply say
n
1, because this statement is only
sensible for dimensionless quantities, but not dimensionful ones.
The interaction terms in (4.3) fall into three dierent categories. First, dimension-three
operators with [
3
] = 1. For such terms, we can dene a dimensionless parameter
3
/E, where
E has dimension of mass and represents the energy scale of the process of interest. This means
that /
3
=
3

3
/(3!) is a small perturbation for high energies, i.e., E
3
, but a big one
at low energies, i.e., E
3
. Such terms are called relevant, because they become and are
most relevant at low energies which, after all, is where most of the physics that we experience
lies. In a relativistic QFT, we have E > m, which means that we can always make this sort
of perturbations small by taking
3
m. Second, terms of dimension four with [
4
] = 0.
E.g., /
4
=
4

4
/(4!). Such terms are small if
4
1 and are called marginal. Third,
operators with dimension of higher than four, having [
n
] < 0. In this case the appropriate
dimensionless parameters is (
n
E
n4
) and terms /
n
=
n

n
/(n!) with n 5 are small
(large) at low (high) energies. Such contributions are called irrelevant, since in daily life,
meaning E
n4

n
, these operators do not matter.
As we will see later, it is typically impossible to avoid high-energy processes in QFT. We
have already seen a glimpse of this feature when we were discussing the structure of the vacuum
in Section 3.2, which involved the calculation of an integral over innitely large frequencies of
a harmonic oscillator. We hence might expect problems with irrelevant operators that become
important at high energies. Indeed, these operators lead to non-renormalizable QFTs in which
one cannot make sense of the innities at arbitrarily high energies. This does not mean that
these theories are useless, it just means that they become incomplete at some energy scale
and need to be embedded into an appropriate complete theory aka an UV completion. Let
me also add that the above naive assignment of relevant, marginal, and irrelevant operators
is not always carved in stone, since quantum corrections can sometimes change the character
of an operator.
Low-Energy Description
In typical applications of QFT only the relevant and marginal couplings are important. This
is due to the fact that the irrelevant couplings become small at low energies, as we have seen
above. In practice this saves us, since instead of considering the innite number of interaction
terms in (4.3), only a handful are actually needed. E.g., in the case of the real scalar eld
described earlier, we only have to take into account two operators, namely /
3
=
3

3
/(3!)
and /
4
=
4

4
/(4!), in the low-energy limit.
Let us have a closer look at this issue. Suppose that at some day we discover the true
superduper theory aka the TOE that describes the world at very high energy scales, say the
GUT scale, or, if you wish, even the Planck scale. Whatever this scale is, lets call it . Since
it is an energy scale, we obviously have [] = 1. What we want to understand are the laws
of physics at energy scales E that we can probe directly in a laboratory, which given todays
standards, means E . Let us further suppose that at energies of order E, the laws of
physics are described by a real scalar eld.
20
This scalar eld will have some complicated
20
Of course, we know that this assumption is plain wrong, since the SM is a non-abelian gauge theory with
chiral fermions, but the same argument applies in that case.
57
interaction terms (4.3), where the precise form is dictated by all the stu that is going on in
the TOE. Can we get an idea about the interactions? Well, we can write our dimensionful
coupling constants
n
in terms of dimensionless couplings g
n
, multiplied by a suitable power
of the relevant scale ,

n
=
g
n

n4
. (4.5)
The exact values of the dimensionless couplings g
n
depend on the details of the TOE,
21
so we
have to do some guesswork. Since the couplings g
n
are dimensionless, 1 looks like a pretty
good and somehow a natural guess. Since we are not completely sure, lets say g
n
= O(1).
This means that in a laboratory with E the interaction terms /
n
=
n

n
/(n!) of (4.3)
will be suppressed by powers of (E/)
n4
if n 5. Given the LHC energy of around 1 TeV,
this is a suppression by many orders of magnitude. E.g., for = M
P
one has E/ = 10
16
. It
is this simple argument based on dimensional analysis that ensures that we need to focus only
on the rst few terms in the interaction, namely those that are relevant and marginal. It also
means that if we only have access to low-energy experiments, it is going to be very dicult
to gure out the precise nature of the TOE, because its eects are highly diluted except for
the relevant and marginal interactions. Some people therefore call the superduper theory that
everybody is looking for, not TOE, but TOENAIL, which stands for theory of everything not
accessible in laboratories. The discussion given above is a poor mans version of the ideas
of eective eld theory and Wilsons renormalization group, about which you can learn much
more by asking Matthias Neubert.
Weakly Coupled Theories
In this course we will only deal with weakly coupled QFTs, i.e., theories that can be truly
considered as small perturbations of the free eld theory at all energies. We will look in more
detail at two specic examples.
The rst example of a weakly coupled QFT we will study is the
4
theory,
/ =
1
2
(

)
2

1
2
m
2


4!

4
, (4.6)
where is our well-known real scalar eld. For (4.6) to be weakly-coupled we have to require
1. We can get a hint for what the eects of the additional
4
term will be. Expanding it
in terms of ladder operators, we nd terms like
a

p
a

p
a

p
a

p
, a

p
a

p
a

p
a
p
, (4.7)
etc., which create and destroy particles. This signals that the
4
Lagrangian (4.6) describes a
theory in which particle number is not conserved. In fact, it is not too dicult to check that
the number operator N does not commute with the Hamiltonian, i.e., [H, N] ,= 0.
The second example we will look at is a scalar Yukawa theory. Its Lagrangian is given by
/ = (

)(

) +
1
2
(

)
2
M
2


1
2
m
2

2
g

, (4.8)
21
If we would know the precise structure of the TOE we could, in fact, calculate the couplings g
n
.
58
with g M, m. This theory couples a complex scalar to a real scalar . In this theory
the individual particle numbers for and are not conserved. Yet, the Lagrangian (4.8) is
invariant under global phase rotations of , which ensures that there will be a conserved charge
Q obeying [H, Q] = 0. In fact, we have met this charge already in (3.95). In consequence,
in the scalar Yukawa theory the number of particles minus the number of antiparticles is
conserved. Notice also that the potential in (4.8) has a stable minimum at = = 0, but it
is unbounded from below, if g becomes too large. This means that we should not mess to
much with the scalar Yukawa theory.
4.2 Interaction Picture
In QM, there is a useful viewpoint called the interaction picture, which allows to deal with
small perturbations to a well-understood Hamiltonian. Let me briey recall how this works.
In the Schrodinger picture, the states evolve as id/dt[)
S
= H[)
S
, while the operators O
S
are
time independent. In contrast, in the Heisenberg picture the states do not evolve with time,
but the operators change with time, namely one has [)
H
= e
iHt
[)
S
and O
H
= e
iHt
O
S
e
iHt
.
The interaction picture is a hybrid of the two. We split the Hamiltonian as
H = H
0
+H
int
, (4.9)
where in the interaction picture the time dependence of operators O
I
is governed by H
0
, while
the time dependence of the states [)
I
is governed by H
int
. While this split is arbitrary, things
are easiest if one is able to solve the Hamiltonian H
0
, e.g., if H
0
is the Hamiltonian of a free
theory. From what I have said so far, it follows that
[)
I
= e
iH
0
t
[)
S
, O
I
= e
iH
0
t
O
S
e
iH
0
t
. (4.10)
Since the Hamiltonian is itself an operator, the latter equation also applies to the interaction
Hamiltonian H
int
. In consequence, one has
H
I
= (H
int
)
I
= e
iH
0
t
H
int
e
iH
0
t
. (4.11)
The Schrodiner equation in the interaction picture is readily derived starting from the Schr odinger
picture,
i
d
dt
[)
S
= H[)
S
, = i
d
dt
_
e
iH
0
t
[)
I
_
= (H
0
+H
int
) e
iH
0
t
[)
I
,
= i
d
dt
[)
I
= e
iH
0
t
H
int
e
iH
0
t
[)
I
,
= i
d
dt
[)
I
= H
I
[)
I
.
(4.12)
Dysons Formula
In order to solve the system described by the Hamiltonian (4.9), we have to nd a way of how
to nd a solution to the Schr odinger equation in the interaction basis (4.12). Let us write the
solution as
[(t))
I
= U(t, t
0
)[(t
0
))
I
, (4.13)
59
where U(t, t
0
) is an unitary time-evolution operator satisfying U(t, t) = 1, U(t
1
, t
2
) U(t
2
, t
3
) =
U(t
1
, t
3
), and U(t
1
, t
3
)
_
U(t
2
, t
3
)

= U(t
1
, t
2
). Inserting (4.13) into the last line of (4.12)
i
d
dt
U(t, t
0
) = H
I
(t) U(t, t
0
) . (4.14)
If H
I
would be a function, the solution to the dierential equation (4.14) would read
U(t, t
0
)
?
= exp
_
i
_
t
t
0
dt

H
I
(t

)
_
. (4.15)
Yet, H
I
is not a function but an operator and this causes ordering issues. Lets have a closer
look at the exponential to understand where the trouble comes from. The exponential is
dened through its power expansion,
exp
_
i
_
t
t
0
dt

H
I
(t

)
_
= 1 i
_
t
t
0
dt

H
I
(t

) +
(i)
2
2
__
t
t
0
dt

H
I
(t

)
_
2
+. . . . (4.16)
When we dierentiate this with respect to t, the third term on the right-hand side gives

1
2
__
t
t
0
dt

H
I
(t

)
_
H
I
(t)
1
2
H
I
(t)
__
t
t
0
dt

H
I
(t

)
_
. (4.17)
The second term of this expression looks good since it is part of H
I
(t)U(t, t
0
) appearing on the
right-hand side of (4.14), but the rst term is no good, because the H
I
(t) sits on the wrong
side of the integral, and we cannot commute it through, given that [H
I
(t

), H
I
(t)] ,= 0 when
t ,= t

. So what is the correct expression for U(t, t


0
) then?
The correct answer is provided by Dysons formula,
22
which reads
U(t, t
0
) = T exp
_
i
_
t
t
0
dt

H
I
(t

)
_
. (4.18)
Here T denotes time ordering as dened in (3.124). It is easy to prove the latter statement.
We start by expanding out (4.18), which leads to
U(t, t
0
) = 1 i
_
t
t
0
dt

H
I
(t

)
+
(i)
2
2
_
_
t
t
0
dt

_
t
t

dt

H
I
(t

) H
I
(t

) +
_
t
t
0
dt

_
t

t
0
dt

H
I
(t

) H
I
(t

)
_
+. . . .
(4.19)
In fact, the terms in the last line are actually the same, since
_
t
t
0
dt

_
t
t

dt

H
I
(t

) H
I
(t

) =
_
t
t
0
dt

_
t

t
0
dt

H
I
(t

) H
I
(t

)
=
_
t
t
0
dt

_
t

t
0
dt

H
I
(t

) H
I
(t

) ,
(4.20)
22
Essentially gured out by Paul Dirac, but in its compact notation due to Freeman Dyson.
60
where the range of integration in the rst expression is over t

, while in the second


expression one integrates over t

, which is, of course, the same thing. The nal expression


is simply obtained by relabelling t

and t

. In fact, it is not too dicult to show that one has


_
t
t
0
dt
1
_
t
1
t
0
dt
2
. . .
_
t
n1
t
0
dt
n
H
I
(t
1
) . . . H
I
(t
n
) =
1
n!
_
t
t
0
dt
1
. . . dt
n
T (H
I
(t
1
) . . . H
I
(t
n
)) . (4.21)
Putting things together this means that the power expansion of (4.17) takes the form
U(t, t
0
) = 1 i
_
t
t
0
dt

H
I
(t

) + (i)
2
_
t
t
0
dt

_
t

t
0
dt

H
I
(t

) H
I
(t

) +. . . . (4.22)
The proof of Dysons formula is straightforward. First, observe that under the T operation,
all operators commute, since their order is already xed by time ordering. Thus,
i
d
dt
U(t, t
0
) = i
d
dt
_
T exp
_
i
_
t
t
0
dt

H
I
(t

)
__
= T
_
H
I
(t) exp
_
i
_
t
t
0
dt

H
I
(t

)
__
= H
I
(t) T exp
_
i
_
t
t
0
dt

H
I
(t

)
_
= H
I
(t)U(t, t
0
) .
(4.23)
Notice that since t, being the upper limit of the integral, is the latest time so that the factor
H
I
(t) can be pulled out to the left.
Before moving on, I have to say that Dysons formula is rather formal. In practice, it turns
out to be very dicult to compute the time-ordered exponential in (4.18). The power of (4.18)
comes from the expansion (4.22) which is valid when H
I
is a small perturbation to H
0
.
4.3 First Look at Scattering Processes
Let us now try to apply the interaction picture to QFT, starting with an easy example, namely
the interaction Hamiltonian of the Yukawa theory,
H
int
= g
_
d
3
x

. (4.24)
Unlike the free theories discussed in Section 2, this interaction does not conserve the particle
number of the individual elds, allowing particles of one type to morph into others. In order to
see why this is the case, we look at the evolution of the state, i.e., [(t)) = U(t, t
0
) [(t
0
)), in
the interaction picture. If g M, m, where M and m are the masses of and , respectively,
the perturbation (4.24) is small and we can approximate the full time-evolution operator
U(t, t
0
) in (4.18) by (4.22). Notice that (4.22) is, in fact, an expansion in powers of H
int
. The
interaction Hamiltonian H
int
contains ladder operators for each type of particle. In particular,
glancing at (3.107) tells us that the eld contains the operators a

and a that create or


destroy particles.
23
Lets call this particle mesons (M). On the other hand, from the
discussion in Section 3.4 and 3.5, it follows that the eld contains the operators a

+
and
23
The additional subscript p of the ladder operators a

and a etc. is dropped hereafter in the text.


61
a

, which implies that it creates antiparticles and destroys particle. We will call these
particles nucleons (N).
24
Finally, the action of

is to create nucleons through a

and to
destroy antinucleons via a
+
.
While the individual particle number is not conserved, it is important to emphasize that
Q = N
+
N

as dened in (3.95) is conserved not only in the free theory, but also in the
presence of H
int
. At rst order in H
int
, one will have terms of the form a

+
a

a which destroys
a meson and creates a nucleon-antinucleon pair, M N

N. At second order in H
int
, we have
more complicated processes. E.g., the combination of ladder operators (a

+
a

a)(a
+
a

) gives
rise to the scattering process N

N M N

N. The rest of this section is devoted to calculate
the quantum amplitudes for such processes to occur.
In order to calculate the amplitude, we have to make an important, but slightly dodgy,
assumption. We require that the initial state [i) at t (nal state [f) at t ) is
an eigenstate of the free theory described by the Hamiltonian H
0
. At some level, this sounds
like a reasonable approximation. If at t the particles are well separated they do not
feel the eects of each other. Moreover, we intuitively expect that the states [i) and [f) are
eigenstates of the individual number operators N and N

. These operators commute with H


0
,
but not with H
int
. As the particles approach each other, they interact briey, before departing
again, each going on its own way. The amplitude to go from [i) to [f) is given by
lim
t

f[U(t
+
, t

)[i) = f[S[i) , (4.25)


where the unitary operator S is known as the S-matrix. Needless to say, that the S in S-matrix
stands for scattering.
There are a number of reason why the assumption of non-interacting initial and nal states
[i) and [f) is shaky. First, one cannot describe bound states. E.g., naively this formalism
cannot deal with the scattering of an e

and proton (p) which collide, bind, and leave as a


Hydrogen atom. It is possible to circumvent this objection, since it turns out that bound
states show up as poles in the S-matrix. Second, and more importantly, a single particle,
a long way from its neighbors, is never alone in eld theory. This is true even in classical
electrodynamics, where the electron sources the electromagnetic eld from which it can never
escape. In QED, a related fact is that there is a cloud of virtual photons surrounding the
electron. This line of thought gets us into the issues of renormalization and you will hear
more on this later. For the time being, let me simply use the assumption of non-interacting
asymptotic states. After developing the basics of scattering theory, we will revisit the latter
problem.
Example: Meson Decay
Let us consider the relativistically normalized initial and nal states,
[i) =
_
2E
p
1
a

p
1
[0) , [f) =
_
4E
q
1
E
q
2
a

+,q
1
a

,q
2
[0) . (4.26)
24
Of course, in reality nucleons are spin-1/2 particles, and do not arise from the quantization of a scalar
eld. Our scalar Yukawa theory is therefore only a toy model for nucleons interacting with mesons.
62
The initial state contains a meson with momentum p
1
, while the nal state contains a nucleon-
antinucleon pair of momentum q
1
and q
2
. In leading order in the interaction H
int
(4.24), the
amplitude for the process M N

N is given by
f[S[i) = ig f[
_
d
4
x

I
(x)
I
(x)
I
(x)[i) . (4.27)
Let us calculate this matrix element step by step. We rst express
I
in terms of a

and a
using (3.107). Notice that it is correct to apply the latter equation, since the
I
eld in (4.27)
is in the interaction picture, which is the same as the Heisenberg picture of the free theory.
The annihilation operator in (3.107) will turn [i) into something proportional to [0), while the
piece containing a creation operator will turn [i) into a two meson state. A two meson state
has however no overlap with f[, which is a N

N state, and the ladder operator appearing in

I
and
I
cannot change this situation. So we have
f[S[i) = ig f[
_
d
4
x

I
(x)
I
(x)
_
d
3
k
(2)
3
_
2E
p
1

2E
k
a
k
a

p
1
e
ikx
[0)
= ig f[
_
d
4
x

I
(x)
I
(x)
_
d
3
k
(2)
3
_
2E
p
1

2E
k
_
(2)
3

(3)
(p
1
k) a

p
1
a
k
_
e
ikx
[0)
= ig f[
_
d
4
x

I
(x)
I
(x) e
ip
1
x
[0) .
(4.28)
Now we do the same for
I
and

I
. To get a non-zero overlap with our nucleon-antinucleon
nal state, we have to pick up the creation operators a

+
and a

from the Fourier expansion


of the eld operators. Altogether we then have
f[S[i) = ig 0[
_
d
4
x d
3
k
1
d
3
k
2
(2)
6
_
4E
q
1
E
q
2
_
4E
k
1
E
k
2
a
,q
2
a
+,q
1
a

+,k
1
a

,k
2
[0) e
i(k
1
+k
2
p
1
)x
= ig
_
d
4
x d
3
k
1
d
3
k
2
(2)
6
_
4E
q
1
E
q
2
_
4E
k
1
E
k
2
(2)
6

(3)
(q
1
k
1
)
(3)
(q
2
k
2
) e
i(k
1
+k
2
p
1
)x
= ig (2)
4

(4)
(q
1
+q
2
p
1
) ,
(4.29)
where we have made repeatedly use of the commutation relations of the ladder operators as
given in (3.16) and ignored contributions where annihilation operators act on the vacuum,
since these vanish by denition. We have drawn rst blood: the result in (4.29) is our rst
QFT amplitude.
Notice that the delta function constraints the possible M N

N decays. In particular,
the decay can only happen at all if the mass of the meson is larger or equal to the mass of the
nucleon-antinucleon state, i.e., m 2M. In order to see this, we simply boost our reference
frame so that the meson is at rest p
1
= (m, 0, 0, 0). This is always possible. Momentum
conservation, as imposed by the delta function, than implies that the nucleon and antinucleon
are produced back-to-back, q
1
= q
2
, and that m = 2
_
M
2
+[q
1,2
[
2
_
1/2
2M.
63
4.4 Wicks Theorem
Using Dysons formulas (4.18) and (4.22), we want to compute matrix elements such as
f[T
_
H
I
(x
1
) . . . H
I
(x
n
)
_
[i) , (4.30)
where [i) and [f) are assumed to be asymptotically free states. The ordering of the operators
H
I
is xed by time ordering. However, since the interaction Hamiltonian contains certain
creation and annihilation operators, it would be convenient if we could start to move all
annihilation operators to the right, where they can start eliminating particles in [i). Recall
that this is the denition of normal ordering as dened in (3.27). Wicks theorem tells us how
to go from time-ordered products to normal-ordered products. Before stating Wicks theorem
in its full generality, lets keep it simple and try to rederive something that we know already.
This is always a good idea.
Case of Two Fields
The most simple matrix element of the form (4.30) is
0[T
I
(x)
I
(y)[0) . (4.31)
We already calculated this object in Section 3.7 and gave it the name Feynman propagator.
What we want to do now is to rewrite it in such a way that it is easy to evaluate and to
generalize the obtained result to the case with more than two elds. We start by decomposing
the real scalar eld in the interaction picture as

I
(x) =
+
I
(x) +

I
(x) , (4.32)
with
25

+
I
(x) =
_
d
3
k
(2)
3
1

2E
k
a
k
e
ikx
,

I
(x) =
_
d
3
k
(2)
3
1

2E
k
a

k
e
ikx
. (4.33)
This decomposition can be done for any free eld. It is useful since

+
I
(x)[0) = 0 , 0[

I
(x) = 0 . (4.34)
Now we consider the case x
0
> y
0
and compute the time-order product of the two scalar elds,
T
I
(x)
I
(y) =
I
(x)
I
(y) =
_

+
I
(x) +

I
(x)
__

+
I
(y) +

I
(y)
_
=
+
I
(x)
+
I
(y) +

I
(x)
+
I
(y) +

I
(y)
+
I
(x) +

I
(x)

I
(y)
+
_

+
I
(x),

I
(y)

,
(4.35)
where we have normal ordered the last line, i.e., brought all
+
I
s to the right. To get rid of
the
+
I
(x)

I
(y) term, we have added the commutator
_

+
I
(x),

I
(y)

. In the case x
0
< y
0
, we
nd, repeating the above exercise,
T
I
(x)
(
y) = :
I
(x)
I
(y): +
_

+
I
(y),

I
(x)

, (4.36)
25
The superscripts do not make much sense, but I just follow Pauli and Heisenberg here. If you have
to, complain with them.
64
where we have made use of the fact that the rst four terms in the last line of (4.35) are simply
the normal-ordered product of the two elds, : (x)(y): .
In order to combine the results (4.35) and (4.36) into one equation, we dene the contraction
of two elds,

I
(x)
I
(y) =
_
_
_
_

+
I
(x),

I
(y)

, x
0
> y
0
,
_

+
I
(y),

I
(x)

, y
0
> x
0
.
(4.37)
This denition implies that the contraction of two
I
elds is nothing but the Feynman
propagator:

I
(x)
I
(y) = D
F
(x y) . (4.38)
For a string of eld operators
I
, the contraction of a pair of elds means replacing the
contracted operators with the Feynman propagator, leaving all other operators untouched.
Equipped with the denition (4.37), the relation between time-ordered and normal-ordered
products of two elds can now be simply written as
T
I
(x)
I
(y) = :
I
(x)
I
(y): +
I
(x)
I
(y) . (4.39)
Let me emphasize that while both T
I
(x)
I
(y) and :
I
(x)
I
(y): are operators, their dierence
is a complex function, namely the Feynman propagator or the contraction of two
I
elds.
The formalism of contractions is also straightforwardly extended to our complex scalar
eld
I
. One has
T
I
(x)

I
(y) = :
I
(x)

I
(y): +
I
(x)

I
(y) , (4.40)
prompting us to dene the contraction in this case as

I
(x)

I
(y) = D
F
(x y) .
I
(x)
I
(y) =

I
(x)

I
(y) = 0 . (4.41)
For convenience and brevity, I will from here on often drop the subscript I, whenever I
calculate matrix elements of the form (4.30). There is however little room for confusion, since
contractions will always involve interaction-picture elds.
Strings of Fields
With all this new notation at hand, the generalization to arbitrarily many elds is also easy
to write down:
T
_
(x
1
) . . . (x
n
)
_
= :
_
(x
1
) . . . (x
n
) + all possible contractions
_
: . (4.42)
This identity is known as the Wicks theorem. Notice that for n = 2 the latter equation is
equivalent to (4.39). Before proving Wicks theorem, let me tell you what the phrase all
possible contractions means by giving a simple example.
65
For n = 4 we have, writing
i
instead of (x
i
) for brevity,
T
_

4
_
= :
_

4
+
1

4
+
1

4
+
1

4
+
1

4
+
1

4
+
1

4
+
1

4
+
1

4
+
1

4
_
: .
(4.43)
When the contracted eld operator are not adjacent, we still dene it to give a factor of D
F
.
E.g.,
:
1

4
: = D
F
(x
2
x
4
) :
1

3
: . (4.44)
Since the VEV of any normal-ordered operator vanishes, i.e., 0[ : O : [0) = 0, sandwiching
any term of (4.43) in which there remain uncontracted eld operators between the vacuum
[0) gives zero. This means that only the three fully contracted terms in the last line of that
equation survive and they are all complex functions. We therefore have
0[T
_

4
_
[0) = D
F
(x
1
x
2
)D
F
(x
3
x
4
)
+D
F
(x
1
x
3
)D
F
(x
2
x
4
)
+D
F
(x
1
x
4
)D
F
(x
2
x
3
) ,
(4.45)
which is a rather simple result and has, as we will see in the next section, a nice pictorial
interpretation.
Proof of Wicks Theorem
We still like to prove Wicks theorem. Naturally this is done by induction. We have already
proved the case n = 2. So lets assume that (4.42) is valid for n 1 and try to show that
the latter equation also holds for n eld operators. With out loss of generality we can assume
that x
0
1
> . . . > x
0
n
, since if this is not the case we simply relabel the points in an appropriate
way. Such a relabeling leaves both sides of (4.42) unchanged. Then applying Wicks theorem
to the string
2
. . .
n
, we arrive at
T
_

1
. . .
n
_
=
1
. . .
n
=
=
1
: (
2
. . .
n
+ all contraction not involving
1
):
= (
+
1
+

1
) : (
2
. . .
n
+ all contraction not involving
1
): .
(4.46)
We now want to move the

1
s into the :
_
. . .
_
: . For

1
this is easy, since moving it in, it is
already on the left-hand side and thus the resulting term is normal ordered. The term with

+
1
is more complicated because we have to bring it into normal order by commuting
+
1
to
the right. E.g., consider the term without contractions,

+
1
:
2
. . .
n
: = :
2
. . .
n
:
+
1
+ [
+
1
, :
2
. . .
n
:]
= :
+
1

2
. . .
n
: + :
_
[
+
1
,

2
]
3
. . .
n
+
2
[
+
1
,

3
]
4
. . .
n
+. . .
_
:
= :
_

+
1

2
. . .
n
+
1

3
. . .
n
+
1

4
. . .
n
+. . .
_
: .
(4.47)
66
Here we rst used the fact that the commutator of a single operator and a string of operators
can be written as a sum of all possible strings of operators with two adjacent operators put
into a commutator. The simplest relation of this type reads [
1
,
2

3
] = [
1
,
2
]
3
+
2
[
1
,
3
]
and is easy to prove. In the last step we then realized that under the assumption x
0
1
> . . . > x
0
n
all commutator of two operators are equivalent to a contraction of the relevant elds.
The rst term in the last line of (4.47) combines with the

1
term of (4.46) to give
:
1
. . .
n
: , meaning that we have derived the rst term on the right-hand side of Wicks
theorem as well as all terms involving only one contraction of
1
with another eld in (4.42).
It is not too dicult to understand that repeating the above exercise (4.47) with all the
remaining terms in (4.46) will then give all possible contractions of all the elds, including
those of
1
. Hence the induction step is complete and Wicks theorem is proved.
4.5 Second Look at Scattering Processes
In order to see the real power of Wicks theorem lets put it to work and try to calculate
NN NN scattering in the Yukawa theory (4.24). We rst write down the expressions for
the initial and nal states,
[i) =
_
4E
p
1
E
p
2
a

+,p
1
a

+,p
2
[0) = [p
1
, p
2
) ,
[f) =
_
4E
q
1
E
q
2
a

+,q
1
a

+,q
2
[0) = [q
1
, q
2
) .
(4.48)
We now look at the expansion of f[S[i) in powers of the coupling constant g. In order to
isolate the interesting part of the S-matrix, i.e., the part due to interactions, we dene the
T-matrix by
S = 1 +iT , (4.49)
where the 1 describes the situation where nothing happens. The leading contribution to iT
occurs at second order in the interaction (4.24). We nd
(ig)
2
2
_
d
4
xd
4
y T
_

(x)(x)(x)

(y)(y)(y)
_
. (4.50)
Applying Wicks theorem to the time-order production entering this expression, we get (besides
others) a term
D
F
(x y) :

(x)(x)

(y)(y): , (4.51)
which features a contraction of the two elds. This term will contribute to the scattering,
because the operator :

(x)(x)

(y)(y) : destroys the two nucleons in the initial state


and generates those appearing in the nal state. In fact, (4.51) is the only contribution to
the process NN NN, since any other ordering of the eld operators would lead to a
vanishing matrix element. The matrix element of the normal-ordered operator in (4.51) is
readily computed:
q
1
, q
2
[ :

(x)(x)

(y)(y): [p
1
, p
2
) = q
1
, q
2
[

(x)

(y)[0)0[(x)(y)[p
1
, p
2
)
=
_
e
i(q
1
x+q
2
y)
+e
i(q
1
y+q
2
x)
_ _
e
i(p
1
x+p
2
y)
+e
i(p
1
y+p
2
x)
_
= e
i[(q
1
p
1
)x+(q
2
p
2
)y]
+e
i[(q
2
p
1
)x+(q
1
p
2
)y]
+ (x y) ,
(4.52)
67
where, in going to the third line, we have used the fact that for relativistically normalized
states, 0[(x)[p) = e
ipx
. Putting things together, the matrix element (4.50) takes the form
(ig)
2
2
_
d
4
xd
4
y d
4
k
(2)
4
_
e
i[(q
1
p
1
)x+(q
2
p
2
)y]
+e
i[(q
2
p
1
)x+(q
1
p
2
)y]
+ (x y)
_
ie
ik(xy)
k
2
m
2
+i
, (4.53)
where the term in curly brackets arises from (4.52), while the nal factor stems from the
expression for the propagator (3.129). The (x y) terms double up with the others to
cancel the factor of 1/2 in the prefactor (ig)
2
/2, while the x and y integrals give delta
functions. One arrives at
(ig)
2
_
d
4
k
(2)
4
i(2)
8
k
2
m
2
+i
_

(4)
(q
1
p
1
+k)
(4)
(q
2
p
2
k)
+
(4)
(q
2
p
1
+k)
(4)
(q
1
p
2
k)
_
.
(4.54)
Finally, we perform the k integration using the delta functions. We obtain
i(ig)
2
_
1
(p
1
q
1
)
2
m
2
+i
+
1
(p
1
q
2
)
2
m
2
+i
_
(2)
4

(4)
(p
1
+p
2
q
1
q
2
) , (4.55)
where the delta function, like in (4.29), imposes momentum conservation. Let me note that,
in fact, we can drop the i in the propagators, since the denominators cannot become zero. In
order to see this, we go to the center-of-mass (CM) frame, where p
1
= p
2
and, by momentum
conservation [p
1
[ = [q
1
[. This ensures that the 4-momentum of the meson is k = (0, p
1
q
1
),
and in consequence k
2
< 0. We will see shortly another, much simpler way to reproduce the
result (4.55) using Feynman diagrams. This will also shed light on the physical interpretation.
Notice that the above calculation is also relevant for the scatterings

N

N

N

N and
N

N N

N. Both reactions arise from the term (4.52) in Wicks theorem. However, we will
never nd a term that contributes to NN

N

N or

N

N NN, because these transitions
would violate the conservation of the charge Q introduced in (3.95).
4.6 Feynman Diagrams
As the above example demonstrates, to actually compute scattering amplitudes using Wicks
theorem is (still) rather tedious. Theres a much better way, which starts by drawing pretty
pictures. This pictures represent the expansion of f[S[i) and we will learn how to associate
mathematical expressions with those pictures. The pictures, you probably already guessed
it, are the famous Feynman diagrams. The Feynman-diagram approach turns out to be a
powerful tool to calculate QFT amplitudes (or as Schwinger puts it in [1]: Like the silicon
chips of more recent years, the Feynman diagram was bringing computation to the masses.).
We again start simple and consider the case of for elds, all at dierent space-time points,
which we have already worked out in (4.45). Let us present each of the points x
1
to x
4
by a
point and the propagators D
F
(x
1
x
2
) etc. by a line joining the relevant points. Then the
68
right-hand side of (4.45) can be represented as a sum of three Feynman diagrams,
0[T
_

4
_
[0) = .
1 2
3 4
+
1
3
2
4
+
1
3
2
4
(4.56)
While this matrix element is not a measurable quantity, the pictures suggest a physical in-
terpretation. Two particles are generated at two points and then each propagators to one of
the other points, where they are both annihilated. This can happen in three possible ways
corresponding to the three shown graphs. The total amplitude for this process is the sum of
the three Feynman diagrams.
Things get more interesting, if one considers expressions like (4.56) that contain eld
operators evaluated at the same space-time point. So let us have a look at the expansion of
the propagator (4.31) of the real scalar eld,
0[T
_
(x)(y) +(x)(y)
_
i
_
dt H
I
(t)
_
+. . .
_
[0) , (4.57)
in the presence of the interaction term H
I
= /(4!)
4
of the
4
theory (4.6). The rst term
gives the free-eld result, 0[T(x)(y)[0) = D
F
(x y), while the second term takes the form
0[T
_
(x)(y) (i)
_
dt
_
d
3
z

4!

4
(z)
_
[0)
= 0[T
_
(x)(y)
i
4!
_
d
4
z (z)(z)(z)(z)
_
[0) .
(4.58)
Now lets apply Wicks theorem (4.42) to (4.58). We get one term for each possible way to
contract the six dierent s with each other in pairs. There are 15 such possibilities, but
fortunately only two of these possibilities are really dierent. If we contract (x) and (y),
there are 3 possible ways to contract the remaining (z)s. The other possibility is to contract
(x) with (z) (four choices) and (y) with (z) (three choices), and (z) with (z) (one
choice). There are 12 possible ways to do this, all giving the same result. In consequence, we
have
0[T
_
(x)(y) (i)
_
dt
_
d
3
z

4!

4
(z)
_
[0)
= 3
i
4!
D
F
(x y)
_
d
4
z D
F
(z z) D
F
(z z)
+ 12
i
4!
_
d
4
z D
F
(x z) D
F
(y z) D
F
(z z) .
(4.59)
We can understand the latter expression better if we represent each term as a Feynman
graph. Again we draw each propagator as a line and each point as a dot. This time we have
however to distinguish between the external points x and y and the internal point z, which is
69
associated with a factor i
_
d
4
z. Neglecting the overall factors, we see that the expression
(4.59) is equal to the sum of the following two diagrams
.
x y
z +
x z y
(4.60)
We refer to the lines in these diagrams as propagators, since they represent the propagation
amplitudes D
F
(x y) etc. Internal points where four lines meet are called vertices. Since
D
F
(x y) is the amplitude for a free Klein-Gordon particle to propagate between x and y,
the diagrams actually interpret the analytic formula as a process of creation, propagation, and
annihilation which takes place in space-time.
Lets now move to a more complicated contraction that arises at order
3
in the
4
inter-
action (
x
= (x) etc.):
0[
x

y
1
3!
_
i
4!
_
3
_
d
4
z
z

z
_
d
4
w
w

w
_
d
4
u
u

u
[0)
=
1
3!
_
i
4!
_
3
_
d
4
z d
4
wd
4
u D
F
(x z)D
F
(z z)D
F
(z w)
D
2
F
(w u)D
F
(u u)D
F
(u y) .
(4.61)
The number of dierent contractions that gives this result is large. One has
3! 4 3 4 3 2 4 3 1/2 , (4.62)
which means a total number or 10 368 possibilities. Here the factor 3! arises from the inter-
change of the vertices z, w, and u, while the rst 4 3 factor describes the placement of the
contractions into the z vertex. The factor 4 3 2 characterizes the placement of the contrac-
tions into the w vertex whereas the second 4 3 factor is associated to the placement of the
contractions into the u vertex. Finally, the factor of 1/2 is due to the interchange of the wu
contractions. The product in (4.62) is roughly 1/13 of the total number of 135 135 contractions
of 14 dierent eld operators. The particular contraction (4.61) can be represented by the
following cactus diagram:
x z w
u
y
.
(4.63)
70
It is conventional, for obvious reasons, to let this one diagram represent the sum of all 10 368
identical terms.
In practical applications one always draws the Feynman diagrams rst, using it as a
mnemonic device to write down the analytic expression. If this is done, one still has to
gure out the multiplicative overall factor. Of course, one can do this as we have done it
above by associating a factor
_
d
4
z (i/(4!)) with each vertex, putting in the 1/n! factor
from the Taylor expansion, and then do the combinatorics by writing out the product of elds
as in (4.61) and counting. Yet, typically the 1/n! factor from the Taylor series will cancel the
n! factor arising from the interchanging the vertices, so that one can simply forget about this
factors. Furthermore, the generic vertex has four dierent lines coming from four dierent
places, so that the various contractions into the operator generates a factor of 4! (as in
the case of the w vertex in the above example). This factor of 4! cancels the denominator of
i/(4!). It is therefore conventional to associate the expression
_
d
4
z (i) with each vertex.
Applying this scheme to the Feynman graph in (4.63) gives a multiplicative factor that is
too large by a factor of S = 8 = 2 2 2, which is called the symmetry factor of the diagram.
Two factor of 2 come from lines that start and end on the same vertex, since the diagram is
symmetric under the interchange of the ends of such lines (z and u in our case). The other
factor of 2 comes from the two propagators connecting w and u, since the graph is symmetric
under the interchange of these two lines. A third type of symmetry (not arising in the case at
hand) is the equivalence of two vertices. In order to arrive at the correct overall factor, one
has to divide by the symmetry factor, which is in general the number of possibilities to change
parts of the diagrams without changing the result of the Feynman graph.
Most people never need to evaluate Feynman graphs with a symmetry factor larger than
2, so there is no need to worry too much about these technicalities. But for completeness let
me give some examples of nontrivial symmetry factors. Here they are (dropping the labels x
and y at the external points):
S = 2 , S = 2 2 2 = 8 ,
S = 3! = 6 , S = 3! 2 = 12 .
(4.64)
Clearly, if you are in doubt about the symmetry factor you can always determine it by counting
equivalent contractions, as we did above.
We are now ready to summarize our rules needed to nd the analytic expression for each
piece of a given Feynman diagram in the
4
theory:
1. For each propagator one has = D
F
(x y) . x y
71
2. For each vertex one has = (i)
_
d
4
z . z
3. For each external point one has = 1 . x
4. Divide by the symmetry factor.
Since these rules are written in terms of space-time points x, y, z, etc. these rules are called
position-space Feynman rules. One way to interpret these rules is to think of the factor (i)
as the amplitude for the emission and/or absorption of particles at a vertex. The integral
_
d
4
z tells us that we have to sum over all points where this process can occur. This means
that this is nothing but the superposition principle of QM: when a process can happen in
dierent ways, we add the amplitudes for each possibility. Furthermore, in order to calculate
each individual amplitude the Feynman rules tell us to multiply the amplitudes (propagators
and vertices) for each of independent part of the process.
The above Feynman rules are given in position-space. Yet, in actual calculation it is (often)
more convenient to work in the momentum-space by introducing the Fourier transformation of
the propagator (3.129). To such a propagator one has to assign a 4-momentum p, indicating
in general the direction of the momentum with an arrow (since D
F
(x y) = D
F
(y x) the
direction of p is arbitrary). The z-dependent factors of the vertices in a diagram are then
given by

_
d
4
z e
i(p
1
+p
2
+p
3
p
4
)z
= (2)
4

(4)
(p
1
+p
2
+p
3
p
4
) .
p
1
p
2
p
3
p
4
(4.65)
In other words momentum is conserved at each vertex. The delta functions from the vertices
can now be used to perform some of the momentum integrals from the propagators. We are
left with the following momentum-space Feynman rules:
1. For each propagator one has =
i
p
2
m
2
+i
.
p
2. For each vertex one has = i.
3. For each external point one has = e
ipx
. x
p
72
4. Impose momentum conservation at each vertex.
5. Integrate over each undetermined momentum
_
d
4
l
(2)
4
.
6. Divide by the symmetry factor.
Again, we can interpret each factor as the amplitude for that part of the process, with the
integrations coming from the superposition principle. The exponential factor for an external
point is just the amplitude for a particle at that point to have the needed momentum, or,
depending on the direction of the arrow, for a particle with a certain momentum to be found
at the specic point.
4.7 Third Look at Scattering Processes
Let us now apply the things that we have learned to the case of NN NN scattering. At
order g
2
we have to consider the two diagrams shown in Figure 4.1. Employing the relevant
momentum-space Feynman rules, it is readily seen that the analytic expression for the sum of
the displayed graphs agrees with the nal result (4.55) of the calculation that we performed
earlier in Section 4.5. In fact, there is a nice physical interpretation of the graphs. We talk,
rather loosely, of the nucleons exchanging a meson which, in the rst diagram, has momentum
k = p
1
q
1
= p
2
q
2
. This meson does not satisfy the usual energy dispersion relation, because
k
2
,= m
2
, where m is the mass of the meson. The meson is called a virtual particle and is said
to be o-shell (or, sometimes, o mass-shell). Heuristically, it cant live long enough for its
energy to be measured to great accuracy. In contrast, the momentum on the external, nucleon
legs satisfy p
2
1
= p
2
2
= q
2
1
= q
2
2
= M
2
, which means that the nucleons, having mass M, are
on-shell. Similar considerations apply to the second diagram. It is important to notice that
the appearance of the two diagrams above ensures that the particles satisfy Bose statistics.
The diagrams describing the scattering of a nucleon and an antinucleon, N

N N

N, are
a little bit dierent than the ones for NN NN. At lowest order, the corresponding graphs
are shown in Figure 4.2. It is a simple matter to write down the amplitude using the relevant
Feynman rules,
i(ig)
2
_
1
(p
1
+p
2
)
2
m
2
+i
+
1
(p
1
q
1
)
2
m
2
+i
_
(2)
4

(4)
(p
1
+p
2
q
1
q
2
) . (4.66)
Notice that in the CM frame, p
1
= p
2
, the denominator of the rst term in the square
bracket is 4 (M
2
+p
2
1
) m
2
. If m < 2M, then this term never vanishes and we may drop the
i. In contrast, if m > 2M, then the amplitude corresponding to the rst diagram diverges at
some value of p
1
. In this case it turns out that we may also neglect the i term, although for a
dierent reason. In this case the meson is unstable when m > 2M and thus has a nite width
. When correctly treated, this instability adds a nite imaginary piece i to the denominator
which makes the application of the i prescription unnecessary. Nonetheless, the increase in
the scattering amplitude which we see in the rst diagram when 4 (M
2
+p
2
1
) = m
2
is what
73
+
N
N
N
N
M
p
1
p
2
q
1
q
2
N
N
N
N
M
p
1
p
2
q
1
q
2
Figure 4.1: Feynman diagrams contributing to NN NN scattering at order g
2
.
allows us to discover new particles. These appear as a resonance (a peak or bump) in the
cross section (roughly the amplitude squared).
We see that the amplitudes (4.55) and (4.66) (and in general all processes that include
the exchange of just a single particle) depend on the same combinations of momenta in the
denominators. There are standard names for various sums and dierences of momenta that
are known as Mandelstam variables. They are
s = (p
1
+p
2
)
2
= (q
1
+q
2
)
2
,
t = (p
1
q
1
)
2
= (p
2
q
2
)
2
,
u = (p
1
q
2
)
2
= (p
2
q
1
)
2
,
(4.67)
where, as in the explicit examples above, p
1
and p
2
are the momenta of the two initial-state
particles, and q
1
and q
2
are the momenta of the two nal-state particles. In order to get a
feel for what these variables mean, let us assume (for simplicity) that all four particles are the
same. In the CM frame, the initial two particles have the following 4-momenta
p
1
= (E, 0, 0, p) , p
2
= (E, 0, 0, p) , (4.68)
The particles then scatter at some angle and leave with momenta
q
1
= (E, 0, p sin , p cos ) , q
2
= (E, 0, p sin , p cos ) . (4.69)
Then from the denitions (4.67), we have that
s = 4E
2
, t = 2p
2
(1 cos ) , u = 2p
2
(1 + cos ) . (4.70)
We see that the variable s measures the total center of mass energy of the collision, while the
variables t and u are measures of the energy exchanged between particles (they are basically
equivalent, just with the outgoing particles swapped around). Now the amplitudes that involve
exchange of a single particle can be written simply in terms of the Mandelstam variables. E.g.,
for nucleon-nucleon scattering, the amplitude (4.55) is proportional to
26
/(NN NN)
1
t m
2
+
1
u m
2
, (4.71)
26
Here and in the following we simply drop all i terms.
74
+
N

N
N

N
M
p
1
p
2
q
1
q
2
N

N
N

N
M
p
1
p
2
q
1
q
2
Figure 4.2: Feynman diagrams contributing to N

N N

N scattering at order g
2
.
while in the case of nucleon-antinucleon scattering one nds
/(N

N N

N)
1
s m
2
+
1
t m
2
. (4.72)
We say that the rst case involves t- and u-channel diagrams. On the other hand, the nucleon-
antinucleon scattering is said to involve s- and t-channel exchange.
Note nally that there is a relationship between the Mandelstam variables. In the cases
of NN NN and N

N N

N scattering, which involves external particles with the same
mass, one has
s +t +u = 4M
2
. (4.73)
When the masses of the external particles are dierent this becomes s + t + u =

4
i=1
m
2
i
,
where m
i
denotes the individual masses of the initial- and nal-state particles.
Let us now consider the case of meson-meson scattering, MM MM. The simplest
diagram we can draw that describes this process is shown in Figure 4.3. It has a single
loop, and momentum conservation at each vertex is no longer sucient to determine every
momentum passing through the diagram. Assigning the single undetermined momentum l to
the right-hand propagator, all other momenta are xed by the kinematics (the actual momenta
assignments are not displayed in the gure). The amplitude corresponding to the displayed
diagram is
i (ig)
4
_
d
4
l
(2)
4
1
l
2
M
2
1
(l +q
1
)
2
M
2

1
(l p
1
+q
1
)
2
M
2
1
(l q
2
)
2
M
2
(2)
4

(4)
(p
1
+p
2
q
1
q
2
) .
(4.74)
While an explicit calculation of this loop integral is beyond the scope of this lecture, notice
that for large l, this integral goes as
_
d
4
l/l
8
, which means that it is UV nite (the integral is
also IR nite since all propagators are massive). In general, loop integrals can have however
both UV (l
2
) and IR (l
2
0) singularities.
The delta function follows from the conservation of 4-momentum which, in turn, follows
from space-time translational invariance. It is common to all S-matrix elements. We will
dene the amplitude /(f i) by stripping o this momentum-conserving delta function,
f [S 1[ i) = i f [T[ i) = i (2)
4

(4)
(p
f
p
i
) /(f i) , (4.75)
75
M M
M M
N
N
N N
Figure 4.3: Lowest order contribution to MM MM scattering. The momentum
assignments are not explicitly shown.
where p
f
(p
i
) is the sum of the nal (initial) 4-momenta, and the factor of i out front is a
convention which is there to match non-relativistic QM.
4.8 Yukawa Potential
So far we have calculated the quantum amplitudes for various scattering processes. But this
quantities are a little bit abstract. In order to make contact to experiment let me show in the
following how to translate the amplitude (4.55) for nucleon-nucleon scattering into something
familiar from Newtonian mechanics, namely a potential, or force, between the particles.
We start by asking a simple question in classical eld theory that will turn out to be relevant
in order to calculate the quantum process. Suppose that we have a xed delta function source
for our real scalar eld , that persists for all times. What is the prole of (x)? In order to
answer this question, we have to solve the static Klein-Gordon equation,
_

2
+m
2
_
(x) =
(3)
(x) . (4.76)
We can solve this equation by going to momentum-space (x) =
_
d
3
p/ ((2)
3
) e
ipx
(p).
After this Fourier transformation the relation (4.76) takes the form (p
2
+m
2
) (p) = 1, which
means that we can write the eld as
(x) =
_
d
3
p
(2)
3
e
ipx
p
2
+m
2
. (4.77)
Let us compute this integral explicitly. Changing to polar coordinates, and writing p x =
pr cos , we get
(x) =
1
(2)
2
_

0
dp
p
2
p
2
+m
2
2 sin (pr)
pr
=
1
(2)
2
r
_

dp
p sin (pr)
p
2
+m
2
=
1
2r
Re
__

dp
2i
pe
ipr
p
2
+m
2
_
.
(4.78)
76
We evaluate the last integral by closing the contour in the upper half plane p i, picking
up the pole at p = im. This gives
(x) =
1
4r
e
mr
. (4.79)
We see that the eld dies o exponentially quickly at distances 1/m, i.e., the Compton wave-
length of the meson.
It is now interesting to ask how the prole of the eld (the meson) and the force between
the particles (the nucleons) are related. Realize that in electrostatics where a charged particle
acts as a delta-function source for the gauge potential A
0
with A

= (, A) we have to face a
similar problem. In this case one has
2
A
0
=
(3)
(x) which is solved by A
0
= 1/(4r). The
prole of A
0
then acts as the potential energy for another charged (test) particle moving in
this background. Is such an interpretation also possible in the case of ? Or phrased slightly
dierent, is there a classical limit of the scalar Yukawa theory where the nucleons act as delta-
function sources for the meson eld, creating the prole (4.79)? And, if so, is this prole then
felt as a static potential? The answer is essentially yes, at least in the limit M m. But the
correct way to describe the potential felt by the nucleons is not to talk about classical elds
at all, but instead work directly with the quantum amplitudes.
Let us see explicitly how this goes. We rst compare the result of the rst diagram in Fig-
ure 4.1 to the corresponding amplitude in non-relativistic QM which describes the interaction
of two particles through a potential. In order for the comparison to be meaningful, we have
to take the non-relativistic limit of (4.55). We work in the CM frame with p = p
1
= p
2
and q = q
1
= q
2
with [p[ = [q[ for elastic scattering. In the non-relativistic limit one has
[p[ M, which by momentum conservation implies [q[ M. It is easy to check that in this
limit the rst term in (4.55) turns into
ig
2
(p q)
2
+m
2
. (4.80)
We should now compare this result to the scattering amplitude in QM. In order to do this,
we consider two particles separated by a distance x, interacting through a potential V (x).
The amplitude for the particles to scatter from p into q can be computed in perturbation
theory, using techniques familiar from non-relativistic QM. In Born approximation, i.e., to
leading order in the perturbative expansion, the sought amplitude is given by
q [V (x)[ p) = i
_
d
3
r V (x) e
i(pq)x
. (4.81)
Taking into account that there is a relative factor of (2M)
2
that arises in comparing the QFT
amplitude to q [V (x)[ p), which can be traced to the relativistic normalization of the states
[p
1
, p
2
),
27
we nd after equating (4.80) and (4.81) the following relation
_
d
3
r V (x) e
i(pq)x
=

2
(p q)
2
+m
2
. (4.82)
27
Notice that this factor is also necessary to get the dimensions of the potential to work out correctly.
77
Here we have introduced the dimensionless parameter = g/(2M). The latter equation is
trivially inverted, giving
V (x) =
2
_
d
3
p
(2)
3
e
ipx
p
2
+m
2
=

2
4r
e
mr
, (4.83)
where in the last step have used the results (4.77) through (4.79). The potential V (x) is the
famous Yukawa potential. The force has a range 1/m and the minus sign in (4.83) tells us
that the potential is attractive. Hideki Yukawa made this potential the basis for his theory of
the nuclear force and worked backwards from the range of the force (of about 1 fm) to predict
the mass (of about 200 MeV) of the required boson the pion [2]. It is important to realize
that QFT has given us an entirely new perspective on the nature of forces between particles.
Rather than being a fundamental concept, the force arises from the virtual exchange of other
particles, in this case the meson.
4.9 Connected and Amputated Feynman Diagrams
We have explained in some detail how to compute scattering amplitudes by drawing all Feyn-
man diagrams and by writing down the corresponding analytic expression for them using
Feynman rules. In fact, there are a couple of caveats about what Feynman diagrams one
should draw and calculate. Both of these caveats are related to the assumption made so far
that the initial and nal states are eigenstates of the free theory which, as we have mentioned
before, is not correct.
The two caveats are as follows. First, we consider only connected Feynman diagrams, where
every part of the diagram is connected to at least one external line. We shall see shortly, that
this will be related to the fact that the vacuum [0) of the free theory is not the true vacuum
[) of the interacting theory. An example of a disconnected diagram (or piece) is shown on the
left-hand side in Figure 4.4. Second, we do not consider diagrams with loops on external lines
so-called unamputated graphs. An example of such a diagram is depicted on the right-hand
side of the latter gure. These diagrams are related to the fact that the one-particle states
of the free theory are not the same as the one-particle states of the interacting theory. In
particular, correctly dealing with these diagrams will account for the fact that particles in
interacting QFTs are always surrounded by a swarm of virtual particles. We will refer to
diagrams in which all loops on external legs have been removed as amputated graphs.
Vacuum of the Interacting Theory
We start out by discussing the properties of the vacuum [) of the interacting theory. We
will normalize the state [) as [) = 1 and H [) = 0. Since [) is the ground state of H,
we can isolate it by the following procedure. Imagine starting with the vacuum [0) of the free
theory (i.e., H
0
[0) = 0) and evolving it with H,
e
iHt
[0) =

n
e
iEnt
[n)n[0) , (4.84)
78
Figure 4.4: Example of a disconnected (left-hand side) and an unamputated (right-
hand side) Feynman diagram in
4
theory.
where E
n
([n)) are the eigenvalues (eigenstates) of H. We must assume that [) and [0) have
some overlap, i.e., [0) , = 0. If this would not be the case the interaction term H
I
would
not be a small perturbation compared to H
0
. Under this assumption, we can rewrite (4.84)
as follows
e
iHt
[0) = e
iE
0
t
[)[0) +

n=0
e
iEnt
[n)n[0) , (4.85)
where E
0
= [H
0
[). Since E
n
> E
0
for all n, we can get rid of the second term in (4.85) by
sending t to innity in a slightly imaginary direction, t (1 i) .
28
It follows that
[) = lim
t(1i)
_
e
iE
0
t
[0)
_
1
e
iHt
[0) . (4.86)
Since t is very large we can shift it by a small amount, lets say t
0
, so that
[) = lim
t(1i)
_
e
iE
0
(t+t
0
)
[0)
_
1
e
iH(t+t
0
)
[0)
= lim
t(1i)
_
e
iE
0
(t
0
(t))
[0)
_
1
e
iH(t
0
(t))
e
iH
0
(tt
0
)
[0)
= lim
t(1i)
_
e
iE
0
(t
0
(t))
[0)
_
1
U(t
0
, t) [0) .
(4.87)
Here we have used in the second line that H
0
[0) = 0 and employed in the third line the relation
U(t, t

) = exp [iH
0
(t t
0
)] exp [iH(t t

)] exp [iH
0
(t

t
0
)] which follows from (4.14). We
see that (ignoring the prefactor) we can get the ket [) from [0) by simply evolving from t
to t
0
with the time-evolution operator U. Similarly, we nd for the bra [ the expression
[ = lim
t(1i)
0[ U(t, t
0
)
_
e
iE
0
(tt
0
)
0[)
_
1
. (4.88)
Correlation Functions
There are many questions we want to ask in QFT that are not directly related to scattering
experiments. E.g., we might want to compute the viscosity of the quark gluon plasma, or
28
Since the BCs of the Feynman propagator D
F
(xy) are such that the integration contour that is slightly
rotated away from the Re p
0
-axis the contribution of the imaginary piece of t does not alter the nal result.
79
understand the response of a condensed matter system to an experimental probe, or gure
out the non-Gaussianity of density perturbations arising in the cosmic microwave background
from novel models of ination. All of these questions are answered in the framework of QFT
by computing elementary objects known as correlation functions. In the following we will
dene correlation functions, explain how to compute them using Feynman diagrams, and then
relate them back to scattering amplitudes.
In order to keep the following discussion as simple as possible, we will work in the real
Klein-Gordon theory. We start by dening the n-point correlation (or Greens) function
G
(n)
(x
1
, . . . , x
n
) = [T (
H
(x
1
) . . .
H
(x
n
)) [) , (4.89)
where
H
denotes the eld in the Heisenberg picture of the full theory, rather than the
interaction picture that we have been dealing with so far. The rst question that one can ask,
is how to compute G
(n)
in terms of matrix elements evaluated on [0), the vacuum of the free
theory. Let me rst state the result and then prove it. The result reads
G
(n)
(x
1
, . . . , x
n
) = lim
t(1i)
0[T
_

I
(x
1
) . . .
I
(x
n
) exp
_
i
_
t
t
dt

H
I
(t

)
__
[0)
0[T
_
exp
_
i
_
t
t
dt

H
I
(t

)
__
[0)
. (4.90)
Notice that both the numerator and denominator appearing on the right-hand side of the
latter equation can be calculated using the methods developed for S-matrix elements, namely
Feynman diagrams (or alternatively Dysons formula and Wicks theorem) after expanding
the exponentials into a Taylor series.
After stating the result (4.90), we still have to prove it. With out loss of generality we
assume that x
0
1
> . . . > x
0
n
> t
0
. If this is not the case we simply relabel the points in an
appropriate way. Such a relabeling leaves both sides of (4.90) unchanged. We then have
G
(n)
(x
1
, . . . , x
n
) = [
H
(x
1
) . . .
H
(x
n
)[)
= lim
t(1i)
_
e
iE
0
(tt
0
)
0[)
_
1
0[ U(t, t
0
)

_
U(x
0
1
, t
0
)

I
(x
1
) U(x
0
1
, t
0
)
_
U(x
0
2
, t
0
)

I
(x
2
) U(x
0
2
, t
0
) . . .

_
U(x
0
n
, t
0
)

I
(x
n
) U(x
0
n
, t
0
) U(t
0
, t) [0)
_
e
iE
0
(t
0
(t))
[0)
_
1
= lim
t(1i)
_
e
iE
0
(2t)
[0[)[
2
_
1
0[U(t, x
0
1
)
I
(x
1
) U(x
0
1
, x
0
2
) . . . U(x
0
n1
, x
0
n
)
I
(x
n
) U(x
0
n
, t)[0)
= lim
t(1i)
0[U(t, x
0
1
)
I
(x
1
) U(x
0
1
, x
0
2
) . . . U(x
0
n1
, x
0
n
)
I
(x
n
) U(x
0
n
, t)[0)
0[U(t, t)[0)
.
(4.91)
Here we have rst used (4.87) and (4.88) and rewritten all Heisenberg elds
H
in terms of
interacting elds,

H
(x) =
_
U(x
0
, t
0
)


I
(x) U(x
0
, t
0
) . (4.92)
80
Remember that U satises U(t
1
, t
2
) U(t
2
, t
3
) = U(t
1
, t
3
) and U(t
1
, t
3
)
_
U(t
2
, t
3
)

= U(t
1
, t
2
).
In order to arrive at the last line, we have nally employed
1 = [) =
_
e
iE
0
(2t)
[0[)[
2
_
1
0[U(t, t)[0) . (4.93)
The proof of (4.90) is complete after noticing that all elds in (4.91) are in time order and that
the product of U operators in the numerator reduces to U(t, t) = T exp
_
i
_
t
t
dt

H
I
(t

.
Hence the last line in (4.91) is nothing but the right-hand side of (4.90).
Exponentiation of Bubble Diagrams
By means of (4.90) we can now (in principle) calculate any n-point correlation function. But
what is the physical interpretation of this equation? We rst express the denominator of (4.90)
in terms of Feynman diagrams,
lim
t(1i)
0[U(t, t)[0) = 1 + +. . . . (4.94) +
_

_
_

_
+ +
The disconnected Feynman diagrams appearing on the right-hand side of this relation are
called vacuum bubbles. What is the value of the rst non-trivial graph? Restoring the position
label and the integration momenta,
(4.95)
l
1
l
2
it is readily seen that momentum conservation requires l
1
= l
2
, so that the diagram evaluates
to (2)
4

(4)
(0). This factor is also easily derived in position space, where one has
_
d
4
z (const.) 2t V . (4.96)
This result just tells us that the space-time process (4.95) can happen at any place in space, and
at any time between t and t. Every disconnected diagram will have one such (2)
4

(4)
(0) =
2t V factor, where V denotes the volume of space.
In fact, the contributions to G
(n)
from disconnected diagrams can be shown to exponentiate.
To prove the linked-cluster theorem, we rst label the various possible disconnected pieces:
V
i

_

_
, . . .
_

_
. (4.97) , , ,
Now we assume that a given Feynman diagram has n
i
pieces of the form V
i
for each i, in
addition to its one piece that is connected. If we also denote the value of V
i
by v
i
, the value
of a single Feynman graph is
(value of connected piece)
_

i
(v
i
)
n
i
(n
i
)!
_
, (4.98)
81
where 1/((n
i
)!) is the symmetry factor associated with interchanging the n
i
copies of the piece
V
i
. The value of the sum of all diagrams is then given by

all connected diagrams

all {n
i
}
(value of connected piece)
_

i
(v
i
)
n
i
(n
i
)!
_
, (4.99)
where all n
i
means all ordered sets n
1
, n
2
, . . . of non-negative integers. The sum of
the connected diagrams factors out of this expression, giving
_

all connected diagrams
(value of connected piece)
_

all {n
i
}
_

i
(v
i
)
n
i
(n
i
)!
_
. (4.100)
In fact, not only the connected pieces factorize, but also the disconnected ones. One has

all {n
i
}
_

i
(v
i
)
n
i
(n
i
)!
_
=

i
_
_

all {n
i
}
(v
i
)
n
i
(n
i
)!
_
_
=

i
exp (v
i
) = exp
_

i
v
i
_
. (4.101)
We see that the combinatoric factors (as well as the symmetry factors) associated with each
diagram are such that the whole series of disconnected pieces sums to an exponential. Taken
together (4.99) through (4.101) imply that the sum of all diagrams is equal to the sum of all
connected diagrams multiplied with the exponential of the sum of all disconnected graphs.
Applying our ndings concerning the exponentiation of bubble diagrams to (4.94), we
arrive at the following pictorial identity
lim
t(1i)
0[T exp
_
i
_
t
t
dt

H
I
(t

)
_
[0) = exp
_

_
+ . . .
_

_
. (4.102) + +
The exponentiation of disconnected diagrams is also relevant in the case of the numerator
of the right-hand side of (4.90). Let us consider the two-point correlation function G
(2)
for
simplicity. In this case the numerator takes the form
lim
t(1i)
0[T
_

I
(x)
I
(y) exp
_
i
_
t
t
dt

H
I
(t

)
__
[0) =
_
_
+ . . .
_
_
exp
_

_
+ . . .
_

_
.
(4.103)
+ +
+ +
x y x y x y
82
Combining now (4.102) and (4.103), it follows that the exponentials involving the sum of
disconnected diagrams cancel between the numerator and denominator in the formula for the
correlation functions. In the case of the two-point function, the nal form of (4.90) is thus
G
(2)
(x, y) = + . . . . (4.104) + + +
x y x y x y x y
The generalization to higher correlation function is straightforward and reads
G
(n)
(x
1
, . . . , x
n
) = [T (
H
(x
1
) . . .
H
(x
n
)) [) =
_
sum of all connected graphs
with n external points
_
. (4.105)
The disconnected diagrams exponentiate, factor, and cancel as before. It is important to
remember that by disconnected we mean disconnected from all external points. In higher
correlations functions, diagrams can also be disconnected in another sense. Consider, e.g., the
four-point function
G
(4)
(x
1
, x
2
, x
3
, x
4
) =
(4.106)
+ + + + + . . .
+ + + + . . .
+ + . . . + + . . . .
In many of the displayed diagrams, external points are disconnected from each other. Such
diagrams do neither exponentiate nor factor, they contribute to the amplitude just as do the
fully connected diagrams in which any point can be reached from any other by traveling along
the lines.
Energy Density of Vacuum
An immediate consequence of the linked-cluster theorem is that all vacuum bubbles cancel
when calculating correlation functions. Does this mean that the disconnected diagrams have
no physical meaning at all? The place to look for the answer to this question is (4.91) and
(4.93). Taken together these two equations imply that
lim
t(1i)
0[T
_

I
(x
1
) . . .
I
(x
n
) exp
_
i
_
t
t
dt

H
I
(t

)
__
[0)
= [T (
H
(x
1
) . . .
H
(x
n
)) [) lim
t(1i)
_
e
iE
0
(2t)
[0[)[
2
_
1
.
(4.107)
83
Looking only at the t-dependent parts on both sides, it follows that
exp
_

i
v
i
_
exp
_
iE
0
(2t)
_
. (4.108)
The sum of all vacuum bubbles is therefore related to the dierence in the ground-state zero-
point energies of the interacting and the free theory, the latter of which was dened to be zero.
Because each bubble graph V
i
contains a single factor of (2)
4

(4)
(0) = 2t V , one explicitly
nds that the energy density of the ground state of the (interacting)
4
theory reads
c
0
=
E
0
V
= i
_

_
+ . . .
_

_
_
(2)
4

(4)
(0)
_
1
. (4.109) + +
Notice that the IR divergence arising from the innite extent of space-time volume
_
which we
have rst met in Section 3.2 and then again in (4.96)
_
has been removed in c
0
, leaving behind
an highly UV-divergent expression that reects our ignorance about the physics governing the
high-energy regime.
One-Particle States in Interacting Theory
We now have an extremely beautiful formula (4.105) for computing an extremely abstract
quantity the n-point correlation function. Our next task is to relate these objects back to
S-matrix elements (4.25)
_
or equivalent T-matrix elements (4.49)
_
, which will allow us to
compute quantities that can actually be measured, namely decay rates and cross sections.
In order to achieve this goal, we still have to learn how to deal with diagrams involving
loops on the external lines. Let us rst try to understand the problem with such graphs,
looking at a specic example. We consider the following Feynman diagram
=
1
2
_
d
4
p
3
i
p
2
3
m
2
_
d
4
l
i
l
2
m
2
(i) (2)
4

(4)
(p
2
+p
3
q
1
q
2
)
(i) (2)
4

(4)
(p
1
p
3
) ,
(4.110)
l
p
1
p
2
p
3
q
1
q
2
appearing in
4
theory. We can integrate over p
3
using the second delta function. It tells us
to evaluate
1
p
2
3
m
2

p
3
=p
1
=
1
p
2
1
m
2
=
1
0
. (4.111)
We get an innity, since p
1
, being the momentum of an external particle, is on-shell, i.e.,
p
2
1
= m
2
. This is not good! Clearly, diagrams like (4.111) should not contribute to the
84
S-matrix elements. In fact, this is physically reasonable, since the external leg corrections,
+ . . . , (4.112) + + +
represent the evolution of one-particle state of the free theory into the one-particle state of
the interacting theory, in the same way that the vacuum-bubble diagrams (4.97) represent
the evolution of [0) into [). Since these corrections have nothing to do with the scattering
process itself, it is somehow clear that one should exclude them from the calculation of the
S-matrix.
For a generic Feynman diagram with external legs, we dene amputation in the following
way. Starting from the tip of each external leg, nd the last point at which the diagram can
be cut by removing a single propagator, such that this operation separates the leg from the
rest of the diagram. Cut there. Let me give an non-trivial example of a diagram that appears
at O(
10
), if one wants to compute scattering in
4
theory. Here it is:
=
amputation
(4.113)
So far we have learnt about the problem with external-leg corrections
_
the become innite
for on-shell external states as implied by (4.111)
_
and gave a simple prescription of how to
solve the issue, i.e., by simply removing these corrections by amputation. A practitioner or
an experimental physicist might be happy at this point, but as theorists we want more. So
lets have a closer look at the connection between G
(n)
and S.
4.10 From Correlation Functions to Scattering Matrix Elements
Before we start, let me warn you that this subsection will be more abstract than the preced-
ing ones. Its main theme will be the singularities of Feynman diagrams viewed as analytic
functions of their external momenta. Yet, we will see rather soon that this apparently esoteric
subject is full of physical implications, and that it illuminates the relation between Feynman
diagrams and the general principles of QFT.
85
Kallen-Lehmann Spectral Representation
We already know that in the free theory the matrix element 0[T(x)(y)[0) has a simple
physical interpretation. It gives the amplitude for a particle to propagator from y to x. To
what extent carries this over to the interacting theory? In order to answer this question, we
will have a look at the two-point correlation function (4.104). Our analysis of G
(2)
will rely
only on general principles of special relativity and QM, but will neither depend on the nature
of the interactions nor on an expansion in perturbation theory. Yet, to simplify matters, we
will restrict our consideration to the case of the real scalar eld . Similar results can be
obtained for correlation functions of elds with spin.
We begin by studying the excited states of the interacting theory, with the corresponding
energies being dened relative to the ground-state energy E
0
. Let [
0
) be an excited eigenstate
of the full Hamiltonian with vanishing total 3-momentum 0, i.e., P [
0
) = 0. That [
0
) can
be an eigenstate of both H and P follows from the fact that [H, P] = 0. Such a state can
consist of an arbitrary number of particles or it can even be bound state. The simultaneous
eigenvalues of H E
0
and P can be combined into a 4-vector p

0
= (m

, 0), where m

denotes the mass of the particular zero-momentum state. Being the generator of space-
time translations, P

= (H E
0
, P) transforms as contravariant 4-vector under boosts, i.e.,
U
1
()P

U() =

where U() is the unitary operator that implements the Lorentz


boost. This implies that by boosting [
0
) one can generate a new state [
p
), which can have
any 3-momenta p and is an eigenstate of H E
0
with energy E
p
() = ([p[
2
+ m
2

)
1/2
. Or
the other way round, any eigenstate with explicit 3-momentum can be boosted to a zero-
momentum eigenstate. You are kindly asked to prove this statement explicitly. The sets of
eigenvalues p

= (E E
0
, p) are thus organized into hyperboloids, as is shown in Figure 4.5.
The lowest-lying isolated hyperboloid corresponds to the one-particle states of the interacting
theory, whereas the other ones correspond to bound states that may or may not be present.
Above a certain threshold value of m

, a continuum of multiparticle states starts.


From the above it follows that the states [
p
) form a complete set of states in the interacting
theory, in the same way the states [p) do in the free theory. In turn, the completeness relation
of the one-particle states in the free theory (3.66) is replaced by
1 = [)[ +

_
d
3
p
(2)
3
1
2E
p
()
[
p
)
p
[ , (4.114)
where the rst term corresponds to the ground state and the second one to all excited states.
We now insert (4.114) into the two-point function G
(2)
(x, y) = [T(x)(y)[).
29
In the
case x
0
> y
0
, we obtain
[(x)(y)[) = [(x)[)[(y)[)
+

_
d
3
p
(2)
3
1
2E
p
()
[(x)[
p
)
p
[(y)[) .
(4.115)
29
For the sake of brevity, the labels H indicating Heisenberg elds will be dropped hereafter, whenever we
discuss the properties of correlation functions.
86
Figure 4.5: The eigenvalues of P

= (H, P) are hyperboloids in the PH plane.


For a typical theory the states consist of one or more particles of mass m. In conse-
quence, there is a hyperboloid of one-particle states and a continuum of hyperboloids
of two-, three-particle states, and so on. There may also be one or more bound state
hyperboloids below the threshold for creation of two free particles.
m
P
H
one particle at rest
one particle
in motion
bound
state
multiparticle
continuum
@
@
@I
H
HY
A
AK
In the absence of preferred directions in the universe, the vacuum[) should be invariant under
space-time translations and Lorentz transformations, i.e., e
iPx
[) = [) and U() [) = [).
As part of an exercise you will show that this implies that
[(x)[) = [(0)[) = v , (4.116)
where v denotes the VEV of the eld (x), usually taken to be zero. If v ,= 0 than one
should reformulate the theory using the shifted eld

(x) = (x) v, which by denition has
vanishing VEV. By an appropriate choice of the dofs of the interacting theory one hence can
always get rid of the rst term in (4.115). The matrix elements entering the second term can
be manipulated as follows
[(x)[
p
) = [e
iPx
(0)e
iPx
[
p
) = e
ipx
[(0)[
p
)

p
0
=Ep()
= e
ipx
[U
1
()U()(0)U
1
()U()[
p
)

p
0
=Ep()
= e
ipx
[(0)[
0
)

p
0
=Ep()
,
(4.117)
where U() implements a boost from p to 0. In order to arrive at the nal expression, we
have made use of the fact that [) and (0) are Lorentz invariant.
30
30
For a eld with spin we would need to keep track of its non-trivial transformation properties under the
Lorentz group.
87
Figure 4.6: The spectra density (s) for a typical interacting theory. The one-particle
states contribute a delta function at m
2
, i.e., the square of the physical mass of the
particle. Multiparticle state form a continuous spectrum starting at (2m)
2
. There may
also be bound states below the two-particle threshold.
s
m
2
(2m)
2
multiparticle
continuum
(s)
bound
states
one-particle
states
Leaving out the VEV and using (4.117), the two-point correlation function (4.115) then
takes the form (x
0
> y
0
)
[(x)(y)[) =

[[(0)[
0
)[
2
_
d
3
p
(2)
3
e
ip(xy)
2E
p
()

p
0
=Ep()
=

[[(0)[
0
)[
2
_
d
4
p
(2)
4
i
p
2
m
2

+i
e
ip(xy)
,
(4.118)
where to arrive at the nal result we have introduced an integration over p
0
employing (3.120).
The integral in the last line of (4.118) is the Feynman propagator D
F
(xy; m
2

) belonging to a
-particle with mass m

. We see that the particle interpretation has in fact changed in the


interacting theory from free particles to dressed particles (quasi-particles), so the particles
we are dealing with here are not the particles that we know from the free theory.
An expression analog to the one in (4.118) holds in the case x
0
< y
0
. Combining both
cases one arrives at the Kallen-Lehmann spectral representation of the two-point correlation
function
G
(2)
(x, y) =
_

0
ds
2
(s) D
F
(x y; s) , (4.119)
where (s) depends on the squared invariant mass s. This spectral density function is positive
denite and given by
(s) =

2 (s m
2

) [[(0)[
0
)[
2
. (4.120)
88
Figure 4.7: Analytic structure in the complex p
2
-plane of the Fourier transform of the
two-point correlation function for a typical interacting theory. The one-particle states
lead to an isolated pole at p
2
= m
2
. States of two or more free particles give a brunch
cut, while possible bound states show up as additional poles below (2m)
2
.
Re (p
2
)
Im(p
2
)
m
2
(2m)
2
multiparticle
brunch cut
(s)
bound-state
poles
one-particle
pole
??
?
?
The spectral density for a typical theory is plotted in Figure 4.6. We see that the states in
the interacting theory that describe one-particle states correspond to an isolated delta function
in the spectral density,
(s) = 2 (s m
2
) Z +
_
nothing else until s (2m)
2
_
. (4.121)
The factor
Z = [[(0)[
0
)[
2
, (4.122)
is called the eld-strength renormalization. It is the probability for (0) to create a one-particle
state out of the vacuum [) and m denotes the physical mass of the associated particle, being
the energy eigenvalue in its rest frame. Notice that this physical mass is in general not equal
to the bare mass parameter occurring in the Lagrangian of the
4
theory (4.6). To make
the distinction between physical and bare quantities manifest, we will hereafter indicate bare
quantities by a subscript 0. It is important to realize that only the physical mass m is directly
observable, while the bare mass m
0
is not.
In momentum-space the spectral decomposition (4.119) reads

G
(2)
(p
2
) =
_
d
4
x e
ipx
G
(2)
(x, 0) =
_

0
ds
2
(s)
i
p
2
s +i
=
iZ
p
2
m
2
+i
+
_

(2m)
2
ds
2
(s)
i
p
2
s +i
.
(4.123)
The analytic structure of this function in the complex p
2
-plane is depicted in Figure 4.7. The
rst term gives an isolated simple pole at p
2
= m
2
, while the second term contributes a branch
cut beginning at p
2
= (2m)
2
. If there are any two-particle bound states these will appear as
additional delta functions in (4.123) and thus as additional poles below the cut.
Let us compare the results we have obtained in this subsection to those found in Section 3.7
for the free theory. The Fourier transform of the Feynman propagator (i.e., the two-point
89
correlation function in the theory of a free scalar eld) reads (x
0
> 0)

D
F
(p
2
) =
_
d
4
x e
ipx
D
F
(x) =
_
d
4
x e
ipx
0[T(x)(0)[0) =
i
p
2
m
2
0
+i
, (4.124)
and is the amplitude for a particle to propagate from 0 to x. The relation (4.123) implies that
the two-point correlation function of the most general theory of an interacting real scalar eld
takes a very similar form. The general expression is essentially a sum of scalar propagation
amplitudes for states generated from the vacuum by the eld (0). There are however two
important dierences between (4.123) and (4.124). First, (4.123) contains the eld renormal-
ization factor Z, which is one in the case of the free elds. The latter statement is easily shown
explicitly by evaluating the matrix elements 0[(0)[p) and thus left as an exercise. Second,
(4.123) contains contributions from multiparticle intermediate states with a continuous mass
spectrum. In the free eld theory, (0) can create only a single particle from [0). Notice that
the generation of multiparticle states is the reason why the factor Z in general diers from
unity in the interacting theory.
Lehmann-Symanzik-Zimmermann Reduction Formula
So far we have seen that the Fourier transform of the two-point correlation function (4.123)
considered as an analytic function of p
2
has a simple pole at the square of the physical mass
of the one-particle states, while multiparticle intermediate states give weaker branch cut sin-
gularities. In the following we will nd that this rather formal observation generalizes to
higher-point correlation functions and plays a crucial role in the derivation of a general rela-
tion between Greens functions and S-matrix elements. This relation has rst been derived
by Harry Lehmann, Kurt Symanzik, and Wolfhart Zimmermann [3] and is today known as
the LSZ reduction formula. Combining the LSZ reduction formula with our Feynman rules
for computing correlation functions (4.105) will then give us a master formula for S-matrix
elements in terms of Feynman diagrams. For simplicity, we will again carry out the whole
analysis for the case of a real scalar eld.
In the following we would like use the single-particle pole structure of

G
(2)
(p
2
) in the vicinity
of p
2
m
2
to obtain the asymptotic in and out states of the theory and in particular
their matrix elements,
out
q
1
, . . . , q
n
[p
A
, p
B
)
in
= q
1
, . . . , q
n
[S[p
A
, p
B
) . (4.125)
These matrix elements are plane-wave amplitudes that describe the scattering of a initial
two-particle momentum state [p
A
, p
B
)
in
, constructed in the far past (t = t

), into a
n-particle momentum state [q
1
, . . . , q
n
)
out
, which represents the nal-state particles in the far
future (t = t
+
).
31
The basic idea to derive the desired master formula is as follows. In order to calculate
the S-matrix element for a 2 n scattering process, we start with the correlation function
31
Because human built detectors are in general not able to resolve positions down to the de Broglie wave-
lengths of the particles, it is correct to work with plane-wave states in the Heisenberg picture rather than wave
packets to describe the collision.
90
involving (n + 2) Heisenberg elds. If we Fourier-transform this function with respect to the
coordinate of any one of these elds, we will nd a pole of the form (4.123) in the corresponding
Fourier-transformed variable. We will argue that the one-particle states associated with these
poles are in fact asymptotic states, i.e., states given by the limit of well-separated wave packets
as they become concentrated around denite momenta. Taking the limit in which all (n + 2)
external particles go on-shell, we can then interpret the coecient of the multiple pole as an
S-matrix element.
We rst study the Fourier-transform of the (n +2)-point correlation function with respect
to one argument x,
_
d
4
x e
ipx
[T (
x

1
. . .
n+1
) [) . (4.126)
Here the shorthands
x
= (x),
1
= (y
1
), etc. have been used and all s are Heisenberg
elds. We would now like to identify poles in the variable p
0
. To do this, we divide the integral
over x
0
into three regions,
_
dx
0
=
_
t

dx
0
+
_
t
+
t

dx
0
+
_

t
+
dx
0
, (4.127)
where t

< min y
0
i
and t
+
> max y
0
i
with i = 1, . . . , n + 1. In the region x
0
[t

, t
+
] the
result of the integral is an analytic function of p
0
without poles, since the region is bounded
and the integrand depends on p
0
through the analytic function exp(ip
0
x
0
). In the other two
regions the integrand still has no poles, but the integration intervals are unbounded. Therefore
singularities in p
0
may develop upon integration.
Consider the third region, i.e., x
0
[t
+
, [. In this case x
0
is the latest time, so
x
stands
rst in the time-ordered product. In order to determine the pole structure of (4.126), we
insert the completeness relation (4.114), assuming that the eld has a vanishing VEV.
32
The integral over the third region then becomes
_

t
+
dx
0
_
d
3
x e
i(p
0
x
0
px)

_
d
3
k
(2)
3
1
2E
k
()
[(x)[
k
)
k
[T (
1
. . .
n+1
) [) . (4.128)
Using (4.117) and including a damping factor exp (x
0
) with innitesimal to ensure that
the integral is well-dened,
33
the above integral takes the form

_

t
+
dx
0
_
d
3
k
(2)
3
1
2E
k
()
e
i(p
0
k
0
+i)x
0
[(0)[
0
) (2)
3

(3)
(p k)

k
[T (
1
. . .
n+1
) [)

k
0
=E
k
()
=

1
2E
p
()
ie
i(p
0
Ep()+i) t
+
p
0
E
p
() +i
[(0)[
0
)
p
[T (
1
. . .
n+1
) [) .
(4.129)
32
If this is not the case we reformulate the theory in terms of the

eld.
33
This regularization is equivalent to the i prescription used in (3.129) and the tilted time-axis prescription
introduced in (4.86).
91
Here we have used
_
d
3
x exp(i(p k)x) = (2)
3

(3)
(p k). The expression (4.129) has
the same residue at p
0
= E
p
() i as the term i/(p
2
m
2

+ i) = i/
_
(p
0
)
2
(E
p
())
2
+ i
_
appearing the two-point correlation function (4.118). Like before this singularity will be either
a single pole or a brunch cut, depending on whether the rest energy m

is isolated or not.
The one-particle state in the far future corresponds to an isolated pole at the on-shell energy
p
0
= E
p
. In this case, (4.129) gives
_
d
4
x e
ipx
[T (
x

1
. . .
n+1
) [)
p
0
Ep

Z
p
2
m
2
+i
out
p[T (
1
. . .
n+1
) [) . (4.130)
In order to obtain this result we have identied the matrix element [(0)[
0
) appearing in
(4.129) with Z
1/2
using (4.122), absorbing the left over phase into the denition of [
0
). We
have furthermore used the notation [p)
out
= [
p
)
oneparticle
for a one-particle eigenstate with
momentum p that is created at asymptotically large times in the future.
In order to evaluate the contribution from the rst region, i.e., x
0
] , t

], one puts
x
last in the time-ordered product. Performing steps similar to the ones for the rst integration
interval (the actual calculation is left as an exercise), one nd that the one-particle state in
the far past corresponds to an isolated pole at the on-shell energy p
0
= E
p
,
_
d
4
x e
ipx
[T (
x

1
. . .
n+1
) [)
p
0
Ep

Z
p
2
m
2
+i
[T (
1
. . .
n+1
) [ p)
in
, (4.131)
where [ p)
in
= [
p
)
oneparticle
denotes the one-particle eigenstate with momentum p which
is constructed at asymptotically large times in the past.
We now want to repeat the same exercise for the remaining eld coordinates y
1
, etc. In
the asymptotic treatment of multiparticle states it is, however, better to use normalized wave
packets. In that case x is constrained to lie within a small band about the trajectory of
a particle with momentum p, with the spatial extent of the band being determined by the
wave packet. In this way the particles do not interfere and can eectively be considered
free at asymptotic times, unlike plane-wave states. Instead of a simple Fourier transform
_
d
4
x exp (ipx), we should hence have used
_
d
3
q
(2)
3
_
d
4
x e
ip
0
x
0
e
iqx
(q) , (4.132)
in (4.126), where (q) is a function that is peaked around p, and at the end taken the limit
of a sharply peaked wave packet (q) (2)
3

(3)
(q p).
With this modication the right-hand side in (4.129) would turn into

_
d
3
q
(2)
3
(q)
1
2E
q
()
i
p
0
E
q
() +i
[(0)[
0
)
q
[T (
1
. . .
n+1
) [)
p
0
Ep

_
d
3
q
(2)
3
(q)
i

Z
p
2
m
2
+i
out
q[T (
1
. . .
n+1
) [) ,
(4.133)
where p = (p
0
, q). We see that the one-particle singularity is now a branch cut, whose length
is the width in momentum space of the wave packet (q). It follows that if the width of the
92
(q) is taken to zero, the brunch cut sharpens up to a pole. In this limit (4.133) reduces to
the simple form (4.130). The same line of reasoning applies to the pole structure that appears
in the far past. In this case one recovers (4.131).
The procedure described above can be generalized to the (n + 2)-particle case we are
interested in by integrating each of the coordinates against a wave packet. Let me spare you
the gory details of the actual calculation and only tell you about the nal result. It turns out
that by smearing each coordinate one can extract the leading singularities that turn out to
be products of poles in the separate energy variables. The physics behind this factorization is
that an (n+2)-particle asymptotic state is created/annihilated by (n+2) eld operators that
are constrained to lie in distant wave packets and therefore are eectively localized. Under
these conditions an (n+2)-particle excitation in the continuum can be represented by (n+2)
distinct (i.e., independent) one-particle excitations of the ground state.
At the end one arrives at

G
(n+2)
(p
A
, p
B
, q
1
, . . . , q
n
) =
_

i=A,B
_
d
4
x
i
e
ip
i
x
i
__
n

j=1
_
d
4
y
i
e
iq
j
y
j
_
[T (
A

1
. . .
n
) [)
p
0
i
Ep
i

q
0
j
Eq
j
_

i=A,B
i

Z
p
2
i
m
2
+i
__
n

j=1
i

Z
q
2
j
m
2
+i
_
out
q
1
, . . . , q
n
[p
A
, p
B
)
in
=
_

i=A,B
i

Z
p
2
i
m
2
+i
__
n

j=1
i

Z
q
2
j
m
2
+i
_
q
1
, . . . , q
n
[S[p
A
, p
B
) , (4.134)
where the use of exp (iq
j
y
j
) ensures that the particles in the in state have positive energy.
The latter relation is the famous LSZ reduction formula. It implies that the S-matrix element
involving two particles in the in state and n particles in the out state can be obtained
from the corresponding Fourier-transformed (n + 2)-point correlation function by extracting
the leading singularities in the energies p
0
i
and q
0
j
, which coincide with the situations where
the external particles become on-shell.
Diagrammatic Master Formula
Our nal goal is to reformulate the above procedure in the language of Feynman diagrams.
For concreteness, we will rst analyze the relation between the diagrammatic expansion of
the scalar eld four-point function and the S-matrix element and then generalize this result
to the case of 2 n scattering. We will consider explicitly the fully connected Feynman
diagrams contributing to the Fourier-transformed correlation functions. By a similar analysis,
it is straightforward to show that disconnected diagrams should be disregarded, because they
do not have the singularity structure with a product of four
_
(n +2)
_
poles, appearing on the
right-hand side of the LSZ reduction formula (4.134).
The exact four-point correlation function is shown in Figure 4.8. In this gure we have
indicated explicitly the diagrammatic corrections on each external leg. The light gray blob in
93
amp.
Figure 4.8: Structure of the exact four-point correlation function in scalar eld theory.
p
A
p
B
q
1
q
2
the centre of the diagram represents the sum of all amputated four-point graphs,
amp. = + . . . , (4.135) + + +
while the dark gray circles indicate the two-point Greens function aka the full propagator.
The full propagator can be written as a Dyson series,
= 1PI + 1PI 1PI + . . . , (4.136)
where
1PI = i(p
2
) = + . . . , (4.137) + +
is the collection of all one-particle irreducible (1PI) self-energy diagrams. Diagrams are called
1PI if they cannot be split in two by removing a single line. The Dyson series (4.136) is in
fact a geometrical series, which can be summed up according to
=
i
p
2
m
2
0
+i
+
i
p
2
m
2
0
+i
_
i(p
2
)
_
i
p
2
m
2
0
+i
+. . .
=
i
p
2
m
2
0
(p
2
) +i
.
(4.138)
We see that the full propagator has a simple pole located at the physical mass m, which is
shifted away from the bare mass m
0
by the self-energy:
_
p
2
m
2
0
(p
2
)
_

p
2
=m
2
= 0 , = m
2
= m
2
0
+ (m
2
) . (4.139)
94
Notice that our sign convention for the 1PI self-energy (p
2
) implies that a positive contribu-
tion to (p
2
) corresponds to a positive shift of the scalar particle mass.
Close to its simple pole at p
2
m
2
the denominator of the full propagator (4.138) can be
expanded in the following way
p
2
m
2
0
(p
2
) =
_
p
2
m
2
_ _
1

(m
2
)
_
+O
_
(p
2
m
2
)
2
_
, (4.140)
where

(m
2
) stands for (p
2
)/(p
2
)

p
2
=m
2
. This implies that just like in the K allen-Lehmann
spectral representation (4.123), the full propagator has a single-particle pole of the form
p
0
Ep

iZ
p
2
m
2
+i
+ (regular terms) , (4.141)
with
Z =
1
1

(m
2
)
. (4.142)
As a result, the sum of all fully connected 2 2 diagrams contains a product of four poles
iZ
p
2
A
m
2
+i
iZ
p
2
B
m
2
+i
iZ
q
2
1
m
2
+i
iZ
q
2
2
m
2
+i
, (4.143)
multiplying the amputated four-point diagrams. This is exactly the singularity on the right-
hand side of the LSZ reduction formula (4.134). Comparing the coecients of the product of
poles, we conclude that the S-matrix element of the process (p
A
)(p
B
) (q
1
)(q
2
) can be
expressed through
q
1
, q
2
[S[p
A
, p
B
) =
_

Z
_
4
amp.
, (4.144)
p
B
p
A
q
2
q
1
where the light gray blob represents the sum of amputated four-point diagrams with all ex-
ternal momenta being on-shell. This is the sought diagrammatic master formula for the case
of 2 2 scattering of scalar elds.
An identical analysis can be applied to the Fourier-transformed (n + 2)-point correlator.
In this case the relation between the S-matrix element and the Feynman graphs reads
34
q
1
, . . . , q
n
[S[p
A
, p
B
) =
_

Z
_
n+2
amp.
. (4.145)
p
B
p
A
q
n
q
1
.
.
.
Notice that the renormalization factors Z
1/2
are irrelevant for calculations at the leading
order of perturbation theory, but are important in the calculation of higher-order corrections.
This completes the derivation of the connection between scattering matrix elements and fully
connected amputated Feynman diagrams.
34
If the external particles are of dierent species, each has its own renormalization factor Z
1/2
. Furthermore,
if the particles have spin, there will be additional polarization factors on the right-hand side of the equation.
95
4.11 Decay Widths and Cross Sections
As in usual QM, also in QFT the probabilities for things to happen are the (modulus) square
of the quantum amplitudes. In this subsection we will compute these probabilities, known as
decay widths and cross sections. One small subtlety here is that any T-matrix element (4.75)
comes with a factor of (2)
4

(4)
(p
f
p
i
), so that we end up with the square of a delta function.
As we will see in a moment, this subtlety is a result of the fact that we are working in an
innite space.
Fermis Golden Rule
In order to start the discussion, let me derive something familiar, namely Fermis golden rule
using Dysons formula (4.18). For two energy eigenstates [m) and [n) with E
m
,= E
n
, one has
in Born approximation
n[ U(t) [m) = i n[
_
t
0
dt

H
I
(t

)[m)
= i n[H
int
[m)
_
t
0
dt

e
it

= n[H
int
[m)
e
it
1

,
(4.146)
where = E
n
E
m
and we have used in the rst step the equality (4.11) to express H
I
in
terms of H
int
. The probability P
mn
(t) for the transition from [m) to [n) to happen in the
time t, is thus given by
P
mn
(t) = [n[ U(t) [m)[
2
= 2 [n[H
int
[m)[
2
1 cos (t)

2
. (4.147)
The function (1 cos (t))/
2
is visualized in Figure 4.9. The -dependence indicates that
most transitions occur in a region between energy eigenstates separated by E = 2/t, i.e.,
the half-width of the function. Looking at the gure one furthermore observes that as t ,
the function shown in the plot approaches a delta function. In order to nd the normalization,
we evaluate
_

d
1 cos (t)

2
= t . (4.148)
This implies that
1 cos (t)

2
t
t
() . (4.149)
Consider now a distribution of nal states with density (E
n
). In this case one has to integrate
over E
n
and obtains
P
mn
(t) =
_
dE
n
(E
n
) [n[ U(t) [m)[
2
=
_
dE
n
(E
n
) 2 [n[H
int
[m)[
2
1 cos (t)

2
t
2t [n[H
int
[m)[
2
(E
m
) .
(4.150)
96
Figure 4.9: Graphical representation of (1 cos (t))/
2
appearing in P
mn
(t).

2
t
2
t
t
2
2
It follows that the probability for the transition per unit time for states around the same
energy E
m
E
n
= E takes the form

P
mn
(t) = 2[n[H
int
[m)[
2
(E) , (4.151)
This result is known as Fermis golden rule.
In the above derivation, we were rather careful with taking the limit t . Suppose we
were a little bit sloppier, and rst chose to compute the amplitude for the initial state [m) at
t to evolve into the nal state [n) at t . Then we would get
i n[
_

dt

H
I
(t

)[m) = i n[H
int
[m) 2() . (4.152)
Now when squaring the amplitude, we nd P
mn
(t) = [n[H
int
[m)[
2
(2)
2
[()]
2
. Tracking
through the previous computation, we realize that the extra innity arises because P
mn
(t)
is the probability for the transition to happen in innite time. We thus can write the delta
functions as (2)
2
[()]
2
= 2() t, where t is a shorthand for t . The reason that we
have stressed this point is because the T-matrix element in (4.75) has been computed in the
same way as (4.152), which means that we have to reinterpret the square of the delta function
arising from [f[T[i)[
2
as a space-time volume factor.
Decay Rates
We would now like to calculate the probability for a single-particle initial state [i) of momen-
tum p
i
and rest mass m to decay into the nal state [f) consisting of n particles with total
momentum p
f
=

n
j=1
q
j
. This quantity is given by the ratio
P
n
=
[f[S[i)[
2
i[i)f[f)
. (4.153)
The states [i) and [f) obey the relativistic normalization formula (3.64),
i[i) = (2)
3
2E
p
i

(3)
(0) = 2E
p
i
V , (4.154)
97
where we have replaced the delta function
(3)
(0) by the volume V of the space. Similarly, one
has for the nal state
f[f) =
n

j=1
2E
q
j
V . (4.155)
If the initial-state particle is at rest, i.e., E
p
i
= m and p
i
= 0, we get using (4.75) for the
i f decay probability
P
n
=
1
2mV
n

j=1
1
2E
q
j
V
_
(2)
4

(4)
(p
f
p
i
)

2
[/(i f)[
2
=
1
2mV
(2)
4

(4)
(p
f
p
i
) [/(i f)[
2
V t
n

j=1
1
2E
q
j
V
.
(4.156)
Notice that in order to arrive at the second line we have replaced one of the delta functions
(2)
4

(4)
(0) by the space-time volume V t.
We can now divide out t to get the transition function per unit time. After integrating
over all possible momenta of the nal-state particles, i.e., V
_
d
3
q
j
/(2)
3
, we then obtain in
terms of the relativistically-invariant n-body phase-space element
35
d
n
= (2)
4

(4)
(p
f
p
i
)
n

j=1
d
3
q
j
(2)
3
1
2E
q
j
, (4.157)
the following expression for the partial decay width into the considered n-particle nal state

n
=
1
2m
_
d
n
[/(i f)[
2
. (4.158)
Notice that the factors of the spatial volume V in the measure V
_
d
3
q
j
/(2)
3
have cancelled
those in (4.156), while the factors 1/(2E
q
j
) in (4.156) have conspired with the 3-momentum
integrals in V
_
d
3
q
j
/(2)
3
to produce Lorentz-invariant measures (3.61). In consequence, the
density of nal states (4.157) is a Lorentz-invariant quantity.
After summation over all possible n-particle nal states, one nally nds the so-called total
decay width
=
1
2m

n
_
d
n
[/(i f)[
2
, (4.159)
with d
n
corresponding to a given nal state. The total decay width is equal to the reciprocal
of the half-life = 1/ of the decaying particle. If the decaying particle is not at rest, the
decay rate becomes m/E
p
i
. This leads to an increased half-life E
p
i
/m = /

1 v
2
= ,
where v is the velocity of the decaying particle. Of course, this is a well-known eect related
to time dilation. E.g. taking the muon lifetime at rest as the laboratory value of 2.22 s, the
lifetime of a cosmic ray produced muon traveling at 98% of the speed of light is about ve
times longer.
35
This object is in some textbooks denoted by dPS
n
.
98
In terms of the partial and total decay width, (4.158) and (4.159), the branching ratio (or
branching fraction) for the n-particle decay i f reads
B(i f) =

n

. (4.160)
Needless to say that B(i f) [0, 1] and

n
B(i f) = 1.
Cross Sections
Consider now a beam of particles of type B hitting a target at rest consisting of particles of
type A. The case of two colliding particle beams like e
+
e

(LEP), p p (Tevatron) or pp (LHC)


can be obtained from this by an appropriate Lorentz boost. Lets start by assuming constant
densities
A
and
B
in the target and the beam over their whole extents
A
and
B
. The
number of scattering events will then be proportional to
(
A

A
) (
B

B
) O, (4.161)
where O denotes the cross-sectional overlap area common to both the beam and the target.
The experimental set-up is illustrated in Figure 4.10. The ratio
=
# scattering events
(O
A

A
) (O
B

B
) /O
=
# scattering events
N
A
N
B
/O
, (4.162)
denes the cross section as the eective area of a chunk taken out of the beam by each
particle in the target. The quantities N
A
and N
B
are the numbers of A and B particles that
are relevant for scattering, i.e., the particles that at some point in time belong to the overlap
between target and beam. Notice that all of this can be equally well formulated in terms of
time-related quantities like the scattering rate and the incoming particle ux. Simply replace
the number (#) of scattering events by the number of scattering events per second and
B

B
by the ux v
B

B
of beam particles.
In reality
A
and
B
are not constant, since the colliding particles are described by wave
packets and both target and beam have a density prole. However, the range of the interaction
between the colliding particles is much smaller than the width of the individual wave packets
perpendicular to the beam, which in turn is much smaller than the actual diameter of the beam.
Therefore, to very good approximation
A
and
B
can be considered as locally constant on
QM (i.e., interaction) length scales, whereas the density proles inside the target and beam
can be incorporated properly by averaging over the overlap region

B
_
d
2
x

A
(x

)
B
(x

) = N
A
N
B
/O. (4.163)
Here x

is the spatial coordinate perpendicular to the beam. From this it follows that
# scattering events = N
A
N
B
/O, (4.164)
where can be calculated for eectively constant values of
A
and
B
corresponding to approx-
imately plane-wave initial states. By the way, we do not have to restrict ourselves to the total
99
Figure 4.10: Incident beam of particles with density
B
, extent
B
, and velocity v
B
hitting a target of density
A
and extent
A
. The overlap area of the beam and target
is denoted by O.

v
B

B

A
O

B
beam
target
-
-
number of scattering events. In a similar way we can study the cross section for scattering into
the region d
3
q
1
. . . d
3
q
n
around the n-particle nal-state momentum point q
1
, . . . q
n
. This is
actually what detectors usually do,
36
since they detect particles with energy and momentum
in certain nite bins, which are given by the detector resolution. These bins cannot resolve
the momentum spread of any of the wave packets, so in the nal state we should use plane
waves as well.
Calculating cross sections therefore amounts to computing transition probabilities in mo-
mentum space. These transition probabilities are universal in the sense that they are in-
dependent of details of the experiment, like the properties of the beams, the targets or the
preparation of the initial-state particles. Consider an initial state consisting of one target
and one beam particle in the momentum state [i) = [p
A
, p
B
) scattering into a nal state
[f) = [q
1
, . . . , q
n
). In analogy with the calculation that lead to (4.159), the corresponding
dierential transition probability per unit time and ux is given by
d =
1
F
d
n
4E
p
A
E
p
B
V
[/(i f)[
2
, (4.165)
which is usually referred to as the dierential cross section. In the latter expression F stands
for the ux associated with the incoming beam of particles. In the CM frame of the collision
this ux reads
F =
[v
rel
[
V
=
[v
A
v
B
[
V
=
[p
A
/E
A
p
B
/E
B
[
V
=
p
CM
E
CM
E
A
E
B
V
, (4.166)
where E
CM
= E
A
+E
B
is the total CM energy and p
CM
is the momentum of either of the parti-
cles in the CM frame. To nd this result we have used that the 4-momentum of a massive par-
ticle reads p

0
= (m, 0) in its rest frame, and becomes p

=
_
(E
0
+v p
0
) , (p
0
+E
0
v)
_
=
36
Provided that the particle positions cannot be resolved at the level of the de Broglie wavelengths of the
particles, which typically is the case in human-built detectors.
100
m (1, v) upon a boost with velocity v. In the CM frame we thus nd the expression

CM
=
_
d
n
4[E
A
p
B
E
B
p
A
[
[/(i f)[
2
, (4.167)
for the total cross section. The ux factor 1/4 [E
A
p
B
E
B
p
A
[
1
is not Lorentz invariant,
but invariant under boosts along the beam direction, as expected for a cross-sectional area
perpendicular to the beam.
Notice that the expression for d as given in (4.165) is also valid for identical particles in the
nal state. Finding a set of particles in the required momentum bin eectively identies the
particles. However, when integrating d to obtain the total cross section
CM
for the scattering
into the n particles one has to restrict this integration to inequivalent congurations. E.g.,
the total cross section for the 2 2 reaction N

N MM in the scalar Yukawa theory is
obtained as
CM
= 1/2
_
d.
4.12 Problems
i) Draw the Feynman diagrams that contribute to N

N MM at O(g
2
) in the Yukawa
theory. Write down the corresponding amplitude and express your result through the
Mandelstam variables. Can the amplitude develop a pole?
Calculate the leading-order contribution (4.74) to MM MM scattering in the limit
of small external momenta, i.e., p
2
1
M
2
etc. To do so, rst relate the d-dimensional
integral
_
d
d
l
(2)
d
1
l
2
M
2
=
i
(4)
d/2
(1 d/2) (M
2
)
d/21
, (4.168)
to the Feynman integral appearing in (4.74), then set d = 42, and nally take the limit
0. Can you recover the qualitative behavior of your result in the eective theory
obtained after integrating out the nucleon elds? Think in terms of higher-dimensional
operators.
ii) Calculate the Born-level potential for N

N N

N scattering in the Yukawa theory.
Compare your result to (4.83). What does your nding tell you about the nature of
scalar interactions?
Compute the leading-order potential for scattering in
4
theory. What is the
physical meaning of your result?
iii) Consider the anharmonic oscillator with Hamiltonian
H =
1
2
p
2
+
1
2

2
x
2
+

3!
x
3
. (4.169)
The goal of this exercise is it to calculate matrix elements of the form
[x
n
(0)[) , (4.170)
101
where n = 1, 2, . . . and [) denotes the ground state of the perturbed Hamiltonian (we
write x
n
(0) rather than x
n
because the time associated with these operators is important)
using Feynman diagrams and comparing the obtained results with those following from
standard time-independent perturbation theory.
There are two types of vertices in the possible Feynman diagrams, namely external and
internal vertices. External vertices have a single line and correspond to the x(0) factors
entering the matrix elements (4.170). They are labeled by the time the operators are
evaluated, t = 0 in our case. Internal vertices have three lines, corresponding to the
perturbation /(3!) x
3
in (4.169) and are labeled by a parameter t. Each internal vertex
has a dierent parameter. The Feynman diagrams are constructed using the following
Feynman rules:
0 = 1 , (external vertex) ,
t =
(i)
3!
_
dt , (internal vertex) ,
s t = D(s, t) =
1
2
e
i|st|
, (propagator) .
(4.171)
Here the limits of the t-integration are (1 i). We need the i because without it,
the Feynman integrals would not converge. Yet, the nal results of the integrals turn
out to be independent of , so that in practice we could omit the i from our notation.
Draw the relevant Feynman diagrams that contribute at O(1) and O() to the matrix
elements (4.170) with n = 1, 2, 3. Determine the corresponding weight factors and
calculate the graphs using (a > 0)
_
dt e
ia|t|
=
2
ia
. (4.172)
Try to reproduce your results using standard time-independent perturbation theory.
Proceed to compute the O(
2
) correction to [x
2
(0)[). Remember to employ (4.105).
The Feynman integrals appearing in this case are of the form (a, b, c > 0)
_
ds
_
dt e
ia|s|
e
ib|t|
e
ic|st|
=
2
(a +b)(b +c)
+
2
(a +b)(a +c)
+
2
(a +c)(b +c)
. (4.173)
This result can be easily derived from (4.172). If you want, verify that you get the same
answer for the O(
2
) correction to [x
2
(0)[) using standard QM perturbation theory.
iv) Consider a Lorentz transformation that boosts p

0
= (m

, 0) to

0
= (E
p
(), p).
Show that [
p
) = U()[
0
) satises P

[
p
) =

0
[
p
) where P

= (H E
0
, P).
The ground state [) of any interacting theory has to be Poincare invariant, since the
vacuum ought not to have a preferred direction. Show that [(x)[) = v for any (x)
if [(0)[) = v.
102
Compute the spectral density function (s) and the eld renormalization factor Z of the
free scalar theory by explicitly calculating 0[(0)[p).
v) Evaluate the leading singularity of the Fourier-transformed (n + 2)-point correlation
function (4.126) arising from the rst integration region in (4.127). You should nd the
result given in (4.131).
vi) In Section 4.10 we learnt that the two-point correlation function of the
4
theory viewed
as an analytic function of the momentum p
2
has a branch-cut singularity associated with
multiparticle intermediate states. This nding should not come as a surprise to those
familiar with non-relativistic scattering theory, where the amplitudes considered as a
function of energy have branch cuts on the positive real axis. The imaginary part of the
scattering amplitude appears as a discontinuity across this branch cut. By the optical
theorem the imaginary part of the forward-scattering amplitude is then proportional to
the total cross section. In this exercise we will derive the QFT version of the optical
theorem for a 2 2 scattering process.
Derive a equation for the product T

T involving the T-matrix starting from the unitarity


of the S-matrix. What is the physical reason for the unitarity of the S-matrix?
Calculate the matrix element of this relation between the two-particle initial and nal
states [p
1
, p
2
) and [q
1
, q
2
). In order to compute the matrix element of T

T insert a
complete set of states [k
i
) with i = 1, . . . , . Give a pictorial representation of the
resulting identity.
Set p
1
= q
1
and p
2
= q
2
and relate the matrix element of T

T to a total cross section.


Use this result to derive the standard form of the optical theorem, i.e.,
Im/(p
1
, p
2
p
1
, p
2
) = 2E
CM
p
CM
(p
1
, p
2
anything) . (4.174)
Here E
CM
is the total CM energy, p
CM
is the momentum of either of the particles in the
CM frame, and (p
1
, p
2
anything) is the total cross section for the production of all
nal states.
vii) The generalized optical theorem
2Im/(a b) =

f
_
d
f
/(a f) /

(b f) . (4.175)
is true not only for S-matrix elements, but for any amplitude that we can dene in terms
of Feynman diagrams. Here a and b denote asymptotic states, the sum f runs over all
possible sets of nal states, and d
f
is the corresponding phase-space element (4.157).
In this exercise we will learn that the optical theorem can also be used to deal with
unstable particles, which never appear in asymptotic states.
Recall that the exact two-point function of a scalar eld takes the form (4.138). Use
the diagrammatic master formula (4.157) to derive a relation between the amplitude
/(p p) describing 1 1 scattering and the quantity i(p
2
) that is the sum of all
1PI insertions into the boson propagator (4.137).
103
The latter relation can be used to study the imaginary part of (p
2
). In order to do so,
we change the denition of the physical mass of the -particle from (4.139) into
m
2
= m
2
0
+ Re (m
2
) . (4.176)
Assume now that the full propagator (4.138) appears in the s channel of a 2 2
Feynman diagram. Compute the cross section for the process in the vicinity of the
resonance. Neglect all overall factors.
Compare your result to the relativistic Breit-Wigner formula for the cross section in the
region of a resonance,

1
s m
2
+im

2
. (4.177)
Here m is the mass of the resonance and is its width. Identify the width with the
imaginary part of (p
2
) assuming that the resonance is narrow, i.e., m. What does
this mean for the lifetime of the particle? Calculate Im(m
2
), and hence , using the
optical theorem (4.175). You should recover the result (4.159).
viii) Derive the explicit form of the 2-body phase-space element
_
d
2
from (4.157). Using
your result calculate the angular dierential cross section (d/d)
CM
for a generic 2 2
process in the CM frame. The solid angle d is given by dd cos where [, ] is
the polar scattering angle (with respect to the beam axis) and [0, 2[ the azimuthal
scattering angle (around the beam axis). Consider also the case where the external
particles all have the same mass. In this case you should obtain
_
d
d
_
CM
=
[/(p
1
, p
2
q
1
, q
2
)[
2
64
2
s
, (4.178)
where /(p
1
, p
2
q
1
, q
2
) represents the relevant scattering matrix element.
ix) The interactions of pions at low energy can be described by a phenomenological model
called the linear sigma model,
H =
_
d
3
x
_
1
2

2
i
+
1
2
(
i
)
2
+V (
2
)
_
. (4.179)
Here
i
with i = 1, . . . , N are real scalar elds and
i
denotes the conjugate momentum
derived from
i
. The potential is given by
V (
2
) =
1
2
m
2
(
i
)
2
+

4
(
2
i
)
2
. (4.180)
Note that for m
2
> 0 and = 0 the above Hamiltonian just consists out of N copies of
the Klein-Gordon Hamiltonian. If one now assumes to be a small perturbation, one
can calculate scattering amplitudes in a series expansion in .
Show that the propagator of the
i
elds is

i
(x)
j
(y) =
ij
D
F
(x y) , (4.181)
104
where D
F
(x y) is the standard Klein-Gordon propagator with mass m. Show further-
more that there is one type of vertex given by
= 2i(
ij

kl
+
il

jk
+
ik

jl
) . (4.182)
i j
k l
A vertex involving two
1
and two
2
thus has the value 2i, while a vertex where
four elds of the same type attach receives a factor 6i.
Compute the cross sections for
1

2

1

2
,
1

1

2

2
, and
1

1

1

1
scattering to rst order in . Work in the CM frame.
Now consider the case m
2
< 0. In this case, the potential has a local maximum rather
than a minimum at
i
= 0. Since the potential is symmetric under SO(N) rotations of
= (
1
, . . . ,
N
), we can choose to write the elds close to the new minimum as
(x) =
_

1
(x), . . . ,
N1
(x), v +(x)
_
T
, (4.183)
where v is a constant chosen to minimize the potential (4.180), (x) is a small deviation,
and
i
(x) denote the remaining elds, called pions. Show, that with such a potential we
have a theory of one massive sigma eld and (N 1) massless pion elds, interacting
through cubic and quartic potential terms. Assign Feynman rules to the propagators
(4.184)
(x)(y) = ,
i
(x)
j
(y) = , i j
and the vertices
. (4.185)
i j
i j
k l
i j
105
References
[1] J. Schwinger, Renormalization Theory of Quantum Electrodynamics: An Individual
View, in The Birth of Particle Physics, Cambridge University Press (1983), 329 p.
[2] H. Yukawa, Proc. Phys. Math. Soc. Jap. 17, 48 (1935).
[3] H. Lehmann, K. Symanzik and W. Zimmermann, Nuovo Cim. 1, 205 (1955).
106
5 Dirac Theory
We have seen that quantization of scalar elds gives rise to spin-zero particles. But most
particles in nature have an intrinsic angular momentum or spin. These arise naturally in eld
theory by considering elds which themselves transform non-trivially under the Lorentz group.
In this section we will describe the Dirac equation, whose quantization gives rise to fermionic
spin-1/2 particles. In order to motivate the Dirac equation, we will start by studying the
appropriate representation of the Lorentz group.
We already know from Section 2.4 that if one considers innitesimal Lorentz transforma-
tions (2.61), the matrices

(2.63) entering the transformations have to be antisymmetric.


Such an object has six independent parameters which agrees with the number of transforma-
tions of the Lorentz group, i.e., three rotations and three boosts. In the following it will turn
out to be useful to introduce a basis of this six 4 4 antisymmetric matrices. We call our
matrices (/

with , , , = 0, 1, 2, 3 and write the basis of six matrices as


(/

, (5.1)
where the indices and denote which basis element we are dealing with, while and
belong to the 4 4 matrices. Notice that (/

is antisymmetric in both , and , . If


we use these matrices in practical applications (e.g., if we want to multiply them together or
act on some eld) we will typically need to lower one index,
(/

. (5.2)
Since we lowered the index with the Minkowski metric, we pick up various minus signs which
means that when written in this form, the matrices are no longer necessarily antisymmetric.
E.g., one has
(/
01
)

=
_
_
_
_
_
0 1 0 0
1 0 0 0
0 0 0 0
0 0 0 0
_
_
_
_
_
, (/
12
)

=
_
_
_
_
_
0 0 0 0
0 0 1 0
0 1 0 0
0 0 0 0
_
_
_
_
_
. (5.3)
The matrix /
01
, which is real and symmetric, generates boost in the x direction, while /
12
is real and antisymmetric and generates rotations in the xy plane. In terms of /

, we can
now write any innitesimal

as

=
1
2

(/

, (5.4)
where the matrix

contains six numbers and is antisymmetric in and . This matrix


parametrizes the Lorentz transformation we are doing. The basis of the six matrices /

forms the generators of the Lorentz transformations. The generators obey the Lorentz Lie
algebra relations,
[/

, /

] =

. (5.5)
107
Here the matrix indices have been suppressed. A nite Lorentz transformation can be
constructed from (5.4) by building the exponential
= exp
_
1
2

_
. (5.6)
Let me stress again what each of these objects are: the /

are six 44 basis elements of the


Lorentz group, while the

are six numbers telling us what kind of Lorentz transformation


we are doing.
5.1 Spinor Representation
We now want to nd other matrices which satisfy the Lorentz algebra commutation relations
(5.5). In the following, we will construct the spinor representation of the Lorentz group using
a trick due to Dirac. We start by dening something which, at rst sight, has nothing to do
with the Lorentz group. It is the Cliord algebra (or Dirac algebra)

= 2

1 , (5.7)
where a, b = ab+ba is the usual anticommutator,

denotes a set of four matrices (the Dirac


matrices), and 1 is the nn unit matrix with n being the dimensionality of the representation.
The relation (5.7) implies that we have to look for matrices

that satisfy

when ,= ,
(
0
)
2
= 1 , (5.8)
and
(
i
)
2
= 1 , (5.9)
for i = 1, 2, 3. It is not dicult to convince oneself that the simplest representation of the
Cliord algebra (for four-dimensional Minkowski space) is in terms of 4 4 matrices. In fact,
there are many 4 4 matrices

that obey (5.7).


37
E.g., we may take the so-called Weyl or
chiral representation,

0
=
_
0 1
1 0
_
,
i
=
_
0
i

i
0
_
, (5.10)
where each element is a 2 2 matrix itself and
i
denotes the Pauli matrices

1
=
_
0 1
1 0
_
,
2
=
_
0 i
i 0
_
,
3
=
_
1 0
0 1
_
. (5.11)
The latter matrices themselves satisfy
i
,
j
= 2
ij
. Using these properties one easily shows
that (5.10) indeed satises (5.7). This is left as a homework problem.
37
One can construct any other representation of the Cliord algebra from a specic one by taking M

M
1
for any invertible matrix M. However, up to this equivalence, it turns out that there is a unique irreducible
representation of the Cliord algebra, and the matrices (5.8) provide an example.
108
So what is the connection between the Cliord algebra and the Lorentz group? In order
to answer the question, we consider the commutator of two Dirac matrices

,
S

=
1
4
[

] . (5.12)
In our representation, the 0i and ij components of S

are given explicitly by


S
0i
=
1
2
_

i
0
0
i
_
, (5.13)
and
S
ij
=
i
2

ijk
_

k
0
0
k
_
. (5.14)
It is straightforward to show and thus part of an exercise, that these matrices (irrespectively
of their representation) satisfy
[S

] =

. (5.15)
and
[S

, S

] =

. (5.16)
The latter equality tells us that the matrices S

form a representation of the Cliord algebra


(5.5). We now also understand the physical meaning of (5.13) and (5.14). The former object
induces a Lorentz boost, while the latter generates a three-dimensional rotation.
Dirac Spinors
The S

are 4 4 matrices, because the

are. So far we havent given an index to the rows


and columns of these matrices. Lets call the indices and . We furthermore need a eld
that the (S

act upon. The sought eld has to have four complex components labelled
and we call it

(x). This object is the famous Dirac spinor. Under Lorentz transformations,
we have

(x) S()

(x

) . (5.17)
where x

=
1
x and the full Lorentz transformation S() takes the form
S() = exp
_
1
2

_
, (5.18)
and the expression for is given in (5.6). Although the basis of generators S

and /

is
dierent we use the same six numbers

in both S() and . This ensures that we are doing


the same Lorentz transformation on and x.
Both S() and are 44 matrices. So how can we be sure that the spinor representation
(5.18) is something new, and isnt equivalent to the familiar vector representation (2.60)?
109
In order to convince ourselves that the two representations are truly dierent, we look at
rotations. If we write the rotation parameters as
ij
=
ijk

k
, then (5.6) and (5.18) become
= exp
_
_
_
_
_
0 0 0 0
0 0
3

2
0
3
0
1
0
2

1
0
_
_
_
_
_
, S() =
_
e
i/2
0
0 e
i/2
_
, (5.19)
where in order to arrive at the right-hand sides one has to remember that
12
=
21
=
3
,
etc. We now consider a rotation by 2 around the z-axis, which means to take = (0, 0, 2).
It follows that
= exp
_
_
_
_
_
0 0 0 0
0 0 2 0
0 2 0 0
0 0 0 0
_
_
_
_
_
= 1 , S() =
_
e
i
3
0
0 e
i
3
_
= 1 . (5.20)
This implies that under a 2 rotation a vector and spinor transforms as follows
A

(x) A

(x) ,

(x)

(x) . (5.21)
The latter relation tells us that spinors have the unintuitive property that a 2 rotation does
not return them to their initial state, but a 4 rotation does. So S() denitely diers from
the vector representation .
For later convenience let me also give explicitly the analogs of (5.19) for Lorentz boosts.
Writing the boost parameter as
i0
=
0i
=
i
, one nds
= exp
_
_
_
_
_
0
1

2
0 0 0

2
0 0 0

3
0 0 0
_
_
_
_
_
, S() =
_
e
i/2
0
0 e
i/2
_
. (5.22)
Another important question to ask is whether or not S() is a unitary representation of
the Lorentz group.
38
From (5.18), we infer that S() is unitary if S

is anti-hermitian, i.e.,
(S

= S

. But we have
(S

=
1
4
[(

, (

] , (5.23)
which can be anti-hermitian if all

are hermitian or all are anti-hermitian. However, we


can never arrange for this to happen since (5.8) and (5.9) imply that S

has both real and


imaginary eigenvalues, and a anti-hermitian matrix ought only to have imaginary ones. E.g.,
in the Weyl representation (5.10), we have the property
(
0
)

=
0
, (
i
)

=
i
. (5.24)
38
Notice that using the Weyl representation the relations (5.13) and (5.14) already tell us explicitly that
rotations are unitary while boosts are not. This observation is also true for the vector representation.
110
In fact the Lorentz group being non-compact, has no nite-dimensional representations that
are unitary. But this does not matter to us, since our spinor is not a QM wavefunction, but
a classical eld.
Dirac Action
With the new eld at hand we now want to construct Lorentz-invariant EOMs involving it.
In order to do this we try to write down a Lorentz-invariant action that is bi-linear in .
We consider the product

(x)(x) = (

)
T
(x)(x) , (5.25)
where

(x) is the usual adjoint of a multi-component object. Under a Lorentz transformation


, one has

(x)(x)

(x

)S

()S()(x

) , (5.26)
which is not Lorentz invariant since S() is not unitary, i.e., S

()S() ,= 1. This means that

is not a Lorentz scalar and thus not the right building block for constructing the action.
Yet, it is easy to see what went wrong and to correct for it. From (5.24) we nd that for
= 0, 1, 2, 3, one has
(

=
0

0
, (5.27)
which in turn implies that
(S

=
0
S

0
, (5.28)
and
S

() =
0
S()
0
. (5.29)
This suggests that instead of

we should better use

(x) =

(x)
0
, (5.30)
as a building block in our Dirac action. This object is called adjoint Dirac spinor.
Equipped with and

let us now see what kind of Lorentz covariant objects we can form
out of them. We rst consider

. It is a simple exercise to show that this object transforms
under a Lorentz transformation as

(x)(x)

(x

)(x

) , (5.31)
which tells us that it is a Lorentz scalar. A term
_
d
4
x

(x)(x) is thus Lorentz invariant since
det() = 1. Next we consider

. This term has the following transformation property

(x)

(x)


(x

(x

) , (5.32)
under Lorentz transformations. This claim is proven as part of an exercise. From (5.32) we
infer that

is a Lorentz vector. This means that we can treat the index on the

matrices as a true vector index. In particular, we can form Lorentz scalars by contracting it
with other Lorentz indices. As a result terms like
_
d
4
x

(x)A/(x)(x) and
_
d
4
x

(x)/(x)
are Lorentz invariant. Here we have introduced the shorthand notation a/ =

for any
111
contravariant vector a

. Finally, we consider

with

= i/2 [

]. Not surprisingly
this object behaves like a Lorentz tensor

(x)

(x)


(x

(x

) . (5.33)
This result is again easy to derive by considering separately the properties of ,

, and

under
Lorentz transformations. From (5.33) it follows that terms like
_
d
4
x

(x)

(x)F

(x),
where all indices are contracted are Lorentz invariant.
We are now equipped with the three bi-linears

,

, and

, each of which
transforms covariantly under the Lorentz group. We can try to build a Lorentz-invariant
action from these. In fact, we need only the rst two terms. We write
S =
_
d
4
x

(x) (i/ m) (x) . (5.34)
This is the Dirac action we were looking for. Since [S] = 0, [d
4
x] = 4, [

] = 1, and [m] = 1,
we can read o the mass dimension of the spinor eld and its adjoint. We have [] = [

] = 3/2.
The factor of i is there to make the action (5.34) real. Upon complex conjugation, it cancels
a minus sign that comes from integration by parts. As we will see soon, after quantization
the Dirac theory describes particles and antiparticles of mass [m[ and spin 1/2. Notice that
the Lagrangian is of rst order, rather than the second-order Lagrangians we were working
with for scalar elds. Also, the mass parameter appears in the Lagrangian as m, which can
be positive or negative.
Dirac Equation
The EOMs for and follow from (5.34) by varying independently with respect to

and ,
respectively. In the rst case, we obtain
39
(i/ m) = 0 . (5.35)
This is the Dirac equation. In the second case it follows that

(i

/ +m) = 0 , (5.36)
which is the hermitian-conjugate form of (5.35). Here the derivative acts to the left. Both
(5.35) and (5.36) are rst order in derivatives, yet miraculously Lorentz invariant. As an
homework assignment you are asked to show this explicitly. In contrast, in the case of a scalar
eld a rst-order EOM would necessarily break Lorentz invariance, because one would always
need to introduce a privileged vector that saturates the open index of

. The

matrices
provide this index in the case of the Dirac equation.
It is also important to realize that the Dirac equation mixes up dierent components of
through

. However, each individual component itself solves the Klein-Gordon equation


(2.46). In order to see this, we compute
0 = (i

+m) (i

m) =
_

+m
2
_

=
_
1/2

+m
2
_
=
_

+m
2
_
.
(5.37)
39
Hereafter we will often drop the coordinate x in (x) etc.
112
The nal expression contains no

matrices, and so applies to each component

of the
spinor eld separately.
Chiral Spinors
We have seen that in the chiral representation (5.10) both the spinor rotations (5.19) and
boosts (5.20) are block diagonal. This means that the Dirac representation is reducible. It
decomposes into two irreducible representations, acting only on two-component spinors
L,R
,
which in the chiral representation, are dened by
=
_

L
_
. (5.38)
The two-component objects
L,R
are called chiral spinors and the labels L, R stand for left-
and right-handed chirality. They transform in the same way under rotations, but oppositely
under boosts:

L,R
e
i/2

L,R
,
L,R
e
1/2

L,R
, (5.39)
In group theory language
L
is in the (1/2, 0) representation of the Lorentz group,
R
is in
the (0, 1/2) representation, and belongs to (1/2, 0) (0, 1/2).
40
Strictly speaking, the Dirac
spinor is a representation of the double cover of SO
+
(1, 3)

= SL(2, C)/Z
2
. Here SO
+
(1, 3)
denotes the proper, orthochronous or restricted Lorentz group, which consists of those Lorentz
transformations that preserve the orientation of space and direction of time, while SL(2, C)
is the complex special linear group and Z
2
is the two element cyclic group. The fact that
the Lorentz group is doubly connected is the source of the rotation-by-4 property (5.21) of
spinors.
The relations (5.38) and (5.39) correspond to the chiral representation. But what happens
if we choose a dierent representation

of the Cliord algebra, where the Lorentz group


matrices S() are not block diagonal? Is there an invariant way to dene chiral spinors? We
can do this by dening the fth Dirac matrix

5
= i
0

3
=
i
4!

, (5.40)
where

is the totally antisymmetric Levi-Civita tensor with


0123
=
0123
= 1. The

5
matrix has the following properties, all of which can be veried using (5.40) and the
anticommutation relations (5.7):
(
5
)

=
5
, (
5
)
2
= 1 ,
5
,

= 0 . (5.41)
The reason that this Dirac matrix is called
5
is that the set of matrices
M
=

, i
5
satisfy
the ve-dimensional Cliord algebra, i.e.,
M
,
N
= 2
MN
where M, N = 0, 1, 2, 3, 4. It is
also not dicult to check that
[S

,
5
] = 0 , (5.42)
40
Using this terminology, scalars belong to the (0, 0) representation, vectors are in the (1/2, 1/2) represen-
tation, and the electromagnetic eld-strength tensor transforms as (1, 0) (0, 1) under the Lorentz group.
113
which means that
5
is a scalar
41
under rotations and boosts. The latter relation also tells
us that eigenvectors of
5
whose eigenvalues are dierent transform without mixing, and as
a result the Dirac representation must be reducible. This criterion for reducibility is Schurs
lemma. It follows that
P
L,R
=
1
5
2
, (5.43)
form Lorentz-invariant projection operators. They satisfy (please show this)
P
2
L,R
= P
L,R
, P
L,R
P
R,L
= 0 , P
L
+P
R
= 1 . (5.44)
One can also check easily that for the Weyl representation (5.10), one has explicitly

5
=
_
1 0
0 1
_
. (5.45)
We see that P
L,R
project onto left- and right-handed spinors, i.e.,
L,R
= P
L,R
.
Weyl Equations
The Dirac Lagrangian can be written in terms of the chiral elds (5.38) as
/ =

(i/ m) = i
_

L
/
L
+

R
/
R
_
m
_

R
+

L
_
, (5.46)
where

L,R
=

P
R,L
. After a slight change of notation,

= (1, ) ,

= (1, ) , (5.47)
and multiplying with

=

L
+

R
from the left the corresponding EOMs read
i
_

L
+

R
_
m
_

R
+

L
_
= 0 . (5.48)
We see that a massive fermion requires both components
L
and
R
, since they are coupled
via the mass term, which is chirality ipping. The kinetic term on the other hand is chirality
conserving. This means that a massless fermion can be described by a single Weyl spinor
L
or
R
alone. The corresponding Euler-Lagrange equations go by the name of Weyl equations:
i

L
= 0 , i

R
= 0 . (5.49)
In many practical applications it is overwhelmingly convenient to employ two-component
Weyl spinor notation, rather than the four-component Dirac spinors. This is due to the fact
that the Lagrangian of the SM and essentially all of its extensions violate parity, i.e., the left-
and right-handed fermionic components couple dierently to the electroweak gauge group. If
one uses four-component spinor notation, then there are a lot of clumsy left- and right-handed
projection operators. This is not the case if one employs the two-component Weyl fermion
notation, which treats fermionic dofs with dierent gauge quantum numbers separately from
the start, as nature intended for us to do. Plenty of details on and many useful techniques to
deal with Weyl fermions can be found in [1], which I highly recommend for further reading.
41
In fact, we will see soon that it is a pseudo-scalar and not a scalar.
114
Dofs Counting
At this point a couple of comments about the dofs counting seem to be indicated. In classical
mechanics, the number of dofs of a system is equal to the dimension of the conguration space
or, equivalently, half the dimension of the phase space. In eld theory we have an innite
number of dofs, but it makes sense to count the number of dofs per spatial point, which at
least should be nite. E.g., in this sense a real scalar eld has a single dofs. At the quantum
level, this translates to the fact that it gives rise to a single type of particle. A classical
complex scalar eld, on the other hand, has two dofs, corresponding to the particle and its
antiparticle in the QFT.
But what about a Dirac spinor? One might think that there are eight dofs, since has
four complex components. But this is wrong! Crucially, and in contrast to the scalar eld,
the EOM of is rst order rather than second order. In particular, for the Dirac theory, the
momentum conjugate to the spinor is given by

=
/

= i

, (5.50)
which is not proportional to the time derivative of . The phase space of a spinor is hence
parameterized by and

, while for a scalar it is parameterized by and

=

. So the
phase space of the Dirac spinor has eight real dimensions and correspondingly the number
of real dofs is four. We will learn soon that, in the QFT, this counting manifests itself as two
dofs (i.e., spin up and down) for the particle, and another two for the antiparticle. A similar
counting for the Weyl fermion tells us that it has two dofs.
Majorana Fermions
Our spinor is a complex object. It has to be, since the representation S() is typically also
complex. This means that if we were to try to make real, e.g., by imposing =

, then
it would not stay real once we make a Lorentz transformation. However, there is a way to
impose a reality condition on . In order to motivate this possibility, its simplest to look at
a novel basis for the Cliord algebra (5.9), known as the Majorana basis

0
=
_
0
2

2
0
_
,
1
=
_
i
3
0
0 i
3
_
,
2
=
_
0
2

2
0
_
,
3
=
_
i
1
0
0 i
1
_
. (5.51)
What is special about these matrices is that they are all pure imaginary, i.e., (

.
This implies that the generators (5.12), and hence the full Lorentz transformations (5.18) are
real. In the specic basis (5.51), we can therefore work with a real spinor simply by imposing
the condition,
=

, (5.52)
which is preserved under Lorentz transformations. Such spinors are called Majorana spinors.
Can this procedure be generalized to an arbitrary basis of Dirac matrices? We only ask
that the basis satises (5.24). We then dene the charge conjugate of a Dirac spinor as

c
= C

, (5.53)
115
where C is a 4 4 matrix obeying
C

C = 1 , C

C = (

. (5.54)
The rst relation tells us that charge conjugation can be described by an unitary operator.
Let us rst check that (5.53) is a sensible denition, meaning that
c
transforms nicely under
Lorentz transformations. One has

c
(x) CS

()

(x

) = S()C

(x

) = S()
c
(x

) . (5.55)
Here we made use of the properties (5.54) to commute the matrix C past S

() to the right.
Comparing the latter result to (5.17), we see that and
c
transform in the same way under
the Lorentz group. In fact, not only does
c
transforms nicely under rotations and boosts,
but it satises the Dirac equation, if does. This follows from,
(i/ m) = 0 , = (i/

m)

= 0 ,
= C(i/

m)

= (i/ m)
c
= 0 ,
(5.56)
where we have again employed (5.54). Finally, we can now impose the Lorentz-invariant reality
condition on the Dirac spinor, to yield a Majorana spinor,
=
c
. (5.57)
After quantization, the Majorana spinor gives rise to a Majorana fermion that is its own
antiparticle. This is exactly the same as in the case of scalar elds, where we have seen that
a real scalar eld gives rise to a spin-zero boson that is its own antiparticle.
So how does the matrix C look like? This, of course, depends a lot on the basis. In
the Majorana basis (5.51), where all the Dirac matrices are purely imaginary, one simply has
C = 1, and in consequence the condition (5.57) turns into (5.52). In the chiral basis (5.10),
on the other hand, only
2
is imaginary, and we may take
42
C = i
2
. (5.58)
It is also interesting to see how the Majorana condition (5.57) looks in terms of the decom-
position into left- and right-handed Weyl spinors. Plugging in the various denition, we nd

R
= i
2

L
and
L
= i
2

R
. In other words, a Majorana spinor can be written in terms of
chiral spinors as
=
_

R
i
2

R
_
. (5.59)
Notice that it is not possible to impose the Majorana condition, =
c
, at the same time
as the Weyl condition,
L
= 0 or
R
= 0. Instead the Majorana condition relates left- and
right-handed spinors via (5.59). In an exercise you will learn more about Majorana fermions.
So lets move on.
42
Be aware, in many texts an extra factor of
0
is absorbed into the denition of C.
116
5.2 Discrete Symmetries of Dirac Theory
In addition to the continuous Lorentz transformations we have considered so far, there are
two other space-time operations that are potential symmetries of any QFT, namely parity and
time reversal. Parity, denoted by P, sends
x = (t, x)
P
(t, x) = x
P
, (5.60)
reversing the handedness of space. Times reversal, denoted by T, sends
x = (t, x)
T
(t, x) = x
T
, (5.61)
interchanging the forward and backward light-cone. Since parity has an important role to play
in the SM and, in particular, the theory of the electroweak interactions, lets rst have a look
at the action of P on spinors and bi-linears constructed from them.
Parity
In order to understand what happens to a spinor under parity, we consider how rotations and
boosts act on Weyl spinors. In the chiral representation, the corresponding transformation
properties have already been spelled out in (5.39). We also know that under parity rotations
do not ip sign, while boosts do, since P acting on a particle should reverse its momentum,
but not its spin. This tells us that parity exchanges right- and left-handed spinors,

L,R
(x)
P

R,L
(x
P
) . (5.62)
Using this knowledge and the fact that changing the parity twice is the identity, i.e., P
2
= 1,
we see that the action of parity on can be described in the Weyl basis by
P =
0
. (5.63)
This 4 4 matrix satises
P

P = 1 , P

P = (

, (5.64)
so also parity can be implemented by an unitary operator. Our spinor transforms under P as
(x)
P
P (x
P
) . (5.65)
Notice that if (x) satises the Dirac equation (5.35), so does the parity-transformed spinor
P (x
P
), since one has
(i
0

t
+i
i

i
m)P (t, x) = P (i
0

t
i
i

i
m)(t, x) = 0 . (5.66)
Here the extra minus sign from passing P through
i
is compensated by the derivative acting
on x instead of x.
Let me now consider how the covariant interaction terms we have constructed before trans-
form under P. We start with

. Obviously, one has

(x)(x)
P


(x
P
)(x
P
) , (5.67)
117
given that (
0
)
2
= 1 and (
0
)

=
0
. This is the transformation of a scalar. In the case of the

, we nd instead

(x)

(x)
P
(1)


(x
P
)

(x
P
) , (5.68)
where (1)

= 1 for = 0 and (1)

= 1 for = 1, 2, 3. Notice that the factor (1)

arises
from the combination of (5.24) and (5.27). The latter transformation property tells us that

transforms as a vector, with the spatial part changing sign. You can also check easily
that

transforms as a tensor, namely

(x)

(x)
P
(1)

(1)


(x
P
)

(x
P
) . (5.69)
Using
5
, we can form two more Lorentz-covariant objects, i.e.,

5
and

. How do
these transform under parity? In the rst case, we obtain

(x)
5
(x)
P

(x
P
)
5
(x
P
) , (5.70)
where we have used the last relation in (5.41) and (
0
)
2
= 1. In the second case, a straight-
forward calculation gives

(x)

5
(x)
P
(1)


(x
P
)

5
(x
P
) . (5.71)
The minus signs in (5.70) and (5.81) earns the objects

5
and

5
the names pseudo-
scalar and pseudo-vector (or axial-vector). To summarize, we have the following spinor bi-
linears,

: scalar ,

: vector ,

: tensor ,

5
: pseudo-scalar ,

5
: pseudo-vector .
(5.72)
The total number of bi-linears is (1 +4 +(4 3)/2 +4 +1) = 16 which is all we could hope for
from a 4-component object.
We are now equipped with new terms involving
5
that we can start to add to our La-
grangian to construct new theories. Typically such terms will break parity invariance of the
theory, although this is not always true. E.g., the term

5
does not break parity if is
itself a pseudo-scalar. nature makes use of these parity-violating interactions by using
5
in
the electroweak force. A theory which treats
L,R
on an equal footing is called a vector-like
theory. In contrast, a theory in which
L,R
appear dierently is called a chiral theory.
118
Time Reversal
Another obvious question that we should address is how our building blocks in (5.66) transform
under T. In order to answer this question, we rst have to understand how the time-reversal
symmetry is correctly implemented in a QM context. The implementation turns out to be
more subtle than in the case of C and P, since the relevant operator is in the case of T not
unitary but anti-unitary, i.e., T

= T
1
with [T

) = T[

, where [) and [

) denote
arbitrary multiparticle quantum states. A straightforward way to realize that the operator
implementing T must be anti-unitary is to consider the behavior of the Schrodinger equation
for a free particle under time reversal. In classical mechanics, a free particle has a time-reversal
invariant motion, and it is reasonable that we would like to retain this property in QM as
well. But the operator
t
is T-odd while is T-even. This is impossible to reconcile with the
Schr odinger equation unless time reversal changes i i and

. The operator thus has


to be anti-unitary. Note that the anti-unitary of T implies that it does not have meaningful
eigenvalues, contrary to what happens in the case of C and P. As there is no quantum number
associated with time reversal, no conservation law exists when the action is invariant under
time reversal. Just as for parity, we dene time-reversal transformation in QFT by its action
on states. We require that T should reverse the particle momentum and its spin.
It is not too dicult to gure out, that the transformation of the Dirac spinor under time
reversal involves in the chiral basis the matrix
T = i
1

3
, (5.73)
which satises
T

T = 1 , T

T = (1)

, (5.74)
The transformation itself takes the following form
(x)
T
T

(x
T
) . (5.75)
The rst thing to notice is that if (x) obeys the Dirac equation, the same is true for T

(x
T
).
This follows, because
(i
0

t
+i
i

i
m)T

(t, x) = T (i(
0
)

t
i(
i
)

i
m)

(t, x) = 0 . (5.76)
Notice that the minus sign between the (
0
)

and (
i
)

term is compensated by the derivative


acting on t rather then t, and that the nal result follows after complex conjugation which
sends i i.
We are now read to consider the transformation properties of the building blocks (5.72).
I simply quote the results without proof, leaving the actual derivations to you as an useful
exercise. For the scalar

one nds that

(x)(x)
T


(x
T
)(x
T
) . (5.77)
In the case of the vector

, one has instead

(x)

(x)
T
(1)


(x
T
)

(x
T
) . (5.78)
119
This is exactly the transformation property we want for vectors, since it leaves

/ and

A/
invariant under time reversal. Notice that the minus sign appearing for the space-components
in (5.78) is cancelled by those appearing in the transformation of the derivative

and the
electromagnetic eld A

, respectively. One furthermore shows, that the tensor


behaves
like

(x)

(x)
T
(1)

(1)


(x
T
)

(x
T
) , (5.79)
under time reversal. We nally want to know the transformation properties of the covariants
involving
5
. For the pseudo-scalar

5
, one obtains

(x)
5
(x)
T

(x
T
)
5
(x
T
) , (5.80)
while the action of T on the pseudo-vector

5
is given by

(x)

5
(x)
T
(1)


(x
T
)

5
(x
T
) . (5.81)
Charge Conjugation
The last of the three discrete symmetries is the particle-antiparticle symmetry C, which we
meet already at the end of Section 5.1 when discussing the properties of Majorana fermions.
In physical terms, charge conjugation is conventionally dened to take a fermion with a given
spin orientation into an antifermion with the same spin orientation. As we have seen in (5.56),
this transformation is a symmetry of the Dirac equation.
Once again we want to know how C acts on fermion bi-linears. I again quote the relevant
results without giving the details of their derivation. The scalar

transforms under C as

(x)(x)
C


(x)(x) . (5.82)
In the case of the vector

, one has instead

(x)

(x)
C

(x)

(x) . (5.83)
Under C the tensor

behaves like

(x)

(x)
C

(x)

(x) . (5.84)
For the pseudo-scalar

5
, one arrives at

(x)
5
(x)
C


(x)
5
(x) , (5.85)
while the action of C on the pseudo-vector

5
reads

(x)

5
(x)
C


(x)

5
(x) . (5.86)
120
CP and CPT Symmetry
We saw that the free Dirac equation (5.35) is invariant under P, T, and C separately. Yet,
we can build more general QFTs that violate any of these discrete symmetries by adding to
the Dirac Lagrangian appropriate perturbations. These additional terms must transform as a
Lorentz scalar. The various fermionic bi-linears that can be used to construct such terms are
shown in Table 1. The last line of this table tells us that all Lorentz-scalar combinations of

and are invariant under the combined symmetry CPT. Actually, it is quite generally true
that one cannot build a Lorentz-invariant QFT with a hermitian Hamiltonian that violates
CPT. More precisely, one can prove the following three statements [2]: rst, an interacting
theory that violates CPT invariance necessarily violates Lorentz invariance, second, CPT
invariance is not sucient for out-of-cone Lorentz invariance, and third, theories that violate
CPT by having dierent particle and antiparticle masses must be non-local. This implies that
any study of CPT violation includes also Lorentz violation. Several experimental searches of
such violations have been performed during the last few years. A detailed list of results of
these experimental searches are summarized in [3]. So far no evidence for neither CPT nor
Lorentz violation has been found. The consequences of the CPT invariance are far-reaching.
The most celebrated ones are the equality of masses and total decay width (or lifetimes) for
particles and antiparticles. Both statements are easy to prove. Try it!
What about the other discrete symmetries in nature? Are they conserved? Although P is
conserved in electromagnetism, strong interactions, and gravity, it turns out to be violated in
electroweak interactions. The SM incorporates parity violation by expressing the electroweak
interaction as a chiral gauge interaction. Only the left-handed components of particles and
right-handed components of antiparticles participate in the electroweak interactions in the SM.
This implies that P is not a symmetry of our universe, unless a hidden mirror sector exists
in which parity is violated in the opposite way (a left-right symmetry). It was suggested
several times and in dierent contexts that parity might not be conserved, but in the absence
of compelling evidence these suggestions were not taken seriously. A careful review by Tsung
Dao Lee and Chen Ning Yang [4] showed that while P conservation had been veried in decays
by the strong or electromagnetic interactions, it was untested in the electroweak interaction.
They proposed several possible direct experimental tests. They were almost ignored, but Lee
was able to convince his colleague Chien-Shiung Wu to look for P violation. In 1957, Wus
group conducted an ingenious experiment showing that in the case of the -decay of Co
60
,
nature knows left from right [5]. The discovery of P violation immediately explained the
outstanding puzzle related to the decay of charged kaons.
So P is broken in nature, what about CP then? The rst thing to notice in this respect is
that a symmetry of a QM system can be restored if another symmetry can be found such that
the combined symmetry remains unbroken. This rather subtle point about the structure of
Hilbert space was realized shortly after the discovery of P violation, and it was proposed that
charge conjugation was the desired symmetry to restore order. As a result, the CP symmetry
was proposed in 1957 by Lev Landau as the true symmetry between matter and antimatter.
In other words, a process in which all particles are exchanged with their antiparticles was
assumed to be equivalent to the mirror image of the original process. The discovery of CP
violation in 1964 in the decays of neutral kaons [6], which resulted in the Nobel Prize in Physics
121
Symmetry

P +1 (1)

(1)

(1)

1 (1)

(1)

(1)

(1)

(1)

T +1 (1)

(1)

(1)

1 (1)

(1)

(1)

(1)

(1)

C +1 1 1 +1 +1 +1 +1 +1
CP +1 (1)

(1)

(1)

1 (1)

(1)

(1)

(1)

(1)

CPT +1 1 +1 +1 1 1 1 +1
Table 1: Transformation properties of fermion bi-linears as well as

, A

, and F

under the discrete P, T, and C symmetries and the combinations CP and CPT.
in 1980 for its discoverers James Cronin and Val Fitch, shocked particle physics and opened
the door to questions still at the core of particle physics and cosmology today. CP violation is
incorporated in the SM by including a complex phase in the matrix describing quark mixing.
In such a scheme a necessary condition for CP violation can then be shown to be the presence
of at least three generations of quarks. This possibility was suggested by Makoto Kobayashi
and Toshihide Maskawa in a seminal paper [7] in 1973, which earned them one half of the
Nobel Prize in Physics in 2008.
The past decade has seen tremendous progress in the study of CP violation. In particular,
the so-called B factories (BaBar and Belle) have collected and analyzed an impressive amount
of experimental data, that led to the conrmation of the Kobayashi-Maskawa (KM) mecha-
nism of CP violation. Yet, the dynamical origin of CP violation remains a puzzling mystery
which awaits to be unraveled. Another unsolved theoretical questions in this context is why
the universe is made entirely of matter, rather than consisting of equal parts of matter and
antimatter. It can be demonstrated that, to create an imbalance in matter and antimatter
from an initial condition of balance, three necessary conditions [8] must be satised, one of
which is the existence of CP violation. The other two are baryon-number violation and the
presence of interactions out of thermal equilibrium. These conditions have been formulated
rst in 1967 by Andrei Sakharov. The SM contains only two sources that can break the CP
symmetry. The rst of these, involves the aforementioned KM phase, but can account for
only a small portion of the needed CP violation. The second of these, resides in the quantum
chromodynamics (QCD) Lagrangian and goes by the name of parameter. It has not been
found experimentally. The fact that one would expect the parameter to lead to either no
or CP violation that is way too large is the essence of the strong CP problem [9]. There
are several proposed solutions to solve this problem. The most well-known is based on an
idea original due to Robert Peccei and Helen Quinn [10], involving new scalar particles called
axions. Oops! Looks like I am getting carried away here. Lets get focused and return to the
discussion of the Dirac theory.
122
5.3 Continuous Symmetries of Dirac Theory
Besides the discrete P, T, and C symmetries, the Dirac action (5.34) enjoys a number of con-
tinuous symmetries. In the following we will discuss space-time translations, Lorentz transfor-
mations, the internal vector and axial-vector symmetry, and compute the associated conserved
currents.
Space-Time Translations
Under innitesimal space-time translations (2.18), the Dirac spinor transforms as
=

. (5.87)
Given that the Dirac Lagrangian depends on

but not on

, we can use the standard


formula (2.20) to obtain the energy-momentum tensor
T

= i

/. (5.88)
Since a current is conserved only when the EOMs are obeyed, we do not lose anything by
imposing the Euler-Lagrange equation already on T

. In the case of a scalar eld this does


not really buy us anything, because the EOMs are second order in derivatives, while the
energy-momentum tensor is rst order. However, for a spinor eld the EOMs are rst order
(5.35). This means that we can ignore the second term in (5.88), leaving us with
T

= i

. (5.89)
It follows that the total energy is given by
E =
_
d
3
x T
00
=
_
d
3
x i

0

=
_
d
3
x

0
(i +m) , (5.90)
where in order to obtain the nal expression we have employed the Dirac equation. The
components of the total momentum are given by
P
i
=
_
d
3
x T
0i
=
_
d
3
x i

i
. (5.91)
Both the total energy and momentum are of course conserved.
Lorentz Transformations
Under a Lorentz transformation, the Dirac spinor transforms as (5.17) which, in innitesimal
form, reads

+
1
2

(S

. (5.92)
From (5.6) it follows that

= 1/2

(/

, where the generators of the Lorentz group


(/

take the form (5.2). After direct substitution, this tells us that

, and as a
result (5.91) becomes

_
x

+
1
2
(S

_
. (5.93)
123
The conserved current arising from Lorentz transformations now follows from the same cal-
culation we saw for the scalar eld (2.67). Yet, there are two small dierences. First, we are
allowed to neglect terms proportional to / in the computation and, second, we pick up an
extra piece in the current from the second term in (5.92). At the end one has
(

= x

. (5.94)
After quantization, when (

is turned into an operator, this extra term will be responsible


for providing the single-particle states with internal angular momentum, telling us that the
quantization of a Dirac spinor gives rise to a particle carrying spin 1/2.
Vector Symmetry
The Dirac Lagrangian is invariant under global phase rotations of the spinor, i.e.,
e
i
. (5.95)
This symmetry gives rise to the conserved current
j

V
=

, (5.96)
where the index V stands for vector, reecting the fact that the left- and right-handed spinors

L,R
transform in the same way under phase rotations. It is straightforward to check using
(5.35) and (5.36), that j

V
is indeed conserved under the EOMs,

V
= (

) = im

im

= 0 . (5.97)
The conserved quantity arising from the vector symmetry is
Q =
_
d
3
x j
0
V
=
_
d
3
x

0
=
_
d
3
x

. (5.98)
We will see shortly that this has the interpretation of electric charge, or particle number, for
fermions.
Axial-Vector Symmetry
In the case of massless fermions, the Dirac Lagrangian possesses an extra internal symmetry,
which rotates left- and right-handed fermions in opposite directions,
e
i
5
,



e
i
5
. (5.99)
Here the second transformation follows from the rst by noticing that exp (i
5
)
0
=

0
exp (i
5
) as a consequence of the anti-commutation relation in (5.41). Invariance under
the global phase rotation (5.98) leads to the conserved current
j

A
=

5
, (5.100)
124
where the subscript A stands for axial-vector. This current is only conserved if the mass
parameter m in the Dirac action (5.34) is equal to zero. Indeed, with the full Dirac Lagrangian
we may compute

A
= (

5
+

5
(

) = 2im

5
, (5.101)
which is non-vanishing only if m ,= 0. However, in the quantum theory things become more
interesting for the axial-vector current. When the theory is coupled to gauge elds, the axial
transformation remains a symmetry of the classical Lagrangian. But the symmetry does not
survive the quantization process [1113]. It is the prototypical example of an anomaly: a
symmetry of the classical theory that is not preserved at the quantum level. In fact, the axial
anomaly has important physical implications. It does not only determine the neutral pion
decay
0
2, but also provides an indirect way to determine the number of color dofs. For
further reading, I recommend the recent review article [14].
5.4 Solutions to Dirac Equation
In order to get some feeling for the physics of the Dirac equation (5.35), we now discuss its
plane-wave solutions. The fact that the Dirac eld obeys the Klein-Gordon equation, tells
us that it can be written as a linear combination of plane waves. We make the ansatz
(x) = u(p) e
ipx
, (5.102)
where u(p) is a 4-component spinor that is independent of x, but does depend on the 3-
momentum p.
43
Notice that (5.102) is a positive frequency solution, because exp (iEt).
Inserting the above ansatz into the Dirac equation takes the form
(p / m) u(p) =
_
m p

m
_
u(p) = 0 , (5.103)
where we have used the notation (5.47). In order to nd the solution to this equation, we
write u(p) = (u
1
, u
2
)
T
. In terms of the two-component spinors u
1,2
the relation (5.103) reads
(p ) u
2
= mu
1
, (p ) u
1
= mu
2
, (5.104)
where p = p

and p = p

. However, these equations are not independent from each


other, since
(p )(p ) = p
2
0
p
i
p
j

j
= p
2
0
p
i
p
j

ij
= p

= m
2
. (5.105)
We conclude that any spinor of the form
u(p) = N
_
(p )

_
, (5.106)
43
In an abuse of notation we denote hereafter the 4-component Dirac spinors by u(p) and not u(p) etc.
125
with constant N is a solution to (5.103). In order to make this more symmetric, we choose
N = 1/m and

=

p . Then u
1
= (p )

p = m

p , and putting things


together one obtains
u(p) =
_

p
_
, (5.107)
where is a 2-component spinor that can be chosen to satisfy

= 1. Here it is understood
that in taking the square root of a matrix, we take the positive root of each eigenvalue.
Further solutions to the Dirac equation follow from the ansatz
(x) = v(p) e
ipx
. (5.108)
These solutions oscillate in time as exp (iEt) and are therefore called negative frequency
solutions. Realize however that both (5.102) and (5.108) are solutions to the classical eld
equations and both have positive total energy (5.90). The Dirac equation (5.35) requires that
the 4-component spinor v(p) satises
(p / +m) v(p) =
_
m p
p m
_
v(p) = 0 . (5.109)
Following the line of reasoning that lead to (5.106), it is easy to show that the latter equation
is solved by
v(p) =
_

p

p
_
, (5.110)
for some constant 2-component spinor taken to be normalized as

= 1.
Spin-Up and Spin-Down Solutions
In order to make contact to QM, consider the positive frequency solution with mass m and
vanishing 3-momentum p = 0, i.e., the rest frame of the associated particle. In this case the
solution to (5.103) takes the form
u(p) =

m
_

_
, (5.111)
where is an arbitrary 2-component spinor. We can interpret the spinor by looking at
the rotation generator (5.19). We see that transforms under rotations as an ordinary 2-
component spinor of the rotation group, and therefore determines the spin orientation of the
Dirac solution in the usual way. E.g., when
T
= (1, 0), the corresponding eld has spin up
along the z-axis. After quantization, this will become the spin of the associated particle.
44
44
In the rest of this section, we will indulge in an abuse of terminology and refer to the classical solutions
to the Dirac equations as particles, even though they have no such interpretation before quantization.
126
Starting from (5.111), we now consider the particle with spin
T
= (1, 0) and boost it along
the z-direction with p

= (E, 0, 0, p
z
). The solution (5.107) to the Dirac equation becomes
u(p) =
_
_
_
_
_
_
_

E p
z
_
1
0
_

E +p
z
_
1
0
_
_
_
_
_
_
_
_
=

m
_
_
_
_
_
_
_
e
y/2
_
1
0
_
e
y/2
_
1
0
_
_
_
_
_
_
_
_
, (5.112)
where in the last step we have introduced the rapidity
y =
1
2
ln
_
E +p
z
E p
z
_
, (5.113)
which is related to E and p
z
via
E =
m
2
_
e
y
+e
y
_
, p
z
=
m
2
_
e
y
e
y
_
. (5.114)
Notice that rapidities are, unlike speeds at relativistic velocities, additive quantities. This
feature explains why in particle physics rapidities are often used instead of velocities. For
large boosts, i.e., E m or equivalent y 1, the result (5.112) turns into
u(p)

2E
_
_
_
_
_
0
0
1
0
_
_
_
_
_
. (5.115)
In the same limit, one obtains for a particle with spin
T
= (0, 1) the expression
u(p)

2E
_
_
_
_
_
0
1
0
0
_
_
_
_
_
. (5.116)
This implies that in the limit y the states degenerate into the 2-component spinors of a
massless particle. We now also understand the reason for the factor of

m in (5.111). It is
necessary to keep the spinor expressions nite in the massless limit.
Helicity
The solutions (5.115) and (5.116) are the eigenstates of the helicity operator
h =
i
2

ijk
p
i
S
jk
=
1
2
p
i
_

i
0
0
i
_
, (5.117)
127
where S
ij
is the rotation generator (5.14). The massless eld in (5.115) has helicity 1/2 and is
said to be right-handed, while the one in (5.116) has helicity 1/2 and is called left-handed.
Notice that the helicity of a massive particle depends on the frame of reference, since one can
always boost to a frame in which its momentum is in the opposite direction, but its spin is
unchanged. For a massless particle which travels at the speed of light one cannot perform such
a boost. This also explains the origin of the notation
L,R
for Weyl spinors. The solutions
of the Weyl equations (5.49) are states of denite helicity, corresponding to left- and right-
handed particles, respectively. The Lorentz invariance of helicity (for a massless particle) is
manifest in the notation of Weyl spinors, since
L
and
R
live in dierent representations of
the Lorentz group.
Spinor Products
There are a number of identities that will be very useful in the following section, regarding
the (inner) products of the spinors u(p) and v(p). For convenience, we dene a basis
r
and

r
with r = 1, 2 for the 2-component spinors such that

s
=
rs
,
r

s
=
rs
. (5.118)
E.g., one can take

1
=
_
1
0
_
,
2
=
_
0
1
_
, (5.119)
and similarly for
r
. Let us rst look at the positive frequency solutions u(p). We can take the
inner product of 4-component spinors in two dierent ways, i.e., u
r
(p) u
s
(p) or u
r
(p) u
s
(p).
Of course, only the latter object is Lorentz invariant, but it will turn out that the former is
needed when we will quantize the theory. So let me state both. One has
u
r
(p) u
s
(p) = (
r

p ,
r

p )
_

p
s

p
s
_
=
r
(p )
s
+
r
(p )
s
= 2
r
p
0

s
= 2p
0

rs
,
(5.120)
while the Lorentz-invariant inner product is
u
r
(p) u
s
(p) = (
r

p ,
r

p )
_
0 1
1 0
__

p
s

p
s
_
=
r

p
s
+
r

p
s
= 2m
rs
.
(5.121)
Here we have used (5.105) in order to arrive at the nal expression. For the negative frequency
solutions v(p), one derives in an analog way
v
r
(p) v
s
(p) = 2p
0

rs
, v
r
(p) v
s
(p) = 2m
rs
. (5.122)
128
We can also compute the Lorentz-invariant inner product between u
r
(p) and v(p). We nd
u
r
(p) v
s
(p) = (
r

p ,
r

p )
_
0 1
1 0
__

p
s

p
s
_
=
r

p
s

p
s
= 0 ,
(5.123)
and similarly for v
r
(p) u
s
(p) = 0. The solutions u(p) and v(p) are thus orthogonal to each
other. Let us furthermore calculate u
r
(p)v
s
(p) and v
r
(p)u
s
(p).
45
Dening p

= (p
0
, p),
one has in the rst case
u
r
(p) v
s
(p) = (
r

p ,
r

p )
_

p
s

p
s
_
=
r

p
s

p
s
.
(5.124)
Here the term under the rst square root is given by (p )( p ) = (p
0
p
i

i
)(p
0
+ p
i

i
) =
p
2
0
p
2
= m
2
. The same result holds for (p )( p ). This means that the two terms in the
last line of (5.124) cancel, leaving us with
u
r
(p) v
s
(p) = v
r
(p) u
s
(p) = 0 . (5.125)
Spin Sums
In evaluating Feynman diagrams, we will often wish to sum over the polarization states of a
fermion. We can derive the relevant spin sums (or completeness relations) by simple calcula-
tions. We start by computing

r=1,2
u
r
(p) u
r
(p) =

r=1,2
_

p
r

p
r
_
(
r

p ,
r

p )
_
0 1
1 0
_
=
_

p

p

p

p

p
_
=
_
m p
p m
_
= p / +m.
(5.126)
Notice that the two spinors appearing on the left-hand side of (5.126) are not contracted. In
the derivation of the latter equation, we have used that

r=1,2

r
=
_
1 0
0 1
_
. (5.127)
Similarly, one derives

r=1,2
v
r
(p) v
r
(p) = p / m. (5.128)
Again, it is crucial that

r=1,2

r
=
_
1 0
0 1
_
. (5.129)
45
Our notation is such that with u(p) we in fact mean u(p) etc.
129
5.5 Quantization of Dirac Theory
We are now ready to construct the quantum version of the free Dirac eld, starting from the
relevant action (5.34). We will rst proceed naively and treat as we have done in the case
of the scalar eld. Yet, we will see pretty fast that things go wrong, and we will have to
reconsider how to quantize the Dirac theory. Walking on this blind alley will, however, allow
us to better understand the relation between spin and statistics. So at the end, it will be a
quite useful detour.
Little Detour
We start in the usual way by calculating the momentum conjugate to . In fact, we already
did this in (5.50), and know that = i

, which does not involve the time derivative of .


This makes perfectly sense, because the Dirac equation is rst order in time, so that we need
only to specify and

on an initial time slice to determine the full evolution.


In order to quantize the theory we then proceed in analogy with the Klein-Gordon eld, and
promote and

to operators, satisfying the following canonical (equal time) commutation


relations
[

(x),

(y)] = [

(x),

(y)] = 0 ,
[

(x),

(y)] =
(3)
(x y)

,
(5.130)
where and denote spinor indices. This already looks peculiar. If were real-valued the
left-hand side would be antisymmetric under exchange of x and y, while the right-hand side
is symmetric. But is complex, so we do not have a contradiction yet. In fact, we will soon
learn that much worse problems arise when we impose commutation relations on the Dirac
eld. But it is instructive to see how far we can get, in order to better understand the relation
between spin and statistics. So lets press on.
Since we are dealing with a free theory, where any classical solution is a sum of plane
waves, we may write the quantum operators in the Schrodinger picture as
(x) =

r=1,2
_
d
3
p
(2)
3
1
_
2E
p
_
a
r
p
u
r
(p) e
ipx
+b
r
p
v
r
(p) e
ipx
_
,

(x) =

r=1,2
_
d
3
p
(2)
3
1
_
2E
p
_
a
r
p
u
r
(p) e
ipx
+b
r
p
v
r
(p) e
ipx
_
,
(5.131)
where the operators a
r
p
and b
r
p
create particles associated to the positive energy solutions
u
r
(p) exp (ip x) and negative energy solutions v
r
(p) exp (ip x), respectively. The a
r
p
and b
r
p
are the corresponding annihilation operators. Like in the case of scalar elds the commutation
relations of the elds (5.130) lead to commutation relations for the ladder operators. The non-
vanishing commutators are
[a
r
p
, a
s
q
] = (2)
3

(3)
(p q)
rs
,
[b
r
p
, b
s
q
] = (2)
3

(3)
(p q)
rs
.
(5.132)
130
Notice that the commutator [b
r
p
, b
s
q
] has a strange minus sign on the right-hand side. It is not
obvious that this sign causes trouble, but we should be aware of it. With the commutation
relations (5.132) at hand, it is straightforward to show that the relations (5.130) hold. One
has
[(x),

(y)] =

r,s=1,2
_
d
3
pd
3
q
(2)
6
1
_
4E
p
E
q
_
[a
r
p
, a
s
q
] u
r
(p)u
s
(q) e
i(pxyq)
+ [b
r
p
, b
s
q
] v
r
(p)v
s
(q) e
i(pxyq)
_
=

r=1,2
_
d
3
p
(2)
3
1
2E
p
_
u
r
(p) u
r
(p)
0
e
ip(xy)
+v
r
(p) v
r
(p)
0
e
ip(xy)
_
.
(5.133)
In order to simplify this further, we now employ the completeness relations (5.126) and (5.128).
It follows that
[(x),

(y)] =
_
d
3
p
(2)
3
1
2E
p
_
(p / +m)
0
e
ip(xy)
+ (p / m)
0
e
ip(xy)
_
=
_
d
3
p
(2)
3
1
2E
p
_
_
p
0

0
+p +m
_

0
+
_
p
0

0
p m
_

0
_
e
ip(xy)
=
_
d
3
p
(2)
3
e
ip(xy)
=
(3)
(x y) ,
(5.134)
as promised. Notice that to obtain the second line we have change the integration from p
to p for what concerns the second term. We also see that the minus sign in the second
relation of (5.132) is crucial here, since it is necessary so that the terms p = p
i

i
cancel
in the nal expression. It is also easy to show that the rst commutation relation in (5.130)
is satised once the equations (5.132) are imposed. I leave it to the reader to perform the
explicit computation.
Equipped with (5.132), we can nd the explicit form of the Dirac Hamiltonian in terms of
ladder operators. The Hamiltonian can be simply read o from (5.90), since E = H =
_
d
3
x H.
Hence, we have
H =

(i +m) , (5.135)
as a starting point, which we would like to turn into an operator. We rst look at
(i +m) =

r=1,2
_
d
3
p
(2)
3
1
_
2E
p
_
a
r
p
(p +m) u
r
(p) e
ipx
+b
r
p
(p +m) v
r
(p) e
ipx
_
.
(5.136)
In order to nd this result it is important to notice that p x = p
i
x
i
, which explains the
additional minus sign of the p terms. Using now (5.103) and (5.109) to replace the p
131
terms, leads to
(i +m) =

r=1,2
_
d
3
p
(2)
3
_
E
p
2

0
_
a
r
p
u
r
(p) e
ipx
b
r
p
v
r
(p) e
ipx
_
,
(5.137)
We now use this expression to write the Hamiltonian as
H =

r,s=1,2
_
d
3
xd
3
pd
3
q
(2)
6

E
p
4E
q
_
a
s
q
u
s
(q) e
iqx
+b
s
q
v
s
(q) e
iqx
_

_
a
r
p
u
r
(p) e
ipx
b
r
p
v
r
(p) e
ipx
_
=

r,s=1,2
_
d
3
p
(2)
3
1
2
_
a
s
p
a
r
p
_
u
s
(p) u
r
(p)
_
b
s
p
b
r
p
_
v
s
(p) v
r
(p)
_
a
s
p
b
r
p
_
u
s
(p) v
r
(p)
_
b
s
p
a
r
p
_
v
s
(p) v
r
(p)
_
_
,
(5.138)
where in the last two terms we have changed p to p. Now is the right time to employ the
formulas in (5.120), (5.122), and (5.125), that allow us to get rid of the spinor products. We
arrive at the simple result
H =

r=1,2
_
d
3
p
(2)
3
E
p
_
a
r
p
a
r
p
b
r
p
b
r
p
_
=

r=1,2
_
d
3
p
(2)
3
E
p
_
a
r
p
a
r
p
b
r
p
b
r
p
+ (2)
3

(3)
(0)
_
.
(5.139)
The delta-function term should be familiar to you by now. It is easily dealt with by normal
ordering. However, the term b
r
p
b
r
p
is a complete mess, since it implies that the Hamiltonian
is not bounded below, meaning that our quantum theory makes no sense. Taken seriously it
would tell us that we could tumble to states of lower and lower energy by continually producing
particles by the action of b
r
p
. Since the above calculation was a little subtle, you might think
that its possible to rescue the theory to get the minus signs to work out right. You can play
around with dierent things, but youll always nd this minus sign cropping up somewhere.
And, in fact, its telling us something important that we missed.
Further insight in the structure of the Dirac theory, can be gained by investigating the
causality of the theory. To do this we should calculate [(x),

(y)], or more conveniently


[(x),

(y)], at non-equal times and hope to nd that this commutator is zero outside the
light-cone. We start this exercise by switching to the Heisenberg picture thereby restoring the
time-dependence of and

. From (3.104) and (3.106), we infer that
(a
r
p
)
H
= e
iHt
a
r
p
e
iHt
= e
iEpt
a
r
p
, (a
r
p
)
H
= e
iHt
a
r
p
e
iHt
= e
iEpt
a
r
p
, (5.140)
while
(b
r
p
)
H
= e
iHt
b
r
p
e
iHt
= e
iEpt
b
r
p
, (b
r
p
)
H
= e
iHt
b
r
p
e
iHt
= e
iEpt
b
r
p
. (5.141)
132
It immediately follows that
(x) =

r=1,2
_
d
3
p
(2)
3
1
_
2E
p
_
a
r
p
u
r
(p) e
ipx
+b
r
p
v
r
(p) e
ipx
_
,

(x) =

r=1,2
_
d
3
p
(2)
3
1
_
2E
p
_
a
r
p
u
r
(p) e
ipx
+b
r
p
v
r
(p) e
ipx
_
.
(5.142)
We can now compute the commutator. One has
[

(x),

(y)] =

r=1,2
_
d
3
p
(2)
3
1
2E
p
_
(u
r
(p) u
r
(p))

e
ip(xy)
+ (v
r
(p) v
r
(p))

e
ip(xy)
_
=
_
d
3
p
(2)
3
1
2E
p
_
(p / +m)

e
ip(xy)
+ (p / m)

e
ip(xy)
_
= (i/
x
+m)

_
d
3
p
(2)
3
1
2E
p
_
e
ip(xy)
e
ip(xy)

.
(5.143)
Looking back at (3.110) and (3.111), we see that this means that
[

(x),

(y)] = (i/
x
+m)

(x y) . (5.144)
This expression vanishes outside the light-cone, because the commutator of the real scalar eld
(x y) = [(x), (y)] does. As a result the quantum version of the Dirac theory is causal.
Although there is no problem with causality, it is worthwhile to stare at the commutator
in (5.144) a bit longer. If [0) is the vacuum state of the theory,
a
r
p
[0) = b
r
p
[0) = 0 . (5.145)
for all r and p, then
[

(x),

(y)] = 0

(x),

(y)]

0)
= 0

(x)

(y)

0) 0

(y)

(x)

0) .
(5.146)
It is important to realize now, that the rst (second) matrix element receives only contribution
from terms containing the u
r
(p) (v
r
(p)) spinors. Explicitly, one has in the rst case
0

(x)

(y)

0) =

r,s=1,2
_
d
3
pd
3
q
(2)
6
1
_
4E
p
E
q
(u
r
(p) u
s
(q))

e
i(pxqy)
0[a
r
p
a
s
q
[0) , (5.147)
and a similar expression holds in the second case.
It is now crucial to ask the following questions. Can we say something about the matrix
elements 0[a
r
p
a
s
q
[0) based on the classical symmetries of the Dirac theory? In particular, how
does Lorentz invariance constrain the form of the relevant matrix elements? For the ground
133
state [0) to be invariant under translations, we must have exp (iP x) [0) = [0). In analogy
to (5.140) the action of exp (iP x) on the ladder operators can be shown to lead to
e
iPx
a
r
p
e
iPx
= e
ipx
a
r
p
, e
iPx
a
r
p
e
iPx
= e
ipx
a
r
p
. (5.148)
Analog expressions hold in the case of b
r
p
and b
r
p
. Therefore,
0[a
r
p
a
s
q
[0) = 0[a
r
p
a
s
q
e
iPx
[0) = e
i(pq)x
0[e
iPx
a
r
p
a
s
q
[0) = e
i(pq)x
0[a
r
p
a
s
q
[0) . (5.149)
This implies that the matrix element can only be non-zero if p = q. Similarly, it can be shown
that rotational invariance of [0) requires that r = s, which should be intuitively clear. From
these considerations, one concludes that the matrix element can be written as
0[a
r
p
a
s
q
[0) = (2)
3

(3)
(p q)
rs
A(p) , (5.150)
where A(p) is an arbitrary function that is so far undetermined. Inserting the latter result
into (5.147), gives
0

(x)

(y)

0) =

r=1,2
_
d
3
p
(2)
3
1
2E
p
(u
r
(p) u
r
(p))

e
ip(xy)
A(p)
=
_
d
3
p
(2)
3
1
2E
p
(p / +m)

e
ip(xy)
A(p) .
(5.151)
For this expression to be invariant under boosts, we have to require that A(p) must be a
Lorentz scalar, i.e., A(p) = A(p
2
). In fact, since p
2
= m
2
it follows that A has to be a
positive constant. The positivity of A is the result of the positivity of the norm of states in
any self-respecting Hilbert space. Hence,
0

(x)

(y)

0) = A (i/ +m)

_
d
3
p
(2)
3
1
2E
p
e
ip(xy)
. (5.152)
In a similar fashion, we can also calculate the second matrix element in (5.146). The nal
result reads
0

(y)

(x)

0) = B (i/ +m)

_
d
3
p
(2)
3
1
2E
p
e
ip(xy)
. (5.153)
where B is another positive constant. The minus sign is important. It arises from the com-
pleteness relation (5.128) of the v
r
(p) spinors and the sign of x in the exponential. From
(5.152) and (5.153) we see that the two terms in the last line of (5.146) would indeed cancel
if A = B. Yet, this is impossible since A and B must both be positive.
So how to resolve this apparent contradiction? Setting A = B = 1, it follows from (5.152)
and (5.153) that (outside the light-cone)
0

(x)

(y)

0) = 0

(y)

(x)

0) , (5.154)
which means that the spinor elds anticommute at space-like separation. This suggests that
postulating the commutation relations (5.130) for the spinor elds, was the mistake that lead
to the negative energy problem in (5.139).
134
Fermionic Quantization
The key piece of physics that we obviously missed before is that spin-1/2 particles are fermions,
meaning that they obey Fermi-Dirac statistics with the quantum state picking up a minus sign
upon the interchange of any two particles as indicated by (5.154). This fact is embedded into
the structure of relativistic QFT: the spin-statistics theorem tells us that integer spin elds
must be quantized as bosons, while half-integer spin elds must be quantized as fermions. Any
attempt to do otherwise will lead to an inconsistency.
All inconsistencies are removed by postulating the equal-time anticommutation relation
for the Dirac eld,

(x),

(y) =

(x),

(y) = 0 ,

(x),

(y) =
(3)
(x y)

,
(5.155)
instead of (5.130). In this case we still have the expansions (5.131) and (5.142) in terms of the
ladder operators a
r
p
, a
r
p
, b
r
p
, and b
r
p
, but the line of reasoning that lead to (5.132) now tells
us that
a
r
p
, a
s
q
= (2)
3

(3)
(p q)
rs
,
b
r
p
, b
s
q
= (2)
3

(3)
(p q)
rs
,
(5.156)
while all other anticommutators vanish identically. Using these anticommutator relations, we
can now compute the Hamiltonian again, nding
H =

r=1,2
_
d
3
p
(2)
3
E
p
_
a
r
p
a
r
p
b
r
p
b
r
p
_
=

r=1,2
_
d
3
p
(2)
3
E
p
_
a
r
p
a
r
p
+b
r
p
b
r
p
(2)
3

(3)
(0)
_
.
(5.157)
We see that the anticommutators have saved us from the indignity of an unbounded Hamilto-
nian. Notice that when normal ordering, we now throw away a negative innite contribution
proportional to (2)
3

(3)
(0) and not a positive one as in the case of the scalar eld (3.19). In
principle, the negative contribution from fermionic elds could (partially) cancel the positive
contribution arising from bosonic elds. So one could hope that if there is a symmetry relat-
ing fermions and bosons to each other, a so-called supersymmetry, the cosmological constant
problem might be solvable. In fact, it can be shown that supersymmetry solves the cosmo-
logical constant problem halfway, but does not render a complete solution. If you want to
gure out what halfway actually means, I recommend to have a look at the excellent review
[15] and the relevant references therein.
For completeness let me also quote the expression for the momentum operator. Inserting
the expansions (5.131) into (5.91), one nds after a straightforward calculation and normal
ordering the following result
P =

r=1,2
_
d
3
p
(2)
3
p
_
a
r
p
a
r
p
+b
r
p
b
r
p
_
. (5.158)
135
Fermi-Dirac Statistic
Although the ladder operators now obey anticommutation relations, the Hamiltonian (5.157)
has nice commutation relations with them. You can check easily that
[H, a
r
p
] = E
p
a
r
p
, [H, a
r
p
] = E
p
a
r
p
, (5.159)
and likewise in the case of b
r
p
and b
r
p
. As in the scalar case (3.43), this implies that we can
again construct a tower of energy eigenstates by acting on [0) with a
r
p
and b
r
p
to create
particles and antiparticles. E.g., we have the one-particle state
[p, r) = a
r
p
[0) , (5.160)
with momentum p and spin quantum number r. The two-particle state
[p
1
, r
1
; p
2
, r
2
) = a
r
1

p
1
a
r
2

p
2
[0) , (5.161)
obeys
[p
1
, r
1
; p
2
, r
2
) = a
r
1

p
1
a
r
2

p
2
[0) = a
r
2

p
2
a
r
1

p
1
[0) = [p
2
, r
2
; p
1
, r
1
) , (5.162)
due to (5.156). This conrms that the particles do satisfy Fermi-Dirac statistics as anticipated.
In particular, we have the Paulis exclusion principle [p, r; p, r) for all p and r. Finally, if one
wants to be sure about the spin of the particle, one could act with the angular momentum
operator J
i
=
ijk
_
d
3
x (
0
)
jk
constructed from (5.94) to conrm that a stationary particle
[p = 0, r) does indeed carry intrinsic angular momentum 1/2. This exercise, which is left to
the reader, will show that in the case of [p = 0, r) only the third term in (5.94) will give a
non-vanishing contribution to the internal angular momentum.
Diracs Hole Interpretation
Before discussing the propagator of the Dirac eld, a historical remark seems to be in order.
Dirac originally viewed his equation (5.35) as a relativistic version of the Schrodinger equation,
considering as the wavefunction ot a single particle with spin 1/2 (a fact which is put in by
hand in Diracs theory). In order to reinforce this interpretation, he wrote (5.35) as
i

t
= i +m = H , (5.163)
with =
0
and =
0
. The operator H appearing in the above equation is then
understood as the one-particle Hamiltonian. Notice that this viewpoint is quite dierent from
the one we held so far, where is a classical eld that gets quantized. In Diracs view, the
Hamiltonian is dened by (5.163), while for us the Hamiltonian is given by the eld operator
(5.157). But for the moment lets stick to (5.163) and see where it lead Dirac/leads us.
With the interpretation of as a single-particle wavefunction, the plane-wave solutions
(5.102) and (5.108) are thought of as energy eigenstates, satisfying
i

t
= i

t
u(p)e
ipx
= E
p
u(p)e
ipx
= E
p
, (5.164)
136
and an analog relation for = v(p)e
ipx
with E
p
replaced by E
p
. The plane-wave solutions
thus look like positive and negative energy solutions. The spectrum is again unbounded from
below, because there are states v(p) with arbitrary low energy E
p
. At rst glance this is
disastrous, just like the unbounded eld theory Hamiltonian of (5.157). Paul Diracs ingenious
solution to this problem was to turn to the Pauli exclusion principle. In 1930, Dirac proposed
that in the true vacuum of the universe, all the negative energy states are lled, so that only
the positive energy states are accessible. The lled negative energy states are referred to as the
Dirac sea. Although you might worry about the innite negative charge of the vacuum, Dirac
argued that only charge dierences would be observable (a trick reminiscent of the normal
ordering prescription we use for eld operators).
Having avoided the problem with the anomalous negative-energy quantum states by in-
troducing an innite sea comprised of occupied negative energy states, Dirac realized that
his theory made a shocking prediction. Suppose that a negative energy state is excited to a
positive energy state, leaving behind a hole in the Dirac sea. The hole would have all the
properties of the electron, except it would carry positive charge. After irting with the idea
that it may be the proton,
46
Dirac concluded that the hole is a new particle, the positron. It
took only couple of years before the positron was discovered experimentally in 1932 by Carl
Anderson, with all the physical properties predicted for the Dirac hole.
Although Diracs physical insight led him to the right answer, we now understand that
the interpretation of the Dirac equation as a single-particle wavefunction is not really correct.
E.g., Diracs argument for antimatter relies crucially on the particles being fermions while, as
we have seen already in this course, antiparticles exist for both fermions and bosons. What
we really learn from Diracs analysis is that there is no consistent way to interpret the Dirac
equation as a single-particle wavefunction. It is instead to be thought of as a classical eld
which has only positive energy solutions, since the Hamiltonian (5.90) is positive denite.
Quantization of this eld then gives rise to both particle and antiparticle excitations and
makes the vacuum the state in which no particles exist instead of an innite sea of particles.
This picture is much more convincing, especially since it recaptures all the valid predictions of
the Dirac sea, such as electron-positron annihilation. On the other hand, the eld formulation
does not eliminate all the diculties raised by the Dirac sea. In particular, the problem of the
vacuum possessing innite energy, is still present.
Feynman Propagator
We now look at the anticommutator of the elds (x) and

(y). Dropping the indices and
from here on, we simply write
iS(x y) = (x),

(y) . (5.165)
46
Robert Oppenheimer pointed out that an electron and its hole would be able to annihilate each other,
releasing energy on the order of the electrons rest energy in the form of energetic photons. If holes were
protons, stable atoms would thus not exist, which is clearly in contradiction with observations. Hermann Weyl
also noted that a hole should have the same mass as an electron, whereas the proton is about 2000 times
heavier.
137
Inserting the expansions (5.142), we essentially only have to repeat the calculation that lead
to (5.143), to obtain
iS(x y) = (i/
x
+m)
_
D(x y) D(y x)

, (5.166)
where D(xy) is the propagator (3.114) of the real scalar eld. The object iS(xy) is called
the fermionic propagator.
Some comments seem to be in order here. For space-like separated points (xy)
2
< 0, we
have already seen in (3.117) that D(x y) D(y x) = 0. In the bosonic theory, we made
a big deal out of this, since it ensured that [(x), (y)] = 0 for (x y)
2
< 0, which we took
as a proof of causality. However, in the case of fermions we now have (x),

(y) = 0 for
(xy)
2
< 0. What happened to causality? The best that we can say is that all our observables
are bi-linear in and

, e.g., the Hamiltonian operator (5.157) or the momentum operator
(5.158). These objects still commute outside the light-cone. The theory remains causal as long
as individual fermionic operators are not observable. If you think this is a weak argument,
remember that no one has ever seen a physical device come back to minus itself when you
rotate by 2! Notice furthermore, that the propagator satises (i/
x
m)S(x y) = 0, since
(
2
x
+m
2
)D(x y) = 0 using the on-shell condition p
2
= m
2
.
By a similar calculation to that above, we can determine the VEVs of the bi-linears,
0[(x)

(y)[0) =
_
d
3
p
(2)
3
(p / +m) e
ip(xy)
,
0[

(y)(x)[0) =
_
d
3
p
(2)
3
(p / m) e
ip(xy)
,
(5.167)
which allows us to dene the fermionic Feynman propagator S
F
(xy), which is a 44 matrix,
as the following time-ordered product
S
F
(xy) = 0[T(x)

(y)[0) = (x
0
y
0
) 0[(x)

(y)[0)(y
0
x
0
) 0[

(y)(x)[0) , (5.168)
where the minus sign in front of the second term is crucial in the QFT of fermions. When
(x y)
2
< 0, there is no invariant way to determine whether x
0
> y
0
or x
0
< y
0
. In this case
the minus sign is necessary to make the two denitions agree since (x),

(y) = 0 outside
the light-cone.
In full analogy to the scalar case, there is also a 4-momentum integral representation for
the Feynman propagator. It reads
S
F
(x y) = i
_
d
4
p
(2)
4
e
ip(xy)
p / +m
p
2
m
2
+i
, (5.169)
and satises
(i/
x
m)S
F
(x y) = i
(4)
(x y) , (5.170)
which means that S
F
(x y) is a Greens function of the Dirac operator.
The minus sign that we see in (5.168) also occurs for any string of operators inside any
time-ordered product. While bosonic operators commute inside T, fermionic operators anti-
commute. We have this same behavior for normal-ordered products as well, with fermionic
138
operators receiving a minus sign when their order is changed. With the understanding that
all fermionic operators anticommute inside time- or normal-ordered products, Wicks theorem
proceeds just as in the bosonic case, which has been outlined in great detail in Section 4.4. In
the fermionic case, we dene the contraction as
(x)

(y) = T(x)

(y) : (x)

(y): = S
F
(x y) . (5.171)
Yukawa Theory
Based on the experiences gained in Section 4.6, it is now straightforward to work out the Feyn-
man rules needed to calculate fermion correlation functions. Let us for deniteness consider
the case of the Yukawa theory,
/ =
1
2
(

)
2

1
2

2

2
+

(i/ m)

, (5.172)
which describes the interaction of a scalar eld with mass and a Dirac eld with mass
m. Couplings of this type appear in the SM, between fermions and the Higgs boson, and
give mass to the fermionic dofs after electroweak symmetry breaking. In that context, the
fermions can be charged leptons (possibly neutrinos) or quarks. If you wish (5.172) is thus the
proper version of the scalar Yukawa theory of (4.8). Notice that there is however an important
dierence following from the dimensions of the involved elds. We still have [] = 1, but the
kinetic terms of the fermion requires that [] = 3/2. Thus, unlike in the case with only scalars,
the coupling is dimensionless, i.e., [] = 0.
In order to get a grip on the Feynman rules, let us study scattering. This
is pretty much the same calculation we have already performed in Section 4.5. The only
minor modication is, that now the particles that scatter have spin, while the nucleons N we
considered earlier on are scalars. In analogy to (4.48) we write the initial and nal states as
[i) =
_
4E
p
1
E
p
2
a
r
1

p
1
a
r
2

p
2
[0) = [p
1
, r
1
; p
2
, r
2
) ,
[f) =
_
4E
q
1
E
q
2
a
s
1

q
1
a
s
2

q
2
[0) = [q
1
, s
1
; q
2
, s
2
) .
(5.173)
Notice that for these states one has to be careful when one takes the adjoint since the fermionic
creation operators anticommute. E.g., the nal-state bra is
f[ =
_
4E
q
1
E
q
2
0[ a
s
2
q
2
a
s
1
q
1
. (5.174)
To get a contribution to the scattering of two fermions, we have to calculate the O(
2
) cor-
rections to the T-matrix element if[T[i). The relevant contribution to iT takes the form
(i)
2
2
_
d
4
xd
4
y T
_
(x)

(x)(x) (y)

(y)(y)
_
, (5.175)
where all elds are interacting ones. Just like in the case of the bosonic calculation, the
contribution to scattering comes from the term where the two elds are contracted,
D
F
(x y) :

(x)(x)

(y)(y): . (5.176)
139
+

p
1
p
2
q
1
q
2

p
1
p
2
q
1
q
2
Figure 5.1: Feynman diagrams contributing to scattering at order
2
.
We can now study the action of the fermionic operators on [i). By expanding the
operators, but not the

elds, we nd
:

(x)(x)

(y)(y): a
r
1

p
1
a
r
2

p
2
[0) =
_
d
3
k
1
d
3
k
2
(2)
6
_

(x) u
t
1
(k
1
)
_ _

(y) u
t
2
(k
2
)
_

e
i(k
1
x+k
2
y)
_
4E
k
1
E
k
2
a
t
1
k
1
a
t
2
k
1
a
r
1

p
1
a
r
2

p
2
[0) .
(5.177)
Here the b
t
1

k
1
and b
t
2

k
2
terms in the expansion of have been ignored since the do not con-
tribute to the considered process at O(
2
) and the brackets indicate how the spinor indices are
contracted. Notice nally that the overall minus sign arises from moving (x) past

(y). By
anticommuting the annihilation operators past the creation operators and performing the mo-
mentum integrations using the delta functions, we then get for the right-hand side of (5.177)
the following expression

1
_
4E
p
1
E
p
2
_

(x) u
r
1
(p
1
)
_ _

(y) u
r
2
(p
2
)
_
e
i(p
1
x+p
2
y)
+
_

(x) u
r
2
(p
2
)
_ _

(y) u
r
1
(p
1
)
_
e
i(p
1
y+p
2
x)
_
[0) .
(5.178)
Note the minus sign between the two individual terms. We now let this expression act on f[
from the right. Let us rst have a look what happens to the rst term in (5.178). Ignoring
prefactors and exponentials, we have
0[ a
s
2
q
2
a
s
1
q
1
_

(x) u
r
1
(p
1
)
_ _

(y) u
r
2
(p
2
)
_
[0) =
e
i(q
1
x+q
2
y)
_
4E
q
1
E
q
2
( u
s
1
(q
1
) u
r
1
(p
1
)) ( u
s
2
(q
2
) u
r
2
(p
2
))

e
i(q
1
y+q
2
x)
_
4E
q
1
E
q
2
( u
s
1
(q
1
) u
r
2
(p
2
)) ( u
s
2
(q
2
) u
r
1
(p
1
)) .
(5.179)
In fact, the second term in (5.178) can be shown to give the same result up to a sign. Both
terms thus add, which cancels the factor of 1/2 in (5.175). Furthermore, the square roots of
140
energies in (5.179) cancel against the relativistic normalizations of the states (5.173). Putting
everything together and including the Feynman propagator of the eld, we end up with
(i)
2
_
d
4
xd
4
yd
4
k
(2)
4
ie
ik(xy)
k
2

2
+i
_
( u
s
1
(q
1
) u
r
1
(p
1
)) ( u
s
2
(q
2
) u
r
2
(p
2
)) e
i[(q
1
p
1
)x+(q
2
p
2
)y]
( u
s
1
(q
1
) u
r
2
(p
2
)) ( u
s
2
(q
2
) u
r
1
(p
1
)) e
i[(q
2
p
1
)x+(q
1
p
2
)y]
_
.
(5.180)
Performing the integrations over x and y and suppressing a factor i(2)
4
, which will end up
in i f[T[i) = i(2)
4

(4)
(p
1
+p
2
q
1
q
2
) /( ), this becomes
(i)
2
_
d
4
k
k
2

2
+i
_
( u
s
1
(q
1
) u
r
1
(p
1
)) ( u
s
2
(q
2
) u
r
2
(p
2
))
(4)
(q
1
p
1
+k)
(4)
(q
2
p
2
k)
( u
s
1
(q
1
) u
r
2
(p
2
)) ( u
s
2
(q
2
) u
r
1
(p
1
))
(4)
(q
1
p
1
+k)
(4)
(q
2
p
2
k)
_
,
(5.181)
from which we can immediately read of the result for the scattering amplitude
/( ) = (i)
2
_
( u
s
1
(q
1
) u
r
1
(p
1
)) ( u
s
2
(q
2
) u
r
2
(p
2
))
(p
1
q
1
)
2

2
+i

( u
s
1
(q
1
) u
r
2
(p
2
)) ( u
s
2
(q
2
) u
r
1
(p
1
))
(p
1
q
2
)
2

2
+i
_
.
(5.182)
Honestly, the derivation of the expression for /( ) was a bit tedious. Can it be
done more easily? Yes, it can! Of course, the trick is again to use Feynman diagrams and
rules. The lowest-order Feynman graphs for the scattering of two fermions into two fermions
are shown in Figure 5.1. Starring at those diagrams as well as (5.182), it is, in fact, easy to
guess the Feynman rules that reproduce the nal result for the scattering amplitude.
The relevant momentum-space Feynman rules involving fermions and antifermions turn out
to be:
1. For each propagator one has =
i (p / +m)
p
2
m
2
+i
.
p
2. For each vertex one has = i.
3. For each external fermion one has
= u
s
(p) (initial state) ,
p
= u
s
(p) (nal state) .
p
141
4. For each external antifermion one has
= v
s
(p) (initial state) ,
p
= v
s
(p) (nal state) .
p
5. Impose momentum conservation at each vertex.
6. Integrate over each undetermined momentum
_
d
4
l
(2)
4
.
7. Figure out the overall sign of the diagram.
The Feynman rule for the propagator of the scalar eld (indicated by a dashed line) has
already been given in Section 4.6 and external scalar legs just give a trivial factor of 1.
Several comments regarding the above rules are in order. First, the direction of the mo-
mentum on a fermion line is signicant. On external lines, the direction of the momentum is
always ingoing (outgoing) for initial-state (nal-state) particles. This follows from the expan-
sion of the operators and

, where the annihilation (creation) operators are multiplied by
exp (ipx) (exp (ipx)) as can be seen from (5.142). On internal lines, represented by prop-
agators, the momentum must be assigned in the direction of the particle-number ow (for
electrons, this is the direction of the negative charge ow). It is conventional to draw arrows
on fermion lines to represent the direction of the particle-number ow. The momentum as-
signed to a fermion then ows in the direction of this arrow, while in the case of an antifermion
particle-number and momentum ow are opposite to each other. Hence an additional arrow,
identifying the momentum ow, has been drawn next to the antifermion line.
Second, in the case of the Yukawa theory the 1/n! factor from the Taylor expansion of the
time-ordered exponential is always cancelled by the n! ways of interchanging the vertices to
obtain the same contraction. In the case at hand there is thus no need for symmetry factors,
given that the elds in the interaction term

cannot replace each other in a contraction.
Third, the Dirac indices contract together along fermion lines. This happened in the case
of scattering (5.182), but will also happen in more complicated diagrams like e.g.
u(p
4
)
i (p
3
/ +m)
(p
2
3
m
2
)
i (p
2
/ +m)
(p
2
2
m
2
)
u(p
1
) . (5.183)
p
4
p
3
p
2
p
1
Fourth and nally, we should understand how to determine the correct overall sign of the
diagrams. Let us return to the case of fermion-fermion scattering (5.182). Here the t-channel
diagram has a plus sign, while the u-channel contribution receives a minus sign. Where
does the relative minus sign between the two graphs come from? Let us look at the Wick
142
contractions. For the contractions corresponding to the t-channel diagram in Figure 5.1, we
have
0[a
s
2
q
2
a
s
1
q
1

y
a
r
1

p
1
a
r
2

p
2
[0) . (5.184)
This contraction can be untangled by moving

y
=

(y) two spaces to the left, and so one picks
up a factor (1)
2
= 1. On the other hand, the contraction corresponding to the u-channel
diagram in Figure 5.1 reads
0[a
s
2
q
2
a
s
1
q
1

y
a
r
1

p
1
a
r
2

p
2
[0) . (5.185)
Here we only have to move

y
one space to the left, giving a factor of 1. The relative minus
sign between the two diagrams is a reection of the Fermi-Dirac statistics. In more complicated
graphs the overall sign can be determined most easily by noting that (

)
x
=

(x)(x) as
well as any other pair of fermions, commutes with any operator. Thus, e.g.
. . . (

)
x
(

)
y
(

)
z
(

)
w
. . . = . . . (+1) (

)
x
(

)
z
(

)
y
(

)
w
. . .
= . . . S
F
(x z)S
F
(z y)S
F
(y w) . . . ,
(5.186)
with S
F
(x y) given in (5.169). Notice that in the case of the simplest closed fermion loop in
the Yukawa theory the latter prescription leads to
= (

)
x
(

)
y
= (1) tr
_

= (1) tr [S
F
(y x)S
F
(x y)] .
(5.187)
Due to the cyclic property of the trace changing the ordering of S
F
(y x) and S
F
(x y) of
course gives the same result. The result (5.187) extends straightforwardly to all closed fermion
lines. A fermion loop hence always gives a factor of 1 and the trace of the product of fermion
propagators that make up the loop. Equipped with the Feynman rules for the Yukawa theory,
we can now calculate the cross sections for some simple scattering processes. This is quite a
good exercise. You should try it!
5.6 Problems
i) Show explicitly that the Weyl representation (5.10) satises the Cliord algebra (5.7).
Derive the properties (5.15) and (5.16) of the matrices S

introduced in (5.12).
Show that the term

,

, and

transforms as in (5.30), (5.31), and (5.32), i.e.,


it is a Lorentz scalar, vector, and tensor, respectively. It might be advantageous to look
143
at innitesimal transformations and consider separately the transformation properties
of ,

, and

under the action of (5.6).


Calculate the transformation properties of (5.35) and (5.36) under Lorentz transforma-
tions. You should nd that theses EOMs are form invariant.
Verify that the fth Dirac matrix
5
dened as in (5.40) satises (5.41) and (5.42). Prove
that the chiral projectors P
L,R
introduced in (5.43) obey the relations (5.44).
ii) Prove the Gordon identity
u(p

u(p) = u(p

)
_
(p

+p)

2m
+
i

2m
_
u(p) . (5.188)
Here q

= (p

p)

.
iii) Use (5.7) to show that the following identities involving contractions of 4-dimensional
Dirac matrices are correct:

= 4 ,

= 2

= 4

= 2

.
(5.189)
Employ the anticommutation relation (5.7) in combination with the cyclic property of
the trace to prove the following identities:
tr (1) = 4 ,
tr (any odd number of Dirac matrices) = 0 ,
tr (

) = 4

,
tr
_

_
= 4
_

_
,
tr
_

5
_
= 0 ,
tr
_

5
_
= 0 ,
tr
_

5
_
= 4i

.
(5.190)
iv) Products of Dirac bi-linears obey relations known as Fierz identities. The simplest of
these formulas reads
( u
1

P
L
u
2
) ( u
3

P
L
u
4
) = ( u
1

P
L
u
4
) ( u
3

P
L
u
2
) , (5.191)
where u
i
with i = 1, . . . , 4 are 4-component Dirac spinors (the momentum dependence
has been dropped here for simplicity) and P
L
is the left-handed projector introduced in
(5.43). In fact, there are similar rearrangement formulas for any product
_
u
1

A
u
2
_ _
u
3

B
u
4
_
. (5.192)
144
Here
A
and
B
are any of the 16 combinations of Dirac matrices listed in (5.72). The
goal of this exercise is to derive these Fierz identities.
To begin with, normalize the 16 matrices
A
such that
tr
_

A
,
B

= 4
ab
. (5.193)
This gives
A
= 1,
0
, i
j
, . . .. Write down all elements of this set.
The general form of the Fierz identity is
_
u
1

A
u
2
_ _
u
3

B
u
4
_
=

M,N
C
AB
MN
_
u
1

M
u
4
_ _
u
3

N
u
2
_
, (5.194)
with unknown coecients C
AB
MN
. Using the completeness of the set
A
, show that
C
AB
MN
=
1
16
tr
_

. (5.195)
Employing (5.194) and (5.195) prove (5.191). In addition work out the explicit Fierz
transformation of the product ( u
1
u
2
)( u
3
u
4
).
v) In Section 4.8 we saw that the Yukawa potential for NN NN scattering is attractive.
Repeat the calculation for ,



, and


scattering in the
non-relativistic limit. You might want to use (5.120) to (5.122).
If you understood how to calculate the Yukawa potential the derivation of the Coulomb
potential, which encodes the interactions of electrons/positrons and the photon eld
A

in the non-relativistic limit, is also not dicult. Consider again the three dierent
cases of particle-particle, particle-antiparticle, and antiparticle-antiparticle scattering.
The QED Feynman rules for the photon () propagator and vertex between electron
(Q = 1) and positron (Q = 1) are given by
=
ig

p
2
+i
,
p

= iQe

e
e
The propagator of a tensor boson, such as the graviton (G), i.e., the force carrier of
gravity, looks like
=
1
2
_
(g

)(g

) + (g

)(g

)
_
i
p
2
+i
.
p

G
145
Can you derive from this result the orientation of the gravitational force?
vi) The Furry theorem states that the sum of all Feynman graphs in QED with an odd
number of external photons (o or on the photon mass shell) and no other external lines
vanish. In order to proof Furrys theorem, consider (n = 0, 1, . . .)
[Tj

1
V
(x
1
) . . . j

2n+1
V
(x
2n+1
)[) , (5.196)
and show by invoking symmetry arguments that a matrix element of this form vanishes.
Here [) denotes the true vacuum of the interacting theory and j

V
(x) is the vector
current introduced in (5.96).
vii) The goal of this exercise is to introduce the spinor method and to derive some identities
that will be very useful to calculate scattering amplitudes in the high-energy limit, where
the involved particles can be treated as massless.
Derive an explicit solution for the Dirac equation
p / u(p) = 0 , (5.197)
of a massless fermion. To do so, write out p / using the basis

0
=
_
0 1
1 0
_
,
i
=
_
0
i

i
0
_
,
5
=
_
1 0
0 1
_
, (5.198)
of Dirac matrices. To keep the notation compact you might want to introduce
p

= p
0
p
3
, e
ip
=
p
1
ip
2
_
(p
1
)
2
+ (p
2
)
2
. (5.199)
Use the projection operators P
L,R
in the basis (5.198) to decompose the original solution
into two helicity solutions
u

(p) = P
R,L
u(p) . (5.200)
Give the explicit form of u

and u

. Show that u

= 2p
0
, which xes the normal-
ization of the spinors. Relate u
+
( u
+
) with ( u

)
T
((u

)
T
) using
0
and
2
. What is the
physics behind these relations? So far we have only talked about the positive-energy
solutions u. How do the negative-energy solutions v t into the picture? In particular,
how are u

( u

) and v

( v

) related in the case of a massless fermion?


Consider now a set of massless momenta p
i
with i = 1, 2, . . . , n. We introduce a bra and
ket notation with the spinor labelled by the index i corresponding to the momentum p
i
,
[i

) = u

(p
i
) , i

[ = u

(p
i
) . (5.201)
The basic spinor product are dened as
ij) = i

[j
+
) , [ij] = i
+
[j

) . (5.202)
146
What happens to i

[j

)? Show the antisymmetry of the spinor products, i.e.,


ij) = ji) , [ij] = [ji] , (5.203)
by using either the explicit expressions for u

and u

you have derived earlier or the


charge conjugation properties of the spinors.
For the case when both energies are positive, i.e., p
0
i
> 0 and p
0
j
> 0, derive analytic
expressions for the spinor products (5.202). Express you result through
s
ij
= (p
i
+p
j
)
2
= 2p
i
p
j
, (5.204)
and
cos
ij
=
p
1
i
p
+
j
p
1
j
p
+
i
_
[s
ij
[ p
+
i
p
+
j
, sin
ij
=
p
2
i
p
+
j
p
2
j
p
+
i
_
[s
ij
[ p
+
i
p
+
j
. (5.205)
So what is the connection between spinor products and Lorentz products of momenta?
Use your explicit result to show that the two types of spinor products are related by
complex conjugation,
ij)

= [ji] . (5.206)
Since spinor products should have simple properties under crossing symmetry, one denes
the spinor product ij) for negative energies by analytic continuation from the positive-
energy case, but with p
i,j
replaced by p
i,j
if p
0
i,j
< 0. The spinor product [ij] is then
dened through the identity
ij)[ji] = tr
_
P
L
p /
i
p /
j

= s
ij
. (5.207)
Consider now the spinor string
[i[

[j) = u
+
(p
i
)

u
+
(p
j
) , (5.208)
a quantity that naturally appears as the current describing the emission of a vector
boson from a right-handed massless fermion line. Notice that the helicity labels on the
spinors can always be suppressed in favor of angle or square brackets as in the spinor
products. So one has, i[ = i

[, [i[ = i
+
[, [i) = [i
+
), and [i] = [i

). Show the charge


conjugation property of the current
[i[

[j) = j[

[i] . (5.209)
Prove that
[i)[i[ = P
R
p /
i
, [i]i[ = P
L
p /
i
, (5.210)
and use these projection operators to show the correctness of the Gordon identity
[i[

[i) = i[

[i] = 2p

i
. (5.211)
147
Show by the use of (5.208) that
[i)j[ [j)i[ = ji)P
R
, (5.212)
holds and derive from this relation the Schouten identity
ij)kl) = ik)jl) +il)kj) . (5.213)
The same identity applies when angle brackets are replaced by square brackets. In
explicit calculation (5.208) is a powerful tool since its application can lead to enormous
algebraic simplications.
Use the Fierz transformation
(

P
R
)
ij
(

P
L
)
kl
= 2(P
L
)
il
(P
R
)
kj
, (5.214)
to show the simple relation
i[

[j][k[

[l) = 2il)[jk] , (5.215)


as well as the Fierz identity

[i[

[j) = 2
_
[i]j[ +[j)[i[
_
. (5.216)
A similar relation holds for

i[

[j].
viii) The goal of this exercise is to calculate the squared tree-level matrix elements of the
processes d u e


e
and d u e


e
g using the spinor formalism developed above.
The rst process d u e


e
describes the production of a massive W

boson from
the collision of a down (d) and an antiup quark ( u) and the subsequent decay of the
W

boson into an electron (e

) and an electron antineutrino (


e
). Draw the relevant
Feynman diagram and write down the corresponding amplitude in spinor notation using
the Feynman rules:
=
ig

p
2
M
2
W
+i
,
p

W
= i
g
w

2
V
ud

P
L
,
u
d
W
148
= i
g
w

P
L
.

e
e

W
Here M
W
denotes the mass of the W

boson, g
w
is the weak gauge coupling, and
V
ud
1 is the complex 11 element of the Cabibbo-Kobayashi-Maskawa (CKM) matrix,
which describes quark mixing in the SM. Notice that the W

boson only couples to the


left-handed component of the quark and lepton elds. The deeper signicance of this
property will become clear once you learn more about the SM of particle physics.
Simplify your result for the amplitude using the charge conjugation property of the
current (5.209) and the Fierz identity (5.215). Calculate the squared matrix element
and express your result in terms of scalar products of momenta.
The second process is similar to the rst one, but more complicated since it involves
the emission of an additional gluon (g) from one of the external quark legs. Draw the
possible Feynman graphs for d u e


e
g at tree level and write down the amplitude.
In addition to the Feynman rules given already you will need:
= i g
s
T
a

.
q
q
g
=

(p) (initial state) ,


p
=

(p) (nal state) .


p

Here g
s
is the coupling constant of QCD and T
a
with a = 1, . . . , 8 are the generators of
the associated gauge group, i.e., SU(3)
c
. The symbol

(p) stands for the polarization


vector of the initial- or nal-state gluon.
In order to calculate the squared matrix element for d u e


e
g, we also need to
introduce a spinor representation for the polarization vector for gluons with denite
helicity a = ,

(p, ) =
[p[

[)

2p)
,

(p, ) =
p[

[]

2[p]
, (5.217)
where p is the gluon momentum and is an auxiliary massless vector, called the refer-
ence momentum, reecting the freedom of on-shell gauge transformations. The objects
introduced in (5.217) have the following properties. Since p / [p

) = 0, the polarization
vector

(p, ) is transverse to p, i.e.,

(p, ) p = 0 , (5.218)
149
for any choice of with p ,= 0. Complex conjugation acts on the polarization vectors
like
_

(p, )
_

(p, ) , (5.219)
and they are normalized as follows

(p, )
_

(p, )
_

= 1 ,

(p, )
_

(p, )
_

= 0 . (5.220)
They also fulll a complettness relation, which reads

a=

(p, ) (
a

(p, ))

+
p

p
. (5.221)
Equipped with the denition and the properties of the polarization vectors you can now
actually calculate the matrix element for W

+ g production. Consider the case of the


emission of a positive and negative helicity gluon separately and keep the gauge vector
arbitrary. Use the charge conjugation, Schouten, and Fierz identities, (5.209), (5.213),
and (5.215), as well as the projection operators (5.210), to reduce both amplitudes to
combinations of basic spinor products (5.202). Also employ momentum conservation,

n
i=1
p

i
= 0, which leads to the identity
n

i=1, i=j,k
[ji]ik) = 0 . (5.222)
It is important that your nal result for the helicity amplitudes is independent of the
choice of . Could you have obtained your results far more simple by a specic choice of ?
Square the d u e


e
g amplitude and simplify your answer as much as possible. In the
fundamental representation the generators T
a
fulll tr
_
T
a
T
b
_
= T
F

ab
with T
F
= 1/2
and T
a
T
a
= C
F
where C
F
= (N
2
c
1)/(2N
c
) = 4/3 for N
c
= 3 corresponding to the
QCD gauge group SU(3)
c
.
ix) Heavy quark eective theory (HQET) is an eective eld theory designed to systemati-
cally exploit the simplications of the interactions of QCD in the heavy-quark limit for
the case of hadrons containing a single heavy quark such as the B and D meson. The
rst goal of this exercise is to derive the interactions of a heavy quark with the light dofs
starting from the Lagrangian
/ =

Q(iD/ m
Q
) Q, (5.223)
where Q is a Dirac spinor representing the heavy quark of mass m
Q
and
D

ig
s
T
a
G
a

, (5.224)
is the covariant derivative, which describes the minimal coupling of quarks to a gluon.
It depends on the QCD coupling constant g
s
, the gluon elds G
a

with a = 1, . . . , 8, and
the generators T
a
of SU(3)
c
. The obtained eective description will not only allow us to
150
show that the Lagrangian (5.223) has a spin-avor symmetry in the limit m
Q
, but
also provides a systematic and rigorous way to obtain corrections to the innite mass
limit.
To warm up solve the free Dirac equation for a heavy quark at rest. Use the decompo-
sition
Q(x) = e
im
Q
t
Q(0) , (5.225)
and plague it into (5.35). What do you observe?
The heavy-quark momentum p

can always be decomposed as


p

= m
Q
v

+k

, (5.226)
where v

is the 4-velocity of the heavy hadron. Once m


Q
v

, the large kinematical part of


the momentum is singled out, the remaining component k

is determined by soft QCD


bound-state interactions, and thus k
2
m
2
Q
. In order to work in an arbitrary frame
one denes
P

=
1 v/
2
, (5.227)
with v/v/ = v
2
= 1. Show that P

are projection operators and nd the explicit form of


them in the rest frame.
Remove the large-frequency part of the x-dependence in Q(x) resulting from the large
momentum m
Q
v

by plugging
Q(x) = e
im
Q
vx

Q(x)
= e
im
Q
vx
_
P
+

Q(x) +P


Q(x)
_
= e
im
Q
vx
_
h
v
(x) +H
v
(x)

,
(5.228)
into the Lagrangian (5.223). Notice that (5.228) is the covariant generalization of de-
composing Q(x) into upper and lower components. Why?
To decouple the simplied Dirac equation multiply it by the projection operators and
use
P

a/ = a/

vaP

, (5.229)
where a

= a

vav

for any 4-vector a

. From the two resulting equations derive a


relation between H
v
(x) and h
v
(x) valid up to terms of O(1/m
2
Q
).
Employ the relation between H
v
(x) and h
v
(x) to eliminate the eld H
v
(x) from the
system of equations. Using
[D

, D

] = igG

, (5.230)
nd the nal form of the EOM of the heavy-quark eld h
v
(x). In (5.229), we have
introduced the QCD eld strength tensor G

= G
a

T
a
. In its explicit form the eld
strength is given by G
a

G
a

G
a

+ gf
abc
G
b

G
c

with [T
a
, T
b
] = if
abc
T
c
, where
f
abc
are the fully antisymmetric structure constants.
151
Write down the Lagrangian that leads to the EOM for h
v
(x) including O(1/m
Q
) terms.
Discuss the spin and avor properties of the leading term and the power corrections in
the 1/m
Q
expansion. By going to the heavy-quark rest frame determine the physical
meaning of the two O(1/m
Q
) corrections. Explain the appearance of the spin and avor
symmetry (and its breaking) in physical terms. Compare your ndings for heavy-light
meson systems with the physics of the hydrogen atom. Point out similarities/dierences.
Derive the Feynman rules for the heavy-quark propagator once starting from the HQET
Lagrangian and once by expanding the propagator of the free Dirac theory. Give also
the Feynman rule for the interaction of the heavy quark with the gluon.
The masses of the vector and pseudoscalar B and D mesons are experimentally deter-
mined to be M
B
= 5.33 GeV, M
B
= 5.28 GeV and M
D
= 2.00 GeV, M
D
= 1.86 GeV,
respectively. These numbers imply that
M
2
B
M
2
B
= 0.53 GeV
2
, M
2
D
M
2
D
= 0.54 GeV
2
, (5.231)
which suggests that the dierence between the square of a heavy-light vector meson mass
and the square of a heavy-light pseudoscalar meson mass is a constant. Can you explain
this behavior qualitatively using the heavy-quark symmetries you have derived above?
References
[1] S. P. Martin, arXiv:hep-ph/9709356.
[2] O. W. Greenberg, Phys. Rev. Lett. 89, 231602 (2002) [arXiv:hep-ph/0201258].
[3] V. A. Kostelecky and N. Russell, arXiv:0801.0287 [hep-ph].
[4] T. D. Lee and C. N. Yang, Phys. Rev. 104 (1956) 254.
[5] C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. P. Hudson, Phys. Rev. 105,
1413 (1957).
[6] J. H. Christenson, J. W. Cronin, V. L. Fitch and R. Turlay, Phys. Rev. Lett. 13, 138
(1964).
[7] M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49, 652 (1973).
[8] A. D. Sakharov, Pisma Zh. Eksp. Teor. Fiz. 5 (1967) 32 [JETP Lett. 5 (1967) 24] [Sov.
Phys. Usp. 34 (1991) 392] [Usp. Fiz. Nauk 161 (1991) 61].
[9] M. Dine, arXiv:hep-ph/0011376.
[10] R. D. Peccei and H. R. Quinn, Phys. Rev. Lett. 38, 1440 (1977).
[11] S. L. Adler, Phys. Rev. 177, 2426 (1969).
[12] J. S. Bell and R. Jackiw, Nuovo Cim. A 60, 47 (1969).
152
[13] S. L. Adler and W. A. Bardeen, Phys. Rev. 182, 1517 (1969).
[14] S. L. Adler, arXiv:hep-th/0405040.
[15] S. M. Carroll, The Cosmological Constant, Living Rev. Relativity 3, 1 (2001),
http://relativity.livingreviews.org/Articles/lrr-2001-1
153