Nucl - Phys.B v.786

Nuclear Physics B 786 (2007) 125
Higher-spin ChernSimons theories in odd dimensions

Johan Engquist , Olaf Hohm
Institute for Theoretical Physics and Spinoza Institute, Utrecht University, 3508 TD Utrecht, The Netherlands
Received 4 June 2007; received in revised form 20 June 2007; accepted 21 June 2007
Available online 28 June 2007
Abstract
We construct consistent bosonic higher-spin gauge theories in odd dimensions D > 3 based on Chern
Simons forms. The gauge groups are infinite-dimensional higher-spin extensions of the anti-de Sitter groups
SO(D 1, 2). We propose an invariant tensor on these algebras, which is required for the definition of the
ChernSimons action. The latter contains the purely gravitational ChernSimons theories constructed by
Chamseddine, and so the entire theory describes a consistent coupling of higher-spin fields to a particular
form of Lovelock gravity. It contains topological as well as non-topological phases. Focusing on D = 5 we
consider as an example for the latter an AdS4 S 1 KaluzaKlein background. By solving the higher-spin
torsion constraints in the case of a spin-3 field, we verify explicitly that the equations of motion reduce in
the linearization to the compensator form of the Frnsdal equations on AdS4 .
2007 Elsevier B.V. All rights reserved.
1. Introduction
The construction of theories describing consistently interacting higher-spin fields is for several
reasons of great interest. For one thing string theory contains an infinite tower of massive higherspin states, and it is an old idea that these hint to a spontaneously broken phase of a theory
with a huge hidden gauge symmetry, thus extending the geometrical framework of Einsteins
theory [16]. However, the actual formulation of higher-spin theories is usually precluded by the
interaction problem. The latter refers to the apparent impossibility of introducing interactions into
a free higher-spin (HS) theory in such a way, that the number of dynamical degrees of freedom
is unaltered [7,8]. For instance, naively coupling free massless HS fields to gravity violates the
HS gauge symmetry and thus renders the theory inconsistent [9].
* Corresponding author.
E-mail addresses: j.engquist@phys.uu.nl (J. Engquist), o.hohm@phys.uu.nl (O. Hohm).

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.nuclphysb.2007.06.015
J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125
In a series of paper Vasiliev has, however, begun to find a route avoiding these no-go theorems, i.e., to consistently couple HS fields to gravity, by relaxing the following assumptions.
First, the theory is assumed to have a non-vanishing negative cosmological constantleading to
anti-de Sitter (AdS) instead of Minkowski space as the ground stateand to depend on this cosmological constant in a non-polynomial way. The latter excludes a flat space limit, in accordance
with standard S-matrix arguments [10]. Second, it will necessarily contain an infinite number of
massless fields carrying arbitrarily high spin, whose couplings can be of arbitrary power in the
derivatives (see [11] for a review and references therein).
The formulation of the associated HS theory is based on a gauging of an infinite-dimensional
HS algebra, in the same way that gravity and supergravity theories can be viewed as resulting
from a gauging of a (super-)AdS algebra. However, theories which are constructed along these
lines (as, e.g., in the approach of MacDowellMansouri [12]) are not true gauge theories in
that the gauge symmetry is not manifest and, moreover, (super-)torsion constraints have to be
imposed by hand. For instance, in supergravity invariance under local supersymmetry is by no
means manifest and has to be checked explicitly. In addition, so-called extra fields appear in
HS theories, which are unphysical and have to be expressed in terms of the physical fields by
imposing further constraints. In total, the program of Vasiliev consists of finding a non-linear HS
theory [11], which
(i) is still invariant under (a deformation of) the HS symmetry, and
(ii) yields in the linearization the required free field equations.
Of course, both requirements are related since once it is proven that the free field equations are of
2nd order, the HS symmetry, i.e., (i), fixes the field equations uniquely to the so-called Frnsdal
form. In the approach of [13,14] this requirement is implemented through the condition that
the extra fields decouple in the free limit (for reasons we will explain below). However, these
conditions have no natural interpretation from the point of view of the HS gauge symmetry. In
turn the consistency of the resulting HS action can only be checked up to some order, as it has
been done in D = 4 and D = 5 for cubic couplings [13,14]. But there are even reasons to expect
that this consistency will not extend to all orders [14]. In fact, up to date a fully consistent action
describing interactions of propagating HS fields is not known.
An approach, which is instead followed in order to describe consistent HS interactions at
the level of the equations of motion, is given by the so-called unfolded formulation [1517].
The latter is a surprisingly concise way to keep the HS invariance manifest. However, in this
approach there is not only an infinite number of physical HS fields, but each of the infinite fields
has an infinite number of auxiliary fields, which, roughly speaking, parametrize all spacetime
derivatives of the physical field. This in turn complicates the analysis of the physical content, and
it would be clearly desirable to have a conventional action principle that extends the Einstein
Hilbert action in the same way as supergravity does for spin-3/2 fields.
Concerning the problem of finding a consistent HS action, it should be noted that one example does exist: the ChernSimons action in D = 3 constructed by Blencowe based on a HS
algebra [18]. (See also [19,20] and [21,22] in a related context.) As the ChernSimons theory
is a true gauge theory, the resulting HS theory is consistent by construction and naturally extends the EinsteinHilbert action (which in D = 3 also has an interpretation as a ChernSimons
action [23]). It is, however, only of limited use since it is topological and does not give rise to
propagating degrees of freedom. On the other hand, gauge invariant ChernSimons actions exist
in all odd dimensions, and even though they are topological in any dimension in the sense that
they do not depend on a metric, they are not devoid of local dynamics in D > 3. In fact, it has
been shown by Chamseddine [24,25] that the ChernSimons actions based on the AdSD algebras
so(D 1, 2) are equivalent to a particular type of Lovelock gravity with propagating torsion and
thus by far not dynamically trivial. So one might wonder what happens if one defines a Chern
Simons action based on a HS extension of so(D 1, 2). This paper is devoted to the analysis of
this question.
The organization of the paper is as follows. In Section 2 we briefly review the known free
HS theories on Minkowski and AdS, and we introduce the HS Lie algebras which will later on
serve as gauge algebras. The general construction of ChernSimons actions in odd dimensions
will be reviewed in Section 3.1, together with the realization of Lovelock gravity as a Chern
Simons gauge theory. In Section 3.1 we construct an invariant tensor of the HS algebra, which
in turn allows a consistent extension of ChernSimons gravity to include an infinite tower of
HS fields. The constructed theory is then linearized around the non-topological KaluzaKlein
background AdS4 S 1 in Section 4. Focusing on the spin-3 mode, we show that the equations of
motion reduce to the correct free equations on AdS4 . We conclude in Section 5, while technical
details concerning Young tableaux, the symmetric invariant and the spin-3 Riemann tensor are
relegated to Appendices AC.
2. Higher-spin theories and their gauge algebras
In this section we first review free HS theories on Minkowski and AdS backgrounds, and
then introduce the infinite-dimensional HS Lie algebras, which are the starting point for the
construction of interacting HS theories. The results hold in general odd dimensions, though for
concreteness we will often specify to D = 5.
2.1. Free higher-spin actions
Bosonic fields of arbitrary spin s are described by symmetric rank-s tensors h1 ...s . In the
massless case they are subject to the gauge symmetry
h1 ...s = (1 2 ...s ) ,
(2.1)
parametrized by a symmetric transformation parameter of rank s 1. An action for a free field

of spin s on Minkowski and (anti-)de Sitter backgrounds has been given by Frnsdal [28,29]. For
a spin-3 field h , which is the case we will later on examine in more detail, it is of the form

1
d D x h h 3 h h + 3 2 h h
S=
2

3
3 h h h h + Lm .
(2.2)
2
Here denotes the AdS-covariant derivative or a partial derivative in case of a Minkowski background. In the flat case the additional term Lm vanishes, while on AdS the HS gauge symmetry
requires a mass-like term proportional to the cosmological constant. The latter then amounts to
the equations of motion
AdS
h 3( h) + 3( h)
F
= 0,

1
(D 3)h + 2 3g( h)
2
L
(2.3)
which defines the so-called Frnsdal operator F . Here h denotes the trace of h in the AdS
metric and L is the AdS radius, related to the cosmological constant by L = 1/ . Let us finally
note that the given action or field equations are invariant under the gauge variations only if the
transformation parameter is traceless and that for spin s > 3 a double-tracelessness condition
has to be imposed on the fields in order to give rise to the correct number of spin-s degrees of
freedom [30].
The difficulty in promoting these HS theories to interacting theories via coupling to gravity
or electrodynamics is due to the fact that the presence of generic covariant derivatives in (2.2)
violates the HS gauge symmetry. This in turn implies that the unphysical degrees of freedom
are no longer eliminated and the theory becomes inconsistent. Despite of these negative results,
Vasiliev has pioneered an approach towards a consistent coupling of HS fields to gravity, which is
based on the introduction of an infinite-dimensional HS algebra [31]. The latter requires a framelike formulation of HS fields, which mimics the vielbein formulation of general relativity rather
than the metric-like formulation used in (2.2) [32]. More specifically, a spin-3 field, for instance,
is described by e ab , being symmetric in the frame indices a, b, together with an analogue of the
spin-connection ab,c . A closer inspection has, however, revealed that consistency of the HS
algebra requires further fields, which are the so-called extra fields. These issues will be dealt with
in later sections, but for the moment we just note that the resulting algebra will be derived from
the enveloping algebra of the AdSD algebra so(D 1, 2), to which we turn in the next section.
2.2. A higher-spin extension of so(D 1, 2)
The starting point for the construction of an infinite-dimensional HS algebra is the AdS
symmetry group SO(D 1, 2). The Lie algebra of the latter is spanned by the anti-Hermitian
generators MAB = MBA , A = 0, 1, . . . , D 1, D + 1, obeying the commutation relations
[MAB , MCD ] = BC MAD AC MBD BD MAC + AD MBC
fAB,CD EF MEF ,
(2.4)
where
AB = diag(1, 1, 1, 1, 1, 1),
[E
F]
fAB,CD EF = 4[A
B][C D]
.
(2.5)
In the Lorentz basis, the so(D 1, 2) commutation relations read

[Mab , Mcd ] = bc Mad ac Mbd bd Mac + ad Mbc ,
[Mab , Pc ] = 2c[a Pb] ,
[Pa , Pb ] = Mab .
(2.6)
To define a Lie algebraic HS extension of so(D 1, 2) it is convenient [33,34] to introduce

a set of bosonic vector oscillators yAi , where i = 1, 2 is an sp(2) doublet index, obeying an
associative non-commutative star product
i j
j
j
yA , yB = 2 ij AB ,
yAi yB = yAi yB + ij AB ,
(2.7)
where ij = j i denotes the invariant sp(2) tensor, and we have introduced the bracket
[U , V ] = U V V U . The star product of general functions f (y) and g(y) can be defined
by the MoyalWeyl formula

ij
,
f
(y)g(z)
f (y) g(y) = exp
(2.8)
AB

i
j
yA zB
z =y
which reduces for linear functions to (2.7).

Given the oscillators we can construct the generators of the commuting (Howe dual) algebras
so(D 1, 2) and sp(2) as the bilinears
1
MAB = yAi yiB ,
2
1
Kij = yiA yj A ,
2
(2.9)
from which it indeed follows that [Kij , MAB ] = 0. The construction of the HS Lie algebra
ho(D 1, 2) is based on the enveloping algebra of so(D 1, 2). It is defined in terms of the
oscillator as [33,34]

ho(D 1, 2) = T (y): T = T , [Kij , T ] = 0 ,
(2.10)
where T (y) are arbitrary polynomials in the oscillator yAi , and the last condition singles out
the sp(2) singlets. This algebra is sometimes (perhaps misleadingly) referred to as the off-shell
HS algebra since it is generated by trace-full generators. On the other hand, starting from this
algebra one may construct the corresponding on-shell algebra where the generators are made
traceless by factoring out the ideal in ho(D 1, 2) spanned by elements of the form Kij X ij [11,
33,34]. In the formulation of HS theories utilizing the approach of unfolded dynamics [15,35],
where an action principle is not needed, the on-shell algebra has been mostly used. It is only
recently [34] that the importance of the off-shell algebra has been emphasized. In contrast, in an
ordinary action formulation of HS theories, as the one in this paper, we believe that an algebra
with trace-full generators is crucial. In the remaining of the paper we will avoid the term off-shell
algebra.
The polynomial T (y) appearing in (2.10) admits a level decomposition into monomials T (y)
(we associate the generators T at level with spins s = + 2 = 2, 3, 4, . . .)
T (y) =
T (y),
T (y) = 2+2 T (y).
(2.11)
= 0
The definition in terms of vector oscillators implies in particular that the algebra does not contain
the full enveloping algebra spanned by polynomials in MAB . Elements which vanish identically
in the vector oscillator formulation belong to a certain ideal I U[so(D 1, 2)]. For instance,
the anti-symmetric part M[AB MC]D vanishes due to the sp(2) identity [ij k]l = 0. This in turn
implies that the generators of the HS algebra (2.10) are in specific Young tableaux. In other
words, the T (y) have an expansion in terms of GL(D + 1) tensors [34]
TA(s1),B(s1) P(s1,s1) (MA1 B1 MAs1 Bs1 )

1
i
= 2s2 P(s1,s1) yAi11 yi1 B1 yAs1
yis1 Bs1 ,
s1
2
(2.12)
(2.13)
where the so(D 1, 2) generators appear at level 0.1 We have introduced a notation in which
TA(n),B(n) TA1 An ,B1 Bn , each set of indices being totally symmetrized. P(s1,s1) is a
Young projector which imposes the symmetry of the two-row GL(D + 1) Young tableau (see
1 There exists a further restriction of ho(D 1, 2) to a minimal algebra containing only even spins s = 2, 4, 6, . . . [33,
34].
Appendix A for details)

(2.14)
.

s1
Later on we will need the generator in a GL(D) Lorentz basis. Splitting (2.14) accordingly
for spin s, we find s generators, which schematically are in the tableaux
,
...,
(2.15)
More specifically, the generator TA1 As1 ,B1 Bs1 split into the series of generator Ta1 as1 and
Ta1 as1 ,b1 bt for 1 t s 1. The gauge fields e a1 as1 corresponding to the first generator
will later be identified with the physical spin-s field, while the fields for the remaining s 1 generators are in the literature referred to as the auxiliary (t = 1) and extra fields (t > 1). However,
for us this distinction between auxiliary and extra fields will be redundant and we will therefore
henceforth refer to all fields with t > 0 as auxiliary.
The complete set of commutation relations of the ho(D 1, 2) algebra is not known in a
closed form. Luckily, for the linearized spin-s analysis, to be treated in Section 4 for an expansion
around a spin-2 solution, it is sufficient to specify the spin-2spin-s commutation relations, which
are entirely fixed by the representations theory of the AdS subalgebra so(D 1, 2)
[MAB , TC(s1),D(s1) ] = 4(s 1)P(s1,s1) (AC
).
s1 TBC(s2),D(s1)

(2.16)
Let us finally mention that when commuting a spin-s generator with a spin-s generator we obtain
a sequence of generators with spins
s + s 2, s + s 4, . . . , |s s | + 2.
(2.17)
Notice that only the s = 2 subsector is closed.

3. ChernSimons theories in odd dimensions
In this section we will introduce the formulation of ChernSimons theories [24,25] in general
odd dimensions (for a review see [42]). The theory is specified once we give the algebra and the
relevant invariant tensor. We will specify to AdS Lovelock gravity, with a focus on D = 5, though
the results directly extend to all odd dimensions (for the explicit formulas see [25]). Although
there is no non-trivial propagation around the vacuum solution AdS5 , interestingly, the theory
also admits a simple AdS4 solution [25] around which the graviton propagates. In Section 4 we
will analyze the linearized HS dynamics around this solution. Finally, in this section we will
propose an invariant tensor for the full HS algebra.
3.1. Lovelock gravity as so(D 1, 2) gauge theory
In any odd dimension D = 2n 1 a gauge-invariant ChernSimons action can be defined,
which is based on the invariant 2n-form F n
constructed out of the field strength F , with
denoting an invariant symmetric tensor of degree n. More specifically, this expression is a total
derivative and thus gives rise to a dynamically non-trivial theory only on the boundary, i.e., in
one dimension less. The action can be written in closed form as

SCS =
M2n
1

=
M2n1
n1

,
dt A t dA + t 2 A2
(3.1)
where M2n1 = M2n , and we left the wedge products implicit. The resulting ChernSimons
form in D = 2n 1 is by construction gauge-invariant up to total derivatives. Explicitly, one has
under arbitrary variations

SCS = n
(3.2)
A F n1 ,
M2n1
i.e., gauge invariance under A = D follows by the Bianchi identity.

For definiteness we focus on D = 5. The gauge field A then takes values in the Lie algebra
of the group SO(4, 2). Specifically, we write in an SO(4, 1) covariant manner A = e a Pa +
1
ab
2 Mab in the basis above and define the invariant tensor to be
MAB MCD MEF
= ABCDEF .
(3.3)
Note that, as required, this tensor is symmetric in the sense that it stays invariant under exchange
of MAB with MCD , etc. The SO(4, 1) covariant field strength tensors in F = 12 R ab Mab +
T a Pa read

R ab = R ab + e a e b e a e b ,
T a = e a e a + ab eb ab eb ,
(3.4)
containing the Riemann tensor

R ab = ab ab + a c cb a c cb ,
(3.5)
and the torsion tensor. The resulting ChernSimons action can be written as [25]

2
S = 3 a1 ...a5 ea1 R a2 a3 R a4 a5 + ea1 ea2 ea3 R a4 a5
3
M5

1 a1
a2
a3
a4
a5
+ e e e e e .
5
(3.6)
We see that the action is the EinsteinHilbert action with a cosmological constant (which we have
set = 1), extended by a D = 5 Lovelock term. To be more precise, it describes a theory with
dynamical torsion. However, it is still consistent with the field equations to impose vanishing
torsion in order to express the spin connection in terms of the vielbein. This in turn reduces the
dynamical degrees of freedom to those of the metric, for which the Einstein equations read
1
1
R Rg 3g =
(RR) ,
2
32
(3.7)
where we have introduced the abbreviation (RR) = R R .

As it stands, (3.6) seems to be a purely conventional type of Lovelock gravity, which is usually assumed to propagate the same number of degrees of freedom as Einstein gravity (five in
D = 5). However, in this case the topological origin (3.1) actually gives rise to a somewhat unconventional behavior: Expanding (3.6) around the AdS5 solution one infers that the quadratic
term vanishes identically. In other words, a propagator around AdS5 does not exist. This can be
most easily understood from the general form of the equations of motion for the ChernSimons
action (3.1), which can be read off from (3.2)
gABC F B F C = 0,
(3.8)
where gABC denotes the invariant tensor and A, B, . . . , are the adjoint indices for a generic
gauge group, which will be later on specified to ho(4, 2). Since for an expansion around AdS5 the
curvature tensor in (3.4) vanishes in the background, there are no linear terms in (3.8) and thus no
quadratic terms in the action. However, this should not be interpreted in the sense that the theory
is devoid of local dynamics altogether, as is sometimes assumed of topological actions like (3.1)
in the literature. Indeed, the propagator around generic backgrounds does not vanish. Moreover,
a careful Hamiltonian analysis of the dynamical content in [26,27] has shown that, apart from
degenerate sectors (like the maximally symmetric AdS5 background), the theory consistently
propagates a number of degrees of freedom depending on the dimension of the gauge group. In
particular, the Lovelock-type gravity theory above has the expected five degrees of freedom.2
Let us also stress that the degenerate sectors are only a measure-zero subspace within phase
space [26,27], and that even around such degenerate backgrounds some degrees of freedom can
propagate, albeit fewer. One example has been given already in [25]: It is effectively an AdS4
solution and reads (, = 0, 1, 2, 3)
a
e =
1
1 14 x x
x x
=
,
2 1 14 x x
e4 4 = const,
(3.9)
which has vanishing torsion, T a = 0, and satisfies

e e = 0,
R +
R4 +
e e4 = 0.
(3.10)
By expanding around this solution, it has been shown that it propagates in particular a fourdimensional graviton [25].
3.2. Invariant tensor of the higher-spin algebra
In order to construct the ChernSimons action (3.1) based on ho(D 1, 2), which extends
standard ChernSimons gravity, we have to find a completely symmetric tensor of degree D 2,
which is invariant under the adjoint action of the HS algebra ho(D 1, 2), and which reduces to
the standard invariant (3.3) for the AdS-subalgebra so(D 1, 2). Below we will propose a formula
for the invariant tensor. However, while the vector oscillator formulation described in Section 2.2
was required in order to establish existence and consistency of the HS algebra, it turns out not
to be sufficient for the definition of a symmetric invariants to ho(D 1, 2). Instead, we will
introduce a new star product, known as the BCH (BakerCampbellHausdorff) star product or
the Gutt star product [3739].
Let us first briefly comment on the reasons why the formulation in terms of vector oscillators is incapable of reproducing the symmetric tensor (3.3). This is simply due to the fact
that the oscillators automatically eliminate the totally anti-symmetric part in the star product,
2 To be more precise, this counting applies only in case of vanishing torsion. Otherwise, there are additional degrees
of freedom [27].
M[AB MCD MEF ] = 0, since it involves an anti-symmetrization over more than two sp(2) indices. On the other hand, this exclusion guaranteed the appearance of generators entirely being
in definite (s 1, s 1) Young tableaux, or in other words, eliminated the ideals spanned by
generators not in these Young tableaux. Here in contrast, by requiring an invariant tensor generalizing (3.3), we are, roughly speaking, assigning a non-zero value to certain parts in the ideal I.
Put differently, instead of using the invariance of the ideal, [ho(D 1, 2), I] I, to set it to zero,
we set it to constants, reducing in particular to (3.3).
To start with, we have to define a non-commutative star product directly in terms of the
MAB (here viewed as commuting coordinates), whose star commutator then yields the required
so(D 1, 2) algebra. This is the BCH star product, which is given by

F (M) G(M) = exp MAB AB (N , N ) F (N)G(N ) N =M,N =M ,
(3.11)
where N is a short-hand notation for /NAB and where AB = BA is defined through the
relation

exp Q exp Q = exp Q + Q + AB (Q, Q )MAB ,
(3.12)
with Q = QAB MAB and Q = Q AB MAB for some anti-symmetric tensors QAB and Q AB . It
defines an associative product on the enveloping algebra [37]. By using the BCH formula

1
1
exp Q exp Q = exp Q + Q + [Q, Q ] +
Q, [Q, Q ] + Q , [Q , Q] + ,
2
12
(3.13)
we find the first few terms in the expansion to be
1
AB = fCD,EF AB QCD Q EF
2

1
fCD,EF GH fGH ,I J AB QCD Q EF QI J Q I J +
12

2
= 2(QQ )[AB] [Q, Q ][A|C| QC B] QC B] + ,
3
(3.14)
where (QQ )AB = QAC QC B , [Q, Q ]AB = (QQ )AB (Q Q)AB and where fAB,CD EF are the
structure constants defined in (2.5). The first terms in the product (3.11) consequently become
F (M) G(M) = F (M)G(M) + 2MAB AC F C B G

AC B
2
+ MAB AC BD F CD G 2
F CG
3
AC B
+ AC BD GCD F 2
G C F + .
(3.15)
The definition of the HS generators in (2.12) extends immediately. However, whereas the
realization in terms of the vector oscillator automatically imposes the Young tableau symmetries
(s 1, s 1), here we need to Young project explicitly.3 Hence, all elements of the enveloping
algebra which belong to other Young tableaux are modded out.
3 Note that under the projector P
(s1,s1) the use of the star product or the point-wise (classical) product is immaterial. For instance, for spin 3 we have TAB,CD = P(2,2) (MAC MBD ) = P(2,2) (MAC MBD ).
10
The star-products between a spin-2 generator and spin-s generator TC(s1),D(s1) read
MB]D ,
MAB MCD =MAB MCD 2C[A
(3.16)
MAB TC(s1),D(s1) =MAB TC(s1),D(s1)

2(s 1)P(s1,s1) (AC
)
s1 TBC(s2),D(s1)

+ double contractions,
(3.17)
where P(s1,s1) is a Young projector. The commutation relations in (2.4) and (2.16) follow
readily by defining the bracket [U , V ] = U V V U , since we know that [MAB , F (M)] =
4MC[A C B] F (M); see Eq. (B.2) in Appendix B.
Let us now proceed with defining the symmetric invariant tensors of the HS algebra. Given an
element F (M) of the enveloping algebra U[so(D 1, 2)], we define the operation tr given by
evaluation at MAB = 0

tr F (M) := F (0).
(3.18)
However, although the analogue of this operation for the vector oscillator described in Section 2.2
constitutes a proper (super) trace [6,40,41], it is easy to realize that the bilinear tr(F (M) G(M))
vanishes identically in our case (see also the comments in footnote 4). To obtain a sensible nonzero trace, we need to insert GL(D + 1)-invariant differential operators into the trace (3.18) cf.
the results in Ref. [36]. A natural GL(D + 1)-invariant differential operator is constructed out
of n = (D + 1)/2 derivatives contracted with the totally anti-symmetric tensor A1 AD+1 . We
propose the following sequence of traces Trk , for k = 1, 2, 3, . . .

Trk F (M) = tr k F (M) ,
(3.19)
,
= A1 An B1 Bn
(3.20)
MA1 B1
MAn Bn
with tr as in (3.18).4 These traces are cyclic
Trk (F G) = Trk (G F ),
(3.21)
for generic elements F (M) and G(M) of the enveloping algebra, which will be proven in Appendix B.
We now define the symmetric trilinear for three generators (2.12) of the HS algebra ho(4, 2)
of spins s, s and s to be
Ts , Ts , Ts
:=

k Trk {Ts , Ts } Ts ,
(3.22)
k =1
where {Ts , Ts } = Ts Ts + Ts Ts and k are arbitrary coefficients. This definition generalizes

directly to an n-form for D = 2n 1. The total symmetry of (3.22) follows from (3.21) and the
associativity of the BCH star product.
4 The oscillator algebra based on (2.8) admits a natural graded (super) trace tr f (y) = f (0), such that
y
try (f (y)g(y)) = try (g(y)f (y)) [6,40,41]. Using this trace we can construct the anti-symmetric invariants of the
ho(D 1, 2) algebra. However, for the reasons explained above, even dressing this trace with derivative operators
analogous to (3.20), cannot give rise to a non-vanishing symmetric combination.
11
At this stage we have to note that, strictly speaking, the cyclicity (3.21) is not sufficient to
prove invariance of (3.22), since the commutator with respect to the BCH star product potentially
contains ideal terms. However, for the linearization in case of a spin-3 field to be analyzed below,
one can check explicitly that the tensor is invariant to that order. So we expect (3.22) to be
invariant under the adjoint action of the full ho(4, 2), which, furthermore, might fix the free
coefficients k .
The definition (3.22) will reproduce the symmetric spin-2 trilinear (3.3) provided we choose
the first coefficient to be 1 = 1/12. Further, it follows that the spin-2, spin-s invariant vanishes
for s > 2 once the symmetries imposed by the Young projector of the spin-s generator are taken
into account,
MAB , MCD , TE(s1),F (s1)
= 0.
(3.23)
This relation guarantees that the equations of motion for the HS fields (see (3.8)) will not contain
a term depending only on the spacetime curvature, which in turn implies that the spin-2 field
does not provide a source for the HS fields. Put differently, it is consistent with the field equations
to set the HS fields to zero.
Only the first trace Tr1 enters the linearized spin-3 analysis which we will focus on below.
The relevant invariant in D = 5 is given by
MAB , TCD,EF , TGH ,I J
= 2P(2,2) P(2,2) (ABCEGI DH F J ),
(3.24)
where the projectors impose the symmetries of the two spin-3 generators. We note that up to
an overall constant, the invariant (3.24) is the only possible term which is consistent with the
imposed Young symmetries.
Up to now we established the existence of a HS Lie algebra and an associated symmetric
invariant tensor. This in turn is sufficient to define a consistent HS ChernSimons action, which
in, say, D = 5 is given by

3
3
W dW dW + dW W W W + W W W W W . (3.25)
S=
2
5
M5
Here W denotes the gauge field taking values in ho(4, 2). It contains by construction the Lovelock gravity discussed in Section 3.1, corresponding to the subalgebra so(4, 2). Note that all the
complexity of this theory is encoded in the infinite-dimensional Lie algebra ho(4, 2) and the symmetric tensor. By virtue of the consistency of ho(4, 2) and the existence of the tri-linear tensor,
this action is by construction invariant under an exact HS symmetry at the full non-linear level,
i.e., it satisfies requirement (i) in the introduction. However, due to the fact that the Lie brackets
of ho(4, 2) are not known explicitly, at this stage the action (3.25) cannot be rewritten in a closed
form in terms of the physical HS fields. Fortunately a linearized analysis can be performed, and
in the next section we will show that one recovers indeed the correct free field limit, thus proving
that (3.25) satisfies also condition (ii).
4. Dynamical analysis
In this section we will discuss some aspects of the dynamical content of the constructed HS
theory. As it stands, the HS action (3.25) describes a theory with propagating gravitational torsion, so we expect also the HS torsions (which will be defined below) to propagate. Since the
dynamics of these kind of theories is much less understood, we take here a pragmatic point of
12
view, i.e., we impose vanishing torsion, which is compatible with the equations of motion though
it is not enforced by them. For simplicity our focus will be on the first non-trivial case, viz. spin-3,
which we believe exhibits generic features present for arbitrary spin.
4.1. Linearization and constraints for spin-3
We first note that, as in the purely gravitational case, an expansion around AdS5 does not give
rise to a non-trivial propagator. This can be seen by inspecting the equations of motion (3.8). Up
to first order they are of the form
C
gABC RB
AdS RHS = 0,
(4.1)
where RHS denotes the linearized HS contribution. As the AdS-covariant field strength vanishes
in the AdS background, RAdS = 0, the equations are identically satisfied at the first order and do
not lead to any perturbative dynamics.
Instead we will first keep the discussion generic and later focus on an expansion around the
AdS4 S 1 solution discussed in Section 3.1. For this we have to know the HS algebra explicitly. Fortunately, for an expansion around a given background geometry, only the commutators
between spin-2 and spin-s generator enter, while the mutual interactions between the different
HS fields are not relevant. The spin-3 generator is given by TAB,CD , corresponding to the Young
tableau
, and it closes according to (2.16) with the spin-2 generator as5
TBD,EF
+ AD
TCB,EF
+ AE
TCD,BF
+ AF
TCD,EB )
[MAB , TCD,EF ]= 2(AC

= 8A C
T|B|D,EF
.

(4.2)
Here curly brackets denote (2, 2) Young projection, while in the following they also indicate
symmetrization according to the Hook tableau, etc. (see Appendix A). In a GL(5) covariant basis,
the spin-3 generators are given by Tab = Tab,66 , Tab,c = Tab,c6 and Tab,cd , and their algebras read
[Mab , Tcd ]= 4a c
T|b|d
,

[Mab , Tcd,e ] = 4a c
T|b|d,e
+ 4a c
Tde
,b ,

[Mab , Tcd,ef ]= 8a c
T|b|d,ef
,

[Pa , Tbc,d ] = 3a b Tcd
Tad,bc ,
[Pa , Tbc ] = 2Tbc,a ,

[Pa , Tbc,de ] = 8a b Tcd,e
.
(4.3)
Here we take the brackets [T , T ] to be vanishing, even though in the full HS algebra they close
into spin-4 generator. However, in the linearization these spin-4 fields decouple, and, indeed, this
truncation defines a consistent Lie algebra.
Next we linearize the HS gauge field as6

1 ab
1
1
1
Mab + e ab Tab + ab,c Tab,c + ab,cd Tab,cd , (4.4)
W = e a Pa +
2
2
3
12
ab are vielbein and spin connection of the background geometry. Moreover,
where e a and
we consistently omitted contributions from all fields with spin s > 3. e ab will later be identified
with the spin-3 field, while ab,c and ab,cd are auxiliary fields that have to be eliminated by
5 In the sequel we will drop the subscript on the commutators.
6 The unit-strength normalizations follow from the Hook length formula [43].
13
means of constraints. It will turn out that these constraints are analogous to the torsion constraint
of general relativity. As the torsion tensor appears as part of the field strength in (3.4), we will
determine the required constraints in the HS case by computing the non-Abelian field strength
based on the algebra (4.3). We find
F = W W + [W , W ]
1 ab
= T a Pa + R
Mab
2

1
1
1
+ T ab Tab + T ab,c Tab,c + R ab,cd Tab,cd + O 2 .
2
3
12
(4.5)
ab is the AdSHere T a denotes the background torsion, which we assume to vanish, while R
covariant background curvature tensor. The linearized HS field strengths read
e ab D
e ab + ab,c ec ab,c ec ,
T ab = D
ab,c D
ab,c + ab,cd ed ab,cd ed + 3e ab e c
3e ab e c
,
T ab,c = D
ab,cd D
ab,cd + 4 ab,c e d
4 ab,c e d
,
R ab,cd = D
(4.6)
denotes the background Lorentz covariant derivative, which reads on the different
where D
fields
e ab = e ab + 2
a c e |c|b
,
D
ab,c = ab,c + 2
a d |d|b,c
2
a d bc
,d ,
D
ab,cd = ab,cd + 4
a e |e|b,cd
.
D
(4.7)
Before we turn to the constraints let us discuss the spin-3 symmetries, under which the field
strengths above stay invariant. Under a non-Abelian gauge transformation W = D = +
[W , ], with Lie algebra valued transformation parameter given in the spin-3 case by

1 ab
1 ab
1 ab,c
1 ab,cd
a
Tab,cd ,
= Pa + Mab + Tab + Tab,c +
(4.8)
2
2
3
12
we find the following variations (ignoring background diffeomorphisms and Lorentz transformations)
ab ab,c ec ,
e ab = D
ab,c 3 ab e c
ab,cd ed ,
ab,c = D
ab,cd 4 ab,c e d
.
ab,cd = D
(4.9)
ab,c
ab,cd
and
corresponding to the auxilNote that the gauge transformations with parameters
iary fields act as Stckelberg shift symmetries.
Next we are going to discuss the constraints. We will see that imposing the conditions7
T ab = 0,
T ab,c = 0,
(4.10)
7 Note that the first constraint allows to identify the background diffeomorphisms with the gauge transformations
generated by a in the sense that the latter read on e ab , up to local Lorentz and Stckelberg transformations, e ab =
e ab , where denotes the Lie derivative with respect to the vector field = e a a .
14
allows to express ab,c in terms of the physical spin-3 field e ab and its first derivative and
ab,cd in terms of ab,c and its first derivatives. In turn, ab,cd is a function of e ab and its
first and second derivatives. The latter can be inserted into the third of Eq. (4.6), which then yields
the HS generalization of the Riemannian curvature tensor. Therefore the spin-3 curvature tensor
will be of third order in the derivatives of the spin-3 field. This procedure can be generalized to
arbitrary spin-s fields, whose curvature tensor will thus contain the sth derivative of the physical
spin-s field. (For traceless tensors in D = 4 spinorial form this analysis has been done in [44],
while a cohomological analysis in D dimensions can be found in [11,45].) This corresponds
to the hierarchy of de WitFreedman connections found in the metric-like formulation [30].
Since the equations of motion will necessarily impose conditions on the HS Riemann tensor, this
implies that the field equations are in the linearization already of higher derivative order. So at
first sight we seem to have little chance to recover the required 2nd order Frnsdal equations.
However, in flat space it has been shown that the Riemann tensor is a curl (DamourDeser
identity [49]) and that it can therefore be locally integrated, giving rise to the Frnsdal equations
in the so-called compensator formulation [34,50]. Here we will prove that this generalizes to
AdS.
Let us now turn to the constraints. From the first of Eq. (4.10) we conclude
d bc,a a bc,d = 1ad,bc ,
(4.11)
where the curved index on ab,c has been converted into a flat index by means of the background
vielbein, and we have introduced a HS generalization of the coefficients of anholonomity,

e cd D
e cd .
1 ab,cd = ea eb D
(4.12)
By permuting the indices in (4.11), one finds the expression

1
bc,d = ea 1 a(b,c)d 1 ad,bc + 1 d(b,c)a + bc,d ,
2
where

1
bc,d = ea abc,d + bda,c + cda,b + dbc,a .
4
(4.13)
(4.14)
To understand the significance of ab,c , we first note that a priori (4.13) lives in the Young
tableaux
(4.15)
It follows from (4.14) that is in the window tableau, i.e., (1 P(2,2) ) = 0. In the following
we will have to treat as an independent field. One can easily check that (4.13) solves (4.11) for
arbitrary , by using the window property of the latter. In fact, we will see that the inclusion of
this auxiliary field is necessary in order for the composite connection bc,d (e, ) to reproduce
the correct transformation behavior in (4.9).
From now on we will specify the geometry to AdS, since this is the case we are interested
reduces to the AdS-covariant
in later on.8 Specifically, the background-covariant derivative D
8 Note, however, that the analysis performed here holds in an arbitrary dimension, i.e., it applies in particular to AdS
5
as well as the AdS4 geometry we will discuss below.
15
derivative , characterized by
1
(g V g V ),
(4.16)
L2
with the AdS metric g of radius L, which in our conventions is L = 1. Applying the first
equation of (4.9), to (4.13), one can verify by use of (4.16) that ab,c (e, ) transforms exactly
as required by the second equation of (4.9), if one defines
[ , ]V =
, = ,
3 g
, .
(4.17)
In particular one sees that this transformation rule is consistent with the window symmetry
of ab,c .
The second torsion constraint in (4.10) can now be solved in a similar fashion. For our purposes it will, however, be sufficient to perform this analysis in a gauged-fixed formulation (for
AdS backgrounds). This will effectively reduce the field content to the completely symmetry
spin-3 field, given in a metric-like formulation by
h := e( a e b e)ab .
(4.18)
Specifically we use the Stckelberg shift symmetry in (4.9) parametrized by ab,c to gauge the
of eab to zero (see (A.3) in Appendix A). However, this gauge-fixing will be
hooked part
violated by a generic spin-3 transformation, and so one has to add a compensating shift transformation with parameter ab,c = c ab
. Under this residual gauge symmetry only the completely
symmetric part of e transforms, namely as
h = ( ) ,
(4.19)
as required in the free limit (see Section 2.1). Furthermore, from (4.17) we infer, that also ab,c
is subject to a Stckelberg shift symmetry with transformation parameter in . Therefore it can
be gauged away completely, which in turn requires a compensating transformation with
, =
3 g
.
(4.20)
In total, after gauge-fixing the spin-3 connections will depend only on the completely symmetric
part of eab .
To solve the second torsion constraint we derive from (4.6) for ab,cd in flat indices:
a bc,de e bc,da = 2 ea bc,d ,
(4.21)
where

cd,e + 3e cd e e
( ) .
2 ab cd,e = ea eb D
(4.22)
We find the solution

1
ab,cd = ef 2 f a cd,b + 2 f b cd,a + 2 f c ab,d + 2 f d ab,c .
2
(4.23)
To verify that this is a solution it is not sufficient to use the symmetries of the 2 ab cd,e , but
instead the explicit expression given by (4.22) and (4.13) together with the AdS relation (4.16) is
required.
16
4.2. Spin-3 field equations

Let us now turn to the equations of motion. We specify to the AdS4 background discussed in
Section 3.1. Moreover, we set all components of e ab which have a leg in the fifth dimension to
zero. In other words, we are not considering the dynamics of KaluzaKlein scalars and vectors,
etc., in order to simplify the analysis. Though in the full non-linear theory this would most likely
not be a consistent KaluzaKlein truncation, in the linearization this is justified since the different
fields decouple.
Using the explicit form of the invariant tensor in (3.24) we see that, after imposing the constraint, the only non-trivial part of the equations of motion (4.1) requires the free index to take
values in the Hook tableau. Moreover, we have seen in Eq. (3.23) that setting the background
spin-3 field to zero is consistent with its equations of motion, which we implicitly assumed already in the expansion (4.4).
Specifically, by use of (3.24) we have

abcde Rab R d f , e h
0=P

1
(4.24)
abcde Rab R d f , e h + abhde Rab R d f , e c ,
2
where we used in the second equation the projector in (ch, f ). Specifying now to flat AdS4
indices a = (, 4),9 and using (3.10) this implies
=
e R , + ( ) = 0.
This yields in components by use of the identity
belling the indices
(4.25)
e
e]
3!e
e[
R , , R , + R , g + ( ) = 0.
and after rela(4.26)
Taking the , trace implies that the double trace of the Riemann tensor vanishes. We prove in
Appendix C that the final equation is equivalent to the condition that any single trace of the Riemann tensor, i.e., the spin-3 analogue of the Ricci tensor, vanishes. It turns out that a convenient
choice is the following:
R , = 0.
(4.27)
Next we are going to analyze this equation in more detail. By inserting (4.13) into (4.23) and
using (4.6) one finds the explicit expressions (in curved indices)
K + g ( h h ) + g ( h h ) ( ) = 0,
(4.28)
where we defined
K = h h h + ( ) h
(D 3)h g( h ) 3g h .
(4.29)
Here we left the spacetime dimension generic, though in our case it is D = 4. As outlined above,
we are going to show that these 3rd-order differential equations can locally be integrated to give
9 Indices , , . . . , denote D-dimensional spacetime indices. We hope it will not source any confusion that we specify
them in this section to curved AdS4 indices.
17
effectively rise to sensible 2nd-order field equations. We first note that, in contrast to Minkowski
space, on AdS a vanishing curl cannot locally be integrated to a gradient, since the covariant
derivatives do not commute. However, for K symmetric in , a condition like
[ K] = 2g[ ] 2g [ ] ,
(4.30)
with symmetric , can be solved by K = , as follows after reinsertion from (4.16).

Comparing with (4.28) we see that the equations of motion have almost this form, except that
the derived like this are not symmetric. If the latter are symmetrized by hand, additional
terms have to be added to the ansatz for K , which in turns implies that it is no longer a pure
gradient. Moreover, an integration constant has to be carefully taken into account. Altogether one
finds that
K = + g (h ) + g (h )
(4.31)
solves (4.28), where

= h h h + 2g ,
(4.32)
and is the integration constant. Then (4.31) can be rewritten by use of the explicit expression
in (4.29) as
AdS
= ( ) 4g( ) ,
F
(4.33)
where F AdS is the AdS Frnsdal operator defined in Section 2.1. To understand the significance
of we note that Eq. (4.28) is by construction spin-3 invariant. However, by locally integrating, this invariance would be lost, if not a non-trivial transformation behavior is assigned to the
integration constant . In fact, (4.33) is only invariant if
= .
(4.34)
This shift symmetry can now be used to set = 0, such that (4.33) reduces to the Frnsdal
equation on AdS, the latter being invariant under all trace-less spin-3 transformations. Thus we
correctly recovered the required free spin-3 equations. The formulation (4.33) with its invariance under trace-full transformations and the appearance of the so-called compensator in fact
coincides completely with the construction of Francia and Sagnotti [4648].
5. Conclusions and outlook
By virtue of the YangMills gauge invariance of ChernSimons actions in any odd dimension,
the HS theories constructed in this paper provide a consistent coupling to gravity in the sense
that the free HS symmetry h1 ...s = (1 2 ...s ) gets deformed to an exact symmetry of
the full non-linear theory. In other words, condition (i) raised in the introduction is satisfied by
construction. Moreover, contrary to what is sometimes implicitly assumed, these topological
actions do possess propagating degrees of freedom for D > 3. By linearizing around the AdS4
solution found in [25], we verified explicitly that this is the case especially in the presence of HS
fields. We recovered the correct free field equations in the first non-trivial case of a spin-3 field.
For this we showed that on the subsector of vanishing spin-3 torsion the field equations, though
being 3rd-order differential equations, can locally be integrated to 2nd-order equations, which in
turn coincide with the Frnsdal equations in the formulation of [46,47].
18
We would like to stress that this is in contrast to previous attempts to construct consistent HS
actions [13,14]: In order to guarantee free field equations of 2nd-order, they impose the additional
condition that the extra fields (in our case the spin-3 connection ab,cd ), which are generically
of higher-derivative order, do not enter the free action. Here we do not have this freedom, since
the action is completely determined by gauge invariance, i.e., the extra fields inevitably enter the
free theory. That we get nevertheless the correct Frnsdal equations, or in other words, that the
higher-derivatives are gauge artefacts that can be eliminated, is due to the curl-like structure of
the HS Riemann tensor, which in flat space is known as the DamourDeser identity [49,50] (see
also [51,52]). Since we verified here an analogous behaviour on AdS for spin-3, this pattern will
most likely extend to all HS fields, and therefore requirement (ii) for consistent HS theories is
satisfied.
Let us also stress that in this approach it is very natural, if not necessary, to start with a HS
algebra based on tracefull generators, since then the appearance of the compensator in the integration leading from (4.28) to (4.33) has a very natural interpretation in that it compensates
for the non-invariance of the pure Frnsdal operator under tracefull HS transformations. Moreover, starting with traceless generators would imply in particular that the HS Riemann tensor is
already traceless and consequently the field equations in the form (4.27) would be identically
satisfied and not lead to any dynamics. Instead, the dynamics could possibly be encoded, via
the Bianchi identities, in the lower-rank torsion-like tensors, for which, however, the distinction
between constraint equations and dynamical equations would be less straightforward. (See also
the discussion about the so-called -cohomology in [11] and references therein.)
Finally we note that, compared to the unfolded formulation of HS theories advertised in
the literature so far, the more conventional action principle presented here has the advantage of
admitting already a class of exact solutions. In fact, by virtue of the relation (3.23) we concluded
that any solution of the purely gravitational theory, as for instance black holes [53] and pp-waves
[54], can be lifted to an exact solution of the full theory, simply by setting all HS fields to zero.
Accordingly, this theory allows the analysis of HS dynamics on more complicated backgrounds
(and then, in principle, also of the back reaction of the geometry). This is in contrast to the
unfolded dynamics, for which even in case that all HS fields vanish, the construction of solutions
is a highly non-trivial problem. Indeed, apart from AdS, exact solutions have been found only
recently [55,56].
Many things are left to be done. First, we have analyzed the dynamical content only in case
of trivial HS torsion. However, viewed as a 1st-order formulation, the theory does not imply vanishing torsion (though the latter provides a particular solution). So either one imposes the torsion
constraints by hand, in order to express the de WitFreedman connections in terms of the physical HS fields, or one treats the torsions as carrying additional degrees of freedom. In the former
case it is not clear that this is consistent with the HS gauge symmetry: Although we have seen
in Section 4.1 that in case of a linearization around an AdS geometry the composite connections
transform in the same way as the fundamental connectionsand so imposing the constraints
does not violate the HS invarianceit is not clear whether this is consistent in general. In case
it is not consistent, this would mean that there are additional degrees of freedom associated to
the torsion, which necessarily need to be taken into account. Apart from that we should point
out that due to the way the torsion tensor enters the ChernSimons theory, there does not exist a
1.5-order formalism.
One of the main difficulties in analyzing the non-linear dynamics of the constructed HS theory
in more detail is due to the fact that the infinite-dimensional HS algebras are poorly understood.
Though these algebras are well defined through the oscillator realization described in Section 2.2,
19
the structure constants, for instance, are not known in general. Further research into this direction
is required for a detailed analysis of the interactions.
Once the dynamical content is known, it remains to be seen how the different fields organize
into HS multiplets. We first notice that the basic field content of the ChernSimons theory in
D = 5 does not fit into multiplets of ho(4, 2), since the latter requires in particular a massless
scalar [6,57]. This in turn is the reason that the construction of HS actions la MacDowell
Mansouri entirely based on a HS gauge field are not believed to be consistent to all orders [14].
However, our case is different, since there are no propagating HS modes around AdS5 and so
there is no reason to expect that the D = 5 field content should organize in multiplets.10 Rather
we found that the non-trivial HS dynamics takes place on backgrounds which are not maximally
symmetric, as the AdS4 solution. However, on this background there will most likely be scalar
and other excitations which are the KaluzaKlein modes originating from the off-diagonal components of the various fields. Due to the HS invariance of the full theory, these modes almost
by construction will organize into multiplets of ho(3, 2), and it would be very interesting to see
how this happens. In some sense the theory seems to prevent itself from becoming inconsistent
exactly by not having standard dynamics around its most symmetric solution.
Let us finally note that the ChernSimons theory in D = 11 based on (two copies of) the
superalgebra osp(1|32) has been proposed as the non-perturbative definition of M-theory [61].
As the latter should cover in particular the infinite towers of massive HS states described by
10-dimensional string theory, it is very tempting to conjecture that osp(1|32) has to be enhanced
to a HS extension, thus giving rise to ChernSimons actions of the type considered here. In
fact, recently it has been argued that the three-dimensional ChernSimons theory based on a
HS algebra is related to M-theory for non-critical strings in D = 2 via the background AdS2
S 1 [62]. Similarly to the AdS4 S 1 solution discussed for the D = 5 theory here, one might
hope to identify a non-topological 10-dimensional phase, which permits flat Minkowski space
and gives rise to massive HS states via spontaneous symmetry breaking.
Acknowledgements
For useful comments and discussions we would like to thank D. Francia, M. Henneaux,
C. Iazeolla, M.A. Vasiliev and especially P. Sundell.
This work has been supported by the European Union RTN network MRTN-CT-2004-005104
Constituents, Fundamental Forces and Symmetries of the Universe and the INTAS contract 0351-6346 Strings, branes and higher-spin fields. O.H. is supported by the stitching FOM.
Appendix A. Young tableaux and projectors
Here we give a brief review of the technique of Young tableaux used in the main text. As we
are exclusively working with tracefull tensors, these encode the irreducible representations of
GL(m), as opposed to SO(m) groups. For tensors with AdSD indices we have m = D + 1, while
for the corresponding Lorentz tensors m = D.
10 A similar argument has been employed for supergravity in [58].
20
A Young tableau consists of a certain number of rows of boxes, where the number of boxes
does not increase from top to bottom, as for instance
(A.1)
It describes the symmetries of an irreducible GL(m) tensor. For the example (A.1) it has the
structure Ta1 a5 ,b1 b3 ,c1 c3 ,d . As a matter of convention we choose the symmetric basis, which
means that the corresponding tensors are completely symmetric in all row indices. Specifically,
the tensor T above is completely symmetric in the sets of indices {ai }, {bi } and {ci }, respectively. For irreducibility the tensors have to satisfy the additional condition that symmetrisation
of all indices in a certain row with any index corresponding to a box below that row gives zero.
For instance, in the example this implies
Ta1 a5 ,(b1 b3 ,c1 )c3 ,d = Ta1 a5 ,b1 b3 ,(c1 c3 ,d) = 0, etc.,
(A.2)
where ordinary brackets denote complete symmetrization of strength one, as, e.g., T(ab) :=
1
2 (Tab
+ Tba ). Note that, accordingly, a tensor Ta,b in is anti-symmetric, while in general no

specific anti-symmetrization properties can be derived from the Young tableau.11 Moreover, one
may check that for a tensor in the window tableau , Eq. (A.2) implies the exchange property
Tab,cd = Tcd,ab .
The language of Young tableaux is efficient in order to determine the decomposition of tensor
products into irreducible representations. Specifically, in the tensor product of the vector representation 2 with any Young tableau, the irreducible parts are obtained by adding 2 to the given
tableau in all possible ways. For instance, the spin-3 frame field e ab is a priori in the tensor
product
(A.3)
i.e., it contains the completely symmetric (physical) part and the so-called Hook diagram.
Finally we give the projectors onto the Hook and window diagrams, which we need in the
main text, explicitly. The Hook projector reads on a general tensor with no a priori symmetries
(P
1
X)abc (P(2,1) X)abc X abc
= (2X(ab)c X(bc)a X(ca)b ).
3
(A.4)
Similarly,
(P
X)abcd (P(2,2) X)abc X abcd
1
= (2X(ab)(cd) + 2X(cd)(ab) X(cb)(ad) X(ad)(cb)
6
X(ac)(bd) X(bd)(ac) ).
(A.5)
11 It is, however, possible to start with a different convention, in which the anti-symmetrization properties, i.e., the
symmetries in a column of boxes, are specified. In Appendix C we have to relate these two.
21
Analogous formulas hold in case of different index orderings, as, e.g., Hook projection according
to indices (ab, c) on a tensor Xcab ,
(P
1
X)cab = (2Xc(ab) Xa(bc) Xb(ca) ).
3
(A.6)
Appendix B. Proof of cyclicity of the trace

In this appendix we prove the assertion made in Section 3.2 that the traces Trk in (3.21) are
cyclic in a general odd dimension D = 2n 1.
For a generic element F (M) in the enveloping algebra U[so(D 1, 2)] the star product with
MAB can be computed by use of (3.15),

2
MAB F = MAB + 2MC[A C B] + MCD [A C B] D MC[A B] D D C F ,
3

2
F MAB = MAB 2MC[A C B] + MCD [A C B] D MC[A B] D D C F ,
(B.1)
3
which implies

MAB , F (M) = 4MC[A C B] F (M).
(B.2)
This equation encodes the transformation of F (M) under MAB . In a more mathematical language
this states that the BCH star product is strongly so(D 1, 2)-invariant [36,37].
In order to prove the cyclicity of the trace we first show for a generic monomial F =
F A1 B1 A B MA1 B1 MA B of degree

Tr1 [MAB , F ] = tr [MAB , F ] = 0.
(B.3)
To see this, we apply to (B.2), whose explicit evaluation gives

tr [MAB , F ]
n

n
tr A1 B1 Ar Br MC[A| Ar+1 Br+1 An Bn C |B] F
= 4 A1 An B1 Bn
r
r =0

= tr [MAB , F ] + 4n A1 An B1 Bn tr A1 B1 MC[A| A2 B2 An Bn C |B] F .
(B.4)
The first term vanishes, which follows from (B.2) and the fact that tr sets M = 0. The second
term can potentially be non-zero when = n. In this case it reduces to (after dropping a constant multiplicative factor) A1 An B1 Bn1 [A| FA1 B1 An1 Bn1 An |B] , which vanishes identically.
To see this we use F A1 B1 A B = F A1 B1 Bm Am A B , the symmetry under exchange of any
pair (Am , Bm ) and (Am , Bm ) and finally the fact that anti-symmetrization in 2n + 1 indices
vanishes identically for so(2n),
[A1 An B1 Bn1 A FA1 B1 An1 Bn1 An B] = 0.
(B.5)
Furthermore, one can show that for k 2

Trk [MAB , F ] = tr k [MAB , F ] = 0,
(B.6)
by using identities similar to (B.5), which proves that Trk (F MAB ) is cyclic. By using induction, the proof extends directly to Trk (F G ) for an arbitrary monomial G . For this
22

we expand G = m Gm M m , use that M m = (M)m + m <m cm M m together with as
sociativity, and finally apply (B.6) several times. For instance, when = 2 we find G2 =
GA1 B1 A2 B2 MA1 B1 MA2 B2 = GA1 B1 A2 B2 (MA1 B1 MA2 B2 2B1 A2 MA1 B2 ). The cyclicity of the
last term follows from the analysis above, and by repeatedly using (B.6) we have that
GA1 B1 A2 B2 Trk (F MA1 B1 MA2 B2 ) = GA1 B1 A2 B2 Trk (MA2 B2 F MA1 B1 )
= GA1 B1 A2 B2 Trk (MA1 B1 MA2 B2 F ).
(B.7)
Appendix C. The spin-3 Riemann tensor

Here we summarize some relations for the spin-3 Riemann tensor, most notably the Bianchi
identities. (On flat space, a very clear discussion of the spin-3 geometry in metric-like formulation can be found in [49], while aspects of a frame-like formulation are given in [32,59,60].) For
the proof of the Bianchi identities it will be convenient to work in form language, for which the
tensors in (4.6) read
ab + ab,c ec ,
ab,c + 3e ab ec
+ ab,cd ed ,
T ab,c = D
T ab = De
ab,cd + 4 ab,c ed
.
R ab,cd = D
(C.1)
to the
After solving the torsion constraints the Bianchi identity follows by application of D
second torsion tensor,

ab,cd + 4 ab,c ed
ed = R ab,cd ed ,
ab,c = D
0 = DT
(C.2)
where we used the first torsion constraint, T ab = 0, and the relation
2 ab,c = Rad d b,c + Rbd a d ,c + Rcd ab, d ,
D
(C.3)
evaluated for the AdS case (4.16). In components the Bianchi identity (C.2) reads
R[ ] , = 0,
(C.4)
where we converted all indices into curved ones.

These identities can now be used to prove that all traces of the Riemann tensor are algebraically related, or in other words, as in the spin-2 case there is a unique Ricci tensor. First of all,
the symmetries of the fiber indices according to the window Young tableau imply R a(b,cd) = 0,
which in turn shows that
R ab, c c = 2R a c ,bc = 2R b c ,ac ,
(C.5)
i.e. there is a unique trace in the fiber indices. By virtue of the Bianchi identity (C.4) the trace in
the fiber indices can then be related to the trace between one spacetime and one fiber index:
1
R , = R , .

2
(C.6)
We are now in a position to rigorously derive the field equations used in the main text. First
contracting (4.26) with g yields
(D 2)R , = 0,
(C.7)
where we used (C.4). This in turn implies that the double traces of the Riemann tensor appearing
in (4.26) can be set to zero. The remaining terms can be simplified by making repeated use of the
23
Bianchi identity (C.4) and the symmetries of the window tableau:

0 = R , R , + ( )
= R , R , + ( )
= 4R , + R , + R ,
= 3R , .
(C.8)
Here we used in the second line the Bianchi identity in , , , in the third line the window
symmetry of the second term in , , and finally the same symmetry in the fourth line. These
are the final equations of motion, which basically state that the spin-3 Ricci tensor vanishes.
In order to clarify the information contained in this equation, we decompose R , into its
irreducible parts. A priori it can take values in the Young tableaux
(C.9)
The origin of these different structures is that in the frame-like formulation the Riemann tensor
necessarily appears in a mixed basis in the sense that the anti-symmetric 2-form indices are on a
different footing as the frame indices. To compare with the completely symmetric or completely
anti-symmetric basis used in the metric-like formulation in [30] and [49], respectively, we have
to impose these symmetries, i.e., we define
(a)
R; ; =R,
,

(s)
R, = R( ( ), ) .
(C.10)
Since these are in definite Young tableaux (namely both in

, depending on the chosen
conventions for symmetrisation or anti-symmetrisation properties), it is easily seen that there is
a unique trace. Explicitly one finds
3
R(a)
;; = 8 R , ,
R(s)
, = R( ), .
2
With these relations it follows that this trace of R(s) is in
(C.11)
, while its algebraically related trace
R(a)
is in
. Similarly, the trace of
takes values in
, but interpreted in the anti-symmetric
basis. To summarize, taking the trace in the fiber indices of the Riemann tensor in the mixed basis
corresponds to the Ricci tensor in the completely anti-symmetric basis, while a trace between
spacetime and fiber index corresponds to the Ricci tensor in the completely symmetric basis.
References
[1] D.J. Gross, High-energy symmetries of string theory, Phys. Rev. Lett. 60 (1988) 1229.
[2] J. Isberg, U. Lindstrom, B. Sundborg, G. Theodoridis, Classical and quantized tensionless strings, Nucl. Phys. B 411
(1994) 122, hep-th/9307108.
[3] B. Sundborg, Stringy gravity, interacting tensionless strings and massless higher spins, Nucl. Phys. B (Proc.
Suppl.) 102 (2001) 113, hep-th/0103247.
[4] G. Bonelli, On the tensionless limit of bosonic strings, infinite symmetries and higher spins, Nucl. Phys. B 669
(2003) 159, hep-th/0305155;
G. Bonelli, On the covariant quantization of tensionless bosonic strings in AdS spacetime, JHEP 0311 (2003) 028,
hep-th/0309222.
[5] A. Sagnotti, M. Tsulaia, On higher spins and the tensionless limit of string theory, Nucl. Phys. B 682 (2004) 83,
hep-th/0311257.
24
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
J. Engquist, P. Sundell, Brane partons and singleton strings, Nucl. Phys. B 752 (2006) 206, hep-th/0508124.
D. Sorokin, Introduction to the classical theory of higher spins, AIP Conf. Proc. 767 (2005) 172, hep-th/0405069.
S. Weinberg, E. Witten, Limits on massless particles, Phys. Lett. B 96 (1980) 59.
C. Aragone, S. Deser, Consistency problems of spin-2 gravity coupling, Nuovo Cimento B 57 (1980) 33.
S.R. Coleman, J. Mandula, All possible symmetries of the S-matrix, Phys. Rev. 159 (1967) 1251.
X. Bekaert, S. Cnockaert, C. Iazeolla, M.A. Vasiliev, Nonlinear higher spin theories in various dimensions, Lectures
given at Workshop on Higher Spin Gauge Theories, Brussels, Belgium, 1214 May 2004, hep-th/0503128.
S.W. MacDowell, F. Mansouri, Unified geometric theory of gravity and supergravity, Phys. Rev. Lett. 38 (1977)
739;
S.W. MacDowell, F. Mansouri, Unified geometric theory of gravity and supergravity, Phys. Rev. Lett. 38 (1977)
1376, Erratum.
E.S. Fradkin, M.A. Vasiliev, Cubic interaction in extended theories of massless higher spin fields, Nucl. Phys. B 291
(1987) 141.
M.A. Vasiliev, Cubic interactions of bosonic higher spin gauge fields in AdS5 , Nucl. Phys. B 616 (2001) 106;
M.A. Vasiliev, Cubic interactions of bosonic higher spin gauge fields in AdS5 , Nucl. Phys. B 652 (2003) 407,
hep-th/0106200, Erratum.
M.A. Vasiliev, Actions, charges and off-shell fields in the unfolded dynamics approach, Int. J. Geom. Methods Mod.
Phys. 3 (2006) 37, hep-th/0504090.
M.A. Vasiliev, Unfolded representation for relativistic equations in (2 + 1) anti-de Sitter space, Class. Quantum
Grav. 11 (1994) 649.
M.A. Vasiliev, Triangle identity and free differential algebra of massless higher spins, Nucl. Phys. B 324 (1989)
503.
M.P. Blencowe, A consistent interacting massless higher spin field theory in D = (2 + 1), Class. Quantum Grav. 6
(1989) 443.
E.S. Fradkin, V.Y. Linetsky, A superconformal theory of massless higher spin fields in D = (2 + 1), Mod. Phys.
Lett. A 4 (1989) 731;
E.S. Fradkin, V.Y. Linetsky, A superconformal theory of massless higher spin fields in D = (2 + 1), Ann. Phys. 198
(1990) 293.
E. Bergshoeff, M.P. Blencowe, K.S. Stelle, Area preserving diffeomorphisms and higher spin algebra, Commun.
Math. Phys. 128 (1990) 213.
O. Hohm, On the infinite-dimensional spin-2 symmetries in KaluzaKlein theories, Phys. Rev. D 73 (2006) 044003,
hep-th/0511165.
O. Hohm, Gauged diffeomorphisms and hidden symmetries in KaluzaKlein theories, Class. Quantum Grav. 24
(2007) 2825, hep-th/0611347.
E. Witten, (2 + 1)-dimensional gravity as an exactly soluble system, Nucl. Phys. B 311 (1988) 46.
A.H. Chamseddine, Topological gauge theory of gravity in five dimensions and all odd dimensions, Phys. Lett.
B 233 (1989) 291.
A.H. Chamseddine, Topological gravity and supergravity in various dimensions, Nucl. Phys. B 346 (1990) 213.
M. Banados, L.J. Garay, M. Henneaux, The local degrees of freedom of higher dimensional pure ChernSimons
theories, Phys. Rev. D 53 (1996) 593, hep-th/9506187.
M. Banados, L.J. Garay, M. Henneaux, The dynamical structure of higher dimensional ChernSimons theory, Nucl.
Phys. B 476 (1996) 611, hep-th/9605159.
C. Frnsdal, Massless fields with integer spin, Phys. Rev. D 18 (1978) 3624.
C. Frnsdal, Singletons and massless, integral spin fields on de Sitter space, Phys. Rev. D 20 (1979) 848.
B. de Wit, D.Z. Freedman, Systematics of higher-spin gauge fields, Phys. Rev. D 21 (1980) 358.
E.S. Fradkin, M.A. Vasiliev, Candidate to the role of higher spin symmetry, Ann. Phys. 177 (1987) 63.
M.A. Vasiliev, Gauge form of description of massless fields with arbitrary spin, Yad. Fiz. 32 (1980) 855 (in
Russian).
M.A. Vasiliev, Nonlinear equations for symmetric massless higher spin fields in (A)dS(d), Phys. Lett. B 567 (2003)
139, hep-th/0304049.
A. Sagnotti, E. Sezgin, P. Sundell, On higher spins with a strong Sp(2, R) condition, hep-th/0501156.
E. Sezgin, P. Sundell, Doubletons and 5D higher spin gauge theory, JHEP 0109 (2001) 036, hep-th/0105001.
P. Bieliavsky, M. Bordemann, S. Gutt, S. Waldmann, Traces for star products on the dual of a Lie algebra,
math/0202126.
S. Gutt, An explicit product on the cotangent bundle of a Lie group, Lett. Math. Phys. 7 (1983) 249.
J. Madore, S. Schraml, P. Schupp, J. Wess, Gauge theory on noncommutative spaces, Eur. Phys. J. C 16 (2000) 161,
hep-th/0001203.
25
[39] B. Jurco, S. Schraml, P. Schupp, J. Wess, Enveloping algebra valued gauge transformations for non-Abelian gauge
groups on non-commutative spaces, Eur. Phys. J. C 17 (2000) 521, hep-th/0006246.
[40] M.A. Vasiliev, Extended higher spin superalgebras and their realizations in terms of quantum operators, Fortschr.
Phys. 36 (1988) 33.
[41] G. Pinczon, R. Ushirobira, Supertrace and superquadratic Lie structure on the Weyl algebra, and applications to
formal inverse Weyl transform, Lett. Math. Phys. 74 (2005) 263.
[42] J. Zanelli, (Super-)gravities beyond 4 dimensions, hep-th/0206169.
[43] J. Fuchs, C. Schweigert, Symmetries, Lie Algebras and Representations: A Graduate Course for Physicists, Cambridge Univ. Press, 1997.
[44] M.A. Vasiliev, Free massless fields of arbitrary spin in the de Sitter space and initial data for a higher spin superalgebra, Fortschr. Phys. 35 (1987) 741;
M.A. Vasiliev, Free massless fields of arbitrary spin in the de Sitter space and initial data for a higher spin superalgebra, Yad. Fiz. 45 (1987) 1784.
[45] V.E. Lopatin, M.A. Vasiliev, Free massless bosonic fields of arbitrary spin in d-dimensional de Sitter space, Mod.
Phys. Lett. A 3 (1988) 257.
[46] D. Francia, A. Sagnotti, Free geometric equations for higher spins, Phys. Lett. B 543 (2002) 303, hep-th/0207002.
[47] D. Francia, A. Sagnotti, On the geometry of higher-spin gauge fields, Class. Quantum Grav. 20 (2003) S473, hepth/0212185.
[48] D. Francia, A. Sagnotti, Higher-spin geometry and string theory, J. Phys. Conf. Ser. 33 (2006) 57, hep-th/0601199.
[49] T. Damour, S. Deser, Geometry of spin-3 gauge theories, Ann. Poincar Phys. Theor. 47 (1987) 277.
[50] X. Bekaert, N. Boulanger, On geometric equations and duality for free higher spins, Phys. Lett. B 561 (2003) 183,
hep-th/0301243.
[51] X. Bekaert, N. Boulanger, Tensor gauge fields in arbitrary representations of GL(D, R). II: Quadratic actions, Commun. Math. Phys. 271 (2007) 723, hep-th/0606198.
[52] I. Bandos, X. Bekaert, J.A. de Azcarraga, D. Sorokin, M. Tsulaia, Dynamics of higher spin fields and tensorial
space, JHEP 0505 (2005) 031, hep-th/0501113.
[53] M. Banados, Constant curvature black holes, Phys. Rev. D 57 (1998) 1068, gr-qc/9703040.
[54] J.D. Edelstein, M. Hassaine, R. Troncoso, J. Zanelli, Lie-algebra expansions, ChernSimons theories and the
EinsteinHilbert Lagrangian, Phys. Lett. B 640 (2006) 278, hep-th/0605174.
[55] E. Sezgin, P. Sundell, An exact solution of 4D higher-spin gauge theory, Nucl. Phys. B 762 (2007) 1, hepth/0508158.
[56] V.E. Didenko, A.S. Matveev, M.A. Vasiliev, BTZ black hole as solution of 3d higher spin gauge theory, hepth/0612161.
[57] S.E. Konstein, M.A. Vasiliev, Massless representations and admissibility condition for higher spin superalgebras,
Nucl. Phys. B 312 (1989) 402.
[58] O. Hohm, H. Samtleben, Effective actions for massive KaluzaKlein states on AdS3 S 3 S 3 , JHEP 0505 (2005)
027, hep-th/0503088.
[59] C. Aragone, H. La Roche, Massless second order tetradic spin-3 fields and higher helicity bosons, Nuovo Cimento
A 72 (1982) 149.
[60] C. Aragone, S. Deser, Z. Yang, Massive higher spin from dimensional reduction of gauge fields, Ann. Phys. 179
(1987) 76.
[61] P. Horava, M-theory as a holographic field theory, Phys. Rev. D 59 (1999) 046004, hep-th/9712130.
[62] P. Horava, C.A. Keeler, Strings on AdS2 and the high-energy limit of noncritical M-Theory, arXiv: 0704.2230
[hep-th].
Two-loop fermionic corrections to massive Bhabha

scattering
Stefano Actis a, , Micha Czakon b,c , Janusz Gluza d , Tord Riemann a
a Deutsches Elektronen-Synchrotron, DESY, Platanenallee 6, D-15738 Zeuthen, Germany
b Institut fr Theoretische Physik und Astrophysik, Universitt Wrzburg, Am Hubland, D-97074 Wrzburg, Germany
c Institute of Nuclear Physics, NCSR DEMOKRITOS, 15310 Athens, Greece
d Institute of Physics, University of Silesia, Uniwersytecka 4, PL-40007 Katowice, Poland
Received 9 May 2007; accepted 27 June 2007

Available online 4 July 2007
Abstract
We evaluate the two-loop corrections to Bhabha scattering from fermion loops in the context of pure
quantum electrodynamics. The differential cross section is expressed by a small number of master integrals
with exact dependence on the fermion masses me , mf and the Mandelstam invariants s, t, u. We determine
the limit of fixed scattering angle and high energy, assuming the hierarchy of scales m2e m2f s, t, u. The
numerical result is combined with the available non-fermionic contributions. As a by-product, we provide
an independent check of the known electron-loop contributions.
1. Introduction
Bhabha scattering is one of the processes at e+ e colliders with the highest experimental precision and represents an important monitoring process. A notable example is its expected role for
the luminosity determination at the future International Linear Collider ILC by measuring smallangle Bhabha-scattering events at center-of-mass energies ranging from about 100 GeV (Giga-Z
collider option) to several TeV. Moreover, the large-angle region is relevant at colliders operat-
E-mail address: stefano.actis@desy.de (S. Actis).

doi:10.1016/j.nuclphysb.2007.06.023
S. Actis et al. / Nuclear Physics B 786 (2007) 2651
27
ing at 110 GeV. For some applications a full two-loop calculation of the QED contributions is
mandatory.1
A large class of QED two-loop corrections was determined in the seminal work of [2]. Later,
the complete two-loop corrections in the limit of zero electron mass were obtained in [3] thanks
to the fundamental results of [4,5]. However, this result cannot be immediately applied, since the
available Monte Carlo programs (see, e.g., [613]) employ a small, but non-vanishing electron
mass. The 2 ln(s/m2e ) terms due to double boxes were derived from [3] by the authors of [14],
and the close-to-complete two-loop result in the ultra-relativistic limit was finally obtained in
[15,16]. Note that the diagrams with fermion loops have not been covered by this approach.
The virtual and real components involving electron loops could be added exactly in [17,18].
The non-approximated analytical expressions for all two-loop corrections, except for double-box
diagrams and for those with loops from heavier-fermion generations, can be found in [19]. For a
comprehensive investigation of the full set of the massive two-loop QED corrections, including
double-box diagrams, we refer to [2024]. The evaluation of the contributions from massive
non-planar double box diagrams remains open so far.
In order to add another piece to the complete two-loop prediction for the Bhabha-scattering
cross section in QED, we evaluate here the so-far lacking diagrams containing heavy-fermion
loops. The cross-section correction is expressed by a small number of scalar master integrals,
where the exact dependence on the masses of the fermions and the Mandelstam variables s, t
and u is retained. In a next step, we assume a hierarchy of scales, m2e m2f s, t, u, where me
is the electron mass and mf is the mass of a heavier fermion. We derive explicit results neglecting
terms suppressed by positive powers of m2e /m2f , m2e /x and m2f /x, where x = s, t, u. This highenergy approximation describes the influence of muons and leptons and proves well-suited for
practical applications. In addition, we provide an independent cross-check of the exact analytical
results of [17] (we used the files provided at [25] for comparison) for mf = me .
The article is organized as follows. In Section 2 we introduce our notations and outline the
calculation and in Section 3 we discuss the solution for each class of diagrams. In Section 4
we reproduce the complete result for the corrections from heavier fermions in analytic form and
perform the numerical analysis. Section 5 contains the summary, and additional material on the
master integrals is collected in Appendix A.
2. Expansion of the cross section
We consider the Bhabha-scattering process,
e (p1 ) + e+ (p2 ) e (p3 ) + e+ (p4 ),
(2.1)
and introduce the Mandelstam invariants s, t and u,

s = (p1 + p2 )2 = 4E 2 ,

t = (p1 p3 )2 = 4 E 2 m2e sin2 ,

2
2
2
2
2
u = (p1 p4 ) = 4 E me cos ,
2
where me is the electron mass, E is the incoming-particle energy in the center-of-mass
and is the scattering angle. In addition, s + t + u = 4m2e .
1 Note that leading two-loop effects in the electroweak Standard Model were already incorporated in [1].
(2.2)
(2.3)
(2.4)
frame
28
1a
1b
1c
Fig. 1. Classes of Bhabha-scattering one-loop diagrams. A thin fermion line represents an electron, a thick one can be
any fermion. The full set of graphs can be obtained through proper permutations. We refer to [26] for the reproduction of
the full set of graphs.
In the kinematical region m2e s, t, u the leading-order (LO) differential cross section with
respect to the solid angle reads as

d LO 2 1 s 2
1 t2
1
2
2
2
(2.5)
=
+ t + st + 2
+ s + st + (s + t) ,
d
s s2 2
2
st
t
where is the fine-structure constant. At higher orders in perturbation theory we write an expansion in ,
NLO 2 NNLO

d
d LO
d
d
(2.6)
=
+
+
+ O 5 .
d
d
d
Here d NLO and d NNLO summarize the next-to-leading order (NLO) and next-to-next-toleading order (NNLO) corrections to the differential cross section. In the following it will be
understood that we consider only components generated by diagrams containing one or two
fermion loops.
2.1. NLO differential cross section
The NLO term follows from the interference of the one-loop vacuum-polarization diagrams
of class 1a (see Fig. 1) with the tree-level amplitude,
d NLO d 1atree
=
d
d

(1)
2 1 s 2
2
=
+
st
2
Q2f Re f (s)
+
t
2
s s
2
f

2
(1)
1 t
+ 2
Q2f Re f (t)
+ s 2 + st 2
2
t
f

(1)

1
(1)
+ (s + t)2
Q2f Re f (s) + f (t) .
st
(2.7)
(1)
Here f (x) is the renormalized one-loop vacuum-polarization function and the sum over f
runs over the massive fermions, e.g., the electron (f = e), the muon (f = ), the lepton
(f = ). Qf is the electric-charge quantum number, Qf = 1 for leptons.
In this paper we will focus on asymptotic expansions in the high-energy limit. In order to
fix our normalizations explicitly, we reproduce here the exact result for f(1) (x) in dimensional
(1)ct
regularization. Adding f
(x), the counterterm contribution in the on-mass-shell scheme (see

(1)un
the following discussion in Section 2.3), to f

polarization function, we get
(1)
(1)un
f (x) = f
29
(x), the unrenormalized one-loop vacuum
(1)ct
(x) + f (x),

m2f
1
1
(1)un
(x) =
2(D 2) A0 (mf ) D 2 + 4
B0 (x, mf ) ,
f
2(D 1)
x
x

2
1
me
1 2
(1)ct
+ ,
f (x) = F
2
3

2
mf
(2.8)
(2.9)
(2.10)
where = (4 D)/2 and D is the number of spacetime dimensions. The normalization factor
is
2 E
me e
,
F =
(2.11)
2
is the t Hooft mass unit and E is the EulerMascheroni constant. Standard one-loop integrals
appearing in Eq. (2.8) are defined by

4D
1
dDk 2
,
A0 (m) =
2
i
k m2

4D

1
.
dDk 2
B0 p 2 , m =
2
2
i
(k m )[(k + p)2 m2 ]
(2.12)
(2.13)
Note that master integrals with l lines and an internal scale m were derived in [22,26] setting
m = 1. For the present computation we introduce a scaling by a factor mfD2l and we get

m2e 2
mf T1l1m,
m2f
2
me
B0 (x, mf ) = F
SE2l2m[x].
m2f
A0 (mf ) = F
(2.14)
(2.15)
In the small-mass limit, A0 vanishes (the result for T1l1m can be read in Eq. (4) of [22]), and
the one-loop self-energy2 reads as

2
1
1
SE2l2m[x] = + 2 + Lf (x) + 4 + 2Lf (x) + L2f (x) ,
(2.16)

2
2
where we introduced the short-hand notation for logarithmic functions (in our conventions the
logarithm has a cut along the negative real axis),

m2f
,
Lf (x) = ln
x + i
0+ .
(2.17)
2 Here, the argument x of SE2l2m[x] is one of the relativistic invariants s, t, u. This deviates from earlier conventions, where we denoted by x the dimensionless conformal transform of s, t , u. This remark applies also to master
integrals in Appendix A.
30
2a
2b
2c
2d
2e
Fig. 2. Classes of Bhabha-scattering two-loop diagrams containing at least one fermion loop. We use the conventions of
Fig. 1. Note that class 2a contains three topologically different subclasses. We refer to [26] for the reproduction of the
full set of graphs.
(1)
Finally, neglecting O(m2f /x) terms, f (x) reads as

(1)
f (x) =
F
3
m2e
m2f

28
5
1
5
+ Lf (x) +
2 + Lf (x) + L2f (x) .
3
9
3
2
(2.18)
Note that the O( ) term in Eq. (2.18) is not required for the NLO computation, but it will become
relevant at NNLO. Here f(1) (x) will be combined with infrared-divergent graphs showing single
(1)
poles in the plane for = 0. The exact result for f (x) is available at [26].
2.2. Outline of the NNLO computation
At NNLO we have to consider:
The interference of the two-loop diagrams of classes 2a2e (see Fig. 2) with the tree-level
amplitude;
The interference of the one-loop vacuum-polarization diagrams of class 1a with the full set
of graphs of classes 1a1c (see Fig. 1).
The complete result can be organized as
d 1a1i
d 2itree
d NNLO
+
.
=
d
d
d
i=a,...,e
i=a,...,c

2-looptree
1-loop1-loop
(2.19)
In order to compute the NNLO differential cross section we use the following reduction strategy:
The generation of all the diagrams is simple and has been made with the computer-algebra
systems GraphShot [27] and qgraf/DIANA [2830]. We spin-sum the squared matrix
31
elements and take the traces over Dirac indices in D dimensions using the computer-algebra
system FORM [31]. The resulting expressions are combinations of algebraic coefficients depending on s, t , u, me , mf and and two-loop integrals with scalar products containing the
loop momenta in the numerators. An example showing the complexity of the result (two-loop
box diagram of class 2e, see Fig. 2) can be found at [26].
We reduce the loop integrals to a set of master integrals by means of the IdSolver
implementation [32] of the Laporta algorithm [33,34]. The complete list of massive Bhabhascattering master integrals can be found in [22].
Next, we evaluate the master integrals:
Integrals arising from graphs of classes 1a1c (Fig. 1), 2a2c (Fig. 2) and 2d2e (Fig. 2, with
electron loops) have been computed exactly through the method of differential equations in
the external kinematic variables and expressed through harmonic polylogarithms [35] or
generalized harmonic polylogarithms [36,37]. Here we agree perfectly with the work of [17,
25]. Non-approximated results for the various components of the differential cross section
are collected in a Mathematica [38] file at [26].
Integrals generated by the diagrams of classes 2d2e (Fig. 2, with heavy-fermion loops) are
computed through a method based on asymptotic expansions of MellinBarnes representations. We derived appropriate MellinBarnes representations [39,40] for each master integral
and performed an analytic continuation in from a range where the integral is regular to the
origin of the plane [4,5]. This is done by an automatic procedure implemented in the package MB.m [41]. To proceed further, we assume a hierarchy of scales, m2e m2f s, t, u,
where f = e. After identifying the leading contributions in the fermion masses (in the same
spirit as in [42]), we express the integrals by series over residua, and the latter are summed up
analytically in terms of polylogs by means of the package XSUMMER [43]. Asymptotic expansions for the master integrals with two different masses were given in [44]. They, and also
few lacking expansions of simpler masters needed here have been collected in Appendix A.
We refer for a detailed discussion to [24], where the technique was employed to derive approximated results for the massive Bhabha-scattering planar box master integrals. All the
mass-expanded masters may also be found in a Mathematica file at [26].
2.3. Renormalization
In the following we will always deal with ultraviolet-renormalized quantities. After regularizing the theory using dimensional regularization [45,46], we perform renormalization in the
on-mass-shell scheme. Here we relate all free parameters to physical observables:
The electric charge coincides with the value of the electromagnetic coupling, as measured in
Thomson scattering, at all orders in perturbation theory;
The squared fermion masses are identified with the real parts of the poles of the Dysonresummed propagators;
Finally, field-renormalization constants are chosen in order to cancel external wave-function
corrections.
Counterterm-dependent Feynman rules are shown in Fig. 3. Note that the presence of infrared
divergencies at NNLO requires to compute one-loop counterterms including O( ) terms.
32

ie2i Zi p 2 g p p
i

ie2i Zff
(p
/ m) Zmi m
i
ie2i+1 Qf Zff
Fig. 3. Counterterm-dependent Feynman rules relevant for Bhabha scattering for i = 1 (one loop) and i = 2 (two loops).
Note that in the on-mass-shell scheme e2 = 4 at all orders in perturbation theory.
2.3.1. One-loop counterterms

The one-loop counterterms read as

F 2 m2e 1
Q
+
Z1 =
2 ,
f
2
12 2
m2f
f

2

F
3
3
1
1
2 me
= Zm
=
Q
+
4
+

8
+
,
Zff
2

2
16 2 f m2f
1
1
= Zff
,
Zff
(2.20)
(2.21)
(2.22)
where the last equation follows from the U(1) QED Ward identity. In the ultrarelativistic limit, the
one-loop fermion-mass counterterm is not needed, since it is always multiplied by the fermion
mass. Note however that the same counterterm is relevant for the exact computation.
2.3.2. Two-loop counterterms
At the two-loop level we get
Z2 =

F 2 4 m2e 2 1 15
Q
+
,
f

2
128 4
m2f
(2.23)
Z2 ee

2 2

F 2
947
5
1
1
2 me
=
Qf
+
162 +
.
36
2 12
128 4 2
m2f
(2.24)
f =e
The result for Z2 ee is obtained including just fermion-loop diagrams and neglecting O(m2e /m2f )
terms for f = e. The expression for Z2 (as well as the one-loop counterterms of Eqs. (2.20)
(2.22)), instead, is exact, since it follows from the single-scale diagrams of classes 2a2b of
Fig. 2. Finally, we observe that the two-loop counterterm with two fermion lines is not required,
since the use of an on-mass-shell renormalization removes external wave-function factors.
3. Two-loop corrections
In this section we show our approximated results for all the components of the NNLO differential cross section of Eq. (2.6). Our short-hand notation for logarithmic functions can be found
33
in Eq. (2.17). In addition, we define two combinations of the Mandelstam invariants:

v1 (x, y; ) = x 2 + 2y 2 + 2xy x 2 ,

v2 (x, y; ) = (x + y)2 x 2 + y 2 + xy ,
(3.1)
(3.2)
where x(y) = s, t, u. Note that for = 0 these functions are proportional to the kinematical
factors appearing in the Born cross section of Eq. (2.5) and the NLO corrections of Eq. (2.7).
Moreover, we introduce a compact notation which will prove useful in discussing box corrections
in Section 3.3 and the complete NNLO differential cross section in Section 4,
2
me
.
L(Rf ) = ln
(3.3)
m2f
3.1. Vacuum-polarization corrections
The interference of the vacuum-polarization diagrams of classes 2a and 2b with the tree-level
amplitude can be written as

1
d 2itree 2 1
v1 (s, t; 0)A2i (s) + 2 v1 (t, s; 0)A2i (t)
=
2
d
s s
t

2i

1
+ v2 (s, t; 0) A (s) + A2i (t) , i = a, b.
(3.4)
st
Here we introduced the auxiliary functions A2a (x) and A2b (x), which are expressed through
the renormalized one- and two-loop vacuum-polarization functions f(1) (x) (see Eq. (2.18)) and
(2)
f (x),
A2a (x) =
(2)
Q4f Re f (x) ,
A2b (x) =

Q2f1 Q2f2 Re f(1)
(x)f(1)
(x) ,
1
2
(3.5)
(3.6)
f1 ,f2
(2)
where the result for f (x) in the small fermion-mass limit reads as
5
1
(3.7)
+ 3 Lf (x).
24
4
Note that O( ) terms in Eq. (3.4) coming from the kinematical coefficients of Eq. (3.1) can be
(1)
(2)
safely neglected, since both f (x) and f (x) are infrared-finite quantities.
f(2) (x) =
3.2. Vertex corrections

The contribution of reducible (irreducible) vertex corrections to the NNLO differential cross
section can be readily derived from diagrams of classes 2c (2d) in Fig. 2,

1

2 1
d 2itree
2i
2 2i
2i
2 2i
v
v
(s,
t;
)A
(s)
+
s
A
(s)
+
(t,
s;
)A
(t)
+
t
A
(t)
=2
1
1
V
M
V
M
d
s s2
t2

3 2 2i
1
2i
v2 (s, t; ) A2i
s AM (s) + t 2 A2i
+
V (s) + AV (t) +
M (t)
st
2

2i
, i = c, d.
+ 2st AM (s) + A2i
(3.8)
M (t)
34
3.2.1. Reducible diagrams

2c
The auxiliary functions A2c
V (x) and AM (x) are given by the product of the renormalized one(1)
loop vacuum-polarization function f (x) (expanded in Eq. (2.18) including O( ) terms) and
(1)
(1)
the renormalized one-loop vector and magnetic vertex form factors FV (x) and FM (x),

(1)

(1)
Q2f Re FI (x)f (x) , I = V, M.
A2c
I (x) =
(3.9)
f
(1)
The asymptotic expansion of FV (x) is given by

FV1 (x) =
1
3
1
1 + Le (x) 1 + 2 Le (x) L2e (x),
2
2
4
4
(1)
(3.10)
(1)
whereas FM (x) vanishes when we neglect the electron mass, FM (x) = 0. The renormalized
one-loop vertex develops an infrared divergency, which shows up as a single pole in the plane
for = 0. Therefore, when computing the cross section, we sum over the spins the squared matrix
element and we evaluate the traces over Dirac indices in D = 4 2 dimensions. The needed
kinematical structures include O( ) terms (see Eq. (3.1)).
3.2.2. Irreducible diagrams
The renormalized two-loop vertex diagrams of class 2d are free of infrared divergencies.
Therefore, we can neglect O( ) terms in the kinematical coefficients of Eq. (3.1) appearing
in Eq. (3.8), setting va (x, y; ) = va (x, y; 0), for a = 1, 2. The auxiliary functions A2d
V (x) and
A2d
(x)
contain
the
renormalized
two-loop
vector
and
magnetic
vertex
form
factors
(see
[4749]
M
for a detailed discussion),

(2)
A2d
(3.11)
Q2f Re FI,f (x) , I = V, M.
I (x) =
f
(2)
For the case with an electron loop, FI,e (x), the exact results in terms of harmonic polylogarithms,
can be readily expanded in the high-energy limit. For the vector term we get

1 383
19
1
1 265
(2)
FV,e (x) =
(3.12)
2 +
+ 2 Le (x) + L2e (x) + L3e (x).
4 27
6 36
72
36
(2)
For FV,f (x), f = e, we perform an asymptotic expansion of the master integrals arising in the
computation (see Table V in [22]) and we fully agree with the result of [50],

1 3355 19
1 265
(2)
+ 2 23 +
+ 2 Lf (x)
FV,f (x) =
6 216
6
6 36
19
1
+ L2f (x) + L3f (x).
(3.13)
72
36
Since collinear logarithms are absent, the logarithmic structure of Eqs. (3.12) and (3.13) is obviously the same.
3.3. Box corrections
The contribution of the renormalized two-loop box diagrams of class 2e is given by

d 2etree 2 1 2etree
1 2etree
(s, t) + A2
(s, t) .
=
A
d
2s s 1
t
(3.14)
35
Here the auxiliary functions can be conveniently expressed through three independent form fac(2)
tors BI,f (x, y), where i = A, B, C,
A2etree
(s, t)
1

(2)

(2)
(2)
(2)
Q2f Re BA,f (s, t) + BB,f (t, s) + BC,f (u, t) BB,f (u, s) ,
= F 2
(3.15)
A2etree
(s, t)
2

(2)

(2)
(2)
(2)
2
= F
Q2f Re BB,f (s, t) + BA,f (t, s) BB,f (u, t) + BC,f (u, s) .
(3.16)
3.3.1. Electron loops

(2)
For the case with an electron loop, BI,e (x, y), we get exact results in terms of harmonic
polylogarithms and generalized harmonic polylogarithms. An asymptotic expansion in the limit
m2e s, t, u leads to

5
2 17
1 2 x2
1 x2
(2)
BA,e (x, y) =
+ 2x + y
+ Le (y) Le (x) +
+ 202
3 y
3
3 y
3 3

41
1
23
+2
2 Le (x) 2
+ 82 Le (y) L2e (y) + 8Le (x)Le (y)
9
3
6

y
y
5 3
2
2
ln 1 +
Le (y) + 4Le (x)Le (y) 62 + ln
3
x
x

y
y
242
x
2 34
y
Li2
+ 2 Li3
+
+ 72 +
Le (x)
2 ln
x
x
x
3
3 3
9

5
1
+ 62 Le (y) + 13L2e (x) 16L2e (y) + 34Le (x)Le (y)

4
3
3

y
y
1 3
ln 1 +
+ 2 Le (x) L3e (y) + 3Le (x)L2e (y) 2 62 + ln2
3
x
x

y
y
y
130
y
2 17
4 ln
Li2
+ 4 Li3
+
+ 112 +
Le (x)
x
x
x
3
3 3
9

5
5
1
6(1 + 22 )Le (y) + L2e (x) L2e (y) + 4Le (x)Le (y) + L3e (x)
3
2
3

y
y
ln 1 +
+ 3Le (x)L2e (y) L3e (y) 62 + ln2
x
x

y
y
y
2 ln
(3.17)
Li2
+ 2 Li3
,
x
x
x
(2)
BB,e
(x, y) =

17
1 2 x2
1 x2 4
5
2 + 2x + y
202
+ Le (y) Le (x) +
3
y
3
3 y 3
3

1
23 2
56
2 Le (x) 4
+ 82 Le (y)
L (y) 20Le (x)Le (y)
+4
9
3
3 e

y
5 3
2
2 y
ln 1 +
2 Le (y) 4Le (x)Le (y) 2 62 + ln
3
x
x
36

y
x
2 34
y
y
272
4 ln
Li2
+ 4 Li3
+
+ 72 +
Le (x)
x
x
x
3
3 3
9

5
1
4
+ 62 Le (y) + 13L2e (x) + 40Le (x)Le (y) 16L2e (y)
3
3

1 3
y
3
2
2 y
+ 2 Le (x) Le (y) + 3Le (x)Le (y) 2 62 + ln
ln 1 +
3
x
x

y
y
2 17
y
y
130
4 ln
Li2
+ 4 Li3
+
+ 112 +
Le (x)
x
x
x
3
3 3
9

5
5
1
6(1 + 22 )Le (y) + L2e (x) L2e (y) + 4Le (x)Le (y) + L3e (x)
3
2
3

y
y
L3e (y) + 3Le (x)L2e (y) 62 + ln2
ln 1 +
x
x

y
y
y
2 ln
(3.18)
Li2
+ 2 Li3
,
x
x
x
(2)

1 2 x2 5
2
5
+ Le (y) Le (x) + (x + y) + Le (y) Le (x)
3 y 3
3
3

2
1 x 2 17
1
41
+
+ 202 2
2 Le (x) + 2
+ 82 Le (y)
3 y 3 3
9
3
5
23
+ L2e (y) 8Le (x)Le (y) + L3e (y) 4Le (x)L2e (y)
6
3

y
y
y
y
2 y
ln 1 +
+ 2 ln
Li2
2 Li3
.
+ 62 + ln
x
x
x
x
x
(3.19)
BC,e (x, y) =
3.3.2. Heavy-fermion loops

The list of master integrals used here was given in Table V of [22]. In Appendix A we collect the explicit analytic expressions for them in the ultra-relativistic limit. At variance with the
electron-loop case, it is not possible to compute them exactly by means of a basis containing
harmonic polylogarithms and generalized harmonic polylogarithms. Therefore, we use the highenergy asymptotic expansion discussed in Section 2.2. The results, expressed by the logarithms
of the fermion masses L(Rf ) (see Eq. (3.3)), are:

1 2 x2
5
(2)
(x, y) =
+ 2x + y
L(Rf ) + Le (y) Le (x)
BA,f
3 y
3

2
1x
131
25
+
2
102 23 2
62 L(Rf )
3 y
27
9

7 2
1 3
82
4
+ L (Rf ) L (Rf ) +
22 L(Rf ) Le (x)
6
3
9
3

1
1
23
2 + 82 L(Rf ) Le (y)
2L(Rf ) L2e (y)
3
2
6

5 3
2
+ 4 2 L(Rf ) Le (x)Le (y) 4
L (y) Le (x)Le (y)
12 e
37

y
y
y
y
y
ln 1 +
2 ln
Li2
2 Li3
62 + ln
x
x
x
x
x

25
x
262
+
2
92 43 4
32 L(Rf )
3
27
9

7 2
2 3
121 10
10
L(Rf ) Le (x) 2
+ 122
+ L (Rf ) L (Rf ) + 2
3
3
9
3
3

13
16
2L(Rf ) L2e (x)
2L(Rf ) L2e (y)
2L(Rf ) Le (y) +
3
3

17
2
2L(Rf ) Le (x)Le (y) + L3e (x) + 6Le (x)L2e (y) 2L3e (y)
+2
3
3

y
y
y
y
y
2
ln 1 +
4 ln
Li2
+ 4 Li3
2 62 + ln
x
x
x
x
x

y
25
131
7 2
+
2
72 23 2
32 L(Rf ) + L (Rf )
3
27
9
6

1 3
130 10
L (Rf ) +
L(Rf ) Le (x) 6 + 122 3L(Rf ) Le (y)
3
9
3

5
25
2
L(Rf ) Le (x)
L(Rf ) L2e (y)
+
3
6

10
1
L(Rf ) Le (x)Le (y) + L3e (x) L3e (y) + 3Le (x)L2e (y)
+2
3
3

y
y
y
y
y
2
ln 1 +
2 ln
Li2
+ 2 Li3
,
62 + ln
x
x
x
x
x
(3.20)

5
1 2 x2
(2)
BB,f (x, y) =
2 + 2x + y
L(Rf ) + Le (y) Le (x)
3
y
3

50
7
2 x 2 262
202 43
122 L(Rf ) + L2 (Rf )
+
3 y 27
9
6

1 3
112
10
2
22 L(Rf ) Le (x) + 162
L (Rf ) +
3
9
3
3

23
2L(Rf ) L2e (y) + 2 5 2L(Rf ) Le (x)Le (y)
+ L(Rf ) Le (y)
6

y
5 3
2
2 y
L (y) Le (x)Le (y) 62 + ln
ln 1 +
4
12 e
x
x

y
y
y
2 ln
Li2
+ 2 Li3
x
x
x

2x 262
25
7
+
92 43 2
32 L(Rf ) + L2 (Rf )
3 27
9
6

136 13
10
1 3
L(Rf ) Le (x)
+ 122 2L(Rf ) Le (y)
L (Rf ) +
3
9
3
3

13
8
L(Rf ) L2e (x)
L(Rf ) L2e (y)
+
6
3
38

20
+
2L(Rf ) Le (x)Le (y)
3

1 3
y
2
3
2 y
+ Le (x) + 3Le (x)Le (y) Le (y) 62 + ln
ln 1 +
3
x
x

y
2y
131
y
y
2 ln
Li2
+ 2 Li3
+
72 23
x
x
x
3
27

25
7 2
1 3
65 5
32 L(Rf ) + L (Rf ) L (Rf ) +

L(Rf ) Le (x)
9
12
6
9
3

1
1 5
6 + 122 3L(Rf ) Le (y) +
L(Rf ) L2e (x)
2
2 3

1 25
10
1
L(Rf ) L2e (y) +

L(Rf ) Le (x)Le (y) + L3e (x)
2 6
3
6

1 3
3
1
y
y
Le (y) + Le (x)L2e (y) 32 + ln2
ln 1 +
2
2
2
x
x

y
y
y
ln
(3.21)
Li2
+ Li3
,
x
x
x
(2)

1 2 x2 5
2
5
L(Rf ) + Le (y) Le (x) + (x + y) L(Rf )
3 y 3
3
3

2 x2
25
131
+ Le (y) Le (x) +
+ 102 + 23 +
62 L(Rf )
3 y
27
9

7 2
1 3
41
2
1
L (Rf ) + L (Rf )
2 L(Rf ) Le (x) +
+ 82
12
6
9
3
3

1
23
L(Rf ) Le (y) 2 2 L(Rf ) Le (x)Le (y) +
L(Rf ) L2e (y)
2
12

5 3
1 2 y
y
2
+ Le (y) 2Le (x)Le (y) + 32 + ln
ln 1 +
6
2
x
x

y
y
y
+ ln
(3.22)
Li2
Li3
.
x
x
x
BC,f (x, y) =
In order to study the numerical effects of massive leptons in two-loop box diagrams we consider the interference of the box diagram of class 2e (see Fig. 2) with the s-channel tree-level
amplitude,
(2)

2
(3.23)
Re
B
(s,
t)
,
A,f
4s 2
where BA,f can be found in Eq. (3.17) for electron loops, and in Eq. (3.20) for f =
e loops. In
Table 1 (Table 2) we show numerical values for the finite part of B2e,f at values of s typical
for meson factories, Giga-Z, ILC, and at two selected small and wide scattering angles, = 3
( = 90 ).
For comparison we show in Table 3 the real part of the vertex function, see Eq. (3.13). We notice that the contributions of the box diagrams with heavier fermions are not strongly suppressed,
and are comparable to those coming from the electron-loop boxes. This is different with respect
B2e,f =
39
Table 1
Numerical values for the finite part of B2e,f of Eq. (3.23) in nanobarns at a scattering
angle = 3 . The first two entries for the lepton are not shown since here the highenergy approximation in not justified (the same consideration applies to the top quark)
B2e,f [nb]/ s [GeV]

10
91
500
e [see Eq. (3.17)]
[see Eq. (3.20)]
[see Eq. (3.20)]
188 758
1635.62
5200.08
1686.88
284.711
130.579
39.5554
Table 2
Numerical values for the finite part of B2e,f of Eq. (3.23) in nanobarns at a scattering
angle = 90 . The first two entries for the top quark are not shown since here the highenergy approximation in not justified
B2e,f [nb]/ s [GeV]

10
91
500
e [see Eq. (3.17)]
[see Eq. (3.20)]
[see Eq. (3.20)]
t [see Eq. (3.20)]
143.162
61.3875
10.0105
3.23102
1.79381
0.935319
Table 3
The real part for the vertex form factor, see Eqs. (3.12) and (3.13)
s [GeV]
10
91
e
124.237
4.8036
254.293
29.1057
2.08719
0.160582
0.0995184
0.0639576
0.00256757
500
400.574
70.1032
13.4901
to the self-energy and vertex corrections and may be traced back to the logarithmic structure of
the terms in Eqs. (3.20)(3.22), where terms of order L3e (x) appear (note that the two-loop box
master integrals of Eqs. (A.7) and (A.8) of Appendix A show a dependence on L3e (x), in contrast
to the vertex and self-energy masters with heavy fermion loops). After assembling the box diagrams we see a remaining dependence on ln2 (s/m2e ). This is a collinear mass singularity, coming
from the external legs of the diagrams, which leads to the fact that the two-loop box corrections
from heavier fermions are not numerically suppressed compared to the electron-loop contributions. One may control this easily by evaluating the singularity structure of the corresponding
massless box diagram where only a scale M due to the internal loop exists, and see there some
1/ 2 terms which are absent in the corresponding self-energy and vertex diagrams.
3.4. Products of one-loop corrections
Finally, we consider the simpler components generated by the interference of one-loop diagrams among themselves. We start with the interference of diagrams of class 1a,

d 1a1a 2 1
1
v1 (s, t; 0)A1a1a (s, s) + 2 v1 (t, s; 0)A1a1a (t, t)
=
d
2s s 2
t

1
+ v2 (s, t; 0) A1a1a (s, t) + A1a1a (t, s) .
(3.24)
st
40
Here the auxiliary function A1a1a (x, y) contains the product of the renormalized one-loop
(1)
vacuum-polarization function f (x) (see Eq. (2.18)) with its complex conjugate,
A1a1a (x, y)

Q2f1 Q2f2 f(1)
(x) f(1)
(y) .
1
2
(3.25)
f1 ,f2
The interference of diagrams of class 1a with those of class 1b gives

d 1a1b
2 1
1a1b
v1 (s, t; )AV
(s, s) + s 2 A1a1b
(s, s)
=2
M
2
d
s s

1
+ 2 v1 (t, s; )A1a1b
(t, t) + t 2 A1a1b
(t, t)
V
M
t

1a1b

1
+
(s, t) + A1a1b
(t, s)
v2 (s, t; ) AV
V
st

3
1a1b
1a1b
+ s 2 AM
(s, t) + t 2 AM
(t, s)
2

1a1b
+ 2st A1a1b
(s,
t)
+
A
(t,
s)
.
M
M
(1)
(3.26)
(1)
The auxiliary function A1a1b (x, y) is given by the product of FV (x) and FM (x), the renormalized one-loop vector (see Eq. (3.10)) and magnetic (vanishing in the high-energy limit)
form factors for the QED vertex, and the complex-conjugate renormalized one-loop vacuum(1)
polarization function f (x) (see Eq. (2.18)),
A1a1b
(x, y)
I
(1)
(1)
Q2f Re FI (x) f (y) ,
I = V, M.
(3.27)
Finally, the interference of diagrams of class 1a with those of class 1c gives

1
d 1a1c 2 1 1a1c
(s, t) + A1a1c
(s,
t)
.
=
A1
d
4s s
t 2
(3.28)
(s, t) and A1a1c

(s, t) take the form
Here the auxiliary functions A1a1c
1
2

(1)
(1)
(1)
A1a1c
(s, t) = F
Q2f Re BA (s, t) + BB (t, s) + BC (u, t)
1
f

(1)
(1)
BB (u, s) f (s) ,

(1)
(1)
(1)
(s, t) = F
Q2f Re BB (s, t) + BA (t, s) BB (u, t)
A1a1c
2
f

(1)
(1)
+ BC (u, s) f (t) ,
(s, t) = F
A1a1c
1
(3.29)

(1)
(1)
(1)
Q2f Re BA (s, t) + BB (t, s) + BC (u, t)

(1)
(1)
BB (u, s) f (s) ,
(3.30)
A1a1c
(s, t) = F
2
41

(1)
(1)
(1)
Q2f Re BB (s, t) + BA (t, s) BB (u, t)

(1)
(1)
+ BC (u, s) f (t) .
(3.31)
(1)
f (x) is given in Eq. (2.18), and the new functions, in the small mass limit, read as

4 x2
x2
(1)
BA (x, y) =
+ 2x + y Le (x) +
162 + 4Le (x) + 2L2e (y) 4Le (x)Le (y)
y
y

+ 2x 102 + Le (x) + Le (y) L2e (x) + L2e (y) 2Le (x)Le (y)

+ y 102 + 2Le (x) + 2Le (y) L2e (x) + L2e (y) 2Le (x)Le (y) ,
(3.32)
(1)

4 x2
x2
2 + 2x + y Le (x) + 4
82 + L2e (y) 2Le (x)Le (y)

y
y

+ 2x 102 Le (x) + Le (y) L2e (x) + L2e (y) 2Le (x)Le (y)

+ y 102 + 2Le (x) + 2Le (y) L2e (x) + L2e (y) 2Le (x)Le (y) ,
(3.33)

4 x2
x2
Le (x) + 2
82 2Le (x) L2e (y) + 2Le (x)Le (y)
y
y
4(x + y)Le (x).
(3.34)
BB (x, y) =
(1)
BC (x, y) =
For the computation of the non-fermionic corrections these functions are needed up to first order
in , since they are combined with the real emission. However, this higher-order expansion is not
relevant here.
4. The net fermionic NNLO differential cross section
In this section we use the results of Section 3 and derive an explicit expression for the NNLO
differential cross section of Eq. (2.19).
Note that the full set of two-loop fermionic virtual corrections to Bhabha scattering represents
an infrared-divergent quantity. In order to obtain a finite quantity, we take into account the real
emission of soft photons3 from the external legs of one-loop fermionic diagrams (class 1a, Fig. 1).
The exact result is available in the literature, see, e.g., Eq. (25) and Appendix A in [18]. Here we
show the high-energy approximation relevant for our computation. We consider events involving
a single soft photon carrying energy in the final state,
e (p1 ) + e+ (p2 ) e (p3 ) + e+ (p4 ) + (k),
(4.1)
and compute one-loop purely-fermionic corrections. Obviously, these real corrections factorize
and their structure is completely equivalent to the tree-level ones. In complete analogy with Eq.
(2.6) we write
LO 2 NLO

d
d
d
=
+
+ O 5 ,
(4.2)
d
d
d
3 The energy carried by a soft photon in the final state is small with respect to the center-of-mass energy E introduced
in Eq. (2.2).
42
where
dLO
d
dNLO
d

1
1
2 1
v
(s,
t;
)
+
v
(t,
s;
)
+
(s,
t;
)
F , s, t, m2e ,
v
1
1
2
2
2
s 2s
st
2t
(4.3)
(1) 1
(1)
2 1
2
v
(s,
t;
)
Q
Re
(s)
+
v
(t,
s;
)
Q2f Re f (t)
=
1
1
f
f
2
2
s s
t
f
f

1
Q2f Re f(1) (s) + f(1) (t) F , s, t, m2e .
+ v2 (s, t; )
(4.4)
st
f
f(1) (x) can be read in Eq. (2.18) and, at variance with Eqs. (2.5)(2.7), the kinematical factors
introduced in Eq. (3.1) need to be expanded up to O( ), since the real-emission factor shows an
infrared divergency,

t
s
2
t
F , s, t, m2e = F ln
ln
1
+
1
+
ln

s
s
m2e

t
t
s
s
2
2
ln 1 +
+ 2 ln
2 ln + ln
+ ln
s
s
m2e
m2e
s

2
t
t
+ 4 ln
ln
ln 1 +
1
s
s
s

t
t
ln2 1 +
42 + ln2
s
s

t
t
2 Li2
(4.5)
+ 2 Li2 1 +
.
s
s
Summing the virtual contributions of Eq. (2.19) to the real-photon emission of Eq. (4.4) we
write the NNLO fermionic corrections to Bhabha scattering through the sum of electron-loop
contributions (d NNLO,e ) and components arising from heavier fermion loops,

d NNLO,f
d NNLO dNLO d NNLO,e 2 d NNLO,f
Qf
Q4f
+
=
+
+
d
d
d
d
d
2
f =e

f1 ,f2 =e
Q2f1 Q2f2
f =e
d NNLO,2f
d
(4.6)
The double summation over the fermion species arises from the loop-by-loop terms of Eqs. (3.6)
and (3.24). Here we do not include the case f1 = f2 = e, which is incorporated in d NNLO,e .
Note also the term proportional to Q4f , coming from Eq. (3.5). The result for electron loops can
be found in Eq. (46) of [18]. For heavier fermion loops we introduce x = t/s and get:

4
5
2 (1 x + x 2 )2
d NNLO,f
s
)
+
4
=
ln
+
ln(R
f
3
d
2s
6
x2
m2e

1
3
3 x
+ ln(x) 2
+
,
2x 2 2
x
(4.7)
43

d NNLO,2f 2 (1 x + x 2 )2 2 s
ln
+ ln(Rf1 ) ln(Rf2 )
=
d
s
3x 2
m2e

5
s
10
5
ln(R
ln(R
+ ln
)
+
ln(R
)
)
+
ln(R
)
f1
f2
f1
f2
3
3
3
m2e

1
7 x
2 2
1
4
+ ln2 (x) 2
+
+
5 + 4x 2x 2
3
3x 6 3
3 x
x

1
s
1 x
10
1
+ ln(x) ln(Rf1 ) + ln(Rf2 )
+ 2 ln
+
,
3
m2e
3x 2 2x 2 6
(4.8)

2
d NNLO,f
2 NNLO,f2
2
NNLO,f2
,
=
1
+ 2
ln
d
s
s
2
1NNLO,f =
(4.9)

1 3 s
s
55
(1 x + x 2 )2
3
2
+
ln
ln
ln(Rf )
(R
)
+
ln
f
3
6
3x 2
m2e
m2e

589 37
s
+
ln(Rf ) ln2 (Rf )
+ ln(1 x) ln(x) + ln
18
3
m2e

2 ln(Rf ) ln(x) ln(1 x) 8 Li2 (x)
19 2
4795 409
ln(Rf ) +
ln (Rf )
108
18
6

40
Li2 (x)
ln2 (Rf ) ln(x) ln(1 x) 8 ln(Rf ) Li2 (x) +
3

2
11 23
16 2
4
1
s
2
+
ln
+
x
+
x
+
(x)
2
+ ln
2
2
2
3x
2
3
3
me
3x
3x

5
x
2 2
5
11
2
17
2
11

+ x + ln2 (1 x) 2 +
+ x x2
+
12x 4 12 3
6x 2
6
3
3x

55
1 5
4 2
4
83 65
2
+ x x + ln(x)
+
+ ln(x) ln(1 x)
3
6
3x 2 3x 2 3
9x 2 9x

1
10 2
10
31
10 2
31
85
10 + x x
x + x + ln(1 x) 2 +
18
9
3
6x
6
3
3x

2
1
11 x x
1
1
31
1
1
+ ln3 (x) 2 +
+
+ ln3 (1 x) 2 +
3
12x
6
6
3
3
x
3x
3x

x2
4
x2
1
1
4
2
+ ln (x) ln(1 x) 2 +
+x
+x
3
3
3x 3
3
3x

1
2
55
46
1
7
x
+ ln2 (x)
+ ln(x) ln2 (1 x) 2 + +
2
3
x 4 2
9x
x
18x

10 2
5
x
2 2
1
17
14 4
x x + ln(Rf ) 2 +

+ x
+
3
9
9
12x 4 12 3
3x

10
10
29 9 29
2
11
+ x + x 2 + ln(Rf ) 2 +
+ ln2 (1 x)
2
9x 2
9
9
6x
9x
3x
+
44

5 11
37
2 2
1 25
10
+ x x
+ x
+ ln(x) ln(1 x) 2 +
2
6
3
18x 2
9
9x

20 2
2
4
1 5
4 2
+ x + ln(Rf )
+ x x
9
3
3x 2 3x 2 3

589
1753 701 925
56
+ ln(x)
+
+
x x2
36
108
27
54x 2 108x

4
19
2
+ Li2 (x) 2 +
7 + 3x x 2
3x
3
x

37
56 47 67
4 1
10 2
2
+ ln(Rf )
+
x + x + 2 2 +
6
18
9
x 6
9x 2 9x
3x

10
161 56 161
56
56
2
x + 2x
x + x2
+ ln(1 x)
2
3
54x
9
54
27
27x

10
31
20
10 31
10 2
2
+ ln(Rf ) 2 +
+ x x + 2 2 +
18x
3
18
9
3x
9x
x

32 20
4
7
5
2
+ x 2x 2 + Li3 (x)
+ 3 x + x2
3
3
3
3
3x 2 3x

2
1
1
13
43
19
+ S1,2 (x) 2 + x + x 2 + 2
3
x
3
x
9x 2 18x

311
2
4
98 2
11 23
16 2
+
x x + ln(Rf ) 2 +
+
x+ x
18
9
3x
2
3
3
3x

3
11
4
+ 3 2 + 5 + x 2x 2 ,
x
3
3x
2
2NNLO,f =
(4.10)

8 (1 x + x 2 )2
8
s
s
2
ln
+
ln
+ ln(Rf )
3
3
x2
m2e
m2e

5
ln(Rf ) 1 + ln(1 x)
ln(1 x) + ln(x) ln(Rf ) +
3

s
7
2
1
5
2 2
4
2
+ 4 ln
+ 3 x + x + ln (x)
ln(x)
2
2
2
3x
3
3
x
me
3x
3x

1
2
1
1
+1 x
+ 1 x ln(x) ln(1 x)
3
3
3x 2 x

1
29
16
23
10
ln(x)
(4.11)
+ 13 x + x 2 .
3
3
3
3x 2 3x
In order to have compact results we used

(1)n+p1
Sn,p (y) =
(n 1)!p!
1
dx
lnn1 (x) lnp (1 xy)

.
x
(4.12)
In Table 4 (Table 5) we show numerical values for the NNLO corrections to the differential
cross section for a scattering angle = 3 ( = 90 ). In both tables we set = E/10. Finally, in
45
Table 4
Numerical values for the NNLO corrections to the differential cross section respect to the solid angle. Results are
expressed in nanobarns for a scattering angle = 3 . Empty entries are related to cases where the high-energy approximation cannot be applied
d/d [nb] | s [GeV]

10
91
500
LO QED [Eq. (2.5)]
LO Zfitter [51,52]
NNLO (e) [Eq. (4.6)]
NNLO (e + ) [Eq. (4.6)]
NNLO (e + + ) [Eq. (4.6)]
NNLO photonic [14,16]
440873
440875
1397.35
1394.74
9564.09
5323.91
5331.5
35.8374
43.1888
251.661
176.349
176.283
1.88151
2.41643
2.55179
12.7943
Table 5
Numerical values for the NNLO corrections to the differential cross section respect to the solid angle. Results are
expressed in nanobarns for a scattering angle = 90 . Empty entries are related to cases where the high-energy approximation cannot be applied
d/d [nb] | s [GeV]

10
91
500
LO QED [Eq. (2.5)]
LO Zfitter [51,52]
NNLO (e) [Eq. (4.6)]
NNLO (e + ) [Eq. (4.6)]
NNLO (e + + ) [Eq. (4.6)]
NNLO (e + + + t) [Eq. (4.6)]
NNLO photonic [14,16]
0.466409
0.468499
0.00453987
0.00570942
0.00586082
0.00563228
0.127292
0.0000919387
0.000122796
0.000135449
0.0358755
0.000655126
0.000186564
0.0000854731
4.28105 106
5.90469 106
6.7059 106
6.6927 106
0.0000284063
Fig. 4. Ratio of the fermionic NNLO corrections to the differential cross section respect to the tree-level result for
s = 10 GeV and s = 500 GeV. A solid line represents the electron-loop contributions, a dotted one the sum of
electron- and muon-loop ones, and a dashed one includes also leptons.
Fig. 4 we plot the ratio of the two-loop fermionic corrections to the tree-level cross section,
2 NNLO
+ dNLO
d
(4.13)
d LO
for s = 10 GeV and s = 500 GeV (see also Fig. 5).

It is clear from the tables, that although there is no decoupling of the heavier fermions (as
indeed there should not, since the typical scale of the process is large compared to all the masses),
R( s ) =
46
Fig. 5. Same as Fig. 4, including the photonic contributions of [2,14,16] (dash-dotted lines).
the electron loop contributions dominate in the fermionic part and the latter is still substantially
smaller than the pure photonic corrections.
5. Summary
In this article, we completed the computation of the virtual two-loop QED fermionic corrections to Bhabha scattering. Based on the kinematics of the targeted phenomenological applications, we considered the limit m2e m2f s, t, u.
The fermionic double box contributions with two different mass scales have been derived
for the first time here. Their numerical importance is comparable to the two-loop self-energies
and vertices. We note, however, a qualitative difference. Due to the structure of the collinear
singularities of the graphs, the contributions of the heavier fermions are not suppressed.
A numerical estimation of differential cross sections shows that the net fermionic two-loop
effects may be neglected for applications at LEP 1 and LEP 2, but have to be taken into account
for precision calculations when a level of 104 has to be reached, as is anticipated for the Giga-Z
option of the ILC project.
Completing the NNLO program for Bhabha scattering requires still several ingredients. First,
let us mention the contributions from the five light quark flavors. Here, an approach based on
dispersion relations la [53] should be suitable. On the other hand, the heavy top quark might
be considered decoupling in a large part of the interesting kinematical regions. Furthermore, an
implementation of the loop-by-loop corrections with pentagon diagrams has to be done.
Finally, light fermionic pair emission diagrams need to be considered. As known from the
form-factor case, they are responsible for the cancellation of the leading part of the logarithmic
sensitivity on the masses.
Exact and approximated results are made publicly available at [26]. The combination of our
result with the photonic two-loop corrections of [16] and with electron loop corrections of [17,
25] proves well-suited for phenomenological purposes, e.g., a precise luminosity determination
at a future International Linear Collider.
Note added
We would like to thank T. Becher and K. Melnikov for drawing our attention to a problem with
a first version of our result, which lead us to discover incorrectly expanded integrals (Eqs. (A.7),
(A.8) and (A.15) in Appendix A) appearing in the evaluation of the two-loop box form factors
47
(Eqs. (3.20), (3.21) and (3.22) of Section 3.3). After correction, Eq. (4.10) agrees with the result
published in the meantime in [55].
Acknowledgements
We would like to thank A. Arbuzov, R. Bonciani, A. Ferroglia and A. Penin for useful communications, and S. Moch and A. Mitov for interesting discussions.
Work supported in part by Sonderforschungsbereich/Transregio TRR 9 of DFG Computergesttzte Theoretische Teilchenphysik, by the Sofja Kovalevskaja Award of the Alexander von
Humboldt Foundation sponsored by the German Federal Ministry of Education and Research,
by the ToK Program ALGOTOOLS (MTKD-CD-2004-014319), by the Polish State Committee for Scientific Research (KBN, research projects in 20042006), and by the European
Communitys Marie-Curie Research Training Networks MRTN-CT-2006-035505 HEPTOOLS
and MRTN-CT-2006-035482 FLAVIAnet.
Appendix A. Mass-expanded master integrals
The list of master integrals required by our computation can already be found in Table V
of [22]. The eight most difficult masters, those involving two different mass scales, have been
derived in [44]. Because they are a substantial part of the present study we reproduce them here:
SE3l2M1m[on shell]

5
3 2 3
1 2
1
= M 2 m4 R
+
(R)
+
+
L(R)
L
2
2
2
2 2 4 8

3 7
45 5
11 1
+ R2
L(R) + R
+ 2 L(R) + L2 (R)
18 3
16 4
3
4

1
3 8
1
,
L3 (R) + R 2 + L(R) L2 (R)
2
4 9
2
SE3l2M1md[on shell]

1
1
1
4
=m
+
1 + 2L(R) + (1 + 2 ) + L(R) + L2 (R)
2
2 2 2

2 3
1
2
+ (3 + 32 23 ) + (1 + 2 )L(R) + L (R) + L (R)
6
3

3 1
7
3
+ R + L(R) +
L(R) + L2 (R)
4 2
8
4

1
5
1
1 2
5
2
+ R + L(R) + + L(R) + L (R)
.
36 6
72 18
4
In the following we set Lm (x) = ln(m2 /x) and LM (x) = ln(M 2 /x),

5
1
1
4
+
V4l2M1m[x] = m
+ 19 32 L2m (x)
2 2 2 2

M2
2 + 42 43 2Lm (x) + 2LM (x) 42 LM (x)
+
x

1 3
2
2
+ 2Lm (x)LM (x) LM (x) Lm (x)LM (x) + LM (x) ,
3
(A.1)
(A.2)
(A.3)
48
V4l2M1md[x] =

m4 1
1
1
1
+
(x)
+ 2 2 + Lm (x) + L2m (x)
1
+
L
m
2
2

2
4
m
2

2
M 1 1
LM (x) 1 + 32 + Lm (x) + LM (x)
+
x

1 2
Lm (x)LM (x) LM (x) ,
2
V4l2M2m[x] = m
(A.4)

1 5
1
1
2
+
+ Lm (x) + (19 + 2 ) + 5 Lm (x) + Lm (x) ,
2
2 2 2
(A.5)

m4
123 62 LM (x) L3M (x) ,

(A.6)
6x

m4 1
1
1 2
B5l2M2m[x, y] =
L
(x)
+
+
2L
(x)
+
(x)
+
L
(x)L
(y)
L
m
2
m
m
m
x

2 m
2
1
22 23 + 4Lm (x) + L2m (x) + L3m (x) 42 Lm (y)
3
1
2
+ 2Lm (x)Lm (y) + Lm (x)Lm (y) L3m (y)
6

1 2
1 2
y
32 + Lm (x) Lm (x)Lm (y) + Lm (y) ln 1 +
2
2
x

y
y
Lm (x) Lm (y) Li2
(A.7)
+ Li3
,
x
x
V4l2M2md[x] =
B5l2M2md[x, y]

m4 1
Lm (x)Lm (y) + Lm (x)L(R) 23 + 2 Lm (x) + 42 Lm (y)

=
xy
1
1
2Lm (x)L2m (y) + L3m (y) 22 L(R) + 2Lm (x)Lm (y)L(R) L3 (R)
6
6

1 2
1 2
y
+ 32 + Lm (x) Lm (x)Lm (y) + Lm (y) ln 1 +
2
2
x

y
y
+ Lm (x) Lm (y) Li2
Li3
.
x
x
(A.8)
We list also the other expanded masters, including the correct normalizations. Note that, compared to the conventions employed in [22] and in Eq. (2.16), all integrals are rescaled by a factor
mL(D2l) , where L is the number of loops, D = 4 2 and l is the number of internal lines.
Expansions are performed up to the order required by our computation. For example, we include
O(m2 ) terms in SE2l2m[x] (see Eq. (A.10)) since the reduction procedure generates coefficients containing 1/m2 . The same consideration applies to O( ) terms, which are included as
long as the reduction brings inverse powers of in the coefficient functions. Since in the following no ambiguities arise, in we drop the subscript f and we set L(x) = ln(m2 /x),

2 1 1
2
2 3
2
T1l1m = m
(A.9)
+1+ 1+
+ 1+
,

2
2
3
49
SE2l2m[x]

m2
2
1
2 1
=m
+ 2 + L(x) + 2
1 L(x) + 4 + 2L(x) + L2 (x)

x
2
2

2
m
7
1
+
2 + 22 L2 (x) + 2 8 2 3 + 4L(x) 2 L(x) + L2 (x)
x
3
2

1 3
m2
1 3
+ L (x) +
(A.10)
,
2 + 2 + 43 + 2 L(x) L (x)
6
x
3
SE2l0m[x]

2
1
1
= m2
+ 2 + L(x) + 4 + 2L(x) + L2 (x)

2
2

7
1
1
+ 2 8 2 3 + 4L(x) 2 L(x) + L2 (x) + L3 (x) ,
3
2
6

1 2
1 3
m2
42 + L (x) 53 + 2 L(x) L (x) ,
V3l1m[x] =
x
2
6
SE3l1m[on shell]

12 1
5
55 25
11
11 5
+
+

+
= m2
+
+
2
2
3
8
2
16
4
3
2 2 4

55
303
949 55
+ 2
+ 2 + 3 +
4 ,
32
8
6
8
SE3l2m[x]

2 12 1
1
x 13 1
x
2
= m
+
3
+ L(x)
+ 5 + 2 L (x) 2
2
2
4m2
m 8

16
+ 3 + 32 + 3 4L(x) + 22 L(x) 3L2 (x) L3 (x)
3

x 115 2 13
1
,
2
+ L(x) + L2 (x)
16
4
4
2
m
SE3l2md[x]

1
m2
1 2
1 2
1
= m4
+
(x)
+
L(x)
L
2 + L2 (x)
2
2 2
2
2
x
2

11 3
8
1
+ + 2 + 3 5L(x) + 2 L(x) 2L2 (x) L3 (x)
2
2
3
2

2
m
+
.
63 4L(x) 22 L(x) + L2 (x) + L3 (x)
x
(A.11)
(A.12)
(A.13)
(A.14)
(A.15)
Finally, the mass expanded one-loop box master integral B4l2m[x,y] can be collected from
Eqs. (4.70)(4.75) of [54]:
50
B4l2m[x, y]

x
x
m2 2
2
L(y) ln
+ 2L (y) 2L(y) ln
+ 43 92 L(y)
=
xy
y
y

2 3
x
x
x
1 3 x
2
+ L (y) + 52 ln
L (y) ln
+ ln
62 ln 1 +
3
y
y
3
y
y

x
x
x
x
x
+ 2 ln
ln
ln 1 +
ln2
ln 1 +
y
y
y
y
y

x
x
x
+ 2 ln
Li2 1 +
+ 2 Li3
.
y
y
y
(A.16)
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
D. Bardin, W. Hollik, T. Riemann, Z. Phys. C 49 (1991) 485490.

A. Arbuzov, E. Kuraev, B. Shaikhatdenov, Mod. Phys. Lett. A 13 (1998) 23052316, hep-ph/9806215.
Z. Bern, L. Dixon, A. Ghinculov, Phys. Rev. D 63 (2001) 053007, hep-ph/0010075.
V. Smirnov, Phys. Lett. B 460 (1999) 397404, hep-ph/9905323.
J. Tausk, Phys. Lett. B 469 (1999) 225234, hep-ph/9909506.
F. Berends, R. Kleiss, Nucl. Phys. B 228 (1983) 537.
F. Berends, R. Kleiss, W. Hollik, Nucl. Phys. B 304 (1988) 712.
S. Jadach, W. Placzek, E. Richter-Was, B. Ward, Z. Was, Comput. Phys. Commun. 102 (1997) 229251.
S. Jadach, W. Placzek, B. Ward, Phys. Lett. B 390 (1997) 298308, hep-ph/9608412.
A. Arbuzov, G. Fedotovich, E. Kuraev, N. Merenkov, V. Rushai, L. Trentadue, JHEP 9710 (1997) 001, hepph/9702262.
A. Arbuzov, G. Fedotovich, F. Ignatov, E. Kuraev, A. Sibidanov, Eur. Phys. J. C 46 (2006) 689703, hepph/0504233.
C. Carloni Calame, C. Lunardini, G. Montagna, O. Nicrosini, F. Piccinini, Nucl. Phys. B 584 (2000) 459479,
hep-ph/0003268.
G. Balossini, C. Carloni Calame, G. Montagna, O. Nicrosini, F. Piccinini, Nucl. Phys. B 758 (2006) 227253,
hep-ph/0607181.
N. Glover, B. Tausk, J. van der Bij, Phys. Lett. B 516 (2001) 3338, hep-ph/0106052.
A. Penin, Phys. Rev. Lett. 95 (2005) 010408, hep-ph/0501120.
A. Penin, Nucl. Phys. B 734 (2006) 185202, hep-ph/0508127.
R. Bonciani, A. Ferroglia, P. Mastrolia, E. Remiddi, J. van der Bij, Nucl. Phys. B 701 (2004) 121179, hepph/0405275.
R. Bonciani, A. Ferroglia, P. Mastrolia, E. Remiddi, J. van der Bij, Nucl. Phys. B 716 (2005) 280302, hepph/0411321.
R. Bonciani, A. Ferroglia, Phys. Rev. D 72 (2005) 056004, hep-ph/0507047.
V.A. Smirnov, Phys. Lett. B 524 (2002) 129, hep-ph/0111160.
G. Heinrich, V.A. Smirnov, Phys. Lett. B 598 (2004) 55, hep-ph/0406053.
M. Czakon, J. Gluza, T. Riemann, Phys. Rev. D 71 (2005) 073009, hep-ph/0412164.
M. Czakon, J. Gluza, K. Kajda, T. Riemann, Nucl. Phys. B (Proc. Suppl.) 157 (2006) 1620, hep-ph/0602102.
M. Czakon, J. Gluza, T. Riemann, Nucl. Phys. B 751 (2006) 117, hep-ph/0604101.
R. Bonciani, A. Ferroglia, Two-loop QED Bhabha scattering, http://pheno.physik.uni-freiburg.de/~bhabha/.
S. Actis, M. Czakon, J. Gluza, T. Riemann, Two-loop QED Bhabha scattering, http://www-zeuthen.desy.de/theory/
research/bhabha/bhabha.html/.
S. Actis, A. Ferroglia, G. Passarino, M. Passera, C. Sturm, S. Uccirati, GraphShot, a FORM package for automatic
generation and manipulation of one and two loop Feynman diagrams, unpublished.
P. Nogueira, J. Comput. Phys. 105 (1993) 279.
P. Nogueira, An introduction to QGRAF 2.0, ftp://gtae2.ist.utl.pt/pub/qgraf/.
M. Tentyukov, J. Fleischer, Comput. Phys. Commun. 132 (2000) 124141, hep-ph/9904258.
J. Vermaseren, New features of FORM, math-ph/0010025.
M. Czakon, DiaGen/IdSolver, unpublished.
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
51
S. Laporta, E. Remiddi, Phys. Lett. B 379 (1996) 283291, hep-ph/9602417.

S. Laporta, Int. J. Mod. Phys. A 15 (2000) 50875159, hep-ph/0102033.
E. Remiddi, J. Vermaseren, Int. J. Mod. Phys. A 15 (2000) 725754, hep-ph/9905237.
T. Gehrmann, E. Remiddi, Nucl. Phys. B 580 (2000) 485518, hep-ph/9912329.
T. Gehrmann, E. Remiddi, Nucl. Phys. B 601 (2001) 248286, hep-ph/0008287.
S. Wolfram, The Mathematica Book, Wolfram Media/Cambridge Univ. Press, 2003.
N. Usyukina, Teor. Mat. Fiz. 22 (1975) 300306.
E. Boos, A. Davydychev, Theor. Math. Phys. 89 (1991) 10521063.
M. Czakon, Comput. Phys. Commun. 175 (2006) 559571, hep-ph/0511200.
M. Roth, A. Denner, Nucl. Phys. B 479 (1996) 495514, hep-ph/9605420.
S. Moch, P. Uwer, Comput. Phys. Commun. 174 (2006) 759770, math-ph/0508008.
S. Actis, M. Czakon, J. Gluza, T. Riemann, Nucl. Phys. B (Proc. Suppl.) 160 (2006) 91100, hep-ph/0609051.
G. t Hooft, M. Veltman, Nucl. Phys. B 44 (1972) 189213.
C. Bollini, J. Giambiagi, Nuovo Cimento B 12 (1972) 2025.
R. Bonciani, P. Mastrolia, E. Remiddi, Nucl. Phys. B 661 (2003) 289343, hep-ph/0301170.
P. Mastrolia, E. Remiddi, Nucl. Phys. B 664 (2003) 341356, hep-ph/0302162.
R. Bonciani, P. Mastrolia, E. Remiddi, Nucl. Phys. B 676 (2004) 399452, hep-ph/0307295.
G. Burgers, Phys. Lett. B 164 (1985) 167.
D. Bardin, P. Christova, M. Jack, L. Kalinovskaya, A. Olchevski, S. Riemann, T. Riemann, Comput. Phys. Commun. 133 (2001) 229395, hep-ph/9908433.
A. Arbuzov, M. Awramik, M. Czakon, A. Freitas, M. Grnewald, K. Mnig, S. Riemann, T. Riemann, Comput.
Phys. Commun. 174 (2006) 728758, hep-ph/0507146.
B. Kniehl, M. Krawczyk, J. Khn, R. Stuart, Phys. Lett. B 209 (1988) 337.
J. Fleischer, J. Gluza, A. Lorca, T. Riemann, Eur. J. Phys. 48 (2006) 3552, hep-ph/0606210.
T. Becher, K. Melnikov, arXiv: 0704.3582 [hep-ph].
Invariant see-saw models and sequential dominance

S.F. King
School of Physics and Astronomy, University of Southampton, Southampton, SO17 1BJ, UK
Received 19 April 2007; accepted 27 June 2007
Abstract
We propose an invariant see-saw (ISS) approach to model building, based on the observation that see-saw
models of neutrino mass and mixing fall into basis invariant classes labelled by the CasasIbarra R-matrix,
which we prove to be invariant not only under basis transformations but also non-unitary right-handed
neutrino transformations S. According to the ISS approach, given any see-saw model in some particular
basis one may determine the invariant R-matrix and hence the invariant class to which that model belongs.
The formulation of see-saw models in terms of invariant classes puts them on a firmer theoretical footing,
and allows different see-saw models in the same class to be related more easily, while their relation to the
R-matrix makes them more easily identifiable in phenomenological studies. To illustrate the ISS approach
we show that sequential dominance (SD) models form basis invariant classes in which the R-matrix is
approximately related to a permutation of the unit matrix, and quite accurately so in the case of constrained
sequential dominance (CSD) and tri-bimaximal mixing. Using the ISS approach we discuss examples of
models in which the mixing naturally arises (at least in part) from the charged lepton or right-handed
neutrino sectors and show that they are in the same invariant class as SD models. We also discuss the
application of our results to flavour-dependent leptogenesis where we show that the case of a real R-matrix
is approximately realized in SD, and accurately realized in CSD.
1. Introduction
The discovery and subsequent study of neutrino masses and mixing [1] remains the greatest advance in physics over the past decade. The latest experimental data
[2] is consistent
with (approximate) tri-bimaximal mixing [3] corresponding to sin 23 1/ 2, sin 12 1/ 3,

sin 13 0 [3]. How to incorporate small neutrino masses and large mixings into some new theE-mail address: sfk@hep.phys.soton.ac.uk.
doi:10.1016/j.nuclphysb.2007.06.024
S.F. King / Nuclear Physics B 786 (2007) 5283
53
ory of flavour beyond the Standard Model has been the topic of intense theoretical activity [4]
over the same period.
One particularly attractive mechanism is the see-saw mechanism [5], based on a simple extension of the (possibly Supersymmetric) Standard Model involving more than one right-handed
neutrino R , coupling to left-handed lepton doublets L with a matrix of typical Yukawa cou (where typical means in the same ball park as the charged lepton Yukawa couplings
plings YLR
E
YLR of L to right-handed charged leptons ER ) and having large (compared to the weak scale)
Majorana masses MRR . From these high energy inputs one may derive the low energy effective
M 1 Y T where v is a Higgs
neutrino mass matrix from the see-saw formula mLL = vu2 YLR
u
RR LR
E one may then obtain the low energy
vacuum expectation value (VEV). From mLL and YLR
charged lepton masses me , m , m and neutrino masses mi from the eigenvalues of the matrices,

E and m from the left. After non= VEL VL , where VEL and VL diagonalize YLR
and VMNS
LL
physical phases are removed, the lepton mixing matrix VMNS can be compared to experiment.
There has been much theoretical effort devoted to understanding the origin and pattern of the
, Y E and M
high energy see-saw matrices YLR
RR which can lead to agreement with low enLR
ergy data, via the see-saw mechanism [4]. This problem is often considered together with the
U , Y D , and is referred to as the flavour probanalogous one of the quark Yukawa matrices YLR
LR
lem. Although the flavour problem has been around for many years, the recent neutrino data
provides additional challenges and constraints which have provided new insights into the problem, and a renewed impetus to attack it, resulting in an explosion of recent theoretical work in
this direction. While it is impossible to review all the different models that have been proposed,
the different approaches may be classified as either kinematical or dynamical. In both the
kinematical or dynamical approaches the goal is to guess or derive the input high enU , Y D and lepton see-saw matrices Y , Y E and M
ergy input quark Yukawa YLR
RR . However,
LR
LR
LR
as has long been emphasized by Jarlskog [6] such matrices are not physical, since their appearance changes depending on the particular basis of underlying fields one chooses to work, and so
working in a particular basis is meaningless.
,
This paper starts from the simple observation that not all choices of see-saw matrices YLR
E
YLR and MRR which are consistent with a given set of low energy lepton parameters me , m , m ,
mi and VMNS , are related to each other under a change of basis. This is in contrast to the quark
U , Y D consistent with a given set of low energy
sector where all choices of Yukawa matrices YLR
LR
quark parameters mu , mc , mt , md , ms , mb and VCKM , are related to each other under a change
of basis. It is also in contrast to the effective lepton sector, where all choices of effective lepton
E which are consistent with a given set of low energy lepton parameters m ,
matrices mLL and YLR
e
m , m , mi and VMNS , are related to each other under a change of basis. This observation implies
,YE ,M
that sets of see-saw matrices fall into invariant classes of models, {YLR
RR } C(R),
LR
where each different class C(R) is labelled by some continuous parameters R, where members
of C(R) are consistent with the same low energy lepton observables me , m , m , mi and VMNS ,
for all R. The set of all see-saw matrices within a particular invariant class C(R1 ) are related to
each other under a change of basis, but are not related to those in a different class C(R2 ).
It is well known amongst the phenomenological community that the R-matrix of Casas and
,YE ,M
Ibarra [7] may be used to parameterize choices of see-saw matrices YLR
RR consistent
LR
with a given set of low energy lepton parameters me , m , m , mi and VMNS . Although it was
appreciated by Casas and Ibarra that the R-matrix parameterization may be used in different
lepton bases [7], this feature is rarely or never used in phenomenological analyses where people
E , M
invariably work in the flavour basis where YLR
RR are both diagonal. On the other hand,
the R-matrix is largely ignored by the theoretical community who are concerned with guessing
54
or deriving the see-saw matrices in a particular basis, which in general will not correspond to the
flavour basis, so the R-matrix is not regarded as relevant.
In the present paper we show that the R-matrix is a basis invariant quantity, then propose
using it in the context of model building to label the invariant class C(R) of see-saw models to
which a particular model example belongs. Given a particular see-saw model there are several
reasons why it is worth determining the invariant class C(R) to which it belongs, i.e. finding the
invariant R-matrix associated with the particular see-saw model:
1. It puts the theory on a firmer theoretical foundation, since invariant quantities are always
preferred to basis dependent one [6].
2. Given the R-matrix one may immediately generate an infinite set of equivalent see-saw models filling out the invariant class C(R) by applying lepton basis changes. This applies both to
the kinematical and the dynamical approaches. So for any particular model (infinitely)
many other models come for free.
3. It may turn out that a particular model under consideration corresponds to the same R-matrix
as another model, i.e. the two models are in the same invariant class, in which case the two
models should essentially must be regarded as the same model.
4. For given (class of) models, with R specified one may immediately make contact with phenomenological analyses which have been performed in the literature which are relevant to
testing the (class of) models.
In this paper we shall illustrate the power of such an invariant see-saw (ISS) approach by
discussing the case of sequential dominance (SD) [8]. SD is motivated by two considerations:
To account for a neutrino mass hierarchy m1 m2 m3 and large atmospheric and solar
mixing angles in a natural way, without any tunings or cancellations. Although the (2, 3)
mass hierarchy in the neutrino sector is not that strong, m2 /m3 0.2, we would still like to
have a natural explanation for the smallness of this hierarchy, just as we would like to have
an explanation for the smallness of the Cabibbo angle which has a similar value.
To disentangle the question of the neutrino masses and the mixing angles, and so enable
some explanation for tri-bimaximal neutrino mixing which involves elements in the MNS
matrix having values equal to square roots of simple rational numbers such as 1/2 or 1/3.
This would not be possible if the neutrino masses played a part in the calculation of the solar
and atmospheric mixing angles.
In SD, a natural neutrino mass hierarchy, m2 /m3 0.2, results from having one of the righthanded neutrinos give the dominant contribution to the see-saw mechanism, while a second righthanded neutrino gives the leading sub-dominant contribution, leading to a neutrino mass matrix
with naturally small determinant [8].1 In a basis where the right-handed neutrino mass matrix is
diagonal, the atmospheric and solar neutrino mixing angles are determined in terms of ratios of
Yukawa couplings involving the dominant and subdominant right-handed neutrinos, respectively.
If these Yukawa couplings are related in a certain way, then it is possible for tri-bimaximal
neutrino mixing, to emerge in a simple and natural way, independently of the neutrino mass
eigenvalues. This is known as constrained sequential dominance (CSD) [10], and can readily
1 For alternative approaches involving a small determinant see [9].
55
arise from vacuum alignment in flavour models [1012]. In such unified flavour models there
are corrections to tri-bimaximal mixing from charged lepton corrections, resulting in testable
predictions and sum rules for lepton mixing angles [10,13].
Although well motivated on physical grounds, SD appears to be restricted to a particular
basis, namely that in which the right-handed neutrino and charged lepton mass matrices are both
diagonal, although in particular model realizations there are typically small off-diagonal elements
in both these mass matrices which must be taken into account. This might lead one to conclude
that the notion of SD is quite limited, and furthermore that it is not physical since physical
quantities should be basis independent. However, following the ISS approach advocated above,
we will determine the invariant classes C(R) to which SD models belong, by finding the invariant
R-matrix associated with each of the SD types, and hence show that SD can be formulated in a
basis invariant way. In particular tri-bimaximal neutrino mixing from constrained SD is shown
to have an easily identifiable form in which the R-matrix is related to the unit matrix, where
this form is preserved under charged lepton or right-handed neutrino basis changes, though the
former gives observable corrections to the MNS parameters. Having done this we shall then
reap the benefits mentioned above. Namely we shall show how certain models that have been
proposed in the literature are equivalent to SD under a basis change, for example models where
the mixing is completely or in part originating from the right-handed neutrino or charged lepton
sectors [12,15]. We shall also discuss phenomenological analyses based on choices of R-matrix
parameters that are seen to be relevant for SD.
In detail, the material discussed in this paper is structured as follows. In Section 2 we discuss
the ISS approach to model building that we advocate. We first review the well known result
U , Y D consistent with given physical parameters are
that all pairs of quark Yukawa matrices YLR
LR
related by basis transformations [6], and then show that a similar result holds for the effective
E . We then show that a similar result does not apply to the see-saw
lepton matrices mLL and YLR
mechanism, which leads to the notion of invariant classes of see-saw models, which may be
parameterized by the R-matrix of Casas and Ibarra. We show how the R-matrix may be obtained
and prove its invariance under basis transformations. We also propose a short-cut to obtaining
the R-matrix, using a non-unitary S-matrix transformation of right-handed neutrinos, which is
useful when right-handed neutrino mass eigenvalues are not required. In Section 3 we discuss SD
models as a prime example of the ISS approach. We first discuss this in a two family example,
where a convenient vector notation for SD is introduced, and a relation between the R-matrix
angle and the angle between these vectors is established. We then go on to the full three family
case where we discuss the form of the R-matrix for all the types of SD, and provide a systematic
discussion of the R-matrix in the two right-handed neutrino limit in each case. Having established
the relation between SD and the R-matrix, this then defines the invariant classes of see-saw
models to which SD models belong, and hence allows the full set of models in these classes
to be constructed by basis transformations. In Section 4 we discuss the physical applications of
these results to invariant classes of SD models. The particular forms of R-matrix associated with
CSD and tri-bimaximal neutrino mixing are identified. SD is shown to be in the same invariant
class as some models where the mixing completely or partly originates from the right-handed
neutrino or charged lepton sectors [12,14,15]. We also discuss phenomenological analyses based
on choices of R-matrix parameters that are seen to be relevant for SD. For example we discuss
the application of our results to flavour-dependent leptogenesis [23], and show that the case of
the real R-matrix may be (approximately) realized in SD. Section 5 concludes the paper.
Finally we would like to mention some earlier works where the relation between SD and the
R-matrix has been mentioned before [1619], and to distinguish what was done in the previous
56
papers from what is done in the present paper. The R-matrix was first applied to SD in [16],
where the approximate unit matrix type structure of the R-matrix (or a permutation of it) corresponded to the sequential dominance of the three right-handed neutrinos. The Yukawa vector
notation, used in this paper, was first introduced in [16], however the MNS vector notation is
introduced here. Some of these features were subsequently further discussed in [17,18]. Recently,
an analysis with a unit R-matrix was performed in [19]. All of these papers presented results in
the flavour basis in which the right-handed neutrino mass matrix and the charged lepton mass
matrix were both diagonal. The basis-invariance noticed by Casas and Ibarra was not discussed
in any of the papers in [1619], and the invariance of the R-matrix under non-unitary righthanded neutrino transformations is new. Other new results include a precise relation between
the angle between the Yukawa vectors and the R-matrix angle in two right-handed neutrino
limiting cases. These limiting cases are discussed in detail, for all the different mass orderings of
right-handed neutrinos. The fact that, for such limiting cases, SD is an automatic consequence of
a particular texture zero and a small 13 is also discussed. Tri-bimaximal mixing and CSD is also
discussed, corresponding to the R-matrix taking a very precise (permutation of the) unit matrix
form (rather than an approximate such form). The effect of charged lepton corrections is shown
to give corrections to the PMNS angles, but not to the R-matrix, which retains its precise form.
The fact that SD corresponds to approximately real R-matrix angles, as can be seen by considering the modular surfaces of the R-matrix angles close to zero or /2, is also a new result, and its
application to recent flavour-dependent leptogenesis analyses is discussed.
Finally we emphasize that the ISS model building approach we advocate here, whereby any
proposed see-saw model should be expressed in terms of the invariant R-matrix, represents a
new strategy that can be applied to all see-saw models, not just the ones which satisfy SD. To
illustrate the usefulness of the approach, we discuss examples of models in which the lepton
mixing originates either from the charged lepton sector, or partly from the right-handed neutrino
sector, and show that they have the same invariant R-matrix as SD, and therefore are equivalent
to it under a change of basis, which we subsequently prove. The idea in this paper, then, is to
use the R-matrix more actively in model building (rather than in phenomenology, where it has
been used extensively by many authors), with the hope that model builders will express their seesaw models in terms of the R-matrix (not normally done). The essential point of this paper is to
emphasise that the R-matrix is invariant under basis transformations, since this feature, although
clearly known by the inventors, is not so well used. It is precisely this invariance that means that
the R-matrix can and should be more widely used as a model building tool, to classify and relate
models.
2. The ISS model building approach
2.1. Quark sector
In the quark sector the Dirac mass matrices of the up and down quarks are given by
U
0
D
D
0
mU
LR = YLR vu , and mLR = YLR vd where vu = Hu and vd = Hd , and the Lagrangian is
of the form L = L YLR H R + H.c. The change from flavour basis to mass eigenstate basis
can be performed with the unitary diagonalization matrices VUL , VUR and VDL , VDR by
VUL mU
LR VUR = diag(mu , mc , mt ),
VDL mD
LR VDR = diag(md , ms , mb ).
(1)
The CKM mixing matrix is then obtained from

VCKM
= VUL VD L
(2)
57
where quark phase rotations which leave the quark masses real and positive may be used to
remove five of the phases leaving one physical phase in the CKM matrix VCKM . The Standard
Model quark sector clearly respects the symmetry
Gquark = UQ (3) UUR (3) UDR (3)
(3)
corresponding to quark doublet, right-handed up quark and right-handed down quark rotations,
which change the quark basis and the form of the Yukawa matrices, but leave the physics (quark
masses and mixings) unchanged. In the quark sector it is well known that the only physical
quantities are basis independent invariants formed from the mass matrices, the so-called Jarlskog
invariants [6], rather than the mass matrices themselves, since any pair of quark mass matrices
which lead to the correct physics may be related to any other pair which lead to the same physics,
by a change of basis, up to quark phases, using the symmetry Gquark .
This can be proved, for example, by showing that any two pairs of quark mass matrices can
be related by a change of basis, using the symmetry Gquark , to a common basis in which the up
quark mass matrix is diagonal, and the down quark mass matrix is equal, up to quark phases, to
the CKM matrix multiplied by a diagonal matrix of down quark masses,

mU
LR = diag(mu , mc , mt ),

mD
LR = VCKM diag(md , ms , mb ).
(4)
U
D
D
Since any two pairs of mass matrices (mU
LR )1 , (mLR )1 and (mLR )2 , (mLR )2 may be related to
U
D

mLR , mLR in Eq. (4) by a change of basis, it follows that all choices of quark mass matrices
which lead to the same physics can be related to each other, up to quark phases, using the symD
metry Gquark . This implies that the quark mass matrices mU
LR , mLR are not physical quantities
since they are basis dependent, i.e. not invariant under the symmetry Gquark . It is possible to
define Gquark invariant combinations consisting of determinants and traces of products of the
U = mU (mU ) and S D = mD (mD ) , for example the determinant of the
combinations SLL
LR
LR
LL
LR
LR
U , S D ] is an invariant [6].
commutator det[SLL
LL
2.2. Effective lepton sector

From the point of view of low energy neutrino experiments, Majorana neutrino masses arise
from the effective operator: Leff = 12 Hu LT Hu L + H.c. where L are the lepton doublets, Hu
are Higgs doublets, and is a matrix of effective (dimensional) couplings. In our convention the
effective Majorana masses are given by the Lagrangian L = L mLL c + H.c., where mLL =
vu2 . The rotation to the mass eigenstate basis can be performed with the unitary diagonalization
matrices VEL , VER and VL by
VEL mE
LR VER = diag(me , m , m ),
VL mLL VTL = diag(m1 , m2 , m3 ).
(5)
The lepton mixing matrix is then obtained from

= VEL VL
VMNS
(6)
where charged lepton phases rotations which leave the charged lepton masses real and positive
may be used to remove three of the phases leaving three physical phases in the MNS matrix
VMNS .
The effective lepton sector clearly respects the symmetry
Geff
lepton = UL (3) UER (3)
(7)
58
corresponding to lepton doublet and right-handed charged lepton rotations, which change the
lepton basis and the form of the effective lepton matrices, but leave the physics (lepton masses
and mixings) unchanged. The physically measurable low energy lepton parameters are the three
charged lepton masses me , m , m , the three neutrino masses m1,2,3 > 0 and the lepton mixing
parameters contained in VMNS .
As in the quark sector, any pair of effective lepton matrices mE

LR , mLL which lead to a given
low energy physics may be related to any other pair which lead to the same physics, by a change
of basis, using the symmetry Geff
lepton . This is easily proved (analogous to the quark sector) by
transforming to a common basis in which the charged lepton mass matrix is diagonal, and the

=
effective Majorana neutrino mass matrix is specified in terms of the lepton mixing matrix VMNS
VEL VL and the physical neutrino masses mi ,

mE
LR = diag(me , m , m ),

T
mLL = VMNS
diag(m1 , m2 , m3 )VMNS
(8)
where Eq. (8), often called the flavour basis, is analogous to Eq. (4). Then, as in the quark case,
E
we can argue that since any two pairs of matrices (mE
LR )1 , (mLL )1 and (mLR )2 , (mLL )2 can be
rotated to the flavour basis then they can therefore be rotated into each other, using the symmetry
E
Geff
lepton , analogous to the quark sector result. mLR , mLL are clearly basis dependent, but invariants

E
E
E
under Geff
lepton can be constructed using SLL = mLR (mLR ) and SLL = mLL (mLL ) , for example
E , S ] is invariant.
the determinant of the commutator det[SLL
LL
2.3. See-saw sector
The starting point of the see-saw mechanism is the Lagrangian,
E
R YLR
R + 1 RT MRR R + H.c.,
Hd LE
Hu L
Lsee-saw = YLR
(9)
2
where all indices have been suppressed, and we have introduced two Higgs doublets Hu , Hd as in
the Supersymmetric Standard Model.2 It is common to call Eq. (9) the see-saw Lagrangian. After
integrating out the right-handed neutrinos it leads to an effective low energy leptonic Lagrangian
of the type discussed in the previous subsection where the effective Majorana mass matrix given
by the (type I) see-saw formula:
1 T
MRR
YLR .
mLL = vu2 YLR
(10)
The effective low energy matrices are diagonalised by unitary transformations VEL , VER and VL
as in Eq. (5), and the lepton mixing matrix is as in Eq. (6).
The lepton symmetry of the see-saw Lagrangian in Eq. (9) is:
Glepton = UL (3) UER (3) UR (3)
(11)
corresponding to lepton doublet, right-handed charged lepton and right-handed neutrino rotations, which change the lepton basis and the form of the see-saw matrices, but leave the
physics (lepton masses and mixings) unchanged. Using these symmetries we can ask the quesE , Y and M
tion whether all sets of see-saw matrices YLR
RR which lead to a given set of low
LR
energy physical lepton parameters are equivalent to each other by a change of basis. Analogous
2 In the case of the Standard Model one of the two Higgs doublets is equal to the charge conjugate of the other,
Hd Huc .
59
to the quark sector, we may attempt to relate all sets of see-saw matrices to a common set of seesaw matrices in which the charged lepton mass matrix is diagonal, and the right-handed neutrino
Majorana mass matrix is also diagonal,
E
vd YLR
= diag(me , m , m ),

MRR
= diag(M1 , M2 , M3 ),
= VEL YLR
VR
YLR
(12)

where unitary VR is defined by VR MRR VTR = MRR
and Mi > 0.
We refer to the basis of Eq. (12) as the see-saw flavour basis in analogy to Eq. (8). The
is not uniquely specified since it is
difference between Eqs. (4), (8) and (12) is that here YLR
diagonalized by left-handed rotations which are not simply related to the lepton mixing matrix,
and in addition its eigenvalues are not simply related to physical neutrino masses. Therefore,
unlike the quark sector, or the effective lepton case, there is not a unique common basis. ThereE ) , (Y ) , (M
E
fore, we conclude that any two sets of see-saw matrices (YLR
1
RR )1 and (YLR )2 ,
LR 1
) , (M
(YLR
)
which
give
the
same
physical
right-handed
neutrino
masses,
light
effective
neu2
RR 2
trino masses, charged lepton masses and lepton mixings, cannot be transformed into each other
under the lepton see-saw symmetry Glepton corresponding to basis changes.
We note parenthetically that although the see saw formula is not a basis invariant, by taking
its determinant one can obtain the invariant mass formula [20]:
m1 m2 m3 =
m2D1 m2D2 m2D3

M1 M2 M3
(13)
where mi are the physical light left-handed neutrino masses, Mi are the heavy right-handed
.
neutrino masses, and mDi are the eigenvalues of the Dirac neutrino mass matrix mLR = vu YLR
The product of diagonal squared Dirac mass eigenvalues, is clearly an invariant since it is given
by det(mLR mLR ). Although Eq. (13) should have useful see-saw model building applications
with respect to neutrino masses, it clearly does not shed any light on the question of neutrino
mixing.
2.4. Invariant classes of see-saw models and the R-matrix
We have seen that, in contrast to the case of the effective lepton or quark sector, not all
, Y E and M
choices of see-saw matrices YLR
RR which are consistent with a given set of
LR
low energy lepton parameters me , m , m , mi and VMNS , are related to each other under a
change of basis. This implies that sets of see-saw matrices fall into invariant classes of models,
,YE ,M
{YLR
RR } C(R), where each different class C(R) is labelled by some continuous paraLR
meters R, where members of C(R) are consistent with the same low energy lepton observables
me , m , m , mi and VMNS , for all R. The set of all see-saw matrices within a particular invariant
class C(R1 ) are related to each other under a change of basis, but are not related to those in a different class C(R2 ). In this subsection we show that the R-matrix of Casas and Ibarra [7], which
is well known in phenomenological applications, is a basis invariant quantity. We then propose
using it in the context of model building to label the invariant class C(R) of see-saw models to
which a particular model example belongs.
Following [7], we first derive the R-matrix in the see-saw flavour basis in Eq. (12), by con to give m in the basis in Eq. (8) using the see-saw mechanism in Eq. (10):
straining YLR
LL

T

vu2 YLR
diag(M1 , M2 , M3 )1 YLR
= VMNS
diag(m1 , m2 , m3 )VMNS T .
(14)
60
we can try to write both sides of

In order to solve Eq. (14) for the neutrino Yukawa matrix YLR
T
T
the equation in the form AA = BB then take the positive square root of the equation to give,

diag(M1 , M2 , M3 )1/2 = VMNS
diag(m1 , m2 , m3 )1/2 R T
vu YLR
(15)
where R is the CasasIbarra complex orthogonal matrix,

= I where I is the unit matrix.
in the see-saw flavour basis,
It is often used in phenomenological analyses to parameterize YLR
in terms of physical parameters from Eq. (15).
since R determines YLR
In the above discussion the R-matrix was derived in the see-saw flavour basis. However one
can repeat the above derivation starting from a general charged lepton basis in which neither
E nor Y (unprimed matrices) are in general not diagonal (but retaining for the moment a
YLR
LR
diagonal right-handed neutrino mass matrix) leading to:
RT R
diag(M1 , M2 , M3 )1/2 = VL diag(m1 , m2 , m3 )1/2 R T

vu YLR
(16)
mLL
where VL is the matrix that diagonalizes

in this basis, as in Eq. (5). Comparing Eq. (16) to
Eq. (15) the only change is to left-hand sides of the equations, where in the see-saw flavour basis

it happens that VL = VMNS
. The fact that the same R-matrix appears in Eq. (16) as Eq. (15)
= V Y , where V
follows from the fact that YLR
EL LR
EL diagonalizes the charged lepton mass
matrix as in Eq. (5). Therefore by multiplying on the left-hand sides of Eq. (16) by VEL , and
comparing the resulting equation to Eq. (15), where the MNS matrix is given by Eq. (6), we find
the non-trivial result that the same R-matrix must appear in both Eqs. (15) and (16). We conclude
that the R-matrix is invariant under a change of charged lepton basis.
We now prove that the R-matrix is invariant under a change of right-handed neutrino basis,
so that the right-handed neutrinos are no longer diagonal. The main observation is that according
to the R-matrix parameterizes only the combination on the left-hand side of Eq. (16), and this
combination is clearly invariant under UR (3), which also preserves the right-handed neutrino
masses. Under R VR R , Eq. (16) thus becomes,
VR diag(M1 , M2 , M3 )1/2 = VL diag(m1 , m2 , m3 )1/2 R T

vu YLR
(17)
with R again invariant. The invariance of the R-matrix, together with Eq. (17), suggests the
following ISS model building strategy. In some particular given basis where the see-saw matrices
, Y E and M
YLR
RR are not diagonal, Eq. (17) may be used to determine the R-matrix in terms
LR
of the masses mi , Mi , the matrix VL which diagonalises mLL in this basis, as in Eq. (5), and VR
as defined below Eq. (12). Since the R-matrix is invariant under a change of basis, as we have
shown, it may then be used to label invariant class of models to which the particular see-saw
,YE ,M
matrices belong, {YLR
RR } C(R).
LR
Finally we show that the R-matrix is also invariant under non-unitary right-handed neutrino
transformations, namely R SR , where S is non-singular, which results in:
YLR
YLR
S 1 ,
MRR S T
MRR S 1 ,
1
1 T
MRR
SMRR
S .
(18)
The transformations in Eq. (18) leave the effective low energy neutrino mass matrix mLL invariant, which follows from the see-saw mechanism in Eq. (10). However the right-handed neutrino
masses will change, since S is non-unitary. By a suitable choice of S, MRR can be transformed
into a diagonal form,
ST
MRR S 1 = diag(M 1 , M 2 , M 3 ),
(19)
61
where we emphasize that the choice of S is not unique, and M i are not the eigenvalues of MRR .
For example, without loss of generality, S can always be chosen so that M i are all equal to unity
in some units. Allowing non-unitary S-matrix transformations, one can derive a similar result to
Eq. (17),
S 1 diag(M 1 , M 2 , M 3 )1/2 = VL diag(m1 , m2 , m3 )1/2 R T

vu YLR
(20)
where S and M i are defined in Eq. (19), and VL is as before since mLL is invariant under Smatrix transformations. R is once again invariant, which essentially follows from the invariance
of the combination on the left-hand side of Eq. (20) under S-matrix transformations. For a given
, Y E and M
non-diagonal set of see-saw matrices YLR
RR , Eq. (20) can sometimes be used as
LR
a short-cut to determining the invariant R-matrix, instead of Eq. (17). Since the R-matrix is
invariant under the S-matrix, as we have shown, it may then be used to label invariant class of
,YE ,M
models to which the particular see-saw matrices belong, {YLR
RR } C(R), as before.
LR
The S-matrix approach may be especially useful in low energy applications where the righthanded neutrino masses are not required.
Note that the use of non-unitary transformations is familiar from the study of non-minimal
Khler potentials in supersymmetric scenarios, which require such field transformations in order
to put the Khler potential into canonical form, as has been recently studied [21]. However none
of these works explicitly addresses the invariance of the R-matrix for non-unitary right-handed
neutrino transformations. Similarly it was not proved either by the originators of the R-matrix,
Casas and Ibarra [7], or in [1619]. The discussion of the S-matrix as applied to the R-matrix in
this section is therefore original and (as we shall see in Section 4.1.3) can be quite useful.
3. ISS approach to SD
3.1. Two family SD in the see-saw flavour basis
In this section we shall show that sequential dominance (SD) models [8] correspond to particular invariant classes of see-saw models characterized by particular forms of the R-matrix.
SD provides a good example of the invariant see-saw (ISS) approach, since SD is sometimes
criticized as being only valid in a special basis, namely the see-saw flavour basis. Defining SD
in terms of the R-matrix renders the SD approach basis independent which overcomes this criticism, and brings with it all the benefits already mentioned previously, some of which will be
explored further in the next section on applications. We shall begin by discussing the dominance
mechanism in a simple two family example, first in the see-saw flavour basis, then in terms of the
R-matrix which defines a basis independent formulation of SD. We then extend this discussion
to include three families, then take the two right-handed neutrino limit of these models.
To review the basic idea of SD, then, it is instructive to begin by discussing a simple 2 2
example applicable to the atmospheric mixing in the (2, 3) sector, in the see-saw flavour basis,
i.e. the diagonal charged lepton and right-handed neutrino Majorana mass basis, where we can
write,

MRR =
MA
0
0
MB

,
mLR =
A2
A3
B2
B3

(21)
62
v . It is sufficient for the toy model to ignore phases, and suppose that A , B
where mLR = YLR
u
i
i
1 T
are real. The see-saw formula in Eq. (10) mLL = mLR MRR
mLR gives:
A22
B22
A2 A3
B2 B3
+
+
M
M
M
M
A
B
A
B
mLL =
(22)
.
A23
B32
A2 A3
B2 B3
+
+
MA
MB
MA
MB
The mass matrix in Eq. (22) is diagonalized to give two neutrino mass eigenvalues m2 , m3 by
rotating through an angle 23 given by,
tan 223 =
2( AM2 AA3 +
A2
( MA2 +
B2 B3
MB )
.
2
B2
A22
B22
)
(
+
)
MB
MA
MB
(23)
The determinant of the neutrino mass matrix mLL in Eq. (22) is

det mLL =
1
(A2 B3 A3 B2 )2 = m2 m3
MA MB
(24)
and the trace of the neutrino mass matrix mLL in Eq. (22) is
Tr mLL =
A2
B2
A22
B2
+ 3 + 2 + 3 = m2 + m3 m3 ,
MA MA MB
MB
(25)
where the last approximation assumes a neutrino mass hierarchy m3 m2 . m2 is then approximately determined from the trace and determinant of the mass matrix as,
det mLL det mLL
m2 =
m3
Tr mLL
(A2 B3 A3 B2 )2
MA MB
(A22 +A23 )
(B 2 +B 2 )
+ 2MB 3
MA
(26)
The basic assumption of SD is that one of the right-handed neutrinos plays the dominant
role in the see-saw mechanism. Without loss of generality we shall assume that the right-handed
neutrino of mass MA dominates the see-saw mechanism:
|Bi Bj |
|Ai Aj |

.
MA
MB
(27)
Assuming the dominance approximation in Eq. (27), the determinant and trace of the mass matrix in Eq. (22) imply that the neutrino mass spectrum then consists of one neutrino with mass
m3 (A22 + A23 )/M3 and one naturally light neutrino m2 m3 determined from Eq. (26), since
the determinant of Eq. (22) is naturally small, and vanishes in the extreme limit of the dominance approximation when only one right-handed neutrino contributes [8]. Under the dominance
approximation in Eq. (27), the atmospheric angle from Eq. (23) is tan 23 A2 /A3 [8] which
can be large or maximal providing A2 A3 . Collecting together these results, the dominance
approximation in Eq. (27) leads to,
m3
(A22 + A23 )
,
MA
m2
(A2 B3 A3 B2 )2
,
(A22 + A23 )MB
tan 23
A2
.
A3
(28)
Therefore, assuming the dominance of a single right-handed neutrino, Eq. (28) shows that m3 is
determined approximately by the right-handed neutrino with mass MA , m2 is determined approximately by the right-handed neutrino with mass MB , and tan 23 is determined approximately by
63
a simple ratio of Yukawa couplings, independently of the neutrino mass hierarchy. Note that
right-handed neutrino dominance allows the origin of the large mixing angle to be decoupled
from the neutrino mass hierarchy, allowing both features to co-emerge in a very natural way.
The above results can be expressed more compactly by introducing the column vector notation
[16],

A2
B2
1/2
1/2
vB =
MA ,
MB .
vA =
(29)
A3
B3
1 T
mLR gives:
Then the see-saw formula in Eq. (10) mLL = mLR MRR
mLL = vA vA T + vB vB T .
(30)
The determinant of the neutrino mass matrix mLL is

det mLL = |vA vB |2 = m2 m3
(31)
and the trace of the neutrino mass matrix mLL is

Tr mLL = |vA |2 + |vB |2 = m2 + m3 m3
(32)
m2 is then approximately determined from the trace and determinant of the mass matrix as,
m2 =
|vA vB |2
det mLL det mLL
m3
Tr mLL
|vA |2 + |vB |2
(33)
To arrange for a hierarchy m2 /m3 1/5, we require the determinant to be small compared to
the square of the trace. This may be achieved using the dominance condition in Eq. (27) that the
right-handed neutrino of mass MA gives the dominant contribution to the see-saw mechanism,
which in vector notation implies:
|vA |2 |vB |2 .
(34)
We shall see in the next section that the dominance approximation leads to the vectors vA and
vB being approximately orthogonal and that there is a precise correlation between the degree of
orthogonality of these two vectors and the degree of dominance. Here we give two examples
which illustrate that the dominance condition only applies when the two vectors vA and vB are
sufficiently orthogonal:
If A2 = A3 and B2 = B3 , corresponding to the two vectors vA and vB being exactly orthogonal, then Eq. (28) gives,
m3
2A22
,
MA
m2
2B22
,
MB
tan 23 1
(35)
and the required hierarchy m2 /m3 1/5 then implies that,

B2
A22
5 2
MA
MB
which satisfies the dominance condition in Eq. (27).
(36)
64
Now suppose that the two vectors are at 45 to each other, such as given by A2 = A3 , and
B3 = 0, then Eq. (28) becomes,
m3
2A22
,
MA
m2
B22
,
2MB
tan 23 1
(37)
and the required hierarchy m2 /m3 1/5 then implies that,

A22
5 B22
MA 4 MB
(38)
which only marginally satisfies the dominance condition in Eq. (27). If the vectors are more
closely aligned than about 45 then the dominance condition will not be satisfied.
3.2. Two family SD and the R-matrix
According to the ISS approach, we should formulate SD in terms of the invariant R-matrix.
From Eq. (15), we have for the two family toy model in Eq. (21), dropping primes, and assuming
MA < MB :

1/2

1/2
MA
0
0
22 m2
T
mLR
(39)
= VMNS
R22
.
0
MB
0 m3
The MNS matrix is parameterized by the atmospheric angle 23 , and the R-matrix may be parameterized here by an angle , ignoring phases,

c23 s23
c s
22
T
VMNS =
(40)
,
,
R22 =
s c
s23 c23
where c = cos , s = sin . Each choice of specifies a particular solution to the see-saw formula
1/2
for the combination mLR MRR on the left-hand side of Eq. (39).
Using Eqs. (39), (40), we find the following expressions for the vectors introduced in
Eqs. (29),

1/2 c23
1/2 s23
vA = cm2
+ sm3
,
s23
c23

1/2 c23
1/2 s23
vB = sm2
(41)
+ cm3
.
s23
c23
The single right-handed neutrino dominance approximation in Eq. (34) is then seen from Eq. (41)
to correspond to values of /2 since for hierarchical neutrinos m3 m2 . An interesting
special limiting case is provided by the choice = /2, which corresponds to an R-matrix, with
an off-diagonal structure,

0 1
R=
.
(42)
1 0
In this limiting case Eq. (41) shows that the vector vB is exactly orthogonal to vA . This example
was discussed in the last section where it was shown to lead to Eq. (35), where the dominant righthanded neutrino dominates the see-saw mechanism by a factor of 5 according to Eq. (36). For
small deviations from = /2, Eq. (41) shows that the vector vB is approximately orthogonal
to vA , and as the angle is decreased, the vectors vB and vA become less orthogonal.
65
There is a precise correlation between the angle between the two vectors vA and vB and the
degree of dominance, parameterized by the angle . To see this we first write Eq. (41) in a more
compact form as,
1/2
1/2
vA = cm2 + sm3 ,
1/2
1/2
vB = sm2 + cm3 ,
(43)
1/2
where mj
is defined by comparing Eq. (43) to Eq. (41), as

22 1/2
1/2
m ,
mj = VMNS
ij j
1/2
i.e. the mj
1/2
(44)
1/2
22
is the j th column of VMNS
times mj . We see that mi
1/2
. mj
1/2
= ij mi , and |m2
1/2
m3 |2 = m2 m3 . The angle AB between the two vectors vA and vB is then given by,
cos AB =
(m3 m2 ) sin 2
2|vA ||vB |
(45)
where the magnitudes of the vectors is given by,

|vA |2 = c2 m2 + s 2 m3 ,
|vB |2 = s 2 m2 + c2 m3 .
(46)
From Eqs. (45), (46) it is seen that the angle simultaneously parameterizes the angle between
the two column vectors and their ratio of magnitudes which quantifies the precise degree of
dominance. From Eqs. (45), (46) it is seen that when /2, then AB /2 and |vA |2 /|vB |2
m3 /m2 5, corresponding to |vA |2 |vB |2 as in Eq. (34). Once an angle and the right-handed
neutrino masses have been chosen, and the vectors vB and vA thereby specified, we can invert
Eq. (43), to express the neutrino mass eigenstates in terms of the different see-saw contributions,
1/2
m2 = cvA svB ,
1/2
m3 = svA + cvB .
(47)
With values of /2, corresponding to single right-handed neutrino dominance, Eq. (47)
clearly shows that the mass eigenstate m3 mainly results from the see-saw contribution of the
right-handed neutrino of mass MA , and the mass eigenstate m2 mainly results from the see-saw
contribution of the right-handed neutrino of mass MB . However Eq. (47) should be interpreted
with care since it is only meaningful once Eq. (41) has first been used.
It is also observed from Eqs. (45), (46) that when 0, then AB /2 and |vA |2 /|vB |2
m2 /m3 1/5 corresponding to |vB |2 |vA |2 . This corresponds to another type of dominance in
which the heavier right-handed neutrino of mass MB dominates the see-saw mechanism. So far
we have been assuming that the lighter right-handed neutrino of mass MA dominates the see-saw
mechanism, but now we see that there is an alternative case in which the heavier right-handed
neutrino of mass MB is the dominant one, and in this case we would find that the dominance of
the right-handed neutrino of mass MB is achieved for 0, and then the R-matrix is the unit
matrix,

1 0
R=
.
(48)
0 1
The dominance approximation is thus seen to be valid over a large range of angles centered
on either zero or /2, corresponding to a large range of angles AB in the range /4 to /2. Of
66
course there is no precise value of at which the dominance approximation breaks down, and
the parametrization shows that there is a continuum of theories which interpolate between those
which have dominance of one right-handed neutrino and those which do not, in varying degrees.
This analysis shows that the idea of single right-handed neutrino dominance is quite generic and
it is quite likely to be relevant to some approximation in practice.
The above discussion illustrates that there are two types of dominance, one in which the lighter
right-handed neutrino dominates, corresponding to an R-matrix with /2 like Eq. (42), and
one in which the heavier right-handed neutrino dominates, corresponding to an R-matrix with
0 like Eq. (48). In practice, in dealing with the second type of dominance, it is convenient
to continue to identify the heavier dominant right-handed neutrino by the label A and rewriting
Eq. (21) in this case as:

MB
B2 A2
0
MRR =
(49)
,
mLR =
B3 A3
0
MA
where here MB < MA . Thus, when the heavier right-handed neutrino dominates, we shall perform a trivial relabelling A B so that without loss of generality the right-handed neutrino of
mass MA always dominates. Clearly in this second case, using Eq. (49), all the results in this
section from Eq. (39) onwards follow as before but with a trivial relabelling A B. We emphasize again that the advantage of dominance is that the determinant of the neutrino mass matrix
is naturally small, and also that the mixing angle is independent of the neutrino mass hierarchy,
1/2
both features following from the fact that with /2 vA m3 (the same result being also
true for the case or 0 after relabelling A B).
3.3. Three family SD in the see-saw flavour basis
It is straightforward to extend two family SD in the see-saw flavour basis to the case of three
families, Eq. (21) becomes,

MA
A1 B1 C1
0
0
MRR =
(50)
mLR = A2 B2 C2 .
0 ,
0
MB
A3 B3 C3
0
0
MC
The column vector notation is trivially extended to three families [16],

A1
B1
C1
1/2
1/2
1/2
vB = B2 MB ,
v C = C2 M C .
vA = A2 MA ,
A3
B3
C3
(51)
1 T
Then the see-saw formula in Eq. (10) mLL = mLR MRR
mLR gives:
mLL = vA vA T + vB vB T + vC vC T .
(52)
We assume the sequential dominance condition

|Ai Aj |
|Bi Bj |
|Ci Cj |

,
MA
MB
MC
(53)
where i, j = 1, . . . , 3. In vector notation this implies:

|vA |2 |vB |2 |vC |2 .
(54)
67
We also assume:
|A1 | |A2,3 |.
(55)
Then approximate results for the masses and mixings are given by [8], writing A = |A |eiA1 ,
B = |B |eiB1 , C = |C |eiC1 :
|A2 |
,
|A3 |
tan 23
(56a)
|B1 |
,
c23 |B2 | cos 2 s23 |B3 | cos 3
tan 12
13 ei(+B1 A2 )
(56b)
|B1 |(A2 B2 + A3 B3 ) MA ei(+A1 A2 ) |A1 |

+
,
[|A2 |2 + |A3 |2 ]3/2 MB
|A2 |2 + |A3 |2
(56c)
and for the masses

(|A2 |2 + |A3 |2 )v 2
,
MA
|B1 |2 v 2
,
m2 2
s12 MB

m1 O |C|2 v 2 /MC .
m3
(57a)
(57b)
(57c)
The MNS phase is fixed by the requirement that we have already imposed in Eq. (56b) that 12
is real,
c23 |B2 | sin 2 s23 |B3 | sin 3 ,
(58)
where
2 B2 B1 + ,
3 B3 B1 + A2 A3 + .
(59)
The phase is fixed by the requirement (not yet imposed in Eq. (56c)) that the angle 13 is real.
In general this condition is rather complicated since the expression for 13 is a sum of two terms.
However if, for example, A1 = 0 then is fixed by:
A2 B1
(60)
where

= arg A2 B2 + A3 B3 .
(61)
Eq. (61) may be expressed as

tan
|B2 |s23 s2 + |B3 |c23 s3

.
|B2 |s23 c2 + |B3 |c23 c3
(62)
Inserting in Eq. (60) into Eqs. (58), (59), we obtain a relation which can be expressed as
tan( + )
|B2 |c23 s2 |B3 |s23 s3

.
|B2 |c23 c2 + |B3 |s23 c3
(63)
68
In Eqs. (62), (63) we have written si = sin i , ci = cos i , where we have defined
2 B2 A2 ,
3 B3 A3 ,
(64)
which are invariant under a charged lepton phase transformation. The reason why the see-saw
parameters only involve two invariant phases rather than the usual six, is due to the sequential
dominance assumption which effectively decouples one of the right-handed neutrinos, thereby
removing three phases, together with the further assumption (in this case) of A1 = 0, which
removes another phase.
3.4. Three family SD and the R-matrix
We now discuss the R-matrix for this case. From Eq. (15), we have for the two right-handed
neutrino model, dropping primes and assuming MA < MB < MC :

(MA
m1 0
0
0 1/2
0 1/2
MNS
mLR
(65)
=V
RT .
0
0
MB
0 m2 0
0
0
MC
0
0 m3
Eq. (65) yields the following expressions for the column vectors introduced in Eqs. (51),

1/2
1/2
( vA vB vC ) = m1/2
m2
m3 R T
1
1/2
where column MNS vectors mj

1/2
mj
are defined as:
1/2
= VijMNS mj ,
(67)
1/2
1/2
i.e. the column vector mj is equal to the j th column of V MNS times mj .

The MNS matrix is given by,

c12 c13
s12 c13
s13 ei
MNS
i
i
= s12 c23 c12 s23 s13 e
V
c12 c23 s12 s23 s13 e
s23 c13 P0
s12 s23 c12 c23 s13 ei c12 s23 s12 c23 s13 ei c23 c13
where
P0 =
ei1
0
0
0
ei2
0
(66)

0
0 .
1
(68)
(69)
The R-matrix is a complex orthogonal 3 3 matrix which can be parameterized in terms of three
complex angles i as R = diag(1, 1, 1)R1 R2 R3 where RiT take the form of Eq. (40):

1 0
0
c2 0 s2
T
T
R1 = 0 c1 s1 ,
R2 = 0 1
,
0
0 s1 c1
s2 0 c2

c3 s3 0
T
R3 = s3 c3 0 ,
(70)
0
0
1
R3T R2T R1T
c 2 c3
c2 s3
s2
c1 s3 s1 s2 c3
c1 c3 s1 s2 s3
s1 c2
s1 s3 c1 s2 c3
s1 c3 c1 s2 s3
c1 c2

(71)
69
where we have written si = sin i , ci = cos i .

Although the R-matrix is rather complicated, it is clear from Eq. (66) that SD occurs for
values of angles i which correspond to the following approximate forms for the moduli of the
elements of R T :
0
0
1
0
1
0
1
0
0
0
0
1
1
0
0
0
1
0
0
1
0
0
0
1
1
0
0
0
1
0
1
0
0
0
0
1
1
0
0
1
T
0
R
CBA
0
0
0
1
0
1
0
T
R
ABC
T

R
ACB
T

R
BAC
T

R
BCA
T

R
CAB

( vA
vB

vC ) m1/2
3
m2
( vA
vC

vB ) m1/2
3
m1
( vB
vA

vC ) m1/2
2
m3
( vB
vC

vA ) m1/2
2
m1
vA

vB ) m1/2
1
m3
vB

vA ) m1/2
1
m2

0
1 ( vC
0

0
0 ( vC
1
1/2
1/2
m1
1/2
m2
1/2
m1
1/2
m3
1/2
m2
1/2
m3
1/2
1/2
1/2
1/2
1/2
(72)
(73)
(74)
(75)
(76)
(77)
As discussed in the previous section, without loss of generality we have assumed that the dominant right-handed neutrino is labelled by A, the leading subdominant right-handed neutrino is
labelled by B, and the subsubdominant right-handed neutrino is labelled by C, and we have relabelled the right-handed neutrinos where appropriate according to this convention. The possible
forms of the neutrino Dirac mass matrix mLR corresponding to the above types of SD are then
given by
mLR = (A, B, C)
or mLR = (A, C, B)
for M1 = MA ,
(78)
mLR = (B, A, C)
or mLR = (B, C, A)
for M1 = MB ,
(79)
= (C, B, A)
for M1 = MC ,
(80)
mLR
= (C, A, B)
or
mLR
where we have ordered the columns in each case according to MRR = diag(M1 , M2 , M3 ) where
M1 < M2 < M3 , consistent with Eq. (65).
Clearly the different types of SD correspond to the moduli of the R-matrix elements taking
values close to either zero or unity, so that each of the vectors vA , vB , vC is approximately equal
1/2
to a particular vector mi . Considering the modular surfaces of sin i and cos i , this corresponds
to the angles i being approximately real and taking values close to either zero or /2, which
is a generalization of the situation in the two family example discussed previously. Note that
SD therefore implies that the R-matrix is approximately real. Since there has been some recent
interest in the case of the real R-matrix in the context of flavour-dependent leptogenesis, we shall
return to this point later in Section 4.
70
3.5. SD in the two right-handed neutrino approximation

In this subsection we consider the two right-handed neutrino limit of SD. We shall suppose
that we have SD but not exact tri-bimaximal mixing. In this case R takes the approximate the
forms discussed in the previous section. For definiteness we will consider the type of SD corresponding to R being close to the unit matrix. The other kinds of SD are discussed in Appendix A.
The two right-handed neutrino approximation corresponds to the limit in which the righthanded neutrino labelled by C decouples from the see-saw mechanism, where this limit also
corresponds to m1 = 0. In this limit of SD we shall see that the models reduces to the two righthanded neutrino model with SD introduced in [8]. For example, let us consider the case of R
being approximately equal to the unit matrix, corresponding to the type of SD given in Eq. (77).
In the C decoupling limit this corresponds to:

1/2 T
(81)
.
( 0 vB vA ) = 0 m1/2
m3 R0BA
2
This limit corresponds to s2 = s3 = 0, with only s1 = 0, giving:

1 0
0
T
T
T
T
R0BA = R3 s =0 R2 s =0 R1 = 0 c1 s1 .
3
2
0 s1 c1
(82)
This results in:

1/2
1/2
vB = c1 m2 + s1 m3 ,
1/2
1/2
vA = s1 m2 + c1 m3
(83)
similar to Eq. (43) in the two family model, except that here the vectors have three components.
SD here corresponds to a small angle 1 0 (for both real and imaginary components). A zero
value 1 = 0 implies that A1 13 , as discussed. However a non-zero angle 1 allows for example
a zero value of A1 = 0 consistent with a non-zero value of 13 . For example A1 = 0 implies from
Eq. (83),
1/2
tan 13
m3
tan 1
(84)
ei(+2 )
.
m2
s12
This result shows that, with a texture zero A1 = 0, small 13 implies also small 1 . This is a
remarkable result: in general having a small value of A1 combined with small 13 in the two
right-handed neutrino limit implies also small (but non-zero in general) 1 , corresponding to SD.
In the two right-handed neutrino limit it is impossible to have a texture zero A1 without SD.
A similar analysis follows for the other types of SD, where the right-handed neutrino labelled
by C in these cases can be decoupled in a similar way. In each case it is necessary to allow
the remaining dominant and subdominant right-handed neutrinos to mix, in order to allow for
the most general kind of SD, and we identify the single remaining mixing angle in each case.
The other cases are discussed in Appendix A.
The above discussion and Appendix A shows how an effective two right-handed neutrino
model arises as a limiting case of the three right-handed neutrino model in which the right-handed
neutrino labelled by C is decoupled. In this decoupling limit the remaining two right-handed
neutrino system is parameterized in each case by a single non-trivial complex angle, where the
nature of the angle and the values of the other fixed angles of the R-matrix depend on the type
of three right-handed neutrino SD. In particular the limiting cases all led to relations similar to
71
Eq. (43) which is repeated below:

1/2
1/2
vA = cm2 + sm3 ,
1/2
1/2
vB = sm2 + cm3 ,
(85)
where the main difference is that the vectors here have three components. For each type of SD
it is straightforward to relate the angle to either 1 or 2 using the results given in Eqs. (83),
(A.3), (A.6), (A.9), (A.12), (A.15). The following discussion will be based on the angle defined
in Eq. (85), assuming that this identification has been made.
Eq. (85) again leads to a similar geometrical relation between the R-matrix angle and the
angle between the two vectors vA and vB as in Eq. (45), where the magnitudes of the two vectors
1/2
is as in Eq. (46). These results follow from the unitarity of VCKM (since recall that mi is
proportional to the ith column of VCKM ) which gives:
1/2 1/2
mi |mj = ij mj
(86)
and hence:
vA |vB = c sm2 + s cm3 ,
vA |vA = c cm2 + s sm3 ,
vB |vB = s sm2 + c cm3 .
(87)
In the case of tri-bimaximal mixing s = 0 or c = 0 and hence vA |vB = 0, i.e. orthogonality
of the dominant and subdominant columns of the Yukawa matrix, as in Eq. (91). However, as
the previous discussion shows, away from the tri-bimaximal limit these limits are in general too
strong, and so we must in general consider s, c = 0, with SD corresponding to either s 0, or
c 0, which implies the R-matrix angle takes approximately real values close to zero or /2.
We also remark that it is trivial to generalize the result in Eq. (84) to all the other types of SD. In
other words a texture zero A1 = 0 directly implies SD, for each of the types of SD.
It is possible to regard the two right-handed neutrino model as a complete model in its own
right, not as a limiting case of a three right-handed neutrino model. This is not so well motivated
as the limiting cases discussed here. However in such a case one may take Eq. (85) as the starting
point for the exploration of the parameter space. This has been discussed fully elsewhere [22],
so we shall not pursue this point further here. However the results in this subsection should be
useful in relating a three right-handed neutrino analysis to the two right-handed neutrino limit,
and in particular to the SD regions of parameter space of this limit.
4. Applications of basis independent SD
In this section we first discuss the application of these new ideas to flavour models, then
discuss the implications for approaches based on the R-matrix, including flavour-dependent leptogenesis which has recently been studied in the literature.
4.1. Examples of models in the same invariant class as SD
The usual application of SD to flavour models in the literature is in the see-saw flavour basis
corresponding to diagonal mass matrices of charged leptons and right-handed neutrinos, or small
perturbations away from the diagonal structures. This severely restricts the applicability of SD,
72
and may even lead one to believe that SD is an artefact of that particular basis, or could be
transformed away by going to another basis, or even that it is meaningless since all see-saw
models are related to each other by a change of basis. We have shown explicitly in this paper that
none of these statements is true. We have shown how the different types of SD may be formulated
in a basis independent way in terms of the R-matrix, since, as we have also shown, each choice of
R-matrix labels an infinite equivalence class of see-saw models related to each other by changes
of lepton basis. These results open the door for new applications of SD away from the usual
diagonal basis of charged leptons and right-handed neutrinos. In this subsection we illustrate the
possibilities by highlighting some existing models in the literature which are now seen to be SD
in disguise, i.e. are in the same invariant class as SD.
4.1.1. Tri-bimaximal neutrino mixing and CSD: Charged lepton corrections
In this subsection, we first discuss CSD and tri-bimaximal neutrino mixing in terms of the
R-matrix. We shall show that the R-matrix elements take quite precise values equal to either
zero or plus or minus unity (we shall discuss how precise) in this case, which are unaffected by
charged lepton corrections, according to Section 2.4 in which a change of charged lepton basis
leaves the R-matrix invariant. However the MNS matrix is subject to observable deviations from
tri-bimaximal mixing due to charged lepton corrections. The lesson from this is that the charged
lepton corrections can result in a change of the invariant class of see-saw model, not due to a
change in R but due to a change in the physical parameters.
In the notation of Eq. (50), tri-bimaximal mixing [3] corresponds to the choice [10]:
|A1 | = 0,
(88)
|A2 | = |A3 |,
(89)
|B1 | = |B2 | = |B3 |,
(90)
A B = 0.
(91)
This is called constrained SD (CSD) [10]. Note that there is no constraint imposed on the couplings Ci since these describe the right-handed neutrino which is approximately decoupled from
the see-saw mechanism.
In terms of the R-matrix SD corresponds to the special case that the R-matrix elements are
approximately equal to zero or plus or minus unity. We now show that the accurate limit of SD,
in which the elements of R are zero or plus or minus unity very accurately, corresponds to CSD
and tri-bimaximal mixing. We shall consider the case of the R-matrix approximately equal to the
unit matrix (the other cases follow similarly). In this case we can write Eq. (77) explicitly as:

1/2
1/2
1/2
1/2
1/2
1/2 T
= Vi1 m1 Vi2 m2 Vi3 m3 RCBA
Bi MB
Ai MA
Ci MC
(92)
T
where we have written Vij = VijMNS . If we take RCBA
= diag(1, 1, 1) precisely, then Eq. (92)
implies for example that Ai Vi3 , so that A1 = 0 would imply that 13 = 0 (cf. the general case
from Eq. (56c) where 13 involves a contribution from a term which is independent of A1 ). We
T
= diag(1, 1, 1) and for tri-bimaximal mixing angles 13 = 0, sin 23 =
further note that for RCBA
1/ 2, sin 12 = 1/ 3, the Dirac matrix takes a very special form:

A1
0
0
(93)
A2 s23 1 ,
A3
c23
1
B1
B2
B3
s12
c12 c23
c12 s23

1
1 ,
1
73
(94)
ignoring the irrelevant couplings Ci . These satisfy the CSD conditions for the Yukawa couplings
discussed in Eqs. (88)(91) [10]. We conclude that with R precisely equal to the unit matrix tribimaximal mixing implies and is implied by CSD. Of course this is not the only way to achieve
tri-bimaximal mixing, which could be achieved via any other choice of R-matrix, corresponding
to other choices of Yukawa couplings, but this choice of Yukawa couplings appears to be the simplest, and could arise for example from vacuum alignment in flavour models [10,12]. Indeed the
simplicity of the Yukawa couplings in this case provides a powerful motivation for SD. Similar
forms of the Yukawa matrices of the CSD form for tri-bimaximal mixing emerge from the other
types of SD in Eqs. (73)(77) when the R matrices take the exact forms shown there (with the
elements being precisely 0, 1) rather than just the approximate forms.
In realistic models [10,12] it is typically the case that CSD arises through vacuum alignment
in the some theory basis, in which the charged lepton mass matrix is not precisely diagonal,
resulting in charged lepton corrections to tri-bimaximal mixing. In the theory basis there is,
to good approximation, tri-bimaximal neutrino mixing, and the neutrino Dirac mass matrix is
parameterized in terms of a unit R-matrix (or one of the other exact forms in Eqs. (72)(77))
as we have just seen. However, if, in some basis, the R-matrix is equal to the unit matrix, for
example, then this will be true in all bases, as we showed in Section 2.4. In the presence of
charged lepton corrections the MNS matrix will deviate from the tri-bimaximal form, but the Rmatrix will remain equal to the unit matrix. In going from the theory basis to the see-saw flavour
basis in which the charged lepton mass matrix is diagonal, both sides of Eq. (17) must be left
appearing
multiplied by a matrix VEL which diagonalizes the charged leptons, resulting in VMNS
on the right-hand side which is not of the precise tri-bimaximal form, even though R is precisely
equal to the unit matrix in both the original basis and the primed basis. Interestingly, the neutrino
mass matrix in the primed basis will retain the property that its columns are proportional to the
columns of the MNS matrix, albeit that the MNS matrix is not precisely of the tri-bimaximal
form.
We have seen that tri-bimaximal neutrino mixing from CSD corresponds to the R-matrix
taking one of the forms in Eqs. (72)(77) rather precisely. One may ask how accurately should
these forms be achieved in realistic models? In practice, tri-bimaximal neutrino mixing relies
on the conditions in Eqs. (88)(91) being satisfied which leads to tri-bimaximal mixing up to
corrections of order m2 /m3 . The conditions on the couplings Ci are more unconstrained since
they only give corrections to the mixing angles of order m1 /m3 , which may be quite small. We
have already examined the limit where the right-handed neutrino labelled by C decouples and in
this limit the corrections to tri-bimaximal neutrino mixing of order m2 /m3 can be described by
a single small angle as discussed in Section 3.5. For example, in the case of R being close to
the unit matrix, then R is described by 2 = 3 = 0 with small values of 1 0 parameterizing
the corrections of order m2 /m3 , according to Eq. (83). If we relax the decoupling of C then we
can also account for corrections of order m1 /m3 to the R-matrix, described by non-zero values
of 2 0 and 3 0, which corresponds to:
1/2
1/2
1/2
vC = c3 c2 m1 + s3 c2 m2 + s2 m3 .
(95)
We conclude that the case of CSD and tri-bimaximal neutrino mixing corresponds to the
R-matrix taking quite exactly (up to corrections of order m1 /m3 , m2 /m3 ) one of the forms in
Eqs. (72)(77). If the forms of the R-matrix deviate by more that this, but still resemble those
74
forms to some degree then we merely have SD not CSD, and exact tri-bimaximal neutrino mixing
is lost. In the case of CSD, the presence of charged lepton mixing corrections will give observable
corrections to tri-bimaximal mixing in the MNS matrix, resulting in testable predictions and sum
rules for lepton mixing angles [10,13], however these corrections leave the R-matrix unchanged
from the precise forms just described. These precise forms of the R-matrix therefore represent
the basis-independent signature of CSD and tri-bimaximal neutrino (rather than MNS) mixing
which can be identified in phenomenological analyses based on the R-matrix.
4.1.2. Lepton mixing from the charged lepton sector
We now discuss a class of models which account for lepton mixing purely as arising from
the charged lepton sector. Such models have been discussed in [15], and we show here that they
are in the same invariant class as SD models, i.e. are SD models in disguise. The starting point
of these models is to assume that there is no mixing coming from the neutrino sector. The mass
matrices are then written as:

0
C1 0
p d a
mE
vu YLR
= C2 B2 0 ,
e b ,
LR = q
r f c
C3 B3 A3

MC
0
0
MRR
(96)
0
0
MB
0
0
MA
and the following conditions are assumed:
|Bi Bj |
|Ci Cj |
|A3 A3 |

MA
MB
MC
(97)
which is the usual SD condition in Eq. (53), and leads to mLL diag(m1 , m2 , m3 ). We also
assume the new conditions:
|a|, |b|, |c| |d|, |e|, |f | |p|, |q|, |r|,
(98)
|d|, |e| |f |.
(99)
The charged lepton masses are given by:

1/2

,
m |a|2 + |b|2 + |c|2

|d a + e b + f c|2 1/2
,
m |d|2 + |e|2 + |f |2
m2

me O |p|, |q|, |r| .
(100a)
(100b)
(100c)
In leading order in |d|/|f | and |e|/|f |, the mixing angles are given by [15]:
|a|
,
|b|
s12 |a| + c12 |b|
,
tan(23 )
|c|

|e|, |d|
.
tan(13 ) O
|f |
tan(12 )
(101a)
(101b)
(101c)
75
According to the ISS approach, we should begin by calculating the R-matrix in the basis
defined in Eq. (96), in order to determine the invariant class C(R) to which this model belongs.
For this purpose we shall use the results in Section 2.4, and in particular Eq. (16) which is valid
for a general charged lepton basis, but a diagonal right-handed neutrino mass basis. Here VL
being the matrix that diagonalizes mLL in this basis, is actually equal to the unit matrix, since by
construction there is no mixing coming from the neutrino sector. Thus the R-matrix is determined
from Eq. (16) as:
R T = diag(m1 , m2 , m3 )1/2 vu YLR

diag(MC , MB , MA )1/2
(102)
is as in Eq. (96). By explicit multiplication, using the conditions for a neutrino mass
where YLR
hierarchy in Eq. (97), it is easy to see that R is approximately equal to the unit matrix. It is also
were taken to be diagonal, then R would be exactly equal to the unit
easy to see that if YLR
matrix. We already saw in Section 3.4 that a unit R-matrix defines a particular invariant class
of models to which SD belongs, where the dominant (subdominant) right-handed neutrino is the
heaviest (intermediate) one. Therefore we conclude that the charged lepton mixing model here is
in the same invariant class as SD.
We can check this result explicitly by rotating the above models to the usual SD models by a
change of charged lepton basis, using the symmetry UL (3) UER (3). We thus perform a change
of charged lepton basis, using the symmetry UL (3) UER (3), which results in a change of mass
matrices from the above ones in Eq. (96) to the ones in the see-saw flavour basis in which the

charged lepton mass matrix is diagonal, mE
LR = diag(me , m , m ), given by:
E
mE
LR = VEL mLR VER ,
mLR = VEL mLR ,
mLL = VEL mLL VETL .
(103)
In the unprimed basis mLL diag(m1 , m2 , m3 ), and by comparing Eq. (8) to Eq. (103) we identify:
VEL V MNS .
(104)
Then using Eq. (104) with Eq. (103) we have,

mLR V MNS mLR .
(105)
Using Eq. (105) with Eq. (96), and the MNS matrix in Eq. (68), immediately leads to the SD
form in Eq. (50) of the neutrino mass matrix, satisfying the usual conditions in Eqs. (53), (55),
with the right-handed neutrino mass ordering of the form in Eq. (77). By a reordering of the
right-handed neutrino masses in Eq. (96) we could similarly arrive at any of the types of SD in
Eqs. (72)(77) in the primed basis in which the charged lepton mass matrix is diagonal as in
Eq. (12).
Alternatively, we could start from one of the sequential right-handed neutrino dominance
types, in the primed basis, then rotate to the unprimed basis in which the mixing is coming
from the charged lepton sector. Starting from the primed basis, in which the charged lepton
mass matrix is diagonal, rotating to the unprimed basis leads to mLL diag(m1 , m2 , m3 ), and a
charged lepton mass matrix given by:

mE
LR V MNS diag(me , m , m ).
(106)
76
For example, for tri-bimaximal mixing, Eq. (106) gives:

2
1
1
3 me 6 m
6 m

1
1
mE
13 m
LR 3 me
3 m

1
2 m
(107)
1
2 m
which is of the form in Eq. (96), for the case of tri-bimaximal lepton mixing.
We conclude that the class of models proposed in [15], where all the mixing arises from the
charged lepton sector, are in the same invariant class as SD, where all mixing arises from the
neutrino sector. The two types of model are in the same invariant class since they correspond to
the same approximately unit R-matrix. In the basis in which there is no mixing coming from the
neutrino sector, then VL is equal to the unit matrix, while in the basis in which all the mixing is

coming from the neutrino sector then VL = VMNS
, with R being the same in both bases.
4.1.3. Non-diagonal right-handed neutrino models
We now consider an example of a see-saw model in which some of the mixing arises from
the right-handed neutrino sector. Specifically we consider the flavour model of tri-bimaximal
neutrino mixing based on SU(3) or its discrete subgroup (27) [12]. We shall show that this
model is in the same invariant class as CSD models, i.e. is CSD in disguise. This will also
provide an example of how the S-matrix may be used as a short-cut to finding the R-matrix, and
also the neutrino mass matrix itself.
In the model under consideration the neutrino mass matrices are of the leading order form:

MA
0
B
C1
MA
0
MRR = MA MA + MB
(108)
vu YLR = A B + A C2
0 ,
A B A C3
0
0
MC
where MA < MB < MC and the couplings A, B, Ci satisfy the conditions in Eq. (53). However
it is not at all clear that the model corresponds to SD since the right-handed neutrino mass matrix
is not diagonal. Moreover it is not clear that tri-bimaximal neutrino mixing results from Eq. (108)
since it does not satisfy the CSD conditions in Eqs. (88)(91).
However, using the S-matrix transformations in Eq. (18), with

1 1 0
1
S = 0 1 0 ,
(109)
0 0 1
results in:
MRR
MA
0
0
0
MB
0
0
0
MC
vu YLR
0
B
A B
A B
C1
C2
C3

(110)
where the transformed mass matrices satisfy the CSD conditions in Eqs. (88)(91). The transformed theory (not strictly a basis transformation since S is not unitary) has the same R-matrix
as the original theory, according to Eq. (20), even though the right-handed neutrino masses are
different (note that in Eq. (110) MA,B,C are not the eigenvalues).
Having made this S-matrix transformation, we can calculate the neutrino mass matrix and
the R-matrix in the transformed basis, since both quantities are invariant under S as shown on
Section 2.4. In fact it is manifestly clear from Eq. (110) that the transformed theory satisfies the
77
CSD conditions and leads to tri-bimaximal neutrino mixing. The R-matrix may be obtained from
Eq. (20),
S 1 diag(MA , MB , MC )1/2
R T = diag(m1 , m2 , m3 )1/2 VL vu YLR
(111)
where in this case VL = VMNS (ignoring small charged lepton corrections). In this case Eq. (111),
with the tri-bimaximal MNS matrix, leads to an R-matrix of the form in Eq. (72). We thus see
that the original theory is in the same invariant class as CSD since it corresponds to the same
R-matrix, in this case that given in Eq. (72).
4.2. SD phenomenology and the R-matrix
In this paper we have formulated SD in terms of the R-matrix in order to show its basisindependence, using the fact that the R-matrix labels distinct equivalence classes of see-saw
models, and each choice of R-matrix generates a continuously infinite class of models related to
each other by basis transformations. However this identification has additional practical benefits
since the R-matrix has been extensively used in phenomenological analyses, so it is useful to be
able to identify sequential dominance with particular points in R-matrix parameter space. In this
subsection we discuss some recent examples of this.
4.2.1. Lepton flavour violation
A recent phenomenological analysis of lepton flavour violation identified a particularly interesting region of parameter space in which the R-matrix is equal to or close to the unit matrix [19].
From our results here we see that the case that R being exactly equal to the unit matrix corresponds to CSD and tri-bimaximal mixing, of the kind where the heaviest right-handed neutrino
is the dominant one, and the second heaviest is the leading sub-dominant one.
4.2.2. Two right-handed neutrino model
Another example of phenomenological analyses which have relied heavily on the R-matrix are
the recent analyses of the two right-handed neutrino model [22]. We have already shown how this
can emerge from the three right-handed neutrino model by decoupling the right-handed neutrino
labelled by C. Although in general the remaining two right-handed neutrinos in the analysis
in [22] do not satisfy the SD condition (or strictly the single right-handed neutrino dominance
condition, since such models automatically satisfy at least the SD condition that one of the righthanded neutrinos is decoupled) it is in fact satisfied in much of the parameter space considered,
namely where the R angle is close to zero or /2, how close being a matter being discussed
earlier in this paper. Moreover having a particular texture zero, as is assumed over some regions
of the analysis in [22], automatically implies SD, as we also saw earlier in Eq. (84).
4.2.3. Flavour-dependent leptogenesis
One of the main phenomenological applications of the R-matrix is to leptogenesis. It is
particularly convenient here since, for example, when it is used in the calculation of the flavourindependent asymmetry parameter 1 it clearly shows that the MNS parameters cancel out.
However recently there has been some activity related to the flavour-dependence of leptogenesis [23], and here the MNS parameters do not cancel out of the expressions for the separate
flavour-dependent asymmetries , where = e, , . Nevertheless, the R-matrix has continues
to be of interest in recent phenomenological analyses of flavour-dependent leptogenesis [24],
78
with the flavour-dependent asymmetry parameter being given by:

1/2 3/2
3 M1 Im( , m m U U R1 R1 )

,
=
2
16 vu2
m |R1 |
(112)
where we have written U = VMNS . Since Eq. (112) only involves basis invariant quantities, it is
manifest that the asymmetry parameter will take a unique value for all see-saw models which
belong to a particular invariant class C(R), i.e. is basis invariant, as it should be. The use of a
real R-matrix to permit a link between leptogenesis and the MNS phases has been explored [24],
since it then follows from Eq. (112) that

m m (m m )R1 R1 I m U
U

(113)
>
which clearly shows that only depends on the MNS phases for the case of R real. A rather
nice application of our results is that approximately real R is an automatic consequence of SD,
as we now discuss.
The case of R being real has been identified with reality of the right-handed rotations used
to diagonalize the Dirac matrix, and thus with the notion of there being no CP violation in the
right-handed neutrino sector. However, this looks like quite a strong requirement. For example
one way to achieve this would be to have an SO(3) family symmetry in the right-handed neutrino
sector, which is broken spontaneously by real flavon vacuum expectation values, which is a rather
precise requirement and not at all generic. This leads to the question of whether there is any more
natural way to guarantee having a real R-matrix which is better motivated? Our formulation of
SD in terms of the R matrix shows that the SD cases actually correspond to the R-matrix being
approximately real. As discussed previously, this follows by considering the modular surfaces of
sin i and cos i , where we saw that SD corresponds to the R angles i being approximately real
and taking values close to either zero or /2. Thus SD is a very nice way of motivating a real
R-matrix, where R takes values approximately as given in Eqs. (72)(77).
The case of CSD and exact tri-bimaximal mixing, corresponding to the R-matrix taking
quite precisely (rather than just approximately) one of the forms in Eqs. (72)(77) leads to
zero leptogenesis asymmetry parameters. For example when R is precisely equal to the unit
matrix Eq. (113) shows that the asymmetry parameters are all equal to zero [24]. Similarly
for the other exact forms in Eqs. (72)(77). Interestingly this result also applies to the case of
tri-bimaximal neutrino mixing, with charged lepton corrections to tri-bimaximal mixing as discussed in [10] giving significant corrections to the total lepton mixing, resulting in deviations
from tri-bimaximal lepton mixing. This might seem paradoxical since physically if there is no
exact tri-bimaximal lepton mixing, then one might also expect that the asymmetry parameters
are also not exactly zero. However the point is that, as already mentioned, if in some basis the
R-matrix is equal to the unit matrix, then this will be true in all bases, as we showed in Section 2.4. In the presence of charged lepton corrections the MNS matrix will deviate from the
tri-bimaximal form, but the R-matrix will remain equal to the unit matrix, and leptogenesis will
remain zero.
Does this mean that the asymmetry parameters of leptogenesis are always equal to zero for
tri-bimaximal neutrino mixing arising from CSD? In practice, tri-bimaximal neutrino mixing
in realistic models is achieved by using vacuum alignment for the dominant and leading subdominant right-handed neutrinos, such that the conditions in Eqs. (88)(91) are satisfied. As
already discussed, there are expected to be small deviations from these precise forms parameterized by small angles which represent corrections of order m1 /m3 and m2 /m3 . In particular
79
there are no conditions imposed on the couplings Ci since the associated right-handed neutrino
is assumed to play a negligible role in the see-saw mechanism and gives corrections of order
m1 /m3 .
If the almost decoupled right-handed neutrino labelled by C is the heaviest, or the intermediate
mass right-handed neutrino, then it will also play no important role in leptogenesis, since the
asymmetry parameters are determined up to corrections of order m1 /m3 by the dominant and
sub-dominant couplings Ai , Bi [25]. However if the almost decoupled right-handed neutrino
is the lightest M1 = MC , then it is unavoidable that the couplings Ci must be involved in the
calculation of the asymmetry parameters, since the asymmetry parameters are given in this case
by [25]:

3M1 Im[C A (C A)] Im[C B (C B)]
+
=
(114)
.
16
MA (C C)
MB (C C)
In this case, there are no constraints on the couplings Ci from CSD and in particular C A and
C B are both non-zero, in contrast to the other cases which would involve A B = 0 due to the
CSD relation in Eq. (91). In the case that the R-matrix is precisely equal to the unit matrix, or one
of the other related forms, then the column vectors A, B, C are each associated with a column of
the MNS matrix, and so we would have C A = C B = 0 by unitarity, giving zero values of the
asymmetry parameters in this case, in agreement with the general argument previously for the
case of R being equal to the unit matrix. However, since the couplings Ci are unconstrained, this
implies that the R-matrix is not precisely equal to the unit matrix, but has important corrections
parameterized by non-zero values of the R angles i as discussed in Eq. (95). In [25] a particular
example of this type was studied in detail.
5. Conclusion
We have proposed an ISS approach to model building, based on the observation that see-saw
models of neutrino mass and mixing fall into basis invariant classes labelled by the CasasIbarra
R-matrix. We have proved that the R-matrix is invariant not just under basis transformations but
also non-unitary right-handed neutrino transformations S. According to the ISS approach, given
any see-saw model in some particular basis one may determine the invariant R matrix and hence
the invariant class to which that model belongs. The formulation of see-saw models in terms of
invariant classes puts them on a firmer theoretical footing, and allows different see-saw models
in the same class to be related more easily, while their relation to the R-matrix makes them more
easily identifiable in phenomenological studies.
We have systematically studied SD as a prime example of the ISS approach. We considered a
simple two family example, before proceeding to the three family case. A very convenient vector
M 1/2 on the left-hand
notation was introduced [16] in which the invariant combination vu YLR
RR
side of Eq. (15) was expressed in terms of three Yukawa vectors consisting of the columns of
the Yukawa matrix normalized by the inverse square roots of right-handed neutrino masses as in
Eq. (51). These three Yukawa vectors are then related to the MNS vectors, introduced in this
paper for the first time, consisting of columns of the MNS matrix normalized by square roots of
neutrino masses, as in Eq. (66). This gives a very nice physical interpretation of the R-matrix,
as that matrix which controls the misalignment of the Yukawa vectors and the MNS vectors.
SD corresponds to the Yukawa vectors and MNS vectors being approximately aligned, up to
permutations. CSD and tri-bimaximal mixing corresponds to the Yukawa vectors and MNS
vectors being more accurately aligned, up to permutations. This interpretation can be extended
80
to any right-handed neutrino or charged lepton basis providing one uses Eq. (17), since the lefthand side is invariant under right-handed neutrino transformations, and on the right-hand side
MNS mixing is replaced by neutrino mixing.
We have given a precise relation between the angle between the Yukawa vectors and
R-matrix angle in two right-handed neutrino limiting cases. These limiting cases are discussed
in detail, for all the different mass orderings of right-handed neutrinos. For such limiting cases,
SD is shown to be an automatic consequence of a particular texture zero and a small 13 . The
discussion of tri-bimaximal mixing and CSD, as corresponding to the R-matrix taking a very
precise (permutation of the) unit matrix form (rather than an approximate such form) is also discussed, as is the fact that the effect of charged lepton corrections is shown to give corrections
to the PMNS angles, but not to the R-matrix, which retains its precise form. The fact that SD
corresponds to approximately real R-matrix angles, as can be seen by considering the modular
surfaces of the R-matrix angles close to zero or /2, is also discussed, with application to recent
flavour-dependent leptogenesis analyses.
The R-matrix provides a beautiful basis invariant formulation of SD and CSD. This means
that SD is physically meaningful, e.g. not all classes of see-saw models correspond to SD, and
also SD cannot be transformed away by a change of basis, since the R-matrix is invariant under
a basis change. The basis independence of SD also makes it more widely applicable to a larger
range of models than is usually considered in the literature. We considered particular models
in which the mixing naturally arises (at least in part) from the charged lepton or right-handed
neutrino sectors, and showed that these models share the same R-matrix as SD, and are hence
in the same invariant class, i.e. they are just SD in disguise. Also the connection of SD to the
R-matrix makes it easier to identify in phenomenological studies.
In summary, the ISS approach amounts to the following procedure. Starting from a particular
see-saw model in a particular basis, one should determine the associated R-matrix, using either
the standard approach involving the right-handed neutrino mass eigenvalues as in Eq. (17), or
using the S matrix short-cut in Eq. (20), useful when right-handed neutrino mass eigenvalues
are not required. Having determined the invariant class C(R) to which it belongs, the particular
model should properly be regarded as one member of an infinite number of other models related
by basis transformations, and it can then easily be seen if any particular model is already present
in the literature in a different guise. This also allows any given model to make contact with
general phenomenological analyses based on the R-matrix. Although the ISS approach has been
applied here to SD models, more generally it should prove to be a valuable model building tool
in classifying and studying the myriad see-saw models that have been proposed in the literature.
Acknowledgements
I would like to thank Stefan Antusch, Michal Malinsky, Graham Ross and Ivo Varzielias for
helpful discussions, and the CERN Theory Division for its hospitality and a Scientific Associateship. The author acknowledges support from the EU network MRTN 2004-503369.
Appendix A. Two right-handed neutrino limit of sequential dominance
In this appendix we discuss the two right-handed neutrino limit of sequential dominance for
the other cases not included in Section 3.5.
The type of dominance in Eq. (72) in the two right-handed neutrino limit corresponds to:

1/2 T
(A.1)
( vA vB 0 ) = 0 m1/2
m3 RAB0
2
where
T
RAB0
R3T s =1 R2T
3
R1T s =1
1
0
c2
s2
1
0
0
0
s2
c2
81

(A.2)
where now s3 = 1, c3 = 0, s1 = 1, c1 = 0 with s2 , c2 = 0. This results in:

1/2
1/2
1/2
1/2
vA = c2 m2 + s2 m3 ,
vB = s2 m2 c2 m3
(A.3)
SD here corresponds to s2 1, c2 0.

1/2 T
(A.4)
( vA 0 vB ) = 0 m1/2
m3 RA0B
2
where
T
RA0B
R3T s =1 R2T
3
R1T s =0
1
0
c2
s2
1
0
0
0
s2
c2

(A.5)

1/2
1/2
vA = c2 m2 + s2 m3 ,
1/2
1/2
vB = s2 m2 + c2 m3
(A.6)

1/2 T
(A.7)
( vB vA 0 ) = 0 m1/2
m3 RBA0
2
where
T
RBA0
R3T s =1 R2T
3
R1T s =1
1
0
c2
s2
0
s2
c2
1
0
0

(A.8)
where now s1,3 = 1, c1,3 = 0 with s2 , c2 = 0. This results in:

1/2
1/2
vB = c2 m2 + s2 m3 ,
1/2
1/2
vA = s2 m2 + c2 m3
(A.9)

1/2 T
(A.10)
( vB 0 vA ) = 0 m1/2
m3 RB0A
2
where
T
RB0A
R3T s =1 R2T
3
R1T s =0
3
0
c2
s2
1
0
0
0
s2
c2

(A.11)
82

1/2
1/2
vB = c2 m2 + s2 m3 ,
1/2
1/2
vA = s2 m2 + c2 m3
(A.12)

1/2 T
(A.13)
( 0 vA vB ) = 0 m1/2
m3 R0AB
2
where

1 0
0
=
= 0 c1 s1
0 s1 c1
where now s2,3 = 0, c2,3 = 1 with s1 , c1 = 0. This results in:
T
R0AB
R3T s =0
3
1/2
R2T s =0 R1T
2
(A.14)
1/2
vA = c1 m2 + s1 m3 ,
1/2
1/2
vB = s1 m2 + c1 m3
(A.15)
References
[1] For a review see e.g. A. Strumia, F. Vissani, hep-ph/0606054;
R.N. Mohapatra, et al., hep-ph/0510213.
[2] For a recent review see e.g. J.W.F. Valle, hep-ph/0608101.
[3] P.F. Harrison, D.H. Perkins, W.G. Scott, Phys. Lett. B 530 (2002) 167, hep-ph/0202074;
P.F. Harrison, W.G. Scott, Phys. Lett. B 535 (2002) 163, hep-ph/0203209;
P.F. Harrison, W.G. Scott, Phys. Lett. B 557 (2003) 76, hep-ph/0302025;
An earlier related ansatz was proposed by: L. Wolfenstein, Phys. Rev. D 18 (1978) 958.
[4] For a review see e.g. S.F. King, Rep. Prog. Phys. 67 (2004) 107, hep-ph/0310204;
G. Altarelli, F. Feruglio, Springer Tracts Mod. Phys. 190 (2003) 169, hep-ph/0206077;
G. Altarelli, hep-ph/0610164;
R.N. Mohapatra, A.Y. Smirnov, hep-ph/0603118.
[5] P. Minkowski, Phys. Lett. B 67 (1977) 421;
M. Gell-Mann, P. Ramond, R. Slansky, in: Sanibel Talk, CALT-68-709, February 1979, and in: Supergravity, NorthHolland, Amsterdam, 1979;
T. Yanagida, in: Proceedings of the Workshop on Unified Theory and Baryon Number of the Universe, KEK, Japan,
1979;
S.L. Glashow, Cargese Lectures (1979);
R.N. Mohapatra, G. Senjanovic, Phys. Rev. Lett. 44 (1980) 912;
J. Schechter, J.W. Valle, Phys. Rev. D 25 (1982) 774.
[6] C. Jarlskog, Phys. Rev. Lett. 55 (1985) 1039;
C. Jarlskog, Z. Phys. C 29 (1985) 491;
For a recent review see: C. Jarlskog, Phys. Scr. T 127 (2006) 64, hep-ph/0606050.
[7] J.A. Casas, A. Ibarra, Nucl. Phys. B 618 (2001) 171, hep-ph/0103065.
[8] S.F. King, Phys. Lett. B 439 (1998) 350, hep-ph/9806440;
S.F. King, Nucl. Phys. B 562 (1999) 57, hep-ph/9904210;
S.F. King, Nucl. Phys. B 576 (2000) 85, hep-ph/9912492;
S.F. King, JHEP 0209 (2002) 011, hep-ph/0204360;
S.F. King, Phys. Rev. D 67 (2003) 113010, hep-ph/0211228.
83
[9] A.Y. Smirnov, Phys. Rev. D 48 (1993) 3264, hep-ph/9304205;

K.S. Babu, S.M. Barr, Phys. Lett. B 381 (1996) 202, hep-ph/9511446;
G. Altarelli, F. Feruglio, JHEP 9811 (1998) 021, hep-ph/9809596;
G. Altarelli, F. Feruglio, I. Masina, Phys. Lett. B 472 (2000) 382, hep-ph/9907532.
[10] S.F. King, JHEP 0508 (2005) 105, hep-ph/0506297.
[11] S.F. King, G.G. Ross, Phys. Lett. B 520 (2001) 243, hep-ph/0108112;
S.F. King, G.G. Ross, Phys. Lett. B 574 (2003) 239, hep-ph/0307190;
G. Altarelli, F. Feruglio, hep-ph/0504165;
I. de Medeiros Varzielas, S.F. King, G.G. Ross, hep-ph/0512313;
S.F. King, M. Malinsky, hep-ph/0608021;
G. Altarelli, F. Feruglio, Y. Lin, hep-ph/0610165.
[12] I. de Medeiros Varzielas, G.G. Ross, Nucl. Phys. B 733 (2006) 31, hep-ph/0507176;
I. de Medeiros Varzielas, S.F. King, G.G. Ross, hep-ph/0607045.
[13] S. Antusch, S.F. King, Phys. Lett. B 631 (2005) 42, hep-ph/0508044.
[14] C.H. Albright, S.M. Barr, Phys. Rev. D 58 (1998) 013002, hep-ph/9712488;
C.H. Albright, K.S. Babu, S.M. Barr, Phys. Rev. Lett. 81 (1998) 1167, hep-ph/9802314;
G. Altarelli, F. Feruglio, I. Masina, Nucl. Phys. B 689 (2004) 157, hep-ph/0402155.
[15] S. Antusch, S.F. King, Phys. Lett. B 591 (2004) 104, hep-ph/0403053;
S. Antusch, S.F. King, Nucl. Phys. B 705 (2005) 239, hep-ph/0402121.
[16] S. Lavignac, I. Masina, C.A. Savoy, Nucl. Phys. B 633 (2002) 139, hep-ph/0202086.
[17] I. Masina, hep-ph/0210125;
I. Masina, Phys. Lett. B 633 (2006) 134, hep-ph/0508031.
[18] G.C. Branco, R. Gonzalez Felipe, F.R. Joaquim, I. Masina, M.N. Rebelo, C.A. Savoy, Phys. Rev. D 67 (2003)
073025, hep-ph/0211001.
[19] S. Antusch, E. Arganda, M.J. Herrero, A.M. Teixeira, JHEP 0611 (2006) 090, hep-ph/0607263.
[20] T. Yanagida, Prog. Theor. Phys. 64 (1980) 1103.
[21] H.K. Dreiner, H. Murayama, M. Thormeier, Nucl. Phys. B 729 (2005) 278, hep-ph/0312012;
J.R. Espinosa, A. Ibarra, JHEP 0408 (2004) 010, hep-ph/0405095;
S.F. King, I.N.R. Peddie, G.G. Ross, L. Velasco-Sevilla, O. Vives, JHEP 0507 (2005) 049, hep-ph/0407012;
S.F. King, I.N.R. Peddie, Phys. Lett. B 586 (2004) 83, hep-ph/0312237.
[22] A. Ibarra, JHEP 0601 (2006) 064, hep-ph/0511136;
A. Ibarra, G.G. Ross, Phys. Lett. B 591 (2004) 285, hep-ph/0312138.
[23] R. Barbieri, P. Creminelli, A. Strumia, N. Tetradis, Nucl. Phys. B 575 (2000) 61, hep-ph/9911315;
A. Abada, S. Davidson, F.X. Josse-Michaux, M. Losada, A. Riotto, JCAP 0604 (2006) 004, hep-ph/0601083;
E. Nardi, Y. Nir, E. Roulet, J. Racker, JHEP 0601 (2006) 164, hep-ph/0601084.
[24] A. Abada, S. Davidson, A. Ibarra, F.X. Josse-Michaux, M. Losada, A. Riotto, hep-ph/0605281;
S. Pascoli, S.T. Petcov, A. Riotto, hep-ph/0609125;
G.C. Branco, R.G. Felipe, F.R. Joaquim, hep-ph/0609297.
[25] S. Antusch, S.F. King, A. Riotto, hep-ph/0609038.
Center vortices as rigid strings

P.V. Buividovich a, , M.I. Polikarpov b
a Belarusian State University, Nezalezhnasti av. 4, 220080 Minsk, Belarus
b ITEP, B. Cheremushkinskaya str. 25, 117218 Moscow, Russia
Received 30 May 2007; received in revised form 20 June 2007; accepted 29 June 2007
Abstract
It is shown that the action associated with center vortices in SU(2) lattice gauge theory is strongly correlated with extrinsic and internal curvatures of the vortex surface and that this correlation persists in the
continuum limit. Thus a good approximation for the effective vortex action is the action of rigid strings,
which can reproduce some of the observed geometric properties of center vortices. It is conjectured that
rigidity may be induced by some fields localized on vortices, and a model-independent test of localization
is performed. Monopoles detected in the Abelian projection are discussed as natural candidates for such
two-dimensional fields.
PACS: 12.38.Aw; 12.38.Gc; 11.25.-w
1. Introduction
YangMills theory is often believed to be equivalent to some string theory, however, up to
now there is no way to detect thin strings behind physical chromoelectric string of finite thickness, which gives rise to linear potential between test colour charges and which is usually seen
in numerical simulations [1,2]. On the other hand, closed magnetic strings, or center vortices,
can be directly detected [3] and seem to be thin [4]. A model-independent argument in favor
of physically thin center vortices in continuum pure YangMills theory was proposed recently
in [5]. It is known that the full QCD string tension is reproduced with sufficiently good precision
if one considers only the contribution due to topologically nontrivial winding of center vortices
E-mail addresses: buividovich@tut.by (P.V. Buividovich), polykarp@itep.ru (M.I. Polikarpov).

doi:10.1016/j.nuclphysb.2007.06.026
P.V. Buividovich, M.I. Polikarpov / Nuclear Physics B 786 (2007) 8494
85
Fig. 1. Average size of center vortices as the function of their area ( = 2.60, 284 lattice).
and Wilson loop [3,6]. Moreover, in [3,4] it was demonstrated that the area of center vortices in
SU(2) LGT scales in physical units of length. These facts imply that center vortices are not just
lattice artifacts, but rather correspond to some physically significant objects. Center vortices are
usually detected using the maximal center gauge and are seen as closed self-avoiding surfaces,
which occupy only a small fraction of lattice plaquettes [3,4]. Typically one observes a large percolating vortex, which extends through the whole lattice, and a number of small satellite vortices,
whose size typically does not exceed several lattice spacings [7].
As the total physical area and size of center vortices remain finite in the continuum limit,
there should be also a continuous description of vortex geometry, which is characterized by
some finite Hausdorf dimension dH . Percolating vortex by definition has Hausdorf dimension
equal to 4. In order to define the dimensionality of small satellite vortices in SU(2) LGT, their
size L was measured as the function of their area S. The size of the vortex was defined as the
maximal distance between points which belong to the vortex. Average size of small vortices as
the function of their area (in lattice units) is plotted on Fig. 1. Fit of the form L = const S 1/d
(solid curve on Fig. 1) gives d = 1.9 0.1. For small vortex areas (S 30) this number simply
reflects self-avoiding of vortex surfaces, but for larger values of area this fit indicates that small
vortices tend to be smooth surfaces.
How can one describe the properties of such surfaces? A necessary condition for physical scaling of area of random surfaces is the cancelation between their entropy, which for self-avoiding
surfaces grows linearly with the area of surface in lattice units, and the bare string tension, which
therefore should be finite in lattice units [8]. A remarkable result of [4] is that the excess of action associated with center vortices is indeed proportional to the area of vortices in lattice units.
However, for the simplest model of random surfaces with NambuGoto action such naive balance between action and entropy does not lead to physical surfaces because of branched polymer
problem [8], therefore one has to consider some more complicated model.
A well-studied model which can probably describe smooth surfaces even in four and three
dimensions is the model of rigid strings [812]. The model was first analyzed in [9,10], where it
was shown that depending on the form of the -function of the model either the branched polymer phase or the phase of smooth surfaces can be observed. Namely, if the -function has no
nontrivial fixed points, the model reduces to the usual NambuGoto string flawed by branched
polymer problem, but if nontrivial fixed point exists, in the vicinity of this point the model should
86
describe smooth surfaces with finite Hausdorf dimension [9]. Numerical simulations indeed confirmed the existence of the phase of smooth surfaces in this model [8,12,13]. Nontrivial scaling
in the vicinity of the phase transition was observed in [12], but in more recent simulations of
a related model of tethered surfaces first-order phase transition was found [13]. However, even
if the model of rigid strings has no continuum limit, it can still serve as an effective model, for
which UV cutoff is set by some other fields in the theory. It is known, for instance, that rigidity
terms arise in the effective string action after integrating over worldsheet fermions [8,14,15] or
some four-dimensional massive fields coupled to the string worldsheet [11,16,17].
The aim of this paper is to fit observed properties of center vortices in the model of rigid
strings. This model was first conjectured to describe center vortices in [18], where the regularized action of vortex configurations was studied in the continuum theory, and in [19], where
the model was found to describe deconfinement transition with a good precision. In this paper the correlation between the excess of action associated with vortices and the geometry of
vortex surfaces will be studied systematically basing on the results of lattice simulations. In Section 2 it will be shown that the effective vortex action can be indeed approximated by the action
of rigid strings and that the coupling constants before extrinsic and internal curvatures are finite in the continuum limit. In Section 3 this action is discussed as an effective action induced
by some two-dimensional fields localized on the surfaces of vortices and a model-independent
check of localization is performed. Monopoles in the Abelian projection of the theory [4,20,21]
are then discussed as appropriate candidates for such localized fields in Section 4. The dependence of the bare string tension on the lattice spacing can be explained if one takes into account
the contribution of percolating monopole cluster. Interaction between monopoles, which can be
approximated by the Yukawa potential with physical mass [22], can also partially account for
surface rigidity [11,16,17]. Finally, implications and possible extensions of obtained results are
discussed.
2. Effective action as the function of vortex geometry
In order to measure the correlation between geometric properties of vortex surface and the action density center vortices were detected in SU(2) LGT by imposing the direct maximal central
gauge (DMC) [4]. Simulated annealing procedure was used to locate true minima. According to
the conventional procedure, closed vortex surfaces were constructed from plaquettes which are
dual to negative projected plaquettes. In this work the same set of lattice configurations as in [4]
was used. The lattices used were of the size 284 for = 2.60 and = 2.55, 244 for = 2.50,
= 2.45 and = 2.40and 164 for = 2.35. Lattice spacing was fixed by setting the value of
QCD string tension to = 440 MeV.
Local geometry of vortex surfaces was characterized by the two simplest local invariants
internal and extrinsic curvatures. Internal curvature in lattice units for hypercubic lattice was
defined as a 2 Rs = 4 ns , where ns is the number of neighbors of the site s and a is the lattice
spacing [8]. Extrinsic curvature for smooth surfaces can be written as K = x x , where
f = 1g a ( gg ab b f ) is the two-dimensional Laplacian on the surface and gab = a x b x is

the induced metric on the surface. In order to define extrinsic curvature on hypercubic
lattice twodimensional lattice Laplacian on the surface was defined as a 2 fs = ns fs s fs , where s
are the sites adjacent to s. Extrinsic curvature in lattice units is then a 2 Ks = xs xs [8]. The
2
shapes of vortex surfaces which correspond to the values of extrinsic curvature a Ks = 0, 1, 2, 3
are shown on Fig. 4. The shape which corresponds to a 2 Ks = 4 is essentially four-dimensional
and is not shown. It can be though of as a lattice site where four orthogonal plaquettes join.
87
Fig. 2. Average excess of action per site as the function of extrinsic curvature and lattice spacing.
Fig. 3. Average excess of action per site as the function of internal curvature and lattice spacing.
The excess of action associated with some lattice site on the vortex was defined by averaging
the excess of action over all lattice plaquettes which are dual to the vortex plaquettes which
contain this site. The excess of action on plaquette was defined as Sp = (1 12 Tr Up )
(1 12 Tr Up ) = 2 (Tr Up Tr Up ). Average excess of action per site as the function of
lattice spacing and extrinsic and internal curvatures (in lattice units) is plotted on Figs. 2 and 3
respectively.
It can be seen that the excess of action increases with both extrinsic and internal curvature,
therefore the action should in general depend on both internal and extrinsic curvatures. Standard
dimensional analysis shows that only the terms linear in extrinsic and internal curvature are
relevant in the continuum limit. It should be also noted that because center vortices are defined
on hypercubic lattice, typical values of curvature diverge as a 2 in the continuum limit (a 2 Ks
and a 2 Rs are finite integer numbers, therefore Ks a 2 and Rs a 2 ). A simple estimation
shows that the number of points on the surface with K = 0 or R = 0 scales to zero as a 2 , which
ensures the smoothness of surfaces and the finiteness of the total contribution of bent plaquettes
to the physical action. Finite values of Ks or Rs for physical surfaces in the continuum limit can
be in principle defined by averaging the curvature over physically small regions, whose area is
nevertheless very large in lattice units.
A peculiar feature of the dependence of action excess on extrinsic curvature is the distinct
peak near a 2 Ks = 3. Such value of extrinsic curvature corresponds to lattice sites where three
plaquettes join (see Fig. 4). It was found that when this peak is neglected, the resulting function
appears to be almost linear in a 2 Ks (Fig. 5). Presumably the peak at a 2 Ks = 3 is a lattice artifact
which corresponds to elementary lattice cubes. In order to obtain better fits the excess of action
at a 2 Ks = 3 was replaced by the average between a 2 Ks = 2 and a 2 Ks = 4 (solid line on Fig. 5),
which yields almost linear dependence. Bare string tension in lattice units and the coefficient before extrinsic curvature were obtained as the intercept and the slope of the linear fits of the data
88
Fig. 4. Shapes of vortex surfaces which correspond to the values of extrinsic curvature a 2 Ks = 0, 1, 2, 3. Lattice sites
with which these values of extrinsic curvature are associated are marked with solid circles.
Fig. 5. Excess of action per site as the function of extrinsic curvature at different lattice spacings (from a = 0.14 fm,
= 2.35 for the uppermost curve to a = 0.06 fm, = 2.60 for the lowest curve). Solid lines are plotted using the value
at a 2 Ks = 3 which was replaced by an average between a 2 Ks = 2 and a 2 Ks = 4.
on Fig. 5. One could in principle try to fit the excess of action per plaquette by some polynomial
in a 2 Ks , but anyway dimensional analysis shows that only the term linear in a 2 Ks would survive
in the continuum limit. Nevertheless, the fitting method affects numerical values of the coefficient (a) before extrinsic curvature. In general, increasing the number of degrees of freedom in
fitting functions changes the value and the uncertainty of this coefficient, however, these values
agree within error range when extrapolated to the continuum limit. For instance, in order to check
the stability of fits the data plotted on Fig. 5 were fitted by first and second-order polynomials
in a 2 Ks . The values of bare string tension 0 (a) (constant term in the fits) obtained from both
fits agree with a very good precision. At finite lattice spacings discrepancies in the values of
are larger, but extrapolation to the continuum limit gives consistent values = 0.065 0.006
(linear fit) and = 0.048 0.014 (quadratic fit). The coefficient before (a 2 Ks )2 contains very
large errors and is close to zero. This coefficient can only be important in the continuum limit if
it contains divergences of order a 2 , which is not likely.
89
Fig. 6. Bare string tension in lattice units 0 (a).
In order to extract the term linear in internal curvature from the data plotted on Fig. 3 the
excess of action was fitted by a third-order polynomial in a 2 Rs = 4 ns . The coefficient before a 2 Rs in this polynomial was assumed to be the coefficient before internal curvature in the
physical action.
Finally, after omitting all terms which become irrelevant in the continuum limit, for sufficiently small lattice spacings a one can write the action associated with center vortices in the
following form:

W [S] = d 2 g 0 (a)2UV + (a)R + (a)K ,
(1)
S
where g = det gab is the invariant element of area on the surface, gab =
= a 1
X X
a b
is the
is the UV cutoff scale of the theory.

induced metric on the surface and UV
The coefficients 0 (a), (a) and (a) as the functions of lattice spacing are plotted on Figs. 6,
8 and 7, respectively. Extrapolation to continuum limit gives the following values:
0 (0) = 0.192 0.006,
(0) = 0.066 0.003,
(0) = 0.08 0.02.
(2)
Thus coefficients and are finite in the continuum limit and therefore the dependence on
internal and extrinsic curvatures is physical. Quadratic divergence in the bare string tension is
crucial for the existence of smooth physical surfaces, as explained above, and should be compensated by a similar divergence in the entropy of random surfaces. It is interesting to note that the
value of bare string tension in lattice units (2), which is obtained after taking curvature into account, is smaller than the value of action excess Sp = 0.540 0.004 obtained in [4]. The fact
that after proper redistribution of action excess among operators with appropriate dimensions
the string tension is strongly decreased indicates that the terms with extrinsic curvature play an
important role in the dynamics of center vortices.
The action (1) corresponds to the model of rigid strings [810]. While the surface entropy is
canceled by the divergent bare string tension, branched polymer problem is circumvented due
to
third term. It is interesting to note that as the consequence of the GaussBonnet identity
the
2 gR = 2[S] ([S] is the Euler characteristic of the surface), the coefficient before
d
S
internal curvature is proportional to the string coupling constant, which is therefore also finite in
the continuum limit.
90
Fig. 7. Coupling constant before extrinsic curvature (a).
Fig. 8. Coupling constant before internal curvature (a).
3. Fields localized on center vortices

From the point of view of field theory quadratically divergent term S d 2 g0 (a)2UV in
the vortex effective action indicates that some fields are localized on center vortices and become effectively two-dimensional. Quadratic divergence in the effective action (1) corresponds
to vacuum oscillations of these two-dimensional fields [23,24]. Localization on center vortices
was also directly observed in lattice simulationsfor instance, it was found that in the maximal
Abelian gauge almost all trajectories of Abelian monopoles belong to center vortices [21]. There
also exist classical string solutions where monopoles are localized on string worldsheets [25,26].
In [27,28] it was shown numerically that eigenfunctions of covariant four-dimensional Laplace
and Dirac operators are also localized near center vortices.
If only two-dimensional fields and their interactions are responsible for the excess of action
on the vortices, total excess of action on the vortex W [S] should be treated as the effective string
action Weff [S] obtained by integrating over the fields at fixed geometry:

2
exp Weff [S] = D exp d gL[] ,
91
(3)
where D is the covariantly defined path integral measure and L[] is the Lagrangian of
the
. Such effective action will necessarily contain quadratic divergence of the form
2field
d g2UV due to vacuum oscillations of these two-dimensional fields plus some geometrydependent terms. For instance, if the fields are four-dimensional Dirac fermions, the effective
action (3) depends on the extrinsic curvature of the surface [8,14,15].
In order to check whether some two-dimensional fields propagate along the vortex or not, one
can consider the points on the vortex which are very distant in terms of internal geometry on the
vortex but are very close in four-dimensional space. If the relevant fields propagate only along the
vortex, correlation between excess of action in such points should be much less than between plaquettes separated by the same distance along the vortex. In order to check this numerically center
vortices were represented as graphs, each node of the graph corresponding to some lattice plaquette. The correlation between action densities was measured for the points which are separated
by only one lattice spacing in four-dimensional space but no less than 6 spacings along the vortex (d4 < 2, d2 6). The standard breadth first search algorithm for unweighted graphs was used
to measure the distances on vortex. In order to reduce anisotropy the sites which belong to the
vortex were linked not only along lattice links, but also along the diagonals of plaquettes. As the
distance measured was used only for lower-bound estimates of distances, there was no necessity
to use much more time-consuming search algorithms for weighted graphs such as Dijkstra algorithm. For comparison the correlation between neighboring vortex plaquettes (d4 < 2, d2 < 2)
was also measured. Correlation between plaquette variables Tr Up and Tr Up was characterized
by the correlation coefficient [Tr Up , Tr Up ]:
[Tr Up , Tr Up ] =
Tr Up Tr Up Tr Up 2

.
(Tr Up )2 Tr Up 2
(4)
The results of these measurements are plotted on Fig. 9. It can be seen that the correlation
in four-dimensional space is notably smaller than along the surface of the vortex, and therefore
the fields which are responsible for the excess of action on vortices are more likely to propagate
along their surfaces. Unfortunately due to insufficient statistics it was not possible to measure the
correlation lengths which correspond to propagation along the vortex and in four-dimensional
space. The latter can be estimated as the inverse mass of the lowest glueball (1.5 GeV), which
is comparable with the inverse lattice spacing. A measurement which is somewhat similar to the
one described above was performed in [4], where the average excess of action on plaquettes very
close to center vortices was shown to be zero.
4. Abelian monopoles and center vortices
Up to now the only way to see directly the content of the conjectured two-dimensional field
theory is to impose maximal Abelian gauge and to look at the trajectories of Abelian monopoles
[20,23,24,29], which populate densely the surfaces of center vortices. In real simulations about
95% of monopole trajectories belong to center vortices [20,21]. It is natural to conjecture that
two-dimensional field theory living on vortex surfaces should describe these monopoles upon
first quantization.
92
Fig. 9. Correlation between neighbouring plaquettes with d4 < 2, d2 6 and d4 < 2, d2 < 2.
Another nontrivial fact which supports the statement about the role of monopoles in the dynamics of center vortices is the dependence of bare string tension in lattice units on lattice
spacing (see Fig. 6), which can be approximated by a linear function 0 (a) = A + Ba with
good precision. Besides quadratic divergence, this also yields a 1 UV divergence in the
bare string tension in physical units: a 2 0 (a) = a 2 A + a 1 B, where A is given by (2) and
B = 2.2 0.1 fm1 . Such divergence usually corresponds to self-energy of one-dimensional
objects [4,8,23,24]. If the density of one-dimensional objects per unit of vortex area 1D scales
in physical units and is constant in the continuum limit, while the bare mass of these objects
is UV divergent and is close to the critical value mbare = a 1 ln 4 [8], for 1D one then obtains the following estimation 1D = lnB4 = 1.5(8) fm1 . Taking into account that the density
of vortices is v = 24 fm2 , it is easy to estimate the density of one-dimensional objects in
= 37.(9) fm3 , which is in a good agreement with the density
four-dimensional space as 1D
of percolating Abelian monopoles m = 31.(1) fm3 [29]. The difference may be explained by
incomplete detection of Abelian monopoles and by curved geometry of vortex surfaces.
Geometric properties of monopole trajectories in SU(2) LGT were studied in [29]. It was
found that the properties of monopole trajectories at hadronic scale are not described by random walks with dH = 2, as in the case of free scalar particles, but rather by smooth random
walks. Namely, the correlation between the directions of tangent vectors to monopole paths is
characterized by a correlation length lc1 300 MeV, which remains finite in the continuum
limit. Smooth random walks are in the same universality class as the trajectories of spinning
particles [8], therefore the fields associated with monopoles living on center vortices are more
likely to have nonzero spin. The simplest physical model which leads to smooth random walks
is the propagation of massive Dirac fermions in Euclidean space [8,30]. If the monopoles are
assumed to be Dirac fermions, the mass of the fermions can be roughly estimated as 1.5 GeV
from the measurements of monopole current correlators in [29]. A general smooth random walk
corresponds to random walk of the tangent vector on the three-sphere S 3 and therefore includes
components of all spins, but it is not clear how such model can be incorporated into the models of random surfaces. On the other hand, random surfaces with four-dimensional fermions are
well studied. It is known that the effective action of fermionic strings includes rigidity terms
[14,15,20]. As in the model of fermionic strings worldsheet fermions are massless Dirac fermions, it is natural to conjecture that the effective action induced by massive worldsheet fermions
93
also induces string rigidity. In general, one can expect that for massive fermions the effective
besides the local
action
terms of the form (1) should also contain nonlocal terms of the form
d 2 1 g(1 ) d 2 2 g(2 )K(1 )1 (1 , 2 )K(2 ), where 1 (1 , 2 ) is the kernel of the inverse Laplace operator on the surface of the vortex. Unfortunately, at presently available lattices
it is very difficult to trace such terms in the effective action.
The effective Lagrangian governing the dynamics of monopoles was obtained in [22] using the
inverse Monte Carlo method. It was found that the effective monopole action, besides the usual
kinetic term, contains Yukawa interaction with physical mass as well as four-point and six-point
interactions. Higher-order interaction terms were found to be very small. Taking these results
together, a reasonable conjecture is that at hadronic scale monopoles behave as massive Dirac
fermions living on center vortices. Yukawa-type interaction may be induced if these fermions are
coupled to some four-dimensional massive field. Coupling to massive four-dimensional fields
also leads to rigidity terms in the effective action [11,16,17], and thus such interaction can be
also absorbed in the effective action (1).
5. Conclusions
In this paper the relation between local vortex geometry and the action density was studied.
Direct measurements of action imply that a reasonable approximation for the effective action of
center vortices is the action of rigid strings [9,10]. This action can reproduce observed smoothness of vortex surfaces and presumably the physical scaling of vortex area. The latter possibility
depends crucially on the form of nonperturbative -function of the model of rigid strings and at
present time can only be checked numerically [12]. Unfortunately the values of the parameters
0 , and (2), obtained by extrapolation to the continuum limit, cannot be compared with the
corresponding critical values obtained from independent simulations, because most numerical
investigations of the model of rigid strings dealt with three-dimensional case [12,13]. As the
existence of the continuum limit of the model of rigid strings has not been proven exactly, it is
not completely clear whether the effective vortex action (1) can be used at all values of lattice
spacing.
It turns out that a large fraction of the action associated with center vortices comes from rigidity term, therefore the dependence of action on local vortex geometry should be crucial for the dynamics of vortices. In [4] it was conjectured that this dependence arises due to Abelian monopoles
localized on vortices. Considerations of the Section 4 support this conjecture, although this problem requires more accurate analytic treatment. For instance, it could be extremely interesting to
construct two-dimensional field theory which describes monopoles localized on center vortices.
Presumably such theory should be fermionic. In fact the action of rigid strings arises naturally as
a low-energy effective action for vortices of finite thickness [11,1618], however, recent lattice
measurements [4] indicate that center vortices in SU(2) LGT are genuinely thin. Fermionic fields
localized on vortex worldsheets provide a natural explanation of the rigidity of such physically
thin vortices.
An important property of center vortices which is not captured by the action (1) is the
existence of a single percolating vortex. In the case of random walks percolating trajectory corresponds to condensate which emerges due to tachyonic instability of perturbative vacuum. In
order to describe condensation one should use the concepts of Euclidean interacting quantum
field theory instead of random walks which describe the states of only one particle. Condensate
then corresponds to nonzero background field, as in the Higgs model (for a related discussion see,
94
e.g., [23,24]). It is not clear whether this picture remains valid for the theory of random surfaces,
because required nonperturbative apparatus of string field theory is almost not developed.
Acknowledgements
The authors are grateful to E.T. Akhmedov, F.V. Gubarev, A.S. Gorsky, and especially to V.I.
Zakharov, for illuminating discussions and critical remarks and to F.V. Gubarev, A.V. Kovalenko
and P.Yu. Boyko for lattice configurations and source codes. P.V. Buividovich is grateful to all
members of the ITEP lattice group for their kind hospitality. M.I. Polikarpov was partially supported by grants RFBR-05-02-16306a, RFBR-0402-16079a, RFBR-0602-04010-NNIOa and EU
Integrated Infrastructure Initiative Hadron Physics (I3HP) under contract RII3-CT-2004-506078.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
G.S. Bali, K. Schilling, C. Schlichter, Phys. Rev. D 51 (1995) 5165, hep-lat/9409005.

P.Y. Boyko, F.V. Gubarev, S.M. Morozov, arXiv: 0704.1203 [hep-lat].
L. Del Debbio, M. Faber, J. Giedt, J. Greensite, S. Olejnik, Phys. Rev. D 58 (1998) 094501, hep-lat/9801027.
F.V. Gubarev, A.V. Kovalenko, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Lett. B 574 (2003) 136, heplat/0212003.
P.V. Buividovich, M.I. Polikarpov, arXiv: 0704.3367 [hep-ph].
V.G. Bornyakov, D.A. Komarov, M.I. Polikarpov, Phys. Lett. B 497 (2001) 151, hep-lat/0009035.
P.Y. Boyko, V.G. Bornyakov, E. Ilgenfritz, A.V. Kovalenko, B.V. Martemyanov, M. Muller-Preussker, M.I. Polikarpov, A.I. Veselov, Nucl. Phys. B 756 (2006) 71, hep-lat/0607003.
J. Ambjrn, hep-th/9411179.
A.M. Polyakov, Nucl. Phys. B 268 (1986) 406.
H. Kleinert, Phys. Lett. B 174 (1986) 335.
H. Kleinert, Phys. Lett. B 211 (1988) 151.
J. Ambjrn, A. Irbck, J. Jurkiewicz, B. Petersson, Nucl. Phys. B 393 (1993) 571, hep-lat/9207008.
H. Koibuchi, T. Kuwahata, Phys. Rev. E 72 (2005) 026124, cond-mat/0506787.
A.R. Kavalov, I.K. Kostov, A.G. Sedrakyan, Phys. Lett. B 175 (1986) 331.
P.B. Wiegmann, Nucl. Phys. B 323 (1989) 330.
E.T. Akhmedov, M.N. Chernodub, M.I. Polikarpov, M.A. Zubkov, Phys. Rev. D 53 (1996) 2087, hep-th/9505070.
P. Orland, Nucl. Phys. B 428 (1994) 221, hep-th/9404140.
M. Engelhardt, H. Reinhardt, Nucl. Phys. B 567 (2000) 249, hep-th/9907139.
M. Engelhardt, H. Reinhardt, Nucl. Phys. B 585 (2000) 591, hep-lat/9912003.
J. Ambjorn, J. Giedt, J. Greensite, JHEP 0002 (2000) 033, hep-lat/9907021.
A.V. Kovalenko, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Rev. D 71 (2005) 054511, hep-lat/0402017.
S. Kato, N. Nakamura, T. Suzuki, S. Kitahara, Nucl. Phys. B 520 (1998) 323.
V.I. Zakharov, From confining fields on the lattice to higher dimensions in the continuum, in: Lectures given at
Infrared QCD in Rio: Propagators, Condensates and Topological Effects (IRQCD, 2006), Rio de Janeiro, Brazil,
2006, hep-ph/0612342.
V.I. Zakharov, hep-ph/0309178.
A. Gorsky, M. Shifman, A. Yung, Phys. Rev. D 71 (2005) 045010, hep-th/0412082.
M.N. Chernodub, R. Feldmann, E. Ilgenfritz, A. Schiller, Phys. Rev. D 71 (2005) 074502, hep-lat/0502009.
J. Greensite, A.V. Kovalenko, S. Olejnik, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Rev. D 74 (2006)
094507, hep-lat/0606008.
A.V. Kovalenko, S.M. Morozov, M.I. Polikarpov, V.I. Zakharov, Phys. Lett. B 648 (2007) 383, hep-lat/0512036.
V.G. Bornyakov, P.Y. Boyko, M.I. Polikarpov, V.I. Zakharov, Nucl. Phys. B 672 (2003) 222, hep-lat/0305021.
M.S. Plyushchay, Phys. Lett. B 235 (1990) 47.
Resonant CP violation due to heavy neutrinos

at the LHC
Simon Bray a , Jae Sik Lee b, , Apostolos Pilaftsis a
a School of Physics and Astronomy, University of Manchester, Manchester M13 9PL, United Kingdom
b Center for Theoretical Physics, School of Physics, Seoul National University, Seoul 151-747, South Korea
Received 6 March 2007; received in revised form 29 June 2007; accepted 4 July 2007
Abstract
The observed light neutrinos may be related to the existence of new heavy neutrinos in the spectrum
of the SM. If a pair of heavy neutrinos has nearly degenerate masses, then CP violation from the interference between tree-level and self-energy graphs can be resonantly enhanced. We explore the possibility
of observing CP asymmetries due to this mechanism at the LHC. We consider a pair of heavy neutrinos
N1,2 with masses ranging from 100500 GeV and a mass-splitting mN = mN2 mN1 comparable to
their widths N1,2 . We find that for mN N1,2 , the resulting CP asymmetries can be very large or even
maximal and therefore, could potentially be observed at the LHC.
1. Introduction
The observation of neutrino oscillations has established that the observed neutrinos are not
massless and so the Standard Model (SM) must be extended in order to accommodate these [1].
In order to explain why the neutrinos are so much lighter than any of the other fermions, it is
common to postulate that in addition to the three observed light neutrinos, there also exist partner
heavy neutrinos. In order to avoid very stringent constraints due to their non-observation at the
Large ElectronPositron (LEP) collider, these must have masses greater than about 100 GeV [2].
However, if they exist with masses greater than this, but less than about 500 GeV, they fall into
the category of particles that could be produced for the first time at the Large Hadron Collider
E-mail address: jslee@muon.kaist.ac.kr (J.S. Lee).

doi:10.1016/j.nuclphysb.2007.07.002
96
S. Bray et al. / Nuclear Physics B 786 (2007) 95118
(LHC) [36]. Complementary to such a search, heavy neutrinos could also be observed at a future
linear collider. Several studies have been conducted into their signals at the International Linear
Collider (ILC) [7,8] as well as possible alternatives such as an e collider [9].
If neutrinos with such masses do exist, then in general they should break lepton-flavour conservation and, if they are Majorana, lepton-number (L) conservation as well. Furthermore, since
their couplings can be complex, they can also contribute to CP violation. A scenario of particular
interest is if two or more of the heavy neutrinos are quasi-degenerate in mass [10]. In this case,
CP violation can be resonantly enhanced such that for appropriate couplings, CP asymmetries
can be large or even maximal [11]. By introducing flavour symmetries in the singlet neutrino
sector, models can be built where quasi-degenerate heavy neutrinos with masses of order the
electroweak scale appear naturally [12].1 These mostly focus on models of resonant leptogenesis, which can be used to explain the Baryon Asymmetry of the Universe (BAU). Observation
of heavy neutrinos at the LHC, and in particular, measurements of CP asymmetries due to them
would be a way of testing such models.
Since it is not known at this time whether neutrinos are Dirac or Majorana particles, both
possibilities need to be considered. For the former, the collider signatures at the LHC are leptonflavour-violating (LFV) processes. For the latter, in addition to these, lepton-number-violating
(LNV) processes could also be observed. Either of these types of processes should be virtually
background free since they are forbidden in the SM which is both lepton-number-conserving
(LNC) and lepton-flavour-conserving (LFC). Higher order processes with light neutrinos in the
final state could in principle fake the signals, but these can be excluded by suitable kinamatical
cuts, e.g., by vetoing on missing transverse momentum [5,6].
CP violation could show up in asymmetries between possible signal final states and their CPconjugates. Although the initial pp state is not a CP-eigenstate, true CP-violating observables
can be constructed, either by taking into account the theoretically calculable difference expected
due to the Parton Distribution Functions (PDFs) [14,15], or by considering appropriate ratios of
different processes such that this factor drops out.
Observation of heavy neutrinos at the LHC would be a major discovery. Less direct evidence
could come from them contributing to LFV decays, e.g., e , e conversion in nuclei,
or (if Majorana) neutrinoless double beta decay. The non-observation of such processes, along
with the excellent agreement of electroweak data to the SM, places limits on the strength of their
couplings.
This paper is organised as follows: In Section 2, we describe extensions of the SM which
include heavy neutrinos and discuss the experimental constraints on them. Two specific models
are considered, one predicting heavy Majorana neutrinos, the other Dirac. Section 3 is a short
discussion of the signatures of such particles at the LHC, in particular, we classify them according
to whether or not they are LNV. Next, in Section 4, we present the formalism for describing the
propagator of a system of two coupled quasi-degenerate heavy neutrinos. This is based on the
field-theoretic resummation approached developed in [16] for describing resonant transitions
involving the mixing of intermediate fermionic states.
Assuming the signals are due entirely to two nearly degenerate heavy (Dirac or Majorana)
neutrinos, the 2 2 propagator matrices for such a pair are then used in Section 5 to derive
expressions for the CP asymmetries between them. We give example scenarios where these
asymmetries are large and discuss their compatibility with both experimental and theoretical
1 For recent studies within supersymmetric models, see [13].
97
constraints. In Section 6, we use these scenarios in order to produce numerical estimates of

the possible level of leptonic CP violation observable at the LHC. We define a number of CPviolating observables and plot these, along with the cross sections they are derived from. Finally,
Section 7 contains our conclusions.
2. Heavy neutrino extensions of the Standard Model
2.1. Heavy Majorana neutrino model
We first describe the SM, minimally extended to include right-handed neutrinos. Assuming
the Higgs sector is not extended, the Lagrangian describing the neutrino masses and mixings
reads

0 c
1 0 0 c 0
(L )
mD
+ h.c.,
Lm = L R
(2.1)
mTD mM
2
R0
where L0 and R0 denote column vectors of the left- and right-handed neutrino fields in the weak
basis, and the notation c C T represents the charge conjugate fields. Although it is commonly
assumed that there is one right-handed neutrino per generation (as required in SO(10) Grand
Unified Theories (GUTs) [17]), this needs not be so in the bottom up approach considered
here. In fact, as will be shown in Section 5, in the context of searches for CP violation, it is
phenomenologically more interesting if the model contains at least four right-handed states. In
order to maintain generality, we will consider adding nR right-handed states, where nR can be
any positive integer. The elements of the complex matrices mD and mM give rise to Dirac and
Majorana mass terms for the neutrinos, respectively. The only constraint on their structure is that
mM must be symmetric.
The Majorana mass-eigenstate neutrinos are related to the weak-eigenstates through

L0
L
T
=U
(2.2)
c .
NL
(R0 )
The states represented by are the three observed light neutrinos, whereas N represents extra
heavy neutrinos (of which there will be as many as right-handed weak-eigenstates). U is a (3 +
nR ) (3 + nR ) unitary matrix chosen such that

0
mD
U = diag(m1 , . . . , m3+nR ),
UT
(2.3)
mTD mM
where m1 , . . . , m3+nR are the physical neutrino masses.
Since mD is derived from the Higgs mechanism, it is most natural to assume that its elements
should be of order the vacuum expectation value of the Higgs field. By contrast, mM is unrelated
to any SM observables and so could be as large as the GUT scale. This observation leads to
the popular seesaw mechanism by which the extreme smallness of the light neutrino masses
are explained through the large hierarchy in these scales [18]. For a recent discussion within
the context of GUT neutrino models, see [19]. Unfortunately, generic seesaw scenarios are not
phenomenologically very interesting since the heavy neutrinos are predicted to be extremely
heavy (of order the GUT scale) and also have their couplings to SM particles highly suppressed.
More interesting scenarios for collider physics can be formed by introducing approximate flavour
symmetries that impose structure on the mass matrices mD and mM [3,12,20,21]. This can then
98
allow the heavy neutrino couplings to be completely independent of the light neutrino masses. In
such theories it is possible to have heavy neutrinos with masses of order 100 GeV and significant
couplings to SM particles, without being in contradiction with light neutrino data.
Writing the Lagrangian for neutrino interactions in terms of the mass-eigenstates gives [3]

g
+ h.c.,
LW = W li PL Bij
(2.4)
N j
2

g
G li [mli PL mj PR ]Bij
+ h.c.,
LG =
(2.5)
N j
2MW

g
LZ =
(2.6)
Z ( N )i [PL Cij PR Cij ]
,
N
4 cos w
j

g
LH =
(2.7)
H ( N )i (mi PL + mj PR )Cij + (mj PL + mi PR )Cij
,
N j
4MW

ig
0
G ( N )i (mi PL mj PR )Cij + (mj PL mi PR )Cij

.
LG0 =
(2.8)
N j
4MW
The matrices B and C in the above are given by
Bij =
VLki
Ukj
,
Cij =
k=1
k=1
Uki Ukj
3+n
Cik Cjk ,
(2.9)
k=1
where VL is a 3 3 unitary matrix relating the weak to mass-eigenstates of the left-handed

charged leptons. Without loss of generality, we assume that there is no mixing in the charged
leptons, i.e., VL is the unit matrix. This allows B to be written as Bli with l = e, , and so
Cij =
Bli Blj .
(2.10)
l=1
From (2.3), the neutrino couplings have to satisfy the constraints

3+n
mi Bli Bl i = 0.
(2.11)
i=1
These are important when it comes to considering the viability of coupling scenarios which give
rise to CP violation, as is done in Section 5.
2.2. Heavy Dirac neutrino model
An alternative model in which the heavy neutrinos are Dirac particles can be constructed by
adding left-handed singlets SL0 to the theory in addition to the right-handed neutrino fields R0 .
These have no couplings to SM particles and only enter the theory through their mixings with the
other neutrinos. For simplicity, we shall assume that the same number of right-handed neutrinos
and left-handed singlets are included. This model can be obtained as the low energy limit of
GUTs based on SO(10) [22] or E6 [20,23] gauge groups.
99
To obtain a theory with Dirac neutrinos, BL conservation is imposed as a global symmetry.

The Lagrangian for neutrino masses is then given by
( 0 )c

0
m
0
L

D
1
c
Lmass
(2.12)
=
mTD
0 M T R0 + h.c..
0 ( 0 ) SL0
2 L R
c
0
0
M
0
(S )
L
As in the previous model, mD and M are complex matrices. This mass matrix can be diagonalised
through the rotations

L0
L
T
(2.13)
= UL
,
R = UR R0 ,
SL
SL0
where UL is a (3 + nR ) (3 + nR ) and UR a nR nR unitary matrix. If these are chosen
appropriately, the Lagrangian given in (2.12) can then be expressed as
0
0
0
(L )c

1
0
MN R + h.c.,
Lmass
(2.14)
= L (R )c SL 0
2
0 MN
0
(SL )c
with MN diagonal. With the identifications SL NL and R NR , this is then a mass term for
three massless neutrinos 2 and nR massive Dirac neutrinos N .
The three weak-eigenstates L0 in this theory are related to the mass-eigenstates through a
(3 + nR ) (3 + nR ) unitary matrix, just like as in the previous theory without singlets. Hence,
the Lagrangian for their interactions with W and G bosons is given by (2.4) and (2.5), just
with UL replacing U in the definition of Blj . However, since the neutrinos are Dirac particles,
the Lagrangian for their couplings to the Z, H and G0 bosons differs from the corresponding
one for Majorana particles [cf. (2.6)(2.8)]. It only contains terms proportional to Cij , not Cij ,
and is given by

g
LZ =
(2.15)
Z ( N )i PL Cij
,
N
2 cos w
j

g
H ( N )i (mi PL + mj PR )Cij
,
LH =
(2.16)
N
2MW
j

ig
G ( N )i (mi PL mj PR )Cij
,
LG0 =
(2.17)
N j
2MW
where again UL replaces U in the definition of Cij .
Dirac neutrinos can be considered as the limit of two degenerate Majorana neutrinos, say Ni
and Nj , whose couplings are related through Bli = iBlj . It is easy to see then that for these,
Eq. (2.11) is automatically satisfied and hence will not act as a constraint for Dirac neutrinos.
2 Although this has already been ruled out, the model can be made compatible with experiment by adding small
0 (S 0 )c [23]. After diagonalisation, these translate to small Majorana
Majorana mass terms for the singlets, e.g., SL
L
masses for the light neutrinos. However, this will have no effect on any collider observables, since these masses are tiny
compared to the energy scales involved.
100
2.3. Constraints on the heavy neutrino couplings

In both of the models described above, the weak-eigenstate neutrinos are related to the masseigenstates through

3+n
R
L
0
Ll
(2.18)
=
Bli
,
NL i
i=1
where l = e, , . To make the notation clearer, B is split up into two parts relating to the light
and heavy states
0
=
Ll
Bli Li +
i=1
nR
(2.19)
BlNi NLi .
i=1
We can then define the parameter ll as [8]

ll ll
i=1
Bli Bl i =
nR
BlNi Bl Ni ,
(2.20)
i=1
which is a generalisation of the LangackerLondon parameters (sLe,, )2 [24], with the identification ll = (sLl )2 . The 3 3 matrix Bl is, to a good approximation, the PontecorvoMaki
NakagawaSakata (PMNS) matrix [25], giving the mixing of three left-handed neutrinos. Any
deviation from unitarity of the PMNS matrix is given by ll , and would constitute evidence for
new physics, such as heavy neutrinos.
Constraints on ll come from LEP and low-energy electroweak data [24,2629]. Tree-level
processes with light neutrinos in the final state can be used as a probe by looking for a reduction
of the couplings of the light neutrinos from their SM values. A global analysis of such processes
gives the upper limits [28]
ee 0.012,
0.0096,
0.016,
(2.21)
at 90% confidence level. These are mostly model independent and depend only weakly on the
heavy neutrino masses.
LFV decays such as , e , , eee, e conversion in nuclei and Z l + l
also constrain the couplings. Heavy neutrinos contribute in loops, as such the limits obtained
from these depend on the heavy neutrino masses and Yukawa couplings [27]. For mN MW
and mD MW , the limits derived including recent analyses of BaBar data [30], are
|e | 0.0001,
|e | 0.02,
| | 0.02.
(2.22)
Since the modulus lies outside of the sums, the applicability of these limits to the individual
couplings is limited as there can be cancellations between the contributions from different heavy
neutrinos. Furthermore, lepton-flavour violation is a very general signature of beyond the SM
physics. Contributions from SUSY particles for example, could also create cancellations reducing the sum.
Attempts to set limits from neutrinoless double beta decay experiments run into similar problems. Non-observation translates to a bound for Majorana neutrinos of [31]

2
BeNi

< 5 108 GeV1 .
(2.23)

mNi
i
101
Fig. 1. Feynman diagrams for the parton-level subprocesses relevant to heavy neutrino production at the LHC.
Although this would appear to severely constrain heavy Majorana neutrino couplings to the electron, this is again a sum in which large cancellations can occur between contributions from
different particles. In particular, if heavy neutrinos are pseudo-Dirac, this constraint can be
avoided, as there is an extra suppression factor (mN2 mN1 )/(mN2 + mN1 ).
3. LHC signals
At the LHC, the dominant production mechanisms for heavy neutrinos if they have masses
in the 100500 GeV range will be q q (W ) l N , as shown in Fig. 1 [36]. Due to the
enhanced contribution from the valence quarks, W + bosons will be produced more copiously
than their charge conjugates, hence the process with an intermediate W + will give the larger
signal. Since the Feynman graphs in Fig. 1 have a W boson in the s-channel, the signal cross
section falls dramatically as the mass of the heavy neutrino is increased. Hence, even though the
LHC centre-of-mass-system (cms) energy will be 14 TeV, there is only an observable signal for
neutrinos below about 400500 GeV at most.
By far the cleanest signals come from the heavy neutrinos decaying as N l W l jj ,
with l = e, and j representing a hadronic jet. All the decay products can then be detected,
allowing the reconstruction of the invariant mass, and more importantly, the observation of
lepton-flavour and lepton-number violation. Given this decay chain, signals that conserve L will
be of the form pp l l W X, where here X represents the beam remnants and the W boson is assumed to decay hadronically. In addition to these, for Majorana neutrinos only, LNV
processes of the form pp l l W X are also possible. If observed, this would unravel the
Dirac or Majorana nature of the heavy neutrinos.
In order to suppress the SM background, signals for heavy neutrinos must be LFV (which includes any LNV processes). Since both lepton-flavour and lepton-number violation are forbidden
in the SM, the backgrounds to these processes require extra light neutrinos in the final state. The
main source of this type of background will come from three W bosons. If two of these decay
leptonically and the third hadronically, this can mimic the signal apart from the additional light
neutrinos. Recent analyses of this process have concluded that such a background can be made
negligible after cuts [5,6]. In particular, a missing pT cut is very effective, since this should have
no effect on the signal.
4. The resummed heavy neutrino propagator
CP violation may originate from self-energy, vertex or higher order quantum corrections. In
general these are small effects, since electroweak loop corrections themselves are small. However, if two or more of the heavy neutrinos are nearly degenerate in mass, then CP violation
from self-energy corrections (often termed
-type CP violation [10]) can be resonantly enhanced [11]. In fact, in the limit of degenerate heavy neutrinos, finite-order perturbation theory
breaks down. A well defined field-theoretic formalism is based on a resummation of the selfenergy graphs [11,16]. This approach is manifestly gauge invariant within the Pinch Technique
102
(PT) framework [32,33] and maintains other field-theoretic properties, such as unitarity and CPT
invariance. Our formalism involves the absorptive part of the heavy neutrino self-energy, which
is computed here at the one-loop level. An important point regarding this formalism is that both
the diagonal and off-diagonal elements of the self-energy must be inserted into the heavy neutrino propagator matrix. This is crucial, since for small mass-splittings, the off-diagonal elements
play a major role.
Following this approach, the propagator for a system of two heavy neutrinos is given by3

1
p
/ m1 + i Im 11 ( p
/)
i Im 12 ( p
/)
p
S(
/) =
,
(4.1)
/)
p
/ m2 + i Im 22 ( p
/)
i Im 21 ( p
where Im ij ( p
/ ) is the absorptive part of the heavy neutrino self-energy.
4.1. Majorana neutrinos
For Majorana neutrinos, Im ij ( p
/ ) is of the form
Im ij ( p
/ ) = Aij (s)/
p PL + Aij (s)/
p PR ,
(4.2)
where s = p 2 and A(s) is Hermitian, (i.e., Aij = Aj i ).

Writing the propagator as
SM ( p
/ ) = DM (s)/
p PL + EM (s)/
p PR + FM (s)PL + GM (s)PR ,
the matrices DM , EM , FM and GM are given by

1
i(sA21 Y + m1 m2 A12 )
X22 A 11 + s|A12 |2 A 22
DM =
,
ZM i(sA12 Y + m1 m2 A21 ) X11 A 22 + s|A12 |2 A 11

1
i(sA12 Y + m1 m2 A21 )
X22 A 11 + s|A12 |2 A 22
EM =
,
ZM i(sA21 Y + m1 m2 A12 ) X11 A 22 + s|A12 |2 A 11

1
X22 m1 sm2 A212
is(m1 A21 A 22 + m2 A12 A 11 )
FM =
,
X11 m2 sm1 A221
ZM is(m1 A21 A 22 + m2 A12 A 11 )

1
X22 m1 sm2 A221
is(m1 A12 A 22 + m2 A21 A 11 )
GM =
,
X11 m2 sm1 A212
ZM is(m1 A12 A 22 + m2 A21 A 11 )
(4.3)
(4.4)
(4.5)
(4.6)
(4.7)
where
Xii = s A 2ii m2i ,
Y = |A12 |2 + A 11 A 22 ,
2

= X11 X22 + sm1 m2 A12 + A221 + s 2 |A12 |2 2A 11 A 22 + |A12 |2 .
A ii = 1 + iAii ,
ZM
(4.8)
By inspection, it can be seen that EM is related to DM , and GM to FM through

EM [A ] = DM [A],
T
,
E M = DM
GM [A ] = FM [A],
T
FM = F M
,
GM = GTM .
(4.9)
3 In all that follows the light neutrino masses have been neglected. Hence, to simplify the notation we use m m .
i
Ni
Also, we shall use Bli BlNi , this should not create any confusion since we are concerned hereafter, only with the
couplings of the heavy neutrinos.
Defining the matrices M and A as

m1 0
1 + iA11
M=
,
A=
iA21
0 m2
iA12
1 + iA22
103

,
(4.10)
Eq. (4.1) can be written

1
p PL + A T p
(p
/ ) = A/
/ PR M.
SM
(4.11)
Combining this with (4.3), the property SS 1 = 1 gives the further equalities
sEM A FM M = 1,
GM A = DM M,
sDM A T GM M = 1,
FM A T = EM M.
(4.12)
These relations, which can be directly verified for the matrices given in (4.4)(4.7) are important for checking that CPT invariance is preserved in the theory. More details will be given in
Section 5.
4.2. Dirac neutrinos
For Dirac neutrinos, the absorptive part of their self-energy has only the left-handed component, viz
Im ij ( p
/ ) = Aij (s)/
p PL .
(4.13)
Writing the propagator in the same form as that in (4.3), i.e.,

SD ( p
/ ) = DD (s)/
p PL + ED (s)/
p PR + FD (s)PL + GD (s)PR ,
the matrices DD , ED , FD and GD are given by the expressions

1
iA12 m1 m2
(s A 22 m22 )A 11 + s|A12 |2
DD =
,
iA21 m1 m2
(s A 11 m21 )A 22 + s|A12 |2
ZD

1
isA12
s A 22 m22
,
ED =
isA21
s A 11 m21
ZD

1
ism2 A12
(s A 22 m22 )m1
,
FD =
ism1 A21
(s A 11 m21 )m2
ZD

1
ism1 A12
(s A 22 m22 )m1
,
GD =
ism2 A21
(s A 11 m21 )m2
ZD
(4.14)
(4.15)
(4.16)
(4.17)
(4.18)
with

ZD = s A 11 m21 s A 22 m22 + s 2 |A12 |2 ,
(4.19)
and A ii defined in (4.8).

The inverse propagator can be expressed as
1
p PL + p
SD
(p
/ ) = A/
/ PR M,
so the equivalent of the relations in (4.12) for Dirac neutrinos are
(4.20)
104
Fig. 2. Feynman graphs contributing to the one-loop self-energy of heavy neutrinos. For Dirac neutrinos, only the LNC
graphs exist.
sED A FD M = 1,
GD A = DD M,
sDD GD M = 1,
FD = ED M.
(4.21)
Calculating the absorptive part of the self-energy at one-loop level in the Feynmant Hooft
gauge4 (for which the Feynman graphs are given in Fig. 2), the matrix A(s), which is the same
for both Dirac and Majorana neutrinos, is given by
Aij (s) =
g 2 Cij
2
128s 2 MW

2

2
2 2
2
4MW
s MW
+ 2MZ2 s MZ2 s MZ2
s MW

2

2 2
2
+ mi mj 2 s MW
+ s MZ2 s MZ2
s MW

2 2
2
.
s MH
+ s MH
(4.22)
The tree-level heavy neutrino widths can then be obtained through Ni = mi Aii (m2i ) for Dirac
neutrinos and Ni = 2mi Aii (m2i ) for Majorana neutrinos.
5. CP asymmetries in lW l W type processes
For the LHC signals described in Section 3, the heavy neutrino propagator is coupled to a
charged lepton and W boson at each end. The asymmetries between such processes and their CPconjugates will thus be the same as for the 2 2 processes lW l W . This can be understood
since the fermion line containing the heavy neutrino is the same in the Feynman graphs for both
the 2 3 signals and the corresponding 2 2 processes. The fact that one of the charged
leptons changes from being a final state particle to an initial state particle will not effect much
the size of the CP asymmetries.
As a result of CPT invariance, if all possible final states X are summed over, then [16]
(l + W N X) = (l W + N X).
(5.1)
However, it is possible for the asymmetry between the cross sections for producing any particular
final state and its CP-conjugate to be large.
Considering LNC processes first, the CP-violating difference between (q q l + l W + )
and (qq
l l + W ) will thus be proportional to that between (l W + l W + ) and
+
(l W l + W ). The only relevant parts of the cross sections are those that involve the
couplings of the heavy neutrinos, any pair of CP-conjugate processes will be otherwise identical.
4 We note that the PT result for fermion self-energies coincides with that obtained in the Feynmant Hooft gauge
[32,33].
Therefore, this asymmetry will be proportional to the factors

2
2

LNC
Bl ED (s)Bl ,
CP Dirac = Bl ED (s)Bl

2

2

LNC
Bl EM (s)Bl ,
CP Majorana = Bl EM (s)Bl
105
(5.2)
(5.3)
depending on whether the heavy neutrino is a Dirac or Majorana particle. In the above, Bl =
(Bl1 , Bl2 ) with N1 and N2 being the two nearly degenerate heavy neutrinos involved. As is a
direct consequence of CPT invariance, these vanish if l and l , the two charged leptons that the
heavy neutrinos couple to, are the same. Since E[B ] = E T [B] for both Dirac and Majorana
neutrinos, the two terms in LNC

CP transform into each other under B B , hence confirming
that they represent CP-conjugates. Using the expression for ED given in (4.16), LNC
CP |Dirac is
given by

4s(s m22 )

LNC
A11 Im[Bl1
Bl 1 Bl2 Bl 2 ]
CP Dirac =
|ZD |2
] + (1 2).
|Bl1 |2 Im[A12 Bl 1 Bl 2 ] + |Bl 1 |2 Im[A12 Bl1 Bl2
(5.4)
The full expression for LNC

CP |Majorana is rather more complicated, it is thus pragmatic to work
with an approximation when performing analytic calculations. The elements of A are very small
(of order the heavy neutrino widths divided by their masses), so terms above first order in them
can be dropped to a very good approximation. Doing this, the matrix EM is approximated as

1
(s m22 )A 11 + 2isA22 i(sA12 + m1 m2 A21 )
EM
(5.5)
.
ZM i(sA21 + m1 m2 A12 ) (s m21 )A 22 + 2isA11
Using this approximation, LNC
CP |Majorana is given by

4(s m22 )

s + m21 A11 Im[Bl1
Bl 1 Bl2 Bl 2 ]
LNC
CP Majorana =
2
|ZM |

|Bl1 |2 s Im[A12 Bl 1 Bl 2 ] + m1 m2 Im[A21 Bl 1 Bl 2 ]

] + m1 m2 Im[A21 Bl1 Bl2

] + (1 2).
+ |Bl 1 |2 s Im[A12 Bl1 Bl2
(5.6)
The propagator for Dirac neutrinos simply contains a subset of the terms in the corresponding
one for Majorana neutrinos, the same is thus also true of the CP-violating expressions (5.4) and
(5.6). The extra terms that appear for Majorana neutrinos all have an m2 mass dependence, these
are due to interference between graphs without a chirality flip in them and graphs with a double
chirality flip, the latter not appearing for Dirac neutrinos. There will also be higher order (in
the elements of A) terms for Majorana neutrinos that have been neglected in our approximation
for EM .
An analogous expression can be derived for the LNV signals (assuming neutrinos are Majorana particles so such processes are allowed), which are of the form (q q l l W ).
The asymmetries between these signals are related to those between (l W + l + W ) and
(l + W l W + ), which are proportional to
2

2

LNV
(5.7)
Bl FM (s)BlT ,
CP = Bl GM (s)Bl
where only the contributions from the resonant s-channel diagrams are included. FM (s) and
GM (s) are symmetric, so both terms in this expression are individually invariant under l l .
106
Also, since GM [B ] = FM [B], it can again be confirmed that these terms represent CPconjugates, since they transform into each other under B B .
In order to make the expressions manageable, we will continue to work with an approximation
of the heavy Majorana neutrino propagator that drops higher order terms in the elements of the
matrix A. With this approximation, the matrices FM and GM are given by

1
m1 (s(1 + 2iA22 ) m22 ) is(m1 A21 + m2 A12 )
FM
(5.8)
,
is(m1 A21 + m2 A12 ) m2 (s(1 + 2iA11 ) m21 )
ZM

1
m1 (s(1 + 2iA22 ) m22 ) is(m1 A12 + m2 A21 )
GM
(5.9)
.
is(m1 A12 + m2 A21 ) m2 (s(1 + 2iA11 ) m21 )
ZM
Using these expressions, the CP asymmetry for LNV processes of the type considered is
LNV
CP =
4sm1 (s m22 )

2m2 A11 Im[Bl1
Bl 1 Bl2 Bl 2 ]
|ZM |2

+ |Bl1 |2 m1 Im[A12 Bl 1 Bl 2 ] + m2 Im[A21 Bl 1 Bl 2 ]

] + m2 Im[A21 Bl1 Bl2

] + (1 2).
+ |Bl 1 |2 m1 Im[A12 Bl1 Bl2
(5.10)
Since all first order graphs contributing to LNV processes have a single chirality flip in the
propagator, all the terms in the above expression are proportional to m2 .
5.1. Theoretical constraints (for Majorana neutrinos)
For three heavy Majorana neutrinos, BlN is a 3 3 matrix. Ignoring the light neutrino masses,
the constraints in (2.11) thus leave four of the heavy neutrino couplings as free parameters. For
example, Bl1 (l = e, , ) and Be2 can be chosen, Eq. (2.11) is then satisfied by

2 + m B2
m1 Be1
Bl1 Bei
2 e2
Be3 = i
(5.11)
,
Bli =
.
m3
Be1
In order to see how this effects the expressions for the CP s, write the matrix A as

mi mj
Aij = Cij a(s) + b(s) 2 ,
MW
where a and b are dimensionless real functions. Then, using (2.10)

mi mj
]= a+b 2
Im[Bl i Bl j Bl1 Bl2
],
Im[Aij Bl1 Bl2
MW
l
which, after the application of (5.11) becomes

mi mj
|Bl1 |2
Im[Aij Bl1 Bl2 ] = a + b 2 C11

Im[Bei
Bej Be1 Be2
].
|Be1 |4
MW
(5.12)
(5.13)
(5.14)
Inserting this into (5.6), LNC

CP |Majorana can be shown to vanish, while in (5.10), all terms proportional to a cancel out and we are left with
LNV
CP =
8bsm1 m2 (m1 m2 )|Bl1 |2 |Bl 1 |2 C11

2 |Z |2
MW
M

2

Be2 2
|Be2 |
2
2
s
m
+
m
s
m
Im
m
,
2
1
1
2
Be1
|Be1 |2
107
(5.15)
where
b=
2 )2 (s M 2 ) + (s M 2 )2 (s M 2 ) + (s M 2 )2 (s M 2 ))
g 2 (2(s MW
Z
Z
Z
H
H
.
128s 2
(5.16)
For neutrino masses accessible to colliders this is always very small, so no observable CP violation is possible for the model considered with three heavy Majorana neutrinos. This result
holds as long as only two of them are close enough in mass to have significant mixing in the
propagator. The case of all three neutrinos being nearly degenerate is more involved, so whether
large CP asymmetries can occur in this case may be studied elsewhere.
If at least four heavy neutrinos exist, Eq. (2.11) can be satisfied for any choices of the couplings of the two quasi-degenerate neutrinos. Scenarios which result in large CP asymmetries are
thus possible, and it is in the context of such a theory which our numerical results for Majorana
neutrinos are to be considered. As mentioned in Section 2.2, Eq. (2.11) is automatically satisfied for the model with Dirac neutrinos. As such, even with just two heavy neutrinos, large CP
asymmetries can result here.
5.2. Scenarios with large CP asymmetries
For the purposes of our numerical calculations, we assume the couplings of the nearly degenerate neutrinos can be chosen independently for both Dirac and Majorana neutrinos. Although
this implies at least four heavy neutrinos in the Majorana case, we also assume that only two
of these are close in mass, such that we can use our 2 2 propagators given in Section 4. In
order to determine scenarios which would result in large CP asymmetries, we consider just the
kinematic point s = s = 12 (m21 + m22 ). To further simplify the expressions, we will also set the
heavy neutrino couplings to all have a common magnitude |B|.
CP , Z(s ) Z and Aij (s ) A ij , the CP asymmetry for
Introducing the notation CP (s )
Dirac neutrinos is given by

m41 m42

LNC
(A 11 + A 22 ) Im[Bl1
Bl 1 Bl2 Bl 2 ]

CP Dirac =
|Z D |2

|Bl1 |2 + |Bl2 |2 Im[A 12 Bl 1 Bl 2 ]

+ |Bl 1 |2 + |Bl 2 |2 Im[A 12 Bl1 Bl2

].
(5.17)
B B B ], which can
One way to obtain a large value for this expression is to maximise Im[Bl1
l 1 l2 l 2
be done by having one of the four couplings imaginary and the rest real. The couplings to the
third charged lepton (i.e., not l or l ) can then be chosen such that C12 is either real or imaginary,
] will be zero. If we now impose the condition
and so either Im[A 12 Bl 1 Bl 2 ] or Im[A 12 Bl1 Bl2
that all the couplings have the same magnitude |B|, Eq. (5.17) becomes
LNC

CP
Dirac
4

= |B| A 11 + A 22 2|A 12 | m4 m4 .
1
2
|Z D |2
(5.18)
108
the off-diagonal element will

Although there is a partial cancellation between the elements of A,
only be about one third of the magnitude of the diagonal elements (which are approximately
equal). This is because in A 12 , the contribution from the couplings to the third charged lepton
cancel exactly with the contribution from either l or l . The equality A 11 = A 22 = 3|A 12 | is
not exact however, since there are small differences due to the mass-splitting of the two heavy
neutrinos.
LNC |Majorana is
For Majorana neutrinos, the expression for
CP

2
m21 m22 2
2
2

LNC

3m
A
A
Im[Bl1
+
m
+
m
+
3m
Bl 1 Bl2 Bl 2 ]
11
22
1
2
1
2
CP Majorana =
|Z M |2

|Bl1 |2 + |Bl2 |2 m21 + m22 Im[A 12 Bl 1 Bl 2 ]

+ 2m1 m2 Im[A 21 Bl 1 Bl 2 ]

+ |Bl 1 |2 + |Bl 2 |2 m21 + m22 Im[A 12 Bl1 Bl2

]

+ 2m1 m2 Im[A21 Bl1 Bl2 ] .

(5.19)
] = Im[A
21 Bl1 B ]. Under the same assumption for the
If A 12 is imaginary, then Im[A 12 Bl1 Bl2
l2
B-couplings, we find
LNC

CP
|m21 m22 ||B|4 2

=
3m1 + m22 A 11 + m21 + 3m22 A 22
Majorana
2
|ZM |

+ O (m1 m2 )2 .
(5.20)
LNV is given by
For LNV processes,
CP
LNV

CP =
m41 m42

2m1 m2 (A 11 + A 22 ) Im[Bl1
Bl 1 Bl2 Bl 2 ]
|Z M |2

+ m21 |Bl1 |2 + m22 |Bl2 |2 Im[A 12 Bl 1 Bl 2 ]

+ m1 m2 |Bl1 |2 + |Bl2 |2 Im[A 21 Bl 1 Bl 2 ]

+ m21 |Bl 1 |2 + m22 |Bl 2 |2 Im[A 12 Bl1 Bl2

]

2
2
+ m1 m2 |Bl 1 | + |Bl 2 | Im[A 21 Bl1 Bl2 ] .
(5.21)
As long as l = l , then the same assumptions for the couplings lead to

LNV 2m1 m2 |B|4

CP =
(A 11 + A 22 )m41 m42 + O (m1 m2 )2 .
2
|ZM |
(5.22)
If l = l , then
LNV

CP =

m41 m42
2
11 + A 22 ) Im (Bl1
2m
m
(
A
B
)
1
2
l2
|Z M |2

]
+ 2 m21 |Bl1 |2 + m22 |Bl2 |2 Im[A 12 Bl1 Bl2

+ m1 m2 |Bl1 |2 + |Bl2 |2 Im[A 21 Bl1 Bl2

] .
(5.23)
imaginary. Im[(B B )2 ] is then

The easiest way to get a large value for this is to have Bl1 Bl2
l1 l2
zero, but both Im[A12 Bl1 Bl2 ] and Im[A21 Bl1 Bl2 ] will be large (and of the same sign), as long as
109
Table 1
Properties of the cross sections and CP asymmetries considered in Section 6
Observable(s)
Heavy neutrino
type
CP violation
in (S2)
CP violation
in (S3)
K-dependence
(pp e W X)
(pp e W X)
(pp e W X)
(pp e W X)
(pp e e W X)
(pp W X)
(pp e W X)
ACP (LNC)
ACP (LNC)
RCP (LNC)
RCP (LNC)
ACP (LNV1)
ACP (LNV2)
RCP (LNV)
Dirac
Majorana
Dirac
Majorana
Majorana
Majorana
Majorana
Dirac
Majorana
Dirac
Majorana
Majorana
Majorana
Majorana
No
Yes
No
Yes
Yes
No
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
No
No
No
No
No
No
No
No
No
No
Yes
Yes
No
C12 has a significant real component. The asymmetry is then

LNV 2|B|4 4
m m4 (m1 + m2 )2 Re[A 12 ].

CP =
1
2
|Z M |2
(5.24)
In all the above coupling scenarios, if the heavy neutrino mass-splitting mN m2 m1 is
of order their widths, then the CP asymmetries can be of order the cross sections involved. In the
resonant region, these have terms proportional to either m2 , N2 or mN (all over |Z D/M |2 ).
The CP asymmetries are themselves proportional to mN (over |Z D/M |2 ), so for m N ,
CP violation can be resonant. This still requires particular coupling scenarios such as we have
described, but since we are interested in investigating the possibility of large CP asymmetries at
the LHC, we will use these optimised conditions for our numerical studies.
6. Numerical results
In this section we give our numerical results for the cross sections of the various signals of
the type discussed in Section 3, for clarity, we list these in Table 1. As mentioned in Section 3,
in all signals considered, the W bosons are assumed to decay hadronically. We do not consider
leptons as possible final state particles since these decay before reaching the detector and so
would require a more involved analysis. Since we are primarily interested in CP asymmetries
between the signal processes, we also plot these, which we define in later this section and also
include in Table 1. For the results shown, we have used the CTEQ6M PDFs [14], in which
we have used Q = mN . Setting Q equal to the invariant mass of the W instead, for example,
increases the cross sections slightly for relatively low heavy neutrino masses.
The basic assumption required for our calculations to be valid is that the two quasi-degenerate
heavy neutrinos are the only new physics particles that make an appreciable contribution to the
processes considered. All others are assumed to either not couple to relevant particles, or have
masses 1 TeV. Our calculations depend weakly on the mass of the Higgs boson, for which we
have used MH = 120 GeV, and we have globally applied the cuts pT > 15 GeV and || < 2.5
for all final state particles. Note that these kinematical cuts are similar to those chosen in [5].
110
We consider the following three scenarios:

(S1): BlN = 0.05 for l = e, , and N = 1, 2.
(S2): As (S1), except Be2 = 0.05i.
(S3): As (S1), except B2 = 0.05i and B 2 = 0.05.
The first scenario, where all couplings are real, is the CP-conserving limit. This enables us to
separate true CP asymmetries from the asymmetry due to the PDFs. The second two are chosen
specifically as examples of resonant CP violation, as guided by our analysis in Section 5.2. Scenario (S2) gives rise to large asymmetries for Majorana neutrinos, but not for Dirac neutrinos,
whereas in scenario (S3) large CP asymmetries occur for both Dirac and Majorana neutrinos, but
only if the two final state charged leptons are of different flavour.
Including
just the two heavy neutrinos that are directly involved in the signal processes, we

have i |BlNi |2 = 0.005 (for all l). This leaves room for couplings to other heavy neutrinos
without invalidating the experimental limits in (2.21). This is especially important for Majorana
neutrinos, since as noted in Section 5.1, it is necessary then to have at least four heavy neutrinos
to avoid theoretical constraints on their couplings. In the same context, a degree of cancellation
between the different loop contributions might be necessary for the scenarios (S1)(S3) to satisfy
the bounds in (2.22), especially those derived from the non-observation of e .
We start our investigation of CP violation by constructing the following CP asymmetry
ACP =
(pp (W + ) Si ) K (pp (W ) Si )
,
(pp (W + ) Si ) + K (pp (W ) Si )
(6.1)
where Si can stand for any of the signal final states considered and Si its CP-conjugate. The
function K takes account of the different PDFs involved in producing W + and W bosons,
such that ACP = 0 if CP is conserved. It has to be calculated theoretically and is defined as

(pp (W + ) Si )
K=
(6.2)
,
(pp (W ) Si ) CP=0
/
where again Si can be any of the possible final states considered and Si its CP-conjugate. An
important point is that although ACP is in general different for different Si , K is, to a very good
approximation, universal whichever signal (of the type considered) is used to define it. Therefore,
we calculate the cross sections in the CP-conserving limit with all couplings real, like for example
in scenario (S1). K is independent of the magnitudes of the couplings and mass-splitting (for
mN mN ), so any other CP-conserving scenario would give the same value. It is plotted as a
function of mN1 (the mass of the lighter of the two heavy neutrinos) in Fig. 3 along with the signal
cross sections in (S1). The value of K can be obtained by taking the ratio of either of the two
pairs of CP-conjugate signals shown. In fact, in this special scenario, where all heavy neutrino
couplings are equal, all additional signals of the type considered have the same cross section as
one of the four shown. However, there is an element of theoretical uncertainty predominantly
coming from the different choices of PDFs used. For instance, using the MRST2004 PDFs
[15] instead, we find values for K that differ by less than 1%. They are also insensitive to the
factorisation scale Q that is used.
Another set of CP-violating observables can be constructed by considering ratios of different
processes, such that the asymmetries due to the different PDFs cancel out. Defining the ratios
111
Fig. 3. Left plot: Signal cross sections for scenario (S1). Right plot: The function K as defined in (6.2). All additional
signals of the type considered have the same cross section as one of the four shown.
Fig. 4. Cross sections for LNC signals in scenario (S2). In this scenario, no CP violation is present for Dirac neutrinos, so
plots are not shown for these. In this and all other plots, the vertical dotted line represents the value of the heavy neutrino
widths.
R+ and R as
R+ =
(pp (W + ) Si )
,
(pp (W + ) Sj )
R =
(pp (W ) Si )
,
(pp (W ) Sj )
(6.3)
CP-violating observables can be constructed from

RCP =
R+ R
.
R+ + R
(6.4)
This second method has the advantage that it does not depend on K. However, it is more complicated to construct and analyse since two different processes (plus their CP-conjugates) are
required for each asymmetry.
112
6.1. LNC processes

We consider four distinct LNC signals, these are:
(1) pp e+ W + X,
(2) pp e + W X,
(3) pp e+ W X,
(4) pp e + W + X.
(6.5)
Signals (1) and (2) are CP-conjugates of each other and likewise (3) and (4). Furthermore, the
cross sections for (3) and (4) are related to those for (1) and (2) through

pp e W + X = K pp e W X ,
(6.6)
which holds even when CP is not conserved. Using this relation, ACP as defined in (6.1) can be
expressed without involving K, viz.
ACP (LNC) =
(pp e+ W X) (pp e + W X)
.
(pp e+ W X) + (pp e + W X)
(6.7)
This has also the advantage that it is not necessary to distinguish between the two W boson
charges experimentally. RCP , given in (6.4), can also be re-expressed using (6.6), and can thus
be written as
RCP (LNC) =
(ppe+ W X)
(ppe + W X)
(ppe+ W X)
(ppe + W X)
(ppe + W X)
(ppe+ W X)
(ppe + W X)
(ppe+ W X)
(6.8)
where again, the two charges of W boson are summed over. However, the observable RCP (LNC)
turns out to be closely related to ACP (LNC).
The cross sections for the signal processes given in (6.5) are shown in Fig. 4 for scenario (S2)
and Figs. 5 and 6 for scenario (S3). Since scenario (S2) only results in CP violation for Majorana
neutrinos, the results for Dirac neutrinos are not shown. Weve plotted the signals as functions of
mN1 and mN . In the latter plots, the point mN = N is marked by a vertical dotted line.5 The
CP asymmetries are close to their maxima at this point, which we use for our plots against mN1 .
The CP-violating observables given in (6.7) and (6.8) are shown as functions of mN in Fig. 7.
Again the point mN = N is marked by vertical dotted lines (one each for Dirac and Majorana
neutrinos).
6.2. LNV processes
We now analyse the LNV processes:
(1) pp e+ e+ W X,
(3) pp e+ + W X,
(2) pp e e W + X,
(4) pp e W + X.
(6.9)
We have not included the di-muon channel processes here since for the two scenarios (S2) and
(S3), there is no CP asymmetry between these. However, interchanging Be2 B2 in these
scenarios would give asymmetries in the di-muon channel but not the di-electron channel, which
is just as likely.
5 In the scenarios considered, the widths of the two heavy neutrinos are almost equal, as all couplings have the same
magnitude.
Fig. 5. Cross sections for LNC signals in scenario (S3) with Majorana neutrinos.
Fig. 6. As Fig. 5, but with Dirac neutrinos.
Fig. 7. The CP-violating observables ACP (LNC) and RCP (LNC) as defined in (6.7) and (6.8).
113
114
Fig. 8. Cross sections for LNV signals in scenario (S2).
Fig. 9. Cross sections for LNV signals in scenario (S3). CP asymmetries are only present for this scenario between signals
with different flavour final state charged leptons.
As before, ACP and RCP can be defined as follows:

(pp e+ e+ W X) K (pp e e W + X)
,
(pp e+ e+ W X) + K (pp e e W + X)
(pp e+ + W X) K (pp e W + X)
,
ACP (LNV2) =
(pp e+ + W X) + K (pp e W + X)
ACP (LNV1) =
RCP (LNV) =
(ppe+ e+ W X)
(ppe+ + W X)
(ppe+ e+ W X)
(ppe+ + W X)
(ppe e W + X)
(ppe W + X)
(ppe e W + X)
(ppe W + X)
(6.10)
(6.11)
(6.12)
The signal cross sections given in (6.9) are plotted in the same manner as the LNC signals were.
These are shown in Fig. 8 for scenario (S2) and Fig. 9 for scenario (S3). Similarly, the CPviolating observables given in (6.10)(6.12) are shown in Fig. 10. The cross sections for both the
LNC and LNV signals fall very rapidly with increasing heavy neutrino mass. For |BlN | = 0.05
and an integrated luminosity of 100 fb1 , to produce at least 10 signal events in any single
channel would typically require mN 300400 GeV. This could be enough to not just discover
heavy neutrinos, but also observe resonant CP violation as well. As can be seen from the plots in
115
Fig. 10. The CP-violating observables ACP (LNV1), ACP (LNV2) and RCP (LNV) as defined in (6.10)(6.12). In scenario
(S3), ACP (LNV1) = 0 and so RCP (LNV) = ACP (LNV2).
Figs. 7 and 10, the CP asymmetries in scenarios (S2) and (S3) are very large for mN N .
With this condition, and taking mN1 = 300 GeV for example, scenario (S2) would result in about
20 signal events for pp e+ e+ W X without a single pp e e W + X event likely to be
observed.
In the limit mN = 0, the asymmetries vanish. This is as expected from Section 5.2, since
there it is shown that for couplings of equal magnitude, the asymmetries are proportional to the
mass-splitting of the heavy neutrinos. For the two CP-violating scenarios considered, the CP
asymmetries are larger for Majorana neutrinos than for Dirac ones. However, more examples
would need to be calculated to determine if this is a general trend. Since they are constructed out
of the asymmetries between two independent processes and their CP-conjugates, it should come
as no surprise that the RCP asymmetries are larger than the ACP ones. The one exception to this
is RCP (LNV) in scenario (S3). This is because there is no asymmetry between one of the two
pairs of CP-conjugate processes it is constructed from. For mN = N , both ACP and RCP are
independent of the heavy neutrino mass scale, although this is not obvious from our plots since
K does depend on mN .
7. Conclusions
Based on the resummation formalism developed in [11,16], we have calculated the 22 heavy
neutrino propagator matrix for both Dirac and Majorana neutrinos. Our results apply to scenarios
for which the two heavy neutrinos that are nearly degenerate in mass dominate the production
cross sections. The formalism involves the absorptive part of the heavy neutrino self-energy,
which is given to one-loop, again for both Dirac and Majorana neutrinos.
Assuming a two heavy neutrino mixing system, we have given numerical estimates for the
production cross sections of LFV and LNV heavy neutrino signal processes at the LHC. These
are of the form pp ll W X, with the final state particle charges (not including the beam remnants X) adding up to 1. For Dirac neutrinos, the leptons have to have opposite charges and so,
in order to avoid large backgrounds, l and l should be different. For Majorana neutrinos, LNV
signals of the form pp l l W X are also allowed, where l and l can (but do not have to)
be equal. The SM background to either of these types of signals should be negligible after cuts.
For the magnitudes of couplings we used (|BlN | = 0.05), heavy neutrinos will be observable at
116
the LHC if they have masses between about 100 GeV and about 400 GeV. If the heavy neutrinos
are much lighter than 100 GeV, the LEP data put severe limits on the B-couplings, leading to
unobservable signal cross sections.
We have plotted the signal cross sections in three different possible scenarios for the couplings of the heavy neutrinos. In the first, all the couplings are real. This is the CP-conserving
limit which is needed to calculate the asymmetry due to the different PDFs for producing W +
and W bosons. The second and third scenarios are chosen as examples in which large CP asymmetries exist in a number of channels. CP-violating observables constructed from the signal cross
sections are also plotted. For a mass-splitting of the two neutrinos of order their widths, these
asymmetries can be very large or even maximal, giving rise to resonant CP violation. It should
be noted that these scenarios require additional flavour symmetries in the heavy neutrino sector
in order to be possible. Experimental constraints, in particular those from the lack of observation
of e , require contributions from additional particles of new physics (either further heavy
neutrinos or something else) to be satisfied. These extra particles would need to cause quite large
cancellations in the rate of this process without making a significant contribution to the signals
of interest. Also, for Majorana neutrinos, there are theoretical constraints that require at least two
additional heavy neutrinos (four in total) to be present in the theory in order to be satisfied. Nevertheless, our results demonstrate that very large CP asymmetries are possible at the LHC. The
couplings used have not been motivated in any fashion, but neither are they unique. In particular
it should be kept in mind that an interchange of charged lepton labels gives scenarios which are
no more or less likely.
Among the different processes we have been studying here, the most realistic modes to
look for large CP asymmetries at the LHC are the di-muon or di-electron channels. These
processes are LNV and hence are only allowed for Majorana neutrinos. For these channels, it
is possible to have the heavy neutrino couplings to either the electron or muon heavily suppressed, hence satisfying the experimental constraints. It may then be possible to have resonant
CP violation in these channels without resorting to additional flavour symmetries or cancellations.
The analysis presented here could be extended to other colliders, in particular to the ILC.
Here the signals e+ e l W can be considered which are CP-conjugates of each other. The
ILC should be a cleaner experimental environment and depending on its cms energy, may well
be able to produce heavy neutrinos of considerably larger mass. However, the ILC would suffer
compared to the LHC in the search for heavy neutrinos in the sense that the SM background
cannot so easily be suppressed. Linked to this, evidence of L violation and hence whether or
not neutrinos are Majorana particles is far harder to obtain. Also, an observable signal requires a
significant coupling of the heavy neutrino to the electron, a requirement not shared by the LHC.
In summary, resonant CP violation due to electroweak-scale heavy neutrinos is an interesting
possibility that might well be observed for the first time at the LHC. If realised, this could, in
principle, give an explanation for the BAU through resonant leptogenesis. The observation of
electroweak-scale heavy neutrinos may not directly unravel the detailed structure of the light
neutrino mass matrix, but it will naturally point towards scenarios based on some approximate
lepton-number or flavour symmetry (see also our comment in Section 2.1). This approximate
lepton-number symmetry may be imprinted into the relative decay rates of the heavy neutrinos
into the different charged-lepton flavours e, and , from which an estimate of the large elements of the Dirac mass matrix mD could, in principle, be obtained. Possible new signatures
or constraints from low-energy experiments, including the low-energy neutrino oscillation data,
may offer valuable information to restrict the form of mD and the light-to-heavy neutrino-mixing
117
matrix mD /mM . A detailed analysis of possible correlations between resonant CP violation due
to heavy neutrinos which we have been studying here and other observables may be the subject
of a future communication.
Note added
Shortly after communicating our paper, we became aware of a new analysis of the background
contributing to the heavy neutrino signals [34]. According to this analysis, previous studies have
grossly underestimated the background by a factor of 30, because they did not take into account high-order pile-up processes of jets which cannot be reduced by various kinematic cuts.
As a consequence, those authors find that the LHC will be capable of only probing relatively
light heavy neutrinos of masses up to 175 GeV, thus restricting the results of our study accordingly. Nonetheless, it should be noticed that the background of high multiplicity jet processes
stems predominantly from colour non-singlet states, unlike the 2 distinct jets in the signal which
originate from decays of the colourless heavy neutrinos or W bosons. Hence, the signal and
background processes are expected to show a different topology of hadronic activities in the central region, which could be used to eliminate the contribution of the high-order pile-up processes
of jets, with the hope to extend the reach of the LHC to heavier Majorana neutrinos. Such an
analysis lies beyond the scope of this paper.
Acknowledgements
We thank Mrinal Dasgupta and Jeff Forshaw for a discussion of hadronization and colour interference effects. The work of S.B. has been funded by the PPARC studentship PPA/S/S/2003/
03666. The work of J.S.L. was supported in part by the Korea Research Foundation (KRF) and
the Korean Federation of Science and Technology Societies Grant and in part by the KRF grant:
KRF-2005-084-C00001 funded by the Korea Government (MOEHRD, Basic Research Promotion Fund). The work of A.P. was supported in part by the PPARC grants: PP/D0000157/1 and
PP/C504286/1.
References
[1] Super-Kamiokande Collaboration, Y. Fukuda, et al., Phys. Rev. Lett. 81 (1998) 1562, hep-ex/9807003;
CHOOZ Collaboration, M. Apollonio, et al., Phys. Lett. B 466 (1999) 415, hep-ex/9907037;
SNO Collaboration, Q.R. Ahmad, et al., Phys. Rev. Lett. 89 (2002) 011301, nucl-ex/0204008;
K2K Collaboration, M.H. Ahn, et al., Phys. Rev. Lett. 90 (2003) 041801, hep-ex/0212007;
KamLAND Collaboration, K. Eguchi, et al., Phys. Rev. Lett. 90 (2003) 021802, hep-ex/0212021.
[2] L3 Collaboration, P. Achard, et al., Phys. Lett. B 517 (2001) 67, hep-ex/0107014.
[3] A. Pilaftsis, Z. Phys. C 55 (1992) 275, hep-ph/9901206.
[4] A. Datta, M. Guchait, A. Pilaftsis, Phys. Rev. D 50 (1994) 3195, hep-ph/9311257;
F.M.L. Almeida Jr., Y.A. Coutinho, J.A. Martins Simoes, M.A.B. do Vale, Phys. Rev. D 62 (2000) 075004, hepph/0002024;
O. Panella, M. Cannoni, C. Carimalo, Y.N. Srivastava, Phys. Rev. D 65 (2002) 035005, hep-ph/0107308.
[5] T. Han, B. Zhang, Phys. Rev. Lett. 97 (2006) 171804, hep-ph/0604064.
[6] F. del Aguila, J.A. Aguilar-Saavedra, R. Pittau, J. Phys. Conf. Ser. 53 (2006) 506, hep-ph/0606198.
[7] J. Gluza, M. Zralek, Phys. Rev. D 55 (1997) 7030, hep-ph/9612227;
G. Cvetic, C.S. Kim, C.W. Kim, Phys. Rev. Lett. 82 (1999) 4761, hep-ph/9812525;
J.F.M.L. Almeida, Y.A. Coutinho, J.A. Martins Simoes, M.A.B. do Vale, Phys. Rev. D 63 (2001) 075005, hepph/0008201.
118
[8] F. del Aguila, J.A. Aguilar-Saavedra, A. Martinez de la Ossa, D. Meloni, Phys. Lett. B 613 (2005) 170, hepph/0502189.
[9] J. Peressutti, O.A. Sampayo, J.I. Aranda, Phys. Rev. D 64 (2001) 073007, hep-ph/0105162;
S. Bray, J.S. Lee, A. Pilaftsis, Phys. Lett. B 628 (2005) 250, hep-ph/0508077.
[10] M. Flanz, E.A. Paschos, U. Sarkar, Phys. Lett. B 345 (1995) 248, hep-ph/9411366;
M. Flanz, E.A. Paschos, U. Sarkar, J. Weiss, Phys. Lett. B 389 (1996) 693, hep-ph/9607310;
L. Covi, E. Roulet, F. Vissani, Phys. Lett. B 384 (1996) 169, hep-ph/9605319.
[11] A. Pilaftsis, Phys. Rev. D 56 (1997) 5431, hep-ph/9707235.
[12] A. Pilaftsis, T.E.J. Underwood, Nucl. Phys. B 692 (2004) 303, hep-ph/0309342;
A. Pilaftsis, T.E.J. Underwood, Phys. Rev. D 72 (2005) 113001, hep-ph/0506107.
[13] B. Garbrecht, C. Pallis, A. Pilaftsis, JHEP 0612 (2006) 038, hep-ph/0605264;
G.C. Branco, A.J. Buras, S. Jager, S. Uhlig, A. Weiler, hep-ph/0609067.
[14] J. Pumplin, et al., JHEP 0207 (2002) 012, hep-ph/0201195.
[15] A.D. Martin, R.G. Roberts, W.J. Stirling, R.S. Thorne, Eur. Phys. J. C 28 (2003) 455, hep-ph/0211080.
[16] A. Pilaftsis, Nucl. Phys. B 504 (1997) 61, hep-ph/9702393.
[17] H. Fritzsch, P. Minkowski, Ann. Phys. 93 (1975) 193.
[18] P. Minkowski, Phys. Lett. B 67 (1977) 421;
M. Gell-Mann, P. Ramond, R. Slansky, in: P. van Nieuwenhuizen, D. Friedman (Eds.), Supergravity, North-Holland,
Amsterdam, 1979, p. 315;
T. Yanagida, in: O. Sawada, A. Sugamoto (Eds.), Preceedings of the Workshop on the Unified Theories and Baryon
Number of the Universe, KEK, Tsukuba, 1979;
R.N. Mohapatra, G. Senjanovic, Phys. Rev. Lett. 44 (1980) 912.
[19] J. Ellis, M.E. Gomez, S. Lola, hep-ph/0612292.
[20] E. Witten, Nucl. Phys. B 268 (1986) 79;
R.N. Mohapatra, J.W.F. Valle, Phys. Rev. D 34 (1986) 1642.
[21] J. Gluza, Acta Phys. Pol. B 33 (2002) 1735, hep-ph/0201002;
G. Altarelli, F. Feruglio, New J. Phys. 6 (2004) 106, hep-ph/0405048.
[22] D. Wyler, L. Wolfenstein, Nucl. Phys. B 218 (1983) 205.
[23] S. Nandi, U. Sarkar, Phys. Rev. Lett. 56 (1986) 564;
J.W.F. Valle, Prog. Part. Nucl. Phys. 26 (1991) 91.
[24] P. Langacker, D. London, Phys. Rev. D 38 (1988) 886.
[25] B. Pontecorvo, Sov. Phys. JETP 6 (1957) 429;
B. Pontecorvo, Sov. Phys. JETP 7 (1958) 172;
Z. Maki, M. Nakagawa, S. Sakata, Prog. Theor. Phys. 28 (1962) 870.
[26] T.P. Cheng, L.-F. Li, Phys. Rev. Lett. 45 (1980) 1908;
J.G. Korner, A. Pilaftsis, K. Schilcher, Phys. Lett. B 300 (1993) 381, hep-ph/9301290;
J. Bernabeu, J.G. Korner, A. Pilaftsis, K. Schilcher, Phys. Rev. Lett. 71 (1993) 2695, hep-ph/9307295;
C.P. Burgess, S. Godfrey, H. Konig, D. London, I. Maksymyk, Phys. Rev. D 49 (1994) 6115, hep-ph/9312291;
E. Nardi, E. Roulet, D. Tommasini, Phys. Lett. B 327 (1994) 319, hep-ph/9402224;
G. Bhattacharya, P. Kalyniak, I. Melo, Phys. Rev. D 51 (1995) 3569, hep-ph/9503248;
F. Deppisch, T.S. Kosmas, J.W.F. Valle, Nucl. Phys. B 752 (2006) 80, hep-ph/0512360.
[27] A. Ilakovac, A. Pilaftsis, Nucl. Phys. B 437 (1995) 491, hep-ph/9403398.
[28] S. Bergmann, A. Kagan, Nucl. Phys. B 538 (1999) 368, hep-ph/9803305.
[29] J.I. Illana, T. Riemann, Phys. Rev. D 63 (2001) 053004, hep-ph/0010193;
G. Cvetic, C. Dib, C.S. Kim, J.D. Kim, Phys. Rev. D 66 (2002) 034008, hep-ph/0202212.
[30] BaBar Collaboration, B. Aubert, et al., Phys. Rev. Lett. 95 (2005) 041802, hep-ex/0502032;
BaBar Collaboration, B. Aubert, et al., Phys. Rev. Lett. 96 (2006) 041801, hep-ex/0508012.
[31] G. Belanger, F. Boudjema, D. London, H. Nadeau, Phys. Rev. D 53 (1996) 6292, hep-ph/9508317.
[32] J.M. Cornwall, J. Papavassiliou, Phys. Rev. D 40 (1989) 3474;
J. Papavassiliou, Phys. Rev. D 41 (1990) 3179;
D. Binosi, J. Papavassiliou, Phys. Rev. D 66 (2002) 111901, hep-ph/0208189;
D. Binosi, J. Papavassiliou, J. Phys. G 30 (2004) 203, hep-ph/0301096.
[33] J. Papavassiliou, A. Pilaftsis, Phys. Rev. Lett. 75 (1995) 3060, hep-ph/9506417;
J. Papavassiliou, A. Pilaftsis, Phys. Rev. D 53 (1996) 2128, hep-ph/9507246;
J. Papavassiliou, A. Pilaftsis, Phys. Rev. D 54 (1996) 5315, hep-ph/9605385.
[34] F. del Aguila, J.A. Aguilar-Saavedra, R. Pittau, hep-ph/0703261.
Moduli space of torsional manifolds

Melanie Becker a , Li-Sheng Tseng b,c, , Shing-Tung Yau c
a George P. and Cynthia W. Mitchell Institute for Fundamental Physics, Texas A&M University,
College Station, TX 77843, USA

b Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA
c Department of Mathematics, Harvard University, Cambridge, MA 02138, USA
Received 26 January 2007; received in revised form 2 July 2007; accepted 9 July 2007
Abstract
We characterize the geometric moduli of non-Khler manifolds with torsion. Heterotic supersymmetric
flux compactifications require that the six-dimensional internal manifold be balanced, the gauge bundle be
Hermitian YangMills, and also the anomaly cancellation be satisfied. We perform the linearized variation
of these constraints to derive the defining equations for the local moduli. We explicitly determine the metric
deformations of the smooth flux solution corresponding to a torus bundle over K3.
1. Introduction
Ever since the discovery of CalabiYau compactifications [1], string theorists have tried to
make the connection to the Minimal Supersymmetric Standard Model (MSSM) and grand unified
theories (GUT). This turned out to be a difficult problem, as many times exotic particles appear
along the way. These are particles that play no role in the current version of the MSSM.1 Recently
[2,3] have made a rather interesting proposal for three generation models without exotics in the
context of CalabiYau compactifications of the heterotic string.2
* Corresponding author at: Department of Mathematics, Harvard University, Cambridge, MA 02138, USA.
E-mail address: tseng@math.harvard.edu (L.-S. Tseng).

1 It is, of course, possible that additional particles not known at present might be discovered, leading to an extension
of the MSSM.
2 String duality implies that in principle one could get realistic models in the context of type II theories. A concrete
proposal has been made recently in terms of a D3-brane in the presence of a dP8 singularity [4]. Alternatively, one could
use intersecting D-brane models. For a review see [5].
doi:10.1016/j.nuclphysb.2007.07.006
120
M. Becker et al. / Nuclear Physics B 786 (2007) 119134
Even though these models have some rather interesting features, it is not possible to predict
with them the values of the coupling constants of the Standard Model, because compactifications
on conventional CalabiYau compactifications lead to unfixed moduli, and therefore additional
massless scalars. This issue can only be addressed in the context of flux compactifications, which
are known to lift the moduli [6,7].
If flux compactifications are considered in the context of the heterotic theory, the resulting
internal geometry is a non-Khler manifold with torsion [810]. Simple examples of such compactifications were constructed in [11,12] in the orbifold limit and a smooth compactification was
constructed in [13,14] in terms of a T 2 bundle over K3. See [1518] for some related works. It
would be extremely exciting to construct a torsional manifold with all the features of the MSSM.
At present, we are not yet at such a state. Many properties of CalabiYau manifolds are not shared
by non-Khler manifolds with torsion, so that well-known aspects of CalabiYau manifolds need
to be rederived for these manifolds.
One of the important open questions is to understand how to characterize the scalar massless fields, in other words, the moduli space of heterotic flux compactifications. We investigate
this question by analyzing the local moduli space emerging in such compactifications from a
spacetime approach. A massless scalar field in the effective four-dimensional theory emerges for
each independent modulus of the background geometry. Thus, the dimension of the moduli space
corresponds to the number of massless scalar fields in the theory. In our analysis, we restrict to
supersymmetric deformations, as we expect the analysis of the supersymmetry constraints to be
easier than the analysis of the equations of motion. While the later equations are corrected by R 2
terms, the form of the supersymmetry transformations is not modified to R 2 order, as long as the
heterotic anomaly cancellation condition is imposed [19]. That a solution of both the supersymmetry constraints and the modified Bianchi identity is also a solution to the equations of motion
has been shown in [20,21].
Unlike the CalabiYau case, the supersymmetry constraint equations in general non-linearly
couple the various fields and thus the analysis even at the linearized variation level is non-trivial.
As an example of our general analysis, we shall give the description of the scalar metric moduli
for the smooth solution of a T 2 bundle over K3 presented in [13,14]. It is an interesting question
to understand whether the massless moduli found in our approach are lifted by higher order
terms in the low energy effective action. For conventional CalabiYau compactifications it is
known that moduli fields appearing in the leading order equations will remain massless even if
higher order corrections are taken into account [22,23]. In our case, such an analysis has not been
performed yet from the spacetime point of view, though the question can be answered from the
world-sheet approach recently developed in [18]. In this work, a gauged linear sigma model was
constructed which in the IR flows to an interacting conformal field theory. The analysis of the
linear model indicates that massless fields emerging at leading order in will remain massless,
even if corrections to the spacetime action are taken into account.
This paper is organized as follows. In Section 2, we perform the linear variation of the supersymmetry constraints. In Section 3, we analyze the variation of the T 2 bundle over K3 solution
and discuss its local moduli space. In Section 4, conclusions and future directions are presented.
In Appendix A, we clarify some of mathematical notations that we used.
2. Determining equations for the moduli fields
The non-Khler manifolds with torsion M that we are interested in are complex manifolds
described in terms of a Hermitian form which is related to the metric
J = iga b dza d z b ,
121
(2.1)
and a no-where vanishing holomorphic three-form

d = 0,
(2.2)
satisfying J = 0. The geometry can be deformed by either deforming the Hermitian form
or deforming the complex structure of M. We are interested in deformations that preserve the
supersymmetry constraints as well as the anomaly cancellation condition.
N = 1 supersymmetry for heterotic flux compactifications to four spacetime dimensions imposes three conditions: the internal geometry has to be conformally balanced, the gauge bundle
satisfies the Hermitian YangMills equation, and the H -flux satisfies the anomaly cancellation
condition. Explicitly, they are [13,14]

d J J J = 0,
(2.3)
Fmn J mn = 0,
F (2,0) = F (0,2) = 0,
(2.4)

= tr(R R) tr(F F ) .
2i J
(2.5)
4
Above, we have replaced the two standard background fieldsthe three-form H and the dilaton field with the required supersymmetric relations
H = i( )J,
J = e
2(+0 )
(2.6)
.
(2.7)
Doing so allows us to consider the constraint equations solely in terms of the geometrical data
(J, ) and the gauge bundle.
Deformations of the metric that are of pure type, i.e. (0, 2) or (2, 0), describe deformations of
the complex structure
ab d gdc dza dzb d z c ,
(2.8)
while deformations of mixed type, i.e. of type (1, 1), describe deformations of the Hermitian
form
iga b dza d z b .
(2.9)
We analyze below the linear variation of the three constraint equations (2.3)(2.5) with respect to a background solution. For simplicity, we shall keep the complex structure of the
six-dimensional internal geometry fixed. For the moduli space of CalabiYau compactifications,
it turns out that the Khler and complex structure deformations decouple from one another [24].
It would be interesting to determine whether some decoupling still persists in the non-Khler
case and more generally how the Hermitian and complex structure deformations are coupled. We
will leave this more general analysis for future work.
2.1. Conformally balanced condition
We consider the linear variation of the conformally balanced condition (2.3). We shall vary
the metric or Hermitian form Ja b = iga b while holding fixed the complex structure. Let
Ja b = Ja b + Ja b ,
(2.10)
122
then we have to first order in J ,

J J = J J + 2J J,
|g |
|ga b |
2J = a b 2J =
2J
|ga b |
|ga b |(1 + g cd gcd )

= 1 g cd gcd 2J .
(2.11)
(2.12)
Note that (2.7) with (2.12) imply the dilaton variation

1
1
= g a b ga b = J mn Jmn .
4
8
The linear variation of the conformally balanced condition can be written as

d J J J = d J J J + 2 = 0,
where is a four-form given by

1
= J J J (J J )J mn Jmn .
8
(2.13)
(2.14)
(2.15)
We can invert (2.15) and express J in terms of . To do this, we note that any (2, 2)form, 4 , can be Lefschetz decomposed as follows
1
4 = L4 L2 2 4 ,
(2.16)
4
where the Lefschetz operator L and its adjoint have the following action on exterior forms
L:
J ,
J .
(2.17)
Comparing (2.15) with (2.16), we find the relation

Jmn =
1
mnrs J rs .
2J
(2.18)
From the linear variation of Eq. (2.14), we observe that the allowed deformations (i.e. which
preserve the conformally balanced condition) satisfy d = 0. Eq. (2.18) implies that any variation of the Hermitian metric can be expressed in terms of a variation by a closed (2, 2)-form.
Equivalently, we can also express the linear variation condition directly for the Hermitian metric
as

1 mn

d J J J Jmn = 0,
(2.19)
4
where J = J J .
Note that J variations that are equivalent to a coordinate transformation (i.e. a diffeomorphism) are physically unobservable and must therefore be quotient out. Under an infinitesimal
coordinate transformation
y m = y m + v m (y),
the variation of a p-form p is given by the Lie derivative

p = Lv p = iv (dp ) + d(iv p ) ,
(2.20)
(2.21)
123
where v = v m m is a vector field and iv denotes the interior product. For the conformally balanced four-form, a coordinate transformation results in

Lv J J J = d iv J J J .
(2.22)
We can thus identify, as physically not relevant, variations that are exterior derivatives of a
non-primitive three-form

d J J ,
(2.23)
where m = v n Jnm . Using (2.18), this corresponds to deformations of the Hermitian form
J

1
d J J .
J
(2.24)
Let us now interpret the content of the above variation formulas. By the identification
of (2.18), variations of the Hermitian metric that preserve the conformally balanced condition
can be parametrized by closed (2, 2)-forms. Moreover, modding out by diffeomorphisms results
in the cohomology3
ker(d) 2,2
.
d( J )
(2.25)
Thus, the space of conformally balanced metrics is equivalent to the space of closed (2, 2)-forms
modded out by those which are exterior derivatives of a non-primitive three-form. But notice
that exact forms which are exterior derivative of a primitive three-form are not quotient out.
Hence, if there exists such a primitive three-form, 30 , then the space of balanced metrics is
infinite-dimensional. This is because d(f 30 ) where f is any real function would be closed but
not modded out.
The cohomology of (2.25) can also be expressed directly in terms of (1, 1)-forms. From (2.19),
every co-closed (1, 1)-form defines a metric deformation preserving the conformally balanced
condition. To see this explicitly, we note that any (1, 1)-form can be Lefschetz decomposed as
follows
1
Cmn = (C0 )mn + Jmn J rs Crs
6
1
(C0 )mn + Jmn C ,
3
(2.26)
where C0 denotes the primitive part and C = 12 J rs Crs encodes the non-primitivity of Cmn . We
can therefore re-express (2.19) as

1
0 = d J J J
2

1
= d J0 J J
6
= d C,
(2.27)
3 Note that complex structures are also defined up to diffeomorphism. So any diffeomorphism generated by a real
vector field will keep the complex structure in the same equivalence class.
124
where we have defined a new (1, 1)-form C = C0 + 13 J C with C0 = J0 and C = 12 J .

Furthermore, variations associated with diffeomorphisms can be written as

J d J J ,
(2.28)
so that we have

1

J J J d ( J ),
2
(2.29)
= J n . Eqs. (2.27) and (2.28) together imply the cohomology

where m
m n
J
ker(d ) 1,1
.
d ( J )
(2.30)
Therefore, the local moduli space can also be described as spanning all co-closed (1, 1)-forms
modulo those which are d of non-primitive three-forms. This space is isomorphic to that
of (2.25) and is in general infinite-dimensional. We have however yet to consider the two
other supersymmetry constraints. Imposing them, especially the anomaly cancellation condition,
will greatly reduce the number of allowed deformations and render the moduli space finitedimensional. This can be seen clearly in the T 2 bundle over K3 example discussed in the next
section.
Finally, let us point out that if we had taken into consideration variations of the complex
structure, then a J variation will in general include also a (2, 0) and a (0, 2) part. Nevertheless,
J + J must still be a (1, 1)-form with respect to the deformed complex structure as is required
by supersymmetry.
2.2. Hermitian YangMills condition
Any variation of the Hermitian gauge connection with the complex structure held fixed
will preserve the holomorphic condition F (2,0) = F (0,2) = 0. As for the primitivity condition
Fmn J mn = 0, we shall vary its equivalent form
0 = (F J J ) = F J 2 + 2F J J.
The Hermitian field strength F can be written as

Fab
= a h 1 b h ,
= a Ab = a h b h
(2.31)
(2.32)
are gauge indices and h = h

where , ,
is the transpose of the Hermitian metric on the
the gauge field varies as
gauge bundle. Under the variation, h = h + h,
+ h 1 h
A = A A = h 1 ( h)

= h 1 h h 1 h h 1 h h 1 h

= h 1 h + A h 1 h h 1 h A

D A h 1 h .
(2.33)
Inserting into (2.31), we obtain

A (h 1 h)).
This implies that the field strength varies as F = (D

0 = D A h 1 h J 2 + 2F J J.
(2.34)
125
This gives the constraint relation between the variations of the Hermitian form and the gauge
field. The pair (J, h) will be further constrained when inserted into the anomaly cancellation
condition as we now show.
2.3. Anomaly cancellation condition
We can write the variation of the anomaly cancellation equation as

tr R(g) R(g) tr F (h) F (h) .
(2.35)
2
The left-hand side is a of a (1, 1)-form, so we should write the variation of the right-hand
side of the equation similarly. With the curvature defined using the Hermitian connection, we
can write the variation using the BottChern form [25,26]. For two Hermitian metrics (g1 , g0 )
that are smoothly connected by a path parameterized by a parameter t [0, 1], the difference of
the first Pontryagin classes is given by the BottChern form
=
2i J
tr[R1 R1 ] tr[R0 R0 ] = 2i BC
2 (g1 , g0 ),
(2.36)
where
1
BC2 (g1 , g0 ) = 2i

tr Rt g t1 g t dt,
(2.37)
and g = gab
denotes the transpose of the Hermitian metric, the dot denotes the derivative with
respect to t , and the tr in (2.37) traces over only the holomorphic indices.4 We now use the
BottChern formula to obtain the variation. Let
gt = g + t (g g) = g + tg,
(2.38)
where t [0, 1] and in particular g0 = g and g1 = g . Then to first order in g, we have

tr[R R] = 2 tr[R R] = 4 tr R g 1 g ,
(2.39)
where the trace can be more simply written in components as

tr R g 1 g a b = iRa b cd Jcd .
(2.40)
With (2.39), the linear variation of the anomaly equation (2.41) becomes

= tr R g 1 g tr F h 1 h .
2i J
(2.41)
By factoring out the 2i derivatives, the anomaly condition can be equivalently expressed as

1
(2.42)
tr R g g tr F h 1 h = ,
2
where is a closed (1, 1)-form.
Note that for the special case where either the gauge bundle is trivial (i.e. F = 0) or h = 0,
there is a simple relationship between J and . The anomaly variation with (2.40) inserted
J i
4 Note that the BottChern form is defined only up to and exact terms.
126
into (2.42) becomes

Ja b
R cd Jcd = a b .
2 ab
(2.43)
as a single index, we can solve for J by inverting the

Grouping the two Hermitian indices (a b)
above equation and obtain
J = (1 M)1 ,
(2.44)
cd

cd
2 Ra b .
where the curvature is encoded in the matrix Ma b =

As long as (1 M) is invertible,
we see that J is parametrized by the space of -closed (1, 1)-forms . Modding out by diffeomorphism equivalence, we can obtain a cohomology associated with the anomaly equation of
the form
1,1
ker( )
,
(2.45)
diff
where
diff = (1 M)Jdiff =

1
(1 M)d J J ,
J
(2.46)
and Jdiff is the variation of the Hermitian form corresponding to diffeomorphism given in (2.24).
To summarize, we list the three linear variation conditions with complex structure fixed.

1
d J 2J J (J J )J mn Jmn = 0,
(2.47)
4

D A h 1 h J 2 + 2F J J = 0,
(2.48)

= 0.
tr R g 1 g tr F h 1 h
J i
(2.49)
2
In the next section, we will write down explicit deformations that satisfy the above equations for
the T 2 bundle over K3 flux background.
3. T 2 bundle over K3 solution
The metric of the T 2 bundle over K3 solution [13,14] has the form
2
+ (dx + 1 )2 + (dy + 2 )2
ds 2 = e2 dsK3
2
= e2 dsK3
+ |dz3 + |2 ,
(3.1)
= dz3
where
+ is a (1, 0)-form and = 1 + i2 . The twisting of the
two-form defined on the base K3,
(2,0)
= 1 + i2 = d = S
(1,1)
+ A
T2
is encoded in the
(3.2)
which is required to be primitive

JK3 = 0,
and obeys the quantization condition
i
i =
H 2 (K3, Z).
2
(3.3)
(3.4)
127
With this metric ansatz, the anomaly cancellation equation reduces to a highly non-linear secondorder differential equation for the dilaton . Importantly, a necessary condition for the existence
of a solution for is that the background satisfies the topological condition

1
2
2 JK3 JK3
S + A
tr F F = 24.
+
(3.5)
2!
16 2
K3
K3
If this condition is satisfied, then the analysis of Fu and Yau [13] guarantees the existence of a
smooth solution for that solves the differential equation of anomaly cancellation.
3.1. Equations for the moduli
For expressing the constraint equations of the allowed deformations, we first write down more
explicitly the Hermitian metric. Note that the conventions we follow here are that Ja b = iga b and
ds 2 = 2ga b dza d z b . The Hermitian two-form can be expressed simply as

i
J = e2 JK3 + ,
2
and we write the corresponding metric as

1 2g + BB B
ga b =
,
1
B
2
(3.6)
(3.7)
where g i j = e2 gK3 is the base K3 metric with the e2 warp factor included, B = (B1 , B2 ) is a
column vector with entries locally given by = B1 dz1 + B2 dz2 , and B = B .
An allowed deformation of the conformally balanced condition must satisfy the requirement
that the four-form (2.15)

1
mn
= J J J (J J )J Jmn
8

i
1
= JK3 J + e2 J e2 JK3 JK3 + iJK3 J mn Jmn , (3.8)
2
8
is d-closed.
As for the anomaly condition, we shall work with the constraint given in the form of (2.49)
(with trivial gauge bundle)

1
J i tr R g g = 0.
(3.9)
2
The curvature term can be written out explicitly as

tr R g 1 g = i R j i Ji j + R j 3 J3j + R 3i Ji 3 + R 33 J33 ,
(3.10)
where
1
1
R j i = g 1 R g 1 B
,
B g
2

1

1 g 1 B
B g
B,
R j 3 = g 1 R B + g 1 B
2
(3.11)
(3.12)
128
1
1
B g
,
R 3i = B g 1 R B g 1 + B g 1 B
2

1
B g 1 B + B g 1 B
R 33 = B g 1 R B + B g 1 B
2

1
,
B g B B g 1 B
(3.13)
(3.14)
g 1 g ) is the curvature tensor of K3 with respect to the g metric. Note that the R ba
and R = (
are two-forms with components only on the coordinates of K3.
Below, we shall analyze the infinitesimal deformations of the T 2 bundle over K3 model with
trivial gauge bundle. For this type of model, the topological constraint (3.5) is satisfied purely by
the curvature of the T 2 twist. (See Section 5.2 in [14] for explicit examples.) We shall discuss
the variation of the three components of the metricthe dilaton conformal factor, the K3 base,
and the T 2 bundleseparately below. We will show that the moduli given below satisfy both
the conformally balanced and anomaly cancellation condition. For the trivial bundle case, the
Hermitian YangMills condition does not place any constraint on the deformations. Finally, we
will also discuss the variation of the complex structure in this model.
3.2. Deformation of the dilaton

The dilaton is associated to the warp factor of the K3 base. Thus, varying the dilaton corresponds to varying the local scale of the K3. The deformation of the Hermitian form due to the
variation of the dilaton is
J = 2e2 JK3 ,
(3.15)
where depends only on the K3 coordinates. This is consistent with the dilaton variation
condition = (1/8)J mn Jmn of Eq. (2.13). As for the conformally balanced condition, it in fact
does not place any constraint on the dilaton. The metric variation (3.15) when inserted into (3.8)
gives the four-form
= e2 JK3 JK3 ,
(3.16)
which is indeed d-closed for any real function on the base K3. Since the space of real function
is infinite-dimensional, the dimensionality of the deformation space is also infinite if only the
conformally balanced condition is considered.
Imposing anomaly cancellation condition will however make the deformation space finite.
Anomaly cancellation (3.9) imposes the condition

2e2 JK3 i

2
1
+ 4
= 0,
tr B B gK3
e
2
(3.17)
where we have used (3.11). The analysis of Fu and Yau [13] guarantees only a one-parameter
family of solutions parametrized by the normalization

A=
e
K3
8 JK3
JK3
2!
1/4
,
(3.18)
129
as long as the topological condition (3.5) is satisfied and also A 1. (See [14] for a discussion of the physical implications of the A 1 bound.) The variation of the dilaton can thus be
parametrized by the value of A.5
3.3. Deformations of the K3 metric
The metric moduli of the K3 are associated with deformations of the Hermitian form JK3
such that the curvatures of the T 2 bundle, i for i = 1, 2, remain primitive (3.3). This implies
that the allowed variation of JK3 satisfies
i JK3 + i JK3 = 0,
i = 1, 2.
(3.19)
Hence, of the 20 possible h1,1 Khler deformations of K3, only the subset that satisfies (3.19) is
allowed.
First, consider the case where i = 0. We then have the condition
i JK3 = 0,
(3.20)
which must be satisfied locally at every point on K3. With the curvature form containing a
(1, 1) part, (3.20) is a very strong condition that in general can only be satisfied by a variation
proportional to the Hermitian form, JK3 JK3 . But this would then be the modulus identified
above as associated with the dilaton (3.15).
i , where fi for i = 1, 2 are functions on the base K3.
More generally, we can have i = i f
This form of i is required so that the variation does not change the H 2 (K3) integral class
of i as required by the quantization of (3.14). Let JK3 = H 1,1 (K3) and not proportional
to JK3 , then the variation (3.19) corresponds to
i JK3
0 = i + i f
JK3 JK3

= fi fi
.
2
(3.21)
K3
Here, we have replaced i = fi JK3 J
noting that the exterior product of two (1, 1)-forms
2
on the base must be a function times the volume form of the K3. Now, the sufficient condition
that a solution for fi exists is that
JK3 JK3
fi
= i = 0.
(3.22)
2
K3
K3
But this is related to the requirement that the intersection numbers are zero. The intersection
numbers of K3 are defined to be
dI J = I J ,
(3.23)
K3
5 Rigorously, one should be able to show that there does not exist a dilaton variation that satisfies (3.17) and leaves the
normalization A unchanged. Regardless, the finite-dimensionality of the deformation space is ensured if one assumes the
elliptic condition required by Fu and Yau [13] to solve the anomaly cancellation equation for .
130
where I , I = 1, . . . , 22, denotes a basis of H 2 (K3, Z). The matrix dI J is the metric of the even
self-dual lattice with Lorentzian signature (3, 19) given by

0 1
0 1
0 1
,
(E8 ) (E8 )
(3.24)
1 0
1 0
1 0
where
2
0
2
0
1 0
0 1
E8 =
0
0
0
0
0
0
0
0
1 0
0
0
0
0
0 1 0
0
0
0
2 1 0
0
0
0
1 2 1 0
0
0
,
0 1 2 1 0
0
0
0 1 2 1 0
0
0
0 1 2 1
0
0
0
0 1 2
(3.25)
is the Cartan matrix of E8 Lie algebra. Thus we see that a variation of JK3 = is allowed as
long as the intersection numbers of with i are zero. This implies at least that = 1 , 2 .
The above variations of the Khler form on the K3 require the metric variations
i
J = e2 + ( + ) + 2e2 JK3 ,
2

i
= + JK3 ( + ) + e2 JK3 JK3 ,
2
(3.26)
where = i(f1 + if2 ). One can check that the above is closed when (3.21) is satisfied. We
note that the additional variation of the dilaton in (3.26) is needed in order to satisfy the anomaly
condition. With it, the analysis of Fu and Yau [13] then guarantees the existence of a solution
for for each consistent pair (, ). Therefore, J variations in (3.26) satisfying (3.22) are
indeed moduli.
3.4. Deformation of the T 2 bundle
We now consider the variation of the size of the T 2 bundle. This is an allowed variation of the
conformally balanced condition since the metric variation
i
J = ,
2
results in the closed four-form

i
1
= e2 JK3 JK3 + JK3 ,
4
4
(3.27)
(3.28)
where is a constant infinitesimal parameter. But we must also check the anomaly condition.
The variation of the curvature term can be calculated using (3.11)(3.14) and we obtain

1

B g 1 .
tr R g 1 g = tr B
2
The anomaly condition (3.9) therefore becomes
(3.29)

1
0 = i J i tr R g g
2

1

= tr B
B g 1
2
2

2

J

1
B g 1 ,
= 2 K3 + tr B
2
2
2
131
(3.30)
but this cannot hold true. To see this, we can integrate the last line over the base K3. The first term
gives a positive contribution while the second term integrates to zero. Here, we have used the fact
B g 1 ] in the second term is well-defined and has dependence only
that the two-form tr[B
on the base K3 as was shown in [13] (see Lemma 10 on page 11). Thus, the size of the torus
cannot be continuously varied as it is fixed by the anomaly condition.
With the size of the torus fixed, it is evident that there cannot be any overall radial moduli
J = J for this model, as has also been noted previously in [21,27,28]. Actually, it is true in
general that the anomaly cancellation forbids an overall constant radial modulus for any heterotic compactification with non-zero H -flux. The reason is simply that tr[R R] is invariant
under constant scaling of the metric since the Riemann tensor, Rmn p q , is scale invariant. How depends on J and cannot be scale invariant. Hence, the overall scale is not a
ever dH = 2i J
modulus.
To summarize, the T 2 bundle over K3 model has a dilaton modulus and also moduli associated with the Khler moduli of the base K3. The number of moduli in particular depends on the
curvature of the T 2 twist, . The size of the T 2 is however fixed and hence there is no overall
radial modulus in the model.
3.5. Fixing the complex structure
We have mostly taken the complex structure to be fixed in analyzing the moduli. But for
the T 2 bundle over K3 solution, the complex structures are rather transparent and we can describe how they can be fixed. To begin, the complex structures are simply those on the K3 plus
that on the T 2 . For the T 2 , its complex structure determines the integral first Chern class quantization condition (3.4) for 1 and 2 . For an arbitrary torus complex structure = 1 + i2 , the
quantization conditions depend on and takes the form

1
1
1
2 Z,
1 2 Z,
(3.31)
2
2
2 2
where H2 (K3, Z) is any two-cycle on K3. Therefore, fixing = 1 + i2 effectively

fixes . And even if we were to allow to vary infinitesimally, the complex structure integrability
condition = 1 + i2 (2,0) (K3) (1,1) (K3) and the topological condition (3.5) must be
imposed. All together, these strong conditions generically fix the T 2 complex structure moduli.
Note also that the condition H (1,1) (K3, Z) = H (1,1) (K3) H 2 (K3, Z) also strongly constrains the complex structure of the K3 since the dimension of H (1,1) (K3, Z) do vary with the
complex structure of K3.6
The complex structures of K3 can also be fixed if the T 2 twist contains a (2, 0) selfdual part, (2,0) = kK3 , which up to a constant k must be proportional to the holomorphic
6 That the T 2 complex structures are fixed has also be noted from the gauged linear sigma model point of view in [18].
132
(2, 0)-form of K3. The above mentioned quantization condition for the (2, 0) part then takes the
form (for = i)
k
(3.32)
K3 Z,
2
which defines the periods of the holomorphic (2, 0)-form on the K3. These periods specify
the complex structures chosen on K3, and the quantization condition thus fixes the complex
structures on K3.
4. Conclusions and open questions
In this paper, we have derived the defining equations for the local moduli of supersymmetric heterotic flux compactifications. The defining equations were derived by performing a linear
variation of the supersymmetry constraints obeyed by such compactifications. We further analyzed the corresponding geometric moduli spaces and discussed the particular example of a T 2
bundle over K3 in detail. This T 2 bundle over K3 solution is special in that in it is dual to M- or
F-theory on K3 K3. Notice that under infinitesimal deformations, the manifold K3 K3 remains K3 K3. Thus, the corresponding heterotic T 2 bundle over K3 dual must also be locally
unique; that is, it remains a T 2 bundle over K3 under infinitesimal variation.
In much of our analysis, we have set the gauge bundle to be trivial. For the T 2 bundle over
the K3 case, the non-trivial, non-U (1) bundle are simply the stable bundles on K3 lifted to the
six-dimensional space. The moduli space then corresponds to the space of K3 stable bundle. The
dimension of this moduli space M is given by the Mukai formula [29]
dim M = 2rc2 (E) (r 1)c12 (E) 2r 2 + 2,
(4.1)
where r is the rank of the bundle (i.e. the dimension of the fiber), and (c1 (E), c2 (E)) are the
first and second Chern number of the gauge bundle E. It would be interesting to understand the
moduli space of stable gauge bundle in general.
There are a number of interesting open questions. First, in our analysis we have kept for
simplicity the complex structure fixed. It is well known that for CalabiYau compactifications
the moduli space is a direct product of complex structure and Khler structure deformations. For
non-Khler manifolds with torsion, this likely is not the case and it would be interesting to allow
for a simultaneous variation of the complex structure and the Hermitian form.
It would be interesting to analyze the geometry of the moduli space and to determine if powerful tools such as the well-known special geometry of CalabiYau compactifications [30] can
be derived in this case.
Furthermore, counting techniques for moduli fields need to be developed and we expect that
the number of moduli can be characterized in terms of an index or some topological invariants
of the manifold.
Finally, it would be interesting to analyze the moduli space from the world-sheet approach
using the recently constructed gauged linear sigma model [18]. Moduli fields will correspond to
the marginal deformations of the IR conformal field theory.
Acknowledgements
We would like to thank A. Adams, K. Becker, S. Giddings, J. Lapan, J. Sparks, E. Sharpe,
A. Subotic, V. Tosatti, D. Waldram, M.-T. Wang, P. Yi, and especially J.-X. Fu for helpful dis-
133
cussions. We thank the 2006 Simons Workshop at YITP Stony Brook for hospitality where part
of this work was done. M. Becker would like to thank members of the Harvard Physics Department for their warm hospitality during the final stages of this work. The work of M. Becker is
supported by NSF grants PHY-0505757, PHY-0555575 and the University of Texas A&M. The
work of L.-S. Tseng is supported in part by NSF grant DMS-0306600 and Harvard University.
The work of S.-T. Yau is supported in part by NSF grants DMS-0306600, DMS-0354737, and
DMS-0628341.
Appendix A
We summarize our notation and conventions.
Our index conventions are as follows: m, n, p, q, . . . denote real six-dimensional coordi c,
nates, a, b, c, . . . and a,
b,
. . . denote six-dimensional complex coordinates, and i, j, k, . . .
j, k,
. . . denote four-dimensional complex coordinates on the base K3.
and i,
The gauge field Am and field strength Fmn take values in the SO(32) or E8 E8 Lie-algebra
with the generators being anti-Hermitian.
The Riemann tensor is defined as follows
Rmn p q = m n p q n m p q + m p r n r q n p r m r q .
With a Hermitian metric g with components ga b , we write the Hermitian curvature two1 ] where g is the transposed of g with components g .
g 1 g]
= [(g)g
form as R = [
ba
Explicitly, in components, we write
cd

dc
c
Rab
.
= a (b gd d )g
d = a g gdd
We follow the convention standard in the mathematics literature for the Hodge star operator.
For example, (H )mnp = 3!1 Hrst rst mnp with mnprst being the Levi-Civita tensor.
We use the definition for 2J :
J3
.
3!
For a vector field, v = v m m , the interior product acting on a p-form with components
m1 m2 ...mp is just
= 2J
(iv )m2 m3 ...mp = v m1 m1 m2 ...mp .

Given a Hermitian form J , the adjoint of the Lefschetz operator acting on a p-form with
components m1 m2 ...mp is
()m3 m4 ...mp =
1 m1 m2
J
m1 m2 m3 m4 ...mp .
2!
References
[1] P. Candelas, G.T. Horowitz, A. Strominger, E. Witten, Vacuum configurations for superstrings, Nucl. Phys. B 258
(1985) 46.
[2] V. Braun, Y.H. He, B.A. Ovrut, T. Pantev, A heterotic standard model, Phys. Lett. B 618 (2005) 252, hep-th/0501070.
[3] V. Bouchard, R. Donagi, An SU(5) heterotic standard model, Phys. Lett. B 633 (2006) 783, hep-th/0512149.
[4] H. Verlinde, M. Wijnholt, Building the standard model on a D3-brane, hep-th/0508089.
134
[5] R. Blumenhagen, M. Cvetic, P. Langacker, G. Shiu, Toward realistic intersecting D-brane models, Annu. Rev. Nucl.
Part. Sci. 55 (2005) 71, hep-th/0502005.
[6] M.R. Douglas, S. Kachru, Flux compactification, hep-th/0610102.
[7] M. Grana, Flux compactifications in string theory: A comprehensive review, Phys. Rep. 423 (2006) 91, hep-th/
0509003.
[8] C.M. Hull, Superstring compactifications with torsion and spacetime supersymmetry, in: R. DAuria, D. Fre (Eds.),
1st Torino Meeting on Superunification and Extra Dimensions, September 1985, Torino, Italy, World Scientific,
Singapore, 1986, p. 347.
[9] A. Strominger, Superstrings with torsion, Nucl. Phys. B 274 (1986) 253.
[10] B. de Wit, D.J. Smit, N.D. Hari Dass, Residual supersymmetry of compactified d = 10 supergravity, Nucl. Phys.
B 283 (1987) 165.
[11] K. Dasgupta, G. Rajesh, S. Sethi, M theory, orientifolds and G-flux, JHEP 9908 (1999) 023, hep-th/9908088;
K. Becker, K. Dasgupta, Heterotic strings with torsion, JHEP 0211 (2002) 006, hep-th/0209077.
[12] K. Becker, M. Becker, K. Dasgupta, P.S. Green, Compactifications of heterotic theory on non-Khler complex
manifolds I, JHEP 0304 (2003) 007, hep-th/0301161;
K. Becker, M. Becker, P.S. Green, K. Dasgupta, E. Sharpe, Compactifications of heterotic strings on non-Khler
complex manifolds II, Nucl. Phys. B 678 (2004) 19, hep-th/0310058.
[13] J.-X. Fu, S.-T. Yau, The theory of superstring with flux on non-Khler manifolds and the complex MongeAmpre
equation, hep-th/0604063.
[14] K. Becker, M. Becker, J.-X. Fu, L.-S. Tseng, S.-T. Yau, Anomaly cancellation and smooth non-Khler solutions in
heterotic string theory, Nucl. Phys. B 751 (2006) 108, hep-th/0604137.
[15] M. Cyrier, J.M. Lapan, Towards the massless spectrum of non-Kaehler heterotic compactifications, hep-th/0605131.
[16] T. Kimura, P. Yi, Comments on heterotic flux compactifications, JHEP 0607 (2006) 030, hep-th/0605247.
[17] S. Kim, P. Yi, A heterotic flux background and calibrated five-branes, JHEP 0611 (2006) 040, hep-th/0607091.
[18] A. Adams, M. Ernebjerg, J.M. Lapan, Linear models for flux vacua, hep-th/0611084.
[19] E.A. Bergshoeff, M. de Roo, The quartic effective action of the heterotic string and supersymmetry, Nucl. Phys.
B 328 (1989) 439.
[20] J.P. Gauntlett, D. Martelli, S. Pakis, D. Waldram, Commun. Math. Phys. 247 (2004) 421, hep-th/0205050.
[21] G. Lopes Cardoso, G. Curio, G. DallAgata, D. Lust, BPS action and superpotential for heterotic string compactifications with fluxes, JHEP 0310 (2003) 004, hep-th/0306088.
[22] D.J. Gross, E. Witten, Superstring modifications of Einsteins equations, Nucl. Phys. B 277 (1986) 1.
[23] D. Nemeschansky, A. Sen, Conformal invariance of supersymmetric sigma models on CalabiYau manifolds, Phys.
Lett. B 178 (1986) 365.
[24] P. Candelas, Yukawa couplings between (2, 1) forms, Nucl. Phys. B 298 (1988) 458;
P. Candelas, X. de la Ossa, Moduli space of CalabiYau manifolds, Nucl. Phys. B 355 (1991) 455.
[25] R. Bott, S.S. Chern, Hermitian vector bundles and the equidistribution of the zeroes of their holomorphic sections,
Acta Math. 114 (1965) 71.
[26] C.M. Hull, Actions for (2, 1) sigma models and strings, Nucl. Phys. B 509 (1998) 252, hep-th/9702067.
[27] K. Becker, M. Becker, K. Dasgupta, S. Prokushkin, Properties of heterotic vacua from superpotentials, Nucl. Phys.
B 666 (2003) 144, hep-th/0304001.
[28] K. Becker, L.-S. Tseng, Heterotic flux compactifications and their moduli, Nucl. Phys. B 741 (2006) 162, hep-th/
0509131.
[29] S. Mukai, Moduli of vector bundles on K3 surfaces, and symplectic manifolds, Sugaku Expositions 1 (1988) 139.
[30] A. Strominger, Special geometry, Commun. Math. Phys. 133 (1990) 163.
Supersymmetric N = 1 Spin(10) gauge theory

with two spinors via a-maximization
Teruhiko Kawano, Futoshi Yagi
Department of Physics, University of Tokyo, Hongo, Tokyo 113-0033, Japan
Received 11 June 2007; accepted 6 July 2007
Abstract
We give a detailed analysis of the superconformal fixed points of four-dimensional N = 1 supersymmetric Spin(10) gauge theory with two spinors and vectors by using a-maximization procedure.
1. Introduction
In the previous paper [1], we studied four-dimensional N = 1 supersymmetric Spin(10) gauge
theory with a single chiral superfield in the spinor representation and NQ chiral superfields Qi
(i = 1, . . . , NQ ) in the vector representation and with no superpotential at the superconformal
infrared (IR) fixed point. This theory is believed to have a non-trivial IR fixed point for 7
NQ 21, where the dual description is available [2,3].
At the IR fixed points, since the conformal dimension D(O) of a gauge invariant chiral primary operator O can be determined by the superconformal U (1)R charge R(O) [4] as
3
D(O) = R(O),
(1.1)
2
the U (1)R symmetry in the superconformal algebra plays an important role.
The unitarity requires the conformal dimension of a gauge invariant Lorentz scalar to satisfy
D(O) 1, where the equality is satisfied if and only if O is free [5]. With (1.1),
E-mail address: yagi@hep-th.phys.s.u-tokyo.ac.jp (F. Yagi).

doi:10.1016/j.nuclphysb.2007.07.007
136
T. Kawano, F. Yagi / Nuclear Physics B 786 (2007) 135151
2
R(O) .
(1.2)
3
However, a gauge invariant chiral primary operator sometimes appears to violate the inequality (1.2), when we na vely assume that the global symmetry at the IR fixed point is the same as
that in the ultraviolet (UV) region. It has been argued in [68] that the operator O decouples from
the remaining interacting system to become free at the IR fixed point, where a new global U (1)
symmetry which transforms only O is enhanced and the real U (1)R charge of O becomes 2/3.
The superconformal U (1)R symmetry can be expressed as a linear combination of anomalyfree U (1) symmetries as

U (1)R = U (1) +
(1.3)
xi U (1)i ,
i
where global U (1) symmetries under which the gaugino has no charge are denoted by U (1)i
(i = 1, 2, . . .) and an anomaly-free U (1) symmetry which transforms the gaugino with charge 1
by U (1) . In order to determine the superconformal U (1)R symmetry, we have to determine the
coefficients xi in (1.3). In fact, we may use a-maximization [9] for this purpose. Following this
method, we regard xi in (1.3) as variables to be determined and construct the trial a-function1
a0 (x1 , x2 , . . .) = 3 Tr R 3 Tr R.
(1.4)
Each term in the right-hand side of (1.4) represents the t Hooft anomaly [10], where the charge R
is the U (1)R charge given in terms of xi in (1.3), but they are not necessarily the superconformal
U (1)R charges at the IR fixed point. If there are no accidental symmetries at the IR fixed point, the
t Hooft anomalies can be evaluated in the UV by using the t Hooft anomaly matching condition
for asymptotically-free theories. Then, a-maximization tells us that the local maximum of the
function (1.4) gives xi for the superconformal U (1)R symmetry in (1.3).
However, as mentioned above, the function (1.4) does not make sense in the range of xi
where gauge invariant chiral primary operators seem to violate the unitarity bounds (1.2). It was
proposed in the paper [7] that, in the range where operators Oi seem to violate the unitarity
bounds (1.2), the trial a-function should be modified into

a(x1 , x2 , . . .) = a0 +
(1.5)
aOi R(Oi ) + aOi (2/3) .
i
The function aOi represents the contribution from the operator Oi to the trial a-function and can
be evaluated as

3

aOi R(Oi ) = dOi 3 R(Oi ) 1 R(Oi ) 1 ,
(1.6)
where dOi is the number of the components of the operator Oi , and R(Oi ) is the U (1)R charge
of Oi , as given in (1.3). The term aOi (2/3) is obtained by substituting the value R(Oi ) = 2/3
of free fields into (1.6) to give 2dOi /9. The prescription (1.5) can be interpreted as subtracting
the contribution which is evaluated under the assumption that the operator Oi is interacting and
adding the contribution of the operators as free fields. Thus, by dividing the range of xi according to which operators hit the unitarity bounds and by modifying the trial a-function as (1.5)
for each range, we obtain the trial a-function in the whole range of the variables xi [1,7]. The
superconformal U (1)R symmetry could be identified by the local maximum of this function.
1 We omit the overall factor 3/32 of the trial a-function in this paper, which does not affect the calculation of the
U (1)R charges.
137
By using the method discussed above, we showed in the previous paper [1] that the meson
operator M ij = Qi Qj hits the unitarity bound and becomes free for NQ = 7, 8, 9. We also analyzed the IR fixed point by using the electricmagnetic duality and found that the decoupling
of the meson operator can be seen more clearly in the magnetic theory. In the magnetic theory,
since the meson operator is described by elementary fields, we do not need the prescription (1.5).
We thus proved the validity of the prescription (1.5) in the theory.
The magnetic theory is SU(NQ 5) gauge theory with NQ antifundamentals qi , a single
fundamental q, a symmetric tensor s, and singlets M ij and Y i , and its superpotential is given by
Wmag = M ij qi s qj + Y i q qi + det s.
(1.7)
where M ij and Y i correspond to the gauge invariant operators M ij = Qi Qj and Y i = Qi 2

in the electric theory, respectively. When M ij hits the unitarity bound, it decouples from the
interacting system, and thus, the interaction M ij qi s qj in (1.7) becomes irrelevant at the IR fixed
point. This can be checked by evaluating the U (1)R charge of this term. Thus, we may identify
the remaining interacting system with the theory without the term M ij qi s qj in the superpotential
so that we can construct the trial a-function of this interacting system together with the free
meson without the prescription (1.5), but the resulting function is actually identical to (1.5).
We further discussed that, since the interaction M ij qi s qj in (1.7) vanishes at the IR fixed
point, we do not have the F -term condition for M ij , and new massless degrees of freedom
corresponding to qi s qj appear there. The dual of the magnetic theory without the interaction
term M ij qi s qj is given by the original electric theory but with the superpotential
W = Nij Qi Qj ,
(1.8)
where Nij are additional singlets and correspond to qi s qj . We found that the IR fixed point of the
original theory is identical to that of this theory. This renormalization group flow can be seen in
the original electric theory by introducing auxiliary fields M ij and the Lagrange multipliers Nij
to give the superpotential

W = Nij Qi Qj M ij .
We can see that it is the same theory as the original one by integrating out M ij and Nij . The
equations of motion give the constraints
M ij = Qi Qj ,
Nij = 0.
(1.9)
When M ij hits the unitarity bound, the interaction Nij M ij becomes irrelevant, to give rise to the
superpotential (1.8) at the IR fixed point, where the constraints (1.9) does not exist. In this way,
we find that a-maximization and the electricmagnetic duality reveal the rich dynamics at the IR
fixed point.
In this paper, we extend the analysis to the theory with two spinors and NQ vectors and show
that the meson operator M ij = Qi Qj decouples from the interacting system to become free for
NQ = 6, 7.
This paper is organized as follows: In Section 2, we briefly review the electricmagnetic
duality in the theory with two spinors, especially about the matching of gauge invariant operators.
In Section 3, we study which operators become free by using a-maximization for both electric
and magnetic theory. Section 4 is devoted to summary and discussion. In the appendices, we
discuss the gauge invariant operators in both the electric and the magnetic theory.
138
2. The electricmagnetic duality

We study four-dimensional N = 1 supersymmetric Spin(10) gauge theory with two chiral superfields I (I = 1, 2) in the spinor representation and NQ chiral superfields Qi (i = 1, . . . , NQ )
in the vector representation and with no superpotential. From the 1-loop beta function, we find
that it is asymptotically free for NQ 19. It is believed that the theory has a non-trivial superconformal IR fixed point for 6 NQ 19, where the magnetic dual description exists [11].
This theory has the anomaly-free global symmetry SU(NQ ) SU(2) U (1)F U (1) , and
the fields Qi and I have charges (NQ , 1, 4, 1) and (1, 2, NQ , 1), respectively, under the
symmetry. Here, U (1)F is a global symmetry under which the gaugino have no charge, while
U (1) transforms it with charge 1. If there are no accidental symmetries at the IR fixed point,
the U (1)R symmetry in the superconformal algebra should be given as a linear combination of
these U (1) symmetries as
U (1)R = xU (1)F + U (1)
(2.1)
with some real number x. Thus, the U (1)R charge of the matter fields can be expressed as
R(Q) = 4x + 1,
R( ) = NQ x 1.
(2.2)
We will determine the value of x in the next section by using a-maximization.

As explained in the introduction, in order to construct the trial a-function in the whole range
of x, we need to know gauge invariant chiral primary operators at the IR fixed point. As discussed
in Appendix A, the gauge invariant generators of the classical chiral ring of this theory are given
by
M ij = Qai Qaj ,
YXi = IT C(2 X )I J a J Qai ,
C i1 i3 = IT C(2 )I J a1 a3 J Qa1 i1 Qa3 i3 ,
i i5
BX1
= IT C(2 X )I J a1 a5 J Qa1 i1 Qa5 i5 ,
F i1 i7 = IT C(2 )I J a1 a7 J Qa1 i1 Qa7 i7 ,

i i9
E2 X1
G = IT C(2 X )I J a J KT C(2 X )KL a L ,

H i1 i4 = IT C(2 X )I J a1 a5 J KT C(2 X )KL a1 L Qa2 i2 Qa5 i5 ,
D0 i1 i6 = a1 a10 Qa1 i1 Qa6 i6 W a7 a8 W a9 a10 ,
D1 i1 i8 = a1 a10 Qa1 i1 Qa8 i8 W a9 a10 ,
D2 i1 i10 = a1 a10 Qa1 i1 Qa10 i10 ,
S = Tr W W .
(2.3)
Here, a and a1 , a2 , . . . are the indices of the gauge group Spin(10), and the matrix C is the charge
conjugation matrix of it. The matrices X (X = 1, 2, 3) are the Pauli matrices for the flavor of
the spinors. Taking account of the number of the antisymmetrized indices of the SU(NQ ) global
symmetry, we see that whether each operator exists depends on NQ . For example, the operator
D2 i1 i10 exists only for NQ 10.
Now, we turn to the magnetic theory, which is believed to be equivalent as the original electric
theory in the IR region. The magnetic theory is given by SU(NQ 3) Sp(1) gauge theory with
139
Table 1
The matter contents of the magnetic theory
SU(NQ 3)
a, b, . . .
Sp(1)
, , . . .
SU(NQ )
i, j, . . .
SU(2)
I, J, . . .
U (1)F
qai
2 NQ 3
q a I
Q
N 3
Q
a
qX
2NQ NQ 3
s ab
22
t I
M ij
i
YX
4NQ
NQ 3
1
1
1
2
1
1
1
22
2
2
1
3
2NQ
8
2NQ 4
U (1)
N 6
1
NQ 3
NQ 2
NQ 3
3NQ 10
NQ 3
N 23
Q
2N
N 4
Q
2
2
1
the matter content given by Table 1 and with the superpotential

a
+ I J q a I s ab qb J
Wmag = M ij qai s ab qbj + YXi qai qX
a J
+ (X 2 )I J q a I qX
t .
(2.4)
This theory has the same anomaly-free global symmetry as the electric theory. Thus, the U (1)R
symmetry should also be expressed as (2.1) with the same value of x as in the electric theory.
There exist gauge invariant operators in this theory which correspond to those of the electric
theory. They are fundamental singlets M ij and YXi and the following composite operators:
(C)i1 iNQ 3
a1 aNQ 3
(B)Xi1 iNQ 5
(F )i1 iNQ 7
qa1 i1 qaNQ 3 iNQ 3 ,
a1 aNQ 3
a1 aNQ 3
q I
aN
Q 6
qa1 i1 qaNQ 5 iNQ 5 q I

aN
Q 4
(2 X )I J q J
aN
Q 3
qa1 i1 qaNQ 7 iNQ 7

(2 X )I J q J
aN
Q 5
q aNQ 4 (2 X )KL q L
aN
K
Q 3
aNQ 9
(E2 )Xi1 iNQ 9 a1 aNQ 3 XY Z (s qi1 )a1 (s qiNQ 9 )

aNQ 8 aNQ 7
(sw )
aNQ 6 aNQ 5 NQ 4 NQ 3
qY
qZ
,
(sw )
G t I (2 )I J t J ,
(H )i1 iNQ 4 I J
a1 aNQ 3
qa1 i1 qaNQ 4 iNQ 4 q I

aN
Q 3
t J ,
aNQ 6 aNQ 5 aNQ 4 aNQ 3

qX
qY
qZ
,
aNQ 8
(D0 )i1 iNQ 6 XY Z a1 aNQ 3 (s qi1 )a1 (s qiNQ 6 )
(D1 )i1 iNQ 8 a1 aNQ 3 XY Z (s qi1 )a1 (s qiNQ 8 )

a

aN 5 aN 4 aN 3
a
sw NQ 7 NQ 6 qX Q qY Q qZ Q ,
a
(D2 )i1 iNQ 10 a1 aNQ 3 XY Z (s qi1 )a1 (s qiNQ 10 ) NQ 10

a
aN 5 aN 4 aN 3
a
a
a
(sw ) NQ 9 NQ 8 sw NQ 7 NQ 6 qX Q qY Q qZ Q ,
S Tr w w ,
S Tr w w .
(2.5)
140
Here, w and w are the field strength of the SU(NQ 3) and Sp(1) gauge groups, respectively,2
and the operation represents the Hodge duality with respect to the flavor SU(NQ ) indices. The
magnetic theory has two kinds of glueball superfields corresponding to the two gauge group
factors. We can check that every operator has the same charges as that of the electric theory.
Furthermore, it seems that more gauge invariant generators exist in the magnetic theory than
in the electric one. They are given by
U0 = det s,
U1XY = a1 aNQ 3 b1 bNQ 3 s a1 b1 s
aNQ 4 bNQ 4 aNQ 3 bNQ 3

qX
qY
,
U2XY = XX1 X2 Y Y1 Y2 a1 aNQ 3 b1 bNQ 3

s a1 b1 s
aNQ 5 bNQ 5 aNQ 4 aNQ 3 bNQ 4 bNQ 3

qX1
qX2
qY1
qY2
,
U3 = X1 X2 X3 Y1 Y2 Y3 a1 aNQ 3 b1 bNQ 3
aNQ 6 bNQ 6 aNQ 5 aNQ 4 aNQ 3 bNQ 5 bNQ 4 bNQ 3
q X1
qX2
qX3
qY1
qY2
qY3
,
a
a
N
4
N
3
a
(E0 )Xi1 iNQ 5 = XY Z a1 aNQ 3 (s qi1 )a1 (s qiNQ 5 ) NQ 5 qY Q qZ Q ,
a
(E1 )Xi1 iNQ 7 = a1 aNQ 3 XY Z (s qi1 )a1 (s qiNQ 7 ) NQ 7
s a1 b1 s
a

aN 4 aN 3
a
sw NQ 6 NQ 5 qY Q qZ Q ,
aNQ 4 aNQ 3
qX
,
(I0 )Xi1 iNQ 4 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 4 )
aNQ 6
(I1 )Xi1 iNQ 6 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 6 )
sw
aN
a
Q 5 NQ 4
aNQ 3
qX
aNQ 8
(I2 )Xi1 iNQ 8 = a1 aNQ 3 (s qi1 ) (s qiNQ 8 )

a
aN 3
a
a
a
(sw ) NQ 7 NQ 6 sw NQ 5 NQ 4 qX Q ,

a
a
a
(J1 )i1 iNQ 5 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 5 ) NQ 5 sw NQ 4 NQ 3 ,
a1
(J2 )i1 iNQ 7 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 7 ) NQ 7

a
a
a
a
(sw ) NQ 6 NQ 5 sw NQ 4 NQ 3 .
(2.6)
In spite of our best effort, we have not succeeded to show that these operators are decomposed
or vanish in the classical chiral ring, as discussed in Appendix B.
The discrepancy makes it difficult for us to understand what happens at the IR fixed point.
Though these two theories might actually not be equivalent to each other at the IR fixed point, it
is not plausible that all the other non-trivial checks discussed in [11] are only accidental. Thus, in
this paper, we assume that the classical chiral ring is deformed by the quantum effects and that the
quantum chiral rings of both the theories are identical. However, it is still unclear what is indeed
happening quantum-mechanically at the IR fixed point. This issue affects the construction of the
trial a-function. Therefore, we will consider both the functions in the electric and the magnetic
theory and compare the results. In the next section, we will see that both the functions have the
identical local maximum.
2 The index of the field strength w and w
is that of Lorentz spinors, which would not cause any confusion.
141
3. a-Maximization
In this section, we study Spin(10) gauge theory with two spinors and NQ vectors at the superconformal IR fixed point both in the electric and the magnetic theory by using a-maximization.
We calculate the local maximum of the trial a-function defined in the whole range of the parameter x and determine which operators become free at the IR fixed point.
3.1. a-Maximization in the electric theory
We begin with the electric theory. As the result depends on NQ , we first analyze the case
NQ = 6. Taking account of the number of the antisymmetrized indices of the global symmetry
SU(NQ ), we find that the gauge invariant operators in this case are M, Y , C, B, G, H , D0 ,
and S in (2.3). Since the U (1)R charge of the glueball superfield S is always 2 and never hit the
unitarity bound, we can concentrate on the other seven operators. Using (2.2), the U (1)R charges
R(O) of the gauge invariant operators can be written in terms of x as
R(M) = 8x + 2,
R(G) = 24x 4,
R(Y ) = 8x 1,
R(H ) = 8x,
R(C) = 1,
R(B) = 8x + 3,
R(D0 ) = 24x + 8.
(3.1)
By solving R(O) < 2/3 for each operator, we find that the ranges of x are given in Fig. 1. Since
the operator C does not hit the unitarity bound for all the ranges of x, it does not appear in the
figure.
Now, we construct the trial a-function in the whole range of the parameter x. The trial
a-function in the region where no operators hit the unitarity bound is given by

a0 (x) = 90 + 32F R( ) + 10NQ F R(Q) ,
where F (y) = 3(y 1)3 (y 1). The first term of this function is the contribution from the
gaugino. The U (1)R charges R( ) and R(Q) may be rewritten in terms of x as (2.2). We modify
this function as (1.5) for each range according to which operators hit the unitarity bound. Writing
each term in the summation of (1.5) as fO (x) = aO (R(O)) + aO (2/3), the trial a-function for
the whole range of x is given by
Fig. 1. The ranges of x where each operator hits the unitarity bound for NQ = 6.
142
),
a0 (x) + fY (x) + fG (x) + fH (x) (x 12
1
1
a
(x)
+
f
(x)
+
f
(x)
(

x

),
0
Y
G
12
6
1
7
a0 (x) + fM (x) + fY (x) + fG (x) ( 6 x 36 ),

7
5
a(x) = a0 (x) + fM (x) + fY (x) ( 36 x 24 ),
5
7
x 24
),
a0 (x) + fM (x) ( 24
a0 (x) + fM (x) + fB (x) ( 24 x 11
36 ),
a0 (x) + fM (x) + fB (x) + fD0 (R) ( 11

36 x).
More explicitly, the function fO is given by

3
fO (x) = dO 3 R(O) 1 R(O) 1 + 2/9 ,
(3.2)
(3.3)
where dO is the number of the components of the operator O and is given by

dM =
NQ (NQ + 1)
,
2
dG = 1,
dH =
dY = 3NQ ,
NQ !
,
4!(NQ 4)!
3NQ !
,
5!(NQ 5)!
NQ !
=
.
6!(NQ 6)!
dB =
dD 0
(3.4)
The U (1)R charge R(Oi ) for each operator Oi is given in (3.1). We find that the function (3.2)
has a unique local maximum at

3 + 143N 2 928N + 1824
18NQ + 6 4NQ
Q
Q
x=
(3.5)
,
2
6(NQ + 8NQ 12)
or equivalently, substituting this to (2.2),

2 12N 48 + 2 4N 3 + 143N 2 928N + 1824
3NQ
Q
Q
Q
Q
R(Q) =
,
2
3(NQ + 8NQ 12)

3
2 42N + 72 N
2
12NQ
Q
Q 4NQ + 143NQ 928NQ + 1824
,
R( ) =
2 + 8N 12)
6(NQ
Q
(3.6)
which is in the range where only the meson operator M ij hits the unitarity bound. Thus, we find
that the meson operator M ij decouples from the interacting system to become free at the IR fixed
point for NQ = 6.
Also in the case of NQ = 7, we find that M ij hits the unitarity bound and the U (1)R charges
are given by (3.6) in the same way as for NQ = 6, though the ranges of x are different from
Fig. 1.
We go on to the case NQ = 8. The ranges of x are divided as Fig. 2. In this case, we encounter a
subtlety that we do not understand how to deal with the situation where gauge invariant Lorentz
spinors like D1 hit the unitarity bound.3 The best we can do at this stage is just to neglect
them assuming that such operators are massive in this case as in the previous paper [1]. Even if
they are actually massless, our analysis in the region where they do not hit the unitarity bound,
which is x 1/4 for this case, is still valid. We find that the trial a-function have a unique local
3 The unitarity bound for gauge invariant Lorentz spinors is R(O) 1 [5].
143
Fig. 2. The ranges of x where each operator hits the unitarity bound for NQ = 8.
maximum at
x=
12NQ
2
2900 NQ
2 20)
6(NQ
(3.7)
or equivalently,
R(Q) =
R( ) =

2 24N 60 + 2 2900 N 2
3NQ
Q
Q
2 20)
3(NQ

2 + 120 N
2
6NQ
Q 2900 NQ
2 20)
6(NQ
(3.8)
This is in the range where no operators hit the unitarity bounds. Though the ranges of x depend
on NQ , we obtain the similar result for 9 NQ 19.
In summary, we find that for NQ = 6, 7, the U (1)R charges are given by (3.6) and the meson
operator M ij becomes free, while for 8 NQ 19, the U (1)R charges are given by (3.8) and
no operators become free.
3.2. a-Maximization in the magnetic theory
We next study the magnetic theory. Though we expect the same results as that in the electric
theory, it is non-trivial because of the extra operators (2.6). The trial a-function of the magnetic
theory is different from that of the electric theory in the region where such operators hit the
unitarity bounds.
We begin with the case NQ = 6 and compare with the result of the previous subsection. The
gauge invariant operators are U0 , U1 , U2 , E0 , I0 , I1 , and J1 in (2.6), which exist only in the
magnetic theory, as well as M, Y , C, B, G, H , D0 , and S in (2.5), which have the counterpart
in the electric theory. The charge of these operators can be written with x of (2.1) by using the
charges of U (1)F and U (1) for each field given in Table 1. They are given by (3.1) and also by
R(U0 ) = 24x 2,
R(I0 ) = 8x + 2,
R(U1 ) = 4,
R(I1 ) = 3,
R(U2 ) = 24x + 10,

R(J1 ) = 16x.
R(E0 ) = 8x + 5,
(3.9)
We thus, find that the ranges of x where each operator hits the unitarity bound is given by Fig. 3.
Since the operators C, U1 and I1 do not hit the unitarity bounds for all the ranges of x, they
144
Fig. 3. The ranges of x where each operator hits the unitarity bound for NQ = 6 in the magnetic theory.
do not appear in Fig. 3. The bold arrows correspond to the operators which exist only in the
magnetic theory. The dotted arrows correspond to the Lorentz spinor operators, which we ignore
as in the previous subsection.
As in Fig. 3, we find that the trial a-function is given by
a0 (x) + fY (x) + fG (x) + fU0 (x) + fH (x) + fI0 (x) (x 16 ),
a0 (x) + fY (x) + fG (x) + fU0 (x) + fH (x) ( 1 x 1 ),
6
12
x 19 ),
a0 (x) + fY (x) + fG (x) + fU0 (x) ( 12
a0 (x) + fY (x) + fG (x) ( 19 x 16 ),
1
7
a0 (x) + fM (x) + fY (x) + fG (x) ( 6 x 36 ),

7
5
a(x) = a0 (x) + fM (x) + fY (x) ( 36
(3.10)
x 24
),
5
7
(x)
+
f
(x)
(

x

),
a
0
M
24
24
a0 (x) + fM (x) + fB (x) ( 24

x 11
36 ),
11
7
(x)
+
f
(x)
+
f
(x)
+
f
(x)
(
a
0
M
B
D
0
36 x 18 ),
x 13
a0 (x) + fM (x) + fB (x) + fD0 (x) + fU2 (x) ( 18
24 ),
a0 (x) + fM (x) + fB (x) + fD0 (x) + fU2 (x) + fE0 (x) ( 13

24 x),
where fO (x) is given by (3.3). The numbers of the components dO which appear in (3.3) are
given by (3.4) and also by
dU0 = 1,
dU2 = 6,
d E0 =
3Q !
,
5!(NQ 5)!
dI 0 =
3NQ !
.
4!(NQ 4)!
(3.11)
For the range 1/9 x 7/18, where the operators which exist only in the magnetic theory do
not hit the unitarity bound, the trial a-function (3.10) have the same shape as that of the electric
theory. As the trial a-function (3.2) of the electric theory have a local maximum in this range, this
function also have the local maximum at the same value of x. We can also check that there are
no other local maximum throughout the whole range of x, though the function itself is different
from that in the electric theory. Also in the case of NQ = 7, we can obtain the same result as in
the electric theory.
In the case of NQ = 8, the ranges of x are given in Fig. 4. We find that the trial a-function
has the same shape as that of the electric theory for 1/12 x 23/96, which includes the local
maximum given by (3.7). Since we can verify that there are no local maximum outside this range,
we find that the trial a-function have the unique local maximum, and no operators become free.
145
Fig. 4. The ranges of x where each operator hits the unitarity bound for NQ = 8 in the magnetic theory.
Though the ranges of x depend on NQ , we can find the same result as in the electric theory also
for 9 NQ 19.
Thus, we obtain the same results about the value of the U (1)R charge in spite of the discrepancy of the gauge invariant operators.
4. Summary and discussion
We have studies Spin(10) gauge theory with NQ vectors and two spinors. We found that the
meson operator M ij = Qi Qj decouples from the interacting system to become free for NQ =
6, 7.
We have discussed the renormalization group flow for the single spinor case in the paper [1].
In particular, for NQ = 7, 8, 9, we have seen the two electric theories flow into the same theory
at the IR fixed point. In the present case, since the magnetic theory flows into the theory without
the term M ij qi s qj in the IR, the electric theory flows into the same theory as that with Nij Qi Qj
in the superpotential, as discussed for the single spinors case in the introduction (see [1] for more
details).
Acknowledgements
We are indebted to Yutaka Ookouchi for collaboration at the early stages of this work. The
research of T.K. was supported in part by the Grants-in-Aid (#16740133) and (#19540268) from
the MEXT of Japan. The research of F.Y. was supported in part by JSPS Research Fellowships
for Young Scientists.
Appendix A. Gauge invariant operator of the electric theory
In this appendix, we explain how to obtain the gauge invariant generators (2.3) of the classical chiral ring of the electric theory. In order to deal with the operators including the Spin(10)
spinors I , let us recall that the product of the spinors can be decomposed into antisymmetric
tensor representations as
16 16 = [1] + [3] + [5]+
where [n] represents the rank n antisymmetric tensor, and the rank 5 tensor is self-dual. They can
be explicitly expressed as IT C a1 an J . These are symmetric under the exchange of I and J
for n = 1, 5 and antisymmetric for n = 3. All gauge invariant operators can be obtained by contracting the Spin(10) gauge indices ai (= 1, . . . , 10) of the antisymmetric tensors IT C a1 an J
146
(n = 1, 3, 5), the vectors Qai , the field strength Wa1 a2 , and the antisymmetric invariant tensors
a1 a10 . However, many of the operators constructed in this way are decomposed into the product of other gauge invariant operators or vanish up to the D 2 exact term. In order to identify
the independent gauge invariant operators, we discuss the constraints among the chiral fields Q,
, W , and the invariant tensors a1 a10 .
Since the invariant tensor a1 a10 satisfies4
a1
10
a1 a10 b1 b10 = [b
b10
],
1
a
(A.1)
we can see that a pair of the invariant tensors can annihilate. Therefore, all the gauge invariant
operators can be reduced into those with at most one of the invariant tensor a1 a10 .
It follows from (A.1) that
a1 a10 IT C b1 bn J b[a11 bann IT C an+1 a10 ] J .
(A.2)
If we introduce the antisymmetric tensors of rank 7 and 9, as seen from (A.2), we do not need
operators with both of the invariant tensor a1 a10 and the antisymmetric tensor. Thus, we find
that all the invariants are classified into operators containing no spinors with at most one of the
invariant tensors a1 a10 and those with spinors and none of the invariant tensors a1 a10 .
We begin with the operators in the former class. A constraint between the field strength W
and other fields q is given by
a

a
WA T A b q b D 2 eV D eV q
(A.3)
0,
where q is a field in a representation of Spin(10) and T A is the generator in the representation.
For example, when it is the field strength W , we obtain that {W , W } 0. Thus, operators with
more than two of the field strength in this class vanish by the anticommutativity of them. Taking
account of (A.3), we find that all the operators in this class are given by
M ij = Qai Qaj ,
S = Tr W W ,
D0 i1 i6 = a1 a10 Qa1 i1 Qa6 i6 W a7 a8 W a9 a10 ,

D1 i1 i8 = a1 a10 Qa1 i1 Qa8 i8 W a9 a10 ,
D2 i1 i10 = a1 a10 Qa1 i1 Qa10 i10 .
(A.4)
We go on to the latter class. We first consider the operators without the field strength. Most of
the constraints on the spinors can be obtained from the Fierz identities. After repeat use of the
Fierz identities and lengthy calculations, we find that the product
IT C a1 ai c1 cn J KT C b1 bj c1 cn L
(A.5)
can in general be given by a linear combination of the products

IT C(2 X )I J a J KT C(2 X )KL a L ,
IT C(2 X )I J a1 a4 b J KT C(2 X )KL b L ,
(A.6)
and those where two antisymmetric tensors are not at all contracted with each other. The sum of
the ranks of the two antisymmetric tensors in the third contribution is always less than that of the
original product (A.5). By using this fact, it turns out that the third contribution is decomposed
4 The brackets [ ] denote the antisymmetrization of the indices.
147
into other invariant operators. When we use the products of the antisymmetric tensors, they are
thus given by (A.6). Therefore, we can see that all the operators with no field strength in this
class contain at most two of the antisymmetric tensors. More explicitly, they are given by
YXi = IT C(2 X )I J a J Qai ,
C i1 i3 = IT C(2 )I J a1 a3 J Qa1 i1 Qa3 i3 ,
i i5
BX1
F i1 i7 = IT C(2 )I J a1 a7 J Qa1 i1 Qa7 i7 ,

i i9
E2 X1
G = IT C(2 X )I J a J KT C(2 X )KL a L ,

H i1 i4 = IT C(2 X )I J a1 a5 J KT C(2 X )KL a1 L Qa2 i2 Qa5 i5 .
(A.7)
We next consider operators with the spinors and the field strength. The field strength W in the
operators of this class only connect to another one W or the antisymmetric tensors due to (A.3)
and {W , W } 0. By using the identity
[a1 a2 am ]
[a1 a2 a3 am ]
a1 am bc = a1 am bc + [b
c] bc
and the relation (A.3) for the spinor representation, we obtain

0 Wbc IT C a1 am bc J + 2W [a1 b IT C a2 am ]b J 2W [a1 a2 IT C a3 am ] J .
By decomposing this equation into the symmetric and the antisymmetric part under the exchange
of the flavor indices I and J , we obtain the equations
Wbc IT C a1 am bc J 2W [a1 a2 IT C a3 am ] J ,
W [a1 b IT C a2 am ]b J 0.
(A.8)
We can see from the first equation of (A.8) that the rank of the antisymmetric tensor connected to
the field strength with two indices can be reduced by four. By using the second equation of (A.8),
we find that the operators including the field strength contracted with two antisymmetric tensors,
IT C a1 am1 b J W bc KT C a1 am1 c L ,
IT C a1 am1 b J W bc W cd KT C a1 an1 d L ,
can be reorganized into the operator where the two antisymmetric tensors are directly contracted.
Similarly to the previous discussion leading to (A.7), such products of the antisymmetric tensors
can be rewritten, and if not vanish, the field strength is in turn connected to the antisymmetric
tensor with the two indices or is decomposed with another field strength into the glueball S. Thus,
we find that operators with the spinors and the field strength finally vanish according to (A.3) or
are decomposed into the product of the glueball superfield S and operators with the spinors.
To summarize, the operators in (A.4) and (A.7) are the gauge invariant generators of the
classical chiral ring of the electric theory, as listed in (2.3).
Appendix B. Gauge invariant operators of the magnetic theory
In this appendix, we only discuss the outline on how to obtain the gauge invariant generators
of the classical chiral ring of the magnetic theory. Similarly to the case of the electric theory, an
148
identity about the antisymmetric invariant tensors

group is given by
a1 aNQ 3
aN
a1 aNQ 3
, b1 bNQ 3 of SU(NQ 3) gauge
a1
b1 bNQ 3 = [b
bN Q3 ] .
1
(B.1)
Thus, all the gauge invariant operators can be classified into operators with none of the antisymmetric invariant tensors, those with the invariant tensors with the lower indices, and those with
the invariant tensors with the upper indices.
We first consider the operators without the antisymmetric invariant tensors. Eq. (A.3) is also
valid for the field strength w and w of SU(NQ 3) and Sp(1), respectively. Taking (A.3) into
account together with the F -term conditions, we can verify that operators without the invariant
tensors are given by the gauge singlets M, Y , and the composites
G t I (2 )I J t J ,
S Tr w w ,
S Tr w w .
(B.2)
Here, we also have used

s ab wb c s cb wb a ,
(B.3)
s ab .
which follows from (A.3) for the symmetric tensors

We next consider operators including the invariant tensors a1 aNQ 3 . It turns out that all the
operators in this class are given by the contraction of the invariant tensor a1 aNQ 3 with the four
operators
a1
,
(s q)
a1 i ,
qX
n a1 b1
b1 bNQ 3
sw
(sw )a1 a2 ,
(n = 0, 1, 2),
(B.4)
which are supposed so that the indices a1 , a2 of the third in (B.4) are contracted with those of
a1 aNQ 3 , while the indices b2 , . . . , bNQ 3 of the other invariant tensor b1 bNQ 3 in the fourth
are contracted with another of (B.4). Taking account of the index X = 1, 2, 3 of the field qX and
the index = 1, 2 of the field strength w , we notice that at most three of the first in (B.4) and
two of the third can be contracted with the same invariant tensor. Therefore, all the operators
with the single a1 aNQ 3 are given by
Dn ,
En ,
In ,
Jm
(n = 0, 1, 2, m = 1, 2),
(B.5)
in (2.5) and (2.6). Note that the operator

aNQ 3
(J0 )i1 iNQ 3 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 3 )
can be decomposed into the product of the operator C in (2.5) and U0 in (2.6).
We turn to the gauge invariant operators with more than one a1 aNQ 3 and find that all the
independent gauge invariant operators are given by
U0 = det s,
U1XY = a1 aNQ 3 b1 bNQ 3 s a1 b1 s
aNQ 4 bNQ 4 aNQ 3 bNQ 3

qX
qY
,
U2XY = XX1 X2 Y Y1 Y2 a1 aNQ 3 b1 bNQ 3

s a1 b1 s
aNQ 5 bNQ 5 aNQ 4 aNQ 3 bNQ 4 bNQ 3

qX1
qX2
qY1
qY2
,
U3 = X1 X2 X3 Y1 Y2 Y3 a1 aNQ 3 b1 bNQ 3
s a1 b1 s
aNQ 6 bNQ 6 aNQ 5 aNQ 4 aNQ 3 bNQ 5 bNQ 4 bNQ 3

q X1
qX2
qX3
qY1
qY2
qY3
.
(B.6)
149
Let us begin with one invariant tensor a1 aNQ 3 and all the symmetric tensor s ab contracted
with it,
a1 ak ak+1 aNQ 3 s a1 b1 s ak bk Tak+1 aNQ 3 b1 bk ,
in an operator of this class. The indices ak+1 , . . . , aNQ 3 are supposed to be contracted with
those of the first in (B.4) or those of the field strength w in the third. As the indices b1 , . . . , bk
in T are antisymmetric, by using (B.1), we can rewrite it as
Tak+1 aNQ 3 b1 bk Tak+1 aNQ 3 d1 dk d1 dk ek+1 eNQ 3
b1 bk ek+1 eNQ 3
(B.7)
On the other hand, since we are considering the operators with more than one invariant tensors,
the operators have another invariant tensor c1 cNQ 3 other than those included in (B.7). Then,
we apply (B.1) again to
b1 bk ek+1 eNQ 3
in (B.7) and c1 cNQ 3 . We thus obtain
bk
Tak+1 aNQ 3 b1 bk c1 cNQ 3 Tak+1 aNQ 3 d1 dk d1 dk [ck+1 cNQ 3 cb11c
.
k]
(B.8)
After this procedure, other s ab besides those in (B.8) may connect to the original a1al al+1aNQ 3 ,
upon the use of (B.1). Then, we can use (B.1) for all the symmetric tensors s ab contracted with
the tensor a1 al al+1 aNQ 3 to annihilate the other d1 dk ck+1 cNQ 3 in (B.8) and the appearing
invariant tensor of the upper indices. If the resulting operator does not vanish, we obtain the
following form
a1 al al+1 aNQ 3 s a1 b1 s al bl q al+1 q
aNQ 3
b1 bl bl+1 bNQ 3 ,
(B.9)
where the remaining indices bk+1 bNQ 3 are contracted with those in (B.4). Again, we apply (B.1) to all symmetric tensors contracted with b1 bk bk+1 bNQ 3 in (B.9) to eliminate the
original invariant tensor a1 aNQ 3 and the newly appearing invariant tensor. We find that all the
a a
gauge invariant operators with more than one 1 NQ 3 except for (B.6) vanish or are decomposed into the gauge invariant operators.
a a
We next consider operators including the invariant tensors 1 NQ 3 with the upper indices.
It turns out that all the operators in this class are given by the contraction of the invariant tensor
a a
1 NQ 3 with the five operators5
qa1 i ,
I J q a1 I t J ,
q a1 (I q b1 ||J )
b1 bNQ 3
q a1 (I q b ||J q |b|KL) ,
,
q a1 (I q a2 ||J )
(B.10)
where q aI J is related to q aX in Table 1 as

a
(X 2 )I J ,
q aI J qX
and thus, it is symmetric under exchange of the indices I and J . The indices a1 and a2 of the
a a
b b
fifth operator in (B.10) are contracted with those of 1 NQ 3 , while 1 NQ 3 in the fourth is
contracted with the operators in (B.10).
Taking account of the indices of the local Sp(1) and those of the global SU(2), we find that at
a a
most four q can be contracted with 1 NQ 3 . The numbers of the second, the third, the fourth,
a a
and the fifth operators in (B.10) contracted with the invariant tensor 1 NQ 3 are limited from
5 The parentheses ( ) denote the symmetrization of the indices, while [ ] does the antisymmetrization.
150
this fact. Further, if two q from the second, the third, and the fourth are contracted with the
invariant tensor, the symmetric part of the global SU(2) indices of them can be rewritten in terms
of the fifth and some other parts. In fact, when the indices of the global SU(2) of these two q
are symmetric, the local Sp(1) indices of those q should be antisymmetric. Then, by using the
relation for the invariant tensor of Sp(1),
1 2
1 2 1 2 = [
,
1 2 ]
we can see that

1 a a a
= 1 2 NQ 3 q a1 1 (I q a2 |2 |J ) 1 2 1 2 ,
2
and it gives rise to the fifth. This is always possible when more than two q from the second, the
a a
third, and the fourth are contracted with the invariant tensor 1 NQ 3 of SU(NQ 3), because

the global SU(2) indices of two q of them must take the same value, thus symmetric. Thus,
we find that the total number of the second, the third, and the fourth contracted with the same
a a
invariant tensor 1 NQ 3 should be less than three.
When four of q are contracted with the invariant tensor, each two of them take the same
value of the SU(2) indices, respectively, and can be rewritten in terms of two copies of the fifth
and some other parts. Thus, when one of the fifth is contracted with the invariant tensor, the
total number of the second, the third, and the fourth contracted with the same invariant tensor
a a
1 NQ 3 should be less than two.
Wrapping up these facts, together with the F -term conditions, (A.3), and (B.1), we can verify
a a
that all the operators with the single 1 NQ 3 are given by
a1 a2 aNQ 3
a1
1 (I
(C)i1 iNQ 3
|2 |J )
a1 aNQ 3
(B)Xi1 iNQ 5
(F )i1 iNQ 7
a2
a1 aNQ 3
qa1 i1 qaNQ 5 iNQ 5 q aN
a1 aNQ 3
q aN
I
(2 X )I J q aN
Q 4
Q 3

(2 X )I J q aN
Q 6
(H )i1 iNQ 4 I J
a1 aNQ 3
Q 5
q aN
K
Q 4
(2 X )KL q aN
L
Q 3
qa1 i1 qaNQ 4 iNQ 4 q aN

I
Q 3
t J .
,
(B.11)
a a
We go on to the operators with more than two 1 NQ 3 and skip those with two here. The
latter will be explained later. We will see that all the operators in these classes do not give the
independent gauge invariant operators. Since the only fourth operators in (B.10) can connect with
a a
a a
two invariant tensors 1 NQ 3 , the operators with more than two 1 NQ 3 should include at
a1 aNQ 3
which are contracted with two of the fourth. Further, all the remaining indices
least one
a a
of the same 1 NQ 3 must be contracted with the first operator in (B.10), as

a a
b b
1 NQ 3 qa1 qaNQ 5 q aNQ 4 (I q b1 ||J ) 1 NQ 3

c c
q aNQ 3 (K q c1 ||L) 1 NQ 3 ,
(B.12)
as we can see from the previous discussion. Here, we apply the identity
[a1 aNQ 3 b1 ]b2 bNQ 3
a1 aNQ 3
b1 bNQ 3
= 0,
(B.13)
and
in (B.12). Ignoring the terms decomposed into the products of gauge
to
invariant operators, we find that the resulting operators are given by

NQ 5

1 a1 ak1 b1 ak+1 aNQ 3
qa1 qak1 qak+1 qaNQ 5 q aNQ 4 (I q b1 ||J )

2
k=1

a b b

c c
q aNQ 3 (K q c1 ||L) 1 NQ 3 k 2 NQ 3 qak .
151
(B.14)
If the resulting operator is not decomposed into gauge invariant operators, the last factor
a b b
c c
k 2 NQ 3 qak in (B.14) are connected with the invariant tensor 1 NQ 3 via other operac1 cNQ 3
in (B.14) is contracted with two
tors. This happens only when the invariant tensor
of the fourth in (B.10). This is the same situation we previously have seen for the invariant tena a
sor 1 NQ 3 in (B.12), and thus, we can repeat the same procedure to show that the resulting
operator is decomposed into gauge invariant operators.
a a
We now turn to the operators with two 1 NQ 3 . As discussed previously, the only fourth
operator in (B.10) can be used to connect the two invariant tensors. In particular, they are connected by at most two of the operator. By using the identity (B.13), we can see that the invariant
tensors connected by two of the fourth in (B.10) can be reduced to those by one. Thus, we only
have to consider the latter operators. If either of the invariant tensors does not have the fifth operator of (B.10), we can use the identity (B.13) If both of them have the fifth operators, a closer
examination is needed on the symmetry of the global SU(2) indices of q s. Taking account of this
point and the identity (B.13), we can verify that they are also decomposed into gauge invariant
operators.
To summarize, the singlets M, Y , and the operators listed in (B.2), (B.5), (B.6), and (B.11)
are the gauge invariant generators of the classical chiral ring of the magnetic theory.
As discussed in Section 2, to all the gauge invariant generators (2.3) of the classical chiral ring
in the electric theory, there exist the counterparts (2.5) in the magnetic theory. However, the extra
gauge invariant operators (2.6) seem to exist in the magnetic theory. If the electricmagnetic
duality is true for this model, this discrepancy should disappear at the quantum level.
References
[1] T. Kawano, Y. Ookouchi, Y. Tachikawa, F. Yagi, Pouliot type duality via a-maximization, Nucl. Phys. B 735
(2006) 1, hep-th/0509230.
[2] P. Pouliot, M.J. Strassler, Duality and dynamical supersymmetry breaking in Spin(10) with a spinor, Phys. Lett.
B 375 (1996) 175, hep-th/9602031.
[3] T. Kawano, Duality of N = 1 supersymmetric SO(10) gauge theory with matter in the spinorial representation,
Prog. Theor. Phys. 95 (1996) 963, hep-th/9602035.
[4] M. Flato, C. Fronsdal, Representations of conformal supersymmetry, Lett. Math. Phys. 8 (1984) 159.
[5] G. Mack, All unitary ray representations of the conformal group SU(2, 2) with positive energy, Commun. Math.
Phys. 55 (1977) 1.
[6] N. Seiberg, Electricmagnetic duality in supersymmetric non-Abelian gauge theories, Nucl. Phys. B 435 (1995)
129, hep-th/9411149.
[7] D. Kutasov, A. Parnachev, D.A. Sahakyan, Central charges and U (1)R symmetries in N = 1 super YangMills,
JHEP 0311 (2003) 013, hep-th/0308071.
[8] D. Kutasov, A. Schwimmer, On duality in supersymmetric YangMills theory, Phys. Lett. B 354 (1995) 315, hepth/9505004.
[9] K. Intriligator, B. Wecht, The exact superconformal R-symmetry maximizes a, Nucl. Phys. B 667 (2003) 183,
hep-th/0304128.
[10] G. t Hooft, Naturalness, chiral symmetry, and spontaneous chiral symmetry breaking, in: G. t Hooft, et al. (Eds.),
Recent Developments in Gauge Theories, Plenum Press, New York, 1980, p. 135.
[11] M. Berkooz, P.L. Cho, P. Kraus, M.J. Strassler, Dual descriptions of SO(10) SUSY gauge theories with arbitrary
numbers of spinors and vectors, Phys. Rev. D 56 (1997) 7166, hep-th/9705003.
Multijet production at low xBj in deep inelastic

scattering at HERA
ZEUS Collaboration
S. Chekanov 1 , M. Derrick, S. Magill, B. Musgrave, D. Nicholass 2 ,
J. Repond, R. Yoshida
Argonne National Laboratory, Argonne, IL 60439-4815, USA 3
M.C.K. Mattingly
Andrews University, Berrien Springs, Michigan 49104-0380, USA
M. Jechow, N. Pavel , A.G. Yages Molina

Institut fr Physik der Humboldt-Universitt zu Berlin, Berlin, Germany
S. Antonelli, P. Antonioli, G. Bari, M. Basile, L. Bellagamba, M. Bindi,

D. Boscherini, A. Bruni, G. Bruni, L. Cifarelli, F. Cindolo, A. Contin,
M. Corradi 4 , S. De Pasquale, G. Iacobucci, A. Margotti, R. Nania,
A. Polini, G. Sartorelli, A. Zichichi
University and INFN Bologna, Bologna, Italy 5
D. Bartsch, I. Brock, S. Goers 6 , H. Hartmann, E. Hilger, H.-P. Jakob,

M. Jngst, O.M. Kind 7 , A.E. Nuncio-Quiroz, E. Paul 8 , R. Renner 6 ,
U. Samson, V. Schnberg, R. Shehzadi, M. Wlasenko
Physikalisches Institut der Universitt Bonn, Bonn, Germany 9
N.H. Brook, G.P. Heath, J.D. Morris, T. Namsoo

H.H. Wills Physics Laboratory, University of Bristol, Bristol, United Kingdom 10
doi:10.1016/j.nuclphysb.2007.05.027
RAPID COMMUNICATION
153
M. Capua, S. Fazio, A. Mastroberardino, M. Schioppa, G. Susinno,

E. Tassi
Calabria University, Physics Department and INFN, Cosenza, Italy 5
J.Y. Kim 11 , K.J. Ma 12

Chonnam National University, Kwangju, South Korea 13
Z.A. Ibrahim, B. Kamaluddin, W.A.T. Wan Abdullah

Jabatan Fizik, Universiti Malaya, 50603 Kuala Lumpur, Malaysia 14
Y. Ning, Z. Ren, F. Sciulli

Nevis Laboratories, Columbia University, Irvington on Hudson, New York 10027 15
J. Chwastowski, A. Eskreys, J. Figiel, A. Galas, M. Gil, K. Olkiewicz,

P. Stopa, L. Zawiejski
The Henryk Niewodniczanski Institute of Nuclear Physics, Polish Academy of Sciences, Cracow, Poland 16
L. Adamczyk, T. Bod, I. Grabowska-Bod, D. Kisielewska, J. ukasik,

M. Przybycien, L. Suszycki
Faculty of Physics and Applied Computer Science, AGH-University of Science and Technology, Cracow, Poland 17
A. Kotanski 18 , W. Sominski 19
Department of Physics, Jagellonian University, Cracow, Poland
V. Adler 20 , U. Behrens, I. Bloch, C. Blohm, A. Bonato, K. Borras,

R. Ciesielski, N. Coppola, A. Dossanov, V. Drugakov, J. Fourletova,
A. Geiser, D. Gladkov, P. Gttlicher 21 , J. Grebenyuk, I. Gregor, T. Haas,
W. Hain, C. Horn 22 , A. Httmann, B. Kahle, I.I. Katkov, U. Klein 23 ,
U. Ktz, H. Kowalski, E. Lobodzinska, B. Lhr, R. Mankel,
I.-A. Melzer-Pellmann, S. Miglioranzi, A. Montanari, D. Notz,
L. Rinaldi, P. Roloff, I. Rubinsky, R. Santamarta, U. Schneekloth,
A. Spiridonov 24 , H. Stadie, D. Szuba 25 , J. Szuba 26 , T. Theedt, G. Wolf,
K. Wrona, C. Youngman, W. Zeuner
Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
RAPID COMMUNICATION
154
W. Lohmann, S. Schlenstedt
Deutsches Elektronen-Synchrotron DESY, Zeuthen, Germany
G. Barbagli, E. Gallo , P.G. Pelfer

University and INFN, Florence, Italy 5
A. Bamberger, D. Dobur, F. Karstens, N.N. Vlasov 27

Fakultt fr Physik der Universitt Freiburg i.Br., Freiburg i.Br., Germany 9
P.J. Bussey, A.T. Doyle, W. Dunne, J. Ferrando, M. Forrest, D.H. Saxon,

I.O. Skillicorn
Department of Physics and Astronomy, University of Glasgow, Glasgow, United Kingdom 10
I. Gialas 28 , K. Papageorgiu
Department of Engineering in Management and Finance, University of Aegean, Greece
T. Gosau, U. Holm, R. Klanner, E. Lohrmann, H. Salehi, P. Schleper,

T. Schrner-Sadenius, J. Sztuk, K. Wichmann, K. Wick
Hamburg University, Institute of Experimental Physics, Hamburg, Germany 9
C. Foudas, C. Fry, K.R. Long, A.D. Tapper

Imperial College London, High Energy Nuclear Physics Group, London, United Kingdom 10
M. Kataoka 29 , T. Matsumoto, K. Nagano, K. Tokushuku 30 , S. Yamada,

Y. Yamazaki
Institute of Particle and Nuclear Studies, KEK, Tsukuba, Japan 31
A.N. Barakbaev, E.G. Boos, N.S. Pokrovskiy, B.O. Zhautykov

Institute of Physics and Technology of Ministry of Education and Science of Kazakhstan, Almaty, Kazakhstan
V. Aushev 1
Institute for Nuclear Research, National Academy of Sciences, Kiev and Kiev National University, Kiev, Ukraine
D. Son
Kyungpook National University, Center for High Energy Physics, Daegu, South Korea 13
RAPID COMMUNICATION
155
J. de Favereau, K. Piotrzkowski
Institut de Physique Nuclaire, Universit Catholique de Louvain, Louvain-la-Neuve, Belgium 32
F. Barreiro, C. Glasman 33 , M. Jimenez, L. Labarga, J. del Peso, E. Ron,

M. Soares, J. Terrn, M. Zambrana
Departamento de Fsica Terica, Universidad Autnoma de Madrid, Madrid, Spain 34
F. Corriveau, C. Liu, R. Walsh, C. Zhou

Department of Physics, McGill University, Montral, Qubec, Canada H3A 2T8 35
T. Tsurugai
Meiji Gakuin University, Faculty of General Education, Yokohama, Japan 31
A. Antonov, B.A. Dolgoshein, V. Sosnovtsev, A. Stifutkin, S. Suchkov

Moscow Engineering Physics Institute, Moscow, Russia 36
R.K. Dementiev, P.F. Ermolov, L.K. Gladilin, L.A. Khein,

I.A. Korzhavina, V.A. Kuzmin, B.B. Levchenko 37 , O.Yu. Lukina,
A.S. Proskuryakov, L.M. Shcheglova, D.S. Zotkin, S.A. Zotkin
Moscow State University, Institute of Nuclear Physics, Moscow, Russia 38
I. Abt, C. Bttner, A. Caldwell, D. Kollar, W.B. Schmidke, J. Sutiak

Max-Planck-Institut fr Physik, Mnchen, Germany
G. Grigorescu, A. Keramidas, E. Koffeman, P. Kooijman, A. Pellegrino,

H. Tiecke, M. Vzquez 29 , L. Wiggers
NIKHEF and University of Amsterdam, Amsterdam, The Netherlands 39
N. Brmmer, B. Bylsma, L.S. Durkin, A. Lee, T.Y. Ling

Physics Department, Ohio State University, Columbus, OH 43210, USA 3
RAPID COMMUNICATION
156
P.D. Allfrey, M.A. Bell, A.M. Cooper-Sarkar, A. Cottrell,

R.C.E. Devenish, B. Foster, K. Korcsak-Gorzo, S. Patel, V. Roberfroid 40 ,
A. Robertson, P.B. Straub, C. Uribe-Estrada, R. Walczak
Department of Physics, University of Oxford, Oxford, United Kingdom 10
P. Bellan, A. Bertolin, R. Brugnera, R. Carlin, F. Dal Corso, S. Dusini,

A. Garfagnini, S. Limentani, A. Longhin, L. Stanco, M. Turcato
Dipartimento di Fisica dell Universit and INFN, Padova, Italy 5
B.Y. Oh, A. Raval, J. Ukleja 41 , J.J. Whitmore 42

Department of Physics, Pennsylvania State University, University Park, PA 16802, USA 15
Y. Iga
Polytechnic University, Sagamihara, Japan 31
G. DAgostini, G. Marini, A. Nigro

Dipartimento di Fisica, Universit La Sapienza and INFN, Rome, Italy
J.E. Cole, J.C. Hart

Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, United Kingdom 10
H. Abramowicz 43 , A. Gabareen, R. Ingbir, S. Kananov, A. Levy

Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics, Tel-Aviv University, Tel-Aviv, Israel 44
M. Kuze, J. Maeda
Department of Physics, Tokyo Institute of Technology, Tokyo, Japan 31
R. Hori, S. Kagawa 45 , N. Okazaki, S. Shimizu, T. Tawara

Department of Physics, University of Tokyo, Tokyo, Japan 31
R. Hamatsu, H. Kaji 46 , S. Kitamura 47 , O. Ota, Y.D. Ri

Tokyo Metropolitan University, Department of Physics, Tokyo, Japan 31
M.I. Ferrero, V. Monaco, R. Sacchi, A. Solano

Universit di Torino and INFN, Torino, Italy 5
RAPID COMMUNICATION
ZEUS Collaboration / Nuclear Physics B 786 (2007) 152180
157
M. Arneodo, M. Ruspa
Universit del Piemonte Orientale, Novara, and INFN, Torino, Italy 5
S. Fourletov, J.F. Martin

Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A7 35
S.K. Boutle 28 , J.M. Butterworth, C. Gwenlan 48 , T.W. Jones,

J.H. Loizides, M.R. Sutton 48 , M. Wing
Physics and Astronomy Department, University College London, London, United Kingdom 10
B. Brzozowska, J. Ciborowski 49 , G. Grzelak, P. Kulinski, P. uzniak 50 ,

J. Malka 50 , R.J. Nowak, J.M. Pawlak, T. Tymieniecka, A. Ukleja,
A.F. Zarnecki
Warsaw University, Institute of Experimental Physics, Warsaw, Poland
M. Adamus, P. Plucinski 51
Institute for Nuclear Studies, Warsaw, Poland
Y. Eisenberg, I. Giller, D. Hochman, U. Karshon, M. Rosin

Department of Particle Physics, Weizmann Institute, Rehovot, Israel 52
E. Brownson, T. Danielson, A. Everett, D. Kira, D.D. Reeder 8 , P. Ryan,

A.A. Savin, W.H. Smith, H. Wolfe
Department of Physics, University of Wisconsin, Madison, WI 53706, USA 3
S. Bhadra, C.D. Catterall, Y. Cui, G. Hartner, S. Menary, U. Noor,

J. Standage, J. Whyte
Department of Physics, York University, Ontario, Canada M3J 1P3 35
Received 12 May 2007; accepted 23 May 2007
RAPID COMMUNICATION
158

Abstract
Inclusive dijet and trijet production in deep inelastic ep scattering has been measured for 10 < Q2 <
x, 104 < xBj < 102 . The data were taken at the HERA ep collider with
100 GeV2 and low Bjorken
centre-of-mass energy s = 318 GeV using the ZEUS detector and correspond to an integrated luminosity
of 82 pb1 . Jets were identified in the hadronic centre-of-mass (HCM) frame using the kT cluster algorithm
in the longitudinally invariant inclusive mode. Measurements of dijet and trijet differential cross sections
are presented as functions of Q2 , xBj , jet transverse energy, and jet pseudorapidity. As a further examination
of low-xBj dynamics, multi-differential cross sections as functions of the jet correlations in transverse momenta, azimuthal angles, and pseudorapidity are also presented. Calculations at O(s3 ) generally describe
the trijet data well and improve the description of the dijet data compared to the calculation at O(s2 ).
E-mail address: gallo@mail.desy.de (E. Gallo).

1 Supported by DESY, Germany.
2 Also affiliated with University College London, UK.
3 Supported by the US Department of Energy.
4 Also at University of Hamburg, Germany, Alexander von Humboldt Fellow.
5 Supported by the Italian National Institute for Nuclear Physics (INFN).
6 Self-employed.
7 Now at Humboldt University, Berlin, Germany.
8 Retired.
9 Supported by the German Federal Ministry for Education and Research (BMBF), under contract numbers
HZ1GUA 2, HZ1GUB 0, HZ1PDA 5, HZ1VFA 5.
10 Supported by the Particle Physics and Astronomy Research Council, UK.
11 Supported by Chonnam National University in 2005.
12 Supported by a scholarship of the World Laboratory Bjrn Wiik Research Project.
13 Supported by the Korean Ministry of Education and Korea Science and Engineering Foundation.
14 Supported by the Malaysian Ministry of Science, Technology and Innovation/Akademi Sains Malaysia grant SAGA
66-02-03-0048.
15 Supported by the US National Science Foundation. Any opinion, findings and conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views of the National Science
Foundation.
16 Supported by the Polish State Committee for Scientific Research, grant No. 620/E-77/SPB/DESY/P-03/DZ
117/2003-2005 and grant No. 1P03B07427/2004-2006.
17 Supported by the Polish Ministry of Science and Higher Education as a scientific project (20062008).
18 Supported by the research grant No. 1 P03B 04529 (20052008).
19 This work was supported in part by the Marie Curie Actions Transfer of Knowledge project COCOS (contract
MTKD-CT-2004-517186).
20 Now at Univ. Libre de Bruxelles, Belgium.
21 Now at DESY group FEB, Hamburg, Germany.
22 Now at Stanford Linear Accelerator Center, Stanford, USA.
23 Now at University of Liverpool, UK.
24 Also at Institut of Theoretical and Experimental Physics, Moscow, Russia.
25 Also at INP, Cracow, Poland.
RAPID COMMUNICATION
159
1. Introduction
Multijet production in deep inelastic ep scattering (DIS) at HERA has been used to test the
predictions of perturbative QCD (pQCD) over a large range of negative four-momentum transfer
squared, Q2 , and to determine the strong coupling constant s [1,2]. At leading order (LO) in
s , dijet production in neutral current DIS proceeds via the bosongluon-fusion (V g q q with
V = , Z 0 ) and QCD-Compton (V q qg) processes. Events with three jets can be seen as
dijet processes with an additional gluon radiation or with a gluon splitting into a quarkantiquark
pair and are directly sensitive to O(s2 ) QCD effects. The higher sensitivity to s and the large
number of degrees of freedom of the trijet final state provide a good testing ground for the pQCD
predictions. In particular, multijet production in DIS is an ideal environment for investigating different approaches to parton dynamics at low Bjorken-x, xBj [3]. An understanding of this regime
is of particular relevance in view of the startup of the LHC, where many of the Standard Model
processes such as the production of electroweak gauge bosons or the Higgs particle involve the
collision of partons with a low fraction of the proton momentum.
In the usual collinear QCD factorisation approach, the cross sections are obtained as the convolution of perturbative matrix elements and parton densities evolved according to the DGLAP
26 On leave of absence from FPACS, AGH-UST, Cracow, Poland.
27 Partly supported by Moscow State University, Russia.
28 Also affiliated with DESY.
29 Now at CERN, Geneva, Switzerland.
30 Also at University of Tokyo, Japan.
31 Supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and its grants
for Scientific Research.

32 Supported by FNRS and its associated funds (IISN and FRIA) and by an Inter-University Attraction Poles Programme
subsidised by the Belgian Federal Science Policy Office.
33 Ramn y Cajal Fellow.
34 Supported by the Spanish Ministry of Education and Science through funds provided by CICYT.
35 Supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
36 Partially supported by the German Federal Ministry for Education and Research (BMBF).
37 Partly supported by Russian Foundation for Basic Research grant No. 05-02-39028-NSFC-a.
38 Supported by RF Presidential grant No. 8122.2006.2 for the leading scientific schools and by the Russian Ministry
of Education and Science through its grant Research on High Energy Physics.
39 Supported by the Netherlands Foundation for Research on Matter (FOM).
40 EU Marie Curie Fellow.
41 Partially supported by Warsaw University, Poland.
42 This material was based on work supported by the National Science Foundation, while working at the Foundation.
43 Also at Max Planck Institute, Munich, Germany, Alexander von Humboldt Research Award.
44 Supported by the GermanIsraeli Foundation and the Israel Science Foundation.
45 Now at KEK, Tsukuba, Japan.
46 Now at Nagoya University, Japan.
47 Department of Radiological Science.
48 PPARC Advanced fellow.
49 Also at dz University, Poland.
50 dz University, Poland.
51 Supported by the Polish Ministry for Education and Science grant No. 1 P03B 14129.
52 Supported in part by the MINERVA Gesellschaft fr Forschung GmbH, the Israel Science Foundation (grant
No. 293/02-11.2) and the USIsrael Binational Science Foundation.
Deceased.
RAPID COMMUNICATION
160
evolution equations [4]. These equations resum to all orders the terms proportional to s ln Q2
and the double logarithms ln Q2 ln 1/x, where x is the fraction of the proton momentum carried
by a parton, which is equal to xBj in the quarkparton model. In the DGLAP approach, the parton
participating in the hard scattering is the result of a partonic cascade ordered in transverse momentum, pT . The partonic cascade starts from a low-pT and high-x parton from the incoming
proton and ends up, after consecutive branching, in the high-pT and low-x parton entering in
the hard scattering. This approximation has been tested extensively at HERA and was found to
describe well the inclusive cross sections [5,6] and jet production [1,2,7,8]. At low xBj , where
the phase space for parton emissions increases, terms proportional to s ln 1/x may become large
and spoil the accuracy of the DGLAP approach. In this region the transverse momenta and angular correlations between partons produced in the hard scatter may be sensitive to effects beyond
DGLAP dynamics. The information about cross sections, transverse energy, ET , and angular
correlations between the two leading jets in multijet production therefore provides an important
testing ground for studying the parton dynamics in the region of small xBj .
In this analysis, correlations for both azimuthal and polar angles, and correlations in jet transverse energy and momenta for dijet and trijet production in the hadronic ( p) centre-of-mass
(HCM) frame are measured with high statistical precision in the kinematic region restricted to
10 < Q2 < 100 GeV2 and 104 < xBj < 102 . The results are compared with pQCD calculations
at next-to-leading order (NLO). A similar study of inclusive dijet production was performed by
the H1 Collaboration [9].
2. Experimental set-up
The data used in this analysis were collected during the 19982000 running period, when
HERA operated with protons of energy Ep = 920 GeV and electrons or positrons53 of energy
Ee = 27.5 GeV, and correspond to an integrated luminosity of 81.7 1.8 pb1 . A detailed description of the ZEUS detector can be found elsewhere [10,11]. A brief outline of the components
that are most relevant for this analysis is given below.
Charged particles are measured in the central tracking detector (CTD) [12], which operates in
a magnetic field of 1.43 T provided by a thin superconducting solenoid. The CTD consists of 72
cylindrical drift chamber layers, organised in nine superlayers covering the polar-angle54 region
15 < < 164 . The transverse momentum resolution for full-length tracks can be parameterised
as (pT )/pT = 0.0058pT 0.0065 0.0014/pT , with pT in GeV. The tracking system was
used to measure the interaction vertex with a typical resolution along (transverse to) the beam
direction of 0.4 (0.1) cm and also to cross-check the energy scale of the calorimeter.
The high-resolution uranium-scintillator calorimeter (CAL) [13] covers 99.7% of the total
solid angle and consists of three parts: the forward (FCAL), the barrel (BCAL) and the rear
(RCAL) calorimeters. Each part is subdivided transversely into towers and longitudinally into
one electromagnetic section and either one (in RCAL) or two (in BCAL and FCAL) hadronic
sections. The smallest subdivision of the calorimeter is called a cell. Under
test-beam conditions,
E for electrons and
the CAL single-particle
relative
energy
resolutions
were
(E)/E
=
0.18/
(E)/E = 0.35/ E for hadrons, with E in GeV.

53 In the following, the term electron denotes generically both the electron (e ) and the positron (e+ ).
54 The ZEUS coordinate system is a right-handed Cartesian system, with the Z axis pointing in the proton beam di-
rection, referred to as the forward direction, and the X axis pointing left towards the centre of HERA. The coordinate
origin is at the nominal interaction point.
RAPID COMMUNICATION
161
The luminosity was measured from the rate of the bremsstrahlung process ep ep. The
resulting small-angle energetic photons were measured by the luminosity monitor [14], a leadscintillator calorimeter placed in the HERA tunnel at Z = 107 m.
3. Kinematics and event selection
A three-level trigger system was used to select events online [11,15]. Neutral current DIS
events were selected by requiring that a scattered electron candidate with an energy more than
4 GeV was measured in the CAL. The variable xBj , the inelasticity y, and Q2 were reconstructed
offline using the electron (subscript e) [16] and JacquetBlondel (JB) [17] methods. For each
event, the reconstruction of the hadronic final state was performed using a combination of track
and CAL information, excluding the cells and the track associated with the scattered electron.
The selected tracks and CAL clusters were treated as massless energy flow objects (EFOs) [18].
The offline selection of DIS events was similar to that used in the previous ZEUS measurement
[1] and was based on the following requirements:
Ee > 10 GeV, where Ee is the scattered electron energy after correction for energy loss from
the inactive material in the detector;
a kinematic region with good reconstruction;
ye < 0.6 and yJB > 0.1, to ensure
40 < < 60 GeV, where = i (Ei PZ,i ), where Ei and PZ,i are the energy and
z-momentum of each final-state object. The lower cut removed background from photoproduction and events with large initial-state QED radiation, while the upper cut removed
cosmic-ray background;
|Zvtx | < 50 cm, where Zvtx is the Z position of the reconstructed primary vertex, to select
events consistent with ep collisions.
The kinematic range of the analysis is
10 < Q2 < 100 GeV2 ,
104 < xBj < 102
and 0.1 < y < 0.6.
Jets were reconstructed using the kT cluster algorithm [19] in the longitudinally invariant
inclusive mode [20]. The jet search was conducted in the HCM frame, which is equivalent to the
Breit frame [21] apart from a longitudinal boost.
jet
The jet phase space is defined by selection cuts on the jet pseudorapidity, LAB , in the laborajet
tory frame and on the jet transverse energy, ET ,HCM , in the HCM frame:
jet1,2(,3)
1.0 < LAB
jet1
< 2.5 and ET ,HCM > 7 GeV,
jet2(,3)
ET ,HCM > 5 GeV,
where jet 1, 2(, 3) refers to the two (three) jets with the highest transverse energy in the HCM
frame for a given event. The dijet and trijet samples are inclusive in that they contain at least two
or three jets passing the selection criteria, respectively.
4. Monte Carlo simulation
Monte Carlo (MC) simulations were used to correct the data for detector effects, inefficiencies
of the event selection and the jet reconstruction, as well as for QED effects. Neutral current DIS
events were generated using the A RIADNE 4.10 program [22] and the L EPTO 6.5 program [23]
interfaced to H ERACLES 4.5.2 [24] via D JANGO 6.2.4 [25]. The H ERACLES program includes
RAPID COMMUNICATION
162
2 ). In the case of A RIADNE , events were generated using the colourQED effects up to O(EM
dipole model [26], whereas for L EPTO, the matrix-elements plus parton-shower model was used.
The CTEQ5L parameterisations of the proton parton density functions (PDFs) [27] were used in
the generation of DIS events for A RIADNE, and the CTEQ4D PDFs [27] were used for L EPTO.
For hadronisation the Lund string model [28], as implemented in J ETSET 7.4 [29,30] was used.
The ZEUS detector response was simulated with a program based on G EANT 3.13 [31]. The
generated events were passed through the detector simulation, subjected to the same trigger requirements as the data, and processed by the same reconstruction and offline programs.
The measured distributions of the global kinematic variables are well described by both the
A RIADNE and L EPTO MC models after reweighting in Q2 [1]. The L EPTO simulation gives a
better overall description of the jet variables, but A RIADNE provides a better description of dijets
with small azimuthal separation. Therefore, for this analysis, the events generated with the A RI ADNE program were used to determine the acceptance corrections. The events generated with
L EPTO were used to estimate the uncertainty associated with the treatment of the parton shower.
5. NLO QCD calculations

The NLO calculations were carried out in the MS scheme for five massless quark flavors
with the program NLO JET [32]. The NLO JET program allows a computation of the dijet (trijet)
production cross sections to next-to-leading order, i.e. including all terms up to O(s2 ) (O(s3 )).
In certain regions of the jet phase space, where the two hardest jets are not balanced in transverse
momentum, NLO JET can be used to calculate the cross sections for dijet production at O(s3 ). It
was checked that the LO and NLO calculations from NLO JET agree with those of D ISENT [33]
at the 12% level for the dijet cross sections [34,35].
For comparison with the data, the CTEQ6M [36] PDFs were used, and the renormalisation and
factorisation scales were both chosen to be (E T2 ,HCM + Q2 )/4, where for dijets (trijets) E T ,HCM
is the average ET ,HCM of the two (three) highest ET ,HCM jets in a given event. The choice of
renormalisation scale matches that used in the previous ZEUS multijet analysis [1]. The strong
coupling constant was set to the value used for the CTEQ6 PDFs, s (MZ ) = 0.118, and evolved
according to the two-loop solution of the renormalisation group equation.
The NLO QCD predictions were corrected for hadronisation effects using a bin-by-bin procedure. Hadronisation correction factors were defined for each bin as the ratio of the hadronto parton-level cross sections and were calculated using the L EPTO MC program, which, at the
parton level, gives a better agreement with NLO JET than A RIADNE. The correction factors Chad
were typically in the range 0.80.9 for most of the phase space.
The theoretical uncertainty was estimated by varying the renormalisation scale up and down
by a factor of two. The uncertainties in the proton PDFs were estimated in the previous ZEUS
multijets analysis [1] by repeating NLO JET calculations using 40 additional sets from CTEQ6M,
which resulted in a 2.5% contribution to the theoretical uncertainty and was therefore neglected.
6. Acceptance corrections
The A RIADNE MC was used to correct the data for detector effects. The jet transverse energies were corrected for energy losses from inactive material in the detector. Typical jet energy
correction factors were 11.2, depending on the transverse energy of the detector-level jet and
the jet pseudorapidity.
RAPID COMMUNICATION
163
The measured cross sections were corrected to the hadron level using a bin-by-bin procedure.
These corrections account for trigger efficiency, acceptance, and migration. Typical efficiencies
and purities were about 50% for the differential cross sections, with correction factors typically
between 1 and 1.5. For the double-differential cross sections, the efficiencies and purities were
typically 2050%, with correction factors between 1 and 2.
The cross sections were corrected to the QED Born level by applying an additional correction
obtained from a special sample of the L EPTO MC with the radiative QED effects turned off. The
QED radiative effects were typically 24%.
7. Systematic uncertainties
A detailed study of the sources contributing to the systematic uncertainties of the measurements has been performed [37,38]. The main sources contributing to the systematic uncertainties
are listed below:
the data were corrected using L EPTO instead of A RIADNE;
the jet energies in the data were scaled up and down by 3% for jets with transverse energy
less than 10 GeV and 1% for jets with transverse energy above 10 GeV, according to the
estimated jet energy scale uncertainty [39];
jet
the cut on ET ,HCM for each jet was raised and lowered by 1 GeV, corresponding to the ET
resolution;
jet1,2(,3)
the upper and lower cuts on LAB
were each changed by 0.1, corresponding to the
resolution;
the uncertainties due to the selection cuts was estimated by varying the cuts within the resolution of each variable.
The largest systematic uncertainties came from the uncertainty of the jet energy scale, which
jet3
produced a systematic uncertainty of 510%. For the trijet sample, altering the cut on ET ,HCM
also produced a systematic uncertainty of 510%. The other significant systematic uncertainty
arose from the choice of L EPTO instead of A RIADNE for correcting detector effects. This systematic uncertainty was also typically 510%. The other systematic uncertainties were smaller
than or similar to the statistical uncertainties.
The systematic uncertainties not associated with the absolute energy scale of the jets were
added in quadrature to the statistical uncertainties and are shown as error bars in the figures.
The uncertainty due to the absolute energy scale of the jets is shown separately as a shaded
band in each figure, due to the large bin-to-bin correlation. In addition, there is an overall normalisation uncertainty of 2.2% from the luminosity determination, which is not included in the
figures.
8. Results
8.1. Single-differential cross sections d/dQ2 , d/dxBj and trijet to dijet cross section ratios
The single-differential cross sections d/dQ2 and d/dxBj for dijet and trijet production are
presented in Figs. 1(a) and (c), and Tables 14. The ratio trijet /dijet of the trijet cross section
to the dijet cross section, as a function of Q2 and of xBj are presented in Figs. 1(b) and (d),
respectively. The ratio trijet /dijet is almost Q2 independent, as shown in Fig. 1(b), and falls
RAPID COMMUNICATION
164
Fig. 1. Inclusive dijet and trijet cross sections as functions of (a) Q2 and (c) xBj . Figures (b) and (d) show the ratios of
the trijet to dijet cross sections. The bin-averaged differential cross sections are plotted at the bin centers. The inner error
bars represent the statistical uncertainties. The outer error bars represent the quadratic sum of statistical and systematic
uncertainties not associated with the jet energy scale. The shaded band indicates the jet energy scale uncertainty. The
predictions of perturbative QCD at NLO, corrected for hadronisation effects and using the CTEQ6 parameterisations of
the proton PDFs, are compared to data. The lower parts of the plots show the relative difference between the data and
the corresponding theoretical prediction. The hatched band represents the renormalisation-scale uncertainty of the QCD
calculation.
steeply with increasing xBj , as shown in Fig. 1(d). In the cross section ratios, the experimental and
theoretical uncertainties partially cancel, providing a possibility to test the pQCD calculations
more precisely than can be done with the individual cross sections. Both the cross sections and
the cross section ratios are well described by the NLO JET calculations.
RAPID COMMUNICATION
165
Table 1
The inclusive dijet cross sections as functions of Q2 . Included are the statistical, systematic, and jet energy scale uncertainties in columns 3, 4, and 5, respectively. Column 6 shows the correction factor from QED radiative effects applied
to the measured cross sections, and column 7 shows the hadronization correction applied to the NLO JET calculations
shown in the figures
Q2
d
dQ2
stat
syst
ES
(GeV2 )
(pb/GeV2 )
(pb/GeV2 )
(pb/GeV2 )
(pb/GeV2 )
1015
66.0
0.8
1520
41.4
0.6
2030
26.2
0.3
+3.7
4.4
+2.0
2.4
+1.0
0.8
+0.4
0.3
+0.17
0.16
+5.7
5.9
+3.5
3.6
+2.2
2.0
+1.0
1.1
+0.38
0.38
3050
14.0
0.1
50100
5.82
0.06
CQED
Chad
0.984
0.866
0.968
0.870
0.965
0.876
0.955
0.884
0.952
0.887
Table 2
The inclusive dijet cross sections as functions of xBj . Other details as in the caption to Table 1
xBj 104
d
dxBj
(pb, 104 )
stat
syst
ES
(pb, 104 )
(pb, 104 )
(pb, 104 )
+5.6
6.8
+5.9
6.2
+3.3
3.7
+0.8
0.8
+0.08
0.07
+7.0
6.3
+8.8
8.9
+6.9
7.1
+2.2
2.2
+0.17
0.17
1.73.0
85.3
1.7
3.05.0
113.8
1.5
5.010.0
83.1
0.8
10.025.0
29.5
0.3
25.0100.0
2.31
0.03
CQED
Chad
0.987
0.910
0.975
0.887
0.969
0.876
0.958
0.876
0.948
0.862
Table 3
The inclusive trijet cross sections as functions of Q2 . Other details as in the caption to Table 1
Q2
d
dQ2
stat
syst
ES
(GeV2 )
(pb/GeV2 )
(pb/GeV2 )
(pb/GeV2 )
(pb/GeV2 )
1015
7.9
0.2
1520
4.40
0.17
2030
3.19
0.11
3050
1.68
0.06
50100
0.719
0.024
+1.1
1.3
+0.46
0.66
+0.27
0.37
+0.13
0.11
+0.044
0.027
+1.0
1.0
+0.45
0.52
+0.38
0.38
+0.20
0.19
+0.077
0.070
CQED
Chad
0.991
0.759
0.946
0.776
0.969
0.786
0.949
0.794
0.956
0.795
8.2. Transverse energy and pseudorapidity dependencies of cross sections

jet
The single-differential cross sections d/dET ,HCM for two (three) jet events are presented
in Fig. 2. The measured cross sections are well described by the NLO JET calculations over the
jet
whole range in ET ,HCM considered.
jet
The single-differential cross sections d/dLAB for dijet and trijet production are presented
jet
in Figs. 3(a) and (c). For this figure, the two (three) jets with highest ET ,HCM were ordered in
RAPID COMMUNICATION
166
Table 4
The inclusive trijet cross sections as functions of xBj . Other details as in the caption to Table 1
xBj 104
d
dxBj
stat
syst
ES
(pb, 104 )
(pb, 104 )
(pb, 104 )
(pb, 104 )
1.73.0
14.7
0.7
3.05.0
15.9
0.5
5.010.0
9.6
0.3
10.025.0
3.35
0.10
25.0100.0
0.192
0.013
+1.5
3.3
+2.0
2.3
+0.9
0.9
+0.21
0.19
+0.032
0.020
+1.5
1.9
+1.9
1.8
+1.1
1.1
+0.40
0.37
+0.023
0.022
CQED
Chad
1.00
0.811
0.968
0.796
0.961
0.780
0.954
0.785
0.95
0.739
Table 5
The bin edges used for the measurements of the jet correlations presented. For the trijet sample, the first two bins in
jet1,2
|HCM | are combined
Variable
Bin
jet1,2
ET ,HCM
jet1,2
| pT ,HCM |
jet1,2
jet1
|pT ,HCM |/2ET ,HCM
jet1,2
|HCM |
Boundaries
1
2
3
4
04 GeV
410 GeV
1018 GeV
18100 GeV
1
2
3
4
04 GeV
410 GeV
1016 GeV
16100 GeV
1
2
3
4
00.5
0.50.7
0.70.85
0.851
1
2
3
4
0/4
/4/2
/23/4
3/4
jet
jet1,2
LAB . Also shown are the measurements of the single-differential cross sections d/d|HCM |,
jet1,2
jet
where |HCM | is the absolute difference in pseudorapidity of the two jets with highest ET ,HCM
(see Figs. 3(b) and (d)). The NLO JET predictions describe the measurements well.
8.3. Jet transverse energy and momentum correlations
Correlations in transverse energy of the jets have been investigated by measuring the doublejet1,2
jet1,2
differential cross sections d 2 /dxBj dET ,HCM , where ET ,HCM is the difference in transverse
jet
energy between the two jets with the highest ET ,HCM . The measurement was performed in xBj
bins, which are defined in Table 2, for dijet and trijet production. Figs. 4 and 5 show the cross
jet1,2
sections d 2 /dxBj dET ,HCM for all bins in xBj for the dijet and trijet samples, respectively.
RAPID COMMUNICATION
167
jet
jet
Fig. 2. Inclusive dijet (a) and trijet (b) cross sections as functions of ET ,HCM with the jets ordered in ET ,HCM . The cross
sections of the second and third jet were scaled for readability. Other details as in the caption to Fig. 1.
jet1,2
The NLO JET calculations at O(s2 ) do not describe the high-ET ,HCM tail of the dijet sample
at low xBj , where the calculations fall below the data. Since these calculations give the lowestjet1,2
order non-trivial contribution to the cross section in the region ET ,HCM > 0, they are affected
by large uncertainties from the higher-order terms in s . A higher-order calculation for the dijet
jet1,2
sample is possible with NLO JET if the region ET ,HCM near zero is avoided. NLO JET calculajet1,2
tions at O(s3 ) for the dijet sample have been obtained for the region ET ,HCM > 4 GeV and are
compared to the data in Fig. 4. With the inclusion of the next term in the perturbative series in
s , the NLO JET calculations describe the data within the theoretical uncertainties. The NLO JET
calculations at O(s3 ) for trijet production are consistent with the measurements.
As a refinement to the studies of the correlations between the transverse energies of the jets,
further correlations of the jet transverse momenta have been investigated. The correlations in jet
transverse momenta were examined by measuring two sets of double-differential cross sections:
jet1,2
jet1,2
jet1
jet1,2
d 2 /dxBj d| pT ,HCM | and d 2 /dxBj d(|pT ,HCM |/(2ET ,HCM )). The variable | pT ,HCM | is
the transverse component of the vector sum of the jet momenta of the two jets with the highest
jet
jet1,2
ET ,HCM . For events with only two jets | pT ,HCM | = 0, and additional QCD radiation increases
jet1,2
jet1
this value. The variable |pT ,HCM |/(2ET ,HCM ) is the magnitude of the vector difference of the
jet
transverse momenta of the two jets with the highest ET ,HCM scaled by twice the transverse enjet1,2
jet1
ergy of the hardest jet. For events with only two jets |pT ,HCM |/(2ET ,HCM ) = 1, and additional
jet1,2
QCD radiation decreases this value. Figs. 69 show the cross sections d 2 /dxBj d| pT ,HCM |
jet1,2
jet1
and the cross sections d 2 /dxBj d|pT ,HCM |/(2ET ,HCM ) in bins of xBj for the dijet and trijet
samples.
At low xBj , the NLO JET calculations at O(s2 ) underestimate the dijet cross sections at high
jet1,2
jet1,2
jet1
values of | pT ,HCM | and low values of |pT ,HCM |/(2ET ,HCM ). The description of the data by
RAPID COMMUNICATION
168
jet
jet
Fig. 3. The inclusive dijet (a) and trijet (c) cross sections as functions of LAB with the jets ordered in LAB :
jet1
jet2
jet3
LAB > LAB > LAB . The cross sections of the second and third jet were scaled for readability. Figures (b) and (d)
jet1,2
jet
show the dijet and trijet cross sections as functions of |HCM | between the two jets with highest ET ,HCM . Other details
as in the caption in Fig. 1.
the NLO JET calculations at O(s2 ) improves at higher values of xBj . A higher-order calculajet1,2
tion with NLO JET at O(s3 ) for the dijet sample has been obtained for the region | pT ,HCM | >
jet1,2
jet1
4 GeV, which is compared to the data in Fig. 6; and for the region |pT ,HCM |/(2ET ,HCM ) <
0.85, which is compared to the data in Fig. 8. With the inclusion of the next term in the perturbative series in s , the NLO JET calculations describe the data well. The NLO JET calculations at
O(s3 ) for trijet production are consistent with the measurements.
RAPID COMMUNICATION
169
jet1,2
Fig. 4. Dijet cross sections as functions of ET ,HCM . The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed
(solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 ) predictions. The
jet1,2
boundaries for the bins in ET ,HCM are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
170
jet1,2
Fig. 5. Trijet cross sections as functions of ET ,HCM . The measurements are compared to NLO JET calculations at
jet1,2
O(s3 ). The boundaries for the bins in ET ,HCM are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
171
jet1,2
Fig. 6. Dijet cross sections as functions of | pT ,HCM |. The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed
jet1,2
boundaries for the bins in | pT ,HCM | are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
172
jet1,2
Fig. 7. Trijet cross sections as functions of | pT ,HCM |. The measurements are compared to NLO JET calculations at
jet1,2
O(s3 ). The boundaries for the bins in | pT ,HCM | are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
jet1,2
jet1
jet1,2
jet1
173
Fig. 8. Dijet cross sections as functions of |pT ,HCM |/(2ET ,HCM ). The NLO JET calculations at O(s2 ) (O(s3 )) are
shown as dashed (solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 )
predictions. The boundaries for the bins in |pT ,HCM |/(2ET ,HCM ) are given in Table 5. Other details as in the caption
to Fig. 1.
RAPID COMMUNICATION
174
jet1,2
jet1
Fig. 9. Trijet cross sections as functions of |pT ,HCM |/(2ET ,HCM ). The measurements are compared to NLO JET
jet1,2
jet1
calculations at O(s3 ). The boundaries for the bins in |pT ,HCM |/(2ET ,HCM ) are given in Table 5. Other details as in
the caption to Fig. 1.
RAPID COMMUNICATION
jet1,2
175
Fig. 10. Dijet cross sections as functions of |HCM |. The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed
jet1,2
boundaries for the bins in |HCM | are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
176
jet1,2
Fig. 11. Trijet cross sections as functions of |HCM |. The measurements are compared to NLO JET calculations at
jet1,2
O(s3 ). The boundaries for the bins in |HCM | are given in Table 5. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
jet1,2
177
Fig. 12. The dijet and trijet cross sections for events with |HCM | < 2/3 as functions of xBj in two different Q2 -bins.
The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed (solid) lines. The trijet measurements are compared
to NLO JET calculations at O(s3 ). The lower parts of the plots in (a) and (b) show the relative difference between the
data and the O(s3 ) predictions. Other details as in the caption to Fig. 1.
RAPID COMMUNICATION
178
8.4. Azimuthal distributions of the jets

jet1,2
jet1,2
Measurements of the double-differential cross section d 2 /dxBj d|HCM |, where |HCM |

jet
is the azimuthal separation of the two jets with the largest ET ,HCM , for dijet and trijet production
are shown in Figs. 10 and 11 for all bins in xBj . For both dijet and trijet production the cross secjet1,2
tion falls with |HCM |. The NLO JET calculations at O(S2 ) for dijet production decrease more
jet1,2
jet1,2
rapidly with |HCM | than the data and the calculations disagree with the data at low |HCM |.
A higher-order NLO JET calculation at O(S3 ) for the dijet sample has been obtained for the rejet1,2
gion |HCM | < 3/4 and describes the data well. The measurements for trijet production are
reasonably well described by the NLO JET calculations at O(S3 ).
A further investigation has been performed by measuring the cross section d 2 /dQ2 dxBj
jet1,2
for dijet (trijet) events with |HCM | < 2/3 as a function of xBj . For the two-jet final states,
jet1,2
the presence of two leading jets with |HCM | < 2/3 can indicate another high-ET jet or set
of high-ET jets outside the measured range. These cross sections are presented in Fig. 12.
The NLO JET calculations at O(S2 ) for dijet production underestimate the data, the difference
increasing towards low xBj . The NLO JET calculations at O(S3 ) are up to about one order of
magnitude larger than the O(S2 ) calculations and are consistent with the data, demonstrating the
importance of the higher-order terms in the description of the data especially at low xBj . The
NLO JET calculations at O(S3 ) describe the trijet data within the renormalisation-scale uncertainties.
9. Summary
Dijet and trijet production in deep inelastic ep scattering has been measured in the phase
space region 10 < Q2 < 100 GeV2 and 104 < xBj < 102 using an integrated luminosity of
82 pb1 collected by the ZEUS experiment. The high statistics have made possible detailed
studies of multijet production at low xBj . The dependence of dijet and trijet production on the
jet
jet
kinematic variables Q2 and xBj and on the jet variables ET ,HCM and LAB is well described by
perturbative QCD calculations which include NLO corrections. To investigate possible deviations
with respect to the collinear factorisation approximation used in the standard pQCD approach,
jet
measurements of the correlations between the two jets with highest ET ,HCM have been made. At
low xBj , measurements of dijet production with low azimuthal separation are reproduced by the
perturbative QCD calculations provided that higher-order terms (O(s3 )) are accounted for. Such
terms increase the predictions of pQCD calculations by up to one order of magnitude when the
jet1,2
two jets with the highest ET ,HCM are not balanced in transverse momentum. This demonstrates
the importance of higher-order corrections in the low-xBj region.
Acknowledgements
It is a pleasure to thank the DESY Directorate for their strong support and encouragement.
The remarkable achievements of the HERA machine group were essential for the successful
completion of this work and are greatly appreciated. The design, construction and installation of
the ZEUS detector has been made possible by the efforts of many people who are not listed as
authors. It is also a pleasure to thank Zoltan Nagy for useful discussions about NLO JET.
RAPID COMMUNICATION
179
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 44 (2005) 183.

H1 Collaboration, C. Adloff, et al., Phys. Lett. B 515 (2001) 17.
J.D. Bjorken, Phys. Rev. 179 (1969) 1547.
V.N. Gribov, L.N. Lipatov, Sov. J. Nucl. Phys. 15 (1972) 438;
G. Altarelli, G. Parisi, Nucl. Phys. B 126 (1977) 298;
L.N. Lipatov, Sov. J. Nucl. Phys. 20 (1975) 94;
Yu.L. Dokshitzer, Sov. Phys. JETP 46 (1977) 641.
H1 Collaboration, C. Adloff, et al., Eur. Phys. J. C 21 (2001) 33.
ZEUS Collaboration, S. Chekanov, et al., Phys. Rev. D 67 (2003) 012007.
ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 23 (2002) 13;
ZEUS Collaboration, S. Chekanov, et al., Phys. Lett. B 547 (2002) 164;
ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 21 (2001) 443.
H1 Collaboration, C. Adloff, et al., Eur. Phys. J. C 19 (2001) 289;
H1 Collaboration, C. Adloff, et al., Phys. Lett. B 515 (2001) 17;
H1 Collaboration, S. Aid, et al., Nucl. Phys. B 470 (1996) 3.
H1 Collaboration, A. Aktas, et al., Eur. Phys. J. C 33 (2004) 477.
ZEUS Collaboration, M. Derrick, et al., Phys. Lett. B 293 (1992) 465.
ZEUS Collaboration, in: U. Holm (Ed.), The ZEUS Detector. Status Report, unpublished, DESY (1993), available
on http://www-zeus.desy.de/bluebook/bluebook.html.
N. Harnew, et al., Nucl. Instrum. Methods A 279 (1989) 290;
B. Foster, et al., Nucl. Phys. B (Proc. Suppl.) B 32 (1993) 181;
B. Foster, et al., Nucl. Instrum. Methods A 338 (1994) 254.
M. Derrick, et al., Nucl. Instrum. Methods A 309 (1991) 77;
A. Andresen, et al., Nucl. Instrum. Methods A 309 (1991) 101;
A. Caldwell, et al., Nucl. Instrum. Methods A 321 (1992) 356;
A. Bernstein, et al., Nucl. Instrum. Methods A 336 (1993) 23.
J. Andruszkw, et al., Preprint DESY-92-066, DESY, 1992;
ZEUS Collaboration, M. Derrick, et al., Z. Phys. C 63 (1994) 391;
J. Andruszkw, et al., Acta Phys. Pol. B 32 (2001) 2025.
W.H. Smith, K. Tokushuku and L.W. Wiggers, C. Verkerk, W. Wojcik (Eds.), in: Proceedings of the Computing in
High Energy Physics (CHEP 92), Geneva, Switzerland, 1992, p. 222, also in preprint DESY 92-150B.
K.C. Hger, in: W. Buchmller, G. Ingelman (Eds.), Proceedings of the Workshop on Physics at HERA vol. 1,
DESY, Hamburg, Germany, 1992, p. 43.
F. Jacquet, A. Blondel, in: U. Amaldi (Ed.), Proceedings of the Study for an ep Facility for Europe, Hamburg,
Germany, 1979, p. 391, also in preprint DESY 79/48.
G.M. Briskin, Ph.D. Thesis, Tel Aviv University, 1998, DESY-THESIS-1998-036.
S. Catani, et al., Nucl. Phys. B 406 (1993) 187.
S.D. Ellis, D.E. Soper, Phys. Rev. D 48 (1993) 3160.
R.P. Feynman, PhotonHadron Interactions, Benjamin, New York, 1972;
K.H. Streng, T.F. Walsh, P.M. Zerwas, Z. Phys. C (1979) 237.
L. Lnnblad, Comput. Phys. Commun 71 (1992) 15.
G. Ingelman, A. Edin, J. Rathsman, Comput. Phys. Commun. 101 (1997) 108.
A. Kwiatkowski, H. Spiesberger, H.-J. Mhring, Comput. Phys. Commun. 69 (1992) 155;
A. Kwiatkowski, H. Spiesberger, H.-J. Mhring, in: Proceedings of the Workshop Physics at HERA, DESY, Hamburg, 1991.
K. Charchula, G.A. Schuler, H. Spiesberger, Comput. Phys. Commun. 81 (1994) 381.
G. Gustafson, U. Pettersson, Nucl. Phys. B 306 (1988) 746.
H.L. Lai, et al., Phys. Rev. D 55 (1997) 1280.
B. Andersson, et al., Phys. Rep. 97 (1983) 31.
M. Bengtsson, T. Sjstrand, Comput. Phys. Commun. 46 (1987) 43.
T. Sjstrand, Comput. Phys. Commun. 82 (1994) 74.
R. Brun, et al., GEANT 3, Technical Report CERN-DD/EE/84-1, CERN, 1987.
Z. Nagy, Z. Trocsanyi, Phys. Rev. Lett. 87 (2001) 082001.
S. Catani, M.H. Seymour, Nucl. Phys. B 485 (1997) 291.
RAPID COMMUNICATION
180
[34]
[35]
[36]
[37]
[38]
[39]

N. Krumnack, Ph.D. Thesis, University of Hamburg, 2004.
L. Li, Ph.D. Thesis, University of WisconsinMadison, 2004.
J. Pumplin, et al., JHEP 0207 (2002) 012.
T. Gosau, Ph.D. Thesis, University of Hamburg, 2007, in preparation.
T. Danielson, Ph.D. Thesis, University of WisconsinMadison, 2007, in preparation.
M. Wing, on behalf of ZEUS Collaboration, in: R. Zhu (Ed.), Proceedings of the 10th International Conference on
Calorimetry in High Energy Physics, Pasadena, USA, 2002, p. 767, hep-ex/0206036.
Measurement of (anti)deuteron and (anti)proton

production in DIS at HERA
ZEUS Collaboration
S. Chekanov 1 , M. Derrick, S. Magill, B. Musgrave, D. Nicholass 2 ,
J. Repond, R. Yoshida
Argonne National Laboratory, Argonne, IL 60439-4815, USA 3
M.C.K. Mattingly
Andrews University, Berrien Springs, MI 49104-0380, USA
M. Jechow, N. Pavel , A.G. Yages Molina

Institut fr Physik der Humboldt-Universitt zu Berlin, Berlin, Germany
S. Antonelli, P. Antonioli, G. Bari, M. Basile, L. Bellagamba, M. Bindi,

D. Boscherini, A. Bruni, G. Bruni, L. Cifarelli, F. Cindolo, A. Contin,
M. Corradi, S. De Pasquale, G. Iacobucci, A. Margotti, R. Nania,
A. Polini, G. Sartorelli, A. Zichichi
University and INFN Bologna, Bologna, Italy 4
D. Bartsch, I. Brock, S. Goers 5 , H. Hartmann, E. Hilger, H.-P. Jakob,

M. Jngst, O.M. Kind 6 , A.E. Nuncio-Quiroz, E. Paul 7 , R. Renner 8 ,
U. Samson, V. Schnberg, R. Shehzadi, M. Wlasenko
Physikalisches Institut der Universitt Bonn, Bonn, Germany 9
N.H. Brook, G.P. Heath, J.D. Morris, T. Namsoo

H.H. Wills Physics Laboratory, University of Bristol, Bristol, United Kingdom 10
M. Capua, S. Fazio, A. Mastroberardino, M. Schioppa, G. Susinno,

E. Tassi
Calabria University, Physics Department and INFN, Cosenza, Italy 4
doi:10.1016/j.nuclphysb.2007.06.022
RAPID COMMUNICATION
182
J.Y. Kim 11 , K.J. Ma 12

Chonnam National University, Kwangju, South Korea 13
Z.A. Ibrahim, B. Kamaluddin, W.A.T. Wan Abdullah

Jabatan Fizik, Universiti Malaya, 50603 Kuala Lumpur, Malaysia 14
Y. Ning, Z. Ren, F. Sciulli

Nevis Laboratories, Columbia University, Irvington on Hudson, NY 10027, USA 15
J. Chwastowski, A. Eskreys, J. Figiel, A. Galas, M. Gil, K. Olkiewicz,

P. Stopa, L. Zawiejski
The Henryk Niewodniczanski Institute of Nuclear Physics, Polish Academy of Sciences, Cracow, Poland 16
L. Adamczyk, T. Bod, I. Grabowska-Bod, D. Kisielewska, J. ukasik,

M. Przybycien, L. Suszycki
Faculty of Physics and Applied Computer Science, AGH-University of Science and Technology, Cracow, Poland 17
A. Kotanski 18 , W. Sominski 19
Department of Physics, Jagellonian University, Cracow, Poland
V. Adler 20 , U. Behrens, I. Bloch, C. Blohm, A. Bonato, K. Borras,

R. Ciesielski, N. Coppola, A. Dossanov, V. Drugakov, J. Fourletova,
A. Geiser, D. Gladkov, P. Gttlicher 21 , J. Grebenyuk, I. Gregor, T. Haas,
W. Hain, C. Horn 22 , A. Httmann, B. Kahle, I.I. Katkov, U. Klein 23 ,
U. Ktz, H. Kowalski, E. Lobodzinska, B. Lhr, R. Mankel,
I.-A. Melzer-Pellmann, S. Miglioranzi, A. Montanari, D. Notz,
L. Rinaldi, P. Roloff, I. Rubinsky, R. Santamarta, U. Schneekloth,
A. Spiridonov 24 , H. Stadie, D. Szuba 25 , J. Szuba 26 , T. Theedt, G. Wolf,
K. Wrona, C. Youngman, W. Zeuner
Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
W. Lohmann, S. Schlenstedt
Deutsches Elektronen-Synchrotron DESY, Zeuthen, Germany
G. Barbagli, E. Gallo , P.G. Pelfer

University and INFN, Florence, Italy 4
RAPID COMMUNICATION
183
A. Bamberger, D. Dobur, F. Karstens, N.N. Vlasov 27

Fakultt fr Physik der Universitt Freiburg i.Br., Freiburg i.Br., Germany 9
P.J. Bussey, A.T. Doyle, W. Dunne, J. Ferrando, M. Forrest, D.H. Saxon,

I.O. Skillicorn
Department of Physics and Astronomy, University of Glasgow, Glasgow, United Kingdom 10
I. Gialas 28 , K. Papageorgiu
Department of Engineering in Management and Finance, University of Aegean, Greece
T. Gosau, U. Holm, R. Klanner, E. Lohrmann, H. Salehi, P. Schleper,

T. Schrner-Sadenius, J. Sztuk, K. Wichmann, K. Wick
Hamburg University, Institute of Experimental Physics, Hamburg, Germany 9
C. Foudas, C. Fry, K.R. Long, A.D. Tapper

Imperial College London, High Energy Nuclear Physics Group, London, United Kingdom 10
M. Kataoka 29 , T. Matsumoto, K. Nagano, K. Tokushuku 30 , S. Yamada,

Y. Yamazaki
Institute of Particle and Nuclear Studies, KEK, Tsukuba, Japan 31
A.N. Barakbaev, E.G. Boos, N.S. Pokrovskiy, B.O. Zhautykov

Institute of Physics and Technology of Ministry of Education and Science of Kazakhstan, Almaty, Kazakhstan
V. Aushev 1
Institute for Nuclear Research, National Academy of Sciences, Kiev,
and Kiev National University, Kiev, Ukraine
D. Son
Kyungpook National University, Center for High Energy Physics, Daegu, South Korea 13
J. de Favereau, K. Piotrzkowski
Institut de Physique Nuclaire, Universit Catholique de Louvain, Louvain-la-Neuve, Belgium 32
F. Barreiro, C. Glasman 33 , M. Jimenez, L. Labarga, J. del Peso, E. Ron,

M. Soares, J. Terrn, M. Zambrana
Departamento de Fsica Terica, Universidad Autnoma de Madrid, Madrid, Spain 34
RAPID COMMUNICATION
184
F. Corriveau, C. Liu, R. Walsh, C. Zhou

Department of Physics, McGill University, Montral, Qubec, Canada H3A 2T8 35
T. Tsurugai
Meiji Gakuin University, Faculty of General Education, Yokohama, Japan 31
A. Antonov, B.A. Dolgoshein, V. Sosnovtsev, A. Stifutkin, S. Suchkov

Moscow Engineering Physics Institute, Moscow, Russia 36
R.K. Dementiev, P.F. Ermolov, L.K. Gladilin, L.A. Khein,

I.A. Korzhavina, V.A. Kuzmin, B.B. Levchenko 37 , O.Yu. Lukina,
A.S. Proskuryakov, L.M. Shcheglova, D.S. Zotkin, S.A. Zotkin
Moscow State University, Institute of Nuclear Physics, Moscow, Russia 38
I. Abt, C. Bttner, A. Caldwell, D. Kollar, W.B. Schmidke, J. Sutiak

Max-Planck-Institut fr Physik, Mnchen, Germany
G. Grigorescu, A. Keramidas, E. Koffeman, P. Kooijman, A. Pellegrino,

H. Tiecke, M. Vzquez 29 , L. Wiggers
NIKHEF and University of Amsterdam, Amsterdam, Netherlands 39
N. Brmmer, B. Bylsma, L.S. Durkin, A. Lee, T.Y. Ling

Physics Department, Ohio State University, Columbus, OH 43210, USA 3
P.D. Allfrey, M.A. Bell, A.M. Cooper-Sarkar, A. Cottrell,

R.C.E. Devenish, B. Foster, K. Korcsak-Gorzo, S. Patel, V. Roberfroid 40 ,
A. Robertson, P.B. Straub, C. Uribe-Estrada, R. Walczak
Department of Physics, University of Oxford, Oxford, United Kingdom 10
P. Bellan, A. Bertolin, R. Brugnera, R. Carlin, F. Dal Corso, S. Dusini,

A. Garfagnini, S. Limentani, A. Longhin, L. Stanco, M. Turcato
Dipartimento di Fisica dellUniversit and INFN, Padova, Italy 4
B.Y. Oh, A. Raval, J. Ukleja 41 , J.J. Whitmore 42

Department of Physics, Pennsylvania State University, University Park, PA 16802, USA 15
Y. Iga
Polytechnic University, Sagamihara, Japan 31
RAPID COMMUNICATION
185
G. DAgostini, G. Marini, A. Nigro

Dipartimento di Fisica, Universit La Sapienza and INFN, Rome, Italy 4
J.E. Cole, J.C. Hart

Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, United Kingdom 10
H. Abramowicz 43 , A. Gabareen, R. Ingbir, S. Kananov, A. Levy

Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics, Tel-Aviv University, Tel-Aviv, Israel 44
M. Kuze, J. Maeda
Department of Physics, Tokyo Institute of Technology, Tokyo, Japan 31
R. Hori, S. Kagawa 45 , N. Okazaki, S. Shimizu, T. Tawara

Department of Physics, University of Tokyo, Tokyo, Japan 31
R. Hamatsu, H. Kaji 46 , S. Kitamura 47 , O. Ota, Y.D. Ri

Tokyo Metropolitan University, Department of Physics, Tokyo, Japan 31
M.I. Ferrero, V. Monaco, R. Sacchi, A. Solano

Universit di Torino and INFN, Torino, Italy 4
M. Arneodo, M. Ruspa
Universit del Piemonte Orientale, Novara, and INFN, Torino, Italy 4
S. Fourletov, J.F. Martin

Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A7 35
S.K. Boutle 28 , J.M. Butterworth, C. Gwenlan 48 , T.W. Jones,

J.H. Loizides, M.R. Sutton 48 , M. Wing
Physics and Astronomy Department, University College London, London, United Kingdom 10
B. Brzozowska, J. Ciborowski 49 , G. Grzelak, P. Kulinski, P. uzniak 50 ,

J. Malka 50 , R.J. Nowak, J.M. Pawlak, T. Tymieniecka, A. Ukleja,
A.F. Zarnecki
Warsaw University, Institute of Experimental Physics, Warsaw, Poland
RAPID COMMUNICATION
186
M. Adamus, P. Plucinski 51
Institute for Nuclear Studies, Warsaw, Poland
Y. Eisenberg, I. Giller, D. Hochman, U. Karshon, M. Rosin

Department of Particle Physics, Weizmann Institute, Rehovot, Israel 52
E. Brownson, T. Danielson, A. Everett, D. Kira, D.D. Reeder 7 , P. Ryan,

A.A. Savin, W.H. Smith, H. Wolfe
Department of Physics, University of Wisconsin, Madison, WI 53706, USA 3
S. Bhadra, C.D. Catterall, Y. Cui, G. Hartner, S. Menary, U. Noor,

J. Standage, J. Whyte
Department of Physics, York University, Ontario, Canada M3J 1P3 35
Received 29 May 2007; received in revised form 7 June 2007; accepted 7 June 2007
Abstract
The first observation of (anti)deuterons in deep inelastic scattering at HERA has been made with the
ZEUS detector at a centre-of-mass energy of 300318 GeV using an integrated luminosity of 120 pb1 . The
measurement was performed in the central rapidity region for transverse momentum per unit of mass in the
range 0.3 < pT /M < 0.7. The particle rates have been extracted and interpreted in terms of the coalescence
model. The (anti)deuteron production yield is smaller than the (anti)proton yield by approximately three
orders of magnitude, consistent with the world measurements.
E-mail address: gallo@mail.desy.de (E. Gallo).

1 Supported by DESY, Germany.
2 Also affiliated with University College London, UK.
3 Supported by the US Department of Energy.
4 Supported by the Italian National Institute for Nuclear Physics (INFN).
5 Now with TV Nord, Germany.
6 Now at Humboldt University, Berlin, Germany.
7 Retired.
8 Self-employed.
9 Supported by the German Federal Ministry for Education and Research (BMBF), under contract Nos. HZ1GUA 2,
HZ1GUB 0, HZ1PDA 5, HZ1VFA 5.
10 Supported by the Particle Physics and Astronomy Research Council, UK.
11 Supported by Chonnam National University in 2005.
RAPID COMMUNICATION
187
12 Supported by a scholarship of the World Laboratory Bjrn Wiik Research Project.

13 Supported by the Korean Ministry of Education and Korea Science and Engineering Foundation.
14 Supported by the Malaysian Ministry of Science, Technology and Innovation/Akademi Sains Malaysia grant SAGA
66-02-03-0048.
15 Supported by the US National Science Foundation. Any opinion, findings and conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views of the National Science
Foundation.
16 Supported by the Polish State Committee for Scientific Research, grant Nos. 620/E-77/SPB/DESY/P-03/DZ
117/2003-2005 and 1P03B07427/2004-2006.
17 Supported by the Polish Ministry of Science and Higher Education as a scientific project (20062008).
18 Supported by the research grant No. 1 P03B 04529 (20052008).
19 This work was supported in part by the Marie Curie Actions Transfer of Knowledge project COCOS (contract
MTKD-CT-2004-517186).
20 Now at University Libre de Bruxelles, Belgium.
21 Now at DESY group FEB, Hamburg, Germany.
22 Now at Stanford Linear Accelerator Center, Stanford, USA.
23 Now at University of Liverpool, UK.
24 Also at Institut of Theoretical and Experimental Physics, Moscow, Russia.
25 Also at INP, Cracow, Poland.
26 On leave of absence from FPACS, AGH-UST, Cracow, Poland.
27 Partly supported by Moscow State University, Russia.
28 Also affiliated with DESY.
29 Now at CERN, Geneva, Switzerland.
30 Also at University of Tokyo, Japan.
31 Supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and its grants
for Scientific Research.
32 Supported by FNRS and its associated funds (IISN and FRIA) and by an Inter-University Attraction Poles Programme
subsidised by the Belgian Federal Science Policy Office.
33 Ramn y Cajal Fellow.
34 Supported by the Spanish Ministry of Education and Science through funds provided by CICYT.
35 Supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
36 Partially supported by the German Federal Ministry for Education and Research (BMBF).
37 Partly supported by Russian Foundation for Basic Research grant No. 05-02-39028-NSFC-a.
38 Supported by RF Presidential grant No. 8122.2006.2 for the leading scientific schools and by the Russian Ministry
of Education and Science through its grant Research on High Energy Physics.
39 Supported by The Netherlands Foundation for Research on Matter (FOM).
40 EU Marie Curie Fellow.
41 Partially supported by Warsaw University, Poland.
42 This material was based on work supported by the National Science Foundation, while working at the Foundation.
43 Also at Max Planck Institute, Munich, Germany, Alexander von Humboldt Research Award.
44 Supported by the GermanIsraeli Foundation and the Israel Science Foundation.
45 Now at KEK, Tsukuba, Japan.
46 Now at Nagoya University, Japan.
47 Department of Radiological Science.
48 PPARC Advanced fellow.
49 Also at dz University, Poland.
50 dz University, Poland.
51 Supported by the Polish Ministry for Education and Science grant No. 1 P03B 14129.
52 Supported in part by the MINERVA Gesellschaft fr Forschung GmbH, the Israel Science Foundation (grant
No. 293/02-11.2) and the USIsrael Binational Science Foundation.
Deceased.
RAPID COMMUNICATION
188
1. Introduction
Light stable nuclei, such as deuterons (d) and tritons (t ), are loosely bound states whose
production mechanism in high-energy collisions is poorly understood. Most measurements of
A selection of d from primary
light stable nuclei have been performed for antideuterons (d).
interactions is more difficult as it requires separation of such states from particles produced by
interactions of colliding beams with residual gas in the beam pipe and by secondary interactions
in detector material. The first observation of d [1] was followed by a number of experiments on
antideuteron production. The production rate of d in e+ e q q collisions [25] is significantly
lower than that measured in (1S) and (2S) decays [2,5]. The d rate in e+ e q q is also
lower than that in protonnucleus (pA) [6,7], protonproton (pp) [8] and photonproton (p)
collisions at HERA [9], but higher than that in nucleusnucleus collisions [10,11]. For heavy-ion
collisions, the coalescence model [12] was proposed to explain the production of d(d).
This paper presents the results of the first measurement of d and d in the central rapidity
region of deep inelastic ep scattering (DIS). The analysis was performed for exchanged photon
virtuality, Q2 , above 1 GeV2 .
2. Coalescence model for (anti)deuteron formation
According to the coalescence model [12] developed for heavy-ion collisions, the production
rate of d is determined by the overlap between the wave-function of a proton (p) and a neutron (n)
with the wave-function of a d. In this case, the d cross section is the product of single-particle
cross sections for protons and neutrons, with a coefficient of proportionality reflecting the spatial
size of the fragmentation region emitting the particles. The same approach applies for d produc production in pp [8], p [9] and e+ e [2,4]
tion. This model was also used to describe d(d)
interactions.
Assuming that all baryons are uncorrelated and the invariant differential cross section for
neutrons is equal to that for protons, the invariant differential cross section for deuteron formation
can be parameterised as

E p d3 p 2
E d d3 d
=
B
,
2
tot dpd3
tot dpp3
where Ed(p) and d(p) are the energy and the production cross section of the d(p), respectively,
pd (pp ) is the momentum of the d(p) and tot is the total ep cross section for the considered
kinematic range. The coalescence parameter, B2 , is inversely proportional to the volume of the
fragmentation region emitting the particles. The same relation holds for d and p.
If B2 is the same
2 . The coalescence
is equal to (p/p)
for particles and antiparticles, then the production ratio d/d
parameter can be obtained from

B2 =
E d d3 d
tot dpd3
E p d3 p
tot dpp3
2

1
d3 d
d
= Mp4 Md2 R 2 (d/p)
,
tot d(pd /Md )3
where Md(p) is the mass of the d(p), d = Ed /Md , R(d/p) is the ratio of the number of d to p
expressed as a function of pT /Md(p) , with pT being the transverse momentum [9].
RAPID COMMUNICATION
189
3. Experimental set-up
A detailed description of the ZEUS detector can be found elsewhere [13]. A brief outline of
the components that are most relevant for this analysis is given below.
Charged particles are tracked in the central tracking detector (CTD) [14], which operates
in a magnetic field of 1.43 T provided by a thin superconducting solenoid. The CTD consists of 72 cylindrical drift chamber layers, organised in nine superlayers covering the polarangle53 region 15 < < 164 . The transversemomentum resolution for full-length tracks is
(pT )/pT = 0.0058pT 0.0065 0.0014/pT , with pT in GeV. To estimate the ionisation
energy loss per unit length, dE/dx, of particles in the CTD [15], the truncated mean of the
anode-wire pulse heights was calculated, which removes the lowest 10% and at least the highest
30% depending on the number of saturated hits. The measured dE/dx values were corrected
by normalising to the average dE/dx for tracks around the region of minimum ionisation for
pions with momentum p satisfying 0.3 < p < 0.4 GeV. Henceforth, dE/dx is quoted in units
of minimum ionising particles (mips). The resolution of the dE/dx measurement for full-length
tracks is about 9%.
The high-resolution uraniumscintillator calorimeter (CAL) [16] consists of three parts: the
forward (FCAL), the barrel (BCAL) and the rear (RCAL) calorimeters. Each part is subdivided
transversely into towers and longitudinally into one electromagnetic section (EMC) and either
one (in RCAL) or two (in BCAL and FCAL) hadronic sections (HAC). The smallest subdivision
of the calorimeter is called a cell.
under test-beam
The CAL energy resolutions, as measured
conditions, are (E)/E = 0.18/ E for electrons and (E)/E = 0.35/ E for hadrons, with E
in GeV. A presampler [17] mounted in front of the calorimeter and a scintillator-strip detector
(SRTD) [18] were used to correct the energy of the scattered electron.54 The position of electrons
scattered close to the electron beam direction is determined by the SRTD detector.
The inactive material between the interaction region and the CTD, relevant for this analysis,
consists of the central beam pipe made of aluminum with 1.5 mm wall thickness and the inner
diameter of 135 mm. The CTD inner wall with a diameter of 324 mm consists of two aluminum
skins, each 0.7 mm thick, separated by a 8.6 mm gap filled with polyurethane foam with a nominal density of 0.05 g/cm3 .
The luminosity was measured using the bremsstrahlung process ep ep with the luminosity monitor [19], a leadscintillator calorimeter placed in the HERA tunnel at Z = 107 m.
4. Monte Carlo simulation
To study the detector response, the A RIADNE 4.12 Monte Carlo (MC) model [20] for the
description of inclusive DIS events was used. The A RIADNE program uses the Lund string
model [21] for hadronisation, as implemented in P YTHIA 6.2 [2224]. In its original version,
this MC does not include a mechanism for the production of d or other light stable nuclei. To
determine reconstruction efficiencies, a second A RIADNE sample was generated in which ds
were included at the generator level by combining p and n with similar momenta.
53 The ZEUS coordinate system is a right-handed Cartesian system, with the Z axis pointing in the proton beam direction, referred to as the forward direction, and the X axis pointing left towards the centre of HERA. The coordinate
origin is at the nominal interaction point.
54 Henceforth the term electron is used to refer both to electrons and positrons.
RAPID COMMUNICATION
190
The A RIADNE events were passed through a full simulation of the detector using the
G EANT 3.13 [25] program. The G EANT simulation uses the G HEISHA model [26] to simulate
hadronic interactions in the material. The G EANT program cannot be used for d as this particle
is not included in the particle table.
5. Event sample
5.1. DIS event selection
The data sample corresponds to an integrated luminosity of 120.3 pb1 taken between 1996
and 2000 with the ZEUS detector at HERA. This sample consists of 38.6 pb1 of e+ p data taken
at a centre-of-mass energy of 300 GeV, 65.0 pb1 taken at 318 GeV and 16.7 pb1 of e p data
taken at 318 GeV.
The search was performed using DIS events with exchanged-photon virtuality Q2 > 1 GeV2 .
The event selection was similar to that used in a previous ZEUS publication [27]. A three-level
trigger [13] was used to select events online. At the third-level trigger, an electron with an energy
greater than 4 GeV was required. Data below Q2 20 GeV2 were prescaled to reduce trigger
rates.
The Bjorken scaling variable, xBj , and Q2 were reconstructed using the electron method (denoted by the subscript e), which uses measurements of the energy and angle of the scattered electron. The scattered-electron candidate was identified from the pattern of energy deposits in the
CAL [28]. In addition, the inelasticity was reconstructed using the JacquetBlondel method [29],
yJB , or the electron method, ye .
For the final DIS sample, the following requirements were imposed:
Q2e > 1 GeV2 ;
the impact point of the scattered electron on the RCAL outside the (X, Y ) region (12,
6) cm centred on the beamline;
Ee > 8.5 GeV, where Ee is the energy of the scattered electron measured in the CAL and
corrected for energy losses;
35 < < 65 GeV, where = Ei (1 cos i ), Ei is the energy of the ith calorimeter cell,
i is its polar angle and the sum runs over all cells;
ye < 0.95 and yJB > 0.01;
at least three tracks fitted to the primary vertex to ensure a good reconstruction of the primary
vertex and to reduce
contributions from non-ep events;
2 + Y 2 < 1 cm, where Z , X
|Zvtx | < 40 cm and Xvtx
vtx
vtx and Yvtx are the coordinates of
vtx
the vertex position determined from the tracks.
The average Q2 of the selected sample was about 10 GeV2 .

5.2. Track selection and the dE/dx measurement
The present analysis is based on charged tracks measured in the CTD. The tracks were required to have:
at least 40 CTD hits, with at least 8 of them for the dE/dx measurement;
the transverse momentum pT 0.15 GeV.
RAPID COMMUNICATION
191
These cuts selected a region where the CTD track acceptance, as well as the resolutions in
momentum and the dE/dx, were high.
To identify particles originating from ep collisions, the following additional variables were
reconstructed for each track:
the distance, Z, of the Z-component of the track helix to Zvtx ;
the distance of closest approach (DCA) of the track to the beam-spot location in the transverse plane. The beam-spot position is determined from the average primary-vertex distributions in X and Y for each data-taking period. The DCA is assigned a positive (negative)
value if the beam spot lies left (right) of the particle path.
Fig. 1 shows the dE/dx distribution as a function of the track momentum for positive and
negative tracks. The events were selected by requiring at least one track with dE/dx > 2.5 mips.
To reduce the fraction of tracks coming from non-ep collisions, the tracks were required to have
|Z| < 1 cm and |DCA| < 0.5 cm. After such a selection, clear bands corresponding to charged
kaons, protons and deuterons were observed. The requirement dE/dx > 2.5 mips enhances the
fraction of events with at least one particle with a mass larger than the pion mass and leads to the
discontinuity near dE/dx = 2.5 mips seen in Fig. 1. The lines show the most probable energy
loss calculated from the BetheBloch formula [30]. The dE/dx bands for K and p are slightly
shifted with respect to the BetheBloch expectations due to the geometrical structure of the CTD
drift cells which leads to a different response to negative and positive tracks.
Fig. 2 shows the reconstructed masses, M, for different particle species. The masses were
calculated from the measured track momentum and energy loss using the BetheBloch formula.
The mass distributions were fitted with asymmetric55 Gaussian functions. The relative width
obtained was 11% (7%) for the left (right) part of the function.
The number of p(p)
candidates in the mass region 0.7(0.6) < M < 1.5 GeV was 1.61 105
5
(1.66 10 ). Due to a shift in the dE/dx for negative tracks, the lower mass cut for p was at
0.6 GeV. The numbers of d and d in the mass window 1.5 < M < 2.5 GeV were 309 and 62,
respectively. The number of p migrating to the d mass region was estimated to be less than 1%
of the total number of d candidates. A similar estimate was obtained for antiparticles. A small
number of triton candidates was observed in the mass window 2.5 < M < 3.5 GeV. However,
due to low statistics, it was difficult to establish a peak inside this mass window, therefore, no
conclusive statement on the origin of the tracks in the region 2.5 < M < 3.5 GeV was possible.
candidates were required to be in the central rapidity region,
The observed p(p)
and d(d)
|y| < 0.4, and to have 0.3 < pT /M < 0.7. This determines the kinematic range used for the
cross-section calculations.
5.3. Identification of particles produced in ep collisions
candidates selected after the dE/dx mass cuts can originate
The observed p(p)
and d(d)
from secondary interactions in the inactive material between the interaction point and the central
tracking detector.
originating from ep collisions, both DCA and Z cuts were
In order to select p(p)
and d(d)
removed and a statistical background subtraction based on the DCA distribution was performed.
55 An asymmetric Gaussian has different widths for the left and right parts of the function.
RAPID COMMUNICATION
192
Fig. 1. The dE/dx distributions as a function of the track momentum for (a) positive and (b) negative tracks. The DIS
events were accepted by requiring at least one track with dE/dx > 2.5 mips (denoted by the dashed lines), |Z| < 1 cm
and |DCA| < 0.5 cm. The lines show the most-probable energy loss calculated using the BetheBloch formula for different particle species.
after the mass cuts are shown in Fig. 3. Clear peaks at

The Z distributions for p(p)
and d(d)
Z = 0 are observed. To optimise the signal-over-background ratio for the DCA distribution, all
candidates were selected using the |Z| < 2(1) cm restriction for p, p (d, d).
candidates. The distributions show
Fig. 4 shows the DCA distributions for p(p)
and d(d)
peaks at zero due to tracks originating from the primary vertex. The number of particles
RAPID COMMUNICATION
193
Fig. 2. The mass spectra for (a) positive and (b) negative particles. Tracks are selected as for Fig. 1. The mass distribution was calculated from the track momenta and the dE/dx. The arrows indicate the cuts applied for the selection of
candidates.
originating from primary ep collisions was determined using the side-band background subtraction. A linear fit to the DCA distribution on either side of the peak region in the range
2 < |DCA| < 4 cm was performed. Then, the expected number of background events in the signal
candidates was subtracted.
region of |DCA| < 1.5(0.5) cm for p, p (d, d)
The number of p(p)
obtained after the DCA side-band background subtraction was 1.52105
5
(1.62 10 ). The numbers of d and d particles were 177 17 and 53 7, respectively. The difference in the observed numbers of p and p can be explained by different dE/dx efficiencies and
the mass cuts for positive and negative tracks. Such a difference in the efficiencies for particles
and antiparticles cannot explain the difference in the observed numbers of d and d.
Fig. 5 shows the distributions for several DIS kinematic variables: Q2e , xe , Ee and . In
addition, rapidity (y) distributions for the selected candidates are shown. The numbers of p(p)
candidates were calculated in each bin from the DCA distributions after the side-band
and d(d)
background subtraction. The distributions for d are consistent with those for p and p,
while the
d sample shows some deviations for the Ee variable and, consequently, for the variable.
6. Studies of background processes
The following two background sources for heavy stable charged particles were considered:
interactions of the proton (or electron) beam with residual gas in the beam pipe, termed
beam-gas interactions;
RAPID COMMUNICATION
194
Fig. 3. The distributions of Z, the distance of the Z-component of the track helix to Zvtx for: (a)(b) particles and
(c)(d) antiparticles, as indicated in the figure. The p, p,
d and d candidates were identified using the dE/dx mass cuts
(see text). The arrows indicate the applied cuts.
secondary interactions of particles in inactive material between the interaction point and the
central tracking detector.
6.1. Beam-gas interactions
The contribution from proton-gas interactions is significantly reduced after the ZEUS threelevel trigger which requires a scattered electron in the CAL. In addition, the requirement to accept
only events with more than three tracks fitted to the primary vertex significantly diminishes the
contribution from both electron-gas and proton-gas events. The remaining fraction of beam-gas
interactions can be assessed by studying the Zvtx distribution.
candidate. The
Fig. 6 shows the Zvtx distributions for events with at least one p(p)
or d(d)
distributions were reconstructed in the signal region |Z| < 2(1) cm and |DCA| < 1.5(0.5) cm
candidates without the background subtraction. Fig. 6 shows that there is essenfor p, p (d, d)
tially no beam-gas background for d events. A small background for d at positive Zvtx is expected
from the DIS MC generated for inclusive DIS events in which ds are produced by secondary
interactions in the material in front of the CTD. This background is expected to have a flat DCA
and, therefore, is subtracted by the procedure described in Section 5.3.
The Zvtx distributions were fitted using a Gaussian function with a first-order polynomial
for the background description. The extracted Gaussian widths are fully consistent with those
obtained for inclusive DIS events without the d preselection.
RAPID COMMUNICATION
195
Fig. 4. The distributions of the distance of closest approach, DCA, for: (a)(b) particles and (c)(d) antiparticles. The
DCA are shown after the cut |Z| < 2(1) cm as discussed in the text. The arrows indicate the signal region for the
side-band background subtraction. The dashed lines show the fitted background level.
To further study the Zvtx distribution, a special event selection was performed for noncolliding electron and proton bunches. Since the requirement to detect an electron with energy
Ee 8.5 GeV significantly reduces the rate of such background events, this requirement was not
selection. The requirement to accept
applied. All other tracking cuts were the same as in the d(d)
events with at least three tracks fitted to the primary vertex rejects most of the beam-gas events
( 95% from the total number of the triggered events). As expected, the remaining events show
clear peaks at zero for the Z and DCA distributions, but the reconstructed Zvtx distribution did
not show a peak at zero.
The enhancement at large Zvtx for d, which was found to be consistent with that originating
from secondary interactions, could partially be due to electron-gas interactions. If one assumes
that the background seen in Fig. 6(b) is due to non-ep interactions, then the contribution from
beam-gas interactions does not exceed 17% of the total number of events with a deuteron.
6.2. Secondary interactions on inactive material
A pure sample of DIS events will still contain deuterons produced by secondary interactions
of particles in material. The aim of the side-band background subtraction discussed in Section 5.3
was to remove such a background contribution, assuming that the background processes do not
create a residual peak at Z = 0 and DCA = 0. Several checks of this assumption are discussed
below.
RAPID COMMUNICATION
196
and p(p)
Fig. 5. The distributions of the number of events with at least one d(d)
candidate normalised to unity as a
function of: (a)(d) DIS kinematic variables and (e) rapidity y. The points for d and d are slightly shifted horizontally
for clarity.
The DCA and Z distributions were investigated using an MC simulation of inclusive DIS
production at the generator level. Deuterons from secondary interactions
events without d(d)
were selected as for the data. The reconstructed DCA and Z for d did not show a peak at
zero. A more detailed study of the DCA and Z distributions was possible for p not originating
from an ep collision at the MC generator level, since in this case the available MC statistics is
significantly higher than for the d case. After the track-quality cuts, no peak at zero was observed
in the DCA and Z distributions.
If a deuteron is produced by secondary interactions of the particles from the DIS event in
the surrounding matter, the secondary d will not point precisely back to the interaction point,
and both DCA and Z distributions will be wider than in case of d and p.
Therefore, the DCA
and Z distributions were fitted with double-Gaussian distributions to establish the width of the
distributions. It was found that the observed deuteron DCA and Z widths were consistent with
the corresponding widths for p and p.
One possible source for d is the reaction N + N d + , where one of the nucleons N
originates from an ep collision, while the other one originates from the detector material in front
of the CTD. For low initial nucleon momenta, the DCA of the d track is in general large and
it does not form an important background; at high initial nucleon momenta however, the DCA
can become small enough that misidentification could become important.56 Since the processes
56 Note that the cross section for the reaction N + N d + decreases rapidly with increasing energy.
RAPID COMMUNICATION
197
Fig. 6. The Zvtx distributions for: (a)(b) particles and (c)(d) antiparticles, as indicated in the figure. The solid lines
show the fit using a Gaussian distribution with a first-order polynomial function for the background description. The
dashed line shows the fitted background. The arrows indicate the cuts applied for the final selection.
N + N d + can lead to an additional charged pion, this source of background deuterons

can be studied by comparing the average charged multiplicity of tracks for d and d events. In
addition, the distance of closest approach, DCA12, between the d track and other non-primary
tracks in the same event should have an enhancement at zero. The study indicated that the average
number of tracks for d events is smaller than that for d events. The rejection of events with
|DCA12| < 2 cm did not lead to a statistically significant reduction in the number of the observed
d events.
Secondary deuterons may also be produced in pickup (p + n d) reactions by primary
p(n) interacting in the surrounding material. These deuterons, peaking in the direction of the
primary p(n), point approximately to the interaction point and are therefore a potentially dangerous source of background. Experimental data on the pickup reactions at the relevant energy
are scarce and therefore only a rough estimate of the size of this background is possible. From
the extrapolation of data on Sm154 [31] and C [32,33] targets using the K. Kikuchi theory [34]
to allow for the change of material, the estimated d background from the pickup reaction was
in the range 110% of the total number of observed d events, depending on the extrapolation
input.
The angular distributions of d from pickup reactions have also been investigated in several
experiments [32,35,36] for various targets and for a range of p/M similar to the present analysis.
In all cases, the angular distribution of d observed in these experiments would lead to a much
wider DCA than that shown in Fig. 4(b).
RAPID COMMUNICATION
198
7. Detector corrections
In this analysis, all measurements are based on event ratios, therefore, the detector corrections
due to DIS event selection and trigger efficiency were found to be small and thus are not discussed
here. The detector corrections for the tracking efficiency and the efficiency of the dE/dx cuts
are described below.
7.1. Tracking efficiency
The efficiency due to the track reconstruction, , was estimated separately for p(p)
and d
using the A RIADNE MC model (with d included at the generator level). The obtained efficiencies
are about 0.95 for p and d and 0.90 for p.
The method cannot be applied to d which are not treated in the G EANT simulation. Therefore,
= (d)(p)/(p).
the tracking efficiency for d was modelled as (d)
In the expression above,

the hit reconstruction efficiency is described by the first term, (d), while the absorption loss
(including annihilation) of d and p are assumed to be similar. This modelling assumes that the
cross sections of annihilation in the detector material are the same for d and p,
since the inelastic
nuclear cross section of p is much larger than that of n for the momentum region less than
0.4 GeV [37]. The use of the geometrical model discussed in [11,37] and the model in which
the p and n inelastic absorption cross sections are added linearly [4,37] to obtain the inelastic
reduces (d)
by 1% and 5%, respectively.
nuclear cross section of d,
7.2. Efficiency of the dE/dx cuts
Another important contribution to the efficiency comes from the dE/dx threshold cuts and the
mass cuts. The inefficiency due to the dE/dx requirements were estimated separately for positive
and negative tracks using p (+ c.c.) decays. In this approach, protons were identified
from the peak and then the proton dE/dx selection efficiency was reconstructed as the ratio
of the events without and with the dE/dx requirement. These efficiencies were determined as
a function of p/M. The efficiency for each pT /M bin was corrected by reweighting the p/M
is 0.7 for
distributions using A RIADNE. The average efficiency of the dE/dx cuts for d(d)
pT /M < 0.5. For larger momenta, the efficiency decreases due to the dE/dx > 2.5 mips cut.
The signal extraction is not possible for pT /M > 0.7 due to a very small efficiency. For the
low-momentum region pT /M < 0.5, the efficiencies for negative tracks tend to be larger than
for positive tracks. The dE/dx efficiency for p(p)

is higher by 15% than that for d(d).
Alternatively, the overall tracking and the dE/dx efficiency was calculated using the A RI ADNE MC model; consistent results with the approach discussed above were found.
8. Systematic uncertainties
The systematic uncertainties were evaluated by changing the selection and the analysis procedure. Only the largest contribution of each cut variation for the final invariant cross section is
given below. The following sources of systematic uncertainties were studied:
efficiency of the track reconstruction and selection. The systematic uncertainty on the tracking efficiency for p, p,
d was 2%. This systematic uncertainty was found after variations
the systematic uncertainty, 5%, includes both the effect
of the track-quality cuts. For d,
RAPID COMMUNICATION
199
when the linear model for the d

of track-quality-cut variations and the reduction in (d)
absorption was used (see Section 7.1);
efficiency due to the dE/dx selection. This systematic uncertainty was estimated by varying
the cut dE/dx > 2.5 mips within the dE/dx resolution and by using the MC simulation.
This systematic uncertainty was 5%. For the lowest pT /M bin, the uncertainty was 10%;
variations in the particle yields associated with the signal extraction:
were reconstructed using a Gaussian fit to the DCA distribution with
the number of d(d)
a first-order polynomial for the background description;
the region used to determine the background for the side-band background subtraction
was reduced to 1.5 < |DCA| < 3.5 cm;
the DCA cut for the side-band background subtraction was varied within its resolution of
0.1 cm;
for the side-band background subtraction, the background shape was taken from the MC
(without d at the generator level);
the cut on Z was varied by 0.2 cm;
These variations lowered the production yields by 5.0% for p, 2.2% for p,
26.0% for d and
The largest effect originates from the conservative treatment of the shape of the
6.1% for d.
and 11% for d.
DCA background. The upper systematic error was below 1% for p, p and d,
the background contribution under the Zvtx peak for d events was assumed to be due to
beam-gas interactions and, therefore, it was subtracted (4% contribution for p, p,
d and
17% contribution for d);
the correction for decays applied for the p(p)
sample was changed by 10% (see Section 9.1). The size of this uncertainty, which is similar to that in other publications [4,9], was
determined by the uncertainty on the strangeness suppression factor in the A RIADNE model;
variations of the DIS-selection cuts. The cut on the energy of the scattered electron was
increased to 10 GeV, and the lower cut on the distribution was tightened to 40 GeV. The
cut on Zvtx was varied by 5 cm. The cut on the number of primary tracks was increased
+3.6
from three to four. These variations led to changes of +3.3
+3.7
4.1 % for p, 4.4 % for p,
8.5 % for d
+5.7
Variations of the cuts on ye and yJB distributions showed a negligible
and 13.3 % for d.
effect.
The overall systematic uncertainty was determined by adding the above uncertainties in
quadrature. The largest experimental uncertainty was due to the uncertainties on the tracking
efficiency and the signal extraction.
9. Results
9.1. Production cross sections and B2
For each particle type i, the invariant differential cross section can be calculated from the
rapidity range y and the transverse momentum pT ,i of a corresponding particle through
d3 i
1
1
Ni
i
=
,
3
tot d(pi /Mi )
NDIS 2(pT ,i /Mi )y (pT ,i /Mi )
Ni is the particle yield in each pT ,i /Mi bin
where the subscript i denotes a p(p)
or a d(d),
after the correction for the tracking efficiencies and the particle selection and NDIS = 2.59 107
RAPID COMMUNICATION
200
produced in DIS ep collisions as a function of pT /M.

Fig. 7. The invariant differential cross sections for p(p)
and d(d)
The inner error bars show the statistical uncertainties, the outer ones show statistical and systematic uncertainties added
in quadrature. For clarity, the points for particles and antiparticles are slightly shifted horizontally with respect to the
corresponding pT /M.
is the number of DIS events used in the analysis. For the present measurement, y = 0.8 and
rate
the bin sizes are (pT ,i /Mi ) = 0.1. For comparisons with other experiments, the p(p)
was corrected for the decay products of . A correction factor of 0.79 was estimated from the
A RIADNE simulation which gives an adequate description of KS0 and production [38].
are shown
The invariant differential cross sections as a function of pT /M for p(p)
and d(d)
in Fig. 7 and given in Tables 1 and 2. The d(d) invariant cross section is smaller by approximately three orders of magnitude than that of p(p).
These cross sections were used to extract the
coalescence parameter B2 as discussed in Section 2. The parameter B2 is shown in Fig. 8 and
especially at low pT /M. The
listed in Tables 3 and 4. For d, B2 tends to be higher than for d,
value of B2 for d is in agreement with the measurements in photoproduction [9], but larger than
that observed in e+ e annihilation at the Z resonance [4]. The measured B2 is also significantly
larger than that observed in heavy-ion collisions [11].
were analysed in the Breit frame [39]. The
The events containing at least one p(p)
or d(d)
number of events with p(p)
in the current region of the Breit frame was about 2.5% of the total
number of observed events with p(p).
In this region, neither d nor d was found. Since the current
region of the Breit frame is analogous to a single hemisphere of e+ e , the observation of d(d)
+
reported in this paper is not in contradiction with the low d rate observed in e e [24].
9.2. Production ratios
p ratios as a function of pT /M are shown in Fig. 9(a)
The detector-corrected d/p and d/
and listed in Tables 3 and 4. For the antiparticle ratio, there is a good agreement with the H1
p ratio was also
published data for photoproduction [9], as well as with pp data [8]. A similar d/
observed in hadronic (1S) and (2S) decays [2].
RAPID COMMUNICATION
201
Table 1
The measured invariant cross sections for the production of p and d in DIS as a function of pT /M. The statistical and
systematic uncertainties are also listed
pT /M
0.30.4
0.40.5
0.50.6
0.60.7
(p /tot ) d3 p /d(pp /Mp )3 (102 )
(d /tot ) d3 d /d(pd /Md )3 (105 )
+0.19
1.33 0.010.21
+0.16
1.34 0.010.18
+0.10
0.88 0.010.12
+0.04
0.38 0.010.05
+0.50
3.29 0.431.24
+0.17
1.37 0.260.51
+0.14
1.16 0.280.42
Table 2
The measured invariant cross sections for the production of p and d in DIS as a function of pT /M. The statistical and
systematic uncertainties are also listed
pT /M
(p /tot ) d3 p /d(pp /Mp )3 (102 )
(d /tot ) d3 d /d(pd /Md )3 (105 )
0.30.4
+0.16
1.59 0.010.19
+0.09
0.77 0.150.14
0.40.5
0.50.6
0.60.7
+0.07
1.21 0.010.09
+0.05
0.86 0.010.07
+0.02
0.35 0.010.03
+0.03
0.45 0.110.07
+0.05
0.60 0.190.09
Fig. 8. The pT /M dependence of the parameter B2 for d and d produced in DIS ep collisions and in photoproduction [9].
The inner error bars show the statistical uncertainties, the outer ones show statistical and systematic uncertainties added
in quadrature. For clarity, the points for particles and antiparticles are slightly shifted horizontally with respect to the
corresponding pT /M.
and p/p
The d/d
ratios as a function of pT /M are shown in Fig. 9(b) and listed in Table 5.
The p/p
ratio is consistent with unity, as expected from hadronisation of quark and gluon jets.
RAPID COMMUNICATION
202
Table 3
The measured d-to-p production ratio and the parameter B2 for d as a function of pT /M. The last row of the table
shows the data in the full measured phase space. The statistical and systematic uncertainties are also listed
pT /M
0.30.4
0.40.5
0.50.6
0.60.7
0.30.7
R(d/p)(103 )
B2 (d)(102 GeV2 )
+0.55
2.48 0.331.00
+0.19
1.02 0.190.40
+0.24
1.32 0.320.51
+1.47
4.11 0.541.97
+0.50
1.68 0.320.74
+0.99
3.31 0.801.45
+0.40
1.88 0.200.75
+1.13
3.32 0.341.55
Table 4
p production ratio and the parameter B2 for d as a function of pT /M. The last row of the table
The measured d-toshows the data in the full measured phase space. The statistical and systematic uncertainties are also listed
pT /M
3 )
p)(10
R(d/
2 GeV2 )
B2 (d)(10
0.30.4
+0.08
0.48 0.090.10
+0.18
0.67 0.130.19
+0.31
1.80 0.570.36
+0.07
0.49 0.070.09
+0.19
0.89 0.140.20
0.40.5
0.50.6
0.60.7
0.30.7
+0.04
0.37 0.090.06
+0.08
0.70 0.220.12
+0.12
0.67 0.170.13
The dominant uncertainty on the ratio is due to systematic effects associated with the track selection and reconstruction.
especially at low pT . Under the assumption
The production rate of d is higher than that of d,
that secondary interactions do not produce an enhancement at DCA = 0 for the d case, the result
2 expected from the coalescence model
and (p/p)
would indicate that the relation between d/d
does not hold in the central fragmentation region of ep DIS collisions.

For collisions involving incoming baryon beams, there are several models [40,41] that predict
baryonantibaryon production asymmetry in the central rapidity region. A pp asymmetry in
proton-induced reactions is predicted to be as high as 7% [41]. Given the experimental uncertainty, this measurement is not sensitive to the expected small pp asymmetry.
In heavy-ion collisions, the d to d production ratio is expected to be smaller than unity [42].
A recent measurement at RHIC [11] indicated a lower production rate of d compared to that
= 0.47 0.03 was compatible with the square of the
of d. The average value of the ratio d/d
p/p
= 0.73 0.01 ratio. Assuming the same size of the production volume for baryons and
antibaryons, this RHIC result is consistent with the coalescence model. A similar conclusion
was obtained earlier in fixed-target pp [8] and pA [7] experiments. For e+ e collisions, the d
yield is compatible with that of d within the large uncertainties [4,5].
10. Summary
in ep collisions in the DIS regime at HERA is presented. The
The first observation of d(d)
is smaller than that for p(p)
production rate of d(d)
by three orders of magnitude, which is in
broad agreement with other experiments.
RAPID COMMUNICATION
203
p production ratios as a function of pT /M compared to the H1 photoproduction results [9].

Fig. 9. (a) d/p and d/
and p/p
(b) The d/d
production ratios as a function of pT /M. The inner error bars show the statistical uncertainties,
the outer ones show statistical and systematic uncertainties added in quadrature. The points in (a) are slightly shifted
horizontally for clarity.
Table 5
The measured p-to-p
and d-to-d
production ratios as a function of pT /M. The last row of the table shows the data in
the full measured phase space. The statistical and systematic uncertainties are also listed
pT /M
0.30.4
0.40.5
0.50.6
0.60.7
0.30.7
R(p/p)
R(d/d)
+0.20
1.19 0.010.19
+0.10
0.90 0.010.09
+0.11
0.97 0.010.10
+0.10
0.92 0.030.09
+0.09
0.23 0.050.05
+0.15
1.05 0.010.14
+0.11
0.31 0.050.06
+0.12
0.33 0.100.07
+0.19
0.52 0.210.10
was studied in terms of the coalescence model. The coalescence paraThe production of d(d)
meter is in agreement with the measurements in photoproduction at HERA. However, it is larger
than that measured in e+ e annihilation at the Z resonance.
The production rate of p is consistent with that of p in the kinematic range 0.3 <
pT /M < 0.7. Due to significant uncertainties, it is not possible to test models that predict a
small baryonantibaryon asymmetry in the central fragmentation region.
If the obFor the same kinematic region, the production rate of d is higher than that for d.
served d are solely attributed to deuterons produced in primary ep collisions, the results would
RAPID COMMUNICATION
204
indicate that the coalescence model with the same source volume for d and d cannot fully explain
in DIS.
the production of d(d)
Acknowledgements
We thank the DESY Directorate for their strong support and encouragement. The remarkable
achievements of the HERA machine group were essential for the successful completion of this
work and are greatly appreciated. We are grateful for the support of the DESY computing and
network services. The design, construction and installation of the ZEUS detector have been made
possible owing to the ingenuity and effort of many people from DESY and home institutes who
are not listed as authors. We thank Prof. D. Heinz and Prof. T. Sloan for the useful discussion of
this topic.
References
[1] T. Massam, et al., Nuovo Cimento 39 (1965) 10.
[2] ARGUS Collaboration, H. Albrecht, et al., Phys. Lett. B 157 (1985) 326;
ARGUS Collaboration, H. Albrecht, et al., Phys. Lett. B 236 (1990) 102.
[3] OPAL Collaboration, R. Akers, et al., Z. Phys. C 67 (1995) 203.
[4] ALEPH Collaboration, S. Schael, et al., Phys. Lett. B 639 (2006) 16.
[5] CLEO Collaboration, D.M. Asner, et al., Phys. Rev. D 75 (2007) 012009.
[6] IHEP-CERN Collaboration, F. Binon, et al., Phys. Lett. B 30 (1969) 510;
Yu.M. Antipov, et al., Phys. Lett. B 34 (1971) 164.
[7] J.W. Cronin, et al., Phys. Rev. D 11 (1975) 3105.
[8] B. Alper, et al., Phys. Lett. B 46 (1973) 265;
BritishScandinavian Collaboration, W.M. Gibson, et al., Nuovo Cimento Lett. 21 (1978) 189;
V.V. Abramov, et al., Sov. J. Nucl. Phys. 45 (1987) 845.
[9] H1 Collaboration, A. Aktas, et al., Eur. Phys. J. C 36 (2004) 413.
[10] M. Aoki, et al., Phys. Rev. Lett. 69 (1992) 2345;
NA52 (NEWMASS) Collaboration, G. Appelquist, et al., Phys. Lett. B 376 (1996) 245;
STAR Collaboration, C. Alper, et al., Phys. Rev. Lett. 87 (2001) 262301;
E802 Collaboration, L. Ahle, et al., Phys. Rev. C 57 (1998) 1416;
NA44 Collaboration, I.G. Bearden, et al., Nucl. Phys. A 661 (1999) 387;
NA44 Collaboration, I.G. Bearden, et al., Eur. Phys. J. C 23 (2002) 237.
[11] PHENIX Collaboration, S.S. Adler, et al., Phys. Rev. Lett. 94 (2005) 122302.
[12] S.T. Butler, C.A. Pearson, Phys. Rev. 129 (1963) 836.
[13] ZEUS Collaboration, in: U. Holm (Ed.), The ZEUS Detector, Status Report (unpublished), DESY, 1993, available
on http://www-zeus.desy.de/bluebook/bluebook.html.
[14] N. Harnew, et al., Nucl. Instrum. Methods A 279 (1989) 290;
B. Foster, et al., Nucl. Phys. B (Proc. Suppl.) 32 (1993) 181;
B. Foster, et al., Nucl. Instrum. Methods A 338 (1994) 254.
[15] ZEUS Collaboration, J. Breitweg, et al., Phys. Lett. B 481 (2000) 213;
ZEUS Collaboration, J. Breitweg, et al., Eur. Phys. J. C 18 (2001) 625;
D. Bartsch, PhD thesis (unpublished), Universitt Bonn, Bonn, Germany, 2007.
[16] M. Derrick, et al., Nucl. Instrum. Methods A 309 (1991) 77;
A. Andresen, et al., Nucl. Instrum. Methods A 309 (1991) 101;
A. Caldwell, et al., Nucl. Instrum. Methods A 321 (1992) 356;
A. Bernstein, et al., Nucl. Instrum. Methods A 336 (1993) 23.
[17] A. Bamberger, et al., Nucl. Instrum. Methods A 382 (1996) 419;
S. Magill, S. Chekanov, in: B. Aubert, et al. (Eds.), Proceedings of the IX International Conference on Calorimetry
Annecy, 914 October 2000, in: Frascati Physics Series, vol. 21, Annecy, France, 2001, p. 625.
[18] A. Bamberger, et al., Nucl. Instrum. Methods A 401 (1997) 63.
RAPID COMMUNICATION
205
[19] J. Andruszkw, et al., Preprint DESY-92-066, DESY, 1992;

ZEUS Collaboration, M. Derrick, et al., Z. Phys. C 63 (1994) 391;
J. Andruszkw, et al., Acta Phys. Pol. B 32 (2001) 2025.
[20] L. Lnnblad, Comput. Phys. Commun. 71 (1992) 15.
[21] B. Andersson, et al., Phys. Rep. 97 (1983) 31.
[22] M. Bengtsson, T. Sjstrand, Comput. Phys. Commun. 46 (1987) 43.
[23] T. Sjstrand, Comput. Phys. Commun. 82 (1994) 74.
[24] T. Sjstrand, et al., Comput. Phys. Commun. 135 (2001) 238.
[25] R. Brun, et al., GEANT3, Technical Report CERN-DD/EE/84-1, CERN, 1987.
[26] H. Fesefeldt, The simulation of hadronic showers: Physics and applications (unpublished), PITHA-85-02.
[27] ZEUS Collaboration, S. Chekanov, et al., Phys. Lett. B 591 (2004) 7.
[28] H. Abramowicz, A. Caldwell, R. Sinkus, Nucl. Instrum. Methods A 365 (1995) 508.
[29] F. Jacquet, A. Blondel, in: U. Amaldi (Ed.), Proceedings of the Study for an ep Facility for Europe, Hamburg,
Germany, 1979, p. 391. Also in preprint DESY 79/48.
[30] Particle Data Group, W.-M. Yao, et al., J. Phys. G 33 (2006) 1.
[31] N. Blasi, et al., Nucl. Phys. A 624 (1997) 433.
[32] J. Franz, et al., Nucl. Phys. A 472 (1987) 733.
[33] G.R. Smith, et al., Phys. Rev. C 30 (1984) 593.
[34] K. Kikuchi, Prog. Theor. Phys. 18 (1957) 503.
[35] P.G. Roos, et al., Nucl. Phys. A 255 (1975) 187.
[36] B. Fagerstrom, et al., Phys. Scr. 13 (1976) 10.
[37] A.A. Moiseev, J.F. Ormes, Astropart. Phys. 6 (1997) 379.
[38] ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 51 (2007) 1.
[39] R.P. Feynman, PhotonHadron Interactions, Benjamin, New York, 1972;
K.H. Streng, T.F. Walsh, P.M. Zerwas, Z. Phys. C 2 (1979) 237.
[40] G.T. Garvey, B.Z. Kopeliovich, B. Povh, Comments Mod. Phys. A 2 (2001) 47;
S. Chekanov, Eur. Phys. J. C 44 (2005) 367;
F. Bopp, Yu.M. Shabelski, Phys. At. Nucl. 68 (2005) 2093;
F. Bopp, Yu.M. Shabelski, Eur. Phys. J. A 28 (2006) 237.
[41] B. Kopeliovich, B. Povh, Z. Phys. C 75 (1997) 693;
B. Kopeliovich, B. Povh, Phys. Lett. B 446 (1999) 321.
[42] S. Leupold, U.W. Heinz, Phys. Rev. C 50 (1994) 1110.
Nuclear Physics B 786 [PM] (2007) 207266
Tau functions in combinatorial Bethe ansatz

Atsuo Kuniba a, , Reiho Sakamoto b , Yasuhiko Yamada c
a Institute of Physics, Graduate School of Arts and Sciences, University of Tokyo, Komaba, Tokyo 153-8902, Japan
b Department of Physics, Graduate School of Science, University of Tokyo, Hongo, Tokyo 113-0033, Japan
c Department of Mathematics, Faculty of Science, Kobe University, Hyogo 657-8501, Japan
Received 18 April 2007; accepted 6 June 2007

Abstract
(1)
We introduce ultradiscrete tau functions associated with rigged configurations for An . They satisfy an
ultradiscrete version of the Hirota bilinear equation and play a role analogous to a corner transfer matrix
for the boxball system. As an application, we establish a piecewise linear formula for the KerovKirillov
Reshetikhin bijection in the combinatorial Bethe ansatz. They also lead to general N -soliton solutions of
the boxball system.
1. Introduction
The Bethe ansatz and the corner transfer matrix are methods of primary importance in
analysing solvable lattice models [1]. The Bethe ansatz produces eigenvectors of row transfer
matrices from solutions of the Bethe equation [2]. The corner transfer matrix method determines
the one-point function from the one-dimensional sums [1]. See [35] and [6,7] for some typical
applications. Interestingly, both of these approaches are known to admit combinatorial versions,
which have brought fruitful insights and applications into representation theory as well [8].
The combinatorial Bethe ansatz was initiated by Kerov, Kirillov and Reshetikhin (KKR)
[9,10]. They invented the object called rigged configuration, which serves as a combinatorial
substitute for the solutions of the Bethe equation. By the KKR bijection, they are in one-to-one
correspondence with the LittlewoodRichardson tableaux, or equivalently, highest paths which
E-mail addresses: atsuo@gokutan.c.u-tokyo.ac.jp (A. Kuniba), reiho@monet.phys.s.u-tokyo.ac.jp (R. Sakamoto),

yamaday@math.kobe-u.ac.jp (Y. Yamada).
doi:10.1016/j.nuclphysb.2007.06.007
208
A. Kuniba et al. / Nuclear Physics B 786 [PM] (2007) 207266
are the combinatorial analogues of the Bethe eigenvectors. As for the corner transfer matrix
method, a decisive progress came with the advent of the crystal base theory [11,12], where the
one-dimensional sums are formulated as generating functions of the energy of affine crystals over
paths.
Guided by a number of relevant results [1318], these streams have merged into the so-called
X = M conjecture [19,20] for general affine Lie algebra. Here X is the one-dimensional sum
in the corner transfer matrix method. For type A(1)
n , it coincides essentially with the Kostka
Foulkes polynomial [21] for the case treated in [9,10]. On the other hand, M is the fermionic
formula (2.10) in the Bethe ansatz, which is a generating function of the charge function c(, r)
(2.9). By now, the X = M conjecture has been studied extensively and solved in several cases
[2225].
During these developments, it was realized that not only the Bethe ansatz or the corner transfer matrix, but also the solvable lattice models themselves admit decent combinatorial versions.
In fact, vertex models with the quantum group symmetry Uq (A(1)
n ) turned out to be the soliton
cellular automata at q = 0 [26,27] that had been known as the boxball systems [28,29]. Row
transfer matrices in the former tend to commuting time evolutions in the latter. The finding has
led to a systematic generalization of such automata [3032], which possess fascinating features
as ultradiscrete integrable systems [33]. (See the explanation under (5.10) for the ultradiscretization.) Thus it is a natural endeavor to study these automata by the combinatorial versions of the
Bethe ansatz and the corner transfer matrix.
As for the Bethe ansatz, this has been done in [34,35], which yielded the inverse scattering
formalism of the boxball systems. It turned out that rigged configurations are action-angle variables, which provide the conserved quantities or linearize the commuting time evolutions. The
KKR bijection is the direct/inverse scattering (GelfandLevitan) map. In particular, the mysterious combinatorial algorithm in the bijection is identified with a crystal theoretical vertex
operator.
Then what about the corner transfer matrix? And this is the issue that we are going to address
in this paper. From a naive point of view, one is tempted to regard the number of balls in a quadrant of the two-dimensional time evolution pattern of the boxball system as its candidate. We
introduce such a quantity i (p) (4.1) for a path p. On the other hand, the combinatorial analogue
of the corner transfer matrix in the crystal base theory is the energy of affine crystals [12,17],
which is denoted by Ei (p) in (4.12). Our Proposition 4.6 asserts i (p) = Ei (p) indeed. One of
the main results in this paper is Theorem 6.12, which states i (p) = i (p) = Ei (p). Here i (p)
is the piecewise linear function on the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) ))
for p:

i (p) = max c(, s) (i) ,
c(, s) =

(a)
1
s ,
Ca,b min (a) , (b) min , (1) +
2
a
a,b
where (Cab )1a,bn is the Cartan matrix of An . c(, s) is the charge function appearing in the
fermionic formula, and the max extends over all the subsets ( (a) , s (a) ) ((a) , r (a) ) of the
rigged configuration. See (2.19), (2.20), (2.24) and Section 2.1 for a precise account. In short, i
is an ultradiscretization of a single summand in the fermionic formula with respect to the subsets
of the rigged configuration.
An origin of this curious quantity goes back to Satos theory of soliton equations [36]. In fact,
i arises as an ultradiscretization of the well known tau function for the KP hierarchy [37] under
209
Table 1
Main combinatorial object

Role in boxball system
Description of dynamics
Bethe ansatz
Corner transfer matrix
Rigged configuration
Action-angle variable
Linear
Energy in affine crystal

Tau function
Bilinear
a special choice of parameters adapted to the rigged configuration. Using this fact, we show
that i satisfies an ultradiscrete version of the Hirota bilinear equation, which actually serves
as a characterization of i up to a boundary condition. We call i the ultradiscrete tau function.
It serves as an analogue of a corner transfer matrix in the boxball system and bilinearize the
dynamics. These features are summarized in Table 1.
As the main consequences of Theorem 6.12, we derive a piecewise linear formula for the KKR
bijection (Theorem 2.1), the solution of the initial value problem (Theorem 7.6) and the general
N -soliton solution (7.21), (7.37), (7.42) for the boxball system. Note that the quantities i = Ei
arise from the corner transfer matrix and crystals, whereas i is an explicit formula originating
in the Bethe ansatz. Therefore our Theorem 6.12, i.e., i = Ei = i provides another connection
of the two methods analogous to the X = M conjecture.
The layout of the paper is as follows. In Section 2, i is introduced in (2.18)(2.20) as a
(a)
piecewise linear function on rigged configurations. It is actually a member of the family i
(2.22) which obeys the recursion relation (2.23). It reflects the nested structure sln+1 sln
sl2 , which will be utilized extensively. The piecewise linear formula for the KKR bijection
is stated in Theorem 2.1.
In Section 3, we give the definition and the basic properties of the boxball system.
In Section 4, we introduce i and Ei . i in (4.1) is the number of balls in the SW quadrant in
the time evolution pattern of the boxball system. Ei is defined by (4.12) and (4.11), which is a
sum of local energy function in the affine crystal. They are analogues of the corner transfer matrix
[1] in complementary viewpoints; i originates in the boxball system and Ei in the crystal base
theory. They are identified in Proposition 4.6.
The piecewise linear formula for the KKR bijection (Theorem 2.1) is a consequence of the
further identification i = i = Ei in Theorem 6.12. Sections 5 and 6 are devoted to a proof of
this fact. In Section 5, i is shown to emerge as an ultradiscretization of the tau functions of the
KP hierarchy (Lemma 5.3) and satisfy the Hirota type bilinear equation (Proposition 5.1). The
key to these results is the special choice of the parameters (5.5)(5.9). It assures the positivity,
which is vital in the ultradiscretization (Lemma 5.2). The content of this section is a refinement
of the earlier analysis [26].
(1)
In Section 6, i = i for An is proved on the asymptotic states by induction on the rank n
(Proposition 6.1 and its reduction in Proposition 6.4). From the assumption i = i = Ei for
(1)
An1 , the scattering data is expressed in terms of tau functions (Lemma 6.6). Then we take
advantage of the vertex operator formulation of the KKR bijection [34,35] to make the induction proceed. Combined with the results in Section 5, the agreement on the asymptotic states is
enough to establish the claim i = i everywhere.
In Section 7, Theorem 2.1 and Theorem 6.12 are generalized to arbitrary (non-highest) states.
As an application, we present the solution of the initial value problem of the boxball system in
Theorem 7.6. Our tau functions are parametrized by the conserved quantities that specify solitons. We rewrite them in several forms in (7.21), (7.37) and (7.42). They yield general N -soliton
210
solutions of the boxball system. Among others, our ultradiscrete tau functions are most elegantly presented in (7.42) in terms of affine crystals in the principal picture.
Appendix A summarizes the rudiments of the crystal base theory. Appendix B illustrates the
graphical rule [17] for obtaining the combinatorial R, the winding and the non-winding numbers
relevant to the energy function. Appendix C recalls the combinatorial algorithm for the KKR
bijection. Appendix D is the crystal theoretical reformulation of the KKR map due to [34,35].
Appendix E is an exposition of the inverse scattering formalism of the boxball system which
supplements Section 3.
2. Ultradiscrete tau function
2.1. Preliminary
We summarize the basic notation used throughout the paper. For a multiset = (1 , . . . , k ),
we use the symbols
|| = 1 + + k ,
() = k,
(2.1)
[N] = (1 , . . . , N )
(0 N k),
(2.2)
where [0] = . Given two multisets = (1 , . . . , k ) and = (1 , . . . , m ), we use the notation:

min(, ) =
m
k

(2.3)
min(i , j ),
i=1 j =1
def
{1 , . . . , k } {1 , . . . , m },
(2.4)
where accounts the multiplicity as well. For example, , (1, 1), (1, 3, 1) (1, 2, 1, 3) but
(2, 2) (1, 2, 1, 3).
2.2. Rigged configurations
Consider the data of the form

(0) (1) (1)
, , r , . . . , (n) , r (n) ,
(a)
(2.5)
(a)
(a)
(a)
where (a) = (1 , . . . , la ) (Z1 )la and r (a) = (r1 , . . . , rla ) (Z0 )la for some la 0.
(a)
(a)
Apart from (0) , each ((a) , r (a) ) is to be understood as a multiset of the pairs (1 , r1 ), . . . ,
(a)
((a)
la , rla ) whose ordering does not matter. The data (2.5) is called a rigged configuration for
(1)
An if
(a)
0 ri
p
(a)
(a)
i
for any pair
(a)
(a)
i , ri
(2.6)
Here pj(a) is called the vacancy number and defined by

pj(a) = Ej(a1) 2Ej(a) + Ej(a+1)
(a)
Ej =
la

k=1
(1 a n),

(a)
(0 a n),
min j, k
(n+1)
Ej
(2.7)
= 0.
(2.8)
211
(a)
The array ((0) , . . . , (n) ) is called a configuration and the nonnegative integers ri are called
(a)
(a)
(a)
rigging. Note that pj and Ej depend only on the configuration. In particular E = |(a) |. It
(a)
(a)
is customary to arrange (a) as (a) = (1 la ) and regard the rigged configuration
as an n-tuple of Young diagrams (1) , . . . , (n) where the row of length (a)
i is assigned with
(a)
the rigging ri subject to the condition (2.6). In this convention, we identify all the diagrams
obtained by reordering the rows of equal length with different rigging. In what follows we do not
(a)
(a)
assume 1 la unless explicitly mentioned.
For a multiset with positive components , let RC() denote the set of rigged configurations
(2.5) with (0) = . Set
c(, r) =

(a)

1
r ,
Cab min (a) , (b) min (0) , (1) +
2
a
(2.9)
a,b
where (Cab )1a,bn is the Cartan matrix of An . The fermionic formula [9,10] is obtained as the
generating function:

M() =
q c(,r) ,
(2.10)
where the sum extends over all the rigged configurations (, ((1) , r (1) ), . . . , ((n) , r (n) ))
RC() with prescribed values for |(1) |, . . . , |(n) |. The sum (2.10) is arranged as M() =
c(,0) r (a)
a,i i , where the sum over the rigging r under the condition (2.6) yields a
q
rq
product of q-binomial coefficients as is well known.
2.3. Crystals
(1)
We recapitulate basic facts on the An crystal Bl . For a general background see Appendix A.
The Bl is the crystal base of the l-fold symmetric tensor representation. As the set it is given by

Bl = x = (x1 , . . . , xn+1 ) (Z0 )n+1 | x1 + + xn+1 = l .
(2.11)
The Kashiwara operators act as ei (x) = x , fi (x) = x with xj = xj + i,j i,j +1 and xj =
xj i,j + i,j +1 . Here indices are in Zn+1 and x and x are to be understood as 0 unless
they belong to (Z0 )n+1 . The combinatorial R: Aff(Bl ) Aff(Bm ) Aff(Bm ) Aff(Bl ) has
the form R: x[d] y[e] y[e
H (x y)] x[d
+ H (x y)], which are described by the
piecewise linear formula [26,38]:
xi = xi + Qi (x y) Qi1 (x y),
yi = yi + Qi1 (x y) Qi (x y),
k1
n+1

xi+j +
yi+j 1 k n + 1 ,
Qi (x y) = min
j =1
(2.12)
(2.13)
j =k+1
H (x y) = min(l, m) Q0 (x y).
(2.14)
The energy function H here is normalized so that 0 H min(l, m) and coincides with the
winding number [17]. In general min(l, m) Qi is the ith winding number that counts the
lines crossing xi and xi+1 (Appendix B).
The element x = (x1 , . . . , xn+1 ) is also denoted by a row shape semistandard tableau of length
l containing the letter i xi times and x[d] Aff(Bl ) by the tableau with index d. For example in
212
(1)
A3 , the following stand for the same relation under R:

(1, 2, 0, 1) [5] (1, 0, 1, 0) [9] (0, 1, 0, 1) [8] (2, 1, 1, 0) [6],
1224
13
24
1123 6 .
(2.15)
To save the space we use the notation:

ul = 1l = 1 1 Bl .
a l = a a Bl ,
Setting

= (x1 , . . . , xn+1 ) Bl | x1 = = xa = 0
a+1
Bl
we have
1
Bl = Bl
2
Bl
n+1
Bl
(2.16)
(0 a n),

= (n + 1)l
(2.17)
(1)
An
as sets. We will need to consider the crystals not only for

but also for the nested family
(1)
(1)
(1)
(1)
A0 , A1 , . . . , An1 . In such a circumstance we realize the crystal Bl for Ana (0 a n) on
a+1
the set B
with the Kashiwara operators ei , fi (a i n). In this convention the highest
l
a+1
element with respect to Ana is (a + 1)l Bl

.
Let

P+ () = p B1 BL | ei p = 0, 1 i n
be the set of highest elements (paths) with respect to An . The bijection [9,10] between
RC() and the LittlewoodRichardson tableaux is translated to the one between RC() and
P+ (). We call the resulting map the KKR bijection. See Appendix C for an exposition of
the algorithm and Appendix D for the recent reformulation as the crystal theoretical vertex operator [34,35]. In particular, there is a nested structure with respect to the rank in
(1)
the sense that if ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is a rigged configuration for An , so is
(1)
((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) for Ana . Moreover, the KKR bijection sends the lata+1
a+1
ter to a highest path in B (a) B (a) .
1
la
2.4. Piecewise linear formula for KKR bijection

We use the notation defined in Section 2.1. Given a rigged configuration ((0) , ((1) , r (1) ),
. . . , ((n) , r (n) )), we introduce the ultradiscrete tau functions 0 (), 1 (), . . . , n+1 () for
(0) as follows:
0 () = n+1 () ||,

1 d n + 1, (n+1) = 0 ,
d () = max c(, s) (d)

c(, s) = min , (1) + min (1) , (2) + + min (n1) , (n)

min (1) , (1) min (2) , (2) min (n) , (n)

s (1) s (n) .
(2.18)
(2.19)
(2.20)
In (2.19), max is taken over = ( (1) , . . . , (n) ), where the components are independently
chosen under the condition (1) (1) , . . . , (n) (n) . The array s = (s (1) , . . . , s (n) ) denotes the set of the riggings s (1) r (1) , . . . , s (n) r (n) that are paired with the chosen

(a)
(a)
(a)
213
(a)
(1) , . . . , (n) as {(i , si )} {(i , ri )}. The quantity c(, s) in (2.20) is obtained
from c(, r) (2.9) by replacing (, r) = ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) with (, s) =
(, ( (1) , s (1) ), . . . , ( (n) , s (n) )). Apart from (a) (a) , there is no further constraint on
| (1) |, . . . , | (n) | and it is not required that the data (, ( (1) , s (1) ), . . . , ( (n) , s (n) )) to be a rigged
(1)
configuration for An . Since the max (2.19) includes the trivial case (a) = , the quantities
1 (), . . . , n+1 () are nonnegative integers. Note that n+1 ((0) ) = max{c(, s)} in (2.19)
may be viewed as an ultradiscretization of the single summand q c(,r) in the fermionic formula (2.10) with respect to the subsets (, s) (, r). See also (5.11).
Theorem 2.1. Let the image of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under
the KKR bijection be the highest path p1 pL P+ ((0) ). Then pk = (x1 , . . . , xn+1 )
B(0) is expressed as
k
xd = k,d k1,d k,d1 + k1,d1 ,

(0)
(2.21)
(0)
where k,d = d ((1 , . . . , k )).

(0)
Note that (2.18) ensures x1 + + xn+1 = k .

Due to the nested structure of the KKR bijection with respect to the rank [34], Theorem 2.1
is also stated as a family of relations corresponding to sln+1 sln sl2 . To do so, we
(a)
introduce the family of ultradiscrete tau functions {d () | 0 a n 1, a d n + 1,
(a)
(a)
(a) } by a () = n+1 () || and

(a)
d () = max min , (a+1) + min (a+1) , (a+2) + + min (n1) , (n)

min (a+1) , (a+1) min (a+2) , (a+2) min (n) , (n)

s (a+1) s (a+2) s (n) (d) (a + 1 d n + 1),
(2.22)
where | (n+1) | = 0 as before. The max is taken over the independent choices (a+1)
(a+1) , . . . , (n) (n) . The subsets of the riggings s (a+1) r (a+1) , . . . , s (n) r (n) are those
paired with the chosen (a+1) , . . . , (n) as before. The previously introduced tau function d ()
(0)
(2.19) is equal to d (). Now Theorem 2.1 is rephrased as
Theorem 2.2. Given a rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and 0 a
n 1, let the image of ((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) under the KKR bijection be the
a+1
a+1
Ana highest path p1 pla B (a) B (a) . Then pk = (xa+1 , xa+2 , . . . , xn+1 )
1
is expressed as
(a)
(a)
(a)
la
(a)
xd = k,d k1,d k,d1 + k1,d1 ,

(a)
(a)
(a)
(a)
where k,d = d ((1 , . . . , k )).

(a)
Again, xa+1 + + xn+1 = k is evident by the construction. For a proof of Theorem 2.1, see
Section 4.4.
The tau functions (2.22) are the solution of the recursion relation with respect to the rank:

(a)
(a+1)
()
d () = max min(, ) min(, ) |s| + d
(2.23)
(a+1)
214

(a)
(a)
for 0 a n 1, a + 1 d n + 1 with the convention a () = n+1 () || and the initial

(n)
(n)
condition n+1 () = 0, n () = ||. The rigging s is the subset of r (a+1) paired with the
chosen .
(a)
Lemma 2.3. d () = 0 for any 0 a n 1 and a + 1 d n + 1.

Proof. It suffices to prove a = 0 case. When = , (2.19) becomes d () = min {c()
is given by (see (2.9))

|s (1) | + + |s (n) | + |s (d) |}, where c()
c()

1

1
(b)
Ca,b min (a) , (b) =
Ca,b
min(i, j )m(a)
i mj ,
2
2
a,b
a,b
(a)
i,j
(a)
where mi is the number of k such that k = i. This is a positive definite quadratic form whose
(a)
minimum is 0 at mj = 0. The other part |s (1) | + + |s (n) | + |s (d) | appearing in d () also
attains the minimum 0 simultaneously at this point. 2
Let the image of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under the KKR
bijection be the highest path p1 pL P+ ((0) ) B(0) B(0) . In what follows
L
1
we will also write
(0)
(0)
(0)
(1 k L).
for = [k] = 1 , . . . , k
i () = k,i = i (p1 pk )
(2.24)
Concerning the notation i (p1 pk ), a remark is in order. Any highest path p1 pk

can be extended to a longer one p1 pk pk+1 pL in which pk+1 pL
is not unique. Suppose that ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and ( (0) , ( (1) , r (1) ), . . . ,
( (n) , r (n) )) are two rigged configurations corresponding to such extensions of p1 pk ,
and let i (p1 pk ) and i (p1 pk ) be the associated tau functions in the sense of
(2.24). Then i (p1 pk ) = i (p1 pk ) will be guaranteed by Theorem 4.9. Note
however that they are different as the piecewise linear expressions as in (2.18)(2.20). By the
reason, we will always mention the rigged configurations relevant to p1 pk .
Example 2.4. Consider the highest path p = 11112221322433 B1L of length L = 14, where
we have omitted the symbol . The corresponding rigged configuration is depicted in Example C.2. Thus we set

(0) = 114 ,
(1) = (4, 3, 2),
(2) = (3, 1),
(3) = (1),
r (1) = (0, 2, 3),
r (2) = (1, 0),
r (3) = (0).
The associated tau function k,i takes the following values.

k
10
11
12
13
14
k,1
k,2
k,3
k,4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
2
2
2
0
3
3
3
1
4
4
4
2
5
6
6
3
7
8
8
4
9
10
10
6
11
12
13
8
13
15
16
10
15
18
19
215
The choices of the subsets = ( (1) , (2) , (3) ) that attain these values for k,4 = max { } in
(2.19) are as follows:
k
1, 2, 3
4
5, 6, 7
8
9, 10
11
12, 13, 14
A
A, B
B
B, C
C
C, D
D
Here A, B, C, D = ((1) , (2) , (3) ) are given by

A = (, , ),

B = (4), , , (4), (1), , (4), (1), (1) ,

C = (4, 2), (3), , (4, 2), (1), , (4, 2), (3), (1) , (4, 2), (1), (1) , (4, 2), (3, 1), (1) ,

D = (4, 3, 2), (3, 1), (1) = (1) , (2) , (3) .
The case k = 0 enforces the choice (a) = in agreement with Lemma 2.3. In the other extreme
case k = L, the full choice = is the consequence of the general result in Remark 6.14. In
general the maximum attaining for i () = max { } gradually grows with . The above p
will be investigated further in Examples E.1 and E.4.
3. Boxball system
3.1. Conventional formulation
Consider the tensor product B1 B2 BL . Its elements are called states. We regard
each component (x1 , . . . , xn+1 ) Bl as a capacity l box containing xi balls with color i for 2
i n + 1. On the other hand the letter 1 is to be interpreted as a vacancy. Thus x1 represents the
empty space in the box. A state represents an array of boxes with capacity 1 , . . . , L containing
balls of colors 2, 3, . . . , n + 1.
We define the time evolution Tl (p) = p1 pL of a state p = p1 pL by
ul [0] p1 [0] pL [0] p1 [d1 ] pL [dL ] vl [d1 + + dL ]
(3.1)
under the isomorphism Aff(Bl ) (Aff(B1 ) Aff(BL )) (Aff(B1 ) Aff(BL ))

Aff(Bl ). Here vl Bl and di are uniquely determined by (2.12)(2.14). We set
El (p) = e1 + + eL ,
ej = min(j , l) dj ,
(3.2)
which has the property El (p uk ) = El (p) for any k and l.

It is known [26,27,31] that Tl is weight preserving, the commutativity Tl Tk = Tk Tl is valid and
El (p) is a conserved quantity, i.e., El (Tk (p)) = El (p) for any k and l, provided that pj = uj for
L j L with sufficiently large L L . The proof of these facts is based on the YangBaxter
equation of the combinatorial R (Proposition A.1) and the property:
v l = ul
if pj = uj
for L j L with sufficiently large L L .
Tl stabilizes for l 1, which will be denoted by T .
(3.3)
216
Since each dj is the winding number (2.14), El (p) is the sum of the non-winding number ej .
In particular for l = , ej is equal to the number of balls x2 + + xn+1 in the j th box pj =
(x1 , x2 , . . . , xn+1 ) Bj . Therefore we find
E (p) = number of balls contained in p.
(3.4)
In the terminology of solvable lattice models, El is the energy associated with a row transfer
matrix. It should not be confused with another energy Ei (4.12) relevant to the corner transfer
matrix. Their relation is given in Proposition 4.8. The conserved quantity El will be evaluated
explicitly for highest states in Proposition 6.15 and for general states in Proposition 7.7.
2 (p), T 3 (p) are
Example 3.1. The time evolution of the top row p under T , i.e., T (p), T
listed downward. The frame of the semistandard tableaux and the symbol are omitted.
11
11
11
11
122
111
111
111
2
1
1
1
1333
1222
1111
1111
1
3
2
1
1
3
2
1
4
3
2
1
1
4
3
2
1
1
4
3
1
1
3
2
1
1
3
2
1
1
1
4
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
The conserved quantities are given by E1 (p) = 3, E2 (p) = 5 and El (p) = 7 for l 3.
The time evolution T can be calculated by a simple prescription [26]. We introduce a map
Li (2 i n + 1) by
Li : Z0 Bl Bl Z0 ,
(m, y) (y , m ),

where m and y = (y1 , . . . , yn+1
) are determined from m and y = (y1 , . . . , yn+1 ) by
yi + (y1 m)+ if j = 1,
if j = i,
m = yi + (m y1 )+ , yj = min(m, y1 )
y
otherwise,
j
(3.5)
(3.6)
where (m)+ = max(m, 0). Li may be viewed as the interaction of the box Bl with the carrier that
contains m balls of color i. The carrier drops as many balls as possible into the empty space y1
and picks away all the color i balls that were originally in the box. Using Li , we introduce the
operators Ki (2 i n + 1) that sends a state to another as follows:
Ki (p1 p2 ) = p1 p2 ,

Li (mj , pj ) = (pj , mj +1 ) for j 0 (m0 = 0).
The latter relation is applied successively for j = 0, 1, 2, . . . , determining all the pj s. In other
words the operator Ki attaches an empty carrier to the left of the state and sends it to the right,
by which the color i balls are moved to the right according to the local interaction rule Li .
Proposition 3.2 (See [26]). The time evolution T admits the factorization:
T = K2 K3 Kn+1 .
217
Example 3.3. For p in Example 3.1, K4 (p), K3 K4 (p) and K2 K3 K4 (p) = T (p) are given.
11
11
11
11
122
122
122
111
2
2
2
1
1333
1333
1111
1222
1
1
3
3
1
1
3
3
4
1
3
3
1
4
4
4
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Remark 3.4. Suppose pj = uj for 1 j k in a state p = p1 pL . Then Proposition 3.2

tells that in the state T (p) = p1 pL , pj = uj is valid for 1 j k + 1.
3.2. Bethe ansatz
Highest states in B1 BL are in one to one correspondence with rigged configurations
((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) with (0) = (1 , . . . , L ) by the KKR bijection. Suppose L
is sufficiently large. If a state p = p1 pL is highest and pk = uk for k 1, so is its time
evolution Tl (p). Thus the boxball system induces the time evolution on the associated rigged
(0)
(1)
configurations. For such states, Ej (2.8) and the vacancy number pj are sufficiently large, and
one can increase the color 1 rigging ri(1) without violating the condition (2.6).
Proposition 3.5 (See [34], Proposition 2.6). Let p = p1 pL P+ ((0) ) be the image
of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under the KKR bijection. Assume
(1)
(1)
(1)
that vl = ul in (3.1) and set ri = ri + min(l, i ).
Then ((0) , ((1) , r (1) ), ((2) , r (2) ), . . . , ((n) , r (n) )) is a rigged configuration and corresponds to the highest state Tl (p) P+ ((0) ).
This is proved from the definition of the time evolution (3.1) and Lemma C.3. The time evo(a)
lution Tl in this paper corresponds to the a = 1 case of Tl considered in [34]. In this sense
the rigged configurations are the action-angle variables of the boxball system which linearize
the original nonlinear dynamics (3.1). Moreover it is clear that all the Tl (p) are the same if
l max (1) .
t (p) (t = 0, 1, 2, 3) in Example 3.1
Example 3.6. The rigged configuration corresponding to T
(0)
(apart from ).
((1) , r (1) )
12
12
13
3+t
(a)
3t
3t
((2) , r (2) )
0
0
((3) , r (3) )
0
0
(a)
The length of each row is i and the numbers on its right and left are the rigging ri and the
(a)
vacancy number p (a) , respectively. (Vacancy numbers are exhibited here for a check of (2.6).)
i
The Bethe ansatz produces transfer matrix eigenvectors from solutions to Bethe equations. The
KKR bijection is its combinatorial version in the sense that the former is replaced by highest
states and the latter by rigged configurations. Thus we see that the combinatorial Bethe ansatz
provides a linearization scheme, or equivalently, an inverse scattering method of the boxball
system [34]. See Appendix E for a further exposition combined with the vertex operator formalism of the KKR bijection.
218
4. Corner transfer matrix

4.1. Number of balls in the SW quadrant
t (p p ) = p t
Let p = p1 pL be a state and write its time evolution as T
1
L
1
t
t
t
t
t
pL , with pj = (xj,1 , xj,2 , . . . , xj,n+1 ) Bj . We do not assume that p is highest. For
0 k L and 1 d n + 1, we define the function k,d (p) Z0 by (0,d (p) = 0)
k,d (p) =
k

k

t

0
0
t
xj,2
+
xj,2 + + xj,n+1
.
+ + xj,d
j =1
(4.1)
t1 j =1
Here
term is finite due to Remark 3.4. In fact the double sum may well be replaced
the second
k
by k1
j =t+1 only where the nonzero contributions are contained. This region is depicted as
t=1
the SW quadrant of the time evolution pattern like Example 3.1.
The first term in (4.1) is the number of balls of color 2, 3, . . . , d contained in the top row,
which is the truncation p1 pk of the state p. The second term counts the balls of all
colors 2, . . . , n + 1 within the hatched domain. By the definition, k,n+1 is the total number of
balls within p1 pk and the SW quadrant beneath it. Thus

k,1 (p) = k,n+1 T (p)
(4.2)
holds. Note that k,d (p) is independent of pk+1 , pk+2 , . . . , pL . In this regard, we will also use
the notation
d (p1 pk ) = k,d (p).
(4.3)
From Remark 3.4 it follows that

d (ul p1 pk ) = d (p1 pk )
(4.4)
for any l.
The above picture reminds us of Baxters corner transfer matrix (CTM) in solvable lattice
models [1]. In fact k,d serves its ultradiscrete analogue adapted to the boxball system as we
will see below.
Example 4.1. For p in Example 3.1, k,d (p) takes the following values.
k
k,1
k,2
k,3
k,4
0
0
0
0
0
2
2
2
0
3
3
3
3
6
9
9
5
8
11
11
7
10
13
13
9
12
15
16
12
15
18
19
219
4.2. Bilinearization of boxball system

By the definition, the kth component pk = (x1 , . . . , xn+1 ) Bk in a state p = p1 pL
is expressed as
xd = k,d k1,d k,d1 + k1,d1
(1 d n + 1),
(4.5)
where k,d = k,d (p) for 1 d n + 1 and the extra one k,0 (p) is specified by
k,0 (p) = k,n+1 (p) (1 + + k )
(4.6)
so as to satisfy x1 + + xn+1 = k . The formula (4.5) may be viewed, in a certain

sense, as an ultradiscrete analogue of the Baxter formula (Eq. (13.1.12) in [1]): 1 =
Tr(SABCD)/Tr(ABCD) for one point function in terms of CTMs.
We use the notation

k,d = k,d T (p) .
(4.7)
Thus (4.2) reads
k,1 = k,n+1 .
(4.8)
Proposition 4.2. For 2 d n + 1 the following relation holds:

k,d1 + k1,d = max(k,d + k1,d1 , k1,d1 + k,d k ).
(4.9)
A similar fact has been shown in [26].

Proof. In the time evolution T = K2 K3 Kn+1 (Proposition 3.2), let us calculate the effect of
the operator Kd on the kth box pk = (x1 , . . . , xn+1 ) Bk in Kd+1 Kn+1 (p). In the following,
the fact that color d balls are touched only by Kd is taken into account. Suppose that the carrier
contains m and m balls with color d just before and after the interaction Ld (3.5). In (3.6) we
are to set
m =
k

(j,d j 1,d j,d1 + j 1,d1 ) ( )
j =1
= (k,d k,d1 ) (k,d k,d1 ),

m = m |kk1 ,
yd = xd = k,d k1,d k,d1 + k1,d1 ,
where we have used (4.5). As for the empty space y1 concerning Ld in (3.5), we show that it is
given by
y1 = k + k,d k1,d k,d + k1,d
(2 d n + 1)
(4.10)
by induction on d in the decreasing order d = n + 1, n, . . . , 2. In so doing, the bilinear relation (4.9) will be established simultaneously.
For d = n + 1, (4.10) coincides with x1 in (4.5) by (4.6) and (4.8), hence it is correct. Then
the relation m = yd + (m y1 )+ (3.6) leads to (4.9). The new empty space is determined from
220
y1 = yd + (y1 m)+ = m + y1 m and is equal to

k + k,d1 k1,d1 k,d1 + k1,d1 .
This coincides with (4.10) with d replaced by d 1, making the induction proceed.
The relation (4.9) is an ultradiscrete analogue of the Hirota bilinear equation. In view of
(4.8), it determines k1,1 , k1,2 , . . . , k1,n+1 successively from {k1,d , k,d , k,d | 1 d
t (p)) are fixed uniquely from the data at sufficiently large t and k.
n + 1}. Thus all the k,d (T
Then the local states are specified by (4.5). In this sense the ultradiscrete CTM d achieves a
bilinearization of the dynamics of the boxball system.
4.3. Relation to energy function
Let p B1 BL be any element which is not necessarily highest. For 1 k L, we
introduce the sum:

(j +1)
(1 i n + 1),
Qi pj pm
Ei (p1 pk ) =
(4.11)
1j <mk
where Qi is the ith non-winding number (2.13) with the convention Qn+1 = Q0 . The element
(j +1)
pm
is defined by sending pm to the left by applying the combinatorial R successively as
(m1)

pm1

pj pj +1 pm1 pm pj pj +1 pm
(j +2)
pj pj +1 pm
(j +1)
p j pm

pm1

pj +1 pm1
.
We understand that (4.11) is 0 for k = 0, 1. Using Ei we define the ith energy Ei by

Ei (p1 pk ) = Ei (u p1 pk )
(1 i n + 1),
(4.12)
where u actually means ul with sufficiently large l. Ei does not depend on such l. In fact, from
the graphical rule in Appendix B, we find Qi (ul x) = x2 + x3 + + xi if x = (x1 , . . . , xn+1 )
(1)
and x1 + + xn+1 l is satisfied. Thus writing pj = (xj,1 , . . . , xj,n+1 ), (4.12) is split into the
boundary and the bulk parts as
Ei (p1 pk ) =
k

(xj,2 + + xj,i ) + Ei (p1 pk ).
(4.13)
j =1
In particular, one has Ei (p1 pk ) = Ei (p1 pk ) if p1 pk is highest. We warn

up to an additive constant. In
that the quantity usually called energy [17,20] is En+1 or En+1
what follows, whenever the notation u is used, it should be understood as ul with sufficiently
large l and the relevant quantity is independent of such l.
To the relation x y y x with e = Qi (x y), we assign the diagram
x
e

y
(4.14)
221
where the suppressed i is to be mentioned nearby if necessary.

Let ((x1 , x2 , . . . , xn+1 )) = (x2 , x3 , . . . , x1 ) be the Dynkin diagram automorphism acting on
Bl decreasing the tableau letters cyclically by one. We extend it naturally to the tensor product
by (p1 pk ) = (p1 ) (pk ). Since the combinatorial R commutes with , the
ith non-winding number has the properties similar to the i = 0 case. In particular, under the
YangBaxter relation
a
d
e
b
=
the equalities a + b = e + f and b + c = d + e hold. In fact, suppose the figure corresponds to

Aff(Bk ) Aff(Bl ) Aff(Bm ) Aff(Bm ) Aff(Bl ) Aff(Bk ) for some k, l and m. If i = 0 for
instance, the associated non-winding number Q0 is related to H via (2.14), therefore by setting
a = min(k, l) a, b = min(k, m) b and c = min(l, m) c, the left-hand side represents the
following relation under the combinatorial R:
x [1 + a]
z[3 ]
x[1 ] y[2 ] z[3 ] y [2 a]

x [1 + a + b]
z [3 b]
y [2 a]
y [2 a + c]
x [1 + a + b].
z [3 b c]
Similarly, by setting d = min(l, m) d, e = min(k, m) e and f = min(k, l) f , the same
element is transformed along the right-hand side as
y [2 + d]
x[1 ] y[2 ] z[3 ] x[1 ] z [3 d]
x [1 + e]
y [2 + d]
z [3 d e]
y [2 + d f] x [1 + e + f].
z [3 d e]
Since the YangBaxter relation is valid among the affine crystals, we obtain not only x =
x , y = y and z = z but also b + c = d + e,
a c = f d and a + b = e + f, which are
equivalent to the two relations b + c = d + e and a + b = e + f . Note that a + b + c = e + f + d
in general.
Remark 4.3. The energy is invariant under any reordering of p1 pk by the combinatorial R. Namely, Ei (p1 pk ) = Ei (p1 pk ) and Ei (p1 pk ) = Ei (p1 pk )
hold if p1 pk p1 pk by the combinatorial R. For i = n + 1 this is essentially
Proposition 3.9 in [20] and the general i case follows from the symmetry under .
Let us consider a particular diagram involving p1 pk , which is illustrated for k =
2, 3, 4. The general case is similar.
p1 p2 p3 p4
p1
p2
p1 p2 p3
Incidentally, this kind of diagrams have been known as the half twist in the construction of link
invariants [39].
222
Lemma 4.4. The energy Ei (p1 pk ) is the sum of the non-winding numbers Qi (as e in
(4.14)) attached to all the vertices of the corresponding diagram for p1 pk as above.
Proof. For k = 2 it is obvious. We use the definition (4.12) and illustrate the induction step along
the one from k = 3 to k = 4.
p1
p3
p2
p4
p3
p2
p1
d3
p4
d2
e1
e2
d1
e3

(j +1)
By the induction assumption, the sum of three is equal to 1j <m3 Qi (pj pm ).

(j +1)
Thus we are to verify e1 + e2 + e3 = 1j <4 Qi (pj p4
). But the YangBaxter equation
(4)
shown above tells that e1 + e2 + e3 = d1 + d2 + d3 , and furthermore, d3 = Qi (p3 p4 ), d2 =

(3)
(2)
Qi (p2 p4 ), d1 = Qi (p1 p4 ). 2
Lemma 4.5. i (p1 pk ) i (T (p1 pk )) = e1 + + ek , where ej s are the ith
non-winding numbers specified by the following diagram:
u
e1
p1
e2
p2
ek
pk
Proof. In terms of the notation in (4.1), the difference of i is evaluated as

k

j =1
k

1

0
0
1
xj,2
+
xj,i+1 + + xj,n+1
.
+ + xj,i
j =1
By using the graphical rule [17] explained in Appendix B, it is easy to show that the non-winding
0 + + x 0 ) + (x 1
1
number Qi (2.13) is given by ej = (xj,2
j,i
j,i+1 + + xj,n+1 ). 2
The main result in this subsection is the following, which identifies the ultradiscrete CTM i
(4.3) with the energy Ei that originates in the crystal theory.
Proposition 4.6. i (p1 pk ) = Ei (p1 pk ) holds for any k and 1 i n + 1.
t (p) with sufficiently large t, its leftmost k components become u u
Proof. For T
1
k
due to Remark 3.4. In this case the both i and Ei are obviously zero. Therefore it suffices to
show
i (p1 pk ) i (p1 pk ) = Ei (p1 pk ) Ei (p1 pk ),
223
where p1 pk = T (p1 pk ). We illustrate the proof for k = 3. From Lemma 4.5,

we are to show Ei (p1 p2 p3 ) = Ei (p1 p2 p3 ) + e1 + e2 + e3 . Recall that p1 pk
is determined by carrying u by the combinatorial R through p1 pk to the right as
u p1 pk p1 pk (). Combining this with Lemma 4.4, one can depict the
two sides as follows:
p1
p3
p2
e1
e1
p2
e2

e2 p2
e3
p3
e1
p1
e2
a
p1
e3
p3
e3
b
Ei (p1 p2 p3 )
Ei (p1 p2 p3 ) + e1 + e2 + e3
We are to check a + b + c = a + b + c + e1 + e2 + e3 . From Remark 4.3, we may assume

1 2 3 without loss of generality. Then the above equality is a consequence of the separate
ones e1 = 0, a = a + e2 and b + c = b + c + e3 . To see them, note that u b um ()
for any b Bm under the combinatorial R. Moreover Qi (um uj ) = 0 for any m, j . Thus
e1 = Qi (u u1 ) = 0 indeed. The other relations can also be seen by appropriately deforming
the leftmost line from u in the right diagram with the aid of the YangBaxter equation:
u
p1
p2
p3
e1
e1
p1
p3
e2
e2
e3
e3
a
d
p2
b
c
c
Comparing the lines for p2 in the left diagram here and the previous one, we find e2 + e2 + a =
e2 + a + d. Similarly, the lines for p3 in the right diagram here and the previous one lead to
e3 + e3 + b + c = e3 + b + c + d . The proof is finished by noting d = d = 0 because of
Qi (um uj ) = 0 for any m, j . 2
As a corollary of Proposition 4.6 and (4.4), one has
Ei (ul p) = Ei (p),
which can also be verified by an argument similar to the above proof.
(4.15)
224
Remark 4.7. Although the both i and Ei admit decompositions into the boundary and the bulk
parts as in (4.1) and (4.13), these parts are not equal separately in general. Proposition 4.6 has
also been proved by Mark Shimozono by using the technique known as katabolism (private
communication).
The energy En+1 (4.12) and the row transfer matrix energy El (3.2) are related by
Proposition 4.8.

En+1 (p) En+1 Tl (p) = El (p).
For l = this coincides Lemma 4.5 with i = n + 1.
Proof. We illustrate the proof for p = p1 pk with k = 3. Consider the diagrams:
u
p1
ul
p2
p3
ul
p1
p2
p3
e1
0
d1
e1
d1
e3
d2
e2
d2
e2
e3
d3
=
a
d3
b
b
c
Here the numbers above the vertices signify the (n + 1)th non-winding number as in (4.14)
with i = n + 1, and we have applied the YangBaxter relation to the line from ul . According to
Lemma 4.4, En+1 (ul p) is equal to the sum of all the numbers in the left diagram. Similarly,
En+1 (Tl (p)) is obtained from the right diagram as En+1 (Tl (p)) = d1 + d2 + d3 + a + b + c. The
YangBaxter equation tells that ei + di = ei + di for i = 1, 2, 3. Using these facts and (4.15), we
obtain En+1 (p) En+1 (Tl (p)) = En+1 (ul p) En+1 (Tl (p)) = e1 + e2 + e3 , which coincides
with El (p) in (3.2). 2
4.4. Proof of Theorem 2.1
Theorem 2.1 is a simple corollary of (4.5) and
Theorem 4.9. For any rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and the corresponding highest state p = p1 pL under the KKR bijection, the associated ultradiscrete
tau function (2.19) and the ultradiscrete CTM (4.1), (4.3) coincide. Namely
i (p1 pk ) = i (p1 pk )
(1 i n + 1, 1 k L).

(4.16)

Proof. Consider the embedding of p into P+ ((0) ) B1L as p = p 1L . The corresponding

rigged configuration is obtained from that of p by just changing (0) into (0) (1L ). It is easily
225
seen that k,i and k,i for p are the same as those for p as long as 1 k L. Thus we understand
them as associated with p rather than p.
Our proof is based on Propositions 5.1 and 6.1, which will be established in Sections 5 and
6, respectively. Proposition 5.1 states that i satisfies the same bilinear equation (4.9) as i .
Combined with (4.8), it determines k1,1 , k1,2 , . . . , k1,n+1 successively in this order from
{k1,i , k,i , k,i | 1 i n + 1}. Namely, the tau function on the NW corner in
k

k1
k1
is fixed from those on the NE, SW and SE. Like and ,

the tau functions and are associated
with p and T (p ), respectively (see the beginning of Section 5), and the above diagram can be
extended to a two-dimensional square lattice with the indicated coordinates. The square at (k, t)
t (p )))n+1 .
is associated with (k,i (T
i=1
Consider the rectangular region on the lattice 0 t t0 , 1 k L + L , where the tau
functions for p constitutes the top line t = 0 of it. They are uniquely determined from the right
t (p )))n+1 | 0 t t }, and the bottom boundary t = t ,
boundary k = L + L , i.e., {(L+L ,i (T
0
0
i=1
t0
n+1

i.e., {(k,i (T (p )))i=1 | 1 k L + L }. The coincidence of i and i on these boundaries will
be proved in Proposition 6.1 by taking t0 and L sufficiently large. 2
5. Bilinear relation for i
Let k,d be the ultradiscrete tau function specified in Theorem 2.1 and (2.18)(2.20). We
define k,d to be k,d with |s (1) | replaced by |s (1) | + | (1) | in (2.20). In view of Proposition 3.5,
this corresponds to the rigged configuration that has undergone the time evolution T once.
Proposition 5.1. The substitution k,d = k,d and k,d = k,d solves the bilinear equation (4.9).
This section is devoted to the proof of Proposition 5.1 by a refinement of the approach in [26].
We invoke the free fermion construction of tau functions associated with gl() [37]. For l Z,
set

(a) (a) (a)
H (x)
,
g|l, g = exp
ci pi qi
l (x) = l|e
(5.1)
(a,i)
where the notation is the same as Eq. (2.3) in [37] except that l there is denoted by l here
(a)
for distinction from (2.19). (pi here is not the vacancy number (2.7).) The operators (k) =

j are the free fermions. They obey the anti-commutation rej
j Z j k , (k) =
j Z j k
lations [i , j ]+ = [i , j ]+ = 0 and [i , j ]+ = ij , hence (k)2 = (k)2 = 0. |l is the

charge l vacuum of the Fock space. H (x) = i1 xi j Z j j+i is the Hamiltonian with in(a)
(a)
(a)
finitely many time variables x = (x1 , x2 , . . .). In (5.1), we associate each triple (ci , pi , qi )
(a) (a)
with the data (i , ri ) in the rigged configuration (, ((1) , r (1) ), . . . , ((n) , r (n) )). The sum
extends over all the colors 1 a n and the rows 1 i ((a) ). The tau function (5.1) is an
N -soliton solution of the KP hierarchy with N = ((1) ) + + ((n) ).
226

H (x)
H (x) = e(x,k) (k) and
The time evolution of the free fermion is given
by e i (k)e
(x,k)
=e
(k) with (x, k) = i1 xi k . Consequently,
eH (x) (k)eH (x)
eH (x) (p) (q)eH (x) =
q
(p) (q)
p
for x = ( 1 ) := ( 1 , 12 2 , 13 3 , . . .). For zk := (11 ) + + (k1 ), the tau function is

expanded as

l (zk ) =
(5.2)
l (zk ) ,
=( (1) ,..., (n) )
l (zk ) =

=

(a) (a)
c i qi
(a,i)
(a)
(a,i)<(b,j ) (pi
k
(a) l
pi
(a)
qi
(b)
(a)
j qi
(a)
j =1 j pi
(b)
(a)
pj )(qj qi )
(a)
(a,i),(b,j ) (pi
(b)
qj )
(5.3)
(5.4)
(1)
(1)
where the sum
, (n) (n) independently.
In (5.3),
(5.2) extends over the subsets , . . . (a)

(a) . In (5.4),
the product (a,i) runs over the rows of the
selected
subset
(a,i)<(b,j ) runs

over the pairs of such indices, whereas (a,i),(b,j ) simply means the double product. is the
(a)
(a)
Cauchy determinant of the free fermion up to an overall power of pi and qi . It is derived by

using the formulas:
l

q
p
j j
l|(p) (q)|l =
p q =
,
pq q
j l1

l
m

pi
i<j (pi pj )(qj qi )
m
l|(p1 ) (pm ) (qm ) (q1 )|l =
qi
.
qi
i,j =1 (pi qj )
i=1
Now we make a special choice of the parameters that further reflects the rigged configuration
(, ((1) , r (1) ), . . . , ((n) , r (n) )). Fixing d {2, . . . , n + 1} we set

(a)
(a)
i
i
(a)
(a)
(a+1)
=
+ i exp
,
qi =
,

(a)
(a) exp 2(a)
i +ri
if a {1, d},
(a) (a)
i

ci q i =
(a)
(a)
+r
(a)
otherwise,
i exp i i

j
j = (1) + j exp
,

(a)
pi
(a)
(a)
i exp
(a)
(5.5)
(5.6)
(5.7)
where 1 a n and > 0. Here (1) , . . . , (n+1) and i , i

(hence distinct) parameters such that
, j are -independent generic
(1) > > (d1) > (d) = 0 > (d+1) > > (n+1) ,
(5.8)
(a)
(a)
i
> 0,
(a)
i
> 0,
> 0.
(5.9)
227
Lemma 5.2. Set q = e1/ . In the limit q 0, the summand l (zk ) (5.3) of the tau function has
the following behavior:

(1)
(d)
(d1) || (d) |)
l (zk ) = q c(,s)+| |+| |l(|
+ O(q) , > 0,

(d)
(d1) || (d) |)
l zk + (1)1 = q c(,s)+| |l(|
(5.10)
+ O(q) , > 0,
where and are independent of . c(, s) is defined by (2.20) with = (1 , . . . , k ).
UD
We denote by A a the relation a = lim+0 log A under the ultradiscretization. It means

that A = A0 q a + higher order terms in q for some leading coefficient A0 (= 0). (A0 still can
UD
depend on as long as A0 0 although it is not needed in our case.) We let the relation A B
mean lim+0 log A = lim+0 log B.
(a)
(a)
(a)
(a)
Proof. Let {(i , si )} be the subset of the rigged configuration {(i , ri )} corresponding to
= ( (1) , . . . , (n) ) as in (2.20). We investigate the leading power of the constituent factors in
(5.3). From (5.5)(5.7) we find
(i)

(a,i)
(ii)

(a,i)
(iii)
(iv)
(a) (a) UD
c i qi
(a) l
pi
(a)
qi
a=1
p (d) l
i
(d1)
qi
k
(a)

j qi
(a)
(a,i) j =1 j pi
n

(a) (a) (1) (d)
+ s ,
(a)
pi

UD
l (d1) (d) ,
k

(1)
i j =1 j pi
(b) (b)
qj
pj
(a)
qi

UD
min , (1) ,
n

(a)
pi
(a) 2
pj
a=1 i<j
(a,i)<(b,j )
UD
n

min (a) , (a) + (a) ,
a=1
(v)
(a)
pi
(b) 1
qj
(vi)
(a)
pi
(a1) 1 UD
qj
a=2 i,j
(a,i),(b,j )
(1) q (a)
i
(a)
(1)
pi
(a,i)
n

1
(1)
(1) pi
n

min (a) , (a1) ,
a=2

UD
(1) ,
where | (n+1) | = 0 and the notation (2.3) is used. The contributions (i)(v) sum up to c(, s)
| (1) | | (d) | + l(| (d1) | | (d) |). This verifies the leading power of l (zk ) in (5.10). Similarly,
the one for l (zk + ( (1)1 )) is derived by including the contribution from (vi).
The remaining task is to check the positivity and -independence of the leading coefficients
and . We first illustrate them along l (zk ) . In the right-hand side of (5.3), we show the
positivity individually for the constituent factors (i), (ii), (iii) and = (iv) (v) considered in

(a)
the above. The leading coefficient from (i) is (a,i) i by (5.6), which is positive due to (5.9).
228
The leading coefficient from (ii) is 1 if l = 0. If l = 1, it is given by

(d1) (d+1)
i
(d)
(d1)
i
j j

1an
a=d1,d
(a+1)
(a)
( (a) )
,
where the products on i and j extend over the selected rows in (d1) and (d) , respectively. The
symbol ( (a) ) denotes the length of (a) as defined in (2.1). This is positive thanks to (5.8) and
(5.9). The leading coefficient from (iii) with a fixed j is equal to the one from

i
(1) (2)
(1)
j q j + i q i
(1)
(1) (a+1)
.
(1) (a)
(a,i)
a2
It is positive by (5.8) and (5.9). The leading coefficients from (iv) and (v) are respectively equal
to those in
n

(a)
(a)
(a)
(a)
j q j i q i
2
a=1 i<j

( (a) )( (b) )
(a) (b) (b+1) (a+1)
,
1a<bn
n

(a)
(a)
(a1)
i q i j
(a1)
j
a=2 i,j
1
(a) (b+1)
( (a) )( (b) )
1a,bn
a=b+1
In view
of (5.8) and (5.9), the coefficients are both positive apart from the same sign factor

(a)
(b)
(1) 1a<bn ( )( ) . Thus the leading coefficient from the product (iv) (v) is positive.
For l (zk + ( (1)1 )), the leading positivity > 0 is proved similarly. The only necessary
modification is to include the contribution from (vi):
n
(1) (a+1) )( (a) )
a=1 (
,
(1) n
(1) (a) )( (a) )
a=2 (
i i
which is again positive due to (5.8) and (5.9). Finally, and are -independent as they are
rational functions of the parameters appearing in (5.8) and (5.9) only. 2
Lemma 5.3.
UD
0 (zk ) k,d ,

UD

0 zk + (1)1 k,d ,
UD
1 (zk ) k,d1 ,

UD

1 zk + (1)1 k,d1 .
Proof. For example we consider the UD limit

lim log 0 zk +
+0
(1)1

c(,s)+| (d) |
q
= lim log
,
+0
(5.11)
where q = e1/ and (5.10) has been substituted. Lemma 5.2 furthermore tells that there is no
cancellation in the -sum here because of > 0. Therefore the limit tends to max {c(, s)
| (d) |} = k,d . See the definitions of k () (2.19) and k,d (2.21). The other limits are confirmed
similarly. 2
229
Proof of Proposition 5.1. It is well known that l satisfies the bilinear equation:

1
1 0 z + 1 + 1 1 z + 1

+ 1 1 0 z + 1 + 1 1 z + 1

+ 1 1 0 z + 1 + 1 1 z + 1 = 0.
This is derived by setting x = z + ( 1 ) + ( 1 ) + ( 1 ), x = z and (l, l ) = (0, 1) in
Eq. (2.4)l,l in p. 956 of [37]. Setting

1

,
= k ,
= ,
x = zk1 = 11 + + k1
= (1) ,
we get

k 0 zk1 + (1)1 1 (zk )

= (1) 0 (zk )1 zk1 + (1)1 + k ek / 0 zk + (1)1 1 (zk1 ),
where k (1) has been evaluated by (5.7). In view of (5.8) and (5.9), the coefficients k , (1)
and k here are all positive and -independent. Moreover from Lemma 5.2, there is no cancellation of the leading terms coming from the two terms on the right-hand side. Therefore by taking
the UD limit lim+0 log() of the two sides and applying Lemma 5.3, we obtain
k1,d + k,d1 = max(k,d + k1,d1 , k,d + k1,d1 k ).
(5.12)
This coincides with (4.9) with replaced by . Note that the range 2 d n + 1 for the both
also match. This completes the proof of Proposition 5.1. 2
Let us compare the results in this section with the similar ones in Section IV of [26]. In [26],
the tau function is supposed to fulfill the periodicity k,d = k,d+n+1 in the present notation. This
(a) (a)
led to a reduction condition (Proposition 4.4 in [26]) on each pair of the parameters (pi , qi )
in (5.1), restricting the class of tau functions captured in the UD limit. In our approach, reduction
conditions are bypassed by the special choice of the parameters (5.5)(5.9) depending on the d
that enters the bilinear equation (5.12) to prove. As it will turn out in Section 7.3, the ultradiscrete
tau functions derived here cover all the solutions of the boxball system.
6. Asymptotic coincidence of i and i
6.1. Statement and its reduction
In this section we prove
L

Proposition 6.1. Given a highest path p with length L, set p = p 1 1 and k0 = L + L .
Then the equalities (1 i n + 1)
t
t
0 (p ) =
0
k,i T
(6.1)
1 k k0 ,
k,i T (p )
t
t
k0 ,i T (p ) = k0 ,i T (p )
(6.2)
0 t t0
hold if t0 1 in (6.1), and if furthermore k0 Lt0 in (6.2).
230
Combined with Proposition 5.1, it establishes Theorem 4.9 and thereby completes the proof of
Theorem 2.1. Let ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be the rigged configuration corresponding
t
(1)
(1)
to T0 (p ). Without loss of generality we assume 1 2 . Moreover from the condition
t0 1 and Proposition 3.5, we assume
(1)
(1)
(1)
1 r1 r2 r3 ,
ri(1) rj(1)
(1)
if (1)
i < j
(6.3)
throughout this section. From Remark 3.4 and k0 Lt0 , the state T0 (p ) takes the form:
t
a1
t0
T
(p ) = u1
b1

uL 1 1 ( ) 1 1,

(6.4)
where = (1 , . . . , L ) are the numbers such that p B1 BL and p B1L is a highest

path.
Lemma 6.2. Under the same condition as Proposition 6.1, the following relation holds:
t
t+1
t
t+1
k0 ,i T
(p ) k0 ,i T
(p ) = k0 ,i T
(p ) k0 ,i T
(p ) (0 t t0 1).
Proof. Suppose ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is the rigged configuration for T0 (p ). From
t (p ))
the definition (4.1) and the assumed situation (6.4), it is easily seen that k0 ,i (T
t+1

k0 ,i (T (p )) is the number of balls with colors 2, . . . , n + 1 contained in p, which is
t (p ))
t+1
equal to |(1) |. To calculate k0 ,i (T
k0 ,i (T (p )), we apply the formula (2.23).
t (p )) is obtained by replacing there with (1k0 L ) and r (1) with r (1) (t t)(1)
k0 ,i (T
0
i
i
i
by Proposition 3.5. Then the max contains k0 only via min( (1k0 L ), ), hence one can
t (p )) =
let it be achieved at = (1) by taking k0 sufficiently large. Consequently k0 ,i (T
(1)
min( (1k0 L ), (1) ) min((1) , (1) ) |r (1) | + (t0 t)|(1) | + i ((1) ) for any 0 t t0
t (p ))
t+1
(1)
as long as k0 Lt0 . Therefore k0 ,i (T
k0 ,i (T (p )) = | | in agreement with
t

t+1

k0 ,i (T (p )) k0 ,i (T (p )). 2
t
By Lemma 6.2, (6.2) is attributed to t = t0 case. Thus the proof of Proposition 6.1 reduces to
showing (6.1), on which we shall concentrate from now on.
Lemma 6.3. For 1 k L, k,i (T0 (p )) = k,i (T0 (p )) = 0. For L < k k0 , the following
relations hold:
t
0 (p ) =
k,i T
(6.5)
kL,i (p),
t
0
k,i T (p ) = kL,i (p),

(6.6)
t
where p B1L is defined in (6.4).

Proof. For k,i , the assertion is obvious from (6.4) and the definition (4.1). As for k,i ,
t
we use the expression (2.23) for T0 (p ) which corresponds to the rigged configuration
(0)
(1)
(1)
(n)
(n)
( , ( , r ), . . . , ( , r )).

(1)
i ( ) = max min(, ) min(, ) |s| + i () , (0) .
(6.7)
(1)

231
(1)
(1)
According to (6.4), we have (0) = (1L ). From Proposition 3.5, we know that ri = i t0 +
(1)
(1)
ri , where ri is the rigging for p . (This is also equal to the rigging for p in Proposition 6.1
although this fact is not used below.) Thus t0 enters (6.7) only via |s| = ||t0 + |s |, where |s |
is t0 -independent. Fixing = (1 , . . . , k ) with 1 k L and taking t0 sufficiently large, we
t
(1)
see that the maximum (6.7) forces the choice = . This yields k,i (T0 (p )) = i () = 0 for
1 k L, where the latter equality is due to Lemma 2.3.
The maximum can be different from 0 for L < k k0 , where we are allowed to take k so large
up to k0 depending on t0 . This corresponds to the situation (6.6), which will be considered in the
sequel. To compute the right-hand side of (6.6) by (6.7), we need to know the rigged configuration
t
for p.
In view of (6.4), it is obtained from the one ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) for T0 (p )

(1)
(1)
(1)
(1)
L
by replacing (0) with (1L ) and the rigging ri with ri = ri j =1 min(j , i ). See
Lemma C.3. This amounts to changing |s| in (6.7) to |s| min(, ) in the notation (2.3). Thus
by setting = (1kL ), we get

= max min 1kL , min(, ) |s| min(, ) + i(1) ()
kL,i (p)
(1)

(1)
= max min 1kL , min(, ) |s| + i () .
(1)
Since (1kL ) = (1 , . . . , k ), this is nothing but the expression of k,i (T0 (p ))

by (6.7). 2
(0)
(0)
Thanks to Lemma 6.3, we may assume = in (6.4) without loss of generality.

k
To summarize so far, we have reduced Proposition 6.1 to (6.1) for p such that p B1 0 .

Resetting the meaning of p, p , p,
L, L and k0 , we restate it as
Proposition 6.4. Let p B1L be a highest path and ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) be its
(1)
(1)
rigged configuration with 1 2 . If L is sufficiently large and the condition (6.3) is
satisfied, the equality
k,i (p) = k,i (p)
(6.8)
is valid for 1 k L, 2 i n + 1.
A highest path p B1L satisfying the assumption of Proposition 6.4 will be called an asymptotic state. We have excluded i = 1 case since it is contained as the i = n + 1 case of T (p)
which is also an asymptotic state. See (2.19), Proposition 3.5 and (4.2). The remainder of this
section is devoted to the proof of Proposition 6.4. Our strategy is to express the both sides of
(1)
(6.8) in terms of the quantities associated with the smaller algebra An1 and invoke the induction with respect to n. Note that the induction allows us to use Theorem 2.2 with 1 a n 1
(1)
and Theorem 4.9 for An1 .
6.2. Precise description of asymptotic states
The KKR bijection from rigged configurations to highest paths is known to be equivalent with
the vertex operator construction [34,35]. Here we utilize the notions in the latter formalism such
as scattering data and normal ordering explained in Appendix D. In particular, we remark that a
232

2
2
scattering data b1 [d1 ] bN [dN ] Aff(B1 ) Aff(BN ) for an asymptotic state is

normal ordered if and only if 1 N .
Lemma 6.5. For an asymptotic state p, denote any successive tensor product components of the
normal ordered scattering data by

d2
d1
(6.9)
Let the semistandard tableaux A and B be

2
2 a1 a2 alA n + 1,
2
2 b1 b2 blB n + 1.
= a1 a2 . . . alA BlA ,
= b1 b2 . . . blB BlB ,
Then locally p has the form:

d1 d2

11blB b2 b1 11 1 alA a2 a1 11 .
(6.10)
Proof. Since p is an asymptotic state, we have lB lA . We divide the proof into two cases.
Case 1. Assume lB < lA . From the definition of the modes of scattering data (D.3), we have
lB d1 d2 for asymptotic states. Therefore, the calculation of the vertex operator goes as (see
around (E.2) for the explanation of B )
d1 d2

B (11 1 alA a2 a1 )
d1 d2 lB

= blB b2 b1 11 1 TlB (alA a2 a1 )
d1 d2

= blB b2 b1 11 1 alA a2 a1 ,
where Tl is a time evolution of the boxball system with capacity l career (3.1).
Case 2. Next, consider the case lB = lA = l. Let the energy function be H = H (B A). Applying
the definition of the mode (D.3) to (6.9), we have d2 = l +rB +h and d1 = l +rA +h+H (B A),
where rA , rB are the riggings for A, B and h denotes the last term in (D.3) for d2 here. Since the
asymptotic state satisfies the condition (6.3), we have rB rA , leading to H d1 d2 (=: ). If
l, the proof is the same as Case 1. Therefore assume H < l in the following. Calculating
the action of B , we arrive at the following situation:
1
1 1
al a2
a1
al a2
1
B
B
bl
bl1 bl+1
a1
233
where B = 1 b1 b2 bl . The diagram says that B al a ( ) under the combinatorial R. Let us show that a = bl . For the purpose, we first claim bl < al . In fact, suppose
bl al on the contrary. We construct the pairs for B A according to the graphical rule in
Appendix B to compute H = H (B A). We know that there are H winding pairs irrespective of
the ways of making pairs. Since bi is weakly increasing with respect to i, we see that more than
+ 1(> H ) is satisfy bi al . On the other hand, al is the largest letter in A, therefore all the
letters in B greater than al have to constitute winding pairs, and we have seen that the number of
these winding pairs is greater than H . This is a contradiction. Therefore we obtain bl < al .
We have seen that bl < al , and we know that bl is the largest number in B . When
we construct the pairs for B al , this fact means that bl and al form an unwinding pair.
Therefore, the action of the combinatorial R is given by B al a ( ) with a = bl . By
continuing the same argument, we arrive at (6.10). 2
In what follows we use the notation explained in Section 2.1.
(1)
Lemma 6.6. Suppose that Proposition 6.4 is true for An1 . For a rigged configuration
((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) with (1) = (1 , . . . , N ) and r (1) = (r1 , . . . , rN ), assume
2
1 N . Then the corresponding scattering data b1 [d1 ] bN [dN ] Aff(B1 )
2
Aff(BN ) is given by
(1)
bM = (x2 , . . . , xn+1 ),
(1)
(1)
(1)
xi = M,i M1,i M,i1 + M1,i1 ,
(1)
dM = |[M] | + rM + M1,n+1
(6.11)
(1)
M,n+1 .
(6.12)
This lemma is shown without assuming that the scattering data b1 [d1 ] bN [dN ] is
normal ordered.
(1)
Proof. From the arguments in Section 6.1, the assumption makes Theorem 2.1 for An1 valid.
Then (6.11) is a corollary of Theorem 2.2 with a = 1. According to the definition (D.3), the mode
dM is given by

(j +1)
dM = M + rM +
(6.13)
.
H bj bM
1j <M
On the other hand, combining (2.14) and (4.13) with i = n + 1, we have En+1 (b1 bM ) =

(j +1)
)). (Since b1 bM is An1 -highest, the first
1j <mM (min(j , m ) H (bj bm
term in (4.13) vanishes.) We know En+1 (b1 bM ) = n+1 (b1 bM ) by Proposi(1)
tion 4.6. Moreover, since Proposition 6.4 for An1 is assumed, we are allowed to use Theorem 4.9
(1)
(1)
to set n+1 (b1 bM ) = M,n+1 . Consequently, M,n+1 is expressed as

(j +1)
(1)
M,n+1 =
min(j , m ) H bj bm
.
(6.14)
1j <mM
The formula (6.12) is a corollary of (6.13), (6.14) and the condition 1 N .
Given a rigged configuration ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) with (1) = (1 , . . . , N )
and r (1) = (r1 , . . . , rN ), we introduce the numbers
(1)
(1)
kM,i = min([M] , [M] ) min([M1] , [M1] ) + rM + M1,i M,i
(6.15)
234
for 1 M N, 1 i n + 1.
(1)
Lemma 6.7. Suppose that Proposition 6.4 is true for An1 . Let ((1L ), ((1) , r (1) ), . . . ,
((n) , r (n) )) be a rigged configuration for an asymptotic state. Set (1) = (1 , . . . , N ) with
1 N . Then the following relations are valid:
kM,n+1 kM,n kM,2 kM,1
kM,1 kM+1,n+1
(1 M N ),
(1 M N 1),
kM,1 kM,n+1 = M
(6.16)
(6.17)
(1 M N ).
(6.18)
Proof. By the assumption we may use Lemma 6.6. The scattering data b1 [d1 ] bN [dN ]
considered there should be understood as a normal ordered one here because we deal with an
asymptotic state and assume 1 N . See the remark before Lemma 6.5. From the de(1)
(1)
(1)
(1)
M,i1
M1,i
+ M,i
for 2 i n + 1. This
finition (6.15), kM,i1 kM,i = M1,i1
is equal to xi in (6.11) hence nonnegative, proving (6.16). Summing this over 2 i n + 1
we get (6.18). Comparing (6.12) and (6.15), we have kM,n+1 = dM + |[M1] |. Therefore
kM+1,n+1 kM,1 = kM+1,n+1 kM,n+1 M = dM+1 dM . Since di s are the modes of normal
ordered scattering data, this is nonnegative, showing (6.17). 2
Now we are ready to determine the precise form of asymptotic states from the associated
rigged configurations.
(1)
Lemma 6.8. Suppose that Proposition 6.4 is true for An1 . For an asymptotic state p, let
((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) be its rigged configuration and (1) = (1 , . . . , N ) with
1 N . Then p = p1 pL B1L is given by

i kM,i < k kM,i1 (2 i n + 1, 1 M N ),
pk =
(6.19)
1 kM,1 < k kM+1,n+1 (0 M N ),
where k0,1 = 0, kN+1,n+1 = L. Namely p has the form:
11 11(b1 )11 11(bM )11 11(bM+1 )11 11(bN )11 11,
where the segment
(bM ) B1 M
(6.20)
(soliton) looks as
kM,n
kM,n1 kM,i kM,i1 kM,2
kM,1
kM,n+1
n+1, . . . , n+1, n, . . . , n, . . . . . . , i, . . . , i, . . . . . . , 2, . . . , 2
(6.21)
Note that Lemma 6.7 guarantees that the regions of k appearing in (6.19) is the disjoint union
decomposition of 1 k L.
Proof. By the assumption we may use Lemmas 6.6 and 6.7. In particular we use the notation xi
and dM in Lemma 6.6. Lemma 6.5 tells that p indeed has the form (6.20). The segment (bM )
xn+1
x2

has the left end at k = dM + |[M1] | + 1 and is arranged as n + 1 n + 1 2 2 with
xi specified by (6.11). From the proof of Lemma 6.7, we find that k = kM,n+1 + 1 and xi =
kM,i1 kM,i . Therefore it looks as (6.21). 2
235
6.3. Evaluation of i and i on asymptotic states

(1)
First we evaluate the tau function k,i of asymptotic states in terms of i .

(1)
Lemma 6.9. Suppose that Proposition 6.4 is true for An1 . If ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) ))
is a rigged configuration for an asymptotic state with (1) = (1 , . . . , N ), r (1) = (r1 , . . . , rN ),
(1 N ), the associated tau function is given by
(1)
k,i = Mk min([M] , [M] ) |r[M] | + M,i
(kM,i < k kM+1,i ),
(6.22)
where 0 M N, 1 i n + 1 and k0,i = 0, kN+1,i = L.

Proof. From (2.23) we know

(1)
k,i = max ()k min(, ) |s| + i () .
(6.23)
(1)
Since s is the rigging attached to and runs over the subset of r (1) that satisfies the asymptotic
condition (6.3), the choice of that attains the maximum must be of the form = [M] for some
(1)
0 M N . (We interpret [0] = .) In terms of the notation (2.24), we have i(1) ([M] ) = M,i
.
In (6.23), the quantity in {} at = [M1] and = [M] become equal if and only if
(1)
Mk min([M] , [M] ) |r[M] | + M,i = (M M 1).
(6.24)
This yields k = kM,i (6.15). Comparing the k-dependence (M 1)k and Mk, we conclude that
= [M] gives a larger value than = [M1] if kM,i < k. Moreover we may use Lemma 6.7
by the assumption and therefore know that < kM,i < kM+1,i < . Thus we conclude that
the maximum in (6.23) is attained at = [M] for kM,i < k kM+1,i , where k,i is equal to the
left-hand side of (6.24). 2
Next we evaluate k,i for asymptotic states.
Lemma 6.10. Under the same assumption as Lemma 6.9, k,i for the asymptotic state is given
by
k,i = k,i
(1 k L, 2 i n + 1),
(6.25)
where the right-hand side is specified by (6.22).

Proof. By the assumption we may use Lemma 6.8, which specifies the concrete form of the asymptotic state as in (6.21). To evaluate k,i (p) (4.1), we count only the balls of colors 2, 3, . . . , i
t1
in p itself and those of any color {2, . . . , n + 1} in the subsequent states T (p). From Proposition 3.5 and (6.15), the positions kM,i in (6.21) changes as kM,i kM,i + M under the time
evolution. Due to 1 N , there is no collision among the segments (solitons) (bM )s in
(6.20) under the time evolution. In view of these facts, the counting for k,i within the region
kM,i < k kM+1,i is done as
k,i =
M

(k kM ,i ).
(6.26)
M =1
From (6.15) and Lemma 2.3, this coincides with the right-hand side of (6.22).
236
Example 6.11. The following figure helps to understand the counting (6.26). Consider an asymptotic state in which the Mth soliton is (bM ) = 44332. Its time evolution takes the form:
k
4
2
4
k = kM,3
2
4
2
4
2
4

t
Here we have omitted , letters 1 and the other solitons for simplicity. Then the contribution to
k,3 from the Mth soliton comes from the balls within the frame, and their number is certainly
equal to k kM,3 .
Proof of Proposition 6.4. Due to Lemma 6.10 and induction on n, it now suffices to show n = 1
case of Proposition 6.4 to complete its proof. It is Lemma 6.6 that we started relying on the n 1
case. But when n = 1, all the subsequent assertions are easily derived by only using Lemma 6.5
and the definitions of the scattering data and normal ordering in Appendix D. In particular, all
(1)
(1)
the formulas are valid by setting M,2 = 0 and M,1 = |[M] | in agreement with the definition
under (2.23). Thus (6.11) becomes bM = (x2 ) with x2 = M , and (6.12) reads dM = |[M] | + rM .
The definition (6.15) reads kM,2 = kM,1 M = min([M] , [M] ) min([M1] , [M1] ) + rM .
Using the fact that rM rM+1 for normal ordered scattering data, one can directly verify the
properties (6.16)(6.21). By using them Lemma 6.9 is shown for n = 1, and (6.22) reads k,2 =
k,1 + |[M] | = Mk min([M] , [M] ) |r[M] |. Finally (6.25) can be checked by substituting
the above kM,2 into (6.26) with i = 2. This proves n = 1 case of Proposition 6.4, therefore it is
established for any n. 2
Summary of proofs. We have finished proving Proposition 6.4. From the arguments in Section 6.1, it leads to Proposition 6.1. Combined with Proposition 5.1, Proposition 6.1 proves
Theorem 4.9 as explained in Section 4.4. Combined with (4.5), Theorem 4.9 proves Theorem 2.1.
In the course of these proofs, we have identified the three basic quantities by Proposition 4.6
and Theorem 4.9. The tau function i (2.19) which is a piecewise linear function on the rigged
configuration, the CTM for the boxball system i (4.1) and the energy Ei (4.12). We rephrase it
as
Theorem 6.12. For any rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and the corresponding highest path p1 pL P+ ((0) ), the equality
i (p1 pk ) = i (p1 pk ) = Ei (p1 pk )
is valid for 1 i n + 1 and 1 k L.
(6.27)
237
Note that the second equality (Proposition 4.6) has been shown even for non-highest states.
The generalization of the first equality to them will be done in Theorem 7.4. Before closing the
section we include a few immediate consequences.
Corollary 6.13. For k = L, Theorem 6.12 becomes

i (p1 pL ) = i (p1 pL ) = Ei (p1 pL ) = c(, r) (i) , (6.28)
where c(, r) is the value of (2.20) at the full choice ( (a) , s (a) ) = ((a) , r (a) ), and we
employ the convention |(n+1) | = 0 as in (2.19).
Proof. For i = n + 1, the equality En+1 (p1 pL ) = c(, r) is a consequence of the
known relation between the charge of rigged configurations and the energy of paths [10,22]. For
i general, we find from (4.1) that n+1 (p1 pL ) i (p1 pL ) is the number of balls
with colors i + 1, i + 2, . . . , n + 1 in p1 pL . By the definition of the KKR bijection, it is
equal to |(i) |. 2
Remark 6.14. Corollary 6.13 tells that if = (0) , the max (2.19) is attained at the full
(a)
((a) ) = min((a) , (a+1) )
choice ( (a) , s (a) ) = ((a) , r (a) ). In particular, (2.23) leads to n+1
(a+1)
min((a+1) , (a+1) ) |r (a+1) | + n+1 ((a+1) ).

Now we are able to evaluate the conserved quantity El (3.2) for highest states in terms of the
rigged configurations.
Proposition 6.15. Let p P+ ((0) ) be the highest state corresponding to the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )). Then, its row transfer matrix energy El (p) (3.2) is given

(1)
(1)
by El (p) = j min(l, j ), which is El in (2.8).
Proof. Combining Proposition 4.8 and Theorem 6.12, we have

(0)

.
El (p) = En+1 (p) En+1 Tl (p) = n+1 (0) n+1
((0) ) is obtained from
(0)
Here, by Proposition 3.5, n+1
n+1 ( ) by replacing the rigging ri
(1)
(1)
(1)
(1)
with ri =
ri + min(l, i ). This amounts to changing |s| in (2.23) (with a = 0, d = n + 1)
into |s| j min(l, j ). On the other hand from Remark 6.14, we know that the max in (2.23)
((0) ) is equal to
for = (0) is attained at = (1) . Therefore the difference n+1 ((0) ) n+1

(1)
j min(l, j ). 2
Proposition 6.15 will be extended to non-highest states in Proposition 7.7.
7. N -soliton solutions of the boxball system
As an application of Theorem 2.1, we present the solution of the initial value problem and
N -soliton solutions of the boxball system. To cope with arbitrary states not necessarily highest,
we first introduce in Section 7.1 an extension of the rigged configurations for such states, which
238
we expect is equivalent to those studied in [23,40]. We naturally extend the domain of the tau
function to them. Generalizations of Theorems 2.1, 4.9 and 6.12 to arbitrary (non-highest) states
are presented in Section 7.2. Based on these results, we give the solution of the initial value
problem in Section 7.3. In Section 7.4 we derive several formulas for our tau functions in terms
of the parameters that specify solitons. Together with (7.13), they yield the N -soliton solution of
the boxball system. Our approach provides the general solution, which accommodates arbitrary
number and kinds of solitons. A class of special solutions have been constructed earlier in [26].
7.1. i for non-highest states
(0)
(0)
For (0) = (1 , . . . , L ) (Z1 )L , let p B(0) B(0) be an arbitrary element not

L
1
necessarily highest. Set
p = pvac p,
(7.1)
pvac = (12 . . . n)Mn (12)M2 1M1 ,
(7.2)
where (12 . . . n) for example means 1 n B1n . The rigged configuration for pvac is given
by
rcvac =
La =

1L0 , 1L1 , 0L1 , . . . , 1Ln , 0Ln ,
n

n

b min(a, b) Mb =
(b a)Mb
b=1
(7.3)
(0 a n).
(7.4)
b=a+1
(a)
Thus Ln = 0 and ((1Ln ), (0Ln )) actually means (, ). The vacancy numbers pj

configuration ((1L0 ), (1L1 ), . . . , (1Ln )) of rcvac is calculated as
a,1 L0
n

Ca,b Lb = Ma
(2.7) for the
(7.5)
b=1
for any j 1. In (7.1), one can always make the state p highest by taking M1 , . . . , Mn sufficiently
large. In fact, the choice
Ma > ma+1
(1 a n)
(7.6)
suffices, where ma denotes the total number of the letter a contained in the tableau representation
of p.
Let (,
r ) = ( (0) , ( (1) , r (1) ), . . . , ( (n) , r (n) )) be the rigged configuration for the highest
state p.
By the definition of the KKR bijection, it contains rcvac (7.3) for pvac . By this we
239
mean that (,
r ) can be depicted as follows (n = 3):
(0)
(1)
(2)
(3)
(0)
(1)
(2)
(3)
..
.
L2
..
.
L1
L0
Recall that (0) is not limited to a partition, therefore it is not necessarily a Young diagram.
(a) (a)
Neither (a) has been depicted so. As mentioned after (2.5), any reordering of {( i , ri )} for
each a should be understood as the same rigged configuration.
From the above rigged configuration ( (0) , ( (1) , r (1) ), . . . , ( (n) , r (n) )), we extract the data
(1)
( , r (1) ), . . . , ((n) , r (n) ) by
(a) la

1La ,
(a) = i i=1
(a) la
(a) = i i=1
,
(7.7)
la
(a)

r (a) = ri + Ma i=1
0La ,
(a) la
r (a) = ri i=1
(7.8)
(a)
for 1 a n, where la = ((a) ). The shift Ma in defining ri by (7.8) has been introduced on
account of (7.5) and the algorithm for the KKR bijection, especially Lemma C.3. As the result,
((1) , r (1) ), . . . , ((n) , r (n) ) become independent of M1 , . . . , Mn as they get large sufficiently.
Therefore the data (, r) = ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is determined unambiguously
from p B(0) B(0) by the prescription (7.1)(7.8). We call (, r) the unrestricted
L
1
rigged configuration for p, which we expect is equivalent to the one studied in [23,40]. For
highest states, it coincides with the rigged configuration under the KKR bijection, but in gen(a)
eral ((0) , (1) , . . . , (n) ) is not necessarily a configuration. The vacancy number pj (2.7) can
become negative. The rigging r (a) Zla is no longer limited to the range (2.6) but obeys the re(a)
(a)
laxed condition ri p (a) with some non-positive lower bound. We associate the tau function
(a)
d () ( (a) ) to an unrestricted rigged configuration (, r) by the same formula as (2.22).

(0)
For = [k] and p = p1 pL , we will also use the notation i () = k,i = i (p1 pk )
as in (2.24).
240
Example 7.1. Take n = 3 and consider the non-highest state p and the highest state p as
p = 344 2 13 24 B3 B1 B2 B2 ,
p = pvac p,
pvac = 123123121 B19 ,
r ) for p is
where we have omitted in pvac . The rigged configuration (,
0
1
0
0
1
1
0
0
0
0
0
We have
(M1 , M2 , M3 ) = (1, 1, 2),
(L0 , L1 , L2 , L3 ) = (9, 5, 2, 0)
according to (7.4). Thus the definitions (7.7) and (7.8) yield the unrestricted rigged configuration
(, r) depicted as
1
0
(3)
Since p3 = 2, this is not a configuration.

7.2. i = i for non-highest states
r ) and (, r) =
Lemma 7.2. For any element p B(0) B(0) , let pvac , La , (,
1
((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be as in (7.2)(7.8). For a fixed (0) , the tau function
associated with the rigged configuration (,
r ) is decomposed as

L
i 1 0 = i (pvac ) + L1 () + i (),
(7.9)

1
i (pvac ) = L0 L1
(7.10)
Ca,b La Lb Li
2
1a,bn
for sufficiently large M1 , M2 , . . . , Mn . Here i (pvac ) is the tau function for the rigged configuration rcvac (7.3). The last term in the right-hand side of (7.9) is the tau function (2.19) associated
with the unrestricted rigged configuration (, r).
Proof. Let us write down the left-hand side of (7.9) according to (2.19) and (2.20) as
i 1
L0

1

= max min 1L0 , (1)
Ca,b min (a) , (b)
2

a,b

s (a) (i) .
241
(7.11)
For M1 , M2 , . . . , Mn sufficiently large, one has L0 L1 Ln1 1. In such a circumstance, one can show that the max can be limited to those (a) (a) that contain (1La ) part
entirely. Accordingly, we set

(a) (a) ,
(a) = (a) 1La ,

|s (a) | = s (a) + Ma (a) ,
s (a) r (a) ,
taking (7.7) and (7.8) into account. Substituting these forms into (7.11) and using the formula (7.5) and min( (a) (1La ), (b) (1Lb )) = La Lb + La ( (b) ) + Lb ( (a) ) + min( (a) , (b) ),
we obtain (7.9). The expression (7.10) is derived by means of (6.28). 2
A decomposition parallel to (7.9) takes place also for i .
Lemma 7.3. Under the same setting as Lemma 7.2, set p = p1 pL and take = (0)
[k]
in the notation (2.2), hence () = k. Then for M1 , M2 , . . . , Mn sufficiently large, the following
relation is valid:
i (pvac p1 pk ) = i (pvac ) + L1 k + i (p1 pk ).
L0
Proof. In view of pvac B1

follows (n = 3).
(7.12)
, the time evolution of pvac p1 pk under T looks as
On the top row, the length L0 part is pvac and the length k part is p1 pk . By the definition (4.1), i (pvac p1 pk ) is the number of balls with colors 2, . . . , i on the top row
and all the balls in the SW quadrant beneath it.
For M1 , . . . , Mn sufficiently large, one has L0 M1 1. Moreover from the time evolution rule in Proposition 3.2, the left segment within pvac with length L0 M1 undergoes just
a translation to the right by one lattice unit under T . Thus this segment and the hatched region containing the balls are entirely separated by the strip 11 11 of empty boxes with width
M1 1. Therefore i (pvac p1 pk ) is decomposed into the contributions from pvac
242
(trapezoid in the bottom left), p1 pk (hatched region) and the parallelogram in the bottom. By the definition, the first two are equal to i (pvac ) and i (p1 pk ), respectively.
The last one yields L1 k because there are L1 balls in total in the left segment in pvac with length
L0 M1 . 2
Now we give the generalization of Theorems 4.9 and 6.12 to arbitrary (non-highest) states.
Theorem 7.4. For any state p = p1 pL B(0) B(0) , let (, r) = ((0) ,
L
((1) , r (1) ), . . . , ((n) , r (n) )) be the unrestricted rigged configuration, and let i be the associated tau function. Then the equality (6.27), namely, i (p1 pk ) = i (p1 pk ) =
Ei (p1 pk ) holds for 1 k L.
Proof. The equality i = Ei has been already shown in Proposition 4.6 for any state, and we are
only to show i = i . Since pvac p1 pk is a highest state associated with the rigged
(0)
configuration ([k] (1L0 ), ( (1) , r (1) ), . . . , ( (n) , r (n) )), Theorem 4.9 tells that (7.12) is equal
(0)
to (7.9) with = [k] . Moreover it also tells that i (pvac ) = i (pvac ).
Combining Theorem 7.4 with (4.5), we obtain a generalization of Theorem 2.1 to arbitrary
states.
Corollary 7.5. For any element, p B(0) B(0) , let (, r) = ((0) , ((1) , r (1) ), . . . ,
L
((n) , r (n) )) be the unrestricted rigged configuration. Then pk = (x1 , . . . , xn+1 ) B(0) is exk
pressed as
xd = k,d k1,d k,d1 + k1,d1
(0)
(0)
in terms of the tau function k,d = d ((1 , . . . , k )) associated with (, r).

7.3. N -soliton solution
To simplify the notation we write in place of (0) in this subsection. We shall exclusively
treat the states p = p1 pL B1 BL such that L is formally infinite and the
boundary condition pk = uk is satisfied for k 1. Under such a setting, the right-hand side of
the inequality (7.6) is still finite, therefore all the arguments in Sections 7.1 and 7.2 remain valid.
Our solution of the initial value problem of the boxball system is formulated as
Theorem 7.6. For any initial state p = p1 p2 B1 B2 , let (, r) =
(, ((1) , r (1) ), . . . , ((n) , r (n) )) be its unrestricted rigged configuration. Then the state after the
time evolution p1 p2 = Tl1 Tl2 Tlt (p) is expressed as pk = (x1 , . . . , xn+1 ) Bk with
xd = k,d k1,d k,d1 + k1,d1 .
(7.13)
Here k,d = d ((1 , . . . , k )) is the tau function (2.18)(2.20) associated with

(1)
(1)
(1)
((2) , r (2) ), . . . , ((n) , r (n) )), where ri = ri + tj =1 min(lj , i ).
Proof. This is a consequence of Corollary 7.5 and Proposition 3.5.
(, ((1) , r (1) ),
243
Let us evaluate the conserved quantity El (3.2) in terms of the data (, ((1) , r (1) ), . . . ,
((n) , r (n) )).
Proposition 7.7. For any state p = p1 p2 B1 B2 , let (, r) = (, ((1) , r (1) ),
. . . , ((n) , r (n) )) be its unrestricted rigged configuration. Then the row transfer matrix energy

(1)
El (p) (3.2) is given by El (p) = j min(l, j ).
When p is highest, this reduces to Proposition 6.15.
Proof. Let p be the highest state (7.1) and let (,
r ) be the corresponding rigged configuration.
Proposition 6.15 tells that

(1)

(1)
El (p)
=
min l, j =
min l, (1) 1L1 j = L1 +
min l, j ,
j
j
(1)
where we have substituted (7.7) into j . On the other hand, due to M1 1 in (7.2) and the
is decomposed as El (p)
= El (pvac ) + El (p). It is easy to check El (pvac ) =
property (3.3), El (p)
L1 by counting the non-winding number using the graphical rule in Appendix B. 2
Following [26,27,31], we call those states p of the boxball system such that El (p) =
l1
(1)
(1)
(1)
j =1 min(l, j ) l1 -soliton states with amplitudes 1 , . . . , l1 . Thus Proposition 7.7 tells
that any state of the boxball system is an l1 -soliton state for some l1 . Moreover, Theorem 7.6
(1)
asserts that in the unrestricted rigged configuration (, ((1) , r (1) ), . . . , ((n) , r (n) )), the An1
(1)
(2)
(2)
(n)
(n)
(1)
part ( , ( , r ), . . . , ( , r )) is the conserved quantity among which provides the
list of amplitudes of solitons. In the remainder of this section we set
l1 = N,
(1) = (1 , . . . , N ),
r (1) = (r1 , . . . , rN ),
and rewrite the tau function in terms of the parameters that specify solitons. These parameters
are equivalent to the conserved quantity (, ((2) , r (2) ), . . . , ((n) , r (n) )) as we will see shortly.
The result yields the general N -soliton solution of the boxball system, which supplements the
special solution in [26].
(1)
From [27,30,31], it is known that N -soliton states in the An boxball system are labelled
2
2
2
2
(1)
with the An1 affine crystal Aff(B1 ) Aff(BN ). The classical part B1 BN
parametrizes the internal degrees of freedom of solitons. The affine part is incorporated in the
integers r1 , . . . , rN , and specifies the positions of the solitons. Thus we start with any such data
b1 bN B2
B2
,
N
1
(r1 , . . . , rN ) ZN ,
(7.14)
where we call each bi a soliton. Let (, ((2) , r (2) ), . . . , ((n) , r (n) )) be the unrestricted rigged
configuration for b1 bN . Without loss of generality we assume
1 N ,
ri rj
if i = j
(1)
and
i < j.
(1)
(7.15)
For any , let us express the An1 tau function i () associated with (, ((2) , r (2) ),
. . . , ((n) , r (n) )) in terms of b1 , . . . , bN . We parametrize (1 , . . . , N ) as = (j1 , . . . , jM )
in terms of the subset J = {j1 < < jM } {1, 2, . . . , N}. From the array of N solitons
244

2
2
b1 bN we extract an element in Bj1 BjM by sending the corresponding components to the left by the combinatorial R as follows:
2
2
2
2
(1)
(M)
B1 BN Bj1 BjM ( ),
b 1 bN
bj1 bjM ( ).
(7.16)
A caution is necessary about this notation. Consider for instance N = 3, M = 2 cases

(2)
b1 b2 b3 b1 b3 ()
(1)
b2
(2)
b3
for J = {1, 3},
()
for J = {2, 3}.
Obviously, the elements represented by the same symbol b3(2) in the two lines are not equal in
()
general. In this way, bj is uniquely determined only by further specifying J except = 1. In
what follows we will always take it for granted that J has been prescribed.
(1)
From Theorem 7.4 for An1 , we know
(1)
(1)
(M)
i () = Ei bj1 bjM
(2 i n + 1).
(1)
Applying the formula (4.13) for An1 to the right-hand side we get
(1)
(1)
(1)
(1)
(1)
(M)
i () =
bj,3 + bj,4 + + bj,i + Ei bj1 bjM
(2 i n + 1),
j J
(1)
(1)
(1)
where bj = (bj,2 , . . . , bj,n+1 ) is the representation in terms of the number of tableau letters
(1)
as in (2.11). The case i = 1 needs an independent derivation. We recall the definition 1 () =

(1)
n+1 () || given just before (2.22). Substituting the above formula with i = n + 1 to this, we
find the result is unified into the single formula
(1)
(1)
(1)
i(1) () = ||
bj,i+1 + + bj,n+1 + bj,2
j J
+ Ei
(1)
(M)
bj1 bjM
(1 i n + 1),
(7.17)
under the convention

(1)
(1)
(M)
(M)
E1 bj1 bjM = En+1

bj1 bjM .
(1)
This is natural in view of the mod n structure of the indices in An1 . Similarly, the sum in (7.17)
(1)
(1)
may well be written as bj,i+1 + + bj,n+2 .

(1)
Now we are ready to express the An tau function (2.23) associated with (, (, r), ((2) , r (2) ),
. . . , ((n) , r (n) )):

(1)
k,i = max min([k] , ) min(, ) |s| + i () (k 1, 1 i n + 1)
(7.18)
in terms of the solitons b1 bN and their positions r1 , . . . , rN . We parametrize by J =

{j1 , . . . , jM } {1, . . . , N } as before, and introduce the functions:
(1)
(1)
(1)
k,i (j ) = min([k] , j ) rj bj,i+1 + + bj,n+1 + bj,2 (j J ),
(7.19)
i (J ) = 2
(1)
(M)
min(l , m ) Ei bj1 bjM ,
245
(7.20)
l,mJ
l<m

where min([k] , j ) = km=1 min(m , j ) according to (2.3). (To simplify the formula,
min(l , m ) has been
(7.17) into (7.18) and noting that
kept as it is despite (7.15).) Substituting

min(, ) || = 2 l,mJ,l<m min(l , m ) and |s| = j J rj , we find that k,i is expressed as

k,i = max
(7.21)
k,i (j ) i (J )
J {1,...,N }
j J
for k 1, 1 i n + 1. We introduce k,0 = k,n+1 |[k] | according to (2.18). Then by Theorem 7.6, the local states are specified by (7.13) and the time evolution Tl is given by changing rj
to rj + min(l, j ), i.e., k,i (j ) into k,i (j ) min(l, j ).
Using the formula (7.21), it is easy to evaluate the local state (7.13) explicitly for k 1
if k = 1 in this region and the condition (6.3) (without the super script (1) in the present
notation) is satisfied. It yields the asymptotic state of the boxball system well after the collisions
of solitons. Omitting the derivation similar to Lemma 6.8, we give the final result:
w

11 11(b1 )11 11(bM ) 11 11(bM+1 )11 11(bN )11 11 ,
(7.22)
2
where 1 B1 and the symbol has been suppressed. For each bM = (x2 , . . . , xn+1 ) BM ,
(bM ) B1 M stands for the array

xn+1
x2
xn

n + 1n + 1nn22.
In (7.22), the interval of adjacent solitons is given by w = rM+1 rM + , where is a constant

independent of r1 , . . . , rN . Therefore if M < M+1 , we have w 1 due to rM rM+1 . In case
M = M+1 , we have
w = rM+1 rM + H (bM bM+1 ) H (bM bM+1 )
(7.23)
(1)
because of (7.15). Here H (bM bM+1 ) is the energy (2.14) for An1 crystals. It is known
(cf. [27,31]) that H (bM bM+1 ) is the minimum distance until which the solitons of the same
amplitude can get close. Therefore (7.23) is consistent with the fact that the tau function (7.21)
constructed from the data (7.14) covers all the N -soliton solutions.
Our formula (7.21) possesses a structure
analogous to the well known tau function of the KP
hierarchy [37]. For each J , the sum j J k,i (j ) is the superposition of individual solitons,
whereas the quantity i (J ) reflects a multi-body effect. A characteristic feature in k,i (j ) (7.19)
(1)
is that it contains bj in (7.16) rather than bj that appears in the asymptotic state (7.22). As for
i (J ), using the definition (4.11), it is factorized into the two-body function as

()
(+1)
i (J ) =
(7.24)
,
S i b j b j
1<M
Si (b c) = 2 min(l, m) Qi (b c)
(+1)
Here bj
()
2
b c Bl

2
.
Bm
is determined by sending bj in (7.16) to the left by the combinatorial R as
(7.25)
246

()
(1)
()
(1)
()
(1)
()
(1)
bj1 bj bj bj1 bj bj
(+1)
bj1 bj bj
()
( ).
Si in (7.25) is equal to min(l, m) plus the ith winding number min(l, m) Qi (b c). For i =
n + 1, it has been identified as the two-body phase shift of the solitons labelled with b and c
[27,31]. Thus i (J ) can be regarded as a generalization of it to the multi-body phase shift for an
arbitrary color i.
7.4. Alternative forms of N -soliton solution
We retain the notation in the previous subsection. The N -soliton solution (7.21) has been
expressed in terms of the parameters in (7.14). Here we rewrite it further in terms of the scattering
data (Appendices D and E):
2
2
Aff b
,
b1 [d1 ] bN [dN ] Aff b
(7.26)
N
1

(k+1)
, b0 = 2l (l 1).
H bk b j
dj = rj +
(7.27)
0k<j
Our task is essentially to switch from the position (rigging) rj to the mode dj . See (2.16) for
the symbol 2l . The mode dj here is a natural generalization of the one defined by (D.3). In
fact, when b1 bN is a highest element with respect to An1 , one has bj(1) = 2j and
(1)
H (b0 b1 ) = j , hence (7.27) reduces to (D.3). The mode is transformed according to (A.1)
under the combinatorial R. The affinization of (7.16) reads
2
2
2
2
Aff B1 Aff BN Aff Bj1 Aff BjM ( ),
(1) (1)
(M) (M)
(7.28)
b1 [d1 ] bN [dN ] bj1 dj1 bjM djM ( ).
()
()
For the notation dj , the same caution as for bj is necessary as mentioned under (7.16). Ap(1)
(1)
(M)
(M)
plying the definition (7.27) to bj1 [dj1 ] bjM [djM ] in the above, we find

()
(+1)
()
H b j b j
,
dj = rj +
0<
(0)
where the notation is the same as (7.24) and we have employed the convention j0 = 0 and b0 =
2
(1)
(1)
b0 . The element b0 Bl in (7.27) is the An1 analogue of u appearing in (4.12) for An . By
(1)
using (4.11), (4.12) and (2.14) for An1 crystals, this can be rewritten as
(1)
(1)
()
()
(1)
dj = rj En+1 bj1 bj + En+1 bj1 bj1

+
min(j , j ),
0
where min(0 , j ) = j . Taking the sum over and using (4.13), we get
M

=1
()
dj =

j J
rj +

j J
(1)
(1)
(M)
bj1 bjM
bj,2 En+1
1M
min(j , j ),
(7.29)

(1)
247
(1)
where we have used bj,2 + + bj,n+1 = j . On the other hand, from Corollary 7.5 we deduce
M

()
()
(1)
(1)
bj ,2 + + bj ,i = i () 1 ()
=1
(1)
(1)
= i () n+1 () + ||
(1 i n + 1),
(1)
(1)
where = (j1 , . . . , jM ) as in the previous subsection. Since i = Ei for An1 by Theorem 7.4, the right-hand side here is evaluated by using (4.13), leading to
M

(1)
(1)
()
()
(1)
(M)
bj ,2 + + bj ,i =
bj,i+1 + + bj,n+1 + Ei bj1 bjM
j J
=1
(1)

En+1
bj1 bj(M)
+ ||.
M
(7.30)
From (7.29) and (7.30), the quantity appearing in (7.21) is rewritten as

k,i (j ) i (J ) = min([k] , ) +
j J
()
j
()
= d j
+ j +
M

()
()
()
j + bj ,2 + + bj ,i ,
=1
min(j , j ) = rj +
1<

(+1)
,
Sn+1 bj bj
0<
where Sn+1 is defined in (7.25). Thus we obtain

k,i =
max
J {1,...,N }

M
()
()
()
min([k] , j ) j + bj ,2 + + bj ,i
(1 i n + 1),
=1
(7.31)
where the max extends over all the subsets J = {j1 , . . . , jM } {1, . . . , N }. Compared with
(7.21), the expression (7.31) is formally free from the multi-body effect. It has been absorbed
()
into the quantity j , which is a shifted mode.
The formula (7.31) is most naturally presented in terms of the principal picture of affine
crystals rather than the conventional homogeneous one. To explain it, let us make a short
(1)
digression on the principal picture in this paragraph. Recall that an element in the affine An
crystal Aff(Bl ) is parametrized as (x1 , . . . , xn+1 )[d], where d Z and xi Z0 are to satisfy
x1 + + xn+1 = l. See (2.15). We naturally extend xi to i Z by xi+n+1 = xi . Instead of
(x1 , . . . , xn+1 )[d], the element is also parametrized as xi = i1 i and d = 0 in terms of an
infinite sequence = (i )iZ such that
i Z,
i1 i ,
i = i+n+1 + l
for all i Z.
(7.32)
The correspondence between (x1 , . . . , xn+1 )[d] and is bijective. In fact, i = d x1 x2

xi for i 0 and i = d + x0 + x1 + + xi+1 for i < 0. We set Affp (Bl ) = { = (i )iZ | (7.32)}
and call the crystal structure induced on it the principal picture. Explicitly, it is given as follows:

(n+1)
(n+1)
ej ( ) = i i,j
,
fj ( ) = i + i,j
for = (i ),
(n+1)
where i,j = 1 if i j mod n + 1 and 0 otherwise. If the right-hand sides break the condition
i1 i in (7.32), they are to be understood as 0. The combinatorial R is especially simple in
248
the principal picture:

R : Affp (Bl ) Affp (Bm ) Affp (Bm ) Affp (Bl ),
(i ) (i ) (i Si ) (i + Si ).
(7.33)
is defined to be the color i two-body phase shift Si (b c) (7.25)

Here Si = Si+n+1 = Si (
(1)
for An with b c Bl Bm , where b and c are specified by b = (i1 i )n+1
i=1 and c =

(i1
i )n+1
.
From
(2.13),
S
reads
explicitly
as
i
i=1

Si ( ) = 2 min(l, m) i + i+n+1

min {i+k
i+k1 }
(7.34)
1kn+1
for Affp (Bl ) Affp (Bm ). Observe the compatibility between (7.33) and (2.12). Actually
for i = 0, the rule (7.33) on 0 , 0 disagrees with the changes of d, d in (A.1) under the above
mentioned identification 0 = d, 0 = d , which renders, however, no problem being merely the
discrepancy in the normalizations of the energy function. By Affp we mean the crystal structure
including the convention specified in (7.33). is a generalized phase variable of solitons.
Back to our N -soliton solution, we restart with the principal picture of the scattering data
(7.26):
2
2
1 N Affp b
(7.35)
Affp b
.
N
1
Accordingly, (7.28) reads
2
2
2
2
Affp B1 Affp BN Affp Bj1 Affp BjM ( ),
1 N
(1)
(M)
j1 jM ( ),
(7.36)
()
where, again, the notation j is unambiguous only combined with J = {j1 , . . . , jM } as cau()
()
()
()
()
tioned after (7.16). We set j = (j ,i )iZ and identify j ,1 with j in (7.31). j

2
()
()
()
2
()
Affp (Bj ) corresponds to (bj ,2 , . . . , bj ,n+1 )[j ] Aff(Bj ). Therefore we have j ,i =

()
()
()
j bj ,2 bj ,i for 1 i n + 1. In this way (7.31) is simplified to

k,i =
max
J {1,...,N }
M

()
min([k] , j ) j ,i
(1 i n + 1),
(7.37)
=1
where the max extends over all the subsets J = {j1 , . . . , jM } {1, . . . , N } as in (7.31). Note that
j()
= j()
+ j is consistent with the time evolution rule in Proposition 3.5 and k,1 (p) =
,1
,n+1
k,n+1 (T (p)) indicated by (4.2).
Finally we present an operator formalism that formally leads to (7.37) via the ultradiscretization. Let q be an indeterminate. Let A be the algebra over C[q, q 1 ] generated by the sym2
2
bols ( ), ( ) ( Affp (Bl )) that satisfy the commutation relations ( Affp (Bl ),
2
Affp (Bm )):
( ) ( ) = ( ) ( ).
Here
are related to
.
(7.38)
by the combinatorial R (7.33), (7.34):

(7.39)
249
(The commutation relation of ( ) ( ) and ( ) ( ) are not needed in the sequel.) We

equip A with the time evolution Tl (l Z1 ):

Tl ( )Tl1 = Tl ( ) ,
Tl ( )Tl1 = Tl ( ) ,
2

Tl ( ) = i + min(l, m) for = (i ) Affp Bm
(7.40)
.
Tl is an automorphism of A since it commutes with the combinatorial R, i.e., Tl ( ) Tl ( )
Tl ( ) Tl ( ) holds under (7.39). Obviously, Tl Tm = Tm Tl is valid.
For i Z, let the bracket i : A C[q, q 1 ] be the linear form on A characterized by the
following properties:

X ( ) i = Xi ,
( )X i = q i Xi for = (i )iZ ,
1i = 1,
(7.41)
where X denotes an arbitrary element in A. We shall write Tlk XTlk i simply as Tlk Xi for
2
2
2
any k Z. As an example, let Affp (Ba ) Affp (Bb ) Affp (Bc ). Then one has

Tl ( ) + ( ) () + () () + () i

= Tl ( ) () () i + Tl ( ) () () i + Tl ( ) () () i

+ Tl ( ) () () i + Tl ( ) () () i + Tl ( ) () () i

+ Tl ( ) () () i + Tl ( ) () () i .
We need the following reordering of by the combinatorial R:
(1) (1) (2) (2) (1) .
See (7.36). As cautioned after (7.16), there are two elements (2) and (2) that are relevant to
under the choices J = {2, 3} and {1, 3}, respectively. In terms of these elements, the above
bracket is evaluated as
(1)
(1)
1 + q min(l,a)+i + q min(l,b)+i + q min(l,c)+i
(2)
(1)
+ q min(l,a)+min(l,b)+i +i + q min(l,a)+min(l,c)+i + i + q min(l,b)+min(l,c)+i
(2)
+i
+ q min(l,a)+min(l,b)+min(l,c)+i +i +i .
From the commutation relation (7.38), the characterization of the bracket (7.41) and the definition (7.36), it follows that the tau function (7.37) associated to the scattering data 1 N
(7.35) comes out as the ultradiscretization:
k

1
k,i = lim log

Tj (1 ) + (1 ) (N ) + (N )
+0
j =1
(1 i n + 1),
(7.42)
where is related to q by q = e1/ . The bracket is expanded into 2N terms as in the

above example (N = 3). In each of them, the list of the positions of specifies the subset
J = {j1 , . . . , jM } {1, . . . , N} for the relevant contribution in (7.37). The timeevolution of the
tau function k,i (Tl (p)) is obtained from (7.42) by further inserting the product kj =1 T1
of the
j
automorphism (7.40).
250
Unlike the tau function (5.1) for the KP hierarchy, A is not the Clifford algebra and it is not
known to us whether the Laurent polynomial

k

1
Tj (1 ) + (1 ) (N ) + (N )
j =1
satisfies any sort of bilinear relations. However, the formula (7.42) is a most intrinsic way to
present our ultradiscrete tau function. It synthesizes the principal features in the theories of solitons and crystal basis, i.e., the free-fermion like structure and the combinatorial R.
8. Summary
In this paper we have introduced the ultradiscrete tau function and exploited several properties
related to the KKR bijection and the boxball systems.
In Section 2, i is introduced in (2.18)(2.20) as a piecewise linear function on rigged configurations. The piecewise linear formula for the KKR bijection is stated in Theorem 2.1. After a
brief exposition on the boxball system in Section 3, we have furthermore introduced i and Ei
in Section 4. i in (4.1) is the number of balls in the SW quadrant in the time evolution pattern
of the boxball system. Ei defined by (4.12) and (4.11) is a sum of local energy function in the
affine crystal. The fact i = Ei has been shown in Proposition 4.6. The two quantities provide
analogues of the corner transfer matrix [1] in complementary viewpoints; i from the boxball
system and Ei from the crystal base theory. Theorem 2.1 is a consequence of the further identification i = i = Ei in Theorem 6.12. Sections 5 and 6 are devoted to a proof of this fact. In
Section 5, i is shown to emerge as an ultradiscretization of the tau functions of the KP hierarchy
(Lemma 5.3) and satisfy the Hirota type bilinear equation (Proposition 5.1). In Section 6, i = i
is proved on the asymptotic states by induction on the rank (Proposition 6.1 and its reduction
in Proposition 6.4). These properties are enough to establish the claim i = i everywhere. Section 7 gives the generalization of Theorems 2.1 and 6.12 to arbitrary (non-highest) states. As
an application, the solution of the initial value problem in the boxball system is given in Theorem 7.6. We have also included the formulas (7.21), (7.37) and (7.42) for general N -soliton
solutions. Curiously, they are most elegantly presented in terms of affine crystals in the principal
picture introduced in Section 7.4.
Acknowledgements
The authors thank Masato Okado, Anne Schilling, Mark Shimozono and Taichiro Takagi for
useful discussion. Y.Y. is supported by Grants-in-Aid for Scientific No. 17340047. R.S. is grateful to Miki Wadati for warm encouragement during the study. He is a research fellow of the Japan
Society for the Promotion of Science.
Appendix A. Crystals and combinatorial R
The crystals Bl used in the main text are crystal bases of irreducible finite-dimensional representations of a quantum affine algebra Uq (g). Let us recall basic facts on them following [11,12].
Let P be the weight lattice, {i }0in the simple roots, and {
i }0in the fundamental
weights of g. A crystal B is a finite set with weight decomposition B = P B . The Kashiwara
operators ei , fi (i = 0, 1, . . . , n) act on B as ei : B B+i {0}, fi : B Bi {0}. In
251
particular, these operators are nilpotent. By definition, we have fi b = b if and only if b = ei b .

For any b B, set i (b) = max{m 0 | eim b = 0} and i (b) = max{m 0 | fim b = 0}. Then we
have the weight wtb of b by wtb = ni=0 (i (b) i (b))i .
For two crystals B and B , one can define the tensor product B B = {b b | b B, b B }.
The operators ei , fi act on B B by

e b b if i (b) i (b ),
ei (b b ) = i
b ei b if i (b) < i (b ),

fi (b b ) = fi b b if i (b) > i (b ),
b fi b if i (b) i (b ).
Here 0 b and b 0 should be understood as 0. For crystals we are considering, there exists a
unique isomorphism B B B B, i.e., a unique map which commutes with the action of
Kashiwara operators. In particular, it preserves the weight.
For a crystal B we define its affinization Aff(B) = {b[d] | d Z, b B} by ei (b[d]) =
(ei b)[d i0 ] and fi (b[d]) = (fi b)[d + i0 ]. (b[d] here corresponds to T d af (b) in [12].) The
crystal isomorphism B B B B is lifted up to a map Aff(B) Aff(B ) Aff(B )

Aff(B) called the combinatorial R. It has the following form:
R : Aff(B) Aff(B ) Aff(B ) Aff(B),

b[d] b [d ] b d H (b b ) b d + H (b b ) ,
(A.1)
B B B B.
H (b b ) is called the energy

where b b b b under the isomorphism
function and determined up to an additive constant by

H b b + 1 if i = 0, 0 (b) 0 (b ), 0 (b ) 0 (b),

H ei (b b ) = H (b b ) 1 if i = 0, 0 (b) < 0 (b ), 0 (b ) < 0 (b),

otherwise.
H (b b )
Proposition A.1 (YangBaxter equation). The following equation holds on Aff(B) Aff(B )
Aff(B ):
(R 1)(1 R)(R 1) = (1 R)(R 1)(1 R).
We often write the map R simply by . The combinatorial R is naturally restricted to B B .
In the main text we are concerned about the crystal Bl corresponding to the l-fold symmetric
tensor representation. We normalize the energy function so that

max H (b c) | b c Bl Bm = min(l, m).
Under this convention one has min{H (b c) | b c Bl Bm } = 0. When l = m, the combinatorial R becomes the identity map on Bl Bl but still acts non-trivially as R(x[d] y[e]) =
x[e H (x y)] y[d + H (x y)].
Appendix B. Graphical rule for combinatorial R
(1)
Following [17], we introduce a graphical rule to calculate the combinatorial R for An and
energy function given by (2.12) and (2.14). Given the two elements
x = (x1 , x2 , . . . , xn+1 ) Bk ,
y = (y1 , y2 , . . . , yn+1 ) Bl ,
252
we draw the following diagram to represent the tensor product x y.

x1
y1

x2
y2

xn+1
yn+1

Combinatorial R and the energy function H for Bk Bl (with k l) are calculated by the
following rule.
(1) Pick any dot, say a , in the right column and connect it with a dot a in the left column by
a line. The partner a is chosen from the dots which are in the lowest row among all dots
whose positions are higher than that of a . If there is no such dot, we return to the bottom
and the partner a is chosen from the dots in the lowest row among all dots. In the latter case,
we call such a pair or line winding.
(2) Repeat the procedure (1) for the remaining unconnected dots (l 1)-times.
(3) Action of the combinatorial R is obtained by moving all unpaired dots in the left column to
the right horizontally. We do not touch the paired dots during this move.
(4) The energy function H is given by the number of winding pairs.
It is known that the results for the combinatorial R and the energy functions are not affected
by the order of making pairs [17, Propositions 3.15 and 3.17]. For more properties, including that
the above definition indeed satisfies the axiom, see [17].
Example B.1. The diagram for 1233 124 is

By moving the unpaired dot (letter 2) in the left column to the right, we obtain
1233 124 133 1224 .
Since we have one winding pair, the energy function is H
253

1233 124 = 1.
For i Zn+1 , the number of connecting lines that cross the horizontal level of the border between
xi and xi+1 is called the ith winding number. The energy function H is the (n + 1)th winding
number. The quantity min(l, k)(ith winding number) is called the ith non-winding number.
It is known that Qi (x y) in (2.13) gives the ith non-winding number. By the definition, the
winding numbers for x y and y x are the same if x y y x by the combinatorial R.
Appendix C. KKR bijection
In order to define the KerovKirillovReshetikhin (KKR) bijection, there are two different
ways. One is the original combinatorial algorithm [9,10] explained here, and the other one is an
algebraic version [34,35] which will be treated in Appendix D. Although the both definitions are
known to be equivalent, they work complementarily in some aspects. In fact, we use the both
definitions case by case in the main text.
C.1. Definition
The KKR bijection provides one to one correspondence between the set of rigged configura(1)
tions and that of highest paths. For a given An rigged configuration
(n) (n)
(0) (1) (1)
,
RC = j , j , rj , . . . , j , rj
(C.1)
we define the KKR procedure RC p B(0) B(0) B(0) , which gives a highest
N
(a)
path p. See Section 2.2 for definitions of rigged configurations, vacancy numbers Ej
gings. The data (0) is called quantum space.
and rig-
Definition C.1. For a given RC, the image (or path) p of the KKR bijection is obtained by the
following procedure.
Step 1. For each row of the quantum space (0) , we assign the numbers from 1 to N arbitrarily,
and reorder it as

(0)
(0)
(0) = (0)
(C.2)
N , . . . , 2 , 1 .
(0)
Take row 1 .
(0)
Step 2. We name each box of the row 1 as

(0)
1 =
(0)
l1
2(0) 1(0) .
(C.3)
(0)
Corresponding to the row 1 , let p1 be the array of l1 empty boxes:

p1 =
(C.4)
254

(0)
(i)
Starting from the box 1 , we recursively choose 1 (i) by the following Rule 1:
(i1)
(i1) . Let g (i) be the set of all the rows of
Rule 1. Assume we have already chosen 1

(i) whose lengths w satisfy
(i1)
w col 1
,
where the right-hand side means the number of columns in (i1) that are not located to the right
(i1)
.
of the box 1
def
(i)
Let gs ( g (i) ) be the set of all the singular rows ( rows whose corresponding vacancy
(i)
number and rigging are equal) in the set g (i) . If gs = , then choose one of the shortest rows of
(i)
(i)
(i)
(i)
(n)
gs , and denote its rightmost box by 1 . If gs = , then we take 1 = = 1 = .
(0)
(j 1)
(1)
chosen above, where j1 1 is the maxiStep 3. From RC, remove boxes 1 , 1 , . . . , 1 1

(k)
mum k such that 1 = . After the removal, construct a new RC by
(a)
(a1)
(a)
(a+1)
Rule 2. Calculate the vacancy numbers pi = Ei

2Ei + Ei
along the configuration
after the removal. For those rows shortened by the removal, assign their vacancy numbers equal
to the new riggings. For the other row, keep the original rigging before Step 3.
Put letter j1 into the leftmost empty box of p1 as
p1 =
j1
(C.5)
(0)
(0)
(0)
Step 4. Repeat Step 2 and Step 3 for the rest of the boxes 2 , 3 , . . . , l1 in this order. Put
letters jk into empty boxes of p1 from left to right.
(0)
(0)
(0)
Step 5. Repeat Step 1 to Step 4 for the rest of the rows 2 , 3 , . . . , N in this order. Then we
obtain pk from (0)
.
k , which we identify with the tableau representation of the element in B(0)
k
The image of the KKR bijection is given by p = pN p2 p1 .
The above procedure gives a map from rigged configurations to highest paths. Its inverse also
admits a similar description. See Theorem 2 of [9].
C.2. Example of the KKR bijection
Let us illustrate a typical example of the KKR bijection. For a later convenience, we treat the
single column type quantum space. The procedure for general quantum space is quite similar.
255
Example C.2. We show that the following rigged configuration corresponds to a path p =
11112221322433.
(0)
(1)
(2)
0
2
(3)
1
1
0
In the above diagram, we have specified the boxes to be removed by Step 3 with the symbol .
Note that the boxes with are the rightmost boxes of the shortest possible singular rows, and
their column coordinates are increasing from the left to the right. We can remove three boxes at
a time, thus resulting part of a path is 3 . Similarly we can proceed as
(113 )
(112 )
(111 )
(110 )
0
4
4
0
8
4
0
4
3
0
3
1
3
0
0
0
0
0
3
1
0
0
4
1
0
256
(19 )
2
6
(18 )
(17 )
3
2
(14 )
4
1
By removing all the boxes, we end up with

p= 1 1 1 1 2 2 2 1 3 2
2 4 3 3.
The following lemma is useful.
Lemma C.3. Let p P+ ((0) ) and q P+ ( (0) ) be the highest paths corresponding to the
rigged configurations ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )),
respectively. Then the rigged configuration for the highest path p q is given by

(0)
(0) , (1) , r (1) (1) , s (1) , . . . , (n) , r (n) (n) , s (n) .
(C.6)
Here ( (a) , s (a) ) = {(i(a) , si (a) )} and the rigging s (a) = (si (a) ) is given by
(a)
si
(a)
= si
+p
(a)
(a) ,
i
(a)
where pj is the vacancy number (2.7) for ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )).
(a)
Proof. Let qj
(a)
number pj
be the vacancy number for ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )). Then the vacancy
(a)
for (C.6) reads pj
(a)
(a)
= pj + qj . Therefore the co-rigging (:= vacancy number
rigging) of the row (i(a) , si (a) ) in (C.6) is pj (a) si (a) = qj(a) si(a) with j = i(a) , which is nothing but the co-rigging of the same row in ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )). Recall that the KKR
257
procedure (Definition C.1) consults co-riggings to decide boxes to be removed from a rigged
configuration. Therefore the above coincidence of the co-rigging means that the KKR procedure
on (C.6) gives the path q when the part (0) is firstly removed from (0) (0) . Moreover at this
stage, the remaining rigged configuration is exactly ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )). 2
Appendix D. Vertex operator formalism of the KKR bijection
Here we give a crystal theoretic reformulation of the KKR bijection based on [34,35]. The
central notions are scattering data, normal ordering and the vertex operator. For illustrative examples, see Appendix E.
D.1. Scattering data and normal ordering
We call elements of affine crystals b1 [d1 ] bm [dm ] Aff(Bl1 ) Aff(Blm ) scattering data. The number di is called the ith mode. By using the combinatorial R, scattering data can be reordered and the modes are changed accordingly. Given a scattering data
s Aff(Bl1 ) Aff(Blm ), define Sm to be the set of such reordering as

!

Sm = s
Aff(Bl (1) ) Aff(Bl (m) ) s s ,
Sm

where means the disjoint union over all the distinct permutations of (l1 , . . . , lm ). For instance,
if s = 234 7 223 2 , we have

S2 = 234 7 223 2 , 234 0 223 9 .
Note that in this case, the union over is trivial as (l1 , l2 ) = (l2 , l1 ) = (3, 3), but S2 contains two
distinct elements since the combinatorial R is nontrivial as remarked in the end of Appendix A.
For i = 2, . . . , m, let Si1 be the subset of Si having the maximal ith mode. Then we have
= S1 S2 Sm .
(D.1)

In the above example, we have S1 = 234 0 223 9 . We call the elements of S1 normal
ordered forms of s. In general the normal ordered form b1 [d1 ] bm [dm ] is not unique
but the mode sequence d1 , . . . , dm is unique by the definition and satisfies d1 dm . Any
element of S1 is denoted by :s:.
D.2. Maps C (1) , . . . , C (n)
(1)
Let ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be an An rigged configuration. Pick the color a part
(a)
( , r (a) ). Here we simply write it as (, r). Namely = (1 , . . . , m ) is an array of positive
integers and r = (ri ), where ri is the rigging attached to the ith row in of length i . For
a+1
(1)
1 a n, let Bl = Bl
be the Ana crystal in the sense explained around (2.17). Define the
(1)
map C (a) among the Ana crystals by
258
C (a) : B1 Bm :Aff(B1 ) Aff(Bm ):

b1 bm :b1 [d1 ] bm [dm ]:

(k+1)
.
H bk b i
di = ri + i +
(1 a n),
(D.2)
(D.3)
1k<i
(j )
Here bi
Bi (j i) is defined by bringing bi to the left by the combinatorial R as

(j )
(bj bi1 ) bi bi
( )
under the isomorphism (Bj Bi1 ) Bi Bi (Bj Bi1 ). Note that the
choice (D.3) is compatible with (A.1).
n+1
(1)
The map C (n) involves A0 crystal Bl
= {(n + 1)l }. See (2.16) for the notation a l . The
(n)
following suffices to define C :
(n + 1)l (n + 1)m (n + 1)m (n + 1)l ,

H (n + 1)l (n + 1)m = min(l, m).
Since the normal ordering in (D.2) is not unique, C (a) is actually multi-valued in general.
Here we mean by C (a) () to pick any one of the normal ordered forms. C (a) is an operator that
(1)
transforms elements of classical Ana crystals to normal ordered scattering data by assigning the
modes.
D.3. Maps (1) , . . . , (n)
Pick the color a and a 1 parts of the configuration and denote them simply by (a) =
a+1
a
(1 , . . . , m ) and (a1) = (1 , . . . , k ). Set Bl = Bl
and Bl = Bl . We define the map
(1)
(1)
(a) from the normal ordered scattering data in Ana affine crystals to classical Ana+1 crystals:
(a) : :Aff(B1 ) Aff(Bm ): B 1 B k
(1 a n),
b1 [d1 ] bm [dm ] c1 ck .
(D.4)
From (D.3) and the fact that b1 [d1 ] bm [dm ] is normal ordered, we have 0 d1 dm .
Then the image c1 ck is determined by the following relation under the isomorphism of
a
(1)
Ana+1 crystals: (We write Tad = a d (B1 )d for short.)

d dm1
Tad1 b1 Tad2 d1 b2 Ta m

bm a 1 a 2 a k
(c1 ck ) tail.
(D.5)
a+1
Here we are regarding bi Bi = Bi
a
as an element of B i = Bi by the natural embedding
(2.17) as sets. The tail part has the same structure as (Tad1 b1 Tad2 d1 bm ) on the
left-hand side. In the actual use, it turns out to be (Tad1 a 1 Tad2 d1 a m ) containing
the letter a only. (This fact will not be used.)
259
(1)
To obtain c1 ck using (D.5), one applies the Ana+1 combinatorial R many times
d dm1
to carry (Tad1 b1 Ta m
procedure is depicted as
bm ) through (a 1 a 2 a k ) to the right. The
a 1
a 2
a k
a m
bm
dm dm1
d1
b1
a 1
a
c1
ck
c2
D.4. Vertex operator formalism

(1)
Define the A0 crystal element

(n)
(n)
p (n) = (n + 1)1 (n + 1)ln .
(D.6)
Theorem D.1. The image p of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under
the KKR bijection is given by

p = (1) C (1) (2) C (2) (n) C (n) p (n) .
(D.7)
This is announced in [34] and proved in [35]. The theorem asserts that the right-hand side is
independent of the choices of the possibly non-unique normal ordered forms when applying the
maps C (1) , . . . , C (n) .
Set

p (a) = (a+1) C (a+1) (n) C (n) p (n) (0 a n 1),
(D.8)
(1)
a+1
(a)
1
which belongs to the Ana crystal B
a+1
(a) .
la
Thus p in (D.7) is p (0) .
Corollary D.2. For 0 a n 1, p (a) coincides with the image of the truncated rigged configuration ((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) under the KKR bijection.
By the construction, C (a) (p (a) ) is a normal ordered scattering data which is an element of an
affine crystal. Then the map (a) produces an Ana+1 highest path by injecting the scat-
(1)
Ana
(a1)
(a1)
tering data C (a) (p (a) ) into the vacuum state a 1

a la1 . We call (a) vertex operator
in this sense. The construction (D.8) involves the family of scattering data and vertex operators
(1)
(1)
(1)
for crystals of A0 A1 An . It can be regarded as a crystal theoretical formulation of
the nested Bethe ansatz due to Schultz [41].
260
Appendix E. Inverse scattering formalism of boxball system

This appendix is an exposition of the inverse scattering formalism of the boxball system
mentioned in Section 3.2. We illustrate the calculations of scattering data, normal ordering and
vertex operators explained in Appendix D along several examples.
E.1. Time evolution, scattering data and normal ordering
Example E.1. Consider the rigged configuration in Example C.2. We put many 1 B1 on the
both sides of the corresponding path p = 11112221322433, and consider its time evolution under
T of the boxball system. See Section 3.1 for the definition of T .
t = 0:
1111222211111133211143111111111111111111111111111111
t = 1:
1111111122221111133211431111111111111111111111111111
t = 2:
1111111111112222111133214311111111111111111111111111
t = 3:
1111111111111111222211133243111111111111111111111111
t = 4:
1111111111111111111122221132433111111111111111111111
t = 5:
1111111111111111111111112221322433111111111111111111
t = 6:
1111111111111111111111111112211322433211111111111111
t = 7:
1111111111111111111111111111122111322143321111111111
t = 8:
1111111111111111111111111111111221111322114332111111
t = 9:
1111111111111111111111111111111112211111322111433211
Here the length of the paths is 52, and t = 5 state contains the original path as 120 p 118 .
The following rigged configurations correspond to the above paths at each time.
(0)
(152 )
(1)
38
40
43
(2)
0 + 4t
7 + 3t
13 + 2t
1
0
(3)
1
The linear dependence of the rigging on t is in agreement with Proposition 3.5. The following is
the list of all the normal ordered scattering data corresponding to each time t of the above paths.
t = 0, 1, 2, 3, 4
t =5
2222 4+4t 233 11+3t 34 16+2t

2222 24 233 26 34 26
t =6
2222 24 23 26 334 26
22 27 2223 29 334 29
t = 7, 8, 9
22 15+2t 223 11+3t 2334 5+4t
22 27 223 29 2334 29
Compare this list with the above time evolution pattern. Each tensor product component of the
scattering data corresponds to a soliton in the path. When the modes of the scattering data are well
separated, the normal ordering is unique, and the corresponding path consists of well separated
261
solitons that contain the tableau letters in the scattering data (in the reverse order). t = 5, 6 are
such cases. From the viewpoint of the scattering data, collisions of solitons happen when the
modes get close and the normal ordering becomes non-unique. t = 5, 6 are such cases. See also
Example 2.4 for the tau functions at t = 5, where k,i there is relevant to k+20,i here.
Let us illustrate the derivation of the normal ordered scattering data at t = 5. At t = 5, riggings
of (1) attached to the rows of length 2, 3 and 4 are r1 = 23, r2 = 22 and r3 = 20, respectively.
By Theorem D.1 and (D.8), we know that p = (1) C (1) (p (1) ), where C (1) (p (1) ) is the normal
ordered scattering data. It is constructed from the A2 -highest path p (1) containing the letters 2, 3
and 4. According to Corollary D.2, p (1) is the image of the KKR bijection of the following part
of the original rigged configuration:
(1)
(2)
1
0
(3)
1
(1)
Here (1) plays the role of the quantum space, and the KKR bijection is A2 type with letters 2,
3 and 4. For example, if we can remove only a box from (1) , then we have the letter 2 as a part
of the path, whereas if boxes are removed from (1) , (2) and (3) , the letter is 4. Removing the
rows of (1) from the top, we obtain the A2 highest path:
p (1) = b1 b2 b3 = 22 223 2334 .
(E.1)
Assigning this with the modes according to (D.2) and (D.3), we get
b1 [d1 ] b2 [d2 ] b3 [d3 ] = 22
2334 25 .

To derive the mode d3 = 25, for instance, we calculate 1k<3 H (bk b3(k+1) ) in (D.3) as
0
25
223
26
22 223 2334 22 2223 334 ,

H
where a b signifies the value of the energy function H (a b) = H . Since in (D.3), we have
r3 = 20 and 3 = 4, the mode is d3 = 20 + 4 + 0 + 1 = 25.
To find the normal ordered scattering data C (1) (p (1) ), we follow the procedure (D.1) and list
the following sets:
"
S3 = 22 25 223 26 2334 25 , 22 25 2223 25 334 26 ,
222
25
23
26
2334
25 ,
222
25
2233
25
34
26 ,
#
2222 24 23 26 334 26 , 2222 24 233 26 34 26 ,
"
S2 = 22 25 2223 25 334 26 , 222 25 2233 25 34 26 ,
#
2222 24 23 26 334 26 , 2222 24 233 26 34 26 ,
"
#
S1 = 2222 24 23 26 334 26 , 2222 24 233 26 34 26 .
The both elements in S1 serve as the normal ordered scattering data in agreement with the previous list at t = 5.
262
Example E.2. Here is a more intriguing example.

(0)
(152 )
(1)
38
40
43
(2)
0 + 4t
5 + 3t
10 + 2t
1
0
(3)
1
The normal ordered scattering data are listed below.

t = 0, 1, 2, 3
2222 4+4t 233 9+3t 34 13+2t
t =4
2222 20 233 21 34 21
2222 20 23 21 334 21
222 20 2233 21 34 21
222 20 23 21 2334 21
22 20 2223 21 334 21
t = 5, 6, 7, 8, 9
22 20 223 21 2334 21
22 12+2t 223 9+3t 2334 5+4t
At t = 4, all 6 reorderings are simultaneously normal ordered. In a sense three solitons collide
all together at t = 4. Compare this with the following time evolution pattern.
t = 0:
1111222211113321143111111111111111111111111111111111
t = 1:
1111111122221113321431111111111111111111111111111111
t = 2:
1111111111112222113324311111111111111111111111111111
t = 3:
1111111111111111222213243311111111111111111111111111
t = 4:
1111111111111111111122132243321111111111111111111111
t = 5:
1111111111111111111111221132214332111111111111111111
t = 6:
1111111111111111111111112211132211433211111111111111
t = 7:
1111111111111111111111111122111132211143321111111111
t = 8:
1111111111111111111111111111221111132211114332111111
t = 9:
1111111111111111111111111111112211111132211111433211
E.2. Vertex operator construction of paths from scattering data

Here we illustrate the action of the vertex operators (1) , . . . , (n) introduced in Appendix D.3 (D.4). It is convenient to use the vertex type diagram to express the action of the
combinatorial R. For example the following successive actions of the combinatorial R
a b c b a c b c a ,
263
will be depicted by the diagram:

b
c
a
a .
Given a path p and an element b Bl , one can carry b through p to the right by successively
applying the combinatorial R as
b p p b ,
p, p Bk1 Bk2 BkN ,
(E.2)
under the isomorphism Bl (Bk1 BkN ) (Bk1 BkN ) Bl . As the result we get
b Bl and another path p . Actually, the only situation b = ul (highest element of Bl ) will
be encountered in our case, and the relation (E.2) will be denoted by b (p) = p . This is an
elementary vertex operator. The previous ones (1) , . . . , (n) defined by (D.5) are compositions
of b with several b.
5
For example, to calculate 2334 ( 1 ), the relevant diagram is
1
2334
1
1233
1
1123
1
1112
1
1111
1111
1
Therefore we obtain 2334 ( 1 ) = 43321. Note that b has created one soliton labeled by the
letters in b.
In general, if b1 [d1 ] bm [dm ] is a normal ordered scattering data, (1) defined by (D.5)
is realized as the following composition of elementary vertex operators:
d dm1
(1) = T1d1 b1 T1d2 d1 T1 m
bm ,
(E.3)
g(p) = f (g(p)). The superscript (1) corresponds to that of (1) . Note that for a
where f
the effect of Tad in (D.5) is described by T1d = ( 1 )d .
In what follows we illustrate Theorem D.1 and Corollary D.2.
= 1,
Example E.3. Take a path p = 11112221322433, which we have already considered in Example
C.2 and Example E.1. From t = 5 case of Example E.1, the both sides of
2222 4 23 6 334 6 2222 4 233 6 34
(E.4)
C (1) (p (1) ).
serve as the normal ordered scattering data

According to Theorem D.1 and (D.8), the
original path p is reconstructed as p = (1) C (1) (p (1) ). This (1) is realized, according to (E.3)
and (E.4), as the following compositions of elementary vertex operators:
14
p = T14 2222 T12 23 334 1
14
.
= T14 2222 T12 233 34 1
It is easy to check p = 11112221322433 from these formulas.
Let us illustrate Corollary D.2, which reflects the nested structure of the KKR bijection. For
(a) (D.5) with general a, the formula (E.3) is replaced by
(a) = ( a )d1 b1 ( a )d2 d1 ( a )dm dm1 bm .
(E.5)
264
Example E.4. We consider the same example as above. In the rigged configuration (see Example
(1)
E.1), first look at the rightmost two diagrams which form an A1 rigged configuration:
(2)
(3)
0
From (3) , we set p (3) = 4 according to (D.6) and obtain the scattering data C (3) (p (3) ) = 4 1 ,
which is obviously normal ordered. From (E.5), the A1 highest path p (2) = (3) C (3) (p (3) ) with
letters 3 and 4 is constructed as

p (2) = 3 4 3 333 = 3 334 .
Taking the rigging attached to (2) into account, we obtain the normal ordered scattering data
C (2) (p (2) ) = 3 1 334 4 .
Next we look at the following parts
(1)
(2)
1
0
(3)
1
Then the A2 highest path p (1) = (2) C (2) (p (2) ) with letters 2, 3 and 4 is calculated along (E.5)
as

p (1) = 2 3 ( 2 )3 334 22 222 2222
= 22 223 2334 .
As a result, we have reproduced (E.1), which was the starting point of the previous Example E.3.
Summarizing, the path p = 11112221322433 has been obtained as p = (1) C (1) (2) C (2)
(3) C (3) (p (3) ).
References
[1] R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982.
[2] H.A. Bethe, Zur Theorie der Metalle, I. Eigenwerte und Eigenfunktionen der linearen Atomkette, Z. Phys. 71 (1931)
205231.
[3] M. Gaudin, La fonction donde de Bethe, Masson, Paris, 1983.
[4] V.E. Korepin, N.M. Bogoliubov, A.G. Izergin, Quantum Inverse Scattering Method and Correlation Functions,
Cambridge Univ. Press, 1997.
[5] M. Takahashi, Thermodynamics of One-Dimensional Solvable Models, Cambridge Univ. Press, 1999.
[6] G.E. Andrews, R.J. Baxter, P.J. Forrester, Eight vertex SOS model and generalized RogersRamanujan-type identities, J. Stat. Phys. 35 (1984) 193266.
[7] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Exactly solvable SOS models: Local height probabilities and
theta function identities, Nucl. Phys. B 290 (1987) 231273;
E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Proof of the startriangle relation and combinatorial identities,
Adv. Stud. Pure Math. 16 (1988) 17122.
265
[8] Combinatorial Aspect of Integrable Systems, A. Kuniba, M. Okado (Eds.), MSJ Memoirs 17 (2007).
[9] S.V. Kerov, A.N. Kirillov, N. Yu, Reshetikhin, Combinatorics, the Bethe ansatz and representations of the symmetric
group, J. Sov. Math. 41 (1988) 916924.
[10] A.N. Kirillov, N.Yu. Reshetikhin, The Bethe ansatz and the combinatorics of Young tableaux, J. Sov. Math. 41
(1988) 925955.
[11] M. Kashiwara, On crystal bases of the q-analogue of universal enveloping algebras, Duke Math. J. 63 (1991) 465
516.
[12] S.-J. Kang, M. Kashiwara, K.C. Misra, T. Miwa, T. Nakashima, A. Nakayashiki, Affine crystals and vertex models,
Int. J. Mod. Phys. A 7 (Suppl. 1A) (1992) 449484.
[13] A. Berkovich, B.M. McCoy, A. Schilling, RogersSchurRamanujan type identities for the M(p, p ) minimal models of conformal field theory, Commun. Math. Phys. 191 (1998) 325395.
[14] S. Dasmahapatra, R. Kedem, T.R. Klassen, B.M. McCoy, E. Melzer, Quasi-particles, conformal field theory, and
q-series, Int. J. Mod. Phys. B 7 (1993) 36173648.
[15] B.L. Feigin, A.V. Stoyanovsky, Quasi-particle models for the representations of Lie algebras and geometry of flag
manifold, hep-th/9308079.
[16] O. Foda, T.A. Welsh, Melzers identities revisited, Contemp. Math. 248 (1999) 207234.
[17] A. Nakayashiki, Y. Yamada, Kostka polynomials and energy functions in solvable lattice models, Selecta Math.
New Ser. 3 (1997) 547599.
[18] S.O. Warnaar, Fermionic solution of the AndrewsBaxterForrester model I: unification of TBA and CTM methods,
J. Stat. Phys. 82 (1996) 657685.
[19] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Y. Yamada, Remarks on fermionic formula, Contemp. Math. 248
(1999) 243291.
[20] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Z. Tsuboi, Paths crystals and fermionic formulae, Prog. Math.
Phys. 23 (2002) 205272.
[21] I. Macdonald, Symmetric functions and Hall polynomials, second edition, Oxford Univ. Press, New York, 1995.
[22] A.N. Kirillov, A. Schilling, M. Shimozono, A bijection between LittlewoodRichardson tableaux and rigged configurations, Selecta Math. 8 (2002) 67135.
[23] A. Schilling, X = M Theorem: Fermionic formulas and rigged configurations under review, Combinatorial Aspect
in Integrable Systems, MSJ Memoirs 17 (2007) 75104.
[24] A. Schilling, M. Shimozono, X = M for symmetric powers, J. Alg. 295 (2006) 562610.
[25] M. Okado, A. Schilling, M. Shimozono, A crystal to rigged configuration bijection for nonexceptional affine algebras, in: N. Jing (Ed.), Algebraic Combinatorics and Quantum Groups, World Scientific, 2003, pp. 85124.
(1)
[26] G. Hatayama, K. Hikami, R. Inoue, A. Kuniba, T. Takagi, T. Tokihiro, The AM automata related to crystals of
symmetric tensors, J. Math. Phys. 42 (2001) 274308.
[27] K. Fukuda, M. Okado, Y. Yamada, Energy functions in boxball systems, Int. J. Mod. Phys. A 15 (2000) 13791392.
[28] D. Takahashi, On some soliton systems defined by using boxes and balls, in: Proceedings of the International
Symposium on Nonlinear Theory and Its Applications NOLTA 93, 1993, pp. 555558.
[29] D. Takahashi, J. Satsuma, A soliton cellular automaton, J. Phys. Soc. Jpn. 59 (1990) 35143519.
[30] G. Hatayama, A. Kuniba, T. Takagi, Soliton cellular automata associated with crystal bases, Nucl. Phys. B 577
(2000) 619645.
[31] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Y. Yamada, Scattering rules in soliton cellular automata associated
with crystal bases, Contem. Math. 297 (2002) 151182.
[32] A. Kuniba, M. Okado, Y. Yamada, Boxball system with reflecting end, J. Nonlin. Math. Phys. 12 (2005) 475507.
[33] T. Tokihiro, D. Takahashi, J. Matsukidaira, J. Satsuma, From soliton equations to integrable cellular automata
through a limiting procedure, Phys. Rev. Lett. 76 (1996) 32473250.
[34] A. Kuniba, M. Okado, R. Sakamoto, T. Takagi, Y. Yamada, Crystal interpretation of KerovKirillovReshetikhin
bijection, Nucl. Phys. B 740 (2006) 299327.
[35] R. Sakamoto, Crystal interpretation of KerovKirillovReshetikhin bijection II. Proof for sln Case, math.QA/
0601697, J. Alg. Comb., in press.
[36] M. Sato, Y. Sato, Soliton equations as dynamical systems on infinite dimensional Grassmann manifold, Nonlinear
PDE in Applied Science, USJapan Seminar, Tokyo, 1982, Lecture Notes Numer. Appl. Anal. 5 (1982) 259271.
[37] M. Jimbo, T. Miwa, Solitons and infinite dimensional Lie algebras, Publ. RIMS. Kyoto Univ. 19 (1983) 9431001.
266
[38] Y. Yamada, A birational representation of Weyl group, combinatorial R-matrix and discrete Toda equation, in: A.N.
Kirillov, N. Liskova (Eds.), Physics and Combinatorics 2000, World Scientific, 2001, pp. 305319.
[39] J.S. Birman, Braids, Links, and Mapping Class Groups, Princeton Univ. Press, 1974.
[40] L. Deka, A. Schilling, New fermionic formula for unrestricted Kostka polynomials, J. Comb. Theor. Ser. A 113
(2006) 14351461.
[41] C.L. Schultz, Eigenvectors of the multicomponent generalization of the six-vertex model, Physica A 122 (1983)
7188.
Tautological relations in Hodge field theory

A. Losev a , S. Shadrin b, , I. Shneiberg c
a Institute for Theoretical and Experimental Physics, Bolshaya Cheremushkinskaya 25, Moscow 117218, Russia
b Department of Mathematics, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
c Department of Algebra, Faculty of Mechanics and Mathematics, Moscow State University, Leninskie Gory, GSP,
Moscow 119899, Russia

Received 17 April 2007; accepted 4 July 2007
Abstract
We propose a Hodge field theory construction that captures algebraic properties of the reduction of
Zwiebach invariants to GromovWitten invariants. It generalizes the BarannikovKontsevich construction
to the case of higher genera correlators with gravitational descendants. We prove the main theorem stating
that algebraically defined Hodge field theory correlators satisfy all tautological relations. From this perspective the statement that BarannikovKontsevich construction provides a solution of the WDVV equation
looks as the simplest particular case of our theorem. Also it generalizes the particular cases of other lowgenera tautological relations proven in our earlier works; we replace the old technical proofs by a novel
conceptual proof.
1. Introduction
In this paper we present an attempt to formalize what may be called a string field theory (SFT)
for (closed) topological strings with Hodge property.
From the very first days of string theory it was considered as a kind of generalization of the
perturbative expansion of the quantum field theory in the (functional) integral representation. The
space of graphs with g loops with metrics on edges (Schwinger proper times) was generalized
to the moduli space of Riemann surfaces. Indeed, the latter space really looks like a principle
E-mail addresses: losev@itep.ru (A. Losev), sergey.shadrin@math.unizh.ch (S. Shadrin), shneiberg@mtu-net.ru

(I. Shneiberg).
doi:10.1016/j.nuclphysb.2007.07.003
268
A. Losev et al. / Nuclear Physics B 786 [PM] (2007) 267296
U (1)n bundle over the former space near the points of maximal degeneracy (i.e., where the
maximal number of handles are pinched).
A natural question is whether there are special string theories that degenerate exactly to quantum field theories (may be, of the special kind). Would it happen such theories should enjoy both
finiteness of string theory and (functional) integral description of quantum field theory.
One of the first attempts to construct a theory of this type was done by Zwiebach in [1]. He
divided the moduli space into two regions: the internal piece and the boundary. He observed that
surfaces representing the boundary region may be constructed from those representing the internal piece by gluing them with the help of cylinders (with flat metric). Therefore, he proposed
to take integrals over the moduli spaces in two steps: first, to take an integral over the internal
pieces, such that this would produce vertices, and then take an integral along metrics on cylinders, that would exactly reproduce integral along the Schwinger parameters on graphs in QFT
prescription.
In this approach, he came with the infinite number of vertices of different internal genera and
with different number of external legs. However, he observed that such vertices satisfy quadratic
relations that were a quantum version of some infinity-structure. At that time community of
theoretical physicists seemed not to be impressed by the Lagrangian with infinite number of
(almost uncomputable1 ) vertices.
The next attempt was done by Witten [2]. He assumed that in topological string theories there
may be a limit in the space of two-dimensional theories such that the measure of integration
goes to the vicinity of the points of maximal degeneration. In the type B theories such limit
seems to be the large volume limit of the target space; this motivated Wittens ChernSimonslike representation for the topological string theory. This approach was further developed by
Bershadsky et al. in [3]. We note that the tropical limit of GromovWitten theory [4] (type A
topological strings) seems to realize the same QFT degeneration of string theory. Indeed, the
tropical limit of a Riemann surface mapped to a toric variety is represented by the graph mapped
to the moment map domain.
In the development of topological string theory it became clear that the proper object is not
just a measure on the moduli space of complex structures of Riemann surfaces, but rather a
differential form on this space. In original formulation these differential forms were assigned
to the tensor algebra of cohomology of some complex; such objects are called GromovWitten
invariants. We say that GromovWitten invariants are QFT-like if the differential forms of nonzero degree have support only in a vicinity of the points of maximal degeneration.
We generalized the definition of GromovWitten invariants in [5] by lifting it from the cohomology of a complex to the full complex. Such generalization involved enlargement of the
moduli space from DeligneMumford space to KimuraStasheffVoronov space [6], and we
called it Zwiebach invariants (in fact, some pieces of this construction appeared earlier in [1]
and [7]). The complex of states involved in the definition of Zwiebach invariants is a bicomplex
due to the action of the second differential. The second differential represents the substitution of
a special vector field corresponding to the constant rotation of the phase of the local coordinate
at a marked point into differential forms on the KimuraStasheffVoronov space.
Once we have some Zwiebach invariants, it is possible to produce new Zwiebach invariants
by contraction of an acyclic Hodge sub-bicomplex. In fact, it is one of the main properties of
Zwiebach invariants. Consider a sub-bicomplex, where these two differentials act freely. We call
1 Note, that computation of an integral over a subspace with a boundary is harder than that one over a compact space.
269
it Hodge contractible bicomplex. The operation of contraction of a Hodge contractible bicomplex turns Zwiebach invariants into induced Zwiebach invariants on the coset with respect to
contactable bicomplex. Induced Zwiebach invariants are differential forms whose support is a
union of the support of the initial Zwiebach invariants and small neighbourhoods of the points
of maximal degeneration. This procedure is a generalization from intervals to cylinders of the
procedure of induction of L -structures, see, e.g., [8,9].
This way we can obtain QFT-like GromovWitten invariants. We just should start with
Zwiebach invariants that have (in some suitable sense) no support inside the KimuraStasheff
Voronov spaces. In fact, it is even enough to consider a weaker condition, motivated by applications. That is, usually people consider the integrals of GromovWitten invariants only over the
tautological classes in the moduli space of curves. So, we call a set of Zwiebach invariants vertexlike if the integral over the KimuraStasheffVoronov spaces of any their non-zero component
multiplied by the pullback (from the DeligneMumford space) of any tautological class vanishes.
Consider vertex-like Zwiebach invariants. Assume we contract a Hodge contractible bicomplex down to cohomology. We obtain differential forms on the DeligneMumford space, such
that the integral of the product of any such form of non-zero degree with any tautological class
vanishes the interior of the moduli space. Integrals of such forms over the moduli spaces turn
out to be sums over graphs (corresponding to degenerate Riemann surfaces). They resemble
Feynman diagrams, and generation function for the integral over moduli spaces resemble diagrammatic expansion of perturbative quantum field theory.
In this paper, we do not construct examples of vertex-like Zwiebach invariants (we are going
to do this explicitly in a future publication, as well as the corresponding theory for the spaces
introduced in [10,11]). Rather we conjecture that they exist and study the consequences of this
assumption. We call the emerging construction the Hodge field theory, and now we will explain
it in some detail.
First of all, degree zero parts of vertex-like Zwiebach invariants induce the structure of homotopy cyclic Hodge algebra on the target complex [5]. We remind that a cyclic Hodge algebra is
just a Hodge dGBV-algebra with one additional axiom (1/12-axiom, see below).
In fact, this structure is interesting by itself, without any reference to Zwiebach invariants. It
has first appeared in the paper of Barannikov and Kontsevich [12]; it captures the properties of
polyvector fields on CalabiYau manifolds. More examples of dGBV algebras are studied in [13]
and [14]. It is possible to understand the structure of dGBV-algebra as a natural generalization of
the algebraic structure studied in [15].
In the Hodge field theory construction we consider only a particular case, where we obtain
axioms of a cyclic Hodge algebra itself, not up to homotopy. We are aware of the fact that demanding existence of vertex-like Zwiebach invariants simultaneously with vanishing homotopy
piece of cyclic Hodge algebra conditions may be too restrictive, and while considering only those
relations that lead to axioms of cyclic Hodge algebra may be too weak, however we proceed.
In the Hodge field theory construction we define graph expressions for the analogues of
GromovWitten invariants multiplied by tautological classes using only cyclic Hodge algebra
data. We call them Hodge field theory correlators. The corresponding action of the Hodge field
theory is written down explicitely in Section 6.2.
Our main result is the proof that the Hodge field theory correlators satisfy all universal equations that follow from relations among tautological classes in the cohomology.
The first result of this kind is due to Barannikov and Kontsevich. They have noticed that there
is a solution of the WDVV equation that is associated to a dGBV-algebra (this solution is the
critical value of the BCOV action [3], see [12, Appendix] and [5, Appendix]). Later, we reproved
270
this in [5]. Then, in [5,1618] we proved some other low-genera universal equations. Here we
generalize all these results and put all calculations done before in a proper framework.
In particular, the main problem for us was to define a graph expression in tensors of a cyclic
Hodge algebra that corresponds to the full GromovWitten potential with descendants. The first
steps were done in [16,17], where we introduced the definition of descendants at one point in
Hodge field theory (mostly for combinatorial reasons). But then we observed that it is a part of a
natural definition of potential with descendants in cyclic Hodge algebras that appears as a special
case of degeneration of vertex-like Zwiebach invariants multiplied by tautological classes.
In this paper, we present and study this construction. We prove in a completely algebraic
way that Hodge field theory correlators satisfy the same equations as a GromovWitten potential: string, dilaton, and the whole system of PDEs coming from tautological relations in the
cohomology of the moduli space of curves (see also [18] for some preliminary results). In what
follows we will not only present the proof but also will do our best relating algebraic definitions
and statements on Hodge field theory to analoguous constructions and theorems in the theory of
Zwiebach invariants.
1.1. Organization of the paper
In Section 2 we remind all necessary facts about the axiomatic GromovWitten theory. In Section 3 we define Zwiebach invariants and explain the motivation to consider the sums over graphs
in cyclic Hodge algebras. In Section 4 we define cyclic Hodge algebras and the corresponding
descendant potential. In Section 5 we state the main properties of the descendant potential in
cyclic Hodge algebras, and the rest of the paper is devoted to the proofs.
2. GromovWitten theory
In this section we remind what is GromovWitten theory and explain its basic properties that
we are going to reproduce in Hodge field theory construction.
2.1. GromovWitten invariants
Let us fix a finite dimensional vector space H0 over C together with the choice of a homogeneous basis H0 = e1 , . . . , es and a non-degenerate scalar product ij = (,) on it. Let e1 be a
distinguished even element of the basis.
g,n . On each M
g,n we take a differential form g,n
Consider the moduli spaces of curves M
of mixed degree with values in H0n . The whole system of forms {g,n } is called GromovWitten
invariants, if it satisfies the axioms [19,20]:
1. There are two actions of the symmetric group Sn on g,n . First, we can relabel the marked
g,n ; second, we can interchange the factors in the tensor product H n .
points on curves in M
0
We require that g,n is equivariant with respect to these two actions of Sn . In other words,
one can think that each copy of H0 in the tensor product is assigned to a specific marked
g,n .
point on curves in M
2. The forms must be closed, dg,n = 0.
g,n forgetting the last marked point. Then the cor g,n+1 M
3. Consider the mapping : M
respondence between g,n and g,n+1 is given by the formula
g,n = (g,n+1 , e1 ).
(1)
271
The meaning of the right-hand side is the following. We want to turn a H0n+1 -valued form
into a H0n -valued one. So, we take the copy of H0 corresponding to the last marked point
and contract it with the vector e1 using the scalar product.
g,n , whose generic point is represented by
4. Consider an irreducible boundary divisor in M
g ,n +1
g ,n +1 M
a two-component curve. It is the image of a natural mapping : M
1 1
1 2
g,n , where g = g1 + g1 and n = n1 + n2 . We require that
M

g,n = g1 ,n1 +1 g2 ,n2 +1 , 1 .
(2)
Here on the right-hand side we contract with a scalar product the two copies of H0 that
correspond to the node.
In the same way, consider the divisor of genus g 1 curves with one self-intersection. It is
g,n . In this case, we require that
g1,n+2 M
the image of a natural mapping : M

g,n = g1,n+2 , 1 .
(3)
As before, we contract the two copies of H0 corresponding to the node.
5. We also assume that (0,3 , e1 ei ej ) = (ei , ej ) = ij .
2.2. GromovWitten potential
Let us associate to each ei the set of formal variables Tn,i , n = 0, 1, 2, . . . . By Fg denote the
formal power series in these variables defined as

n
n
s
1

ai
Fg :=
(4)
i ,
ej Tai ,j .
g,n
n!
n
a1 ,...,an 0
Mg,n
i=1 j =1
i=1
The first sum is taken over n 3 for g = 0, n 1 for g = 1, and n 0 for g 2. On the righthand side, we contract each copy of H0 with the factor of the tensor product associated to the
same marked point.
The formal power series F := exp( g0 h g1 Fg ) is called GromovWitten potential associated to the system of GromovWitten invariants {g,n }. The coefficients of Fg , g 0, are called
correlators and denoted by

n
n

aj
a1 ,i1 an ,in g :=
(5)
j ,
ei j .
g,n
g,n
M
j =1
j =1
Vectors ei1 , . . . , ein are called primary fields.

The main properties of GW potentials come from geometry of the moduli space of curves.
First, one can prove that coefficients of F satisfy string and dilaton equations:

n
n

aj ,ij =
ak ,ik ,
0,1
(6)
aj 1,ij
j =1

1,1
n

j =1
j =1
k =j
= (2g 2 + n)
aj ,ij
g
n

j =1
aj ,ij
(7)
g,n+1 M
g,n
The string equation is a corollary of the fact that j = j Dj ; here : M
is the projection forgetting the last marked point and Dj is the divisor in Mg,n+1 whose generic
272
point is represented by a two-component curve with one node such that one component has
genus
0 and contains exactly two marked points, the ith and the (n + 1)th ones. It is assumed
that nj=1 aj > 0.
The dilaton equation is a corollary of the fact that, in the same notations, n+1 = 2g 2+n.
Of course, we assume that 2g 2 + n > 0.
g,n among natural -strata gives a relation
Second, any relation in the cohomology of M
for the correlators. Let us explain this in more detail.
2.3. Tautological relations
2.3.1. Stable dual graphs
g,n [21] has a natural stratification by the topological type of
The moduli space of curves M
stable curves. We can combine natural strata with -classes at marked points and at nodes and
-classes on the moduli spaces of irreducible components. These objects are called -strata.
g,n is the language of stable dual graphs.
A convenient way to describe a -stratum in M
Take a generic curve in the stratum. To each irreducible component we associate a vertex marked
by its genus. To each node we associate an edge connecting the corresponding vertices (or a
loop, if it is a double point of an irreducible curve). If there is a marked point on a component,
then we add a leaf at the corresponding vertex, and we label leaves in the same way as marked
points. If we multiply a stratum by some -classes, then we just mark the corresponding leaves
or half-edges (in the case when we add -classes at nodes) by the corresponding powers of .
Also we mark each vertex by the -class associated to it.
g,n we mean the classes
Let us remark that by -classes on M

k1 ,...,kl :=
l

kj +1
n+j
(8)
j =1
g,n+l M
g,n is the projections forgetting the last l marked points. It is just another
where : M
additive basis in the ring generated by the ordinary -classes (k , k 1, in our notations). The
basic properties of these classes are stated in [22].
2.3.2. Integrals over -strata
Using the properties of GW invariants, one can express the integral of g,n over a stratum S in terms of correlators.
Consider a special case, when S is represented by a two-vertex graph with no - and classes. Then, according to axiom 4, the integral of g,n is the product of integrals of g1 ,n1 +1
and g2 ,n2 +1 over the moduli spaces corresponding to the vertices, contracted by the scalar
product:

n

ei j =
e i j ei
g1 ,n1 +1 ,
g,n ,
S
j =1
g ,n +1
M
1 1

j J1
i i

g2 ,n2 +1 ,
g ,n +1
M
2 2

j J2

eij ei .
(9)
273
Here we assume that the genus of one component of a generic curve in S is g1 and n1 marked
points with labels j J1 , |J1 | = n1 , are on this component. The other component has genus g2
and n2 marked points with labels j J2 , |J2 | = n2 , lie on it. Of course, g1 + g2 = g, n1 + n2 = n.
Now consider a special case, when S is represented by a one-vertex graph with - and classes. Let us assign a vector in the basis of H0 to each leaf (to each marked point). Then,
according to axiom 3 the integral

g,n
M
n

n

aj
j b1 ,...,bk ,
ei j
g,n
j =1
j =1

(10)
is equal to

g,n+k
g,n+k
M
n

j =1
a
j j
k

n
bj +1
n+j ,
ei j
j =1
j =1

e1k
(11)
Combining these two special cases one can obtain an expression in correlators that corresponds to an arbitrary -stratum.
2.3.3. Relations for correlators
Suppose that we have a linear combination L of -strata that is equal to
0 in the cohomol g,n (a tautological relation). Since dg,n = 0, the integral of (g,n , n eij ) over L
ogy of M
j =1
is equal to zero, for an arbitrary choice of primary fields. This gives an equation for correlators.
g,n+n , n 0, multiplied by arbitrary
Usually, one consider also the pull-backs of L to M
monomials of -classes. Of course, they are also represented as vanishing linear combinations
of -strata. This gives a system of PDEs for the formal power series Fg , g 0. For the
detailed description of the correspondence between tautological relations and universal PDEs for
GW potentials see, e.g., [23] or [22].
There are 8 basic tautological relations known at the moment: WDVV, Getzler, Belorousski
0,4 , M
1,1 , M
2,1 , M
2,2 , M
3,1 [2326].
Pandharipande, and topological recursion relations in M
3. Zwiebach invariants
In GromovWitten theory (and also in topological string theory) the GromovWitten invariants is usually a structure on the cohomology of a target manifold (the space H0 ) of on the
cohomology of a complex of some other geometric origin. We have introduced the notion of
Zwiebach invariants in [5] in order to formalize in a convenient way what physicists mean by
topological conformal quantum field theory at the level of a complex rather than at the level of
the cohomology.
The very general principles of homological algebra imply that algebraic structures on the
cohomology are often induced by some fundamental structures on a full complex (the standard
example is the induction of the infinity-structures from differential graded algebraic structures).
Such induction usually can be represented as a sum over trees with vertices corresponding to
fundamental operations and edges corresponding to the homotopy that contracts the complex to
its cohomology.
274
We would like to stress that GromovWitten invariants also can be considered as an induced
structure on the cohomology of a complex. In this case, the fundamental structure on the whole
complex is determined by Zwiebach invariants.
We are able to associate some structure on a bicomplex with a special compactification of the
moduli space curves (KimuraStasheffVoronov compactification). So, complexes are replaced
by bicomplexes, where the second differential reflects the rotation of attached cylinders (or circles). This is an appearance of the string nature of the problem.
As an induced structure we indeed obtain a GromovWitten-type theory that, under some
additional assumptions, can be presented in terms of a sum over graphs. Below we explain the
whole construction, following [5] and with some additional details.
3.1. KimuraStasheffVoronov spaces
We remind the construction of the KimuraStasheffVoronov compactification K g,n of the
g,n ; we just
moduli space of curves of genus g with n marked point. It is a real blow-up of M
remember the relative angles at double points. We can also choose an angle of the tangent vector
at each marked point; this way we get the principal U (1)n -bundle over K g,n . We denote the total
space of this bundle by Sg,n .
There are also the standard mappings between different spaces Sg,n . First, one can consider
the projection : Sg,n+1 Sg,n forgetting the last marked point. Suppose that under the projection we have to contract a sphere that contains the points xi , xn+1 , and a node. Denote the
natural coordinates on the circles corresponding to xi and a node on a curve in Sg,n+1 by i
and . Let i be a coordinate on the circle corresponding to xi in Sg,n . Then i = i + under
the projection . In the same way, if we contract a sphere that contains two nodes and xn+1 , then
= 1 + 2 , where 1 and 2 are the coordinates on the circles corresponding to the two nodes
of a curve in Sg,n+1 and is a coordinate on the circle at the resulting node in Sg,n .
In the same way, when we consider the mappings : Sg1 ,n1 +1 Sg1 ,n2 +1 Sg,n representing
the natural boundary components of Sg,n , we have = n1 +1 + n2 +1 , where n1 +1 and n2 +1
are the coordinates on the circles corresponding the points that are glued by into the node and
is the coordinate on the circle at the node. For the mapping : Sg1,n+2 Sg,n we also have
= n+1 + n+2 with the same notations.
3.2. Zwiebach invariants
Let us fix a Hodge bicomplex H with two differentials denoted by Q and G and with an
even scalar product = (,) invariant under the differentials:
(Qv, w) = (v, Qw),
(G v, w) = (v, G w).
The Hodge property means that

H = H0
e , Qe , G e , QG e ,
(12)
(13)
where QH0 = G H0 = 0 and H0 is orthogonal to H4 .

(k)
Below we consider the action of Q and G on H n . We denote by Q(k) and G the action
of Q and G , respectively, on the kth component of the tensor product.
On each Sg,n we take a differential form Cg,n of the mixed degree with values in H0n . The
whole system of forms {Cg,n } is called Zwiebach invariants, if it satisfies the axioms:
275
1. Cg,n is Sn -equivariant.
2. (d + Q)Cg,n = 0, Q = ni=1 Q(i) .

(k)
3. (G + k )Cg,n = 0 for all 1 k n (we denote by k the substitution of the vector field
generating the action on Sg,n of the kth copy of U (1)); Cg,n is invariant under the action
of U (1)n ;
4. Cg,n = (Cg,n+1 , e1 ), where : Sg,n+1 Sg,n is the mapping forgetting the last marked
point.
5. Cg,n = (Cg1 ,n1 +1 Cg2 ,n2 +1 , 1 ), where : Sg1 ,n1 +1 Sg1 ,n2 +1 Sg,n represents
the boundary component. In the same way, Cg,n = (Cg1,n+2 , 1 ) for the mapping
: Sg1,n+2 Sg,n .
6. (C0,3 , e1 v v ) = ((Id + d2 G )v , (Id + d3 G )v ), 2 and 3 are the coordinates
on the circles at the corresponding points.
Zwiebach invariants on the bicomplex with zero differentials determine GromovWitten invariants. Indeed, in this case the factorization property implies that {Cg,n } is lifted from the
blowdown of KimuraStasheffVoronov spaces, i.e., it is determined by a set of forms on
DeligneMumford spaces. Then it is easy to check that this system of forms satisfies all axioms
of GromovWitten invariants.
3.3. Induced Zwiebach invariants
Induced Zwiebach invariants are obtained by the contraction of H4 . We denote by G+ the
contraction operator. This means that G+ H0 = 0, = {Q, G+ } is the projection to H4 along H0 ,
{G+ , G } = 0, and (G+ v, w) = (v, G+ w).
ind on a homotopy equivalent modification S
We construct an induced Zwiebach form Cg,n
g,n of
the space Sg,n . At each boundary component we glue the cylinder [0, +] such that in
Sg,n is identified with {0} in the cylinder.
So, we have the mappings : Sg1 ,n1 +1 Sg1 ,n2 +1 [0, +] Sg,n and : Sg1,n+2
[0, +] Sg,n representing the boundary components with glued cylinders. We take a form
Cg,n , restrict it to H0n , and extend it to the glued cylinder by the rule that

ind
Cg,n
(14)
= Cgind
Cgind
, et dtG+
1 ,n1 +1
2 ,n2 +1
in the first case and

ind

ind
= Cg1,n+2
, et dtG+
Cg,n
(15)
in the second case, where [et dtG+ ] is the bivector obtained from the operator etdtG+ ,
ind completely.
t is the coordinate on [0, +]. This determines Cg,n
ind are (d + Q)-closed and
Now it is a straightforward calculation to check that the forms Cg,n
satisfy the factorization property when restricted to the strata {+}.
3.4. Induced GromovWitten theory
The induced Zwiebach invariants determine GromovWitten invariants. The correlators of the
corresponding GromovWitten potential are given by the integrals over the fundamental
cycles
n
ai
ind
of K g,n (we just forget the circles at marked points in Sg,n ) of the forms Cg,n
.
i=1 i
276
In fact, the fundamental class of K g,n is represented as a sum over all irreducible boundary
g,n has real codimension equal to the dou g,n . Indeed, a boundary stratum in M
strata in M
bled number of the nodes of its generic curve. But then we add in K g,n a real two-dimensional
cylinder for each node. A simple explicit calculation allows to express the integral over the component of the fundamental cycle of K g,n corresponding to . It splits into the integrals of the
initial Zwiebach invariants (multiplied by -classes) over the moduli spaces corresponding to
the irreducible components of curves in ; they are contracted with the bivectors [G G+ ] (obtained from the operator G G+ via the scalar product) corresponding to the nodes according to
the topology of curves in .
So, we represent the correlators of the induced GromovWitten theory as sums over graphs.
Then one can observe that C0,3 determines a multiplication on H . Topology of the spaces S0,4
and S1,1 implies that the whole algebraic structure that we obtain on H is the structure of cyclic
Hodge algebra up to Q-homotopy, see [5]. Let us assume that the initial system of Zwiebach
invariants is simple enough, i.e., it induces the explicit structure of cyclic Hodge algebra on H
and only the integrals of the zero-degree parts of the initial Zwiebach invariants (multiplied by
-classes) are non-vanishing on fundamental cycles. In this case, the induced GromovWitten
potential can be described in very simple algebraic terms. It is the motivation of the definition of
the Hodge field theory construction given in the next section.
4. Construction of correlators in Hodge field theory
In this section, we describe in a very formal algebraic way the sum over graphs obtained as an
expression for the GromovWitten potential induced from Zwiebach invariants in the previous
section.
4.1. Cyclic Hodge algebras
In this section, we recall the definition of cyclic Hodge dGBV-algebras [5,16,17,20] (cyclic
Hodge algebras, for short). A supercommutative associative C-algebra H with unit is called
cyclic Hodge
algebra, if there are two odd linear operators Q, G : H H and an even linear

function : H C called integral. They must satisfy the following axioms:
1. (H, Q, G ) is a bicomplex:
Q2 = G2 = QG + G Q = 0.
(16)
2. H = H0 H4 , where QH0 = G H0 = 0 and H4 is represented as a direct sum of subspaces

of dimension 4 generated by e , Qe , G e , QG e for some vectors e H4 , i.e.,

e , Qe , G e , QG e
H = H0
(17)
(Hodge decomposition).
3. Q is an operator of the first order, it satisfies the Leibniz rule:
Q(ab) = Q(a)b + (1)a aQ(b)
(here and below we denote by a the parity of a H ).
(18)
277
4. G is an operator of the second order, it satisfies the 7-term relation:
bG (ac) + (1)a aG (bc)

G (abc) = G (ab)c + (1)b(a+1)
b
G (a)bc (1)a aG (b)c (1)a+
abG (c).
5. G satisfies the property called 1/12-axiom:

str(G a) = (1/12) str G (a)
(19)
(20)
(here a and G (a) are the operators of multiplication by a and G (a), respectively, str
means supertrace).
Define an operator G+ : H H related to the particular choice of Hodge decomposition. We
put G+ H0 = 0, and on each subspace e , Qe , G e , QG e we define G+ as
G+ e = G+ G e = 0,
G+ Qe = e ,
G+ QG e = G e .
(21)
We see that [G , G+ ] = 0; 4 = [Q, G+ ] is the projection to H4 along H0 ; 0 = Id 4 is the

projection to H0 along H4 .
Consider the integral : H C. We require that

Q(a)b = (1)a+1
aQ(b),

G (a)b = (1)a aG (b),

G+ (a)b = (1)a aG+ (b).
(22)

These
properties
imply that G G+ (a)b = aG G+ (b), 4 (a)b = a4 (b), and

0 (a)b = a0 (b).

We can define a scalar product on H as (a, b) = ab. We suppose that this scalar product is
non-degenerate. Using the scalar product we may turn any operator A : H H into the bivector
that we denote by [A].
4.2. Tensor expressions in terms of graphs
Here we explain a way to encode some tensor expressions over an arbitrary vector space in
terms of graphs.
Consider an arbitrary graph (we allow graphs to have leaves and we require vertices to be
at least of degree 3, the definition of graph that we use can be found in [20]). We associate a
symmetric n-form to each internal vertex of degree n, a symmetric bivector to each edge, and
a vector to each leaf. Then we can substitute the tensor product of all vectors in leaves and
bivectors in edges into the product of n-forms in vertices, distributing the components of tensors
in the same way as the corresponding edges and leaves are attached to vertices in the graph. This
way we get a number.
278
Let us study an example:

(23)
We assign a 5-form x to the left vertex of this graph and a 3-form y to the right vertex. Then the
number that we get from this graph is x(a, b, c, v, w) y(v, w, d).
Note that vectors, bivectors and n-forms used in this construction can depend on some variables. Then what we get is not a number, but a function.
4.3. Usage of graphs in cyclic Hodge algebras
Consider a cyclic Hodge algebra H . There are some standard tensors over H , which we
associate to elements of graphs below. Here we introduce the notations for these tensors.
We always assign the form

(a1 , . . . , an ) a1 an
(24)
to a vertex of degree n.
There is a collection of bivectors that will be assigned below to edges: [G G+ ], [0 ], [Id],
[QG+ ], [G+ Q], [G+ ], and [G ]. In pictures, edges with these bivectors will be denoted by
,
(25)
respectively. Note that an empty edge corresponding to the bivector [Id] can usually be contracted
(if it is not a loop).
The vectors that we will put at leaves depend on some variables. Let {e1 , . . . , es } be a homogeneous basis of H0 . In particular, we assume that e1 is the unit of H . To each vector ei we
associate formal variables
Tn,i , n 0, of the same parity as ei . Then we will put at a leaf one of
the vectors En = si=1 ei Tn,i , n 0, and we will mark such leaf by the number n. In our picture,
an empty leaf is the same as the leaf marked by 0.
4.3.1. Remark
There is a subtlety related to the fact that H is a Z2 -graded space. In order to give an honest
definition we must do the following. Suppose we consider a graph of genus g. We can choose
g edges in such a way that the graph being cut at these edge turns into a tree. To each of these
edges we have already assigned a bivector [A] for some operator A : H H . Now we have to
put the bivector [J A] instead of the bivector [A], where J is an operator defined by the formula
J : a (1)a a.
In particular, consider the following graph (this is also an example to the notations given
above):
.
(26)
An empty loop corresponds to the bivector [Id]. An empty leaf correspondsto the vector E0 .
A trivalent vertex corresponds to the 3-form given by the formula (a, b, c) abc.
If we ignore this remark, then what we get is just the trace of the operator a E0 a. But
using this remark we get the supertrace of this operator.
279
In fact, this subtlety will play no role in this paper. It affects only some signs in calculations
and all these signs will be hidden in lemmas shared from [5,16]. So, one can just ignore this
remark.
4.4. Correlators
We are going to define the potential using correlators. Let

k1 (V1 ) kn (Vn ) g
(27)
be the sum over graphs of genus g with n leaves marked by ki (Vi ), i = 1, . . . , n, where
V1 , . . . , Vn are vectors in H , and ki are just formal symbols. The index of each internal vertex of these graphs is 3; we associate to it the symmetric form (24). There are two possible
types of edges: edges marked by [G G+ ] (thick black dots in pictures, heavy edges in the text)
and edges marked by [Id] (empty edges). Since an empty edge connecting two different vertices
can be contracted, we assume that all empty edges are loops.
Consider a vertex of such graph. Let us describe all possible half-edges adjusted to this vertex.
There are 2g, g 0, half-edges coming from g empty loops; m half-edges coming from heavy
edges of graph, and l leaves marked ka1 (Va1 ), . . . , kal (Val ). Then we say that the type of this vertex is (g, m; ka1 , . . . , kal ). We denote the type of a vertex v by (g(v), m(v); ka1 (v) , . . . , kal(v) (v) ).
Consider a graph in the sum determining the correlator

k1 (V1 ) kn (Vn ) g .
(28)
We associate to a number: we contract according to the graph structure all tensors corresponding to its vertices, edges, and leaves (for leaves, we take vectors V1 , . . . , Vn ). Let us denote this
number by T ( ).
Also we weight each graph by a coefficient which is the product of two combinatorial constants. The first factor is equal to

g(v) g(v)!
vVert( ) 2
V ( ) =
(29)
.
| aut( )|
Here | aut( )| is the order of the automorphism group of the labeled graph , Vert( ) is the set
of internal vertices of . In other words, we can label each vertex v by g(v), delete all empty
loops, and then we get a graph with the order of the automorphism group equal to 1/V ( ).
The second factor is equal to

al(v) (v)
a (v)
P ( ) =
(30)
1 1 l(v)
.
vVert( )
Mg(v),m(v)+l(v)
The integrals used in this formula can be calculated with the help of the WittenKontsevich
theorem [2733].
So, the whole contribution of the graph to the correlator is equal to V ( )P ( )T ( ). One
can check that the non-trivial contribution
to the correlator k1 (V1 ) kn (Vn )g is given only
by graphs that have exactly 3g 3 + n ni=1 ki heavy edges.
The geometric meaning here is very clear. The number T ( ) comes from the integral of the
induced GromovWitten invariants of degree zero, while the coefficient V ( )P ( ) is exactly
the combinatorial interpretation of the intersection number of 1k1 nkn with the stratum whose
dual graph is obtained from by the procedure described after the definition of V ( ).
280
4.5. Potential
We fix a cyclic Hodge algebra and consider the formal power series F = F (Tn,i ) defined as

F = exp
h g1 Fg

= exp
g=0
g1
g=0
1
n!
n

a1 ,...,an Z0

a1 (Ea1 ) an (Ean ) g .
(31)
Abusing notations, we allow to mark the leaves by a (Ea ), Ea , or a; all this variants are
possible and denote the same.
4.6. Trivial example
For example, consider the trivial cyclic Hodge algebra: H = H0 = e1 , Q = G = 0,
e1 = 1. Then Ea = e1 ta , and the correlator a1 (Ea1 ) an (Ean )g consists just of one graph
with one vertex, g empty loops, and n leaves marked by a1 , . . . , an . The explicit value of the
coefficient of this graph is, by definition,

a1 an g :=
(32)
1a1 nan .
g,n
M
So, in the case of trivial cyclic Hodge algebra we obtain exactly the GromovWitten potential of
the point (i.e., (g,n , e1n ) 1) that we denote below by F pt .
4.6.1. Remark about notations
Abusing notation, we use the same symbol g for the correlators in GW theory and in Hodge
field theory. We hope that it does not lead to a confusion. For instance, a1 (Ea1 ) an (Ean )g
in the trivial example above is the correlator of the trivial Hodge field theory, while a1 an g
is the correlator of the trivial GW theory.
5. String, dilaton, and tautological relations
In this section, we prove that the potential (31) satisfies the same string and dilaton equations
as GW potentials.
5.1. String equation
Theorem 1. If nj=1 aj > 0, we have:

n
n

aj (eij ) =
ak (eik ) .
aj 1 (eij )
0 (e1 )
j =1
j =1
k =j
(33)
Proof. Consider a graph contributing to the correlator on the left-hand side of the string
equation. The special leaf that we are going to remove is marked by 0 (e1 ) and is attached to a
281
vertex v of genus gv (i.e., with gv attached light loops) with lv more attached leaves labeled by
indices in Iv , |Iv | = lv , and mv attached half-edges coming from heavy edges and loops.
Let us remove the leaf 0 (e1 ) and change the label of one of the leaves attached to the same
vertex from aj (eij ) to aj 1 (eij ). This way we obtain a graph j contributing to the j th summand of the right-hand side of (33). We take the sum of these graphs over j Iv . Of course, we
skip the summands where aj = 0.
Note that this sum is not empty (if gives a non-zero contribution to the left-hand side
of (33)). Indeed, if it is empty, this means that aj = 0 for all j Iv . Therefore, since we expect
that the contribution to P ( ) of the vertex v on the left-hand side is non-zero, it follows that
gv = 0 and mv + lv = 2. So, there are three possible local pictures:
,
and
(34)
The first picture can be replaced with the bivector [G G+ G G+ ], which is equal to zero.
Therefore, T ( ) is also equal to zero. In the second case, we also get 0 since G G+ (e1 ej ) =
only when it is the whole graph, and this is in
G G+ (E0 ) = 0. The third picture is possible
contradiction with the assumption that nj=1 aj > 0.

Note also that T ( ) = T (j ) and V ( ) = V (j ) for all j . Indeed, we have just removed
the leaf with the unit of the algebra, so this cannot change anything in the contraction of tensors.
Therefore, T ( ) = T (j ). Also both the leaf 0 (e1 ) and the vertex v are the fixed points of
any automorphism of . The same is for the vertex corresponding to v in j . Therefore, the
automorphism groups are isomorphic for both graphs. Since we make no changes for empty
loops, it follows that V ( ) =
V (j ).
Let us prove that P ( ) = j Iv P (j ). Indeed, the vertices of and j are in a natural oneto-one correspondence. Moreover, the local pictures for all of them except for v and its image
in j are the same. Therefore, the corresponding intersection numbers contributing to P ( )
and P (j ) are the same. The unique difference appears when we take the intersection numbers
corresponding to v and its images in j , j Iv . But then we can apply the string equation (6) of
the GW theory of the point (32), and we see that

aj

aj 1 ak
o(j ) =
o(j
o(k)
(35)
)
g ,k +l +1 j Iv
M
v v v
j Iv
Mg
k =j
v ,kv +lv
(here
o : Iv {1, . . . , l} is an arbitrary on-to-one mapping). This implies that P ( ) =
j Iv P (j ).
So, we have V ( )P ( )T ( ) = j Iv V (j )P (j )T (j ). In order to complete the proof

of (33), we should just notice that when we write down this expression for all graphs contributing
to the left-hand side of (33), we use each graph contributing to the right-hand side of (33) exactly
once. 2
5.2. Dilaton equation
Theorem 2. If 2g 2 + n > 0, we have:

n

n

aj (eij ) = (2g 2 + n)
aj (eij ) .
1 (e1 )
j =1
j =1
(36)
282
Proof. Consider a graph contributing to the correlator on the left-hand side of (36). The special leaf that we are going to remove is marked by 1 (e1 ) and is attached to a vertex v of genus gv
(i.e., with gv attached light loops) with lv more attached leaves labeled by indices in Iv , |Iv | = lv ,
and mv attached half-edges coming from heavy edges and loops.
Let us remove the leaf 1 (e1 ). We obtain a graph contributing to the right-hand side of (33).
Let us prove this. Indeed, if we remove a leaf and do not get a proper graph, it follows that we
have a trivalent vertex. Since the contribution of this vertex to P ( ) should be non-zero, it
follows that the unique possible local picture is
(37)
But this picture is the whole graph, and it is in contradiction with the condition 2g 2 + n > 0.
The same argument as in the proof of the string equation shows that T ( ) = T ( ) and
V ( ) = V ( ). Also, the contribution to P ( ) and P ( ) of all vertices except for the changed
one is the same. The change of the intersection number corresponding to the vertex v is captured
by the dilaton equation (6) of the trivial GW theory (32):

aj
aj
(38)
l+1
o(j ) = (2gv 2 + kv + lv )
o(j )
g ,k +l +1
M
v v v
j Iv
g ,k +l j Iv
M
v v v
(again, o : Iv {1, . . . , l} is an arbitrary on-to-one mapping). This implies that P ( ) = (2gv

2 + kv + lv )P ( ), and, therefore,
V ( )P ( )T ( ) = (2gv 2 + kv + lv )V ( )P ( )T ( ).
(39)
Let us write down the last equation for all graphs contributing to the left-hand side (36).
Observe that any graph contributing to the right-hand side occurs | Vert( )| times, since
the leaf 1 (e1 ) could be attached to any its vertex. Therefore, any graph contributing to the
right-hand side of (36) appears in these equations with the coefficient

(40)
(2gv 2 + kv + lv ) = 2g 2 + n.
vVert( )
This completes the proof.
5.3. Tautological equations

As we have explained in Section 2.3.3, any linear relation L among -strata in the cohomology of the moduli space of curves gives rise to a family of universal relations for the
correlators of a GromovWitten theory.
Theorem 3 (Main Theorem). The system of universal relations coming from a tautological
rela
tion in the cohomology of the moduli space of curves holds for the correlators nj=1 aj (eij )g
of cyclic Hodge algebra.
Note that some special cases of this theorem were proved in [5,16,17]. Our argument below is
a natural generalization of the technique introduced in these papers. Also we are able now to give
an explanation why we have managed to perform all our calculations there, see Remark 8.6.1.
283
Let us give here a brief account of the proof of this theorem. First, the definition of correlators
of the Hodge field theory can be extended to the intersection with an arbitrary tautological class
g,n , not only the monomial in -classes. In that case, we do the
of degree K in the space M
following. Again, we consider the sum over all graphs with 3g 3 + n K heavy edges,
and the number T ( ) is defined as above. Instead of the coefficient V ( )P ( ) we use the
intersection number of with the stratum whose dual graph is obtained from by the procedure
described right after the definition of V ( ). Namely, a vertex with g loops is replaced by a vertex
marked by g.
This definition is very natural from the point of view of Zwiebach invariants. However, we
know from GromovWitten theory that this extension of the notion of correlator is unnecessary.
Indeed, all integrals with arbitrary tautological classes can be expressed in terms of the integrals
with only -classes via some universal formulas.
The main question is whether these universal formulas also work in Hodge field theory. Actually, the main result that more or less immediately proves the theorem is the positive answer to
this question.
5.3.1. Organization of the proof
The rest of the paper is devoted to the proof of Main Theorem, and here we would like to
overview it here.
In Section 6, we study the structure of graphs that can appear in formulas for the correlators of
Hodge field theory. We prove that if T ( ) = 0 and there is at least one heavy edge in , then all
vertices have genus 1, i.e., there is at most one empty loop at any vertex. This basically means
that in calculations well have to deal only with genera 0 and 1. Also this allows us to write down
the action of a Hodge field theory.
In Section 7, we prove the main technical result (Main Lemma). Informally, it states that
Q = G when we apply these two operators to the correlators of a Hodge field theory. In
order to prove it, we look at a small piece (consisting just of one heavy edge and one or two
vertices that are attached to it) in one of the graphs of a correlator. Of course, in the correlator
we can vary this small piece in an arbitrary way, such that the rest of the graph remains the same.
So, when we consider the sum of all these small pieces, it is also a correlator of the Hodge field
theory. Thus we reduce the proof to a special case of the whole statement. But since the genus of
a vertex is 1, it appear now to be a low-genera statement that can be done by a straightforward
calculation.
In Section 8, we present the proof of the Main Theorem. Consider a -stratum whose
stable dual graph has k 1 edges. There is a universal expression of the integral over coming
from GromovWitten theory. It includes k entries of the scalar product restricted to H0 . In terms
of graphs, it means that we are to introduce new edges with the bivector [0 ] on them, and there
are k such edges in our expression. A direct corollary of the Main Lemma is that we can always
replace [0 ] by [Id] [G G+ ] [G G+ ] .
In Sections 8.28.4, we show that when we replace [0 ] by [Id] [G G+ ] [G G+ ] at
all edges corresponding to the scalar product restricted to H0 , we obtain a new expression for the
integral over that again contains only heavy edges and empty loops, as any ordinary correlator.
The main problem now is to understand the combinatorial coefficient of a graph obtained this
way.
Since we have a sum over graphs with heavy edges and empty loops, it is natural to identify again these graphs with the corresponding strata in the moduli space of curves. Then we
can calculate the intersection index of the stratum corresponding to a graph and the initial
284
class . Roughly speaking, the main thing that we have to do is to decide about each node
in (represented initially by [0 ]), whether we have this node in stratum corresponding to .
If yes, then we have an excessive intersection (so, we must put on one of the half-edges of
the corresponding edge), and we keep this edge in (so, we replace [0 ] with [G G+ ] or
[G G+ ] ). If no, then we do not have this edge in , so we contract [0 ], i.e., replace it
with [Id].
So, the procedure that we used to get rid of the scalar product is the same as the procedure of
the intersection of with strata of the complementary dimension. This means (Section 8.5) that
the universal formula coming from GromovWitten theory is equivalent to the natural formula
for the correlator with coming from Zwiebach theory. The tautological relation is a sum
of classes equal to zero. So, while the universal formula coming from GromovWitten theory
gives (in the case of a vanishing class) a non-trivial expression in correlators, the natural formula
coming from Zwiebach theory gives identically zero. This proves our theorem, see Section 8.6.
6. Vanishing of the BV structure
In this section, we recall several useful lemmas shared in [16,34]. In particular, these lemmas
give some strong restrictions on graphs that can give a non-zero contirbution to the correlators
defined above.
6.1. Lemmas
Lemma 1. (See [16,34].) The following vectors and bivectors are equal to zero:
(41)
Also let us remind another lemma in [16] that is very useful in calculations.
Lemma 2. (See [16].) For any vectors V0 , V1 , . . . , Vk , k 2,
+ +
(42)
+ +
(43)
Both lemmas are just simple corollaries of the axioms of cyclic Hodge algebra.
285
6.2. Structure of graphs

Consider a graph studied in Section 4.4. It can have leaves, empty and heavy loops, and heavy
edges. Consider a vertex of such graph. Let us assume that there are A empty loops, B heavy
loops, C heavy edges going to the other vertices of the graph, and D leaves attached to this
vertex:
(44)
This picture can be considered as an C + D form. Let us denote it by (A, B, C, D).

Lemma 3. If A 2 and B + C 1, then (A, B, C, D) = 0.
In other words, if there are at least two empty loops at a vertex, then there should not be
any heavy loops or edges attached to this vertex. Otherwise the contribution of the whole graph
vanishes. This implies
Corollary 1. In the definition of correlators one should consider only graphs of one of the following two types:
(1) One-vertex graphs with no heavy edges (loops).
(2) Arbitrary graphs with at most one empty loop at each vertex.
The contribution of all other graphs vanishes.
This corollary dramatically simplifies all our calculations with graphs given below. Also we
can write down now the action of the Hodge field theory.
Let Fg0 (v0 , v1 , v2 , . . .), vi H CJ{Tn,i }K be the dimension zero part of the potential of
the Hodge field theory, namely,
Fg0 :=
1
n!
n
a1 ++an =3g3+n

a1 (va1 ) an (van ) g .
(45)
The first sum is taken over n 0 such that 2g 2 + n > 0. So, it is exactly the generating function
for the vertices of our graph expressions. Then the action of the Hodge field theory is equal to
A(v) := F00 (E0 + G v, E1 , E2 , . . .) + h F10 (E0 + G v, E1 , E2 , . . .)

1
Qv G v.
+
h g Fg0 (E0 , E1 , E2 , . . .)
2
(46)
g2
If we put Tn,i = 0 for n 1, then we immediately obtain the BCOV-type action discussed in
[12, Appendix] and [5, Appendix]. The similar actions were also studied in [35] and [36].
286
6.3. Proof of Lemma 3

We consider the form (A, B, C, D) and we assume that A 2.
First, let us study the case when C 1. In this case, our C + D-form can be represented as a
contraction via the bivector [Id] of two forms, (A 2, B, C 1, D + 1) and (2, 0, 1, 1). Let
us prove that the last one is equal to zero. Indeed, this two-form can be represented as (,G
+ ),
where the two-form is represented by the picture
(47)
According to Lemma 1, = 0. Therefore, (2, 0, 1, 1) = 0 and the whole form (A, B,

C, D) is also equal to zero.
Now consider the case when B 1. In this case, our C + D-form can be represented as a
contraction via the bivector [Id] of two forms, (A 2, B 1, C, D + 1) and (2, 1, 0, 1). Let
us prove that the last one is equal to zero. Indeed,
(2, 1, 0, 1) =
1
2
1
2
(48)
Here the first equality is definition of (2, 1, 0, 1), the second one is just an equivalent redrawing,
the third equality is application of Lemma 2, the fourth one is again an equivalent redrawing.
The last picture contains the bivector (47) which is equal to zero according to Lemma 1. Therefore, the whole picture is equal to zero, and (2, 1, 0, 1) = 0. So, the whole form (A, B, C, D)
is equal to zero also in this case. This proves the lemma.
7. Main Lemma
7.1. Statement
The main technical tool that we use in the proof of Theorem 3 is the lemma that we prove in
this section.
Lemma 4 (Main Lemma). For any v1 , . . . , vn H , a1 , . . . , an 0,
n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) g
i=1
n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) g .
i=1
(49)
287
A simple corollary of this lemma is the following:

Lemma 5. For any w H , v1 , . . . , vn H0 ,

a0 (Qw)a1 (v1 ) an (vn ) g + a0 +1 (G w)a1 (v1 ) an (vn ) g = 0.
(50)
In other words, we can state informally that Q + G = 0.

7.2. Special cases
The proof of the lemma can be reduced to a small number of special cases. We consider
correlators whose graphs have only one heavy edge.
7.2.1. The first case is the following: Let

v1 , . . . , vn H ,
i=1 ai
= n 4. We prove that for any
n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) 0
i=1
n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) 0 .
(51)
i=1
First, we see that according to the definition of the correlator, the left-hand side of Eq. (51) is
the sum over graphs with two vertices and with [G G+ ] on the unique edge that connects the
vertices. For each I J = {1, . . . , n} we can consider the corresponding distribution of leaves
between
the vertices
(to be precise, let us assume that 1 I ). Then the coefficient of such graph is
0 iI ai 0 0 j J aj 0 , and we take the sum over all possible positions of Q at the leaves.
Using the Leibniz rule for Q and the property that [Q, G G+ ] = G , we see that this
sum is equal to the sum over graphs with two vertices and with [G ] on the unique edge
that connects the vertices. For each I J = {1, . . . , n}, |I |, |J | 2, we consider the corresponding
distribution

of leaves between the vertices. Then the coefficient of such graph is still
0 iI ai 0 0 j J aj 0 , and the underlying tensor expression can be written (after we multiply the whole sum by 1) as

vi G
vj .
v1 ,
(52)
j J
iI \{1}
Let us recall that the 7-term relation for G implies that

G
vj =
G (vi vj )
vk
j J
i,j J, i<j
|J | 2
j J
kJ \{i,j }
G (vj )
vi .
iJ \{i,j }
Using this, we can rewrite the whole sum over graphs as
(53)
288

v1 ,

vk G (vi vj )

k =1,i,j
1<i<j

ak ai aj 0
ak
a1 0
I J {i,j }={2,...,n}
kI

vj G (vi )
v1 ,
i =1
j =1,i

|J | 1 a1 0
I J {i}={2,...,n}
kJ

aj ai 0
aj
.
0
j I
j J
(54)
Using that

a1 0
I J {i,j }={2,...,n}

ak
ai aj 0
0
kI

ak
kJ

= a1 +1
ak ,
k =1
(55)
Eq. (53), and the fact that

aj
(n 3) a1 +1
0
j =1
|J | 1 a1 0
I J {i}={2,...,n}

aj ai 0
aj = ai +1
aj ,
0
j I
j J
we can rewrite expression (54) as

n

vj ai +1
aj .
G (vi ),
j =i
i=1
j =i
j =i
(56)
(57)
The last formula coincides by definition with the right-hand side of Eq. (51) multiplied by 1.
This proves the first special case.
7.2.2. The second case is in genus 1. Let
vn H ,
i=1 ai
= n 1. We prove that for any v1 , . . . ,
n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) 1
i=1
n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) 1 .
(58)
i=1
According to the definition of the correlator, the left-hand side of Eq. (58) is the sum over
graphs of two possible types. The first type include graphs with two vertices and two edges. The
first edge is heavy and connects the vertices; the second edge is an empty loop attached to the first
vertex. For each I J = {1, . . . , n}, |J | 2, we can consider the corresponding distribution of
leaves between the vertices (we assume
that leaveswith indices in I are at the first edge). Then
the coefficient of such graph is 0 iI ai 1 0 j J aj 0 . The second type include graphs
with one vertex and one heavy loop. All leaves are attached to this vertex, and the coefficient of
289

such graph is 02 ni=1 ai 0 . For both types of graphs, we take the sum over all possible positions
of Q at the leaves.
Using the Leibniz rule for Q and the property that [Q, G G+ ] = G , we get the same
graphs as before, but there is no Q, and instead of [G G+ ] we have [G ] on the corresponding
edge. Using Lemma 2, we move G in graphs of the first type to the leaves marked by indices
in J . Using the 1/12-axiom and Lemma 2, we move G in graphs of the second type to all
leaves.
This way we get graphs of the same type in both cases. We get graphs with one vertex, one
empty loop attached to it, all leaves are also attached to this vertex, and there is G on one of
the leaves. One can easily check that the coefficient of the graphs with G at the ith leaf is
equal to

n

1 2
ak 0 ai
ak +
ak
0
24 0
1
0
0
kI
kJ
I J {i}={1,...,n}
k=1

= ai +1
ak .
k =i
(59)
It is exactly the unique graph contributing to the ith summand of the right-hand side of Eq. (58),
and the coefficient is right. This proves that special case.
7.2.3. Consider g 2. Let
v 1 , . . . , vn H ,
i=1 ai
= 3g + n 4. In this case, the statement that for any
n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) g
i=1
n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) g ,
(60)
i=1
is immediately reduced to 0 = 0; it is a simple corollary of Lemma 1.

7.3. Proof of Main Lemma
Consider the left-hand side of Eq. (49). As usual, using the Leibniz rule for Q and the property
that [Q, G G+ ] = G , we can remove all Q, but then we must change one of [G G+ ] on
edges to [G ]. Let us cut out the peaces of graphs that includes this edges with [G ], all
empty loops, leaves and halves of heavy edges attached to the ends of this special edge.
Since we consider the sum over all possible graphs contributing to correlators, these small
pieces can be gathered into groups according to the type of the rest of the initial graph. Each
group forms exactly one of the special cases studied above. So, we know that G should jump
either to one of the leaves or to one of the heavy edges attached to the ends of its edge. In the
first case, we get exactly the graphs in the right-hand side of Eq. (49); in the second case, we get
zero. One can easily check that we get the right coefficients for the graphs in the right-hand side
of Eq. (49). This proves the lemma.
290
8. Proof of Theorem 3
8.1. Equivalence of expression in graphs
Consider the expression in correlators corresponding to a -stratum as it is described in
Section 2.3.2. To each vertex of the corresponding stable dual graph we assign the sum of graphs
that forms correlator in the sense of Section 4.4. The leaves of these graphs corresponding to the
edges of the stable dual graph (nodes) are connected in these pictures by edges with [0 ] (the
restriction of the scalar product to H0 ). We call the edges with [0 ] white edges and mark
them in pictures by thick white points, see (25).
The axioms of cyclic Hodge algebra imply a system of linear equations for the graphs of this
type. In particular, it has appeared that playing with this linear equations we can always get rid of
white edges in the sum of pictures corresponding to a stable dual graph, see [5,16,17]. However,
previously it was just an experimental fact. Now we can show how it works in general.
The numerous examples of the correspondence between stable dual graphs and graphs expressions in cyclic Hodge algebras and also of the linear relations implied by the axioms of cyclic
Hodge algebra are given in [5,16,17].
Below, we explain how one can represent the expression in correlators corresponding to a
-stratum in terms of graphs with only empty and heavy edges and with no white edges. The
unique tool that we need is Lemmas 4 and 5 proved above.
8.2. The simplest example
Consider a stable dual graph with two vertices and one edge connecting them:
(61)
.
The corresponding expression in correlators is

n1
l1
n2
l2

j1 j2
ai (e )
ui (e1 )
b0 (ej2 )
bi (e )
vi (e1 )
a0 (ej1 )
i=1
i=1
i=1
g1
i=1
(62)
g1
(here we denote by e an arbitrary choice of ei H0 ). It is convenient for us to rewrite this

expression as

n1
l1
n2
l2

a0 (x1 )
(63)
ai (e )
ui (e1 )
[0 ]1 2 b0 (x2 )
bi (e )
vi (e1 ) ,
i=1
i=1
i=1
g1
i=1
g1
where {x } is the basis of the whole H . Using the fact that 0 = I d QG+ G+ Q and applying
Lemma 5, we obtain

n1
l1
n2
l2

1 2
ai (e )
ui (e1 )
[0 ]
b0 (x2 )
bi (e )
vi (e1 )
a0 (x1 )

i=1
= a0 (x1 )
i=1
n1

i=1
ai (e )
g1
l1

i=1
i=1
[Id]1 2 b0 (x2 )
ui (e1 )
g1
i=1
n2

i=1
bi (e )
g1
l2

i=1
vi (e1 )
g1

a0 +1 (x1 )
n1

ai (e )
i=1

ui (e1 )
i=1
[G G+ ]
l1

1 2
b0 (x2 )
n1

l1

g1
n2

bi (e )
i=1

a0 (x1 )
i=1
[G G+ ]
1 2
ai (e )
l2

vi (e1 )
i=1
g1
ui (e1 )
i=1
291
b0 +1 (x2 )
g1
n2

i=1
bi (e )
l2

i=1

vi (e1 )
(64)
g1
In all three summands of the right-hand side we still have two correlators, whose leaves corresponding to the nodes are connected by some special edges. But now the connecting edge is
either marked by [Id] (an empty edge) or by [G G+ ] (an ordinary heavy edge). So, this way we
get rid of the white edge in this case.
Informally, in terms of pictures, we can describe Eq. (64) as
=
(65)
.
When we put , we mean that we add one more -class at the node at the corresponding branch
of the curve. Dashed circles denote correlators.
8.3. Example with two nodes
Now we consider an example of stratum, whose generic point is represented by a threecomponent curve. Again, we allow arbitrary -classes at marked points and two branches at
nodes.
We perform the same calculation as above, but now we explain it in terms of informal pictures
from the very beginning. So, the first step is the same as above:
=
.
Then we apply Lemma 4 to each of the summands in the right-hand side:
=
(66)
292
(67)
=
+
+
+
(68)
(69)
and
=
+
+
+
We take the sum of these three expressions, and we see that all pictures where we have edges
with [G ] and [G+ ] are cancelled. So, we get an expression for the sum of graphs representing
the initial stratum in terms of graphs with only empty and heavy edges.
8.4. General case
The general argument is exactly the same as in the second example. In fact, this gives a
procedure how to write an expression in graphs with only empty and heavy edges (and no white
edges) starting from a stable dual graph. Let us describe this procedure.
g,n . First, we
Take a stable dual graph corresponding to a -stratum of dimension k in M
are to decorate it a little bit. For each edge, we either leave it untouched, or substitute it with an
arrow (in two possible ways). At the pointing end of the arrow, we increase the number of classes by 1. Each of these graphs we weight with the inversed order of its automorphism group
(automorphisms must preserve all decorations) multiplied by (1)arr , where arr is the number of
arrows.
Consider a decorated dual graph. To each its vertex we associate the corresponding correlator
of cyclic Hodge algebra (we add new leaves in order to represent -classes). Then we connect
293
the leaves corresponding to the nodes either by empty edges (if the corresponding edge of the
decorated graph is untouched) or by heavy edges (if the corresponding edge of the dual graph is
decorated by an arrow).
It is obvious that the number of heavy edges in the final graphs is equal to k.
8.5. Coefficients
We can simplify the resulting graphs obtained in the previous subsection. First, we can contract empty edges (as much as it is possible; it is forbidden to contract loops). Second, we can
remove leaves added for the needs of -classes. Indeed, each such leaf is equipped with a unit
of H , so it does not affect the contraction of tensors corresponding to a graph. Moreover, when
we remove all leaves corresponding to -classes, we still have graph with at least trivalent vertices. Otherwise, this graph is equal to zero, cf. arguments in the proofs of string and dilaton
equations.
So, we obtain final graphs that have the same number of heavy edges as the dimension of the
initial -stratum, the same number of leaves as the initial dual graph, and some number of
empty loops, at most one at each vertex. The exceptional case is when k = 0; in this case we
obtain only one graph, with one vertex, n leaves, and g empty loops.
In the first case, let us turn a graph like this into a stable dual graph. Just replace its vertices
with no empty loops by vertices of genus zero, vertices with empty loops by vertices of genus
one, heavy edges are edges, and leaves are leaves. There are no - or -classes. It is obvious that
the codimension of the stratum corresponding to this dual graph is k. Indeed, in this case it is just
the number of nodes.
So, to each -stratum X of dimension k > 0 we associate a linear combination i ci Yi of

strata of codimension k with no - or -classes, whose curves have irreducible components of
genus 0 and 1 only.
Proposition 1. We have ci = X Yi .
Proof. We prove it two steps. First, consider a one-vertex stable dual graph with no edges (just
a correlator). In this case, the intersection number X Yi is just by definition ci = V (i )P (i ),
where i is the cyclic Hodge algebra graph that turns into Yi via the procedure described above.
Then, consider a stable dual graph with one edge. It is the intersection of the one-vertex stable
dual graph with an irreducible component of the boundary. For a given Yi , this component of the
boundary either intersects it transversaly, or we have an excessive intersection. In the first case,
the corresponding node is not represented in Yi . This means that in i it should be an empty
edge. In the second case, this node is one of the nodes of Yi , so it should be a heavy edge of i .
Also, it is an excessive intersection, so we are to add the sum of -classes with the negative sign
at the marked points (half-edges) corresponding to the node, see [37, Appendix].
Exactly the same argument works for an arbitrary number of nodes, we just extend it by
induction. 2
In the case of k = 0, we get just one final graph with coefficient c.
Proposition 2. If k = 0, the coefficient of the final graph is equal to the number of points in the
initial -stratum.
294
Proof. If k = 0, this means that each of the vertices of the initial stable dual graph also has
dimension 0, and the corresponding correlator of cyclic Hodge algebra is represented by one
one-vertex graph with no heavy edges. Also this means that each edge of the initial stable dual
graph is replaced in the algorithm above by an empty edge. So, we can think that we just work
with the correlators of the GromovWitten theory of the point. In this case proposition becomes
obvious. 2
Now we deduce Theorem 3 from these propositions.
8.6. Proof of Theorem 3
Consider the system of subalgebras
g,n ) RH (M
g,n )
RH1 (M
(70)
g,n generated by strata with no - or -classes

of the cohomological tautological algebras of M
and with irreducible curves of genus 0 and 1 only.
g,n . Then the expression in
Let L be a linear combination of -strata of dimension k in M
correlators of cyclic Hodge algebras corresponding to L is equaivalent to a sum of some graphs
g,n ).
with coefficients equal to the intersection of L with classes in RH1k (M
So, if the class of L is equal to zero, then the corresponding equation (and also the whole
system of equations that we described in Section 2.3.3) for correlators of cyclic Hodge algebra
is valid. Theorem is proved.
8.6.1. Remark
g,n ) is a module over RH (M
g,n ). Also it is obvious that RH (M
g,n )
Evidently, RH1 (M
1
is closed under pull-backs and push-forwards via the forgetful morphisms. This explains why it
was enough to make only one check in the simplest case in order to get the system of equations
in [5,17] (cf. an argument in the last section in [5]).
8.7. An interpretation of Propositions 1 and 2
From the point of view of the theory of Zwiebach invariants, both propositions look very
natural. Indeed, we try to give a graph expression for the integral of an induced GromovWitten
form multiplied by a tautological class X. Since we know that we are able to integrate only degree
zero parts of induced GromovWitten invariants, we should just take the sum over all graphs that
g,n ). The coefficients are to be
correspond to the strata of complimentary dimension in RH1 (M
the intersection numbers of these strata with X.
On the other hand, we know that in any GromovWitten theory it is enough to fix the integrals
of GromovWitten invariants multiplied by -classes. Then the integrals of GromovWitten invariants multiplied by arbitrary tautological classes are expressed by universal formulas. We can
try to use these universal formulas also in Hodge field theory. They are exactly our expressions
with white edges.
So, we have two different natural ways to express in terms of graphs the integrals of induced
GromovWitten invariants multiplied by tautological classes. Propositions 1 and 2 state that these
two different expressions coinside.
295
Acknowledgements
A.L. was supported by the Russian Federal Agency of Atomic Energy and by the grants
INTAS-03-51-6346, NSh-8065.2006.2, NWO-RFBR-047.011.2004.026 (RFBR-05-02-89000NWO-a), and RFBR-07-02-01161-a.
S.S. was supported by the grant SNSF-200021-115907/1. S.S. is grateful to the participants of
the Moduli Spaces program at the Mittag-Leffler Institute (Djursholm, Sweden) for the fruitful
discussions of the preliminary versions of the results of this paper. The remarks of C. Faber,
O. Tommasi, and D. Zvonkine were especially helpful.
I.S. was supported by the grant RFBR-06-01-00037.
References
[1] B. Zwiebach, Closed string field theory: Quantum action and the BatalinVilkovisky master equation, Nucl. Phys.
B 390 (1) (1993) 33152.
[2] E. Witten, ChernSimons gauge theory as a string theory, in: The Floer Memorial Volume, in: Progress in Mathematics, vol. 133, Birkhuser, Basel, 1995, pp. 637678.
[3] M. Bershadsky, S. Cecotti, H. Ooguri, C. Vafa, KodairaSpencer theory of gravity and exact results for quantum
string amplitudes, Commun. Math. Phys. 165 (2) (1994) 311427.
[4] G. Mikhalkin, Enumerative tropical algebraic geometry in R2 , J. Amer. Math. Soc. 18 (2) (2005) 313377.
[5] A. Losev, S. Shadrin, From Zwiebach invariants to Getzler relation, Commun. Math. Phys. 271 (3) (2007) 649679.
[6] T. Kimura, J. Stasheff, A. Voronov, On operad structures of moduli spaces and string theory, Commun. Math.
Phys. 171 (1) (1995) 125.
[7] E. Getzler, BatalinVilkovisky algebras and two-dimensional topological field theories, Commun. Math. Phys. 159
(1994) 265285.
[8] F. Schaetz, BVF-complex and higher homotopy structures, math.QA/0611912.
[9] P. Mnev, Notes on simplicial BF theory, hep-th/0610326.
[10] A. Losev, Y. Manin, New moduli spaces of pointed curves and pencils of flat connections, Michigan Math. J. 48
(2000) 443472.
[11] A. Losev, Yu. Manin, Extended modular operad, in: Frobenius Manifolds, in: Aspects of Mathematics, vol. E36,
Vieweg, Wiesbaden, 2004, pp. 181211.
[12] S. Barannikov, M. Kontsevich, Frobenius manifolds and formality of Lie algebras of polyvector fields, Int. Math.
Res. Notices 4 (1998) 201215.
[13] S.A. Merkulov, Formality of canonical symplectic complexes and Frobenius manifolds, Int. Math. Res. Not. 14
(1998) 727733.
[14] Yu.I. Manin, Three constructions of Frobenius manifolds: A comparative study, Asian J. Math. 3 (1) (1999) 179
220.
[15] A. Losev, Hodge strings and elements of K. Saitos theory of primitive form, in: Topological Field Theory, Primitive
Forms and Related Topics, Kyoto, 1996, in: Progress in Mathematics, vol. 160, Birkhauser Boston, Boston, MA,
1998, pp. 305335.
[16] S. Shadrin, A definition of descendants at one point in graph calculus, math.QA/0507106.
[17] S. Shadrin, I. Shneiberg, BelorousskiPandharipande relation in dGBV algebras, J. Geom. Phys. 57 (2) (2007)
597615.
2,2 , Funct. Anal. Appl. (2007), in press.
[18] I. Shneiberg, Topological recursion relations in M
[19] M. Kontsevich, Y. Manin, GromovWitten classes, quantum cohomology, and enumerative geometry, Commun.
Math. Phys. 164 (3) (1994) 525562.
[20] Yu. Manin, Frobenius Manifolds, Quantum Cohomology, and Moduli Spaces, American Mathematical Society
Colloquium Publications, vol. 47, Amer. Math. Soc., Providence, RI, 1999.
[21] J. Harris, I. Morrison, Moduli of Curves, Graduate Texts in Mathematics, vol. 187, Springer-Verlag, New York,
1998.
[22] C. Faber, S. Shadrin, D. Zvonkine, Tautological relations and the r-spin Witten conjecture, math.AG/0612510.
[23] E. Getzler, Topological recursion relations in genus 2, in: Integrable Systems and Algebraic Geometry, Kobe/Kyoto,
1997, World Scientific, River Edge, NJ, 1998, pp. 73106.
296
1,4 and elliptic GromovWitten invariants, J. Am. Math. Soc. 10 (4) (1997)
[24] E. Getzler, Intersection theory on M
973998.
[25] P. Belorousski, R. Pandharipande, A descendent relation in genus 2, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 29 (1)
(2000) 171191.
[26] T. Kimura, X. Liu, A genus-3 topological recursion relation, Commun. Math. Phys. 262 (3) (2006) 645661.
[27] E. Witten, Two-dimensional Gravity and Intersection Theory on Moduli Space, Surveys in Differential Geometry,
vol. 1, Lehigh Univ., Bethlehem, PA, 1991, pp. 243310.
[28] M. Kontsevich, Intersection theory on the moduli space of curves and the matrix Airy function, Commun. Math.
Phys. 147 (1) (1992) 123.
[29] A. Okounkov, R. Pandharipande, GromovWitten theory, Hurwitz numbers, and matrix models, I, math.AG/
0101147.
[30] M. Mirzakhani, WeilPetersson volumes and intersection theory on the moduli space of curves, J. Am. Math.
Soc. 20 (1) (2007) 123.
[31] M. Kazarian, S. Lando, An algebro-geometric proof of Wittens conjecture, math.AG/0601760.
[32] Y.-S. Kim, K. Liu, A simple proof of Witten conjecture through localization, math.AG/0508384.
[33] L. Chen, Y. Li, K. Liu, Localization, Hurwitz numbers and the Witten conjecture, math.AG/0609263.
[34] U. Tillmann, Vanishing of the BatalinVilkovisky algebra structure for TCFTs, Commun. Math. Phys. 205 (2)
(1999) 283286.
[35] R. Dijkgraaf, Chiral deformations of conformal field theories, Nucl. Phys. B 493 (3) (1997) 588612.
[36] A. Gerasimov, S. Shatashvili, Towards integrability of topological strings I: Three-forms on CalabiYau manifolds,
JHEP 0411 (2004) 074.
[37] T. Graber, R. Pandharipande, Constructions of non-tautological classes on moduli spaces of curves, Michigan Math.
J. 51 (1) (2003) 93109.
The orbifolds of permutation-type as physical string

systems at multiples of c = 26:
III. The spectra of c = 52 strings
M.B. Halpern
Department of Physics, University of California and Theoretical Physics Group,
Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA
Received 17 April 2007; accepted 8 August 2007
Available online 14 August 2007
Abstract
In the second paper of this series, I obtained the twisted BRST systems and extended physical-state conditions of all twisted open and closed c = 52 strings. In this paper, I supplement the extended physical-state
conditions with the explicit form of the extended (twisted) Virasoro generators of all c = 52 strings, which
allows us to discuss the physical spectra of these systems. Surprisingly, all the c = 52 spectra admit an
equivalent description in terms of generically-unconventional Virasoro generators at c = 26. This description strongly supports our prior conjecture that the c = 52 strings are free of negative-norm states, and
moreover shows that the spectra of some of the simpler cases are equivalent to those of ordinary untwisted
open and closed c = 26 strings.
1. Introduction
Opening another chapter in the orbifold program [115], this is the third in a series of papers which considers the critical orbifolds of permutation-type as candidates for new physical
string systems at higher central charge. In the first paper [16] of this series, we found that the
twisted sectors of these orbifolds are governed by new, extended (permutation-twisted) worldsheet gravitieswhich indicate that the free-bosonic orbifold-string systems of permutation-type
can be free of negative-norm states at critical central charge c = 26K. Correspondingly-extended
world-sheet permutation supergravities are expected in the twisted sectors of the superstring
E-mail address: halpern@physics.berkeley.edu.
doi:10.1016/j.nuclphysb.2007.08.001
298
M.B. Halpern / Nuclear Physics B 786 [PM] (2007) 297312
orbifolds of permutation-type, where superconformal matter lives at higher multiples of critical

superstring central charges.
In the second paper [17] of the series, we found the corresponding twisted BRST systems for
all sectors of the free-bosonic orbifolds which couple to the simple case of Z2 -twisted permutation gravity, i.e. for all the twisted strings with c = 52 matter. The new BRST systems also
implied the following extended physical-state conditions for the physical states {|} of each of
the c = 52 strings:

17
u
Lu m +
(1.1a)
0 m+ u2 ,0
| = 0, m Z, u = 0, 1,
2
8

u
v
, Lv n +
Lu m +
2
2

uv
u+v
= mn+
Lu+v m + n +
2
2

2
u
u
52
1 m+n+ u+v ,0 .
m+
m+
+
(1.1b)
2
12
2
2
The algebra in Eq. (1.1b) is called an order-two orbifold Virasoro algebra (or extended, twisted
Virasoro algebra) and general orbifold Virasoro algebras [1,9,12,1618] are known to govern all
the twisted sectors of the orbifolds of permutation-type at higher multiples of c = 26.
The set of all c = 52 orbifold-strings is a very large class of fractional-moded free-bosonic
string systems, including e.g. the twisted open-string sectors of the orientation orbifolds, the
twisted closed-string sectors of the generalized Z2 -permutation orbifolds and many others (see
Refs. [16,17] and Section 2). Starting from the extended physical-state conditions (1.1) (and a
right-mover copy of (1.1) on the same {|} for the twisted closed-string sectors) this paper
begins the concrete study of the physical spectrum of each c = 52 string.
As the prerequisite for this analysis, I first provide in Section 2 the explicit formin terms of
twisted matter fieldsof the extended Virasoro generators {L u (m + u2 ), u = 0, 1} of all c = 52
strings. This construction allows us to begin the study of the general c = 52 string spectra in
Section 3. The same subject is further considered in Section 4, where I point out that all the
c = 52 spectra admit an equivalent description in terms of generically-unconventional Virasoro
generators at c = 26. This description allows us to see clearly a number of spectral regularities
which are only glimpsed in Section 3, including strong further evidence that the critical orbifolds
of permutation-type can be free of negative-norm states. Moreover, although the generic c = 52
spectrum is apparently new, we are able to show that some of the simpler spectra are equivalent
to those of ordinary untwisted open and closed critical strings at c = 26.
Based on these results, the discussion in Section 5 raises some interesting questions about
these theories at the interacting level, and speculates on the form of the extended physical-state
conditions for more general orbifold-strings of permutation-type. I will return to both of these
subjects in succeeding papers of the series.
2. The twisted Virasoro generators of c = 52 strings
As emphasized in Ref. [17], the universal form of the twisted BRST systems and the extended
physical-state conditions (1.1) are consequences of their origin in Z2 -twisted permutation gravity,
which governs all twisted c = 52 matter.
299
There are however many distinct c = 52 strings, including the twisted open-string sectors of
the orientation orbifolds [12,13,1517]
U (1)26
,
H
H = Z2 (w.s.) H,
(2.1)
and the twisted closed-string sectors of the generalized Z2 -permutation orbifolds [1517]
U (1)26 U (1)26
,
H+
H+ = Z2 (perm) H ,
(2.2)
as well as the generalized open-string Z2 -permutation orbifolds and their T -duals [1517].
For the orientation orbifolds in Eq. (2.1), I remind that H is any automorphism group of
the untwisted closed string U (1)26 which includes world-sheet orientation-reversing automorphisms. Indeed the twisted open-string orientation-orbifold sectors correspond to the orientationreversing automorphisms, which have the form , H , where the basic automorphism
exchanges the left- and right-movers of the closed string and is an extra automorphism
which acts uniformly on the left- and right-movers of the closed string. Similarly, the automorphism group H+ of the generalized Z2 -permutation orbifolds in (2.2) is generated by elements
of the form + , H , where the basic automorphism + exchanges the two copies of the
closed string and the extra automorphism again acts uniformly on the left- and right-movers
of each closed string. In both cases, the extra automorphisms in may or may not form
a group (see the examples at the end of this section).
The spectra of different c = 52 strings are characterized by their extended (twisted) Virasoro
generators, all of which can in fact be written in the following unified form:

1

1 n(r);n(r)
u
Lu m +
G
( )
=
2
4 r,
v=0 pZ

v
uv
n(r)
n(r)
:Jn(r)v p +
+
Jn(r),,uv m p
+
:M
( ) 2
( )
2
+ m+ u ,0 0 ( ),
(2.3a)
2

u
v
n(r)
n(s)
Jn(r)u m +
+
, Jn(s)v n +
+
( ) 2
( ) 2

n(r)
u
=2 m+
+
n(r)+n(s),0 mod ( ) m+n+ n(r)+n(s) + u+v ,0 Gn(r),;n(r), ( ),
2
( )
( ) 2

v
u
n(r)
Lu m +
, Jn(r)v n +
+
2
( ) 2

v
u+v
n(r)
n(r)
+
Jn(r),u+v m + n +
+
,
= n+
( ) 2
( )
2

n(r)
13 1
1
n(r)
1
n(r)
dim n(r)
,
0 ( ) =
8
2 r
( ) 2
( ) 2
( )

dim n(r)
= 26.
r
(2.3b)
(2.3c)
(2.3d)
(2.3e)
300
Each set of extended Virasoro generators in Eq. (2.3a) satisfies the order-two orbifold Virasoro
algebra (1.1b) at c = 52, and the current algebras in Eq. (2.3b) are of the type called doublytwisted in the orbifold program.
For those unfamiliar with the program, I first give a short summary of the standard notation in
the result (2.3)followed by the derivation of the result. As in the extended Virasoro generators
themselves, the indices u, v with fundamental range u,
v {0, 1} describe the twist of the basic
permutations in each H . For each extra automorphism ( ) in each ( ), the spectral
indices {n(r)} and the degeneracy indices { (n(r))} of each twisted sector are determined
by the so-called H -eigenvalue problem [3,5,6] of ( ),
n(r)
2i (
)
( ) H or H ,

1 0
( )a c ( )b d Gcd = Gab , Gab = Gab =
,
0 1

a, b = 0, 1, . . . , 25,
n(r)
0, 1, . . . , ( ) 1 ,
( )a b U ( )b n(r) = U ( )a n(r) e
,
(2.4a)
(2.4b)
(2.4c)
where G is the untwisted target-space metric of U (1)26 . The quantity ( ) is the order of ( )
and all indices {n(r)} are periodic modulo ( ), with {n(r)}
the pullback to the fundamental

region and dim[n(r)]
the size of the subspace n(r).
The index r is summed once over the fundamental region in Eqs. (2.3a), (2.3d) and (2.3e). The twisted metric G. ( ) and its inverse G . ( )
are defined in terms of the unitary eigenvectors U ( ) of the H-eigenvalue problem
Gn(r);n(s) ( ) = n(r) n(s) U ( )n(r) a U ( )n(s) b Gab ,
(2.5a)
= n(r)+n(s),0 mod ( ) Gn(r);n(r), ( ),

1
1
n(s)
Gab U ( )a
G n(r);n(s) ( ) = n(r)
= n(r)+n(s),0 mod ( ) G
n(r)
U ( )b
n(r),;n(r),
(2.5b)
n(s)
(2.5c)
(2.5d)
where G is again the untwisted metric and the s are essentially-arbitrary normalization constants. Finally, the standard mode normal-ordering in Eq. (2.3a) is:

u
v
n(r)
n(s)
+
Jn(s)v n +
+
:M
:Jn(r)u m +
( ) 2
( ) 2

n(r)
u
v
u
n(s)
n(r)
= m+
+
0 Jn(s)v n +
+
Jn(r)u m +
+
( ) 2
( ) 2
( ) 2

u
u
v
n(r)
n(s)
n(r)
Jn(s)v n +
+
< 0 Jn(r)u m +
+
+
.
+ m+
( ) 2
( ) 2
( ) 2
(2.6)
It follows that the quantity 0 ( ) in Eqs. (2.3a) and (2.3d)

u
n(r)
+
0 |0 = 0
Jn(r)u m +
(2.7a)
( ) 2

u
L u m +
(2.7b)
0 |0 = 0 ( )m+ u2 ,0 |0
2
is the conformal weight of the scalar twist-field state |0 of sector .
I comment briefly on the derivation of the unified form (2.3) of the c = 52 extended Virasoro generators. Essentially this result was given for the twisted open-string sectors of the
301
non-Abelian orientation orbifolds in Sections 3.4, 3.5 of Ref. [12], and that result is easily reduced for our Abelian case U (1)26 /H in Eq. (2.1). With a right-mover copy of the extended
Virasoro generators (and u j = 0, 1), the result also hold for the twisted closed-string sectors
of the generalized Z2 -permutation orbifolds (U (1)26 U (1)26 )/H+ in Eq. (2.2). This follows
by the substitution
G G,
u
n(r)
u
+
2
( ) 2
(2.8)
into the known results for the ordinary Z2 -permutation orbifolds with trivial H (see Ref. [9]
and Section 4.2 of Ref. [16]). Finally, a single copy of the unified form (2.3) holds as well
for each twisted sector of the generalized open-string Z2 -permutation orbifolds (U (1)26
U (1)26 )open /H+ and all possible T -dualizations of each of these sectors. This conclusion follows because the left-mover extended Virasoro generators of the closed-string orbifolds for each
H+ are the input data for the construction of the corresponding open-string orbifolds [14],
and the twisted-current form of each set of extended Virasoro generators is independent of Tdualization [15]. The branes, quasi-canonical algebra and non-commutative geometry of the
twisted open strings [1317] depend of course on the particular T-dualization, but these will
not be needed here.
In what follows I will consider each twisted c = 52 string separately, but the reader may find
it helpful to bear in mind the complete sector structure of these orbifold-string systems as labeled
by the elements of the automorphism groups H . Given a particular extra automorphism n H
or H of order n, one may list the following low-order examples:
(1; ),
(2.9a)
(1; 2 ),

1, 3 , 32 ; , 3 , 32 ,

1, 42 ; 4 , 43 ,

1, 62 , 64 ; 6 , 63 , 65 .
(2.9b)
(2.9c)
(2.9d)
(2.9e)
For the generalized Z2 -permutation orbifolds (+ ) all of these sectors are twisted closed strings
at c = 52, while all the sectors of the generalized open-string Z2 -permutation orbifolds (+ )
and their T-dualizations are twisted open strings at c = 52. For the orientation orbifolds ( )
the sectors before the semicolon are twisted closed strings at c = 26 (which form an ordinary
space-time orbifold) while the sectors after the semicolon are twisted open strings at c = 52.
More generally, orientation orbifolds always contain an equal number of twisted open and closed
strings. In all cases, the twisting is of course trivial for sectors corresponding to the unit element.
3. First discussion of the c = 52 string spectra
To frame this discussion, I remind [1] the reader that the Virasoro primary states of our
orbifold CFTs are defined by the integral Virasoro subalgebra (generated by {L 0 (m)}) of the
extended Virasoro algebra. Then the extended physical-state conditions (1.1a) tell us that all the
physical states {|} of each c = 52 orbifold-string are Virasoro primary
L 0 (m > 0)| = 0
(3.1)
302
but only a small subset of these primary states are selected by the rest of the physical-state
conditions:

17
1
L 0 (0)
(3.2)
| = L 1 m +
> 0 | = 0.
8
2
In what follows, I will refer to the L 0 (0) condition in Eq. (3.2) as the spectral condition, since it
will determine the allowed values of momentum-squared for each c = 52 string.
The space of physical states of each orbifold-string is then much smaller than the space of
states of the underlying orbifold conformal field theory. For the experts, I remark in particular
that the extended physical-state conditions generically disallow the characteristic sequence [19]
of Virasoro primary states known as the principle-primary states [1,9]. This follows first by the
spectral condition (which fixes the conformal weight), and second because the physical-state
condition {L u ((m + u2 ) > 0) 0} is stronger than the principle-primary state condition [1,9]

u
L u m +
(3.3)
|p.p.s. = 0, u = 0, 1, m > 0,
2
which does not extend to m = 0.
I turn now to concretize the spectral condition of each twisted c = 52 string, using the explicit
form (2.3) of its extended Virasoro generators. For this, recall [12,15] first that these generators contain in general two kinds of commuting zero modes (dimensionless momenta), namely
{J00 (0)} and {J( )/2,,1 (0)}, where the latter is relevant only when the order ( ) of ( ) is
even. In what follows, I often refer to these zero modes collectively as {J(0)}. It is then natural
to define the momentum-squared operator P 2 as follows:
1
) + 0 ( ),
L 0 (0) = P 2 + R(
4

G 0;0 ( )J00 (0)J00 (0)
P 2
,
+G
)
R(
( )
( )
2 ,; 2 ,

( )J( )/2,,1 (0)J( )/2,,1 (0) ,

(3.4a)
(3.4b)
G n(r);n(r), ( )
r,, u pZ

u
u
n(r)
n(r)
+
Jn(r),,u p
:M .
:Jn(r)u p +
( ) 2
( ) 2
(3.4c)
) indicates omission of the zero modes.

Here the primed sum in the level-number operator R(
With this decomposition, the spectral condition in Eq. (3.2) takes the simple form:
2
) |,
P 2 | = P(0)
(3.5a)
+ R(

2
P(0)
2 0 ( ) 1 ,
0 ( ) =

r

n(r)
1
n(r)
1
n(r)
dim n(r)
>
0.
( ) 2
( ) 2
( )
(3.5b)
(3.5c)
303
Although I will continue the discussion primarily in this form, in fact Eqs. (3.4a) and (3.5a) hold
only for the twisted open-string sectors of the orbifolds. For the twisted closed-string sectors,
we also have right-mover copies of the extended Virasoro generators (2.3), and a corresponding
right-mover copy of the extended physical-state conditions (1.1) on the same {|}. For simplicity
I will limit the discussion of these sectors here to the case of decompactified zero modes, for
which it is appropriate to equate the left and right movers
1
JR (0) = JL (0) = J(0) R R ( ) = R L ( )
(3.6)
2
where the last equality is level-matching in each twisted sector. Keeping the same definition of
the operator P 2 in Eq. (3.4b), the correct closed-string c = 52 spectral condition is then obtained
by the substitution
1
P 2 P 2
2
(3.7)
2 2P 2 , can be used at any

in both Eqs. (3.4a) and (3.5a). These identifications, and hence P(0)
(0)
point in the discussion below to obtain the corresponding closed-string results.
Returning to the open-string case, one simple solution of the extended physical-state conditions is the ground state |0, J(0) of twisted sector :

) 0, J(0) = L u m + u > 0 0, J( ) = 0,
R(
(3.8a)
2

2
2
0, J(0) , P(0)
= 2 + 20 ( ).
P 2 0, J(0) = P(0)
(3.8b)
This is the momentum-boosted twist-field state (see Eq. (2.7)) of that sector, with ground-state
2 . Moreover Eq. (3.4c) and the commutator (2.3c) give the increments
mass-squared P(0)

n(r)
+
P = R( ) = 4 m +
(3.9)
( ) 2
obtained by adding the negatively-moded current

u
n(r)
+
<0
Jn(r)u m +
( ) 2
to any previous state. The precise content of these excited levels must of course be determined
from the remainder of the extended physical-state conditions.
I continue this discussion with some specific examples of c = 52 strings, beginning with the
simplest twisted open-string orientation-orbifold sectors [12,13,1517]:

u
,
= 1: = 1, n = 0, U = 1, G = G, J0au m +
(3.10a)
2

u
P 2 = 4 m + (u = 0 is DD, u = 0 is ND),
(3.10b)
2

u+1
,
= 1: = 2, n = 1, U = 1, G = G, J1au m +
(3.11a)
2

u + 1
(u = 0 is DN, u = 0 is NN).
P 2 = 4 m +
(3.11b)
2
304
In these cases, the extra automorphisms act uniformly on the labels a = 0, . . . 25 and G is the
untwisted target space metric in Eq. (2.4b). Although both twisted strings have (26 + 26) = 52
matter degrees of freedom, note that each example has only one of the two types of zero modes
{J(0)}: 26DD zero modes {J0a0 (0)} for = 1 and 26NN zero modes {J/2,a,1 (0)} for = 1.
In both cases, the momentum-squared (3.4b) has the schematic form

1 0
P 2 = ab Ja (0)Jb (0), =
(3.12)
,
0 1
where = G is the standard (west-coast) 26-dimensional target-space metric. Then we compute from Eqs. (3.5b) and (3.5c) that both strings share the same tachyonic ground-state masssquared
13
2
0 ( ) = ,
= 2
P(0)
8
and the first excited state of each is massless:
1

J0a1 2

1,
2
P
1
0, J(0) = 0 for =
1.
J1a0
0 ( ) = 0,
(3.13)
(3.14)
For this level, I have checked that the L 1 ( 12 ) 0 gauge eliminates the longitudinal parts of the
26-dimensional photons, and moreover the L 1 ( 12 ) and L 0 (1) gauges together eliminate the
negative-norm states at the next level:

1 2
+ J(1) 0, J(0) ,
P 2 = 2.
J
(3.15)
2
Since the increments (P 2 ) in Eqs. (3.10b) and (3.11b) are even integers, we are led to suspect
that the spectra of these two twisted c = 52 strings are nothing but the spectrum of an ordinary
open c = 26 string in disguise.1 I will return to this question in the following section.
A larger subset of twisted c = 52 strings is the following. For a particular twisted sector ,
suppose that = 1 acts uniformly on a set of d labels a = 0, 1, . . . d 1, d 4, while a
non-trivial element (perm) of some permutation group acts non-trivially on the other 26 d
spatial labels. Then Eqs. (2.4), (2.5) and standard results [3,57,9] in the orbifold program give
the following explicit form of the extended Virasoro generators (2.3) in this sector:

u
Lu m +
2

1 ab
v+
uv
= m+ u2 ,0 0 ( ) + G(d)
:Jav p +
J,b,uv m p +
:M
4
2
2
v p
+
1 1
4 v
fj ( )
j

:Jjj v p +
fj ( )1
j=0

v
uv
j
j
+
Jj,j,uv m p
+
:M ,
fj ( ) 2
fj ( )
2
(3.16a)
1 The spectra of these two c = 52 strings look even more familiar in terms of the dimensionful momenta k

J (0)/ 20 , where 0 is the conventional open-string Regge slope.
0 ( ) =

( )1
fj
j
j=0
fj ( ) = 26 d,
1
j
fj ( ) 2

j
1
j
>
0,
fj ( ) 2
fj ( )
4 d 26.
305
(3.16b)
(3.16c)
Here = 0 or 1 for = 1 or 1, G(d) is the restriction of the flat target-space metric (2.4b) to the
first d labels, fj ( ) is the size of the j th cycle in (perm), and the previous cases with 0 ( ) = 0
are included when d = 26. The half-integer moded currents in the second term of (3.16) satisfy
the twisted current algebra (2.3b) with G G(d) . For the permutation-twisted currents in the last
term of (3.16), I have used the standard relation (n(r)/( )) = (j/fj ( )) and (the inverse of )
the twisted metric [3,57,9]
Gjj ;ll ( ) = j l fj ( )j+l,0
mod fj ( )
(3.17)
which also determines the twisted current algebra (2.3b) for these currents. Using Eq. (3.16b),
2 = 2
we see that the non-trivial element of Z2 on two labels also gives 0 ( ) = 0 and a P(0)
ground state, but a non-trivial element of Z3 on three labels gives a slightly-raised ground state
1
121
16
2
=
0 ( ) =
,
P(0)
0 ( ) = ,
(3.18)
9
72
9
and no photons.
Given the cycle-structure {fj ( )} of any extra automorphism w(perm) (see e.g. Eq. (3.4) of
Ref. [16]), it is straightforward to evaluate the sum in Eq. (3.16b). As an illustration, one finds
the simple tachyonic ground-state mass-squares

1
1
2
5 (d = prime) 23: P(0) =

(3.19)
d 2+
12
26 d
in twisted sectors which correspond to the action of any non-trivial element of the cyclic group
Z of prime order on 3 ( = 26 d) 21 spatial labels. The result (3.19) includes Eq. (3.18)
2 = 2 discussed above.
when d = 23, but does not extend to the cases d = 26, 24 with P(0)
2
I remind that this result applies only to the open orbifold-strings, while twice these values of P(0)
are obtained for the closed-string versions.
Further analysis of the c = 52 strings, including the larger subset of examples (3.16), is
found in the following section.
4. Equivalent c = 26 description of the c = 52 spectra
In fact, there exists an entirely equivalent description of all the c = 52 string spectra in terms
of generically-unconventional Virasoro generators at c = 26.
To obtain the c = 26 description, I first define the relabeled (unhatted) operators

u
2n(r)
n(r)
Jn(r) 2m + u +
(4.1a)
Jn(r)u m +
+
, u = 0, 1,
( )
( ) 2

13
u
L(2m + u) 2L u m +
(4.1b)
m+ u2 ,0
2
4
306
in terms of the hatted operators above. This 11 map is recognized as a modest generalization of (the inverse of) the order-two orbifold-induction procedure of Borisov, Halpern and
Schweigert [1]. Since M 2m + u, u = 0, 1 covers the integers once, we then find from (2.3)
the explicit form of the c = 26 generators:
L(M) = 0 ( )M,0
1 n(r);n(r),
+
G
( )
2 r,,

2n(r)
2n(r)
:Jn(r) Q +
Jn(r), M Q
:M ,
( )
( )
QZ

n(r)
1
1
n(r)
n(r)
0 ( ) =
dim n(r)
>
,
( ) 2
( ) 2
( )
r

26
L(M), L(N ) = (M N )L(M + N ) + M M 2 1 M+N,0 ,
12

2n(r)
2n(r)
2n(r)
= N +
Jn(r) M + N +
,
L(M), Jn(r) N +
( )
( )
( )

2n(r)
2n(s)
, Jn(s) N +
Jn(r) M +
( )
( )
= n(r)+n(s),0 mod ( ) M+N+2( n(r)+n(s) ),0 Gn(r);n(r), ( ).
(4.2a)
(4.2b)
(4.2c)
(4.2d)
(4.2e)
( )
The expression (4.2b) for 0 ( ) is the same as above, and the mode-normal ordering in Eq. (4.2a)

2n(r)
2n(s)
:Jn(r) M +
Jn(s) N +
:M
( )
( )

2n(r)
2n(s)
2n(r)
= M +
0 Jn(s) N +
Jn(r) M +
( )
( )
( )

2n(r)
2n(r)
2n(s)
< 0 Jn(r) M +
Jn(s) N +
+ M +
(4.3)
( )
( )
( )
follows from the c = 52 ordering (2.6) because the map (4.1) preserves the sign of all arguments.
I emphasize that the c = 26 Virasoro generators in Eq. (4.2) are generically-unconventional
because the twisted matter is now summed over the fractions {2n/} instead of the conventional
orbifold-fractions {n/}. This distortion of the extra twist is the price we must pay in order to
unwind the basic twist associated to the basic permutations of H .
The map (4.1) also tells us that the c = 52 momenta {J(0)} and the c = 26 momenta {J (0)}
are identical, and we may record
J (0) = J(0):
P 2 = P 2 =
J0 (0) = J00 (0),
J( )/2, (0) = J( )/2,,1 (0),
(4.4a)

( )
( )
G 0:0 ( )J0 (0)J0 (0) + G 2 ,; 2 , ( )J( )/2, (0)J( )/2, (0)
,
(4.4b)
2
where the c = 52 form of P was given in Eq. (3.4b). Similarly, the level-number operator
R( ) in the decomposition of L(0) is the same
L(0) =
1 2
P + R( ) + 0 ( ),
2
307
(4.5a)
)
R( ) = R(

=
G n(r);n(r), ( )
r,, QZ

2n(r)
2n(r)
:Jn(r) Q +
Jn(r), Q
:M
( )
( )
(4.5b)
) was given in Eq. (3.4c).

where the c = 52 form of R(
By itself, the inverse orbifold-induction procedure (4.1) is only a relabeling of the operators
of the permutation-orbifold CFTs. The central point of this discussion however is that for the
orbifold-string theoriesrestricted by the extended physical state conditions (1.1)the map
also gives us a completely equivalent c = 26 description of the physical spectrum of each c = 52
orbifold-string. Indeed, it is easily checked that both components u = 0, 1 of the c = 52 extended
physical-state condition (1.1a) map directly onto the simpler and in fact conventional physicalstate condition
L(M 0)| = M,0 |
(4.6)
in the 26-dimensional description! A right-mover copy of Eq. (4.6) on the same physical states
{|} is similarly obtained in the equivalent c = 26 description of the closed orbifold-strings.
I emphasize that the physical states {|} of the 26-dimensional description (4.6) are exactly
the original physical states (1.1a) of the c = 52 string. Indeed, each physical state | can be
regarded as invariant under the map, or each can now be rewritten in 26-dimensional form. In
further detail, Eqs. (4.5) and (4.6) give the same spectral condition P 2 P02 + R( ), the same
physical ground state2

0, J (0) 0, J(0) ,
P02 = P02 = 2 + 20 ( )
(4.7)
and each negatively-moded hatted current in any physical state can be replaced according to
Eq. (4.1a) by the corresponding unhatted current mode. Note finally that the commutator (4.2d)
and the decomposition (4.5a) give the 26-dimensional increment

2n(r)

P = R( ) = 2 M +
( )
(4.8)
which results from the addition of Jn(r) ((M + 2n(r)

( ) ) < 0) to any previous state. With M =
2m + u, these are recognized as the same increments (3.9) obtained in the c = 52 description.
As simple examples, consider the larger subset (3.16) of c = 52 stringswhose equivalent
c = 26 physical state condition (4.6) now involves the following subset of the c = 26 Virasoro
generators (4.2):
2 Although it is not directly relevant in either description of the c = 52 strings, one notes that the conformal weight of
the scalar twist-field state |0 of sector has now shifted from 0 ( ) to 0 ( ) in the c = 26 description.
308

1
L(M) = M,0 0 ( ) + Gab
:Ja (Q + )J,b (M Q ):M
2 (d)
QZ

1 1
2j
2j
:Jjj Q +
Jj,j M Q
:M ,
+
2
fj ( )
fj ( )
fj ( )
j
j=0 QZ
(4.9a)
fj ( )1

1
2j
2j
2j
1 2
>1
,
0 ( ) =
(4.9b)
4
fj ( )
fj ( )
fj ( )
j
j=0

fj ( ) = 26 d, 4 d 26.
a, b = 0, . . . , d 1,
(4.9c)
fj ( )1
Recall for the larger subset that = 0, 1 corresponds in the symmetric theory to the action of
the extra automorphism = 1 on the first d 4 labels {a}, while fj ( ) is the length of the
j th cycle of the extra permutation (perm) which acts on the remaining 26 d spatial labels.
Shifting the dummy integer Q by the integer , we note that the second term in Eq. (4.9a) is a set
of ordinary Virasoro generators for d untwisted bosons with the ordinary current algebra

Ja (Q), Jb (P ) = G(d)
(4.10)
ab QQ+P ,0
for both values of . The currents in the third term satisfy the twisted current algebra (4.2e) with
the permutation-twisted metric (3.17), and the value of 0 ( ) in Eq. (4.9b) is only a slightlyrewritten form of that given in Eq. (3.16b).
We are now in a position to confirm our suspicions in the previous section about the simplest
orbifold-strings, described earlier at c = 52 by the extended Virasoro generators:

1
u
v+
uv
L u m +
:Jav p +
= Gab
J,b,uv m p +
:M
2
4
2
2
v
13
(4.11)
u , u = 0, 1, = 0, 1.
8 m+ 2 ,0
These are now equivalently described by the choice d = 26 in Eq. (4.9), in which case only
the second (ordinary) term of Eq. (4.9a) is non-zeroand then the equivalent physical-state
condition (4.6) verifies that the physical spectrum of each of these particular twisted c = 52
strings is indeed equivalent to that of an ordinary untwisted c = 26 string! These cases include the
open-string orientation-orbifold sectors corresponding to ( = 1) in Eq. (3.10) and their
T-duals, as well as the twisted closed-string sectors of the generalized Z2 -permutation orbifolds
corresponding to + ( = 1).
Additionally, consider the following special cases of the extended Virasoro generators (3.16)
at c = 52

u
Lu m +
2

13 1 ab
v+
uv
u
:Jav p +
J,b,uv m p +
:M
= m+ 2 ,0 + G(24)
8
4
2
2
v p
+

1
1
j + v
u v j
+
:Jjv p +
Jj,uv m p +
:M
8 v
2
2
p
j=0
(4.12)
309
which result when the extra automorphism in the symmetric theory acts as = 1 on the first
d = 24 labels and the non-trivial element of a Z2 on the remaining 2 spatial labels. I have noted
in Section 3 that 0 ( ) = 0 for these cases as well, and indeed the equivalent c = 26 description
(4.6) and (4.9) at d = 24 now shows that the open and closed orbifold- strings of this type also
have the spectrum of ordinary untwisted c = 26 strings. The common thread for the orbifoldstrings in Eqs. (4.10) and (4.12) is that they are at most half-integer moded, so that the shift
{n/} {2n/} gives integer moding in the c = 26 description.
Beyond these simple cases, the c = 52 strings are apparently newwith 0 ( )
= 0, unfamiliar
ground-state mass-squares, and fractional moding (and increments) in either description.
5. Conclusions
We have discussed the physical spectrum of the general c = 52 orbifold-string, as well as an
equivalent but unconventionally-twisted c = 26 description of the twisted c = 52 matter. The
equivalent c = 26 description holds only for the orbifold-string theoriesrestricted by the extended physical-state conditions (1.1)and not in the larger Hilbert space of the underlying
orbifold conformal field theories.
In general we have found that the spectra of these orbifold-string systems are unfamiliar. One
simple and unexpected conclusion however is that, as string theories restricted by the extended
physical-state conditions, the single twisted c = 52 sector of each of the simplest orbifolds of
permutation-type (see Eq. (2.9))
(5.1a)
(1; ),
(1; 2 ),
22 = 1
(5.1b)
have the same physical spectra as ordinary untwisted c = 26 strings. No such equivalence is
found of course in the half-integer moded Hilbert space of the full orbifold CFTs. The list in
Eq. (5.1) includes the simplest orientation orbifolds (with ) and their T-duals, as well as the
simplest generalized Z2 -permutation orbifolds (with + ).
For the simplest orientation orbifolds in particular, the string theories in Eq. (5.1) consist of
an ordinary unoriented closed string (the unit element) at c = 26 and a c = 52 twisted open string
whose physical spectrum is equivalent to that of an ordinary untwisted c = 26 critical open string.
Since both the closed- and open-string spectra of these simple orientation orbifolds are equivalent
to those of the archetypal orientifold (without ChanPaton factors), we are led to suspect that
orientation orbifolds include orientifolds. I will return in the next paper of this series to consider
this question at the interacting level, where we will also be able to ask about the decoupling
of null physical states. Following that, I will consider in a succeeding paper the corresponding
situation and modular invariance for the simplest permutation orbifold-string systems.
More generally, we have seen that there are many other orientation orbifolds, open-string
Z2 -permutation orbifolds and generalized Z2 -permutation orbifolds whose c = 52 spectra show
fractional moding in both the c = 52 and the c = 26 descriptions. These include in particular the
orbifolds in Eq. (2.9) when the order n of the extra automorphism is greater than two.
There is more to say about no-ghost theorems for the general twisted c = 52 string. The
original intuition [16] was that the doubled gauges u = 0, 1 of the extended physical state condition (1.1) could remove the doubled set of negative-norm states (time-like modes) of the c = 52
stringswhich are also associated with u = 0, 1. For the simplest c = 52 strings in Eq. (5.1), this
intuition is certainly born out [20,21]. More generally, the equivalent c = 26 description of each
310
spectrum shows that both aspects of the doubling are indeed eliminated at the same time, leaving
us with the conventional physical state condition (4.6) and only a single set of time-like modes.
This is clearly visible in the set of examples (4.9), where the only time-like modes (a = 0) are
included in the second term. For the general c = 52 string, the reader should bear in mind that
the twisted metric G in Eq. (4.2) is only a unitary transformation (2.5) of the untwisted metric G
with a single time-like direction. Although not yet a proof, and illustrated here only for c = 52,
I consider this a stronger form of the original arguments [16] that all the critical orbifolds of
permutation-type should be free of negative-norm states.
The next question I wish to address is the following: I have emphasized that the equivalent
c = 26 Virasoro generators (4.2) are generically-unconventional, being summed over the matterfield fractions {2n/} instead of the conventional orbifold fractions {n/}, but are they actually
new Virasoro generators? I do not know the answer to this question in general, but at least some
of them can in fact be re-expressed by further mode-relabeling in terms of more familiar Virasoro
generators. As examples, consider the special case of the larger subset (4.9) when (perm) is
one of the elements of order of each cyclic group Z . (These are the particular, single-cycle
elements of Z with f0 ( ) = .) When is odd, one finds that the first and third terms of (4.9) can
in fact be re-expressed in terms of the conventional Virasoro generators associated to a twisted
sector of an ordinary cyclic permutation orbifold U (1) /Z [1]

1
1
j
j
L (M) =
:Jj Q +
Jj M Q
:M
2
QZ
j=0

1
1
+ M,0
(5.2)
, c = = 2l + 1,
24
where I have relabeled the currents Jj Jj0 . To obtain this result from (4.9), one needs the fact
that {2j/} {j/} modulo the integers when is odd. This observation is consistent with the
ground-state mass-squares for prime in Eq. (3.19). When is even, I have also checked that the
first and third terms of (4.9) can be re-expressed as the sum of two identical commuting Virasoro
generators of this type
L (M) = L (M) + L (M),
2
c = = 2l,
(5.3)
each of which is associated to a twisted sector of U (1)/2 /Z/2 . This result is also obtained by
relabeling the modes modulo the integers, and provides us with another way to understand that
the ground-state mass-squared is unshifted when (perm) is the non-trivial element of a Z2 .
My final remark is a conjecture, that the extended physical-state conditions for the twisted
strings at c = 26, prime will in fact read

j
0 m+ j ,0 a | = 0, j = 0, 1, . . . , 1,
Lj m +
(5.4a)
132 1
,
12

j l
j
l
j + l
, Ll n +
= mn+
Lj+l m + n +
Lj m +

2
26
j
j
+
1
m+
m+
l
m+n+ j+
12
,0
a
(5.4b)
(5.4c)
311
where Eq. (5.4c) is an orbifold Virasoro algebra [1,9,18] of order . This form includes the correct
generators {L j } corresponding to the classical extended Polyakov constraints of Ref. [16], and
includes the correct value a 2 = 17/8 studied here for the c = 52 strings. I obtained the system
(5.4) by requiring (as we now know for = 2) that it map by the inverse of the order- orbifoldinduction procedure [1] to the conventional physical-state condition (4.6) with a 1 = 1 at c = 26.
One way to test this conjecture would be the construction of the corresponding twisted BRST
systems [17] for these higher values of c.
Extensions to include winding number and twisted B fields at c = 52 are also deferred to
another time and place.
Acknowledgements
For helpful information, discussions and encouragement, I thank L. Alvarez-Gaum, K. Bardakci, I. Brunner, J. de Boer, D. Fairlie, O. Ganor, E. Gimon, C. Helfgott, E. Kiritsis, R. Littlejohn, S. Mandelstam, J. McGreevy, N. Obers, A. Petkou, E. Rabinovici, V. Schomerus,
K. Schoutens, C. Schweigert and E. Witten. This work was supported in part by the Director,
Office of Energy Research, Office of High Energy and Nuclear Physics, Division of High Energy
Physics of the US Department of Energy under Contract DE-AC02-O5CH11231 and in part by
the National Science Foundation under grant PHY00-98840.
References
[1] L. Borisov, M.B. Halpern, C. Schweigert, Systematic approach to cyclic orbifolds, Int. J. Mod. Phys. A 13 (1998)
125, hep-th/9701061.
[2] J. Evslin, M.B. Halpern, J.E. Wang, General Virasoro construction on orbifold affine algebra, Int. J. Mod. Phys.
A 14 (1999) 4985, hep-th/9904105.
[3] J. de Boer, J. Evslin, M.B. Halpern, J.E. Wang, New duality transformations in orbifold theory, Int. J. Mod. Phys.
A 15 (2000) 1297, hep-th/9908187.
[4] J. Evslin, M.B. Halpern, J.E. Wang, Cyclic coset orbifolds, Int. J. Mod. Phys. A 15 (2000) 3829, hep-th/9912084.
[5] M.B. Halpern, J.E. Wang, More about all current-algebraic orbifolds, Int. J. Mod. Phys. A 16 (2001) 97, hep-th/
0005187.
[6] J. de Boer, M.B. Halpern, N.A. Obers, The operator algebra and twisted KZ equations of WZW orbifolds,
JHEP 0110 (2001) 011, hep-th/0105305.
[7] M.B. Halpern, N.A. Obers, Two large examples in orbifold theory: Abelian orbifolds and the charge conjugation
orbifold on su(n), Int. J. Mod. Phys. A 17 (2002) 3897, hep-th/0203056.
[8] M.B. Halpern, F. Wagner, The general coset orbifold action, Int. J. Mod. Phys. A 18 (2003) 19, hep-th/0205143.
[9] M.B. Halpern, C. Helfgott, Extended operator algebra and reducibility in the WZW permutation orbifolds, Int. J.
Mod. Phys. A 18 (2003) 1773, hep-th/0208087.
[10] O. Ganor, M.B. Halpern, C. Helfgott, N.A. Obers, The outer-automorphic WZW orbifolds on so(2n), including five
triality orbifolds on so(8), JHEP 0212 (2002) 019, hep-th/0211003.
[11] J. de Boer, M.B. Halpern, C. Helfgott, Twisted Einstein tensors and orbifold geometry, Int. J. Mod. Phys. A 18
(2003) 3489, hep-th/0212275.
[12] M.B. Halpern, C. Helfgott, Twisted open strings from closed strings: The WZW orientation orbifolds, Int. J. Mod.
Phys. A 19 (2004) 2233, hep-th/0306014.
[13] M.B. Halpern, C. Helfgott, On the target-space geometry of the open-string orientation-orbifold sectors, Ann.
Phys. 310 (2004) 302, hep-th/0309101.
[14] M.B. Halpern, C. Helfgott, A basic class of twisted open WZW strings, Int. J. Mod. Phys. A 19 (2004) 3481,
hep-th/0402108.
[15] M.B. Halpern, C. Helfgott, The general twisted open WZW string, Int. J. Mod. Phys. A 20 (2005) 923, hep-th/
0406003.
[16] M.B. Halpern, The orbifolds of permutation-type as physical string systems at multiples of c = 26: I. Extended
actions and new twisted world-sheet gravities, JHEP 0706 (2007) 068, hep-th/0703044.
312
[17] M.B. Halpern, The orbifolds of permutation-type as physical string systems at multiples of c = 26: II. The twisted
BRST systems of c = 52 matter, hep-th/0703208, Int. J. Mod. Phys. A, in press.
[18] R. Dijkgraaf, E. Verlinde, H. Verlinde, Matrix string theory, Nucl. Phys. B 500 (1997) 43, hep-th/9703030.
[19] A. Klemm, M.G. Schmidt, Orbifolds by cyclic permutations of tensor-product conformal field theories, Phys. Lett.
B 245 (1990) 53.
[20] P. Goddard, C.B. Thorn, Compatibility of the dual pomeron with unitarity and the absence of ghosts in the dual
resonance model, Phys. Lett. B 40 (1972) 378.
[21] S. Mandelstam, Dual resonance models, Phys. Rep. 13 (1974) 259.

Nucl - Phys.B v.786

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Nucl - Phys.B v.786

Hochgeladen von

Copyright:

Verfügbare Formate

Nuclear Physics B 786 (2007) 125

Higher-spin ChernSimons theories in odd dimensions

E-mail addresses: j.engquist@phys.uu.nl (J. Engquist), o.hohm@phys.uu.nl (O. Hohm).

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

parametrized by a symmetric transformation parameter of rank s 1. An action for a free field

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

In the Lorentz basis, the so(D 1, 2) commutation relations read

To define a Lie algebraic HS extension of so(D 1, 2) it is convenient [33,34] to introduce

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

which reduces for linear functions to (2.7).

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

Appendix A for details)

Notice that only the s = 2 subsector is closed.

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

one dimension less. The action can be written in closed form as

i.e., gauge invariance under A = D follows by the Bianchi identity.

containing the Riemann tensor

where we have introduced the abbreviation (RR) = R R .

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

which has vanishing torsion, T a = 0, and satisfies

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

MAB  TC(s1),D(s1) =MAB TC(s1),D(s1)

where {Ts , Ts  } = Ts  Ts  + Ts   Ts and k are arbitrary coefficients. This definition generalizes

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

, and it closes according to (2.16) with the spin-2 generator as5

[Pa , Tbc ] = 2Tbc,a ,

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

We find the solution

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

4.2. Spin-3 field equations

and after rela(4.26)

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

with symmetric , can be solved by K = , as follows after reinsertion from (4.16).

solves (4.28), where

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

10 A similar argument has been employed for supergravity in [58].

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

+ Tba ). Note that, accordingly, a tensor Ta,b in is anti-symmetric, while in general no

X)abcd (P(2,2) X)abc X abcd

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

Appendix B. Proof of cyclicity of the trace

Furthermore, one can show that for k 2

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

Appendix C. The spin-3 Riemann tensor

where we converted all indices into curved ones.

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

Bianchi identity (C.4) and the symmetries of the window tableau:

Since these are in definite Young tableaux (namely both in

With these relations it follows that this trace of R(s) is in

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

J. Engquist, O. Hohm / Nuclear Physics B 786 (2007) 125

Nuclear Physics B 786 (2007) 2651

Two-loop fermionic corrections to massive Bhabha

Received 9 May 2007; accepted 27 June 2007

E-mail address: stefano.actis@desy.de (S. Actis).

S. Actis et al. / Nuclear Physics B 786 (2007) 2651

and introduce the Mandelstam invariants s, t and u,

MAB TC(s1),D(s1) =MAB TC(s1),D(s1)

where {Ts , Ts } = Ts Ts + Ts Ts and k are arbitrary coefficients. This definition generalizes