Nucl.Phys.B v.786

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

3 Aufrufe

Nucl.Phys.B v.786

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- 1304.5527
- The Maxima Computer Algebra System
- Carlos Castro Perelman's CV
- Myriam Mondragon and George Zoupanos- Unified Gauge Theories and Reduction of Couplings: from Finiteness to Fuzzy Extra Dimensions
- Singular General Relativity - ae100prg poster
- Mech Notes
- Loop Quantum Gravity as an Effective Theory_Martin Bojowald
- Semes-2
- Chapter 1
- A Non Parametric Calibration of the HJM Geometry[1]
- solutions GR demystified
- lecture490_ch7
- Lecture 6
- 1312.6658
- motionmountain-volume6
- Lieder IV
- 535123610-MIT.pdf
- motionmountain-volume2
- Asymptotic
- [Polchinski, J.] Dualities

Sie sind auf Seite 1von 311

Johan Engquist , Olaf Hohm

Institute for Theoretical Physics and Spinoza Institute, Utrecht University, 3508 TD Utrecht, The Netherlands

Received 4 June 2007; received in revised form 20 June 2007; accepted 21 June 2007

Available online 28 June 2007

Abstract

We construct consistent bosonic higher-spin gauge theories in odd dimensions D > 3 based on Chern

Simons forms. The gauge groups are infinite-dimensional higher-spin extensions of the anti-de Sitter groups

SO(D 1, 2). We propose an invariant tensor on these algebras, which is required for the definition of the

ChernSimons action. The latter contains the purely gravitational ChernSimons theories constructed by

Chamseddine, and so the entire theory describes a consistent coupling of higher-spin fields to a particular

form of Lovelock gravity. It contains topological as well as non-topological phases. Focusing on D = 5 we

consider as an example for the latter an AdS4 S 1 KaluzaKlein background. By solving the higher-spin

torsion constraints in the case of a spin-3 field, we verify explicitly that the equations of motion reduce in

the linearization to the compensator form of the Frnsdal equations on AdS4 .

2007 Elsevier B.V. All rights reserved.

1. Introduction

The construction of theories describing consistently interacting higher-spin fields is for several

reasons of great interest. For one thing string theory contains an infinite tower of massive higherspin states, and it is an old idea that these hint to a spontaneously broken phase of a theory

with a huge hidden gauge symmetry, thus extending the geometrical framework of Einsteins

theory [16]. However, the actual formulation of higher-spin theories is usually precluded by the

interaction problem. The latter refers to the apparent impossibility of introducing interactions into

a free higher-spin (HS) theory in such a way, that the number of dynamical degrees of freedom

is unaltered [7,8]. For instance, naively coupling free massless HS fields to gravity violates the

HS gauge symmetry and thus renders the theory inconsistent [9].

* Corresponding author.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.015

In a series of paper Vasiliev has, however, begun to find a route avoiding these no-go theorems, i.e., to consistently couple HS fields to gravity, by relaxing the following assumptions.

First, the theory is assumed to have a non-vanishing negative cosmological constantleading to

anti-de Sitter (AdS) instead of Minkowski space as the ground stateand to depend on this cosmological constant in a non-polynomial way. The latter excludes a flat space limit, in accordance

with standard S-matrix arguments [10]. Second, it will necessarily contain an infinite number of

massless fields carrying arbitrarily high spin, whose couplings can be of arbitrary power in the

derivatives (see [11] for a review and references therein).

The formulation of the associated HS theory is based on a gauging of an infinite-dimensional

HS algebra, in the same way that gravity and supergravity theories can be viewed as resulting

from a gauging of a (super-)AdS algebra. However, theories which are constructed along these

lines (as, e.g., in the approach of MacDowellMansouri [12]) are not true gauge theories in

that the gauge symmetry is not manifest and, moreover, (super-)torsion constraints have to be

imposed by hand. For instance, in supergravity invariance under local supersymmetry is by no

means manifest and has to be checked explicitly. In addition, so-called extra fields appear in

HS theories, which are unphysical and have to be expressed in terms of the physical fields by

imposing further constraints. In total, the program of Vasiliev consists of finding a non-linear HS

theory [11], which

(i) is still invariant under (a deformation of) the HS symmetry, and

(ii) yields in the linearization the required free field equations.

Of course, both requirements are related since once it is proven that the free field equations are of

2nd order, the HS symmetry, i.e., (i), fixes the field equations uniquely to the so-called Frnsdal

form. In the approach of [13,14] this requirement is implemented through the condition that

the extra fields decouple in the free limit (for reasons we will explain below). However, these

conditions have no natural interpretation from the point of view of the HS gauge symmetry. In

turn the consistency of the resulting HS action can only be checked up to some order, as it has

been done in D = 4 and D = 5 for cubic couplings [13,14]. But there are even reasons to expect

that this consistency will not extend to all orders [14]. In fact, up to date a fully consistent action

describing interactions of propagating HS fields is not known.

An approach, which is instead followed in order to describe consistent HS interactions at

the level of the equations of motion, is given by the so-called unfolded formulation [1517].

The latter is a surprisingly concise way to keep the HS invariance manifest. However, in this

approach there is not only an infinite number of physical HS fields, but each of the infinite fields

has an infinite number of auxiliary fields, which, roughly speaking, parametrize all spacetime

derivatives of the physical field. This in turn complicates the analysis of the physical content, and

it would be clearly desirable to have a conventional action principle that extends the Einstein

Hilbert action in the same way as supergravity does for spin-3/2 fields.

Concerning the problem of finding a consistent HS action, it should be noted that one example does exist: the ChernSimons action in D = 3 constructed by Blencowe based on a HS

algebra [18]. (See also [19,20] and [21,22] in a related context.) As the ChernSimons theory

is a true gauge theory, the resulting HS theory is consistent by construction and naturally extends the EinsteinHilbert action (which in D = 3 also has an interpretation as a ChernSimons

action [23]). It is, however, only of limited use since it is topological and does not give rise to

propagating degrees of freedom. On the other hand, gauge invariant ChernSimons actions exist

in all odd dimensions, and even though they are topological in any dimension in the sense that

they do not depend on a metric, they are not devoid of local dynamics in D > 3. In fact, it has

been shown by Chamseddine [24,25] that the ChernSimons actions based on the AdSD algebras

so(D 1, 2) are equivalent to a particular type of Lovelock gravity with propagating torsion and

thus by far not dynamically trivial. So one might wonder what happens if one defines a Chern

Simons action based on a HS extension of so(D 1, 2). This paper is devoted to the analysis of

this question.

The organization of the paper is as follows. In Section 2 we briefly review the known free

HS theories on Minkowski and AdS, and we introduce the HS Lie algebras which will later on

serve as gauge algebras. The general construction of ChernSimons actions in odd dimensions

will be reviewed in Section 3.1, together with the realization of Lovelock gravity as a Chern

Simons gauge theory. In Section 3.1 we construct an invariant tensor of the HS algebra, which

in turn allows a consistent extension of ChernSimons gravity to include an infinite tower of

HS fields. The constructed theory is then linearized around the non-topological KaluzaKlein

background AdS4 S 1 in Section 4. Focusing on the spin-3 mode, we show that the equations of

motion reduce to the correct free equations on AdS4 . We conclude in Section 5, while technical

details concerning Young tableaux, the symmetric invariant and the spin-3 Riemann tensor are

relegated to Appendices AC.

2. Higher-spin theories and their gauge algebras

In this section we first review free HS theories on Minkowski and AdS backgrounds, and

then introduce the infinite-dimensional HS Lie algebras, which are the starting point for the

construction of interacting HS theories. The results hold in general odd dimensions, though for

concreteness we will often specify to D = 5.

2.1. Free higher-spin actions

Bosonic fields of arbitrary spin s are described by symmetric rank-s tensors h1 ...s . In the

massless case they are subject to the gauge symmetry

h1 ...s = (1 2 ...s ) ,

(2.1)

of spin s on Minkowski and (anti-)de Sitter backgrounds has been given by Frnsdal [28,29]. For

a spin-3 field h , which is the case we will later on examine in more detail, it is of the form

1

d D x h h 3 h h + 3 2 h h

S=

2

3

3 h h h h + Lm .

(2.2)

2

Here denotes the AdS-covariant derivative or a partial derivative in case of a Minkowski background. In the flat case the additional term Lm vanishes, while on AdS the HS gauge symmetry

requires a mass-like term proportional to the cosmological constant. The latter then amounts to

the equations of motion

AdS

h 3( h) + 3( h)

F

= 0,

1

(D 3)h + 2 3g( h)

2

L

(2.3)

which defines the so-called Frnsdal operator F . Here h denotes the trace of h in the AdS

metric and L is the AdS radius, related to the cosmological constant by L = 1/ . Let us finally

note that the given action or field equations are invariant under the gauge variations only if the

transformation parameter is traceless and that for spin s > 3 a double-tracelessness condition

has to be imposed on the fields in order to give rise to the correct number of spin-s degrees of

freedom [30].

The difficulty in promoting these HS theories to interacting theories via coupling to gravity

or electrodynamics is due to the fact that the presence of generic covariant derivatives in (2.2)

violates the HS gauge symmetry. This in turn implies that the unphysical degrees of freedom

are no longer eliminated and the theory becomes inconsistent. Despite of these negative results,

Vasiliev has pioneered an approach towards a consistent coupling of HS fields to gravity, which is

based on the introduction of an infinite-dimensional HS algebra [31]. The latter requires a framelike formulation of HS fields, which mimics the vielbein formulation of general relativity rather

than the metric-like formulation used in (2.2) [32]. More specifically, a spin-3 field, for instance,

is described by e ab , being symmetric in the frame indices a, b, together with an analogue of the

spin-connection ab,c . A closer inspection has, however, revealed that consistency of the HS

algebra requires further fields, which are the so-called extra fields. These issues will be dealt with

in later sections, but for the moment we just note that the resulting algebra will be derived from

the enveloping algebra of the AdSD algebra so(D 1, 2), to which we turn in the next section.

2.2. A higher-spin extension of so(D 1, 2)

The starting point for the construction of an infinite-dimensional HS algebra is the AdS

symmetry group SO(D 1, 2). The Lie algebra of the latter is spanned by the anti-Hermitian

generators MAB = MBA , A = 0, 1, . . . , D 1, D + 1, obeying the commutation relations

[MAB , MCD ] = BC MAD AC MBD BD MAC + AD MBC

fAB,CD EF MEF ,

(2.4)

where

AB = diag(1, 1, 1, 1, 1, 1),

[E

F]

fAB,CD EF = 4[A

B][C D]

.

(2.5)

[Mab , Mcd ] = bc Mad ac Mbd bd Mac + ad Mbc ,

[Mab , Pc ] = 2c[a Pb] ,

[Pa , Pb ] = Mab .

(2.6)

a set of bosonic vector oscillators yAi , where i = 1, 2 is an sp(2) doublet index, obeying an

associative non-commutative star product

i j

j

j

yA , yB = 2 ij AB ,

yAi yB = yAi yB + ij AB ,

(2.7)

where ij = j i denotes the invariant sp(2) tensor, and we have introduced the bracket

[U , V ] = U V V U . The star product of general functions f (y) and g(y) can be defined

by the MoyalWeyl formula

ij

,

f

(y)g(z)

f (y) g(y) = exp

(2.8)

AB

i

j

yA zB

z =y

Given the oscillators we can construct the generators of the commuting (Howe dual) algebras

so(D 1, 2) and sp(2) as the bilinears

1

MAB = yAi yiB ,

2

1

Kij = yiA yj A ,

2

(2.9)

from which it indeed follows that [Kij , MAB ] = 0. The construction of the HS Lie algebra

ho(D 1, 2) is based on the enveloping algebra of so(D 1, 2). It is defined in terms of the

oscillator as [33,34]

ho(D 1, 2) = T (y): T = T , [Kij , T ] = 0 ,

(2.10)

where T (y) are arbitrary polynomials in the oscillator yAi , and the last condition singles out

the sp(2) singlets. This algebra is sometimes (perhaps misleadingly) referred to as the off-shell

HS algebra since it is generated by trace-full generators. On the other hand, starting from this

algebra one may construct the corresponding on-shell algebra where the generators are made

traceless by factoring out the ideal in ho(D 1, 2) spanned by elements of the form Kij X ij [11,

33,34]. In the formulation of HS theories utilizing the approach of unfolded dynamics [15,35],

where an action principle is not needed, the on-shell algebra has been mostly used. It is only

recently [34] that the importance of the off-shell algebra has been emphasized. In contrast, in an

ordinary action formulation of HS theories, as the one in this paper, we believe that an algebra

with trace-full generators is crucial. In the remaining of the paper we will avoid the term off-shell

algebra.

The polynomial T (y) appearing in (2.10) admits a level decomposition into monomials T (y)

(we associate the generators T at level with spins s = + 2 = 2, 3, 4, . . .)

T (y) =

T (y),

T (y) = 2+2 T (y).

(2.11)

= 0

The definition in terms of vector oscillators implies in particular that the algebra does not contain

the full enveloping algebra spanned by polynomials in MAB . Elements which vanish identically

in the vector oscillator formulation belong to a certain ideal I U[so(D 1, 2)]. For instance,

the anti-symmetric part M[AB MC]D vanishes due to the sp(2) identity [ij k]l = 0. This in turn

implies that the generators of the HS algebra (2.10) are in specific Young tableaux. In other

words, the T (y) have an expansion in terms of GL(D + 1) tensors [34]

TA(s1),B(s1) P(s1,s1) (MA1 B1 MAs1 Bs1 )

1

i

= 2s2 P(s1,s1) yAi11 yi1 B1 yAs1

yis1 Bs1 ,

s1

2

(2.12)

(2.13)

where the so(D 1, 2) generators appear at level 0.1 We have introduced a notation in which

TA(n),B(n) TA1 An ,B1 Bn , each set of indices being totally symmetrized. P(s1,s1) is a

Young projector which imposes the symmetry of the two-row GL(D + 1) Young tableau (see

1 There exists a further restriction of ho(D 1, 2) to a minimal algebra containing only even spins s = 2, 4, 6, . . . [33,

34].

(2.14)

.

s1

Later on we will need the generator in a GL(D) Lorentz basis. Splitting (2.14) accordingly

for spin s, we find s generators, which schematically are in the tableaux

,

...,

(2.15)

More specifically, the generator TA1 As1 ,B1 Bs1 split into the series of generator Ta1 as1 and

Ta1 as1 ,b1 bt for 1 t s 1. The gauge fields e a1 as1 corresponding to the first generator

will later be identified with the physical spin-s field, while the fields for the remaining s 1 generators are in the literature referred to as the auxiliary (t = 1) and extra fields (t > 1). However,

for us this distinction between auxiliary and extra fields will be redundant and we will therefore

henceforth refer to all fields with t > 0 as auxiliary.

The complete set of commutation relations of the ho(D 1, 2) algebra is not known in a

closed form. Luckily, for the linearized spin-s analysis, to be treated in Section 4 for an expansion

around a spin-2 solution, it is sufficient to specify the spin-2spin-s commutation relations, which

are entirely fixed by the representations theory of the AdS subalgebra so(D 1, 2)

[MAB , TC(s1),D(s1) ] = 4(s 1)P(s1,s1) (AC

).

s1 TBC(s2),D(s1)

(2.16)

Let us finally mention that when commuting a spin-s generator with a spin-s generator we obtain

a sequence of generators with spins

s + s 2, s + s 4, . . . , |s s | + 2.

(2.17)

3. ChernSimons theories in odd dimensions

In this section we will introduce the formulation of ChernSimons theories [24,25] in general

odd dimensions (for a review see [42]). The theory is specified once we give the algebra and the

relevant invariant tensor. We will specify to AdS Lovelock gravity, with a focus on D = 5, though

the results directly extend to all odd dimensions (for the explicit formulas see [25]). Although

there is no non-trivial propagation around the vacuum solution AdS5 , interestingly, the theory

also admits a simple AdS4 solution [25] around which the graviton propagates. In Section 4 we

will analyze the linearized HS dynamics around this solution. Finally, in this section we will

propose an invariant tensor for the full HS algebra.

3.1. Lovelock gravity as so(D 1, 2) gauge theory

In any odd dimension D = 2n 1 a gauge-invariant ChernSimons action can be defined,

which is based on the invariant 2n-form F n

constructed out of the field strength F , with

denoting an invariant symmetric tensor of degree n. More specifically, this expression is a total

derivative and thus gives rise to a dynamically non-trivial theory only on the boundary, i.e., in

SCS =

M2n

1

=

M2n1

n1

,

dt A t dA + t 2 A2

(3.1)

where M2n1 = M2n , and we left the wedge products implicit. The resulting ChernSimons

form in D = 2n 1 is by construction gauge-invariant up to total derivatives. Explicitly, one has

under arbitrary variations

SCS = n

(3.2)

A F n1 ,

M2n1

For definiteness we focus on D = 5. The gauge field A then takes values in the Lie algebra

of the group SO(4, 2). Specifically, we write in an SO(4, 1) covariant manner A = e a Pa +

1

ab

2 Mab in the basis above and define the invariant tensor to be

MAB MCD MEF

= ABCDEF .

(3.3)

Note that, as required, this tensor is symmetric in the sense that it stays invariant under exchange

of MAB with MCD , etc. The SO(4, 1) covariant field strength tensors in F = 12 R ab Mab +

T a Pa read

R ab = R ab + e a e b e a e b ,

T a = e a e a + ab eb ab eb ,

(3.4)

R ab = ab ab + a c cb a c cb ,

(3.5)

and the torsion tensor. The resulting ChernSimons action can be written as [25]

2

S = 3 a1 ...a5 ea1 R a2 a3 R a4 a5 + ea1 ea2 ea3 R a4 a5

3

M5

1 a1

a2

a3

a4

a5

+ e e e e e .

5

(3.6)

We see that the action is the EinsteinHilbert action with a cosmological constant (which we have

set = 1), extended by a D = 5 Lovelock term. To be more precise, it describes a theory with

dynamical torsion. However, it is still consistent with the field equations to impose vanishing

torsion in order to express the spin connection in terms of the vielbein. This in turn reduces the

dynamical degrees of freedom to those of the metric, for which the Einstein equations read

1

1

R Rg 3g =

(RR) ,

2

32

(3.7)

As it stands, (3.6) seems to be a purely conventional type of Lovelock gravity, which is usually assumed to propagate the same number of degrees of freedom as Einstein gravity (five in

D = 5). However, in this case the topological origin (3.1) actually gives rise to a somewhat unconventional behavior: Expanding (3.6) around the AdS5 solution one infers that the quadratic

term vanishes identically. In other words, a propagator around AdS5 does not exist. This can be

most easily understood from the general form of the equations of motion for the ChernSimons

action (3.1), which can be read off from (3.2)

gABC F B F C = 0,

(3.8)

where gABC denotes the invariant tensor and A, B, . . . , are the adjoint indices for a generic

gauge group, which will be later on specified to ho(4, 2). Since for an expansion around AdS5 the

curvature tensor in (3.4) vanishes in the background, there are no linear terms in (3.8) and thus no

quadratic terms in the action. However, this should not be interpreted in the sense that the theory

is devoid of local dynamics altogether, as is sometimes assumed of topological actions like (3.1)

in the literature. Indeed, the propagator around generic backgrounds does not vanish. Moreover,

a careful Hamiltonian analysis of the dynamical content in [26,27] has shown that, apart from

degenerate sectors (like the maximally symmetric AdS5 background), the theory consistently

propagates a number of degrees of freedom depending on the dimension of the gauge group. In

particular, the Lovelock-type gravity theory above has the expected five degrees of freedom.2

Let us also stress that the degenerate sectors are only a measure-zero subspace within phase

space [26,27], and that even around such degenerate backgrounds some degrees of freedom can

propagate, albeit fewer. One example has been given already in [25]: It is effectively an AdS4

solution and reads (, = 0, 1, 2, 3)

a

e =

1

1 14 x x

x x

=

,

2 1 14 x x

e4 4 = const,

(3.9)

e e = 0,

R +

R4 +

e e4 = 0.

(3.10)

By expanding around this solution, it has been shown that it propagates in particular a fourdimensional graviton [25].

3.2. Invariant tensor of the higher-spin algebra

In order to construct the ChernSimons action (3.1) based on ho(D 1, 2), which extends

standard ChernSimons gravity, we have to find a completely symmetric tensor of degree D 2,

which is invariant under the adjoint action of the HS algebra ho(D 1, 2), and which reduces to

the standard invariant (3.3) for the AdS-subalgebra so(D 1, 2). Below we will propose a formula

for the invariant tensor. However, while the vector oscillator formulation described in Section 2.2

was required in order to establish existence and consistency of the HS algebra, it turns out not

to be sufficient for the definition of a symmetric invariants to ho(D 1, 2). Instead, we will

introduce a new star product, known as the BCH (BakerCampbellHausdorff) star product or

the Gutt star product [3739].

Let us first briefly comment on the reasons why the formulation in terms of vector oscillators is incapable of reproducing the symmetric tensor (3.3). This is simply due to the fact

that the oscillators automatically eliminate the totally anti-symmetric part in the star product,

2 To be more precise, this counting applies only in case of vanishing torsion. Otherwise, there are additional degrees

of freedom [27].

M[AB MCD MEF ] = 0, since it involves an anti-symmetrization over more than two sp(2) indices. On the other hand, this exclusion guaranteed the appearance of generators entirely being

in definite (s 1, s 1) Young tableaux, or in other words, eliminated the ideals spanned by

generators not in these Young tableaux. Here in contrast, by requiring an invariant tensor generalizing (3.3), we are, roughly speaking, assigning a non-zero value to certain parts in the ideal I.

Put differently, instead of using the invariance of the ideal, [ho(D 1, 2), I] I, to set it to zero,

we set it to constants, reducing in particular to (3.3).

To start with, we have to define a non-commutative star product directly in terms of the

MAB (here viewed as commuting coordinates), whose star commutator then yields the required

so(D 1, 2) algebra. This is the BCH star product, which is given by

F (M) G(M) = exp MAB AB (N , N ) F (N)G(N ) N =M,N =M ,

(3.11)

where N is a short-hand notation for /NAB and where AB = BA is defined through the

relation

exp Q exp Q = exp Q + Q + AB (Q, Q )MAB ,

(3.12)

with Q = QAB MAB and Q = Q AB MAB for some anti-symmetric tensors QAB and Q AB . It

defines an associative product on the enveloping algebra [37]. By using the BCH formula

1

1

exp Q exp Q = exp Q + Q + [Q, Q ] +

Q, [Q, Q ] + Q , [Q , Q] + ,

2

12

(3.13)

we find the first few terms in the expansion to be

1

AB = fCD,EF AB QCD Q EF

2

1

fCD,EF GH fGH ,I J AB QCD Q EF QI J Q I J +

12

2

= 2(QQ )[AB] [Q, Q ][A|C| QC B] QC B] + ,

3

(3.14)

where (QQ )AB = QAC QC B , [Q, Q ]AB = (QQ )AB (Q Q)AB and where fAB,CD EF are the

structure constants defined in (2.5). The first terms in the product (3.11) consequently become

F (M) G(M) = F (M)G(M) + 2MAB AC F C B G

AC B

2

+ MAB AC BD F CD G 2

F CG

3

AC B

+ AC BD GCD F 2

G C F + .

(3.15)

The definition of the HS generators in (2.12) extends immediately. However, whereas the

realization in terms of the vector oscillator automatically imposes the Young tableau symmetries

(s 1, s 1), here we need to Young project explicitly.3 Hence, all elements of the enveloping

algebra which belong to other Young tableaux are modded out.

3 Note that under the projector P

(s1,s1) the use of the star product or the point-wise (classical) product is immaterial. For instance, for spin 3 we have TAB,CD = P(2,2) (MAC MBD ) = P(2,2) (MAC MBD ).

10

The star-products between a spin-2 generator and spin-s generator TC(s1),D(s1) read

MB]D ,

MAB MCD =MAB MCD 2C[A

(3.16)

2(s 1)P(s1,s1) (AC

)

s1 TBC(s2),D(s1)

+ double contractions,

(3.17)

where P(s1,s1) is a Young projector. The commutation relations in (2.4) and (2.16) follow

readily by defining the bracket [U , V ] = U V V U , since we know that [MAB , F (M)] =

4MC[A C B] F (M); see Eq. (B.2) in Appendix B.

Let us now proceed with defining the symmetric invariant tensors of the HS algebra. Given an

element F (M) of the enveloping algebra U[so(D 1, 2)], we define the operation tr given by

evaluation at MAB = 0

tr F (M) := F (0).

(3.18)

However, although the analogue of this operation for the vector oscillator described in Section 2.2

constitutes a proper (super) trace [6,40,41], it is easy to realize that the bilinear tr(F (M) G(M))

vanishes identically in our case (see also the comments in footnote 4). To obtain a sensible nonzero trace, we need to insert GL(D + 1)-invariant differential operators into the trace (3.18) cf.

the results in Ref. [36]. A natural GL(D + 1)-invariant differential operator is constructed out

of n = (D + 1)/2 derivatives contracted with the totally anti-symmetric tensor A1 AD+1 . We

propose the following sequence of traces Trk , for k = 1, 2, 3, . . .

Trk F (M) = tr k F (M) ,

(3.19)

,

= A1 An B1 Bn

(3.20)

MA1 B1

MAn Bn

with tr as in (3.18).4 These traces are cyclic

Trk (F G) = Trk (G F ),

(3.21)

for generic elements F (M) and G(M) of the enveloping algebra, which will be proven in Appendix B.

We now define the symmetric trilinear for three generators (2.12) of the HS algebra ho(4, 2)

of spins s, s and s to be

Ts , Ts , Ts

:=

k Trk {Ts , Ts } Ts ,

(3.22)

k =1

directly to an n-form for D = 2n 1. The total symmetry of (3.22) follows from (3.21) and the

associativity of the BCH star product.

4 The oscillator algebra based on (2.8) admits a natural graded (super) trace tr f (y) = f (0), such that

y

try (f (y)g(y)) = try (g(y)f (y)) [6,40,41]. Using this trace we can construct the anti-symmetric invariants of the

ho(D 1, 2) algebra. However, for the reasons explained above, even dressing this trace with derivative operators

analogous to (3.20), cannot give rise to a non-vanishing symmetric combination.

11

At this stage we have to note that, strictly speaking, the cyclicity (3.21) is not sufficient to

prove invariance of (3.22), since the commutator with respect to the BCH star product potentially

contains ideal terms. However, for the linearization in case of a spin-3 field to be analyzed below,

one can check explicitly that the tensor is invariant to that order. So we expect (3.22) to be

invariant under the adjoint action of the full ho(4, 2), which, furthermore, might fix the free

coefficients k .

The definition (3.22) will reproduce the symmetric spin-2 trilinear (3.3) provided we choose

the first coefficient to be 1 = 1/12. Further, it follows that the spin-2, spin-s invariant vanishes

for s > 2 once the symmetries imposed by the Young projector of the spin-s generator are taken

into account,

MAB , MCD , TE(s1),F (s1)

= 0.

(3.23)

This relation guarantees that the equations of motion for the HS fields (see (3.8)) will not contain

a term depending only on the spacetime curvature, which in turn implies that the spin-2 field

does not provide a source for the HS fields. Put differently, it is consistent with the field equations

to set the HS fields to zero.

Only the first trace Tr1 enters the linearized spin-3 analysis which we will focus on below.

The relevant invariant in D = 5 is given by

MAB , TCD,EF , TGH ,I J

= 2P(2,2) P(2,2) (ABCEGI DH F J ),

(3.24)

where the projectors impose the symmetries of the two spin-3 generators. We note that up to

an overall constant, the invariant (3.24) is the only possible term which is consistent with the

imposed Young symmetries.

Up to now we established the existence of a HS Lie algebra and an associated symmetric

invariant tensor. This in turn is sufficient to define a consistent HS ChernSimons action, which

in, say, D = 5 is given by

3

3

W dW dW + dW W W W + W W W W W . (3.25)

S=

2

5

M5

Here W denotes the gauge field taking values in ho(4, 2). It contains by construction the Lovelock gravity discussed in Section 3.1, corresponding to the subalgebra so(4, 2). Note that all the

complexity of this theory is encoded in the infinite-dimensional Lie algebra ho(4, 2) and the symmetric tensor. By virtue of the consistency of ho(4, 2) and the existence of the tri-linear tensor,

this action is by construction invariant under an exact HS symmetry at the full non-linear level,

i.e., it satisfies requirement (i) in the introduction. However, due to the fact that the Lie brackets

of ho(4, 2) are not known explicitly, at this stage the action (3.25) cannot be rewritten in a closed

form in terms of the physical HS fields. Fortunately a linearized analysis can be performed, and

in the next section we will show that one recovers indeed the correct free field limit, thus proving

that (3.25) satisfies also condition (ii).

4. Dynamical analysis

In this section we will discuss some aspects of the dynamical content of the constructed HS

theory. As it stands, the HS action (3.25) describes a theory with propagating gravitational torsion, so we expect also the HS torsions (which will be defined below) to propagate. Since the

dynamics of these kind of theories is much less understood, we take here a pragmatic point of

12

view, i.e., we impose vanishing torsion, which is compatible with the equations of motion though

it is not enforced by them. For simplicity our focus will be on the first non-trivial case, viz. spin-3,

which we believe exhibits generic features present for arbitrary spin.

4.1. Linearization and constraints for spin-3

We first note that, as in the purely gravitational case, an expansion around AdS5 does not give

rise to a non-trivial propagator. This can be seen by inspecting the equations of motion (3.8). Up

to first order they are of the form

C

gABC RB

AdS RHS = 0,

(4.1)

where RHS denotes the linearized HS contribution. As the AdS-covariant field strength vanishes

in the AdS background, RAdS = 0, the equations are identically satisfied at the first order and do

not lead to any perturbative dynamics.

Instead we will first keep the discussion generic and later focus on an expansion around the

AdS4 S 1 solution discussed in Section 3.1. For this we have to know the HS algebra explicitly. Fortunately, for an expansion around a given background geometry, only the commutators

between spin-2 and spin-s generator enter, while the mutual interactions between the different

HS fields are not relevant. The spin-3 generator is given by TAB,CD , corresponding to the Young

tableau

TBD,EF

+ AD

TCB,EF

+ AE

TCD,BF

+ AF

TCD,EB )

[MAB , TCD,EF ]= 2(AC

= 8A C

T|B|D,EF

.

(4.2)

Here curly brackets denote (2, 2) Young projection, while in the following they also indicate

symmetrization according to the Hook tableau, etc. (see Appendix A). In a GL(5) covariant basis,

the spin-3 generators are given by Tab = Tab,66 , Tab,c = Tab,c6 and Tab,cd , and their algebras read

[Mab , Tcd ]= 4a c

T|b|d

,

[Mab , Tcd,e ] = 4a c

T|b|d,e

+ 4a c

Tde

,b ,

[Mab , Tcd,ef ]= 8a c

T|b|d,ef

,

[Pa , Tbc,d ] = 3a b Tcd

Tad,bc ,

[Pa , Tbc,de ] = 8a b Tcd,e

.

(4.3)

Here we take the brackets [T , T ] to be vanishing, even though in the full HS algebra they close

into spin-4 generator. However, in the linearization these spin-4 fields decouple, and, indeed, this

truncation defines a consistent Lie algebra.

Next we linearize the HS gauge field as6

1 ab

1

1

1

Mab + e ab Tab + ab,c Tab,c + ab,cd Tab,cd , (4.4)

W = e a Pa +

2

2

3

12

ab are vielbein and spin connection of the background geometry. Moreover,

where e a and

we consistently omitted contributions from all fields with spin s > 3. e ab will later be identified

with the spin-3 field, while ab,c and ab,cd are auxiliary fields that have to be eliminated by

5 In the sequel we will drop the subscript on the commutators.

6 The unit-strength normalizations follow from the Hook length formula [43].

13

means of constraints. It will turn out that these constraints are analogous to the torsion constraint

of general relativity. As the torsion tensor appears as part of the field strength in (3.4), we will

determine the required constraints in the HS case by computing the non-Abelian field strength

based on the algebra (4.3). We find

F = W W + [W , W ]

1 ab

= T a Pa + R

Mab

2

1

1

1

+ T ab Tab + T ab,c Tab,c + R ab,cd Tab,cd + O 2 .

2

3

12

(4.5)

ab is the AdSHere T a denotes the background torsion, which we assume to vanish, while R

covariant background curvature tensor. The linearized HS field strengths read

e ab D

e ab + ab,c ec ab,c ec ,

T ab = D

ab,c D

ab,c + ab,cd ed ab,cd ed + 3e ab e c

3e ab e c

,

T ab,c = D

ab,cd D

ab,cd + 4 ab,c e d

4 ab,c e d

,

R ab,cd = D

(4.6)

denotes the background Lorentz covariant derivative, which reads on the different

where D

fields

e ab = e ab + 2

a c e |c|b

,

D

ab,c = ab,c + 2

a d |d|b,c

2

a d bc

,d ,

D

ab,cd = ab,cd + 4

a e |e|b,cd

.

D

(4.7)

Before we turn to the constraints let us discuss the spin-3 symmetries, under which the field

strengths above stay invariant. Under a non-Abelian gauge transformation W = D = +

[W , ], with Lie algebra valued transformation parameter given in the spin-3 case by

1 ab

1 ab

1 ab,c

1 ab,cd

a

Tab,cd ,

= Pa + Mab + Tab + Tab,c +

(4.8)

2

2

3

12

we find the following variations (ignoring background diffeomorphisms and Lorentz transformations)

ab ab,c ec ,

e ab = D

ab,c 3 ab e c

ab,cd ed ,

ab,c = D

ab,cd 4 ab,c e d

.

ab,cd = D

(4.9)

ab,c

ab,cd

and

corresponding to the auxilNote that the gauge transformations with parameters

iary fields act as Stckelberg shift symmetries.

Next we are going to discuss the constraints. We will see that imposing the conditions7

T ab = 0,

T ab,c = 0,

(4.10)

7 Note that the first constraint allows to identify the background diffeomorphisms with the gauge transformations

generated by a in the sense that the latter read on e ab , up to local Lorentz and Stckelberg transformations, e ab =

e ab , where denotes the Lie derivative with respect to the vector field = e a a .

14

allows to express ab,c in terms of the physical spin-3 field e ab and its first derivative and

ab,cd in terms of ab,c and its first derivatives. In turn, ab,cd is a function of e ab and its

first and second derivatives. The latter can be inserted into the third of Eq. (4.6), which then yields

the HS generalization of the Riemannian curvature tensor. Therefore the spin-3 curvature tensor

will be of third order in the derivatives of the spin-3 field. This procedure can be generalized to

arbitrary spin-s fields, whose curvature tensor will thus contain the sth derivative of the physical

spin-s field. (For traceless tensors in D = 4 spinorial form this analysis has been done in [44],

while a cohomological analysis in D dimensions can be found in [11,45].) This corresponds

to the hierarchy of de WitFreedman connections found in the metric-like formulation [30].

Since the equations of motion will necessarily impose conditions on the HS Riemann tensor, this

implies that the field equations are in the linearization already of higher derivative order. So at

first sight we seem to have little chance to recover the required 2nd order Frnsdal equations.

However, in flat space it has been shown that the Riemann tensor is a curl (DamourDeser

identity [49]) and that it can therefore be locally integrated, giving rise to the Frnsdal equations

in the so-called compensator formulation [34,50]. Here we will prove that this generalizes to

AdS.

Let us now turn to the constraints. From the first of Eq. (4.10) we conclude

d bc,a a bc,d = 1ad,bc ,

(4.11)

where the curved index on ab,c has been converted into a flat index by means of the background

vielbein, and we have introduced a HS generalization of the coefficients of anholonomity,

e cd D

e cd .

1 ab,cd = ea eb D

(4.12)

By permuting the indices in (4.11), one finds the expression

1

bc,d = ea 1 a(b,c)d 1 ad,bc + 1 d(b,c)a + bc,d ,

2

where

1

bc,d = ea abc,d + bda,c + cda,b + dbc,a .

4

(4.13)

(4.14)

To understand the significance of ab,c , we first note that a priori (4.13) lives in the Young

tableaux

(4.15)

It follows from (4.14) that is in the window tableau, i.e., (1 P(2,2) ) = 0. In the following

we will have to treat as an independent field. One can easily check that (4.13) solves (4.11) for

arbitrary , by using the window property of the latter. In fact, we will see that the inclusion of

this auxiliary field is necessary in order for the composite connection bc,d (e, ) to reproduce

the correct transformation behavior in (4.9).

From now on we will specify the geometry to AdS, since this is the case we are interested

reduces to the AdS-covariant

in later on.8 Specifically, the background-covariant derivative D

8 Note, however, that the analysis performed here holds in an arbitrary dimension, i.e., it applies in particular to AdS

5

as well as the AdS4 geometry we will discuss below.

15

derivative , characterized by

1

(g V g V ),

(4.16)

L2

with the AdS metric g of radius L, which in our conventions is L = 1. Applying the first

equation of (4.9), to (4.13), one can verify by use of (4.16) that ab,c (e, ) transforms exactly

as required by the second equation of (4.9), if one defines

[ , ]V =

, = ,

3 g

, .

(4.17)

In particular one sees that this transformation rule is consistent with the window symmetry

of ab,c .

The second torsion constraint in (4.10) can now be solved in a similar fashion. For our purposes it will, however, be sufficient to perform this analysis in a gauged-fixed formulation (for

AdS backgrounds). This will effectively reduce the field content to the completely symmetry

spin-3 field, given in a metric-like formulation by

h := e( a e b e)ab .

(4.18)

Specifically we use the Stckelberg shift symmetry in (4.9) parametrized by ab,c to gauge the

of eab to zero (see (A.3) in Appendix A). However, this gauge-fixing will be

hooked part

violated by a generic spin-3 transformation, and so one has to add a compensating shift transformation with parameter ab,c = c ab

. Under this residual gauge symmetry only the completely

symmetric part of e transforms, namely as

h = ( ) ,

(4.19)

as required in the free limit (see Section 2.1). Furthermore, from (4.17) we infer, that also ab,c

is subject to a Stckelberg shift symmetry with transformation parameter in . Therefore it can

be gauged away completely, which in turn requires a compensating transformation with

, =

3 g

.

(4.20)

In total, after gauge-fixing the spin-3 connections will depend only on the completely symmetric

part of eab .

To solve the second torsion constraint we derive from (4.6) for ab,cd in flat indices:

a bc,de e bc,da = 2 ea bc,d ,

(4.21)

where

cd,e + 3e cd e e

( ) .

2 ab cd,e = ea eb D

(4.22)

1

ab,cd = ef 2 f a cd,b + 2 f b cd,a + 2 f c ab,d + 2 f d ab,c .

2

(4.23)

To verify that this is a solution it is not sufficient to use the symmetries of the 2 ab cd,e , but

instead the explicit expression given by (4.22) and (4.13) together with the AdS relation (4.16) is

required.

16

Let us now turn to the equations of motion. We specify to the AdS4 background discussed in

Section 3.1. Moreover, we set all components of e ab which have a leg in the fifth dimension to

zero. In other words, we are not considering the dynamics of KaluzaKlein scalars and vectors,

etc., in order to simplify the analysis. Though in the full non-linear theory this would most likely

not be a consistent KaluzaKlein truncation, in the linearization this is justified since the different

fields decouple.

Using the explicit form of the invariant tensor in (3.24) we see that, after imposing the constraint, the only non-trivial part of the equations of motion (4.1) requires the free index to take

values in the Hook tableau. Moreover, we have seen in Eq. (3.23) that setting the background

spin-3 field to zero is consistent with its equations of motion, which we implicitly assumed already in the expansion (4.4).

Specifically, by use of (3.24) we have

abcde Rab R d f , e h

0=P

1

(4.24)

abcde Rab R d f , e h + abhde Rab R d f , e c ,

2

where we used in the second equation the projector in (ch, f ). Specifying now to flat AdS4

indices a = (, 4),9 and using (3.10) this implies

=

e R , + ( ) = 0.

This yields in components by use of the identity

belling the indices

(4.25)

e

e]

3!e

e[

R , , R , + R , g + ( ) = 0.

Taking the , trace implies that the double trace of the Riemann tensor vanishes. We prove in

Appendix C that the final equation is equivalent to the condition that any single trace of the Riemann tensor, i.e., the spin-3 analogue of the Ricci tensor, vanishes. It turns out that a convenient

choice is the following:

R , = 0.

(4.27)

Next we are going to analyze this equation in more detail. By inserting (4.13) into (4.23) and

using (4.6) one finds the explicit expressions (in curved indices)

K + g ( h h ) + g ( h h ) ( ) = 0,

(4.28)

where we defined

K = h h h + ( ) h

(D 3)h g( h ) 3g h .

(4.29)

Here we left the spacetime dimension generic, though in our case it is D = 4. As outlined above,

we are going to show that these 3rd-order differential equations can locally be integrated to give

9 Indices , , . . . , denote D-dimensional spacetime indices. We hope it will not source any confusion that we specify

them in this section to curved AdS4 indices.

17

effectively rise to sensible 2nd-order field equations. We first note that, in contrast to Minkowski

space, on AdS a vanishing curl cannot locally be integrated to a gradient, since the covariant

derivatives do not commute. However, for K symmetric in , a condition like

[ K] = 2g[ ] 2g [ ] ,

(4.30)

Comparing with (4.28) we see that the equations of motion have almost this form, except that

the derived like this are not symmetric. If the latter are symmetrized by hand, additional

terms have to be added to the ansatz for K , which in turns implies that it is no longer a pure

gradient. Moreover, an integration constant has to be carefully taken into account. Altogether one

finds that

K = + g (h ) + g (h )

(4.31)

= h h h + 2g ,

(4.32)

and is the integration constant. Then (4.31) can be rewritten by use of the explicit expression

in (4.29) as

AdS

= ( ) 4g( ) ,

F

(4.33)

where F AdS is the AdS Frnsdal operator defined in Section 2.1. To understand the significance

of we note that Eq. (4.28) is by construction spin-3 invariant. However, by locally integrating, this invariance would be lost, if not a non-trivial transformation behavior is assigned to the

integration constant . In fact, (4.33) is only invariant if

= .

(4.34)

This shift symmetry can now be used to set = 0, such that (4.33) reduces to the Frnsdal

equation on AdS, the latter being invariant under all trace-less spin-3 transformations. Thus we

correctly recovered the required free spin-3 equations. The formulation (4.33) with its invariance under trace-full transformations and the appearance of the so-called compensator in fact

coincides completely with the construction of Francia and Sagnotti [4648].

5. Conclusions and outlook

By virtue of the YangMills gauge invariance of ChernSimons actions in any odd dimension,

the HS theories constructed in this paper provide a consistent coupling to gravity in the sense

that the free HS symmetry h1 ...s = (1 2 ...s ) gets deformed to an exact symmetry of

the full non-linear theory. In other words, condition (i) raised in the introduction is satisfied by

construction. Moreover, contrary to what is sometimes implicitly assumed, these topological

actions do possess propagating degrees of freedom for D > 3. By linearizing around the AdS4

solution found in [25], we verified explicitly that this is the case especially in the presence of HS

fields. We recovered the correct free field equations in the first non-trivial case of a spin-3 field.

For this we showed that on the subsector of vanishing spin-3 torsion the field equations, though

being 3rd-order differential equations, can locally be integrated to 2nd-order equations, which in

turn coincide with the Frnsdal equations in the formulation of [46,47].

18

We would like to stress that this is in contrast to previous attempts to construct consistent HS

actions [13,14]: In order to guarantee free field equations of 2nd-order, they impose the additional

condition that the extra fields (in our case the spin-3 connection ab,cd ), which are generically

of higher-derivative order, do not enter the free action. Here we do not have this freedom, since

the action is completely determined by gauge invariance, i.e., the extra fields inevitably enter the

free theory. That we get nevertheless the correct Frnsdal equations, or in other words, that the

higher-derivatives are gauge artefacts that can be eliminated, is due to the curl-like structure of

the HS Riemann tensor, which in flat space is known as the DamourDeser identity [49,50] (see

also [51,52]). Since we verified here an analogous behaviour on AdS for spin-3, this pattern will

most likely extend to all HS fields, and therefore requirement (ii) for consistent HS theories is

satisfied.

Let us also stress that in this approach it is very natural, if not necessary, to start with a HS

algebra based on tracefull generators, since then the appearance of the compensator in the integration leading from (4.28) to (4.33) has a very natural interpretation in that it compensates

for the non-invariance of the pure Frnsdal operator under tracefull HS transformations. Moreover, starting with traceless generators would imply in particular that the HS Riemann tensor is

already traceless and consequently the field equations in the form (4.27) would be identically

satisfied and not lead to any dynamics. Instead, the dynamics could possibly be encoded, via

the Bianchi identities, in the lower-rank torsion-like tensors, for which, however, the distinction

between constraint equations and dynamical equations would be less straightforward. (See also

the discussion about the so-called -cohomology in [11] and references therein.)

Finally we note that, compared to the unfolded formulation of HS theories advertised in

the literature so far, the more conventional action principle presented here has the advantage of

admitting already a class of exact solutions. In fact, by virtue of the relation (3.23) we concluded

that any solution of the purely gravitational theory, as for instance black holes [53] and pp-waves

[54], can be lifted to an exact solution of the full theory, simply by setting all HS fields to zero.

Accordingly, this theory allows the analysis of HS dynamics on more complicated backgrounds

(and then, in principle, also of the back reaction of the geometry). This is in contrast to the

unfolded dynamics, for which even in case that all HS fields vanish, the construction of solutions

is a highly non-trivial problem. Indeed, apart from AdS, exact solutions have been found only

recently [55,56].

Many things are left to be done. First, we have analyzed the dynamical content only in case

of trivial HS torsion. However, viewed as a 1st-order formulation, the theory does not imply vanishing torsion (though the latter provides a particular solution). So either one imposes the torsion

constraints by hand, in order to express the de WitFreedman connections in terms of the physical HS fields, or one treats the torsions as carrying additional degrees of freedom. In the former

case it is not clear that this is consistent with the HS gauge symmetry: Although we have seen

in Section 4.1 that in case of a linearization around an AdS geometry the composite connections

transform in the same way as the fundamental connectionsand so imposing the constraints

does not violate the HS invarianceit is not clear whether this is consistent in general. In case

it is not consistent, this would mean that there are additional degrees of freedom associated to

the torsion, which necessarily need to be taken into account. Apart from that we should point

out that due to the way the torsion tensor enters the ChernSimons theory, there does not exist a

1.5-order formalism.

One of the main difficulties in analyzing the non-linear dynamics of the constructed HS theory

in more detail is due to the fact that the infinite-dimensional HS algebras are poorly understood.

Though these algebras are well defined through the oscillator realization described in Section 2.2,

19

the structure constants, for instance, are not known in general. Further research into this direction

is required for a detailed analysis of the interactions.

Once the dynamical content is known, it remains to be seen how the different fields organize

into HS multiplets. We first notice that the basic field content of the ChernSimons theory in

D = 5 does not fit into multiplets of ho(4, 2), since the latter requires in particular a massless

scalar [6,57]. This in turn is the reason that the construction of HS actions la MacDowell

Mansouri entirely based on a HS gauge field are not believed to be consistent to all orders [14].

However, our case is different, since there are no propagating HS modes around AdS5 and so

there is no reason to expect that the D = 5 field content should organize in multiplets.10 Rather

we found that the non-trivial HS dynamics takes place on backgrounds which are not maximally

symmetric, as the AdS4 solution. However, on this background there will most likely be scalar

and other excitations which are the KaluzaKlein modes originating from the off-diagonal components of the various fields. Due to the HS invariance of the full theory, these modes almost

by construction will organize into multiplets of ho(3, 2), and it would be very interesting to see

how this happens. In some sense the theory seems to prevent itself from becoming inconsistent

exactly by not having standard dynamics around its most symmetric solution.

Let us finally note that the ChernSimons theory in D = 11 based on (two copies of) the

superalgebra osp(1|32) has been proposed as the non-perturbative definition of M-theory [61].

As the latter should cover in particular the infinite towers of massive HS states described by

10-dimensional string theory, it is very tempting to conjecture that osp(1|32) has to be enhanced

to a HS extension, thus giving rise to ChernSimons actions of the type considered here. In

fact, recently it has been argued that the three-dimensional ChernSimons theory based on a

HS algebra is related to M-theory for non-critical strings in D = 2 via the background AdS2

S 1 [62]. Similarly to the AdS4 S 1 solution discussed for the D = 5 theory here, one might

hope to identify a non-topological 10-dimensional phase, which permits flat Minkowski space

and gives rise to massive HS states via spontaneous symmetry breaking.

Acknowledgements

For useful comments and discussions we would like to thank D. Francia, M. Henneaux,

C. Iazeolla, M.A. Vasiliev and especially P. Sundell.

This work has been supported by the European Union RTN network MRTN-CT-2004-005104

Constituents, Fundamental Forces and Symmetries of the Universe and the INTAS contract 0351-6346 Strings, branes and higher-spin fields. O.H. is supported by the stitching FOM.

Appendix A. Young tableaux and projectors

Here we give a brief review of the technique of Young tableaux used in the main text. As we

are exclusively working with tracefull tensors, these encode the irreducible representations of

GL(m), as opposed to SO(m) groups. For tensors with AdSD indices we have m = D + 1, while

for the corresponding Lorentz tensors m = D.

20

A Young tableau consists of a certain number of rows of boxes, where the number of boxes

does not increase from top to bottom, as for instance

(A.1)

It describes the symmetries of an irreducible GL(m) tensor. For the example (A.1) it has the

structure Ta1 a5 ,b1 b3 ,c1 c3 ,d . As a matter of convention we choose the symmetric basis, which

means that the corresponding tensors are completely symmetric in all row indices. Specifically,

the tensor T above is completely symmetric in the sets of indices {ai }, {bi } and {ci }, respectively. For irreducibility the tensors have to satisfy the additional condition that symmetrisation

of all indices in a certain row with any index corresponding to a box below that row gives zero.

For instance, in the example this implies

Ta1 a5 ,(b1 b3 ,c1 )c3 ,d = Ta1 a5 ,b1 b3 ,(c1 c3 ,d) = 0, etc.,

(A.2)

where ordinary brackets denote complete symmetrization of strength one, as, e.g., T(ab) :=

1

2 (Tab

specific anti-symmetrization properties can be derived from the Young tableau.11 Moreover, one

may check that for a tensor in the window tableau , Eq. (A.2) implies the exchange property

Tab,cd = Tcd,ab .

The language of Young tableaux is efficient in order to determine the decomposition of tensor

products into irreducible representations. Specifically, in the tensor product of the vector representation 2 with any Young tableau, the irreducible parts are obtained by adding 2 to the given

tableau in all possible ways. For instance, the spin-3 frame field e ab is a priori in the tensor

product

(A.3)

i.e., it contains the completely symmetric (physical) part and the so-called Hook diagram.

Finally we give the projectors onto the Hook and window diagrams, which we need in the

main text, explicitly. The Hook projector reads on a general tensor with no a priori symmetries

(P

1

X)abc (P(2,1) X)abc X abc

= (2X(ab)c X(bc)a X(ca)b ).

3

(A.4)

Similarly,

(P

1

= (2X(ab)(cd) + 2X(cd)(ab) X(cb)(ad) X(ad)(cb)

6

X(ac)(bd) X(bd)(ac) ).

(A.5)

11 It is, however, possible to start with a different convention, in which the anti-symmetrization properties, i.e., the

symmetries in a column of boxes, are specified. In Appendix C we have to relate these two.

21

Analogous formulas hold in case of different index orderings, as, e.g., Hook projection according

to indices (ab, c) on a tensor Xcab ,

(P

1

X)cab = (2Xc(ab) Xa(bc) Xb(ca) ).

3

(A.6)

In this appendix we prove the assertion made in Section 3.2 that the traces Trk in (3.21) are

cyclic in a general odd dimension D = 2n 1.

For a generic element F (M) in the enveloping algebra U[so(D 1, 2)] the star product with

MAB can be computed by use of (3.15),

2

MAB F = MAB + 2MC[A C B] + MCD [A C B] D MC[A B] D D C F ,

3

2

F MAB = MAB 2MC[A C B] + MCD [A C B] D MC[A B] D D C F ,

(B.1)

3

which implies

MAB , F (M) = 4MC[A C B] F (M).

(B.2)

This equation encodes the transformation of F (M) under MAB . In a more mathematical language

this states that the BCH star product is strongly so(D 1, 2)-invariant [36,37].

In order to prove the cyclicity of the trace we first show for a generic monomial F =

F A1 B1 A B MA1 B1 MA B of degree

Tr1 [MAB , F ] = tr [MAB , F ] = 0.

(B.3)

To see this, we apply to (B.2), whose explicit evaluation gives

tr [MAB , F ]

n

n

tr A1 B1 Ar Br MC[A| Ar+1 Br+1 An Bn C |B] F

= 4 A1 An B1 Bn

r

r =0

= tr [MAB , F ] + 4n A1 An B1 Bn tr A1 B1 MC[A| A2 B2 An Bn C |B] F .

(B.4)

The first term vanishes, which follows from (B.2) and the fact that tr sets M = 0. The second

term can potentially be non-zero when = n. In this case it reduces to (after dropping a constant multiplicative factor) A1 An B1 Bn1 [A| FA1 B1 An1 Bn1 An |B] , which vanishes identically.

To see this we use F A1 B1 A B = F A1 B1 Bm Am A B , the symmetry under exchange of any

pair (Am , Bm ) and (Am , Bm ) and finally the fact that anti-symmetrization in 2n + 1 indices

vanishes identically for so(2n),

[A1 An B1 Bn1 A FA1 B1 An1 Bn1 An B] = 0.

(B.5)

Trk [MAB , F ] = tr k [MAB , F ] = 0,

(B.6)

by using identities similar to (B.5), which proves that Trk (F MAB ) is cyclic. By using induction, the proof extends directly to Trk (F G ) for an arbitrary monomial G . For this

22

we expand G = m Gm M m , use that M m = (M)m + m <m cm M m together with as

sociativity, and finally apply (B.6) several times. For instance, when = 2 we find G2 =

GA1 B1 A2 B2 MA1 B1 MA2 B2 = GA1 B1 A2 B2 (MA1 B1 MA2 B2 2B1 A2 MA1 B2 ). The cyclicity of the

last term follows from the analysis above, and by repeatedly using (B.6) we have that

GA1 B1 A2 B2 Trk (F MA1 B1 MA2 B2 ) = GA1 B1 A2 B2 Trk (MA2 B2 F MA1 B1 )

= GA1 B1 A2 B2 Trk (MA1 B1 MA2 B2 F ).

(B.7)

Here we summarize some relations for the spin-3 Riemann tensor, most notably the Bianchi

identities. (On flat space, a very clear discussion of the spin-3 geometry in metric-like formulation can be found in [49], while aspects of a frame-like formulation are given in [32,59,60].) For

the proof of the Bianchi identities it will be convenient to work in form language, for which the

tensors in (4.6) read

ab + ab,c ec ,

ab,c + 3e ab ec

+ ab,cd ed ,

T ab,c = D

T ab = De

ab,cd + 4 ab,c ed

.

R ab,cd = D

(C.1)

to the

After solving the torsion constraints the Bianchi identity follows by application of D

second torsion tensor,

ab,cd + 4 ab,c ed

ed = R ab,cd ed ,

ab,c = D

0 = DT

(C.2)

where we used the first torsion constraint, T ab = 0, and the relation

2 ab,c = Rad d b,c + Rbd a d ,c + Rcd ab, d ,

D

(C.3)

evaluated for the AdS case (4.16). In components the Bianchi identity (C.2) reads

R[ ] , = 0,

(C.4)

These identities can now be used to prove that all traces of the Riemann tensor are algebraically related, or in other words, as in the spin-2 case there is a unique Ricci tensor. First of all,

the symmetries of the fiber indices according to the window Young tableau imply R a(b,cd) = 0,

which in turn shows that

R ab, c c = 2R a c ,bc = 2R b c ,ac ,

(C.5)

i.e. there is a unique trace in the fiber indices. By virtue of the Bianchi identity (C.4) the trace in

the fiber indices can then be related to the trace between one spacetime and one fiber index:

1

R , = R , .

2

(C.6)

We are now in a position to rigorously derive the field equations used in the main text. First

contracting (4.26) with g yields

(D 2)R , = 0,

(C.7)

where we used (C.4). This in turn implies that the double traces of the Riemann tensor appearing

in (4.26) can be set to zero. The remaining terms can be simplified by making repeated use of the

23

0 = R , R , + ( )

= R , R , + ( )

= 4R , + R , + R ,

= 3R , .

(C.8)

Here we used in the second line the Bianchi identity in , , , in the third line the window

symmetry of the second term in , , and finally the same symmetry in the fourth line. These

are the final equations of motion, which basically state that the spin-3 Ricci tensor vanishes.

In order to clarify the information contained in this equation, we decompose R , into its

irreducible parts. A priori it can take values in the Young tableaux

(C.9)

The origin of these different structures is that in the frame-like formulation the Riemann tensor

necessarily appears in a mixed basis in the sense that the anti-symmetric 2-form indices are on a

different footing as the frame indices. To compare with the completely symmetric or completely

anti-symmetric basis used in the metric-like formulation in [30] and [49], respectively, we have

to impose these symmetries, i.e., we define

(a)

R; ; =R,

,

(s)

R, = R( ( ), ) .

(C.10)

, depending on the chosen

conventions for symmetrisation or anti-symmetrisation properties), it is easily seen that there is

a unique trace. Explicitly one finds

3

R(a)

;; = 8 R , ,

R(s)

, = R( ), .

2

(C.11)

, while its algebraically related trace

R(a)

is in

. Similarly, the trace of

takes values in

, but interpreted in the anti-symmetric

basis. To summarize, taking the trace in the fiber indices of the Riemann tensor in the mixed basis

corresponds to the Ricci tensor in the completely anti-symmetric basis, while a trace between

spacetime and fiber index corresponds to the Ricci tensor in the completely symmetric basis.

References

[1] D.J. Gross, High-energy symmetries of string theory, Phys. Rev. Lett. 60 (1988) 1229.

[2] J. Isberg, U. Lindstrom, B. Sundborg, G. Theodoridis, Classical and quantized tensionless strings, Nucl. Phys. B 411

(1994) 122, hep-th/9307108.

[3] B. Sundborg, Stringy gravity, interacting tensionless strings and massless higher spins, Nucl. Phys. B (Proc.

Suppl.) 102 (2001) 113, hep-th/0103247.

[4] G. Bonelli, On the tensionless limit of bosonic strings, infinite symmetries and higher spins, Nucl. Phys. B 669

(2003) 159, hep-th/0305155;

G. Bonelli, On the covariant quantization of tensionless bosonic strings in AdS spacetime, JHEP 0311 (2003) 028,

hep-th/0309222.

[5] A. Sagnotti, M. Tsulaia, On higher spins and the tensionless limit of string theory, Nucl. Phys. B 682 (2004) 83,

hep-th/0311257.

24

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

J. Engquist, P. Sundell, Brane partons and singleton strings, Nucl. Phys. B 752 (2006) 206, hep-th/0508124.

D. Sorokin, Introduction to the classical theory of higher spins, AIP Conf. Proc. 767 (2005) 172, hep-th/0405069.

S. Weinberg, E. Witten, Limits on massless particles, Phys. Lett. B 96 (1980) 59.

C. Aragone, S. Deser, Consistency problems of spin-2 gravity coupling, Nuovo Cimento B 57 (1980) 33.

S.R. Coleman, J. Mandula, All possible symmetries of the S-matrix, Phys. Rev. 159 (1967) 1251.

X. Bekaert, S. Cnockaert, C. Iazeolla, M.A. Vasiliev, Nonlinear higher spin theories in various dimensions, Lectures

given at Workshop on Higher Spin Gauge Theories, Brussels, Belgium, 1214 May 2004, hep-th/0503128.

S.W. MacDowell, F. Mansouri, Unified geometric theory of gravity and supergravity, Phys. Rev. Lett. 38 (1977)

739;

S.W. MacDowell, F. Mansouri, Unified geometric theory of gravity and supergravity, Phys. Rev. Lett. 38 (1977)

1376, Erratum.

E.S. Fradkin, M.A. Vasiliev, Cubic interaction in extended theories of massless higher spin fields, Nucl. Phys. B 291

(1987) 141.

M.A. Vasiliev, Cubic interactions of bosonic higher spin gauge fields in AdS5 , Nucl. Phys. B 616 (2001) 106;

M.A. Vasiliev, Cubic interactions of bosonic higher spin gauge fields in AdS5 , Nucl. Phys. B 652 (2003) 407,

hep-th/0106200, Erratum.

M.A. Vasiliev, Actions, charges and off-shell fields in the unfolded dynamics approach, Int. J. Geom. Methods Mod.

Phys. 3 (2006) 37, hep-th/0504090.

M.A. Vasiliev, Unfolded representation for relativistic equations in (2 + 1) anti-de Sitter space, Class. Quantum

Grav. 11 (1994) 649.

M.A. Vasiliev, Triangle identity and free differential algebra of massless higher spins, Nucl. Phys. B 324 (1989)

503.

M.P. Blencowe, A consistent interacting massless higher spin field theory in D = (2 + 1), Class. Quantum Grav. 6

(1989) 443.

E.S. Fradkin, V.Y. Linetsky, A superconformal theory of massless higher spin fields in D = (2 + 1), Mod. Phys.

Lett. A 4 (1989) 731;

E.S. Fradkin, V.Y. Linetsky, A superconformal theory of massless higher spin fields in D = (2 + 1), Ann. Phys. 198

(1990) 293.

E. Bergshoeff, M.P. Blencowe, K.S. Stelle, Area preserving diffeomorphisms and higher spin algebra, Commun.

Math. Phys. 128 (1990) 213.

O. Hohm, On the infinite-dimensional spin-2 symmetries in KaluzaKlein theories, Phys. Rev. D 73 (2006) 044003,

hep-th/0511165.

O. Hohm, Gauged diffeomorphisms and hidden symmetries in KaluzaKlein theories, Class. Quantum Grav. 24

(2007) 2825, hep-th/0611347.

E. Witten, (2 + 1)-dimensional gravity as an exactly soluble system, Nucl. Phys. B 311 (1988) 46.

A.H. Chamseddine, Topological gauge theory of gravity in five dimensions and all odd dimensions, Phys. Lett.

B 233 (1989) 291.

A.H. Chamseddine, Topological gravity and supergravity in various dimensions, Nucl. Phys. B 346 (1990) 213.

M. Banados, L.J. Garay, M. Henneaux, The local degrees of freedom of higher dimensional pure ChernSimons

theories, Phys. Rev. D 53 (1996) 593, hep-th/9506187.

M. Banados, L.J. Garay, M. Henneaux, The dynamical structure of higher dimensional ChernSimons theory, Nucl.

Phys. B 476 (1996) 611, hep-th/9605159.

C. Frnsdal, Massless fields with integer spin, Phys. Rev. D 18 (1978) 3624.

C. Frnsdal, Singletons and massless, integral spin fields on de Sitter space, Phys. Rev. D 20 (1979) 848.

B. de Wit, D.Z. Freedman, Systematics of higher-spin gauge fields, Phys. Rev. D 21 (1980) 358.

E.S. Fradkin, M.A. Vasiliev, Candidate to the role of higher spin symmetry, Ann. Phys. 177 (1987) 63.

M.A. Vasiliev, Gauge form of description of massless fields with arbitrary spin, Yad. Fiz. 32 (1980) 855 (in

Russian).

M.A. Vasiliev, Nonlinear equations for symmetric massless higher spin fields in (A)dS(d), Phys. Lett. B 567 (2003)

139, hep-th/0304049.

A. Sagnotti, E. Sezgin, P. Sundell, On higher spins with a strong Sp(2, R) condition, hep-th/0501156.

E. Sezgin, P. Sundell, Doubletons and 5D higher spin gauge theory, JHEP 0109 (2001) 036, hep-th/0105001.

P. Bieliavsky, M. Bordemann, S. Gutt, S. Waldmann, Traces for star products on the dual of a Lie algebra,

math/0202126.

S. Gutt, An explicit product on the cotangent bundle of a Lie group, Lett. Math. Phys. 7 (1983) 249.

J. Madore, S. Schraml, P. Schupp, J. Wess, Gauge theory on noncommutative spaces, Eur. Phys. J. C 16 (2000) 161,

hep-th/0001203.

25

[39] B. Jurco, S. Schraml, P. Schupp, J. Wess, Enveloping algebra valued gauge transformations for non-Abelian gauge

groups on non-commutative spaces, Eur. Phys. J. C 17 (2000) 521, hep-th/0006246.

[40] M.A. Vasiliev, Extended higher spin superalgebras and their realizations in terms of quantum operators, Fortschr.

Phys. 36 (1988) 33.

[41] G. Pinczon, R. Ushirobira, Supertrace and superquadratic Lie structure on the Weyl algebra, and applications to

formal inverse Weyl transform, Lett. Math. Phys. 74 (2005) 263.

[42] J. Zanelli, (Super-)gravities beyond 4 dimensions, hep-th/0206169.

[43] J. Fuchs, C. Schweigert, Symmetries, Lie Algebras and Representations: A Graduate Course for Physicists, Cambridge Univ. Press, 1997.

[44] M.A. Vasiliev, Free massless fields of arbitrary spin in the de Sitter space and initial data for a higher spin superalgebra, Fortschr. Phys. 35 (1987) 741;

M.A. Vasiliev, Free massless fields of arbitrary spin in the de Sitter space and initial data for a higher spin superalgebra, Yad. Fiz. 45 (1987) 1784.

[45] V.E. Lopatin, M.A. Vasiliev, Free massless bosonic fields of arbitrary spin in d-dimensional de Sitter space, Mod.

Phys. Lett. A 3 (1988) 257.

[46] D. Francia, A. Sagnotti, Free geometric equations for higher spins, Phys. Lett. B 543 (2002) 303, hep-th/0207002.

[47] D. Francia, A. Sagnotti, On the geometry of higher-spin gauge fields, Class. Quantum Grav. 20 (2003) S473, hepth/0212185.

[48] D. Francia, A. Sagnotti, Higher-spin geometry and string theory, J. Phys. Conf. Ser. 33 (2006) 57, hep-th/0601199.

[49] T. Damour, S. Deser, Geometry of spin-3 gauge theories, Ann. Poincar Phys. Theor. 47 (1987) 277.

[50] X. Bekaert, N. Boulanger, On geometric equations and duality for free higher spins, Phys. Lett. B 561 (2003) 183,

hep-th/0301243.

[51] X. Bekaert, N. Boulanger, Tensor gauge fields in arbitrary representations of GL(D, R). II: Quadratic actions, Commun. Math. Phys. 271 (2007) 723, hep-th/0606198.

[52] I. Bandos, X. Bekaert, J.A. de Azcarraga, D. Sorokin, M. Tsulaia, Dynamics of higher spin fields and tensorial

space, JHEP 0505 (2005) 031, hep-th/0501113.

[53] M. Banados, Constant curvature black holes, Phys. Rev. D 57 (1998) 1068, gr-qc/9703040.

[54] J.D. Edelstein, M. Hassaine, R. Troncoso, J. Zanelli, Lie-algebra expansions, ChernSimons theories and the

EinsteinHilbert Lagrangian, Phys. Lett. B 640 (2006) 278, hep-th/0605174.

[55] E. Sezgin, P. Sundell, An exact solution of 4D higher-spin gauge theory, Nucl. Phys. B 762 (2007) 1, hepth/0508158.

[56] V.E. Didenko, A.S. Matveev, M.A. Vasiliev, BTZ black hole as solution of 3d higher spin gauge theory, hepth/0612161.

[57] S.E. Konstein, M.A. Vasiliev, Massless representations and admissibility condition for higher spin superalgebras,

Nucl. Phys. B 312 (1989) 402.

[58] O. Hohm, H. Samtleben, Effective actions for massive KaluzaKlein states on AdS3 S 3 S 3 , JHEP 0505 (2005)

027, hep-th/0503088.

[59] C. Aragone, H. La Roche, Massless second order tetradic spin-3 fields and higher helicity bosons, Nuovo Cimento

A 72 (1982) 149.

[60] C. Aragone, S. Deser, Z. Yang, Massive higher spin from dimensional reduction of gauge fields, Ann. Phys. 179

(1987) 76.

[61] P. Horava, M-theory as a holographic field theory, Phys. Rev. D 59 (1999) 046004, hep-th/9712130.

[62] P. Horava, C.A. Keeler, Strings on AdS2 and the high-energy limit of noncritical M-Theory, arXiv: 0704.2230

[hep-th].

scattering

Stefano Actis a, , Micha Czakon b,c , Janusz Gluza d , Tord Riemann a

a Deutsches Elektronen-Synchrotron, DESY, Platanenallee 6, D-15738 Zeuthen, Germany

b Institut fr Theoretische Physik und Astrophysik, Universitt Wrzburg, Am Hubland, D-97074 Wrzburg, Germany

c Institute of Nuclear Physics, NCSR DEMOKRITOS, 15310 Athens, Greece

d Institute of Physics, University of Silesia, Uniwersytecka 4, PL-40007 Katowice, Poland

Available online 4 July 2007

Abstract

We evaluate the two-loop corrections to Bhabha scattering from fermion loops in the context of pure

quantum electrodynamics. The differential cross section is expressed by a small number of master integrals

with exact dependence on the fermion masses me , mf and the Mandelstam invariants s, t, u. We determine

the limit of fixed scattering angle and high energy, assuming the hierarchy of scales m2e m2f s, t, u. The

numerical result is combined with the available non-fermionic contributions. As a by-product, we provide

an independent check of the known electron-loop contributions.

2007 Elsevier B.V. All rights reserved.

1. Introduction

Bhabha scattering is one of the processes at e+ e colliders with the highest experimental precision and represents an important monitoring process. A notable example is its expected role for

the luminosity determination at the future International Linear Collider ILC by measuring smallangle Bhabha-scattering events at center-of-mass energies ranging from about 100 GeV (Giga-Z

collider option) to several TeV. Moreover, the large-angle region is relevant at colliders operat-

* Corresponding author.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.023

27

ing at 110 GeV. For some applications a full two-loop calculation of the QED contributions is

mandatory.1

A large class of QED two-loop corrections was determined in the seminal work of [2]. Later,

the complete two-loop corrections in the limit of zero electron mass were obtained in [3] thanks

to the fundamental results of [4,5]. However, this result cannot be immediately applied, since the

available Monte Carlo programs (see, e.g., [613]) employ a small, but non-vanishing electron

mass. The 2 ln(s/m2e ) terms due to double boxes were derived from [3] by the authors of [14],

and the close-to-complete two-loop result in the ultra-relativistic limit was finally obtained in

[15,16]. Note that the diagrams with fermion loops have not been covered by this approach.

The virtual and real components involving electron loops could be added exactly in [17,18].

The non-approximated analytical expressions for all two-loop corrections, except for double-box

diagrams and for those with loops from heavier-fermion generations, can be found in [19]. For a

comprehensive investigation of the full set of the massive two-loop QED corrections, including

double-box diagrams, we refer to [2024]. The evaluation of the contributions from massive

non-planar double box diagrams remains open so far.

In order to add another piece to the complete two-loop prediction for the Bhabha-scattering

cross section in QED, we evaluate here the so-far lacking diagrams containing heavy-fermion

loops. The cross-section correction is expressed by a small number of scalar master integrals,

where the exact dependence on the masses of the fermions and the Mandelstam variables s, t

and u is retained. In a next step, we assume a hierarchy of scales, m2e m2f s, t, u, where me

is the electron mass and mf is the mass of a heavier fermion. We derive explicit results neglecting

terms suppressed by positive powers of m2e /m2f , m2e /x and m2f /x, where x = s, t, u. This highenergy approximation describes the influence of muons and leptons and proves well-suited for

practical applications. In addition, we provide an independent cross-check of the exact analytical

results of [17] (we used the files provided at [25] for comparison) for mf = me .

The article is organized as follows. In Section 2 we introduce our notations and outline the

calculation and in Section 3 we discuss the solution for each class of diagrams. In Section 4

we reproduce the complete result for the corrections from heavier fermions in analytic form and

perform the numerical analysis. Section 5 contains the summary, and additional material on the

master integrals is collected in Appendix A.

2. Expansion of the cross section

We consider the Bhabha-scattering process,

e (p1 ) + e+ (p2 ) e (p3 ) + e+ (p4 ),

(2.1)

s = (p1 + p2 )2 = 4E 2 ,

2

2

2

2

2

u = (p1 p4 ) = 4 E me cos ,

2

where me is the electron mass, E is the incoming-particle energy in the center-of-mass

and is the scattering angle. In addition, s + t + u = 4m2e .

1 Note that leading two-loop effects in the electroweak Standard Model were already incorporated in [1].

(2.2)

(2.3)

(2.4)

frame

28

1a

1b

1c

Fig. 1. Classes of Bhabha-scattering one-loop diagrams. A thin fermion line represents an electron, a thick one can be

any fermion. The full set of graphs can be obtained through proper permutations. We refer to [26] for the reproduction of

the full set of graphs.

In the kinematical region m2e s, t, u the leading-order (LO) differential cross section with

respect to the solid angle reads as

d LO 2 1 s 2

1 t2

1

2

2

2

(2.5)

=

+ t + st + 2

+ s + st + (s + t) ,

d

s s2 2

2

st

t

where is the fine-structure constant. At higher orders in perturbation theory we write an expansion in ,

NLO 2 NNLO

d

d LO

d

d

(2.6)

=

+

+

+ O 5 .

d

d

d

Here d NLO and d NNLO summarize the next-to-leading order (NLO) and next-to-next-toleading order (NNLO) corrections to the differential cross section. In the following it will be

understood that we consider only components generated by diagrams containing one or two

fermion loops.

2.1. NLO differential cross section

The NLO term follows from the interference of the one-loop vacuum-polarization diagrams

of class 1a (see Fig. 1) with the tree-level amplitude,

d NLO d 1atree

=

d

d

(1)

2 1 s 2

2

=

+

st

2

Q2f Re f (s)

+

t

2

s s

2

f

2

(1)

1 t

+ 2

Q2f Re f (t)

+ s 2 + st 2

2

t

f

(1)

1

(1)

+ (s + t)2

Q2f Re f (s) + f (t) .

st

(2.7)

(1)

Here f (x) is the renormalized one-loop vacuum-polarization function and the sum over f

runs over the massive fermions, e.g., the electron (f = e), the muon (f = ), the lepton

(f = ). Qf is the electric-charge quantum number, Qf = 1 for leptons.

In this paper we will focus on asymptotic expansions in the high-energy limit. In order to

fix our normalizations explicitly, we reproduce here the exact result for f(1) (x) in dimensional

(1)ct

regularization. Adding f

(1)un

polarization function, we get

(1)

(1)un

f (x) = f

29

(1)ct

(x) + f (x),

m2f

1

1

(1)un

(x) =

2(D 2) A0 (mf ) D 2 + 4

B0 (x, mf ) ,

f

2(D 1)

x

x

2

1

me

1 2

(1)ct

+ ,

f (x) = F

2

3

2

mf

(2.8)

(2.9)

(2.10)

where = (4 D)/2 and D is the number of spacetime dimensions. The normalization factor

is

2 E

me e

,

F =

(2.11)

2

is the t Hooft mass unit and E is the EulerMascheroni constant. Standard one-loop integrals

appearing in Eq. (2.8) are defined by

4D

1

dDk 2

,

A0 (m) =

2

i

k m2

4D

1

.

dDk 2

B0 p 2 , m =

2

2

i

(k m )[(k + p)2 m2 ]

(2.12)

(2.13)

Note that master integrals with l lines and an internal scale m were derived in [22,26] setting

m = 1. For the present computation we introduce a scaling by a factor mfD2l and we get

m2e 2

mf T1l1m,

m2f

2

me

B0 (x, mf ) = F

SE2l2m[x].

m2f

A0 (mf ) = F

(2.14)

(2.15)

In the small-mass limit, A0 vanishes (the result for T1l1m can be read in Eq. (4) of [22]), and

the one-loop self-energy2 reads as

2

1

1

SE2l2m[x] = + 2 + Lf (x) + 4 + 2Lf (x) + L2f (x) ,

(2.16)

2

2

where we introduced the short-hand notation for logarithmic functions (in our conventions the

logarithm has a cut along the negative real axis),

m2f

,

Lf (x) = ln

x + i

0+ .

(2.17)

2 Here, the argument x of SE2l2m[x] is one of the relativistic invariants s, t, u. This deviates from earlier conventions, where we denoted by x the dimensionless conformal transform of s, t , u. This remark applies also to master

integrals in Appendix A.

30

2a

2b

2c

2d

2e

Fig. 2. Classes of Bhabha-scattering two-loop diagrams containing at least one fermion loop. We use the conventions of

Fig. 1. Note that class 2a contains three topologically different subclasses. We refer to [26] for the reproduction of the

full set of graphs.

(1)

(1)

f (x) =

F

3

m2e

m2f

28

5

1

5

+ Lf (x) +

2 + Lf (x) + L2f (x) .

3

9

3

2

(2.18)

Note that the O( ) term in Eq. (2.18) is not required for the NLO computation, but it will become

relevant at NNLO. Here f(1) (x) will be combined with infrared-divergent graphs showing single

(1)

poles in the plane for = 0. The exact result for f (x) is available at [26].

2.2. Outline of the NNLO computation

At NNLO we have to consider:

The interference of the two-loop diagrams of classes 2a2e (see Fig. 2) with the tree-level

amplitude;

The interference of the one-loop vacuum-polarization diagrams of class 1a with the full set

of graphs of classes 1a1c (see Fig. 1).

The complete result can be organized as

d 1a1i

d 2itree

d NNLO

+

.

=

d

d

d

i=a,...,e

i=a,...,c

2-looptree

1-loop1-loop

(2.19)

In order to compute the NNLO differential cross section we use the following reduction strategy:

The generation of all the diagrams is simple and has been made with the computer-algebra

systems GraphShot [27] and qgraf/DIANA [2830]. We spin-sum the squared matrix

31

elements and take the traces over Dirac indices in D dimensions using the computer-algebra

system FORM [31]. The resulting expressions are combinations of algebraic coefficients depending on s, t , u, me , mf and and two-loop integrals with scalar products containing the

loop momenta in the numerators. An example showing the complexity of the result (two-loop

box diagram of class 2e, see Fig. 2) can be found at [26].

We reduce the loop integrals to a set of master integrals by means of the IdSolver

implementation [32] of the Laporta algorithm [33,34]. The complete list of massive Bhabhascattering master integrals can be found in [22].

Next, we evaluate the master integrals:

Integrals arising from graphs of classes 1a1c (Fig. 1), 2a2c (Fig. 2) and 2d2e (Fig. 2, with

electron loops) have been computed exactly through the method of differential equations in

the external kinematic variables and expressed through harmonic polylogarithms [35] or

generalized harmonic polylogarithms [36,37]. Here we agree perfectly with the work of [17,

25]. Non-approximated results for the various components of the differential cross section

are collected in a Mathematica [38] file at [26].

Integrals generated by the diagrams of classes 2d2e (Fig. 2, with heavy-fermion loops) are

computed through a method based on asymptotic expansions of MellinBarnes representations. We derived appropriate MellinBarnes representations [39,40] for each master integral

and performed an analytic continuation in from a range where the integral is regular to the

origin of the plane [4,5]. This is done by an automatic procedure implemented in the package MB.m [41]. To proceed further, we assume a hierarchy of scales, m2e m2f s, t, u,

where f = e. After identifying the leading contributions in the fermion masses (in the same

spirit as in [42]), we express the integrals by series over residua, and the latter are summed up

analytically in terms of polylogs by means of the package XSUMMER [43]. Asymptotic expansions for the master integrals with two different masses were given in [44]. They, and also

few lacking expansions of simpler masters needed here have been collected in Appendix A.

We refer for a detailed discussion to [24], where the technique was employed to derive approximated results for the massive Bhabha-scattering planar box master integrals. All the

mass-expanded masters may also be found in a Mathematica file at [26].

2.3. Renormalization

In the following we will always deal with ultraviolet-renormalized quantities. After regularizing the theory using dimensional regularization [45,46], we perform renormalization in the

on-mass-shell scheme. Here we relate all free parameters to physical observables:

The electric charge coincides with the value of the electromagnetic coupling, as measured in

Thomson scattering, at all orders in perturbation theory;

The squared fermion masses are identified with the real parts of the poles of the Dysonresummed propagators;

Finally, field-renormalization constants are chosen in order to cancel external wave-function

corrections.

Counterterm-dependent Feynman rules are shown in Fig. 3. Note that the presence of infrared

divergencies at NNLO requires to compute one-loop counterterms including O( ) terms.

32

ie2i Zi p 2 g p p

i

ie2i Zff

(p

/ m) Zmi m

i

ie2i+1 Qf Zff

Fig. 3. Counterterm-dependent Feynman rules relevant for Bhabha scattering for i = 1 (one loop) and i = 2 (two loops).

Note that in the on-mass-shell scheme e2 = 4 at all orders in perturbation theory.

The one-loop counterterms read as

F 2 m2e 1

Q

+

Z1 =

2 ,

f

2

12 2

m2f

f

2

F

3

3

1

1

2 me

= Zm

=

Q

+

4

+

8

+

,

Zff

2

2

16 2 f m2f

1

1

= Zff

,

Zff

(2.20)

(2.21)

(2.22)

where the last equation follows from the U(1) QED Ward identity. In the ultrarelativistic limit, the

one-loop fermion-mass counterterm is not needed, since it is always multiplied by the fermion

mass. Note however that the same counterterm is relevant for the exact computation.

2.3.2. Two-loop counterterms

At the two-loop level we get

Z2 =

F 2 4 m2e 2 1 15

Q

+

,

f

2

128 4

m2f

(2.23)

Z2 ee

2 2

F 2

947

5

1

1

2 me

=

Qf

+

162 +

.

36

2 12

128 4 2

m2f

(2.24)

f =e

The result for Z2 ee is obtained including just fermion-loop diagrams and neglecting O(m2e /m2f )

terms for f = e. The expression for Z2 (as well as the one-loop counterterms of Eqs. (2.20)

(2.22)), instead, is exact, since it follows from the single-scale diagrams of classes 2a2b of

Fig. 2. Finally, we observe that the two-loop counterterm with two fermion lines is not required,

since the use of an on-mass-shell renormalization removes external wave-function factors.

3. Two-loop corrections

In this section we show our approximated results for all the components of the NNLO differential cross section of Eq. (2.6). Our short-hand notation for logarithmic functions can be found

33

v1 (x, y; ) = x 2 + 2y 2 + 2xy x 2 ,

v2 (x, y; ) = (x + y)2 x 2 + y 2 + xy ,

(3.1)

(3.2)

where x(y) = s, t, u. Note that for = 0 these functions are proportional to the kinematical

factors appearing in the Born cross section of Eq. (2.5) and the NLO corrections of Eq. (2.7).

Moreover, we introduce a compact notation which will prove useful in discussing box corrections

in Section 3.3 and the complete NNLO differential cross section in Section 4,

2

me

.

L(Rf ) = ln

(3.3)

m2f

3.1. Vacuum-polarization corrections

The interference of the vacuum-polarization diagrams of classes 2a and 2b with the tree-level

amplitude can be written as

1

d 2itree 2 1

v1 (s, t; 0)A2i (s) + 2 v1 (t, s; 0)A2i (t)

=

2

d

s s

t

2i

1

+ v2 (s, t; 0) A (s) + A2i (t) , i = a, b.

(3.4)

st

Here we introduced the auxiliary functions A2a (x) and A2b (x), which are expressed through

the renormalized one- and two-loop vacuum-polarization functions f(1) (x) (see Eq. (2.18)) and

(2)

f (x),

A2a (x) =

(2)

Q4f Re f (x) ,

A2b (x) =

Q2f1 Q2f2 Re f(1)

(x)f(1)

(x) ,

1

2

(3.5)

(3.6)

f1 ,f2

(2)

where the result for f (x) in the small fermion-mass limit reads as

5

1

(3.7)

+ 3 Lf (x).

24

4

Note that O( ) terms in Eq. (3.4) coming from the kinematical coefficients of Eq. (3.1) can be

(1)

(2)

safely neglected, since both f (x) and f (x) are infrared-finite quantities.

f(2) (x) =

The contribution of reducible (irreducible) vertex corrections to the NNLO differential cross

section can be readily derived from diagrams of classes 2c (2d) in Fig. 2,

1

2 1

d 2itree

2i

2 2i

2i

2 2i

v

v

(s,

t;

)A

(s)

+

s

A

(s)

+

(t,

s;

)A

(t)

+

t

A

(t)

=2

1

1

V

M

V

M

d

s s2

t2

3 2 2i

1

2i

v2 (s, t; ) A2i

s AM (s) + t 2 A2i

+

V (s) + AV (t) +

M (t)

st

2

2i

, i = c, d.

+ 2st AM (s) + A2i

(3.8)

M (t)

34

2c

The auxiliary functions A2c

V (x) and AM (x) are given by the product of the renormalized one(1)

loop vacuum-polarization function f (x) (expanded in Eq. (2.18) including O( ) terms) and

(1)

(1)

the renormalized one-loop vector and magnetic vertex form factors FV (x) and FM (x),

(1)

(1)

Q2f Re FI (x)f (x) , I = V, M.

A2c

I (x) =

(3.9)

f

(1)

FV1 (x) =

1

3

1

1 + Le (x) 1 + 2 Le (x) L2e (x),

2

2

4

4

(1)

(3.10)

(1)

whereas FM (x) vanishes when we neglect the electron mass, FM (x) = 0. The renormalized

one-loop vertex develops an infrared divergency, which shows up as a single pole in the plane

for = 0. Therefore, when computing the cross section, we sum over the spins the squared matrix

element and we evaluate the traces over Dirac indices in D = 4 2 dimensions. The needed

kinematical structures include O( ) terms (see Eq. (3.1)).

3.2.2. Irreducible diagrams

The renormalized two-loop vertex diagrams of class 2d are free of infrared divergencies.

Therefore, we can neglect O( ) terms in the kinematical coefficients of Eq. (3.1) appearing

in Eq. (3.8), setting va (x, y; ) = va (x, y; 0), for a = 1, 2. The auxiliary functions A2d

V (x) and

A2d

(x)

contain

the

renormalized

two-loop

vector

and

magnetic

vertex

form

factors

(see

[4749]

M

for a detailed discussion),

(2)

A2d

(3.11)

Q2f Re FI,f (x) , I = V, M.

I (x) =

f

(2)

For the case with an electron loop, FI,e (x), the exact results in terms of harmonic polylogarithms,

can be readily expanded in the high-energy limit. For the vector term we get

1 383

19

1

1 265

(2)

FV,e (x) =

(3.12)

2 +

+ 2 Le (x) + L2e (x) + L3e (x).

4 27

6 36

72

36

(2)

For FV,f (x), f = e, we perform an asymptotic expansion of the master integrals arising in the

computation (see Table V in [22]) and we fully agree with the result of [50],

1 3355 19

1 265

(2)

+ 2 23 +

+ 2 Lf (x)

FV,f (x) =

6 216

6

6 36

19

1

+ L2f (x) + L3f (x).

(3.13)

72

36

Since collinear logarithms are absent, the logarithmic structure of Eqs. (3.12) and (3.13) is obviously the same.

3.3. Box corrections

The contribution of the renormalized two-loop box diagrams of class 2e is given by

d 2etree 2 1 2etree

1 2etree

(s, t) + A2

(s, t) .

=

A

d

2s s 1

t

(3.14)

35

Here the auxiliary functions can be conveniently expressed through three independent form fac(2)

tors BI,f (x, y), where i = A, B, C,

A2etree

(s, t)

1

(2)

(2)

(2)

(2)

Q2f Re BA,f (s, t) + BB,f (t, s) + BC,f (u, t) BB,f (u, s) ,

= F 2

(3.15)

A2etree

(s, t)

2

(2)

(2)

(2)

(2)

2

= F

Q2f Re BB,f (s, t) + BA,f (t, s) BB,f (u, t) + BC,f (u, s) .

(3.16)

(2)

For the case with an electron loop, BI,e (x, y), we get exact results in terms of harmonic

polylogarithms and generalized harmonic polylogarithms. An asymptotic expansion in the limit

m2e s, t, u leads to

5

2 17

1 2 x2

1 x2

(2)

BA,e (x, y) =

+ 2x + y

+ Le (y) Le (x) +

+ 202

3 y

3

3 y

3 3

41

1

23

+2

2 Le (x) 2

+ 82 Le (y) L2e (y) + 8Le (x)Le (y)

9

3

6

y

y

5 3

2

2

ln 1 +

Le (y) + 4Le (x)Le (y) 62 + ln

3

x

x

y

y

242

x

2 34

y

Li2

+ 2 Li3

+

+ 72 +

Le (x)

2 ln

x

x

x

3

3 3

9

5

1

4

3

3

y

y

1 3

ln 1 +

+ 2 Le (x) L3e (y) + 3Le (x)L2e (y) 2 62 + ln2

3

x

x

y

y

y

130

y

2 17

4 ln

Li2

+ 4 Li3

+

+ 112 +

Le (x)

x

x

x

3

3 3

9

5

5

1

6(1 + 22 )Le (y) + L2e (x) L2e (y) + 4Le (x)Le (y) + L3e (x)

3

2

3

y

y

ln 1 +

+ 3Le (x)L2e (y) L3e (y) 62 + ln2

x

x

y

y

y

2 ln

(3.17)

Li2

+ 2 Li3

,

x

x

x

(2)

BB,e

(x, y) =

17

1 2 x2

1 x2 4

5

2 + 2x + y

202

+ Le (y) Le (x) +

3

y

3

3 y 3

3

1

23 2

56

2 Le (x) 4

+ 82 Le (y)

L (y) 20Le (x)Le (y)

+4

9

3

3 e

y

5 3

2

2 y

ln 1 +

2 Le (y) 4Le (x)Le (y) 2 62 + ln

3

x

x

36

y

x

2 34

y

y

272

4 ln

Li2

+ 4 Li3

+

+ 72 +

Le (x)

x

x

x

3

3 3

9

5

1

4

+ 62 Le (y) + 13L2e (x) + 40Le (x)Le (y) 16L2e (y)

3

3

1 3

y

3

2

2 y

+ 2 Le (x) Le (y) + 3Le (x)Le (y) 2 62 + ln

ln 1 +

3

x

x

y

y

2 17

y

y

130

4 ln

Li2

+ 4 Li3

+

+ 112 +

Le (x)

x

x

x

3

3 3

9

5

5

1

6(1 + 22 )Le (y) + L2e (x) L2e (y) + 4Le (x)Le (y) + L3e (x)

3

2

3

y

y

L3e (y) + 3Le (x)L2e (y) 62 + ln2

ln 1 +

x

x

y

y

y

2 ln

(3.18)

Li2

+ 2 Li3

,

x

x

x

(2)

1 2 x2 5

2

5

+ Le (y) Le (x) + (x + y) + Le (y) Le (x)

3 y 3

3

3

2

1 x 2 17

1

41

+

+ 202 2

2 Le (x) + 2

+ 82 Le (y)

3 y 3 3

9

3

5

23

+ L2e (y) 8Le (x)Le (y) + L3e (y) 4Le (x)L2e (y)

6

3

y

y

y

y

2 y

ln 1 +

+ 2 ln

Li2

2 Li3

.

+ 62 + ln

x

x

x

x

x

(3.19)

BC,e (x, y) =

The list of master integrals used here was given in Table V of [22]. In Appendix A we collect the explicit analytic expressions for them in the ultra-relativistic limit. At variance with the

electron-loop case, it is not possible to compute them exactly by means of a basis containing

harmonic polylogarithms and generalized harmonic polylogarithms. Therefore, we use the highenergy asymptotic expansion discussed in Section 2.2. The results, expressed by the logarithms

of the fermion masses L(Rf ) (see Eq. (3.3)), are:

1 2 x2

5

(2)

(x, y) =

+ 2x + y

L(Rf ) + Le (y) Le (x)

BA,f

3 y

3

2

1x

131

25

+

2

102 23 2

62 L(Rf )

3 y

27

9

7 2

1 3

82

4

+ L (Rf ) L (Rf ) +

22 L(Rf ) Le (x)

6

3

9

3

1

1

23

2 + 82 L(Rf ) Le (y)

2L(Rf ) L2e (y)

3

2

6

5 3

2

+ 4 2 L(Rf ) Le (x)Le (y) 4

L (y) Le (x)Le (y)

12 e

37

y

y

y

y

y

ln 1 +

2 ln

Li2

2 Li3

62 + ln

x

x

x

x

x

25

x

262

+

2

92 43 4

32 L(Rf )

3

27

9

7 2

2 3

121 10

10

L(Rf ) Le (x) 2

+ 122

+ L (Rf ) L (Rf ) + 2

3

3

9

3

3

13

16

2L(Rf ) L2e (x)

2L(Rf ) L2e (y)

2L(Rf ) Le (y) +

3

3

17

2

2L(Rf ) Le (x)Le (y) + L3e (x) + 6Le (x)L2e (y) 2L3e (y)

+2

3

3

y

y

y

y

y

2

ln 1 +

4 ln

Li2

+ 4 Li3

2 62 + ln

x

x

x

x

x

y

25

131

7 2

+

2

72 23 2

32 L(Rf ) + L (Rf )

3

27

9

6

1 3

130 10

L (Rf ) +

L(Rf ) Le (x) 6 + 122 3L(Rf ) Le (y)

3

9

3

5

25

2

L(Rf ) Le (x)

L(Rf ) L2e (y)

+

3

6

10

1

L(Rf ) Le (x)Le (y) + L3e (x) L3e (y) + 3Le (x)L2e (y)

+2

3

3

y

y

y

y

y

2

ln 1 +

2 ln

Li2

+ 2 Li3

,

62 + ln

x

x

x

x

x

(3.20)

5

1 2 x2

(2)

BB,f (x, y) =

2 + 2x + y

L(Rf ) + Le (y) Le (x)

3

y

3

50

7

2 x 2 262

202 43

122 L(Rf ) + L2 (Rf )

+

3 y 27

9

6

1 3

112

10

2

22 L(Rf ) Le (x) + 162

L (Rf ) +

3

9

3

3

23

2L(Rf ) L2e (y) + 2 5 2L(Rf ) Le (x)Le (y)

+ L(Rf ) Le (y)

6

y

5 3

2

2 y

L (y) Le (x)Le (y) 62 + ln

ln 1 +

4

12 e

x

x

y

y

y

2 ln

Li2

+ 2 Li3

x

x

x

2x 262

25

7

+

92 43 2

32 L(Rf ) + L2 (Rf )

3 27

9

6

136 13

10

1 3

L(Rf ) Le (x)

+ 122 2L(Rf ) Le (y)

L (Rf ) +

3

9

3

3

13

8

L(Rf ) L2e (x)

L(Rf ) L2e (y)

+

6

3

38

20

+

2L(Rf ) Le (x)Le (y)

3

1 3

y

2

3

2 y

+ Le (x) + 3Le (x)Le (y) Le (y) 62 + ln

ln 1 +

3

x

x

y

2y

131

y

y

2 ln

Li2

+ 2 Li3

+

72 23

x

x

x

3

27

25

7 2

1 3

65 5

L(Rf ) Le (x)

9

12

6

9

3

1

1 5

6 + 122 3L(Rf ) Le (y) +

L(Rf ) L2e (x)

2

2 3

1 25

10

1

L(Rf ) Le (x)Le (y) + L3e (x)

2 6

3

6

1 3

3

1

y

y

Le (y) + Le (x)L2e (y) 32 + ln2

ln 1 +

2

2

2

x

x

y

y

y

ln

(3.21)

Li2

+ Li3

,

x

x

x

(2)

1 2 x2 5

2

5

L(Rf ) + Le (y) Le (x) + (x + y) L(Rf )

3 y 3

3

3

2 x2

25

131

+ Le (y) Le (x) +

+ 102 + 23 +

62 L(Rf )

3 y

27

9

7 2

1 3

41

2

1

L (Rf ) + L (Rf )

2 L(Rf ) Le (x) +

+ 82

12

6

9

3

3

1

23

L(Rf ) Le (y) 2 2 L(Rf ) Le (x)Le (y) +

L(Rf ) L2e (y)

2

12

5 3

1 2 y

y

2

+ Le (y) 2Le (x)Le (y) + 32 + ln

ln 1 +

6

2

x

x

y

y

y

+ ln

(3.22)

Li2

Li3

.

x

x

x

BC,f (x, y) =

In order to study the numerical effects of massive leptons in two-loop box diagrams we consider the interference of the box diagram of class 2e (see Fig. 2) with the s-channel tree-level

amplitude,

(2)

2

(3.23)

Re

B

(s,

t)

,

A,f

4s 2

where BA,f can be found in Eq. (3.17) for electron loops, and in Eq. (3.20) for f =

e loops. In

Table 1 (Table 2) we show numerical values for the finite part of B2e,f at values of s typical

for meson factories, Giga-Z, ILC, and at two selected small and wide scattering angles, = 3

( = 90 ).

For comparison we show in Table 3 the real part of the vertex function, see Eq. (3.13). We notice that the contributions of the box diagrams with heavier fermions are not strongly suppressed,

and are comparable to those coming from the electron-loop boxes. This is different with respect

B2e,f =

39

Table 1

Numerical values for the finite part of B2e,f of Eq. (3.23) in nanobarns at a scattering

angle = 3 . The first two entries for the lepton are not shown since here the highenergy approximation in not justified (the same consideration applies to the top quark)

10

91

500

e [see Eq. (3.17)]

[see Eq. (3.20)]

[see Eq. (3.20)]

188 758

1635.62

5200.08

1686.88

284.711

130.579

39.5554

Table 2

Numerical values for the finite part of B2e,f of Eq. (3.23) in nanobarns at a scattering

angle = 90 . The first two entries for the top quark are not shown since here the highenergy approximation in not justified

10

91

500

e [see Eq. (3.17)]

[see Eq. (3.20)]

[see Eq. (3.20)]

t [see Eq. (3.20)]

143.162

61.3875

10.0105

3.23102

1.79381

0.935319

Table 3

The real part for the vertex form factor, see Eqs. (3.12) and (3.13)

s [GeV]

10

91

e

124.237

4.8036

254.293

29.1057

2.08719

0.160582

0.0995184

0.0639576

0.00256757

500

400.574

70.1032

13.4901

to the self-energy and vertex corrections and may be traced back to the logarithmic structure of

the terms in Eqs. (3.20)(3.22), where terms of order L3e (x) appear (note that the two-loop box

master integrals of Eqs. (A.7) and (A.8) of Appendix A show a dependence on L3e (x), in contrast

to the vertex and self-energy masters with heavy fermion loops). After assembling the box diagrams we see a remaining dependence on ln2 (s/m2e ). This is a collinear mass singularity, coming

from the external legs of the diagrams, which leads to the fact that the two-loop box corrections

from heavier fermions are not numerically suppressed compared to the electron-loop contributions. One may control this easily by evaluating the singularity structure of the corresponding

massless box diagram where only a scale M due to the internal loop exists, and see there some

1/ 2 terms which are absent in the corresponding self-energy and vertex diagrams.

3.4. Products of one-loop corrections

Finally, we consider the simpler components generated by the interference of one-loop diagrams among themselves. We start with the interference of diagrams of class 1a,

d 1a1a 2 1

1

v1 (s, t; 0)A1a1a (s, s) + 2 v1 (t, s; 0)A1a1a (t, t)

=

d

2s s 2

t

1

+ v2 (s, t; 0) A1a1a (s, t) + A1a1a (t, s) .

(3.24)

st

40

Here the auxiliary function A1a1a (x, y) contains the product of the renormalized one-loop

(1)

vacuum-polarization function f (x) (see Eq. (2.18)) with its complex conjugate,

A1a1a (x, y)

Q2f1 Q2f2 f(1)

(x) f(1)

(y) .

1

2

(3.25)

f1 ,f2

d 1a1b

2 1

1a1b

v1 (s, t; )AV

(s, s) + s 2 A1a1b

(s, s)

=2

M

2

d

s s

1

+ 2 v1 (t, s; )A1a1b

(t, t) + t 2 A1a1b

(t, t)

V

M

t

1a1b

1

+

(s, t) + A1a1b

(t, s)

v2 (s, t; ) AV

V

st

3

1a1b

1a1b

+ s 2 AM

(s, t) + t 2 AM

(t, s)

2

1a1b

+ 2st A1a1b

(s,

t)

+

A

(t,

s)

.

M

M

(1)

(3.26)

(1)

The auxiliary function A1a1b (x, y) is given by the product of FV (x) and FM (x), the renormalized one-loop vector (see Eq. (3.10)) and magnetic (vanishing in the high-energy limit)

form factors for the QED vertex, and the complex-conjugate renormalized one-loop vacuum(1)

polarization function f (x) (see Eq. (2.18)),

A1a1b

(x, y)

I

(1)

(1)

Q2f Re FI (x) f (y) ,

I = V, M.

(3.27)

1

d 1a1c 2 1 1a1c

(s, t) + A1a1c

(s,

t)

.

=

A1

d

4s s

t 2

(3.28)

(s, t) take the form

Here the auxiliary functions A1a1c

1

2

(1)

(1)

(1)

A1a1c

(s, t) = F

Q2f Re BA (s, t) + BB (t, s) + BC (u, t)

1

f

(1)

(1)

BB (u, s) f (s) ,

(1)

(1)

(1)

(s, t) = F

Q2f Re BB (s, t) + BA (t, s) BB (u, t)

A1a1c

2

f

(1)

(1)

+ BC (u, s) f (t) ,

(s, t) = F

A1a1c

1

(3.29)

(1)

(1)

(1)

Q2f Re BA (s, t) + BB (t, s) + BC (u, t)

(1)

(1)

BB (u, s) f (s) ,

(3.30)

A1a1c

(s, t) = F

2

41

(1)

(1)

(1)

Q2f Re BB (s, t) + BA (t, s) BB (u, t)

(1)

(1)

+ BC (u, s) f (t) .

(3.31)

(1)

f (x) is given in Eq. (2.18), and the new functions, in the small mass limit, read as

4 x2

x2

(1)

BA (x, y) =

+ 2x + y Le (x) +

162 + 4Le (x) + 2L2e (y) 4Le (x)Le (y)

y

y

+ 2x 102 + Le (x) + Le (y) L2e (x) + L2e (y) 2Le (x)Le (y)

+ y 102 + 2Le (x) + 2Le (y) L2e (x) + L2e (y) 2Le (x)Le (y) ,

(3.32)

(1)

4 x2

x2

2 + 2x + y Le (x) + 4

82 + L2e (y) 2Le (x)Le (y)

y

y

+ 2x 102 Le (x) + Le (y) L2e (x) + L2e (y) 2Le (x)Le (y)

+ y 102 + 2Le (x) + 2Le (y) L2e (x) + L2e (y) 2Le (x)Le (y) ,

(3.33)

4 x2

x2

Le (x) + 2

82 2Le (x) L2e (y) + 2Le (x)Le (y)

y

y

4(x + y)Le (x).

(3.34)

BB (x, y) =

(1)

BC (x, y) =

For the computation of the non-fermionic corrections these functions are needed up to first order

in , since they are combined with the real emission. However, this higher-order expansion is not

relevant here.

4. The net fermionic NNLO differential cross section

In this section we use the results of Section 3 and derive an explicit expression for the NNLO

differential cross section of Eq. (2.19).

Note that the full set of two-loop fermionic virtual corrections to Bhabha scattering represents

an infrared-divergent quantity. In order to obtain a finite quantity, we take into account the real

emission of soft photons3 from the external legs of one-loop fermionic diagrams (class 1a, Fig. 1).

The exact result is available in the literature, see, e.g., Eq. (25) and Appendix A in [18]. Here we

show the high-energy approximation relevant for our computation. We consider events involving

a single soft photon carrying energy in the final state,

e (p1 ) + e+ (p2 ) e (p3 ) + e+ (p4 ) + (k),

(4.1)

and compute one-loop purely-fermionic corrections. Obviously, these real corrections factorize

and their structure is completely equivalent to the tree-level ones. In complete analogy with Eq.

(2.6) we write

LO 2 NLO

d

d

d

=

+

+ O 5 ,

(4.2)

d

d

d

3 The energy carried by a soft photon in the final state is small with respect to the center-of-mass energy E introduced

in Eq. (2.2).

42

where

dLO

d

dNLO

d

1

1

2 1

v

(s,

t;

)

+

v

(t,

s;

)

+

(s,

t;

)

F , s, t, m2e ,

v

1

1

2

2

2

s 2s

st

2t

(4.3)

(1) 1

(1)

2 1

2

v

(s,

t;

)

Q

Re

(s)

+

v

(t,

s;

)

Q2f Re f (t)

=

1

1

f

f

2

2

s s

t

f

f

1

Q2f Re f(1) (s) + f(1) (t) F , s, t, m2e .

+ v2 (s, t; )

(4.4)

st

f

f(1) (x) can be read in Eq. (2.18) and, at variance with Eqs. (2.5)(2.7), the kinematical factors

introduced in Eq. (3.1) need to be expanded up to O( ), since the real-emission factor shows an

infrared divergency,

t

s

2

t

F , s, t, m2e = F ln

ln

1

+

1

+

ln

s

s

m2e

t

t

s

s

2

2

ln 1 +

+ 2 ln

2 ln + ln

+ ln

s

s

m2e

m2e

s

2

t

t

+ 4 ln

ln

ln 1 +

1

s

s

s

t

t

ln2 1 +

42 + ln2

s

s

t

t

2 Li2

(4.5)

+ 2 Li2 1 +

.

s

s

Summing the virtual contributions of Eq. (2.19) to the real-photon emission of Eq. (4.4) we

write the NNLO fermionic corrections to Bhabha scattering through the sum of electron-loop

contributions (d NNLO,e ) and components arising from heavier fermion loops,

d NNLO,f

d NNLO dNLO d NNLO,e 2 d NNLO,f

Qf

Q4f

+

=

+

+

d

d

d

d

d

2

f =e

f1 ,f2 =e

Q2f1 Q2f2

f =e

d NNLO,2f

d

(4.6)

The double summation over the fermion species arises from the loop-by-loop terms of Eqs. (3.6)

and (3.24). Here we do not include the case f1 = f2 = e, which is incorporated in d NNLO,e .

Note also the term proportional to Q4f , coming from Eq. (3.5). The result for electron loops can

be found in Eq. (46) of [18]. For heavier fermion loops we introduce x = t/s and get:

4

5

2 (1 x + x 2 )2

d NNLO,f

s

)

+

4

=

ln

+

ln(R

f

3

d

2s

6

x2

m2e

1

3

3 x

+ ln(x) 2

+

,

2x 2 2

x

(4.7)

43

d NNLO,2f 2 (1 x + x 2 )2 2 s

ln

+ ln(Rf1 ) ln(Rf2 )

=

d

s

3x 2

m2e

5

s

10

5

ln(R

ln(R

+ ln

)

+

ln(R

)

)

+

ln(R

)

f1

f2

f1

f2

3

3

3

m2e

1

7 x

2 2

1

4

+ ln2 (x) 2

+

+

5 + 4x 2x 2

3

3x 6 3

3 x

x

1

s

1 x

10

1

+ ln(x) ln(Rf1 ) + ln(Rf2 )

+ 2 ln

+

,

3

m2e

3x 2 2x 2 6

(4.8)

2

d NNLO,f

2 NNLO,f2

2

NNLO,f2

,

=

1

+ 2

ln

d

s

s

2

1NNLO,f =

(4.9)

1 3 s

s

55

(1 x + x 2 )2

3

2

+

ln

ln

ln(Rf )

(R

)

+

ln

f

3

6

3x 2

m2e

m2e

589 37

s

+

ln(Rf ) ln2 (Rf )

+ ln(1 x) ln(x) + ln

18

3

m2e

2 ln(Rf ) ln(x) ln(1 x) 8 Li2 (x)

19 2

4795 409

ln(Rf ) +

ln (Rf )

108

18

6

40

Li2 (x)

ln2 (Rf ) ln(x) ln(1 x) 8 ln(Rf ) Li2 (x) +

3

2

11 23

16 2

4

1

s

2

+

ln

+

x

+

x

+

(x)

2

+ ln

2

2

2

3x

2

3

3

me

3x

3x

5

x

2 2

5

11

2

17

2

11

+ x + ln2 (1 x) 2 +

+ x x2

+

12x 4 12 3

6x 2

6

3

3x

55

1 5

4 2

4

83 65

2

+ x x + ln(x)

+

+ ln(x) ln(1 x)

3

6

3x 2 3x 2 3

9x 2 9x

1

10 2

10

31

10 2

31

85

10 + x x

x + x + ln(1 x) 2 +

18

9

3

6x

6

3

3x

2

1

11 x x

1

1

31

1

1

+ ln3 (x) 2 +

+

+ ln3 (1 x) 2 +

3

12x

6

6

3

3

x

3x

3x

x2

4

x2

1

1

4

2

+ ln (x) ln(1 x) 2 +

+x

+x

3

3

3x 3

3

3x

1

2

55

46

1

7

x

+ ln2 (x)

+ ln(x) ln2 (1 x) 2 + +

2

3

x 4 2

9x

x

18x

10 2

5

x

2 2

1

17

14 4

x x + ln(Rf ) 2 +

+ x

+

3

9

9

12x 4 12 3

3x

10

10

29 9 29

2

11

+ x + x 2 + ln(Rf ) 2 +

+ ln2 (1 x)

2

9x 2

9

9

6x

9x

3x

+

44

5 11

37

2 2

1 25

10

+ x x

+ x

+ ln(x) ln(1 x) 2 +

2

6

3

18x 2

9

9x

20 2

2

4

1 5

4 2

+ x + ln(Rf )

+ x x

9

3

3x 2 3x 2 3

589

1753 701 925

56

+ ln(x)

+

+

x x2

36

108

27

54x 2 108x

4

19

2

+ Li2 (x) 2 +

7 + 3x x 2

3x

3

x

37

56 47 67

4 1

10 2

2

+ ln(Rf )

+

x + x + 2 2 +

6

18

9

x 6

9x 2 9x

3x

10

161 56 161

56

56

2

x + 2x

x + x2

+ ln(1 x)

2

3

54x

9

54

27

27x

10

31

20

10 31

10 2

2

+ ln(Rf ) 2 +

+ x x + 2 2 +

18x

3

18

9

3x

9x

x

32 20

4

7

5

2

+ x 2x 2 + Li3 (x)

+ 3 x + x2

3

3

3

3

3x 2 3x

2

1

1

13

43

19

+ S1,2 (x) 2 + x + x 2 + 2

3

x

3

x

9x 2 18x

311

2

4

98 2

11 23

16 2

+

x x + ln(Rf ) 2 +

+

x+ x

18

9

3x

2

3

3

3x

3

11

4

+ 3 2 + 5 + x 2x 2 ,

x

3

3x

2

2NNLO,f =

(4.10)

8 (1 x + x 2 )2

8

s

s

2

ln

+

ln

+ ln(Rf )

3

3

x2

m2e

m2e

5

ln(Rf ) 1 + ln(1 x)

ln(1 x) + ln(x) ln(Rf ) +

3

s

7

2

1

5

2 2

4

2

+ 4 ln

+ 3 x + x + ln (x)

ln(x)

2

2

2

3x

3

3

x

me

3x

3x

1

2

1

1

+1 x

+ 1 x ln(x) ln(1 x)

3

3

3x 2 x

1

29

16

23

10

ln(x)

(4.11)

+ 13 x + x 2 .

3

3

3

3x 2 3x

(1)n+p1

Sn,p (y) =

(n 1)!p!

1

dx

.

x

(4.12)

In Table 4 (Table 5) we show numerical values for the NNLO corrections to the differential

cross section for a scattering angle = 3 ( = 90 ). In both tables we set = E/10. Finally, in

45

Table 4

Numerical values for the NNLO corrections to the differential cross section respect to the solid angle. Results are

expressed in nanobarns for a scattering angle = 3 . Empty entries are related to cases where the high-energy approximation cannot be applied

10

91

500

LO QED [Eq. (2.5)]

LO Zfitter [51,52]

NNLO (e) [Eq. (4.6)]

NNLO (e + ) [Eq. (4.6)]

NNLO (e + + ) [Eq. (4.6)]

NNLO photonic [14,16]

440873

440875

1397.35

1394.74

9564.09

5323.91

5331.5

35.8374

43.1888

251.661

176.349

176.283

1.88151

2.41643

2.55179

12.7943

Table 5

Numerical values for the NNLO corrections to the differential cross section respect to the solid angle. Results are

expressed in nanobarns for a scattering angle = 90 . Empty entries are related to cases where the high-energy approximation cannot be applied

10

91

500

LO QED [Eq. (2.5)]

LO Zfitter [51,52]

NNLO (e) [Eq. (4.6)]

NNLO (e + ) [Eq. (4.6)]

NNLO (e + + ) [Eq. (4.6)]

NNLO (e + + + t) [Eq. (4.6)]

NNLO photonic [14,16]

0.466409

0.468499

0.00453987

0.00570942

0.00586082

0.00563228

0.127292

0.0000919387

0.000122796

0.000135449

0.0358755

0.000655126

0.000186564

0.0000854731

4.28105 106

5.90469 106

6.7059 106

6.6927 106

0.0000284063

Fig. 4. Ratio of the fermionic NNLO corrections to the differential cross section respect to the tree-level result for

s = 10 GeV and s = 500 GeV. A solid line represents the electron-loop contributions, a dotted one the sum of

electron- and muon-loop ones, and a dashed one includes also leptons.

Fig. 4 we plot the ratio of the two-loop fermionic corrections to the tree-level cross section,

2 NNLO

+ dNLO

d

(4.13)

d LO

It is clear from the tables, that although there is no decoupling of the heavier fermions (as

indeed there should not, since the typical scale of the process is large compared to all the masses),

R( s ) =

46

Fig. 5. Same as Fig. 4, including the photonic contributions of [2,14,16] (dash-dotted lines).

the electron loop contributions dominate in the fermionic part and the latter is still substantially

smaller than the pure photonic corrections.

5. Summary

In this article, we completed the computation of the virtual two-loop QED fermionic corrections to Bhabha scattering. Based on the kinematics of the targeted phenomenological applications, we considered the limit m2e m2f s, t, u.

The fermionic double box contributions with two different mass scales have been derived

for the first time here. Their numerical importance is comparable to the two-loop self-energies

and vertices. We note, however, a qualitative difference. Due to the structure of the collinear

singularities of the graphs, the contributions of the heavier fermions are not suppressed.

A numerical estimation of differential cross sections shows that the net fermionic two-loop

effects may be neglected for applications at LEP 1 and LEP 2, but have to be taken into account

for precision calculations when a level of 104 has to be reached, as is anticipated for the Giga-Z

option of the ILC project.

Completing the NNLO program for Bhabha scattering requires still several ingredients. First,

let us mention the contributions from the five light quark flavors. Here, an approach based on

dispersion relations la [53] should be suitable. On the other hand, the heavy top quark might

be considered decoupling in a large part of the interesting kinematical regions. Furthermore, an

implementation of the loop-by-loop corrections with pentagon diagrams has to be done.

Finally, light fermionic pair emission diagrams need to be considered. As known from the

form-factor case, they are responsible for the cancellation of the leading part of the logarithmic

sensitivity on the masses.

Exact and approximated results are made publicly available at [26]. The combination of our

result with the photonic two-loop corrections of [16] and with electron loop corrections of [17,

25] proves well-suited for phenomenological purposes, e.g., a precise luminosity determination

at a future International Linear Collider.

Note added

We would like to thank T. Becher and K. Melnikov for drawing our attention to a problem with

a first version of our result, which lead us to discover incorrectly expanded integrals (Eqs. (A.7),

(A.8) and (A.15) in Appendix A) appearing in the evaluation of the two-loop box form factors

47

(Eqs. (3.20), (3.21) and (3.22) of Section 3.3). After correction, Eq. (4.10) agrees with the result

published in the meantime in [55].

Acknowledgements

We would like to thank A. Arbuzov, R. Bonciani, A. Ferroglia and A. Penin for useful communications, and S. Moch and A. Mitov for interesting discussions.

Work supported in part by Sonderforschungsbereich/Transregio TRR 9 of DFG Computergesttzte Theoretische Teilchenphysik, by the Sofja Kovalevskaja Award of the Alexander von

Humboldt Foundation sponsored by the German Federal Ministry of Education and Research,

by the ToK Program ALGOTOOLS (MTKD-CD-2004-014319), by the Polish State Committee for Scientific Research (KBN, research projects in 20042006), and by the European

Communitys Marie-Curie Research Training Networks MRTN-CT-2006-035505 HEPTOOLS

and MRTN-CT-2006-035482 FLAVIAnet.

Appendix A. Mass-expanded master integrals

The list of master integrals required by our computation can already be found in Table V

of [22]. The eight most difficult masters, those involving two different mass scales, have been

derived in [44]. Because they are a substantial part of the present study we reproduce them here:

SE3l2M1m[on shell]

5

3 2 3

1 2

1

= M 2 m4 R

+

(R)

+

+

L(R)

L

2

2

2

2 2 4 8

3 7

45 5

11 1

+ R2

L(R) + R

+ 2 L(R) + L2 (R)

18 3

16 4

3

4

1

3 8

1

,

L3 (R) + R 2 + L(R) L2 (R)

2

4 9

2

SE3l2M1md[on shell]

1

1

1

4

=m

+

1 + 2L(R) + (1 + 2 ) + L(R) + L2 (R)

2

2 2 2

2 3

1

2

+ (3 + 32 23 ) + (1 + 2 )L(R) + L (R) + L (R)

6

3

3 1

7

3

+ R + L(R) +

L(R) + L2 (R)

4 2

8

4

1

5

1

1 2

5

2

+ R + L(R) + + L(R) + L (R)

.

36 6

72 18

4

In the following we set Lm (x) = ln(m2 /x) and LM (x) = ln(M 2 /x),

5

1

1

4

+

V4l2M1m[x] = m

+ 19 32 L2m (x)

2 2 2 2

M2

2 + 42 43 2Lm (x) + 2LM (x) 42 LM (x)

+

x

1 3

2

2

+ 2Lm (x)LM (x) LM (x) Lm (x)LM (x) + LM (x) ,

3

(A.1)

(A.2)

(A.3)

48

V4l2M1md[x] =

m4 1

1

1

1

+

(x)

+ 2 2 + Lm (x) + L2m (x)

1

+

L

m

2

2

2

4

m

2

2

M 1 1

LM (x) 1 + 32 + Lm (x) + LM (x)

+

x

1 2

Lm (x)LM (x) LM (x) ,

2

V4l2M2m[x] = m

(A.4)

1 5

1

1

2

+

+ Lm (x) + (19 + 2 ) + 5 Lm (x) + Lm (x) ,

2

2 2 2

(A.5)

m4

(A.6)

6x

m4 1

1

1 2

B5l2M2m[x, y] =

L

(x)

+

+

2L

(x)

+

(x)

+

L

(x)L

(y)

L

m

2

m

m

m

x

2 m

2

1

22 23 + 4Lm (x) + L2m (x) + L3m (x) 42 Lm (y)

3

1

2

+ 2Lm (x)Lm (y) + Lm (x)Lm (y) L3m (y)

6

1 2

1 2

y

32 + Lm (x) Lm (x)Lm (y) + Lm (y) ln 1 +

2

2

x

y

y

Lm (x) Lm (y) Li2

(A.7)

+ Li3

,

x

x

V4l2M2md[x] =

B5l2M2md[x, y]

m4 1

=

xy

1

1

2Lm (x)L2m (y) + L3m (y) 22 L(R) + 2Lm (x)Lm (y)L(R) L3 (R)

6

6

1 2

1 2

y

+ 32 + Lm (x) Lm (x)Lm (y) + Lm (y) ln 1 +

2

2

x

y

y

+ Lm (x) Lm (y) Li2

Li3

.

x

x

(A.8)

We list also the other expanded masters, including the correct normalizations. Note that, compared to the conventions employed in [22] and in Eq. (2.16), all integrals are rescaled by a factor

mL(D2l) , where L is the number of loops, D = 4 2 and l is the number of internal lines.

Expansions are performed up to the order required by our computation. For example, we include

O(m2 ) terms in SE2l2m[x] (see Eq. (A.10)) since the reduction procedure generates coefficients containing 1/m2 . The same consideration applies to O( ) terms, which are included as

long as the reduction brings inverse powers of in the coefficient functions. Since in the following no ambiguities arise, in we drop the subscript f and we set L(x) = ln(m2 /x),

2 1 1

2

2 3

2

T1l1m = m

(A.9)

+1+ 1+

+ 1+

,

2

2

3

49

SE2l2m[x]

m2

2

1

2 1

=m

+ 2 + L(x) + 2

1 L(x) + 4 + 2L(x) + L2 (x)

x

2

2

2

m

7

1

+

2 + 22 L2 (x) + 2 8 2 3 + 4L(x) 2 L(x) + L2 (x)

x

3

2

1 3

m2

1 3

+ L (x) +

(A.10)

,

2 + 2 + 43 + 2 L(x) L (x)

6

x

3

SE2l0m[x]

2

1

1

= m2

+ 2 + L(x) + 4 + 2L(x) + L2 (x)

2

2

7

1

1

+ 2 8 2 3 + 4L(x) 2 L(x) + L2 (x) + L3 (x) ,

3

2

6

1 2

1 3

m2

42 + L (x) 53 + 2 L(x) L (x) ,

V3l1m[x] =

x

2

6

SE3l1m[on shell]

12 1

5

55 25

11

11 5

+

+

+

= m2

+

+

2

2

3

8

2

16

4

3

2 2 4

55

303

949 55

+ 2

+ 2 + 3 +

4 ,

32

8

6

8

SE3l2m[x]

2 12 1

1

x 13 1

x

2

= m

+

3

+ L(x)

+ 5 + 2 L (x) 2

2

2

4m2

m 8

16

+ 3 + 32 + 3 4L(x) + 22 L(x) 3L2 (x) L3 (x)

3

x 115 2 13

1

,

2

+ L(x) + L2 (x)

16

4

4

2

m

SE3l2md[x]

1

m2

1 2

1 2

1

= m4

+

(x)

+

L(x)

L

2 + L2 (x)

2

2 2

2

2

x

2

11 3

8

1

+ + 2 + 3 5L(x) + 2 L(x) 2L2 (x) L3 (x)

2

2

3

2

2

m

+

.

63 4L(x) 22 L(x) + L2 (x) + L3 (x)

x

(A.11)

(A.12)

(A.13)

(A.14)

(A.15)

Finally, the mass expanded one-loop box master integral B4l2m[x,y] can be collected from

Eqs. (4.70)(4.75) of [54]:

50

B4l2m[x, y]

x

x

m2 2

2

L(y) ln

+ 2L (y) 2L(y) ln

+ 43 92 L(y)

=

xy

y

y

2 3

x

x

x

1 3 x

2

+ L (y) + 52 ln

L (y) ln

+ ln

62 ln 1 +

3

y

y

3

y

y

x

x

x

x

x

+ 2 ln

ln

ln 1 +

ln2

ln 1 +

y

y

y

y

y

x

x

x

+ 2 ln

Li2 1 +

+ 2 Li3

.

y

y

y

(A.16)

References

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

A. Arbuzov, E. Kuraev, B. Shaikhatdenov, Mod. Phys. Lett. A 13 (1998) 23052316, hep-ph/9806215.

Z. Bern, L. Dixon, A. Ghinculov, Phys. Rev. D 63 (2001) 053007, hep-ph/0010075.

V. Smirnov, Phys. Lett. B 460 (1999) 397404, hep-ph/9905323.

J. Tausk, Phys. Lett. B 469 (1999) 225234, hep-ph/9909506.

F. Berends, R. Kleiss, Nucl. Phys. B 228 (1983) 537.

F. Berends, R. Kleiss, W. Hollik, Nucl. Phys. B 304 (1988) 712.

S. Jadach, W. Placzek, E. Richter-Was, B. Ward, Z. Was, Comput. Phys. Commun. 102 (1997) 229251.

S. Jadach, W. Placzek, B. Ward, Phys. Lett. B 390 (1997) 298308, hep-ph/9608412.

A. Arbuzov, G. Fedotovich, E. Kuraev, N. Merenkov, V. Rushai, L. Trentadue, JHEP 9710 (1997) 001, hepph/9702262.

A. Arbuzov, G. Fedotovich, F. Ignatov, E. Kuraev, A. Sibidanov, Eur. Phys. J. C 46 (2006) 689703, hepph/0504233.

C. Carloni Calame, C. Lunardini, G. Montagna, O. Nicrosini, F. Piccinini, Nucl. Phys. B 584 (2000) 459479,

hep-ph/0003268.

G. Balossini, C. Carloni Calame, G. Montagna, O. Nicrosini, F. Piccinini, Nucl. Phys. B 758 (2006) 227253,

hep-ph/0607181.

N. Glover, B. Tausk, J. van der Bij, Phys. Lett. B 516 (2001) 3338, hep-ph/0106052.

A. Penin, Phys. Rev. Lett. 95 (2005) 010408, hep-ph/0501120.

A. Penin, Nucl. Phys. B 734 (2006) 185202, hep-ph/0508127.

R. Bonciani, A. Ferroglia, P. Mastrolia, E. Remiddi, J. van der Bij, Nucl. Phys. B 701 (2004) 121179, hepph/0405275.

R. Bonciani, A. Ferroglia, P. Mastrolia, E. Remiddi, J. van der Bij, Nucl. Phys. B 716 (2005) 280302, hepph/0411321.

R. Bonciani, A. Ferroglia, Phys. Rev. D 72 (2005) 056004, hep-ph/0507047.

V.A. Smirnov, Phys. Lett. B 524 (2002) 129, hep-ph/0111160.

G. Heinrich, V.A. Smirnov, Phys. Lett. B 598 (2004) 55, hep-ph/0406053.

M. Czakon, J. Gluza, T. Riemann, Phys. Rev. D 71 (2005) 073009, hep-ph/0412164.

M. Czakon, J. Gluza, K. Kajda, T. Riemann, Nucl. Phys. B (Proc. Suppl.) 157 (2006) 1620, hep-ph/0602102.

M. Czakon, J. Gluza, T. Riemann, Nucl. Phys. B 751 (2006) 117, hep-ph/0604101.

R. Bonciani, A. Ferroglia, Two-loop QED Bhabha scattering, http://pheno.physik.uni-freiburg.de/~bhabha/.

S. Actis, M. Czakon, J. Gluza, T. Riemann, Two-loop QED Bhabha scattering, http://www-zeuthen.desy.de/theory/

research/bhabha/bhabha.html/.

S. Actis, A. Ferroglia, G. Passarino, M. Passera, C. Sturm, S. Uccirati, GraphShot, a FORM package for automatic

generation and manipulation of one and two loop Feynman diagrams, unpublished.

P. Nogueira, J. Comput. Phys. 105 (1993) 279.

P. Nogueira, An introduction to QGRAF 2.0, ftp://gtae2.ist.utl.pt/pub/qgraf/.

M. Tentyukov, J. Fleischer, Comput. Phys. Commun. 132 (2000) 124141, hep-ph/9904258.

J. Vermaseren, New features of FORM, math-ph/0010025.

M. Czakon, DiaGen/IdSolver, unpublished.

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

51

S. Laporta, Int. J. Mod. Phys. A 15 (2000) 50875159, hep-ph/0102033.

E. Remiddi, J. Vermaseren, Int. J. Mod. Phys. A 15 (2000) 725754, hep-ph/9905237.

T. Gehrmann, E. Remiddi, Nucl. Phys. B 580 (2000) 485518, hep-ph/9912329.

T. Gehrmann, E. Remiddi, Nucl. Phys. B 601 (2001) 248286, hep-ph/0008287.

S. Wolfram, The Mathematica Book, Wolfram Media/Cambridge Univ. Press, 2003.

N. Usyukina, Teor. Mat. Fiz. 22 (1975) 300306.

E. Boos, A. Davydychev, Theor. Math. Phys. 89 (1991) 10521063.

M. Czakon, Comput. Phys. Commun. 175 (2006) 559571, hep-ph/0511200.

M. Roth, A. Denner, Nucl. Phys. B 479 (1996) 495514, hep-ph/9605420.

S. Moch, P. Uwer, Comput. Phys. Commun. 174 (2006) 759770, math-ph/0508008.

S. Actis, M. Czakon, J. Gluza, T. Riemann, Nucl. Phys. B (Proc. Suppl.) 160 (2006) 91100, hep-ph/0609051.

G. t Hooft, M. Veltman, Nucl. Phys. B 44 (1972) 189213.

C. Bollini, J. Giambiagi, Nuovo Cimento B 12 (1972) 2025.

R. Bonciani, P. Mastrolia, E. Remiddi, Nucl. Phys. B 661 (2003) 289343, hep-ph/0301170.

P. Mastrolia, E. Remiddi, Nucl. Phys. B 664 (2003) 341356, hep-ph/0302162.

R. Bonciani, P. Mastrolia, E. Remiddi, Nucl. Phys. B 676 (2004) 399452, hep-ph/0307295.

G. Burgers, Phys. Lett. B 164 (1985) 167.

D. Bardin, P. Christova, M. Jack, L. Kalinovskaya, A. Olchevski, S. Riemann, T. Riemann, Comput. Phys. Commun. 133 (2001) 229395, hep-ph/9908433.

A. Arbuzov, M. Awramik, M. Czakon, A. Freitas, M. Grnewald, K. Mnig, S. Riemann, T. Riemann, Comput.

Phys. Commun. 174 (2006) 728758, hep-ph/0507146.

B. Kniehl, M. Krawczyk, J. Khn, R. Stuart, Phys. Lett. B 209 (1988) 337.

J. Fleischer, J. Gluza, A. Lorca, T. Riemann, Eur. J. Phys. 48 (2006) 3552, hep-ph/0606210.

T. Becher, K. Melnikov, arXiv: 0704.3582 [hep-ph].

S.F. King

School of Physics and Astronomy, University of Southampton, Southampton, SO17 1BJ, UK

Received 19 April 2007; accepted 27 June 2007

Available online 4 July 2007

Abstract

We propose an invariant see-saw (ISS) approach to model building, based on the observation that see-saw

models of neutrino mass and mixing fall into basis invariant classes labelled by the CasasIbarra R-matrix,

which we prove to be invariant not only under basis transformations but also non-unitary right-handed

neutrino transformations S. According to the ISS approach, given any see-saw model in some particular

basis one may determine the invariant R-matrix and hence the invariant class to which that model belongs.

The formulation of see-saw models in terms of invariant classes puts them on a firmer theoretical footing,

and allows different see-saw models in the same class to be related more easily, while their relation to the

R-matrix makes them more easily identifiable in phenomenological studies. To illustrate the ISS approach

we show that sequential dominance (SD) models form basis invariant classes in which the R-matrix is

approximately related to a permutation of the unit matrix, and quite accurately so in the case of constrained

sequential dominance (CSD) and tri-bimaximal mixing. Using the ISS approach we discuss examples of

models in which the mixing naturally arises (at least in part) from the charged lepton or right-handed

neutrino sectors and show that they are in the same invariant class as SD models. We also discuss the

application of our results to flavour-dependent leptogenesis where we show that the case of a real R-matrix

is approximately realized in SD, and accurately realized in CSD.

2007 Elsevier B.V. All rights reserved.

1. Introduction

The discovery and subsequent study of neutrino masses and mixing [1] remains the greatest advance in physics over the past decade. The latest experimental data

[2] is consistent

sin 13 0 [3]. How to incorporate small neutrino masses and large mixings into some new theE-mail address: sfk@hep.phys.soton.ac.uk.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.024

53

ory of flavour beyond the Standard Model has been the topic of intense theoretical activity [4]

over the same period.

One particularly attractive mechanism is the see-saw mechanism [5], based on a simple extension of the (possibly Supersymmetric) Standard Model involving more than one right-handed

neutrino R , coupling to left-handed lepton doublets L with a matrix of typical Yukawa cou (where typical means in the same ball park as the charged lepton Yukawa couplings

plings YLR

E

YLR of L to right-handed charged leptons ER ) and having large (compared to the weak scale)

Majorana masses MRR . From these high energy inputs one may derive the low energy effective

M 1 Y T where v is a Higgs

neutrino mass matrix from the see-saw formula mLL = vu2 YLR

u

RR LR

E one may then obtain the low energy

vacuum expectation value (VEV). From mLL and YLR

charged lepton masses me , m , m and neutrino masses mi from the eigenvalues of the matrices,

E and m from the left. After non= VEL VL , where VEL and VL diagonalize YLR

and VMNS

LL

physical phases are removed, the lepton mixing matrix VMNS can be compared to experiment.

There has been much theoretical effort devoted to understanding the origin and pattern of the

, Y E and M

high energy see-saw matrices YLR

RR which can lead to agreement with low enLR

ergy data, via the see-saw mechanism [4]. This problem is often considered together with the

U , Y D , and is referred to as the flavour probanalogous one of the quark Yukawa matrices YLR

LR

lem. Although the flavour problem has been around for many years, the recent neutrino data

provides additional challenges and constraints which have provided new insights into the problem, and a renewed impetus to attack it, resulting in an explosion of recent theoretical work in

this direction. While it is impossible to review all the different models that have been proposed,

the different approaches may be classified as either kinematical or dynamical. In both the

kinematical or dynamical approaches the goal is to guess or derive the input high enU , Y D and lepton see-saw matrices Y , Y E and M

ergy input quark Yukawa YLR

RR . However,

LR

LR

LR

as has long been emphasized by Jarlskog [6] such matrices are not physical, since their appearance changes depending on the particular basis of underlying fields one chooses to work, and so

working in a particular basis is meaningless.

,

This paper starts from the simple observation that not all choices of see-saw matrices YLR

E

YLR and MRR which are consistent with a given set of low energy lepton parameters me , m , m ,

mi and VMNS , are related to each other under a change of basis. This is in contrast to the quark

U , Y D consistent with a given set of low energy

sector where all choices of Yukawa matrices YLR

LR

quark parameters mu , mc , mt , md , ms , mb and VCKM , are related to each other under a change

of basis. It is also in contrast to the effective lepton sector, where all choices of effective lepton

E which are consistent with a given set of low energy lepton parameters m ,

matrices mLL and YLR

e

m , m , mi and VMNS , are related to each other under a change of basis. This observation implies

,YE ,M

that sets of see-saw matrices fall into invariant classes of models, {YLR

RR } C(R),

LR

where each different class C(R) is labelled by some continuous parameters R, where members

of C(R) are consistent with the same low energy lepton observables me , m , m , mi and VMNS ,

for all R. The set of all see-saw matrices within a particular invariant class C(R1 ) are related to

each other under a change of basis, but are not related to those in a different class C(R2 ).

It is well known amongst the phenomenological community that the R-matrix of Casas and

,YE ,M

Ibarra [7] may be used to parameterize choices of see-saw matrices YLR

RR consistent

LR

with a given set of low energy lepton parameters me , m , m , mi and VMNS . Although it was

appreciated by Casas and Ibarra that the R-matrix parameterization may be used in different

lepton bases [7], this feature is rarely or never used in phenomenological analyses where people

E , M

invariably work in the flavour basis where YLR

RR are both diagonal. On the other hand,

the R-matrix is largely ignored by the theoretical community who are concerned with guessing

54

or deriving the see-saw matrices in a particular basis, which in general will not correspond to the

flavour basis, so the R-matrix is not regarded as relevant.

In the present paper we show that the R-matrix is a basis invariant quantity, then propose

using it in the context of model building to label the invariant class C(R) of see-saw models to

which a particular model example belongs. Given a particular see-saw model there are several

reasons why it is worth determining the invariant class C(R) to which it belongs, i.e. finding the

invariant R-matrix associated with the particular see-saw model:

1. It puts the theory on a firmer theoretical foundation, since invariant quantities are always

preferred to basis dependent one [6].

2. Given the R-matrix one may immediately generate an infinite set of equivalent see-saw models filling out the invariant class C(R) by applying lepton basis changes. This applies both to

the kinematical and the dynamical approaches. So for any particular model (infinitely)

many other models come for free.

3. It may turn out that a particular model under consideration corresponds to the same R-matrix

as another model, i.e. the two models are in the same invariant class, in which case the two

models should essentially must be regarded as the same model.

4. For given (class of) models, with R specified one may immediately make contact with phenomenological analyses which have been performed in the literature which are relevant to

testing the (class of) models.

In this paper we shall illustrate the power of such an invariant see-saw (ISS) approach by

discussing the case of sequential dominance (SD) [8]. SD is motivated by two considerations:

To account for a neutrino mass hierarchy m1 m2 m3 and large atmospheric and solar

mixing angles in a natural way, without any tunings or cancellations. Although the (2, 3)

mass hierarchy in the neutrino sector is not that strong, m2 /m3 0.2, we would still like to

have a natural explanation for the smallness of this hierarchy, just as we would like to have

an explanation for the smallness of the Cabibbo angle which has a similar value.

To disentangle the question of the neutrino masses and the mixing angles, and so enable

some explanation for tri-bimaximal neutrino mixing which involves elements in the MNS

matrix having values equal to square roots of simple rational numbers such as 1/2 or 1/3.

This would not be possible if the neutrino masses played a part in the calculation of the solar

and atmospheric mixing angles.

In SD, a natural neutrino mass hierarchy, m2 /m3 0.2, results from having one of the righthanded neutrinos give the dominant contribution to the see-saw mechanism, while a second righthanded neutrino gives the leading sub-dominant contribution, leading to a neutrino mass matrix

with naturally small determinant [8].1 In a basis where the right-handed neutrino mass matrix is

diagonal, the atmospheric and solar neutrino mixing angles are determined in terms of ratios of

Yukawa couplings involving the dominant and subdominant right-handed neutrinos, respectively.

If these Yukawa couplings are related in a certain way, then it is possible for tri-bimaximal

neutrino mixing, to emerge in a simple and natural way, independently of the neutrino mass

eigenvalues. This is known as constrained sequential dominance (CSD) [10], and can readily

1 For alternative approaches involving a small determinant see [9].

55

arise from vacuum alignment in flavour models [1012]. In such unified flavour models there

are corrections to tri-bimaximal mixing from charged lepton corrections, resulting in testable

predictions and sum rules for lepton mixing angles [10,13].

Although well motivated on physical grounds, SD appears to be restricted to a particular

basis, namely that in which the right-handed neutrino and charged lepton mass matrices are both

diagonal, although in particular model realizations there are typically small off-diagonal elements

in both these mass matrices which must be taken into account. This might lead one to conclude

that the notion of SD is quite limited, and furthermore that it is not physical since physical

quantities should be basis independent. However, following the ISS approach advocated above,

we will determine the invariant classes C(R) to which SD models belong, by finding the invariant

R-matrix associated with each of the SD types, and hence show that SD can be formulated in a

basis invariant way. In particular tri-bimaximal neutrino mixing from constrained SD is shown

to have an easily identifiable form in which the R-matrix is related to the unit matrix, where

this form is preserved under charged lepton or right-handed neutrino basis changes, though the

former gives observable corrections to the MNS parameters. Having done this we shall then

reap the benefits mentioned above. Namely we shall show how certain models that have been

proposed in the literature are equivalent to SD under a basis change, for example models where

the mixing is completely or in part originating from the right-handed neutrino or charged lepton

sectors [12,15]. We shall also discuss phenomenological analyses based on choices of R-matrix

parameters that are seen to be relevant for SD.

In detail, the material discussed in this paper is structured as follows. In Section 2 we discuss

the ISS approach to model building that we advocate. We first review the well known result

U , Y D consistent with given physical parameters are

that all pairs of quark Yukawa matrices YLR

LR

related by basis transformations [6], and then show that a similar result holds for the effective

E . We then show that a similar result does not apply to the see-saw

lepton matrices mLL and YLR

mechanism, which leads to the notion of invariant classes of see-saw models, which may be

parameterized by the R-matrix of Casas and Ibarra. We show how the R-matrix may be obtained

and prove its invariance under basis transformations. We also propose a short-cut to obtaining

the R-matrix, using a non-unitary S-matrix transformation of right-handed neutrinos, which is

useful when right-handed neutrino mass eigenvalues are not required. In Section 3 we discuss SD

models as a prime example of the ISS approach. We first discuss this in a two family example,

where a convenient vector notation for SD is introduced, and a relation between the R-matrix

angle and the angle between these vectors is established. We then go on to the full three family

case where we discuss the form of the R-matrix for all the types of SD, and provide a systematic

discussion of the R-matrix in the two right-handed neutrino limit in each case. Having established

the relation between SD and the R-matrix, this then defines the invariant classes of see-saw

models to which SD models belong, and hence allows the full set of models in these classes

to be constructed by basis transformations. In Section 4 we discuss the physical applications of

these results to invariant classes of SD models. The particular forms of R-matrix associated with

CSD and tri-bimaximal neutrino mixing are identified. SD is shown to be in the same invariant

class as some models where the mixing completely or partly originates from the right-handed

neutrino or charged lepton sectors [12,14,15]. We also discuss phenomenological analyses based

on choices of R-matrix parameters that are seen to be relevant for SD. For example we discuss

the application of our results to flavour-dependent leptogenesis [23], and show that the case of

the real R-matrix may be (approximately) realized in SD. Section 5 concludes the paper.

Finally we would like to mention some earlier works where the relation between SD and the

R-matrix has been mentioned before [1619], and to distinguish what was done in the previous

56

papers from what is done in the present paper. The R-matrix was first applied to SD in [16],

where the approximate unit matrix type structure of the R-matrix (or a permutation of it) corresponded to the sequential dominance of the three right-handed neutrinos. The Yukawa vector

notation, used in this paper, was first introduced in [16], however the MNS vector notation is

introduced here. Some of these features were subsequently further discussed in [17,18]. Recently,

an analysis with a unit R-matrix was performed in [19]. All of these papers presented results in

the flavour basis in which the right-handed neutrino mass matrix and the charged lepton mass

matrix were both diagonal. The basis-invariance noticed by Casas and Ibarra was not discussed

in any of the papers in [1619], and the invariance of the R-matrix under non-unitary righthanded neutrino transformations is new. Other new results include a precise relation between

the angle between the Yukawa vectors and the R-matrix angle in two right-handed neutrino

limiting cases. These limiting cases are discussed in detail, for all the different mass orderings of

right-handed neutrinos. The fact that, for such limiting cases, SD is an automatic consequence of

a particular texture zero and a small 13 is also discussed. Tri-bimaximal mixing and CSD is also

discussed, corresponding to the R-matrix taking a very precise (permutation of the) unit matrix

form (rather than an approximate such form). The effect of charged lepton corrections is shown

to give corrections to the PMNS angles, but not to the R-matrix, which retains its precise form.

The fact that SD corresponds to approximately real R-matrix angles, as can be seen by considering the modular surfaces of the R-matrix angles close to zero or /2, is also a new result, and its

application to recent flavour-dependent leptogenesis analyses is discussed.

Finally we emphasize that the ISS model building approach we advocate here, whereby any

proposed see-saw model should be expressed in terms of the invariant R-matrix, represents a

new strategy that can be applied to all see-saw models, not just the ones which satisfy SD. To

illustrate the usefulness of the approach, we discuss examples of models in which the lepton

mixing originates either from the charged lepton sector, or partly from the right-handed neutrino

sector, and show that they have the same invariant R-matrix as SD, and therefore are equivalent

to it under a change of basis, which we subsequently prove. The idea in this paper, then, is to

use the R-matrix more actively in model building (rather than in phenomenology, where it has

been used extensively by many authors), with the hope that model builders will express their seesaw models in terms of the R-matrix (not normally done). The essential point of this paper is to

emphasise that the R-matrix is invariant under basis transformations, since this feature, although

clearly known by the inventors, is not so well used. It is precisely this invariance that means that

the R-matrix can and should be more widely used as a model building tool, to classify and relate

models.

2. The ISS model building approach

2.1. Quark sector

In the quark sector the Dirac mass matrices of the up and down quarks are given by

U

0

D

D

0

mU

LR = YLR vu , and mLR = YLR vd where vu = Hu and vd = Hd , and the Lagrangian is

of the form L = L YLR H R + H.c. The change from flavour basis to mass eigenstate basis

can be performed with the unitary diagonalization matrices VUL , VUR and VDL , VDR by

VUL mU

LR VUR = diag(mu , mc , mt ),

VDL mD

LR VDR = diag(md , ms , mb ).

(1)

VCKM

= VUL VD L

(2)

57

where quark phase rotations which leave the quark masses real and positive may be used to

remove five of the phases leaving one physical phase in the CKM matrix VCKM . The Standard

Model quark sector clearly respects the symmetry

Gquark = UQ (3) UUR (3) UDR (3)

(3)

corresponding to quark doublet, right-handed up quark and right-handed down quark rotations,

which change the quark basis and the form of the Yukawa matrices, but leave the physics (quark

masses and mixings) unchanged. In the quark sector it is well known that the only physical

quantities are basis independent invariants formed from the mass matrices, the so-called Jarlskog

invariants [6], rather than the mass matrices themselves, since any pair of quark mass matrices

which lead to the correct physics may be related to any other pair which lead to the same physics,

by a change of basis, up to quark phases, using the symmetry Gquark .

This can be proved, for example, by showing that any two pairs of quark mass matrices can

be related by a change of basis, using the symmetry Gquark , to a common basis in which the up

quark mass matrix is diagonal, and the down quark mass matrix is equal, up to quark phases, to

the CKM matrix multiplied by a diagonal matrix of down quark masses,

mU

LR = diag(mu , mc , mt ),

mD

LR = VCKM diag(md , ms , mb ).

(4)

U

D

D

Since any two pairs of mass matrices (mU

LR )1 , (mLR )1 and (mLR )2 , (mLR )2 may be related to

U

D

mLR , mLR in Eq. (4) by a change of basis, it follows that all choices of quark mass matrices

which lead to the same physics can be related to each other, up to quark phases, using the symD

metry Gquark . This implies that the quark mass matrices mU

LR , mLR are not physical quantities

since they are basis dependent, i.e. not invariant under the symmetry Gquark . It is possible to

define Gquark invariant combinations consisting of determinants and traces of products of the

U = mU (mU ) and S D = mD (mD ) , for example the determinant of the

combinations SLL

LR

LR

LL

LR

LR

U , S D ] is an invariant [6].

commutator det[SLL

LL

From the point of view of low energy neutrino experiments, Majorana neutrino masses arise

from the effective operator: Leff = 12 Hu LT Hu L + H.c. where L are the lepton doublets, Hu

are Higgs doublets, and is a matrix of effective (dimensional) couplings. In our convention the

effective Majorana masses are given by the Lagrangian L = L mLL c + H.c., where mLL =

vu2 . The rotation to the mass eigenstate basis can be performed with the unitary diagonalization

matrices VEL , VER and VL by

VEL mE

LR VER = diag(me , m , m ),

(5)

= VEL VL

VMNS

(6)

where charged lepton phases rotations which leave the charged lepton masses real and positive

may be used to remove three of the phases leaving three physical phases in the MNS matrix

VMNS .

The effective lepton sector clearly respects the symmetry

Geff

lepton = UL (3) UER (3)

(7)

58

corresponding to lepton doublet and right-handed charged lepton rotations, which change the

lepton basis and the form of the effective lepton matrices, but leave the physics (lepton masses

and mixings) unchanged. The physically measurable low energy lepton parameters are the three

charged lepton masses me , m , m , the three neutrino masses m1,2,3 > 0 and the lepton mixing

parameters contained in VMNS .

LR , mLL which lead to a given

low energy physics may be related to any other pair which lead to the same physics, by a change

of basis, using the symmetry Geff

lepton . This is easily proved (analogous to the quark sector) by

transforming to a common basis in which the charged lepton mass matrix is diagonal, and the

=

effective Majorana neutrino mass matrix is specified in terms of the lepton mixing matrix VMNS

VEL VL and the physical neutrino masses mi ,

mE

LR = diag(me , m , m ),

T

mLL = VMNS

diag(m1 , m2 , m3 )VMNS

(8)

where Eq. (8), often called the flavour basis, is analogous to Eq. (4). Then, as in the quark case,

E

we can argue that since any two pairs of matrices (mE

LR )1 , (mLL )1 and (mLR )2 , (mLL )2 can be

rotated to the flavour basis then they can therefore be rotated into each other, using the symmetry

E

Geff

lepton , analogous to the quark sector result. mLR , mLL are clearly basis dependent, but invariants

E

E

E

under Geff

lepton can be constructed using SLL = mLR (mLR ) and SLL = mLL (mLL ) , for example

E , S ] is invariant.

the determinant of the commutator det[SLL

LL

2.3. See-saw sector

The starting point of the see-saw mechanism is the Lagrangian,

E

R YLR

R + 1 RT MRR R + H.c.,

Hd LE

Hu L

Lsee-saw = YLR

(9)

2

where all indices have been suppressed, and we have introduced two Higgs doublets Hu , Hd as in

the Supersymmetric Standard Model.2 It is common to call Eq. (9) the see-saw Lagrangian. After

integrating out the right-handed neutrinos it leads to an effective low energy leptonic Lagrangian

of the type discussed in the previous subsection where the effective Majorana mass matrix given

by the (type I) see-saw formula:

1 T

MRR

YLR .

mLL = vu2 YLR

(10)

The effective low energy matrices are diagonalised by unitary transformations VEL , VER and VL

as in Eq. (5), and the lepton mixing matrix is as in Eq. (6).

The lepton symmetry of the see-saw Lagrangian in Eq. (9) is:

Glepton = UL (3) UER (3) UR (3)

(11)

corresponding to lepton doublet, right-handed charged lepton and right-handed neutrino rotations, which change the lepton basis and the form of the see-saw matrices, but leave the

physics (lepton masses and mixings) unchanged. Using these symmetries we can ask the quesE , Y and M

tion whether all sets of see-saw matrices YLR

RR which lead to a given set of low

LR

energy physical lepton parameters are equivalent to each other by a change of basis. Analogous

2 In the case of the Standard Model one of the two Higgs doublets is equal to the charge conjugate of the other,

Hd Huc .

59

to the quark sector, we may attempt to relate all sets of see-saw matrices to a common set of seesaw matrices in which the charged lepton mass matrix is diagonal, and the right-handed neutrino

Majorana mass matrix is also diagonal,

E

vd YLR

= diag(me , m , m ),

MRR

= diag(M1 , M2 , M3 ),

= VEL YLR

VR

YLR

(12)

where unitary VR is defined by VR MRR VTR = MRR

and Mi > 0.

We refer to the basis of Eq. (12) as the see-saw flavour basis in analogy to Eq. (8). The

is not uniquely specified since it is

difference between Eqs. (4), (8) and (12) is that here YLR

diagonalized by left-handed rotations which are not simply related to the lepton mixing matrix,

and in addition its eigenvalues are not simply related to physical neutrino masses. Therefore,

unlike the quark sector, or the effective lepton case, there is not a unique common basis. ThereE ) , (Y ) , (M

E

fore, we conclude that any two sets of see-saw matrices (YLR

1

RR )1 and (YLR )2 ,

LR 1

) , (M

(YLR

)

which

give

the

same

physical

right-handed

neutrino

masses,

light

effective

neu2

RR 2

trino masses, charged lepton masses and lepton mixings, cannot be transformed into each other

under the lepton see-saw symmetry Glepton corresponding to basis changes.

We note parenthetically that although the see saw formula is not a basis invariant, by taking

its determinant one can obtain the invariant mass formula [20]:

m1 m2 m3 =

M1 M2 M3

(13)

where mi are the physical light left-handed neutrino masses, Mi are the heavy right-handed

.

neutrino masses, and mDi are the eigenvalues of the Dirac neutrino mass matrix mLR = vu YLR

The product of diagonal squared Dirac mass eigenvalues, is clearly an invariant since it is given

by det(mLR mLR ). Although Eq. (13) should have useful see-saw model building applications

with respect to neutrino masses, it clearly does not shed any light on the question of neutrino

mixing.

2.4. Invariant classes of see-saw models and the R-matrix

We have seen that, in contrast to the case of the effective lepton or quark sector, not all

, Y E and M

choices of see-saw matrices YLR

RR which are consistent with a given set of

LR

low energy lepton parameters me , m , m , mi and VMNS , are related to each other under a

change of basis. This implies that sets of see-saw matrices fall into invariant classes of models,

,YE ,M

{YLR

RR } C(R), where each different class C(R) is labelled by some continuous paraLR

meters R, where members of C(R) are consistent with the same low energy lepton observables

me , m , m , mi and VMNS , for all R. The set of all see-saw matrices within a particular invariant

class C(R1 ) are related to each other under a change of basis, but are not related to those in a different class C(R2 ). In this subsection we show that the R-matrix of Casas and Ibarra [7], which

is well known in phenomenological applications, is a basis invariant quantity. We then propose

using it in the context of model building to label the invariant class C(R) of see-saw models to

which a particular model example belongs.

Following [7], we first derive the R-matrix in the see-saw flavour basis in Eq. (12), by con to give m in the basis in Eq. (8) using the see-saw mechanism in Eq. (10):

straining YLR

LL

T

vu2 YLR

diag(M1 , M2 , M3 )1 YLR

= VMNS

diag(m1 , m2 , m3 )VMNS T .

(14)

60

In order to solve Eq. (14) for the neutrino Yukawa matrix YLR

T

T

the equation in the form AA = BB then take the positive square root of the equation to give,

diag(M1 , M2 , M3 )1/2 = VMNS

diag(m1 , m2 , m3 )1/2 R T

vu YLR

(15)

= I where I is the unit matrix.

in the see-saw flavour basis,

It is often used in phenomenological analyses to parameterize YLR

in terms of physical parameters from Eq. (15).

since R determines YLR

In the above discussion the R-matrix was derived in the see-saw flavour basis. However one

can repeat the above derivation starting from a general charged lepton basis in which neither

E nor Y (unprimed matrices) are in general not diagonal (but retaining for the moment a

YLR

LR

diagonal right-handed neutrino mass matrix) leading to:

RT R

vu YLR

(16)

mLL

in this basis, as in Eq. (5). Comparing Eq. (16) to

Eq. (15) the only change is to left-hand sides of the equations, where in the see-saw flavour basis

it happens that VL = VMNS

. The fact that the same R-matrix appears in Eq. (16) as Eq. (15)

= V Y , where V

follows from the fact that YLR

EL LR

EL diagonalizes the charged lepton mass

matrix as in Eq. (5). Therefore by multiplying on the left-hand sides of Eq. (16) by VEL , and

comparing the resulting equation to Eq. (15), where the MNS matrix is given by Eq. (6), we find

the non-trivial result that the same R-matrix must appear in both Eqs. (15) and (16). We conclude

that the R-matrix is invariant under a change of charged lepton basis.

We now prove that the R-matrix is invariant under a change of right-handed neutrino basis,

so that the right-handed neutrinos are no longer diagonal. The main observation is that according

to the R-matrix parameterizes only the combination on the left-hand side of Eq. (16), and this

combination is clearly invariant under UR (3), which also preserves the right-handed neutrino

masses. Under R VR R , Eq. (16) thus becomes,

vu YLR

(17)

with R again invariant. The invariance of the R-matrix, together with Eq. (17), suggests the

following ISS model building strategy. In some particular given basis where the see-saw matrices

, Y E and M

YLR

RR are not diagonal, Eq. (17) may be used to determine the R-matrix in terms

LR

of the masses mi , Mi , the matrix VL which diagonalises mLL in this basis, as in Eq. (5), and VR

as defined below Eq. (12). Since the R-matrix is invariant under a change of basis, as we have

shown, it may then be used to label invariant class of models to which the particular see-saw

,YE ,M

matrices belong, {YLR

RR } C(R).

LR

Finally we show that the R-matrix is also invariant under non-unitary right-handed neutrino

transformations, namely R SR , where S is non-singular, which results in:

YLR

YLR

S 1 ,

MRR S T

MRR S 1 ,

1

1 T

MRR

SMRR

S .

(18)

The transformations in Eq. (18) leave the effective low energy neutrino mass matrix mLL invariant, which follows from the see-saw mechanism in Eq. (10). However the right-handed neutrino

masses will change, since S is non-unitary. By a suitable choice of S, MRR can be transformed

into a diagonal form,

ST

MRR S 1 = diag(M 1 , M 2 , M 3 ),

(19)

61

where we emphasize that the choice of S is not unique, and M i are not the eigenvalues of MRR .

For example, without loss of generality, S can always be chosen so that M i are all equal to unity

in some units. Allowing non-unitary S-matrix transformations, one can derive a similar result to

Eq. (17),

vu YLR

(20)

where S and M i are defined in Eq. (19), and VL is as before since mLL is invariant under Smatrix transformations. R is once again invariant, which essentially follows from the invariance

of the combination on the left-hand side of Eq. (20) under S-matrix transformations. For a given

, Y E and M

non-diagonal set of see-saw matrices YLR

RR , Eq. (20) can sometimes be used as

LR

a short-cut to determining the invariant R-matrix, instead of Eq. (17). Since the R-matrix is

invariant under the S-matrix, as we have shown, it may then be used to label invariant class of

,YE ,M

models to which the particular see-saw matrices belong, {YLR

RR } C(R), as before.

LR

The S-matrix approach may be especially useful in low energy applications where the righthanded neutrino masses are not required.

Note that the use of non-unitary transformations is familiar from the study of non-minimal

Khler potentials in supersymmetric scenarios, which require such field transformations in order

to put the Khler potential into canonical form, as has been recently studied [21]. However none

of these works explicitly addresses the invariance of the R-matrix for non-unitary right-handed

neutrino transformations. Similarly it was not proved either by the originators of the R-matrix,

Casas and Ibarra [7], or in [1619]. The discussion of the S-matrix as applied to the R-matrix in

this section is therefore original and (as we shall see in Section 4.1.3) can be quite useful.

3. ISS approach to SD

3.1. Two family SD in the see-saw flavour basis

In this section we shall show that sequential dominance (SD) models [8] correspond to particular invariant classes of see-saw models characterized by particular forms of the R-matrix.

SD provides a good example of the invariant see-saw (ISS) approach, since SD is sometimes

criticized as being only valid in a special basis, namely the see-saw flavour basis. Defining SD

in terms of the R-matrix renders the SD approach basis independent which overcomes this criticism, and brings with it all the benefits already mentioned previously, some of which will be

explored further in the next section on applications. We shall begin by discussing the dominance

mechanism in a simple two family example, first in the see-saw flavour basis, then in terms of the

R-matrix which defines a basis independent formulation of SD. We then extend this discussion

to include three families, then take the two right-handed neutrino limit of these models.

To review the basic idea of SD, then, it is instructive to begin by discussing a simple 2 2

example applicable to the atmospheric mixing in the (2, 3) sector, in the see-saw flavour basis,

i.e. the diagonal charged lepton and right-handed neutrino Majorana mass basis, where we can

write,

MRR =

MA

0

0

MB

,

mLR =

A2

A3

B2

B3

(21)

62

v . It is sufficient for the toy model to ignore phases, and suppose that A , B

where mLR = YLR

u

i

i

1 T

are real. The see-saw formula in Eq. (10) mLL = mLR MRR

mLR gives:

A22

B22

A2 A3

B2 B3

+

+

M

M

M

M

A

B

A

B

mLL =

(22)

.

A23

B32

A2 A3

B2 B3

+

+

MA

MB

MA

MB

The mass matrix in Eq. (22) is diagonalized to give two neutrino mass eigenvalues m2 , m3 by

rotating through an angle 23 given by,

tan 223 =

2( AM2 AA3 +

A2

( MA2 +

B2 B3

MB )

.

2

B2

A22

B22

)

(

+

)

MB

MA

MB

(23)

det mLL =

1

(A2 B3 A3 B2 )2 = m2 m3

MA MB

(24)

and the trace of the neutrino mass matrix mLL in Eq. (22) is

Tr mLL =

A2

B2

A22

B2

+ 3 + 2 + 3 = m2 + m3 m3 ,

MA MA MB

MB

(25)

where the last approximation assumes a neutrino mass hierarchy m3 m2 . m2 is then approximately determined from the trace and determinant of the mass matrix as,

det mLL det mLL

m2 =

m3

Tr mLL

(A2 B3 A3 B2 )2

MA MB

(A22 +A23 )

(B 2 +B 2 )

+ 2MB 3

MA

(26)

The basic assumption of SD is that one of the right-handed neutrinos plays the dominant

role in the see-saw mechanism. Without loss of generality we shall assume that the right-handed

neutrino of mass MA dominates the see-saw mechanism:

|Bi Bj |

|Ai Aj |

.

MA

MB

(27)

Assuming the dominance approximation in Eq. (27), the determinant and trace of the mass matrix in Eq. (22) imply that the neutrino mass spectrum then consists of one neutrino with mass

m3 (A22 + A23 )/M3 and one naturally light neutrino m2 m3 determined from Eq. (26), since

the determinant of Eq. (22) is naturally small, and vanishes in the extreme limit of the dominance approximation when only one right-handed neutrino contributes [8]. Under the dominance

approximation in Eq. (27), the atmospheric angle from Eq. (23) is tan 23 A2 /A3 [8] which

can be large or maximal providing A2 A3 . Collecting together these results, the dominance

approximation in Eq. (27) leads to,

m3

(A22 + A23 )

,

MA

m2

(A2 B3 A3 B2 )2

,

(A22 + A23 )MB

tan 23

A2

.

A3

(28)

Therefore, assuming the dominance of a single right-handed neutrino, Eq. (28) shows that m3 is

determined approximately by the right-handed neutrino with mass MA , m2 is determined approximately by the right-handed neutrino with mass MB , and tan 23 is determined approximately by

63

a simple ratio of Yukawa couplings, independently of the neutrino mass hierarchy. Note that

right-handed neutrino dominance allows the origin of the large mixing angle to be decoupled

from the neutrino mass hierarchy, allowing both features to co-emerge in a very natural way.

The above results can be expressed more compactly by introducing the column vector notation

[16],

A2

B2

1/2

1/2

vB =

MA ,

MB .

vA =

(29)

A3

B3

1 T

mLR gives:

Then the see-saw formula in Eq. (10) mLL = mLR MRR

mLL = vA vA T + vB vB T .

(30)

det mLL = |vA vB |2 = m2 m3

(31)

Tr mLL = |vA |2 + |vB |2 = m2 + m3 m3

(32)

m2 is then approximately determined from the trace and determinant of the mass matrix as,

m2 =

|vA vB |2

det mLL det mLL

m3

Tr mLL

|vA |2 + |vB |2

(33)

To arrange for a hierarchy m2 /m3 1/5, we require the determinant to be small compared to

the square of the trace. This may be achieved using the dominance condition in Eq. (27) that the

right-handed neutrino of mass MA gives the dominant contribution to the see-saw mechanism,

which in vector notation implies:

|vA |2 |vB |2 .

(34)

We shall see in the next section that the dominance approximation leads to the vectors vA and

vB being approximately orthogonal and that there is a precise correlation between the degree of

orthogonality of these two vectors and the degree of dominance. Here we give two examples

which illustrate that the dominance condition only applies when the two vectors vA and vB are

sufficiently orthogonal:

If A2 = A3 and B2 = B3 , corresponding to the two vectors vA and vB being exactly orthogonal, then Eq. (28) gives,

m3

2A22

,

MA

m2

2B22

,

MB

tan 23 1

(35)

B2

A22

5 2

MA

MB

which satisfies the dominance condition in Eq. (27).

(36)

64

Now suppose that the two vectors are at 45 to each other, such as given by A2 = A3 , and

B3 = 0, then Eq. (28) becomes,

m3

2A22

,

MA

m2

B22

,

2MB

tan 23 1

(37)

A22

5 B22

MA 4 MB

(38)

which only marginally satisfies the dominance condition in Eq. (27). If the vectors are more

closely aligned than about 45 then the dominance condition will not be satisfied.

3.2. Two family SD and the R-matrix

According to the ISS approach, we should formulate SD in terms of the invariant R-matrix.

From Eq. (15), we have for the two family toy model in Eq. (21), dropping primes, and assuming

MA < MB :

1/2

1/2

MA

0

0

22 m2

T

mLR

(39)

= VMNS

R22

.

0

MB

0 m3

The MNS matrix is parameterized by the atmospheric angle 23 , and the R-matrix may be parameterized here by an angle , ignoring phases,

c23 s23

c s

22

T

VMNS =

(40)

,

,

R22 =

s c

s23 c23

where c = cos , s = sin . Each choice of specifies a particular solution to the see-saw formula

1/2

for the combination mLR MRR on the left-hand side of Eq. (39).

Using Eqs. (39), (40), we find the following expressions for the vectors introduced in

Eqs. (29),

1/2 c23

1/2 s23

vA = cm2

+ sm3

,

s23

c23

1/2 c23

1/2 s23

vB = sm2

(41)

+ cm3

.

s23

c23

The single right-handed neutrino dominance approximation in Eq. (34) is then seen from Eq. (41)

to correspond to values of /2 since for hierarchical neutrinos m3 m2 . An interesting

special limiting case is provided by the choice = /2, which corresponds to an R-matrix, with

an off-diagonal structure,

0 1

R=

.

(42)

1 0

In this limiting case Eq. (41) shows that the vector vB is exactly orthogonal to vA . This example

was discussed in the last section where it was shown to lead to Eq. (35), where the dominant righthanded neutrino dominates the see-saw mechanism by a factor of 5 according to Eq. (36). For

small deviations from = /2, Eq. (41) shows that the vector vB is approximately orthogonal

to vA , and as the angle is decreased, the vectors vB and vA become less orthogonal.

65

There is a precise correlation between the angle between the two vectors vA and vB and the

degree of dominance, parameterized by the angle . To see this we first write Eq. (41) in a more

compact form as,

1/2

1/2

vA = cm2 + sm3 ,

1/2

1/2

vB = sm2 + cm3 ,

(43)

1/2

where mj

22 1/2

1/2

m ,

mj = VMNS

ij j

1/2

i.e. the mj

1/2

(44)

1/2

22

is the j th column of VMNS

times mj . We see that mi

1/2

. mj

1/2

= ij mi , and |m2

1/2

m3 |2 = m2 m3 . The angle AB between the two vectors vA and vB is then given by,

cos AB =

(m3 m2 ) sin 2

2|vA ||vB |

(45)

|vA |2 = c2 m2 + s 2 m3 ,

|vB |2 = s 2 m2 + c2 m3 .

(46)

From Eqs. (45), (46) it is seen that the angle simultaneously parameterizes the angle between

the two column vectors and their ratio of magnitudes which quantifies the precise degree of

dominance. From Eqs. (45), (46) it is seen that when /2, then AB /2 and |vA |2 /|vB |2

m3 /m2 5, corresponding to |vA |2 |vB |2 as in Eq. (34). Once an angle and the right-handed

neutrino masses have been chosen, and the vectors vB and vA thereby specified, we can invert

Eq. (43), to express the neutrino mass eigenstates in terms of the different see-saw contributions,

1/2

m2 = cvA svB ,

1/2

m3 = svA + cvB .

(47)

With values of /2, corresponding to single right-handed neutrino dominance, Eq. (47)

clearly shows that the mass eigenstate m3 mainly results from the see-saw contribution of the

right-handed neutrino of mass MA , and the mass eigenstate m2 mainly results from the see-saw

contribution of the right-handed neutrino of mass MB . However Eq. (47) should be interpreted

with care since it is only meaningful once Eq. (41) has first been used.

It is also observed from Eqs. (45), (46) that when 0, then AB /2 and |vA |2 /|vB |2

m2 /m3 1/5 corresponding to |vB |2 |vA |2 . This corresponds to another type of dominance in

which the heavier right-handed neutrino of mass MB dominates the see-saw mechanism. So far

we have been assuming that the lighter right-handed neutrino of mass MA dominates the see-saw

mechanism, but now we see that there is an alternative case in which the heavier right-handed

neutrino of mass MB is the dominant one, and in this case we would find that the dominance of

the right-handed neutrino of mass MB is achieved for 0, and then the R-matrix is the unit

matrix,

1 0

R=

.

(48)

0 1

The dominance approximation is thus seen to be valid over a large range of angles centered

on either zero or /2, corresponding to a large range of angles AB in the range /4 to /2. Of

66

course there is no precise value of at which the dominance approximation breaks down, and

the parametrization shows that there is a continuum of theories which interpolate between those

which have dominance of one right-handed neutrino and those which do not, in varying degrees.

This analysis shows that the idea of single right-handed neutrino dominance is quite generic and

it is quite likely to be relevant to some approximation in practice.

The above discussion illustrates that there are two types of dominance, one in which the lighter

right-handed neutrino dominates, corresponding to an R-matrix with /2 like Eq. (42), and

one in which the heavier right-handed neutrino dominates, corresponding to an R-matrix with

0 like Eq. (48). In practice, in dealing with the second type of dominance, it is convenient

to continue to identify the heavier dominant right-handed neutrino by the label A and rewriting

Eq. (21) in this case as:

MB

B2 A2

0

MRR =

(49)

,

mLR =

B3 A3

0

MA

where here MB < MA . Thus, when the heavier right-handed neutrino dominates, we shall perform a trivial relabelling A B so that without loss of generality the right-handed neutrino of

mass MA always dominates. Clearly in this second case, using Eq. (49), all the results in this

section from Eq. (39) onwards follow as before but with a trivial relabelling A B. We emphasize again that the advantage of dominance is that the determinant of the neutrino mass matrix

is naturally small, and also that the mixing angle is independent of the neutrino mass hierarchy,

1/2

both features following from the fact that with /2 vA m3 (the same result being also

true for the case or 0 after relabelling A B).

3.3. Three family SD in the see-saw flavour basis

It is straightforward to extend two family SD in the see-saw flavour basis to the case of three

families, Eq. (21) becomes,

MA

A1 B1 C1

0

0

MRR =

(50)

mLR = A2 B2 C2 .

0 ,

0

MB

A3 B3 C3

0

0

MC

The column vector notation is trivially extended to three families [16],

A1

B1

C1

1/2

1/2

1/2

vB = B2 MB ,

v C = C2 M C .

vA = A2 MA ,

A3

B3

C3

(51)

1 T

Then the see-saw formula in Eq. (10) mLL = mLR MRR

mLR gives:

mLL = vA vA T + vB vB T + vC vC T .

(52)

|Ai Aj |

|Bi Bj |

|Ci Cj |

,

MA

MB

MC

(53)

|vA |2 |vB |2 |vC |2 .

(54)

67

We also assume:

|A1 | |A2,3 |.

(55)

Then approximate results for the masses and mixings are given by [8], writing A = |A |eiA1 ,

B = |B |eiB1 , C = |C |eiC1 :

|A2 |

,

|A3 |

tan 23

(56a)

|B1 |

,

tan 12

13 ei(+B1 A2 )

(56b)

+

,

[|A2 |2 + |A3 |2 ]3/2 MB

|A2 |2 + |A3 |2

(56c)

(|A2 |2 + |A3 |2 )v 2

,

MA

|B1 |2 v 2

,

m2 2

s12 MB

m1 O |C|2 v 2 /MC .

m3

(57a)

(57b)

(57c)

The MNS phase is fixed by the requirement that we have already imposed in Eq. (56b) that 12

is real,

c23 |B2 | sin 2 s23 |B3 | sin 3 ,

(58)

where

2 B2 B1 + ,

3 B3 B1 + A2 A3 + .

(59)

The phase is fixed by the requirement (not yet imposed in Eq. (56c)) that the angle 13 is real.

In general this condition is rather complicated since the expression for 13 is a sum of two terms.

However if, for example, A1 = 0 then is fixed by:

A2 B1

(60)

where

= arg A2 B2 + A3 B3 .

(61)

tan

.

|B2 |s23 c2 + |B3 |c23 c3

(62)

Inserting in Eq. (60) into Eqs. (58), (59), we obtain a relation which can be expressed as

tan( + )

.

|B2 |c23 c2 + |B3 |s23 c3

(63)

68

In Eqs. (62), (63) we have written si = sin i , ci = cos i , where we have defined

2 B2 A2 ,

3 B3 A3 ,

(64)

which are invariant under a charged lepton phase transformation. The reason why the see-saw

parameters only involve two invariant phases rather than the usual six, is due to the sequential

dominance assumption which effectively decouples one of the right-handed neutrinos, thereby

removing three phases, together with the further assumption (in this case) of A1 = 0, which

removes another phase.

3.4. Three family SD and the R-matrix

We now discuss the R-matrix for this case. From Eq. (15), we have for the two right-handed

neutrino model, dropping primes and assuming MA < MB < MC :

(MA

m1 0

0

0 1/2

0 1/2

MNS

mLR

(65)

=V

RT .

0

0

MB

0 m2 0

0

0

MC

0

0 m3

Eq. (65) yields the following expressions for the column vectors introduced in Eqs. (51),

1/2

1/2

( vA vB vC ) = m1/2

m2

m3 R T

1

1/2

1/2

mj

1/2

= VijMNS mj ,

(67)

1/2

1/2

The MNS matrix is given by,

c12 c13

s12 c13

s13 ei

MNS

i

i

= s12 c23 c12 s23 s13 e

V

c12 c23 s12 s23 s13 e

s23 c13 P0

s12 s23 c12 c23 s13 ei c12 s23 s12 c23 s13 ei c23 c13

where

P0 =

ei1

0

0

0

ei2

0

(66)

0

0 .

1

(68)

(69)

The R-matrix is a complex orthogonal 3 3 matrix which can be parameterized in terms of three

complex angles i as R = diag(1, 1, 1)R1 R2 R3 where RiT take the form of Eq. (40):

1 0

0

c2 0 s2

T

T

R1 = 0 c1 s1 ,

R2 = 0 1

,

0

0 s1 c1

s2 0 c2

c3 s3 0

T

R3 = s3 c3 0 ,

(70)

0

0

1

c 2 c3

c2 s3

s2

c1 s3 s1 s2 c3

c1 c3 s1 s2 s3

s1 c2

s1 s3 c1 s2 c3

s1 c3 c1 s2 s3

c1 c2

(71)

69

Although the R-matrix is rather complicated, it is clear from Eq. (66) that SD occurs for

values of angles i which correspond to the following approximate forms for the moduli of the

elements of R T :

0

0

1

0

1

0

1

0

0

0

0

1

1

0

0

0

1

0

0

1

0

0

0

1

1

0

0

0

1

0

1

0

0

0

0

1

1

0

0

1

T

0

R

CBA

0

0

0

1

0

1

0

T

R

ABC

T

R

ACB

T

R

BAC

T

R

BCA

T

R

CAB

( vA

vB

vC ) m1/2

3

m2

( vA

vC

vB ) m1/2

3

m1

( vB

vA

vC ) m1/2

2

m3

( vB

vC

vA ) m1/2

2

m1

vA

vB ) m1/2

1

m3

vB

vA ) m1/2

1

m2

0

1 ( vC

0

0

0 ( vC

1

1/2

1/2

m1

1/2

m2

1/2

m1

1/2

m3

1/2

m2

1/2

m3

1/2

1/2

1/2

1/2

1/2

(72)

(73)

(74)

(75)

(76)

(77)

As discussed in the previous section, without loss of generality we have assumed that the dominant right-handed neutrino is labelled by A, the leading subdominant right-handed neutrino is

labelled by B, and the subsubdominant right-handed neutrino is labelled by C, and we have relabelled the right-handed neutrinos where appropriate according to this convention. The possible

forms of the neutrino Dirac mass matrix mLR corresponding to the above types of SD are then

given by

mLR = (A, B, C)

or mLR = (A, C, B)

for M1 = MA ,

(78)

mLR = (B, A, C)

or mLR = (B, C, A)

for M1 = MB ,

(79)

= (C, B, A)

for M1 = MC ,

(80)

mLR

= (C, A, B)

or

mLR

where we have ordered the columns in each case according to MRR = diag(M1 , M2 , M3 ) where

M1 < M2 < M3 , consistent with Eq. (65).

Clearly the different types of SD correspond to the moduli of the R-matrix elements taking

values close to either zero or unity, so that each of the vectors vA , vB , vC is approximately equal

1/2

to a particular vector mi . Considering the modular surfaces of sin i and cos i , this corresponds

to the angles i being approximately real and taking values close to either zero or /2, which

is a generalization of the situation in the two family example discussed previously. Note that

SD therefore implies that the R-matrix is approximately real. Since there has been some recent

interest in the case of the real R-matrix in the context of flavour-dependent leptogenesis, we shall

return to this point later in Section 4.

70

In this subsection we consider the two right-handed neutrino limit of SD. We shall suppose

that we have SD but not exact tri-bimaximal mixing. In this case R takes the approximate the

forms discussed in the previous section. For definiteness we will consider the type of SD corresponding to R being close to the unit matrix. The other kinds of SD are discussed in Appendix A.

The two right-handed neutrino approximation corresponds to the limit in which the righthanded neutrino labelled by C decouples from the see-saw mechanism, where this limit also

corresponds to m1 = 0. In this limit of SD we shall see that the models reduces to the two righthanded neutrino model with SD introduced in [8]. For example, let us consider the case of R

being approximately equal to the unit matrix, corresponding to the type of SD given in Eq. (77).

In the C decoupling limit this corresponds to:

1/2 T

(81)

.

( 0 vB vA ) = 0 m1/2

m3 R0BA

2

This limit corresponds to s2 = s3 = 0, with only s1 = 0, giving:

1 0

0

T

T

T

T

R0BA = R3 s =0 R2 s =0 R1 = 0 c1 s1 .

3

2

0 s1 c1

(82)

1/2

1/2

vB = c1 m2 + s1 m3 ,

1/2

1/2

vA = s1 m2 + c1 m3

(83)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to a small angle 1 0 (for both real and imaginary components). A zero

value 1 = 0 implies that A1 13 , as discussed. However a non-zero angle 1 allows for example

a zero value of A1 = 0 consistent with a non-zero value of 13 . For example A1 = 0 implies from

Eq. (83),

1/2

tan 13

m3

tan 1

(84)

ei(+2 )

.

m2

s12

This result shows that, with a texture zero A1 = 0, small 13 implies also small 1 . This is a

remarkable result: in general having a small value of A1 combined with small 13 in the two

right-handed neutrino limit implies also small (but non-zero in general) 1 , corresponding to SD.

In the two right-handed neutrino limit it is impossible to have a texture zero A1 without SD.

A similar analysis follows for the other types of SD, where the right-handed neutrino labelled

by C in these cases can be decoupled in a similar way. In each case it is necessary to allow

the remaining dominant and subdominant right-handed neutrinos to mix, in order to allow for

the most general kind of SD, and we identify the single remaining mixing angle in each case.

The other cases are discussed in Appendix A.

The above discussion and Appendix A shows how an effective two right-handed neutrino

model arises as a limiting case of the three right-handed neutrino model in which the right-handed

neutrino labelled by C is decoupled. In this decoupling limit the remaining two right-handed

neutrino system is parameterized in each case by a single non-trivial complex angle, where the

nature of the angle and the values of the other fixed angles of the R-matrix depend on the type

of three right-handed neutrino SD. In particular the limiting cases all led to relations similar to

71

1/2

1/2

vA = cm2 + sm3 ,

1/2

1/2

vB = sm2 + cm3 ,

(85)

where the main difference is that the vectors here have three components. For each type of SD

it is straightforward to relate the angle to either 1 or 2 using the results given in Eqs. (83),

(A.3), (A.6), (A.9), (A.12), (A.15). The following discussion will be based on the angle defined

in Eq. (85), assuming that this identification has been made.

Eq. (85) again leads to a similar geometrical relation between the R-matrix angle and the

angle between the two vectors vA and vB as in Eq. (45), where the magnitudes of the two vectors

1/2

is as in Eq. (46). These results follow from the unitarity of VCKM (since recall that mi is

proportional to the ith column of VCKM ) which gives:

1/2 1/2

mi |mj = ij mj

(86)

and hence:

vA |vB = c sm2 + s cm3 ,

vA |vA = c cm2 + s sm3 ,

vB |vB = s sm2 + c cm3 .

(87)

In the case of tri-bimaximal mixing s = 0 or c = 0 and hence vA |vB = 0, i.e. orthogonality

of the dominant and subdominant columns of the Yukawa matrix, as in Eq. (91). However, as

the previous discussion shows, away from the tri-bimaximal limit these limits are in general too

strong, and so we must in general consider s, c = 0, with SD corresponding to either s 0, or

c 0, which implies the R-matrix angle takes approximately real values close to zero or /2.

We also remark that it is trivial to generalize the result in Eq. (84) to all the other types of SD. In

other words a texture zero A1 = 0 directly implies SD, for each of the types of SD.

It is possible to regard the two right-handed neutrino model as a complete model in its own

right, not as a limiting case of a three right-handed neutrino model. This is not so well motivated

as the limiting cases discussed here. However in such a case one may take Eq. (85) as the starting

point for the exploration of the parameter space. This has been discussed fully elsewhere [22],

so we shall not pursue this point further here. However the results in this subsection should be

useful in relating a three right-handed neutrino analysis to the two right-handed neutrino limit,

and in particular to the SD regions of parameter space of this limit.

4. Applications of basis independent SD

In this section we first discuss the application of these new ideas to flavour models, then

discuss the implications for approaches based on the R-matrix, including flavour-dependent leptogenesis which has recently been studied in the literature.

4.1. Examples of models in the same invariant class as SD

The usual application of SD to flavour models in the literature is in the see-saw flavour basis

corresponding to diagonal mass matrices of charged leptons and right-handed neutrinos, or small

perturbations away from the diagonal structures. This severely restricts the applicability of SD,

72

and may even lead one to believe that SD is an artefact of that particular basis, or could be

transformed away by going to another basis, or even that it is meaningless since all see-saw

models are related to each other by a change of basis. We have shown explicitly in this paper that

none of these statements is true. We have shown how the different types of SD may be formulated

in a basis independent way in terms of the R-matrix, since, as we have also shown, each choice of

R-matrix labels an infinite equivalence class of see-saw models related to each other by changes

of lepton basis. These results open the door for new applications of SD away from the usual

diagonal basis of charged leptons and right-handed neutrinos. In this subsection we illustrate the

possibilities by highlighting some existing models in the literature which are now seen to be SD

in disguise, i.e. are in the same invariant class as SD.

4.1.1. Tri-bimaximal neutrino mixing and CSD: Charged lepton corrections

In this subsection, we first discuss CSD and tri-bimaximal neutrino mixing in terms of the

R-matrix. We shall show that the R-matrix elements take quite precise values equal to either

zero or plus or minus unity (we shall discuss how precise) in this case, which are unaffected by

charged lepton corrections, according to Section 2.4 in which a change of charged lepton basis

leaves the R-matrix invariant. However the MNS matrix is subject to observable deviations from

tri-bimaximal mixing due to charged lepton corrections. The lesson from this is that the charged

lepton corrections can result in a change of the invariant class of see-saw model, not due to a

change in R but due to a change in the physical parameters.

In the notation of Eq. (50), tri-bimaximal mixing [3] corresponds to the choice [10]:

|A1 | = 0,

(88)

|A2 | = |A3 |,

(89)

(90)

A B = 0.

(91)

This is called constrained SD (CSD) [10]. Note that there is no constraint imposed on the couplings Ci since these describe the right-handed neutrino which is approximately decoupled from

the see-saw mechanism.

In terms of the R-matrix SD corresponds to the special case that the R-matrix elements are

approximately equal to zero or plus or minus unity. We now show that the accurate limit of SD,

in which the elements of R are zero or plus or minus unity very accurately, corresponds to CSD

and tri-bimaximal mixing. We shall consider the case of the R-matrix approximately equal to the

unit matrix (the other cases follow similarly). In this case we can write Eq. (77) explicitly as:

1/2

1/2

1/2

1/2

1/2

1/2 T

= Vi1 m1 Vi2 m2 Vi3 m3 RCBA

Bi MB

Ai MA

Ci MC

(92)

T

where we have written Vij = VijMNS . If we take RCBA

= diag(1, 1, 1) precisely, then Eq. (92)

implies for example that Ai Vi3 , so that A1 = 0 would imply that 13 = 0 (cf. the general case

from Eq. (56c) where 13 involves a contribution from a term which is independent of A1 ). We

T

= diag(1, 1, 1) and for tri-bimaximal mixing angles 13 = 0, sin 23 =

further note that for RCBA

A1

0

0

(93)

A2 s23 1 ,

A3

c23

1

B1

B2

B3

s12

c12 c23

c12 s23

1

1 ,

1

73

(94)

ignoring the irrelevant couplings Ci . These satisfy the CSD conditions for the Yukawa couplings

discussed in Eqs. (88)(91) [10]. We conclude that with R precisely equal to the unit matrix tribimaximal mixing implies and is implied by CSD. Of course this is not the only way to achieve

tri-bimaximal mixing, which could be achieved via any other choice of R-matrix, corresponding

to other choices of Yukawa couplings, but this choice of Yukawa couplings appears to be the simplest, and could arise for example from vacuum alignment in flavour models [10,12]. Indeed the

simplicity of the Yukawa couplings in this case provides a powerful motivation for SD. Similar

forms of the Yukawa matrices of the CSD form for tri-bimaximal mixing emerge from the other

types of SD in Eqs. (73)(77) when the R matrices take the exact forms shown there (with the

elements being precisely 0, 1) rather than just the approximate forms.

In realistic models [10,12] it is typically the case that CSD arises through vacuum alignment

in the some theory basis, in which the charged lepton mass matrix is not precisely diagonal,

resulting in charged lepton corrections to tri-bimaximal mixing. In the theory basis there is,

to good approximation, tri-bimaximal neutrino mixing, and the neutrino Dirac mass matrix is

parameterized in terms of a unit R-matrix (or one of the other exact forms in Eqs. (72)(77))

as we have just seen. However, if, in some basis, the R-matrix is equal to the unit matrix, for

example, then this will be true in all bases, as we showed in Section 2.4. In the presence of

charged lepton corrections the MNS matrix will deviate from the tri-bimaximal form, but the Rmatrix will remain equal to the unit matrix. In going from the theory basis to the see-saw flavour

basis in which the charged lepton mass matrix is diagonal, both sides of Eq. (17) must be left

appearing

multiplied by a matrix VEL which diagonalizes the charged leptons, resulting in VMNS

on the right-hand side which is not of the precise tri-bimaximal form, even though R is precisely

equal to the unit matrix in both the original basis and the primed basis. Interestingly, the neutrino

mass matrix in the primed basis will retain the property that its columns are proportional to the

columns of the MNS matrix, albeit that the MNS matrix is not precisely of the tri-bimaximal

form.

We have seen that tri-bimaximal neutrino mixing from CSD corresponds to the R-matrix

taking one of the forms in Eqs. (72)(77) rather precisely. One may ask how accurately should

these forms be achieved in realistic models? In practice, tri-bimaximal neutrino mixing relies

on the conditions in Eqs. (88)(91) being satisfied which leads to tri-bimaximal mixing up to

corrections of order m2 /m3 . The conditions on the couplings Ci are more unconstrained since

they only give corrections to the mixing angles of order m1 /m3 , which may be quite small. We

have already examined the limit where the right-handed neutrino labelled by C decouples and in

this limit the corrections to tri-bimaximal neutrino mixing of order m2 /m3 can be described by

a single small angle as discussed in Section 3.5. For example, in the case of R being close to

the unit matrix, then R is described by 2 = 3 = 0 with small values of 1 0 parameterizing

the corrections of order m2 /m3 , according to Eq. (83). If we relax the decoupling of C then we

can also account for corrections of order m1 /m3 to the R-matrix, described by non-zero values

of 2 0 and 3 0, which corresponds to:

1/2

1/2

1/2

vC = c3 c2 m1 + s3 c2 m2 + s2 m3 .

(95)

We conclude that the case of CSD and tri-bimaximal neutrino mixing corresponds to the

R-matrix taking quite exactly (up to corrections of order m1 /m3 , m2 /m3 ) one of the forms in

Eqs. (72)(77). If the forms of the R-matrix deviate by more that this, but still resemble those

74

forms to some degree then we merely have SD not CSD, and exact tri-bimaximal neutrino mixing

is lost. In the case of CSD, the presence of charged lepton mixing corrections will give observable

corrections to tri-bimaximal mixing in the MNS matrix, resulting in testable predictions and sum

rules for lepton mixing angles [10,13], however these corrections leave the R-matrix unchanged

from the precise forms just described. These precise forms of the R-matrix therefore represent

the basis-independent signature of CSD and tri-bimaximal neutrino (rather than MNS) mixing

which can be identified in phenomenological analyses based on the R-matrix.

4.1.2. Lepton mixing from the charged lepton sector

We now discuss a class of models which account for lepton mixing purely as arising from

the charged lepton sector. Such models have been discussed in [15], and we show here that they

are in the same invariant class as SD models, i.e. are SD models in disguise. The starting point

of these models is to assume that there is no mixing coming from the neutrino sector. The mass

matrices are then written as:

0

C1 0

p d a

mE

vu YLR

= C2 B2 0 ,

e b ,

LR = q

r f c

C3 B3 A3

MC

0

0

MRR

(96)

0

0

MB

0

0

MA

and the following conditions are assumed:

|Bi Bj |

|Ci Cj |

|A3 A3 |

MA

MB

MC

(97)

which is the usual SD condition in Eq. (53), and leads to mLL diag(m1 , m2 , m3 ). We also

assume the new conditions:

|a|, |b|, |c| |d|, |e|, |f | |p|, |q|, |r|,

(98)

|d|, |e| |f |.

(99)

1/2

,

m |a|2 + |b|2 + |c|2

|d a + e b + f c|2 1/2

,

m |d|2 + |e|2 + |f |2

m2

me O |p|, |q|, |r| .

(100a)

(100b)

(100c)

In leading order in |d|/|f | and |e|/|f |, the mixing angles are given by [15]:

|a|

,

|b|

s12 |a| + c12 |b|

,

tan(23 )

|c|

|e|, |d|

.

tan(13 ) O

|f |

tan(12 )

(101a)

(101b)

(101c)

75

According to the ISS approach, we should begin by calculating the R-matrix in the basis

defined in Eq. (96), in order to determine the invariant class C(R) to which this model belongs.

For this purpose we shall use the results in Section 2.4, and in particular Eq. (16) which is valid

for a general charged lepton basis, but a diagonal right-handed neutrino mass basis. Here VL

being the matrix that diagonalizes mLL in this basis, is actually equal to the unit matrix, since by

construction there is no mixing coming from the neutrino sector. Thus the R-matrix is determined

from Eq. (16) as:

diag(MC , MB , MA )1/2

(102)

is as in Eq. (96). By explicit multiplication, using the conditions for a neutrino mass

where YLR

hierarchy in Eq. (97), it is easy to see that R is approximately equal to the unit matrix. It is also

were taken to be diagonal, then R would be exactly equal to the unit

easy to see that if YLR

matrix. We already saw in Section 3.4 that a unit R-matrix defines a particular invariant class

of models to which SD belongs, where the dominant (subdominant) right-handed neutrino is the

heaviest (intermediate) one. Therefore we conclude that the charged lepton mixing model here is

in the same invariant class as SD.

We can check this result explicitly by rotating the above models to the usual SD models by a

change of charged lepton basis, using the symmetry UL (3) UER (3). We thus perform a change

of charged lepton basis, using the symmetry UL (3) UER (3), which results in a change of mass

matrices from the above ones in Eq. (96) to the ones in the see-saw flavour basis in which the

charged lepton mass matrix is diagonal, mE

LR = diag(me , m , m ), given by:

E

mE

LR = VEL mLR VER ,

(103)

In the unprimed basis mLL diag(m1 , m2 , m3 ), and by comparing Eq. (8) to Eq. (103) we identify:

VEL V MNS .

(104)

(105)

Using Eq. (105) with Eq. (96), and the MNS matrix in Eq. (68), immediately leads to the SD

form in Eq. (50) of the neutrino mass matrix, satisfying the usual conditions in Eqs. (53), (55),

with the right-handed neutrino mass ordering of the form in Eq. (77). By a reordering of the

right-handed neutrino masses in Eq. (96) we could similarly arrive at any of the types of SD in

Eqs. (72)(77) in the primed basis in which the charged lepton mass matrix is diagonal as in

Eq. (12).

Alternatively, we could start from one of the sequential right-handed neutrino dominance

types, in the primed basis, then rotate to the unprimed basis in which the mixing is coming

from the charged lepton sector. Starting from the primed basis, in which the charged lepton

mass matrix is diagonal, rotating to the unprimed basis leads to mLL diag(m1 , m2 , m3 ), and a

charged lepton mass matrix given by:

mE

LR V MNS diag(me , m , m ).

(106)

76

2

1

1

3 me 6 m

6 m

1

1

mE

13 m

LR 3 me

3 m

1

2 m

(107)

1

2 m

which is of the form in Eq. (96), for the case of tri-bimaximal lepton mixing.

We conclude that the class of models proposed in [15], where all the mixing arises from the

charged lepton sector, are in the same invariant class as SD, where all mixing arises from the

neutrino sector. The two types of model are in the same invariant class since they correspond to

the same approximately unit R-matrix. In the basis in which there is no mixing coming from the

neutrino sector, then VL is equal to the unit matrix, while in the basis in which all the mixing is

coming from the neutrino sector then VL = VMNS

, with R being the same in both bases.

4.1.3. Non-diagonal right-handed neutrino models

We now consider an example of a see-saw model in which some of the mixing arises from

the right-handed neutrino sector. Specifically we consider the flavour model of tri-bimaximal

neutrino mixing based on SU(3) or its discrete subgroup
(27) [12]. We shall show that this

model is in the same invariant class as CSD models, i.e. is CSD in disguise. This will also

provide an example of how the S-matrix may be used as a short-cut to finding the R-matrix, and

also the neutrino mass matrix itself.

In the model under consideration the neutrino mass matrices are of the leading order form:

MA

0

B

C1

MA

0

MRR = MA MA + MB

(108)

vu YLR = A B + A C2

0 ,

A B A C3

0

0

MC

where MA < MB < MC and the couplings A, B, Ci satisfy the conditions in Eq. (53). However

it is not at all clear that the model corresponds to SD since the right-handed neutrino mass matrix

is not diagonal. Moreover it is not clear that tri-bimaximal neutrino mixing results from Eq. (108)

since it does not satisfy the CSD conditions in Eqs. (88)(91).

However, using the S-matrix transformations in Eq. (18), with

1 1 0

1

S = 0 1 0 ,

(109)

0 0 1

results in:

MRR

MA

0

0

0

MB

0

0

0

MC

vu YLR

0

B

A B

A B

C1

C2

C3

(110)

where the transformed mass matrices satisfy the CSD conditions in Eqs. (88)(91). The transformed theory (not strictly a basis transformation since S is not unitary) has the same R-matrix

as the original theory, according to Eq. (20), even though the right-handed neutrino masses are

different (note that in Eq. (110) MA,B,C are not the eigenvalues).

Having made this S-matrix transformation, we can calculate the neutrino mass matrix and

the R-matrix in the transformed basis, since both quantities are invariant under S as shown on

Section 2.4. In fact it is manifestly clear from Eq. (110) that the transformed theory satisfies the

77

CSD conditions and leads to tri-bimaximal neutrino mixing. The R-matrix may be obtained from

Eq. (20),

S 1 diag(MA , MB , MC )1/2

R T = diag(m1 , m2 , m3 )1/2 VL vu YLR

(111)

where in this case VL = VMNS (ignoring small charged lepton corrections). In this case Eq. (111),

with the tri-bimaximal MNS matrix, leads to an R-matrix of the form in Eq. (72). We thus see

that the original theory is in the same invariant class as CSD since it corresponds to the same

R-matrix, in this case that given in Eq. (72).

4.2. SD phenomenology and the R-matrix

In this paper we have formulated SD in terms of the R-matrix in order to show its basisindependence, using the fact that the R-matrix labels distinct equivalence classes of see-saw

models, and each choice of R-matrix generates a continuously infinite class of models related to

each other by basis transformations. However this identification has additional practical benefits

since the R-matrix has been extensively used in phenomenological analyses, so it is useful to be

able to identify sequential dominance with particular points in R-matrix parameter space. In this

subsection we discuss some recent examples of this.

4.2.1. Lepton flavour violation

A recent phenomenological analysis of lepton flavour violation identified a particularly interesting region of parameter space in which the R-matrix is equal to or close to the unit matrix [19].

From our results here we see that the case that R being exactly equal to the unit matrix corresponds to CSD and tri-bimaximal mixing, of the kind where the heaviest right-handed neutrino

is the dominant one, and the second heaviest is the leading sub-dominant one.

4.2.2. Two right-handed neutrino model

Another example of phenomenological analyses which have relied heavily on the R-matrix are

the recent analyses of the two right-handed neutrino model [22]. We have already shown how this

can emerge from the three right-handed neutrino model by decoupling the right-handed neutrino

labelled by C. Although in general the remaining two right-handed neutrinos in the analysis

in [22] do not satisfy the SD condition (or strictly the single right-handed neutrino dominance

condition, since such models automatically satisfy at least the SD condition that one of the righthanded neutrinos is decoupled) it is in fact satisfied in much of the parameter space considered,

namely where the R angle is close to zero or /2, how close being a matter being discussed

earlier in this paper. Moreover having a particular texture zero, as is assumed over some regions

of the analysis in [22], automatically implies SD, as we also saw earlier in Eq. (84).

4.2.3. Flavour-dependent leptogenesis

One of the main phenomenological applications of the R-matrix is to leptogenesis. It is

particularly convenient here since, for example, when it is used in the calculation of the flavourindependent asymmetry parameter 1 it clearly shows that the MNS parameters cancel out.

However recently there has been some activity related to the flavour-dependence of leptogenesis [23], and here the MNS parameters do not cancel out of the expressions for the separate

flavour-dependent asymmetries , where = e, , . Nevertheless, the R-matrix has continues

to be of interest in recent phenomenological analyses of flavour-dependent leptogenesis [24],

78

1/2 3/2

3 M1 Im( , m m U U R1 R1 )

,

=

2

16 vu2

m |R1 |

(112)

where we have written U = VMNS . Since Eq. (112) only involves basis invariant quantities, it is

manifest that the asymmetry parameter will take a unique value for all see-saw models which

belong to a particular invariant class C(R), i.e. is basis invariant, as it should be. The use of a

real R-matrix to permit a link between leptogenesis and the MNS phases has been explored [24],

since it then follows from Eq. (112) that

m m (m m )R1 R1 I m U

U

(113)

>

which clearly shows that only depends on the MNS phases for the case of R real. A rather

nice application of our results is that approximately real R is an automatic consequence of SD,

as we now discuss.

The case of R being real has been identified with reality of the right-handed rotations used

to diagonalize the Dirac matrix, and thus with the notion of there being no CP violation in the

right-handed neutrino sector. However, this looks like quite a strong requirement. For example

one way to achieve this would be to have an SO(3) family symmetry in the right-handed neutrino

sector, which is broken spontaneously by real flavon vacuum expectation values, which is a rather

precise requirement and not at all generic. This leads to the question of whether there is any more

natural way to guarantee having a real R-matrix which is better motivated? Our formulation of

SD in terms of the R matrix shows that the SD cases actually correspond to the R-matrix being

approximately real. As discussed previously, this follows by considering the modular surfaces of

sin i and cos i , where we saw that SD corresponds to the R angles i being approximately real

and taking values close to either zero or /2. Thus SD is a very nice way of motivating a real

R-matrix, where R takes values approximately as given in Eqs. (72)(77).

The case of CSD and exact tri-bimaximal mixing, corresponding to the R-matrix taking

quite precisely (rather than just approximately) one of the forms in Eqs. (72)(77) leads to

zero leptogenesis asymmetry parameters. For example when R is precisely equal to the unit

matrix Eq. (113) shows that the asymmetry parameters are all equal to zero [24]. Similarly

for the other exact forms in Eqs. (72)(77). Interestingly this result also applies to the case of

tri-bimaximal neutrino mixing, with charged lepton corrections to tri-bimaximal mixing as discussed in [10] giving significant corrections to the total lepton mixing, resulting in deviations

from tri-bimaximal lepton mixing. This might seem paradoxical since physically if there is no

exact tri-bimaximal lepton mixing, then one might also expect that the asymmetry parameters

are also not exactly zero. However the point is that, as already mentioned, if in some basis the

R-matrix is equal to the unit matrix, then this will be true in all bases, as we showed in Section 2.4. In the presence of charged lepton corrections the MNS matrix will deviate from the

tri-bimaximal form, but the R-matrix will remain equal to the unit matrix, and leptogenesis will

remain zero.

Does this mean that the asymmetry parameters of leptogenesis are always equal to zero for

tri-bimaximal neutrino mixing arising from CSD? In practice, tri-bimaximal neutrino mixing

in realistic models is achieved by using vacuum alignment for the dominant and leading subdominant right-handed neutrinos, such that the conditions in Eqs. (88)(91) are satisfied. As

already discussed, there are expected to be small deviations from these precise forms parameterized by small angles which represent corrections of order m1 /m3 and m2 /m3 . In particular

79

there are no conditions imposed on the couplings Ci since the associated right-handed neutrino

is assumed to play a negligible role in the see-saw mechanism and gives corrections of order

m1 /m3 .

If the almost decoupled right-handed neutrino labelled by C is the heaviest, or the intermediate

mass right-handed neutrino, then it will also play no important role in leptogenesis, since the

asymmetry parameters are determined up to corrections of order m1 /m3 by the dominant and

sub-dominant couplings Ai , Bi [25]. However if the almost decoupled right-handed neutrino

is the lightest M1 = MC , then it is unavoidable that the couplings Ci must be involved in the

calculation of the asymmetry parameters, since the asymmetry parameters are given in this case

by [25]:

3M1 Im[C A (C A)] Im[C B (C B)]

+

=

(114)

.

16

MA (C C)

MB (C C)

In this case, there are no constraints on the couplings Ci from CSD and in particular C A and

C B are both non-zero, in contrast to the other cases which would involve A B = 0 due to the

CSD relation in Eq. (91). In the case that the R-matrix is precisely equal to the unit matrix, or one

of the other related forms, then the column vectors A, B, C are each associated with a column of

the MNS matrix, and so we would have C A = C B = 0 by unitarity, giving zero values of the

asymmetry parameters in this case, in agreement with the general argument previously for the

case of R being equal to the unit matrix. However, since the couplings Ci are unconstrained, this

implies that the R-matrix is not precisely equal to the unit matrix, but has important corrections

parameterized by non-zero values of the R angles i as discussed in Eq. (95). In [25] a particular

example of this type was studied in detail.

5. Conclusion

We have proposed an ISS approach to model building, based on the observation that see-saw

models of neutrino mass and mixing fall into basis invariant classes labelled by the CasasIbarra

R-matrix. We have proved that the R-matrix is invariant not just under basis transformations but

also non-unitary right-handed neutrino transformations S. According to the ISS approach, given

any see-saw model in some particular basis one may determine the invariant R matrix and hence

the invariant class to which that model belongs. The formulation of see-saw models in terms of

invariant classes puts them on a firmer theoretical footing, and allows different see-saw models

in the same class to be related more easily, while their relation to the R-matrix makes them more

easily identifiable in phenomenological studies.

We have systematically studied SD as a prime example of the ISS approach. We considered a

simple two family example, before proceeding to the three family case. A very convenient vector

M 1/2 on the left-hand

notation was introduced [16] in which the invariant combination vu YLR

RR

side of Eq. (15) was expressed in terms of three Yukawa vectors consisting of the columns of

the Yukawa matrix normalized by the inverse square roots of right-handed neutrino masses as in

Eq. (51). These three Yukawa vectors are then related to the MNS vectors, introduced in this

paper for the first time, consisting of columns of the MNS matrix normalized by square roots of

neutrino masses, as in Eq. (66). This gives a very nice physical interpretation of the R-matrix,

as that matrix which controls the misalignment of the Yukawa vectors and the MNS vectors.

SD corresponds to the Yukawa vectors and MNS vectors being approximately aligned, up to

permutations. CSD and tri-bimaximal mixing corresponds to the Yukawa vectors and MNS

vectors being more accurately aligned, up to permutations. This interpretation can be extended

80

to any right-handed neutrino or charged lepton basis providing one uses Eq. (17), since the lefthand side is invariant under right-handed neutrino transformations, and on the right-hand side

MNS mixing is replaced by neutrino mixing.

We have given a precise relation between the angle between the Yukawa vectors and

R-matrix angle in two right-handed neutrino limiting cases. These limiting cases are discussed

in detail, for all the different mass orderings of right-handed neutrinos. For such limiting cases,

SD is shown to be an automatic consequence of a particular texture zero and a small 13 . The

discussion of tri-bimaximal mixing and CSD, as corresponding to the R-matrix taking a very

precise (permutation of the) unit matrix form (rather than an approximate such form) is also discussed, as is the fact that the effect of charged lepton corrections is shown to give corrections

to the PMNS angles, but not to the R-matrix, which retains its precise form. The fact that SD

corresponds to approximately real R-matrix angles, as can be seen by considering the modular

surfaces of the R-matrix angles close to zero or /2, is also discussed, with application to recent

flavour-dependent leptogenesis analyses.

The R-matrix provides a beautiful basis invariant formulation of SD and CSD. This means

that SD is physically meaningful, e.g. not all classes of see-saw models correspond to SD, and

also SD cannot be transformed away by a change of basis, since the R-matrix is invariant under

a basis change. The basis independence of SD also makes it more widely applicable to a larger

range of models than is usually considered in the literature. We considered particular models

in which the mixing naturally arises (at least in part) from the charged lepton or right-handed

neutrino sectors, and showed that these models share the same R-matrix as SD, and are hence

in the same invariant class, i.e. they are just SD in disguise. Also the connection of SD to the

R-matrix makes it easier to identify in phenomenological studies.

In summary, the ISS approach amounts to the following procedure. Starting from a particular

see-saw model in a particular basis, one should determine the associated R-matrix, using either

the standard approach involving the right-handed neutrino mass eigenvalues as in Eq. (17), or

using the S matrix short-cut in Eq. (20), useful when right-handed neutrino mass eigenvalues

are not required. Having determined the invariant class C(R) to which it belongs, the particular

model should properly be regarded as one member of an infinite number of other models related

by basis transformations, and it can then easily be seen if any particular model is already present

in the literature in a different guise. This also allows any given model to make contact with

general phenomenological analyses based on the R-matrix. Although the ISS approach has been

applied here to SD models, more generally it should prove to be a valuable model building tool

in classifying and studying the myriad see-saw models that have been proposed in the literature.

Acknowledgements

I would like to thank Stefan Antusch, Michal Malinsky, Graham Ross and Ivo Varzielias for

helpful discussions, and the CERN Theory Division for its hospitality and a Scientific Associateship. The author acknowledges support from the EU network MRTN 2004-503369.

Appendix A. Two right-handed neutrino limit of sequential dominance

In this appendix we discuss the two right-handed neutrino limit of sequential dominance for

the other cases not included in Section 3.5.

The type of dominance in Eq. (72) in the two right-handed neutrino limit corresponds to:

1/2 T

(A.1)

( vA vB 0 ) = 0 m1/2

m3 RAB0

2

where

T

RAB0

R3T s =1 R2T

3

R1T s =1

1

0

c2

s2

1

0

0

0

s2

c2

81

(A.2)

1/2

1/2

1/2

1/2

vA = c2 m2 + s2 m3 ,

vB = s2 m2 c2 m3

(A.3)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to s2 1, c2 0.

The type of dominance in Eq. (73) in the two right-handed neutrino limit corresponds to:

1/2 T

(A.4)

( vA 0 vB ) = 0 m1/2

m3 RA0B

2

where

T

RA0B

R3T s =1 R2T

3

R1T s =0

1

0

c2

s2

1

0

0

0

s2

c2

(A.5)

1/2

1/2

vA = c2 m2 + s2 m3 ,

1/2

1/2

vB = s2 m2 + c2 m3

(A.6)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to s2 1, c2 0.

The type of dominance in Eq. (74) in the two right-handed neutrino limit corresponds to:

1/2 T

(A.7)

( vB vA 0 ) = 0 m1/2

m3 RBA0

2

where

T

RBA0

R3T s =1 R2T

3

R1T s =1

1

0

c2

s2

0

s2

c2

1

0

0

(A.8)

1/2

1/2

vB = c2 m2 + s2 m3 ,

1/2

1/2

vA = s2 m2 + c2 m3

(A.9)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to s2 0, c2 1.

The type of dominance in Eq. (75) in the two right-handed neutrino limit corresponds to:

1/2 T

(A.10)

( vB 0 vA ) = 0 m1/2

m3 RB0A

2

where

T

RB0A

R3T s =1 R2T

3

R1T s =0

3

0

c2

s2

1

0

0

0

s2

c2

(A.11)

82

1/2

1/2

vB = c2 m2 + s2 m3 ,

1/2

1/2

vA = s2 m2 + c2 m3

(A.12)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to s2 0, c2 1.

The type of dominance in Eq. (76) in the two right-handed neutrino limit corresponds to:

1/2 T

(A.13)

( 0 vA vB ) = 0 m1/2

m3 R0AB

2

where

1 0

0

=

= 0 c1 s1

0 s1 c1

where now s2,3 = 0, c2,3 = 1 with s1 , c1 = 0. This results in:

T

R0AB

R3T s =0

3

1/2

R2T s =0 R1T

2

(A.14)

1/2

vA = c1 m2 + s1 m3 ,

1/2

1/2

vB = s1 m2 + c1 m3

(A.15)

similar to Eq. (43) in the two family model, except that here the vectors have three components.

SD here corresponds to s1 1, c1 0.

References

[1] For a review see e.g. A. Strumia, F. Vissani, hep-ph/0606054;

R.N. Mohapatra, et al., hep-ph/0510213.

[2] For a recent review see e.g. J.W.F. Valle, hep-ph/0608101.

[3] P.F. Harrison, D.H. Perkins, W.G. Scott, Phys. Lett. B 530 (2002) 167, hep-ph/0202074;

P.F. Harrison, W.G. Scott, Phys. Lett. B 535 (2002) 163, hep-ph/0203209;

P.F. Harrison, W.G. Scott, Phys. Lett. B 557 (2003) 76, hep-ph/0302025;

An earlier related ansatz was proposed by: L. Wolfenstein, Phys. Rev. D 18 (1978) 958.

[4] For a review see e.g. S.F. King, Rep. Prog. Phys. 67 (2004) 107, hep-ph/0310204;

G. Altarelli, F. Feruglio, Springer Tracts Mod. Phys. 190 (2003) 169, hep-ph/0206077;

G. Altarelli, hep-ph/0610164;

R.N. Mohapatra, A.Y. Smirnov, hep-ph/0603118.

[5] P. Minkowski, Phys. Lett. B 67 (1977) 421;

M. Gell-Mann, P. Ramond, R. Slansky, in: Sanibel Talk, CALT-68-709, February 1979, and in: Supergravity, NorthHolland, Amsterdam, 1979;

T. Yanagida, in: Proceedings of the Workshop on Unified Theory and Baryon Number of the Universe, KEK, Japan,

1979;

S.L. Glashow, Cargese Lectures (1979);

R.N. Mohapatra, G. Senjanovic, Phys. Rev. Lett. 44 (1980) 912;

J. Schechter, J.W. Valle, Phys. Rev. D 25 (1982) 774.

[6] C. Jarlskog, Phys. Rev. Lett. 55 (1985) 1039;

C. Jarlskog, Z. Phys. C 29 (1985) 491;

For a recent review see: C. Jarlskog, Phys. Scr. T 127 (2006) 64, hep-ph/0606050.

[7] J.A. Casas, A. Ibarra, Nucl. Phys. B 618 (2001) 171, hep-ph/0103065.

[8] S.F. King, Phys. Lett. B 439 (1998) 350, hep-ph/9806440;

S.F. King, Nucl. Phys. B 562 (1999) 57, hep-ph/9904210;

S.F. King, Nucl. Phys. B 576 (2000) 85, hep-ph/9912492;

S.F. King, JHEP 0209 (2002) 011, hep-ph/0204360;

S.F. King, Phys. Rev. D 67 (2003) 113010, hep-ph/0211228.

83

K.S. Babu, S.M. Barr, Phys. Lett. B 381 (1996) 202, hep-ph/9511446;

G. Altarelli, F. Feruglio, JHEP 9811 (1998) 021, hep-ph/9809596;

G. Altarelli, F. Feruglio, I. Masina, Phys. Lett. B 472 (2000) 382, hep-ph/9907532.

[10] S.F. King, JHEP 0508 (2005) 105, hep-ph/0506297.

[11] S.F. King, G.G. Ross, Phys. Lett. B 520 (2001) 243, hep-ph/0108112;

S.F. King, G.G. Ross, Phys. Lett. B 574 (2003) 239, hep-ph/0307190;

G. Altarelli, F. Feruglio, hep-ph/0504165;

I. de Medeiros Varzielas, S.F. King, G.G. Ross, hep-ph/0512313;

S.F. King, M. Malinsky, hep-ph/0608021;

G. Altarelli, F. Feruglio, Y. Lin, hep-ph/0610165.

[12] I. de Medeiros Varzielas, G.G. Ross, Nucl. Phys. B 733 (2006) 31, hep-ph/0507176;

I. de Medeiros Varzielas, S.F. King, G.G. Ross, hep-ph/0607045.

[13] S. Antusch, S.F. King, Phys. Lett. B 631 (2005) 42, hep-ph/0508044.

[14] C.H. Albright, S.M. Barr, Phys. Rev. D 58 (1998) 013002, hep-ph/9712488;

C.H. Albright, K.S. Babu, S.M. Barr, Phys. Rev. Lett. 81 (1998) 1167, hep-ph/9802314;

G. Altarelli, F. Feruglio, I. Masina, Nucl. Phys. B 689 (2004) 157, hep-ph/0402155.

[15] S. Antusch, S.F. King, Phys. Lett. B 591 (2004) 104, hep-ph/0403053;

S. Antusch, S.F. King, Nucl. Phys. B 705 (2005) 239, hep-ph/0402121.

[16] S. Lavignac, I. Masina, C.A. Savoy, Nucl. Phys. B 633 (2002) 139, hep-ph/0202086.

[17] I. Masina, hep-ph/0210125;

I. Masina, Phys. Lett. B 633 (2006) 134, hep-ph/0508031.

[18] G.C. Branco, R. Gonzalez Felipe, F.R. Joaquim, I. Masina, M.N. Rebelo, C.A. Savoy, Phys. Rev. D 67 (2003)

073025, hep-ph/0211001.

[19] S. Antusch, E. Arganda, M.J. Herrero, A.M. Teixeira, JHEP 0611 (2006) 090, hep-ph/0607263.

[20] T. Yanagida, Prog. Theor. Phys. 64 (1980) 1103.

[21] H.K. Dreiner, H. Murayama, M. Thormeier, Nucl. Phys. B 729 (2005) 278, hep-ph/0312012;

J.R. Espinosa, A. Ibarra, JHEP 0408 (2004) 010, hep-ph/0405095;

S.F. King, I.N.R. Peddie, G.G. Ross, L. Velasco-Sevilla, O. Vives, JHEP 0507 (2005) 049, hep-ph/0407012;

S.F. King, I.N.R. Peddie, Phys. Lett. B 586 (2004) 83, hep-ph/0312237.

[22] A. Ibarra, JHEP 0601 (2006) 064, hep-ph/0511136;

A. Ibarra, G.G. Ross, Phys. Lett. B 591 (2004) 285, hep-ph/0312138.

[23] R. Barbieri, P. Creminelli, A. Strumia, N. Tetradis, Nucl. Phys. B 575 (2000) 61, hep-ph/9911315;

A. Abada, S. Davidson, F.X. Josse-Michaux, M. Losada, A. Riotto, JCAP 0604 (2006) 004, hep-ph/0601083;

E. Nardi, Y. Nir, E. Roulet, J. Racker, JHEP 0601 (2006) 164, hep-ph/0601084.

[24] A. Abada, S. Davidson, A. Ibarra, F.X. Josse-Michaux, M. Losada, A. Riotto, hep-ph/0605281;

S. Pascoli, S.T. Petcov, A. Riotto, hep-ph/0609125;

G.C. Branco, R.G. Felipe, F.R. Joaquim, hep-ph/0609297.

[25] S. Antusch, S.F. King, A. Riotto, hep-ph/0609038.

P.V. Buividovich a, , M.I. Polikarpov b

a Belarusian State University, Nezalezhnasti av. 4, 220080 Minsk, Belarus

b ITEP, B. Cheremushkinskaya str. 25, 117218 Moscow, Russia

Received 30 May 2007; received in revised form 20 June 2007; accepted 29 June 2007

Available online 25 July 2007

Abstract

It is shown that the action associated with center vortices in SU(2) lattice gauge theory is strongly correlated with extrinsic and internal curvatures of the vortex surface and that this correlation persists in the

continuum limit. Thus a good approximation for the effective vortex action is the action of rigid strings,

which can reproduce some of the observed geometric properties of center vortices. It is conjectured that

rigidity may be induced by some fields localized on vortices, and a model-independent test of localization

is performed. Monopoles detected in the Abelian projection are discussed as natural candidates for such

two-dimensional fields.

2007 Elsevier B.V. All rights reserved.

PACS: 12.38.Aw; 12.38.Gc; 11.25.-w

1. Introduction

YangMills theory is often believed to be equivalent to some string theory, however, up to

now there is no way to detect thin strings behind physical chromoelectric string of finite thickness, which gives rise to linear potential between test colour charges and which is usually seen

in numerical simulations [1,2]. On the other hand, closed magnetic strings, or center vortices,

can be directly detected [3] and seem to be thin [4]. A model-independent argument in favor

of physically thin center vortices in continuum pure YangMills theory was proposed recently

in [5]. It is known that the full QCD string tension is reproduced with sufficiently good precision

if one considers only the contribution due to topologically nontrivial winding of center vortices

* Corresponding author.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.026

85

Fig. 1. Average size of center vortices as the function of their area ( = 2.60, 284 lattice).

and Wilson loop [3,6]. Moreover, in [3,4] it was demonstrated that the area of center vortices in

SU(2) LGT scales in physical units of length. These facts imply that center vortices are not just

lattice artifacts, but rather correspond to some physically significant objects. Center vortices are

usually detected using the maximal center gauge and are seen as closed self-avoiding surfaces,

which occupy only a small fraction of lattice plaquettes [3,4]. Typically one observes a large percolating vortex, which extends through the whole lattice, and a number of small satellite vortices,

whose size typically does not exceed several lattice spacings [7].

As the total physical area and size of center vortices remain finite in the continuum limit,

there should be also a continuous description of vortex geometry, which is characterized by

some finite Hausdorf dimension dH . Percolating vortex by definition has Hausdorf dimension

equal to 4. In order to define the dimensionality of small satellite vortices in SU(2) LGT, their

size L was measured as the function of their area S. The size of the vortex was defined as the

maximal distance between points which belong to the vortex. Average size of small vortices as

the function of their area (in lattice units) is plotted on Fig. 1. Fit of the form L = const S 1/d

(solid curve on Fig. 1) gives d = 1.9 0.1. For small vortex areas (S 30) this number simply

reflects self-avoiding of vortex surfaces, but for larger values of area this fit indicates that small

vortices tend to be smooth surfaces.

How can one describe the properties of such surfaces? A necessary condition for physical scaling of area of random surfaces is the cancelation between their entropy, which for self-avoiding

surfaces grows linearly with the area of surface in lattice units, and the bare string tension, which

therefore should be finite in lattice units [8]. A remarkable result of [4] is that the excess of action associated with center vortices is indeed proportional to the area of vortices in lattice units.

However, for the simplest model of random surfaces with NambuGoto action such naive balance between action and entropy does not lead to physical surfaces because of branched polymer

problem [8], therefore one has to consider some more complicated model.

A well-studied model which can probably describe smooth surfaces even in four and three

dimensions is the model of rigid strings [812]. The model was first analyzed in [9,10], where it

was shown that depending on the form of the -function of the model either the branched polymer phase or the phase of smooth surfaces can be observed. Namely, if the -function has no

nontrivial fixed points, the model reduces to the usual NambuGoto string flawed by branched

polymer problem, but if nontrivial fixed point exists, in the vicinity of this point the model should

86

describe smooth surfaces with finite Hausdorf dimension [9]. Numerical simulations indeed confirmed the existence of the phase of smooth surfaces in this model [8,12,13]. Nontrivial scaling

in the vicinity of the phase transition was observed in [12], but in more recent simulations of

a related model of tethered surfaces first-order phase transition was found [13]. However, even

if the model of rigid strings has no continuum limit, it can still serve as an effective model, for

which UV cutoff is set by some other fields in the theory. It is known, for instance, that rigidity

terms arise in the effective string action after integrating over worldsheet fermions [8,14,15] or

some four-dimensional massive fields coupled to the string worldsheet [11,16,17].

The aim of this paper is to fit observed properties of center vortices in the model of rigid

strings. This model was first conjectured to describe center vortices in [18], where the regularized action of vortex configurations was studied in the continuum theory, and in [19], where

the model was found to describe deconfinement transition with a good precision. In this paper the correlation between the excess of action associated with vortices and the geometry of

vortex surfaces will be studied systematically basing on the results of lattice simulations. In Section 2 it will be shown that the effective vortex action can be indeed approximated by the action

of rigid strings and that the coupling constants before extrinsic and internal curvatures are finite in the continuum limit. In Section 3 this action is discussed as an effective action induced

by some two-dimensional fields localized on the surfaces of vortices and a model-independent

check of localization is performed. Monopoles in the Abelian projection of the theory [4,20,21]

are then discussed as appropriate candidates for such localized fields in Section 4. The dependence of the bare string tension on the lattice spacing can be explained if one takes into account

the contribution of percolating monopole cluster. Interaction between monopoles, which can be

approximated by the Yukawa potential with physical mass [22], can also partially account for

surface rigidity [11,16,17]. Finally, implications and possible extensions of obtained results are

discussed.

2. Effective action as the function of vortex geometry

In order to measure the correlation between geometric properties of vortex surface and the action density center vortices were detected in SU(2) LGT by imposing the direct maximal central

gauge (DMC) [4]. Simulated annealing procedure was used to locate true minima. According to

the conventional procedure, closed vortex surfaces were constructed from plaquettes which are

dual to negative projected plaquettes. In this work the same set of lattice configurations as in [4]

was used. The lattices used were of the size 284 for = 2.60 and = 2.55, 244 for = 2.50,

= 2.45 and = 2.40and 164 for = 2.35. Lattice spacing was fixed by setting the value of

QCD string tension to = 440 MeV.

Local geometry of vortex surfaces was characterized by the two simplest local invariants

internal and extrinsic curvatures. Internal curvature in lattice units for hypercubic lattice was

defined as a 2 Rs = 4 ns , where ns is the number of neighbors of the site s and a is the lattice

spacing [8]. Extrinsic curvature for smooth surfaces can be written as K = x x , where

the induced metric on the surface. In order to define extrinsic curvature on hypercubic

lattice twodimensional lattice Laplacian on the surface was defined as a 2 fs = ns fs s fs , where s

are the sites adjacent to s. Extrinsic curvature in lattice units is then a 2 Ks = xs xs [8]. The

2

shapes of vortex surfaces which correspond to the values of extrinsic curvature a Ks = 0, 1, 2, 3

are shown on Fig. 4. The shape which corresponds to a 2 Ks = 4 is essentially four-dimensional

and is not shown. It can be though of as a lattice site where four orthogonal plaquettes join.

87

Fig. 2. Average excess of action per site as the function of extrinsic curvature and lattice spacing.

Fig. 3. Average excess of action per site as the function of internal curvature and lattice spacing.

The excess of action associated with some lattice site on the vortex was defined by averaging

the excess of action over all lattice plaquettes which are dual to the vortex plaquettes which

contain this site. The excess of action on plaquette was defined as Sp = (1 12 Tr Up )

(1 12 Tr Up ) = 2 (Tr Up Tr Up ). Average excess of action per site as the function of

lattice spacing and extrinsic and internal curvatures (in lattice units) is plotted on Figs. 2 and 3

respectively.

It can be seen that the excess of action increases with both extrinsic and internal curvature,

therefore the action should in general depend on both internal and extrinsic curvatures. Standard

dimensional analysis shows that only the terms linear in extrinsic and internal curvature are

relevant in the continuum limit. It should be also noted that because center vortices are defined

on hypercubic lattice, typical values of curvature diverge as a 2 in the continuum limit (a 2 Ks

and a 2 Rs are finite integer numbers, therefore Ks a 2 and Rs a 2 ). A simple estimation

shows that the number of points on the surface with K = 0 or R = 0 scales to zero as a 2 , which

ensures the smoothness of surfaces and the finiteness of the total contribution of bent plaquettes

to the physical action. Finite values of Ks or Rs for physical surfaces in the continuum limit can

be in principle defined by averaging the curvature over physically small regions, whose area is

nevertheless very large in lattice units.

A peculiar feature of the dependence of action excess on extrinsic curvature is the distinct

peak near a 2 Ks = 3. Such value of extrinsic curvature corresponds to lattice sites where three

plaquettes join (see Fig. 4). It was found that when this peak is neglected, the resulting function

appears to be almost linear in a 2 Ks (Fig. 5). Presumably the peak at a 2 Ks = 3 is a lattice artifact

which corresponds to elementary lattice cubes. In order to obtain better fits the excess of action

at a 2 Ks = 3 was replaced by the average between a 2 Ks = 2 and a 2 Ks = 4 (solid line on Fig. 5),

which yields almost linear dependence. Bare string tension in lattice units and the coefficient before extrinsic curvature were obtained as the intercept and the slope of the linear fits of the data

88

Fig. 4. Shapes of vortex surfaces which correspond to the values of extrinsic curvature a 2 Ks = 0, 1, 2, 3. Lattice sites

with which these values of extrinsic curvature are associated are marked with solid circles.

Fig. 5. Excess of action per site as the function of extrinsic curvature at different lattice spacings (from a = 0.14 fm,

= 2.35 for the uppermost curve to a = 0.06 fm, = 2.60 for the lowest curve). Solid lines are plotted using the value

at a 2 Ks = 3 which was replaced by an average between a 2 Ks = 2 and a 2 Ks = 4.

on Fig. 5. One could in principle try to fit the excess of action per plaquette by some polynomial

in a 2 Ks , but anyway dimensional analysis shows that only the term linear in a 2 Ks would survive

in the continuum limit. Nevertheless, the fitting method affects numerical values of the coefficient (a) before extrinsic curvature. In general, increasing the number of degrees of freedom in

fitting functions changes the value and the uncertainty of this coefficient, however, these values

agree within error range when extrapolated to the continuum limit. For instance, in order to check

the stability of fits the data plotted on Fig. 5 were fitted by first and second-order polynomials

in a 2 Ks . The values of bare string tension 0 (a) (constant term in the fits) obtained from both

fits agree with a very good precision. At finite lattice spacings discrepancies in the values of

are larger, but extrapolation to the continuum limit gives consistent values = 0.065 0.006

(linear fit) and = 0.048 0.014 (quadratic fit). The coefficient before (a 2 Ks )2 contains very

large errors and is close to zero. This coefficient can only be important in the continuum limit if

it contains divergences of order a 2 , which is not likely.

89

In order to extract the term linear in internal curvature from the data plotted on Fig. 3 the

excess of action was fitted by a third-order polynomial in a 2 Rs = 4 ns . The coefficient before a 2 Rs in this polynomial was assumed to be the coefficient before internal curvature in the

physical action.

Finally, after omitting all terms which become irrelevant in the continuum limit, for sufficiently small lattice spacings a one can write the action associated with center vortices in the

following form:

W [S] = d 2 g 0 (a)2UV + (a)R + (a)K ,

(1)

S

where g = det gab is the invariant element of area on the surface, gab =

= a 1

X X

a b

is the

induced metric on the surface and UV

The coefficients 0 (a), (a) and (a) as the functions of lattice spacing are plotted on Figs. 6,

8 and 7, respectively. Extrapolation to continuum limit gives the following values:

0 (0) = 0.192 0.006,

(2)

Thus coefficients and are finite in the continuum limit and therefore the dependence on

internal and extrinsic curvatures is physical. Quadratic divergence in the bare string tension is

crucial for the existence of smooth physical surfaces, as explained above, and should be compensated by a similar divergence in the entropy of random surfaces. It is interesting to note that the

value of bare string tension in lattice units (2), which is obtained after taking curvature into account, is smaller than the value of action excess Sp = 0.540 0.004 obtained in [4]. The fact

that after proper redistribution of action excess among operators with appropriate dimensions

the string tension is strongly decreased indicates that the terms with extrinsic curvature play an

important role in the dynamics of center vortices.

The action (1) corresponds to the model of rigid strings [810]. While the surface entropy is

canceled by the divergent bare string tension, branched polymer problem is circumvented due

to

third term. It is interesting to note that as the consequence of the GaussBonnet identity

the

2 gR = 2[S] ([S] is the Euler characteristic of the surface), the coefficient before

d

S

internal curvature is proportional to the string coupling constant, which is therefore also finite in

the continuum limit.

90

From the point of view of field theory quadratically divergent term S d 2 g0 (a)2UV in

the vortex effective action indicates that some fields are localized on center vortices and become effectively two-dimensional. Quadratic divergence in the effective action (1) corresponds

to vacuum oscillations of these two-dimensional fields [23,24]. Localization on center vortices

was also directly observed in lattice simulationsfor instance, it was found that in the maximal

Abelian gauge almost all trajectories of Abelian monopoles belong to center vortices [21]. There

also exist classical string solutions where monopoles are localized on string worldsheets [25,26].

In [27,28] it was shown numerically that eigenfunctions of covariant four-dimensional Laplace

and Dirac operators are also localized near center vortices.

If only two-dimensional fields and their interactions are responsible for the excess of action

on the vortices, total excess of action on the vortex W [S] should be treated as the effective string

action Weff [S] obtained by integrating over the fields at fixed geometry:

2

exp Weff [S] = D exp d gL[] ,

91

(3)

where D is the covariantly defined path integral measure and L[] is the Lagrangian of

the

. Such effective action will necessarily contain quadratic divergence of the form

2field

d g2UV due to vacuum oscillations of these two-dimensional fields plus some geometrydependent terms. For instance, if the fields are four-dimensional Dirac fermions, the effective

action (3) depends on the extrinsic curvature of the surface [8,14,15].

In order to check whether some two-dimensional fields propagate along the vortex or not, one

can consider the points on the vortex which are very distant in terms of internal geometry on the

vortex but are very close in four-dimensional space. If the relevant fields propagate only along the

vortex, correlation between excess of action in such points should be much less than between plaquettes separated by the same distance along the vortex. In order to check this numerically center

vortices were represented as graphs, each node of the graph corresponding to some lattice plaquette. The correlation between action densities was measured for the points which are separated

by only one lattice spacing in four-dimensional space but no less than 6 spacings along the vortex (d4 < 2, d2 6). The standard breadth first search algorithm for unweighted graphs was used

to measure the distances on vortex. In order to reduce anisotropy the sites which belong to the

vortex were linked not only along lattice links, but also along the diagonals of plaquettes. As the

distance measured was used only for lower-bound estimates of distances, there was no necessity

to use much more time-consuming search algorithms for weighted graphs such as Dijkstra algorithm. For comparison the correlation between neighboring vortex plaquettes (d4 < 2, d2 < 2)

was also measured. Correlation between plaquette variables Tr Up and Tr Up was characterized

by the correlation coefficient [Tr Up , Tr Up ]:

[Tr Up , Tr Up ] =

.

(Tr Up )2 Tr Up 2

(4)

The results of these measurements are plotted on Fig. 9. It can be seen that the correlation

in four-dimensional space is notably smaller than along the surface of the vortex, and therefore

the fields which are responsible for the excess of action on vortices are more likely to propagate

along their surfaces. Unfortunately due to insufficient statistics it was not possible to measure the

correlation lengths which correspond to propagation along the vortex and in four-dimensional

space. The latter can be estimated as the inverse mass of the lowest glueball (1.5 GeV), which

is comparable with the inverse lattice spacing. A measurement which is somewhat similar to the

one described above was performed in [4], where the average excess of action on plaquettes very

close to center vortices was shown to be zero.

4. Abelian monopoles and center vortices

Up to now the only way to see directly the content of the conjectured two-dimensional field

theory is to impose maximal Abelian gauge and to look at the trajectories of Abelian monopoles

[20,23,24,29], which populate densely the surfaces of center vortices. In real simulations about

95% of monopole trajectories belong to center vortices [20,21]. It is natural to conjecture that

two-dimensional field theory living on vortex surfaces should describe these monopoles upon

first quantization.

92

Fig. 9. Correlation between neighbouring plaquettes with d4 < 2, d2 6 and d4 < 2, d2 < 2.

Another nontrivial fact which supports the statement about the role of monopoles in the dynamics of center vortices is the dependence of bare string tension in lattice units on lattice

spacing (see Fig. 6), which can be approximated by a linear function 0 (a) = A + Ba with

good precision. Besides quadratic divergence, this also yields a 1 UV divergence in the

bare string tension in physical units: a 2 0 (a) = a 2 A + a 1 B, where A is given by (2) and

B = 2.2 0.1 fm1 . Such divergence usually corresponds to self-energy of one-dimensional

objects [4,8,23,24]. If the density of one-dimensional objects per unit of vortex area 1D scales

in physical units and is constant in the continuum limit, while the bare mass of these objects

is UV divergent and is close to the critical value mbare = a 1 ln 4 [8], for 1D one then obtains the following estimation 1D = lnB4 = 1.5(8) fm1 . Taking into account that the density

of vortices is v = 24 fm2 , it is easy to estimate the density of one-dimensional objects in

= 37.(9) fm3 , which is in a good agreement with the density

four-dimensional space as 1D

of percolating Abelian monopoles m = 31.(1) fm3 [29]. The difference may be explained by

incomplete detection of Abelian monopoles and by curved geometry of vortex surfaces.

Geometric properties of monopole trajectories in SU(2) LGT were studied in [29]. It was

found that the properties of monopole trajectories at hadronic scale are not described by random walks with dH = 2, as in the case of free scalar particles, but rather by smooth random

walks. Namely, the correlation between the directions of tangent vectors to monopole paths is

characterized by a correlation length lc1 300 MeV, which remains finite in the continuum

limit. Smooth random walks are in the same universality class as the trajectories of spinning

particles [8], therefore the fields associated with monopoles living on center vortices are more

likely to have nonzero spin. The simplest physical model which leads to smooth random walks

is the propagation of massive Dirac fermions in Euclidean space [8,30]. If the monopoles are

assumed to be Dirac fermions, the mass of the fermions can be roughly estimated as 1.5 GeV

from the measurements of monopole current correlators in [29]. A general smooth random walk

corresponds to random walk of the tangent vector on the three-sphere S 3 and therefore includes

components of all spins, but it is not clear how such model can be incorporated into the models of random surfaces. On the other hand, random surfaces with four-dimensional fermions are

well studied. It is known that the effective action of fermionic strings includes rigidity terms

[14,15,20]. As in the model of fermionic strings worldsheet fermions are massless Dirac fermions, it is natural to conjecture that the effective action induced by massive worldsheet fermions

93

also induces string rigidity. In general, one can expect that for massive fermions the effective

besides the local

action

terms of the form (1) should also contain nonlocal terms of the form

d 2 1 g(1 ) d 2 2 g(2 )K(1 )1 (1 , 2 )K(2 ), where 1 (1 , 2 ) is the kernel of the inverse Laplace operator on the surface of the vortex. Unfortunately, at presently available lattices

it is very difficult to trace such terms in the effective action.

The effective Lagrangian governing the dynamics of monopoles was obtained in [22] using the

inverse Monte Carlo method. It was found that the effective monopole action, besides the usual

kinetic term, contains Yukawa interaction with physical mass as well as four-point and six-point

interactions. Higher-order interaction terms were found to be very small. Taking these results

together, a reasonable conjecture is that at hadronic scale monopoles behave as massive Dirac

fermions living on center vortices. Yukawa-type interaction may be induced if these fermions are

coupled to some four-dimensional massive field. Coupling to massive four-dimensional fields

also leads to rigidity terms in the effective action [11,16,17], and thus such interaction can be

also absorbed in the effective action (1).

5. Conclusions

In this paper the relation between local vortex geometry and the action density was studied.

Direct measurements of action imply that a reasonable approximation for the effective action of

center vortices is the action of rigid strings [9,10]. This action can reproduce observed smoothness of vortex surfaces and presumably the physical scaling of vortex area. The latter possibility

depends crucially on the form of nonperturbative -function of the model of rigid strings and at

present time can only be checked numerically [12]. Unfortunately the values of the parameters

0 , and (2), obtained by extrapolation to the continuum limit, cannot be compared with the

corresponding critical values obtained from independent simulations, because most numerical

investigations of the model of rigid strings dealt with three-dimensional case [12,13]. As the

existence of the continuum limit of the model of rigid strings has not been proven exactly, it is

not completely clear whether the effective vortex action (1) can be used at all values of lattice

spacing.

It turns out that a large fraction of the action associated with center vortices comes from rigidity term, therefore the dependence of action on local vortex geometry should be crucial for the dynamics of vortices. In [4] it was conjectured that this dependence arises due to Abelian monopoles

localized on vortices. Considerations of the Section 4 support this conjecture, although this problem requires more accurate analytic treatment. For instance, it could be extremely interesting to

construct two-dimensional field theory which describes monopoles localized on center vortices.

Presumably such theory should be fermionic. In fact the action of rigid strings arises naturally as

a low-energy effective action for vortices of finite thickness [11,1618], however, recent lattice

measurements [4] indicate that center vortices in SU(2) LGT are genuinely thin. Fermionic fields

localized on vortex worldsheets provide a natural explanation of the rigidity of such physically

thin vortices.

An important property of center vortices which is not captured by the action (1) is the

existence of a single percolating vortex. In the case of random walks percolating trajectory corresponds to condensate which emerges due to tachyonic instability of perturbative vacuum. In

order to describe condensation one should use the concepts of Euclidean interacting quantum

field theory instead of random walks which describe the states of only one particle. Condensate

then corresponds to nonzero background field, as in the Higgs model (for a related discussion see,

94

e.g., [23,24]). It is not clear whether this picture remains valid for the theory of random surfaces,

because required nonperturbative apparatus of string field theory is almost not developed.

Acknowledgements

The authors are grateful to E.T. Akhmedov, F.V. Gubarev, A.S. Gorsky, and especially to V.I.

Zakharov, for illuminating discussions and critical remarks and to F.V. Gubarev, A.V. Kovalenko

and P.Yu. Boyko for lattice configurations and source codes. P.V. Buividovich is grateful to all

members of the ITEP lattice group for their kind hospitality. M.I. Polikarpov was partially supported by grants RFBR-05-02-16306a, RFBR-0402-16079a, RFBR-0602-04010-NNIOa and EU

Integrated Infrastructure Initiative Hadron Physics (I3HP) under contract RII3-CT-2004-506078.

References

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

P.Y. Boyko, F.V. Gubarev, S.M. Morozov, arXiv: 0704.1203 [hep-lat].

L. Del Debbio, M. Faber, J. Giedt, J. Greensite, S. Olejnik, Phys. Rev. D 58 (1998) 094501, hep-lat/9801027.

F.V. Gubarev, A.V. Kovalenko, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Lett. B 574 (2003) 136, heplat/0212003.

P.V. Buividovich, M.I. Polikarpov, arXiv: 0704.3367 [hep-ph].

V.G. Bornyakov, D.A. Komarov, M.I. Polikarpov, Phys. Lett. B 497 (2001) 151, hep-lat/0009035.

P.Y. Boyko, V.G. Bornyakov, E. Ilgenfritz, A.V. Kovalenko, B.V. Martemyanov, M. Muller-Preussker, M.I. Polikarpov, A.I. Veselov, Nucl. Phys. B 756 (2006) 71, hep-lat/0607003.

J. Ambjrn, hep-th/9411179.

A.M. Polyakov, Nucl. Phys. B 268 (1986) 406.

H. Kleinert, Phys. Lett. B 174 (1986) 335.

H. Kleinert, Phys. Lett. B 211 (1988) 151.

J. Ambjrn, A. Irbck, J. Jurkiewicz, B. Petersson, Nucl. Phys. B 393 (1993) 571, hep-lat/9207008.

H. Koibuchi, T. Kuwahata, Phys. Rev. E 72 (2005) 026124, cond-mat/0506787.

A.R. Kavalov, I.K. Kostov, A.G. Sedrakyan, Phys. Lett. B 175 (1986) 331.

P.B. Wiegmann, Nucl. Phys. B 323 (1989) 330.

E.T. Akhmedov, M.N. Chernodub, M.I. Polikarpov, M.A. Zubkov, Phys. Rev. D 53 (1996) 2087, hep-th/9505070.

P. Orland, Nucl. Phys. B 428 (1994) 221, hep-th/9404140.

M. Engelhardt, H. Reinhardt, Nucl. Phys. B 567 (2000) 249, hep-th/9907139.

M. Engelhardt, H. Reinhardt, Nucl. Phys. B 585 (2000) 591, hep-lat/9912003.

J. Ambjorn, J. Giedt, J. Greensite, JHEP 0002 (2000) 033, hep-lat/9907021.

A.V. Kovalenko, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Rev. D 71 (2005) 054511, hep-lat/0402017.

S. Kato, N. Nakamura, T. Suzuki, S. Kitahara, Nucl. Phys. B 520 (1998) 323.

V.I. Zakharov, From confining fields on the lattice to higher dimensions in the continuum, in: Lectures given at

Infrared QCD in Rio: Propagators, Condensates and Topological Effects (IRQCD, 2006), Rio de Janeiro, Brazil,

2006, hep-ph/0612342.

V.I. Zakharov, hep-ph/0309178.

A. Gorsky, M. Shifman, A. Yung, Phys. Rev. D 71 (2005) 045010, hep-th/0412082.

M.N. Chernodub, R. Feldmann, E. Ilgenfritz, A. Schiller, Phys. Rev. D 71 (2005) 074502, hep-lat/0502009.

J. Greensite, A.V. Kovalenko, S. Olejnik, M.I. Polikarpov, S.N. Syritsyn, V.I. Zakharov, Phys. Rev. D 74 (2006)

094507, hep-lat/0606008.

A.V. Kovalenko, S.M. Morozov, M.I. Polikarpov, V.I. Zakharov, Phys. Lett. B 648 (2007) 383, hep-lat/0512036.

V.G. Bornyakov, P.Y. Boyko, M.I. Polikarpov, V.I. Zakharov, Nucl. Phys. B 672 (2003) 222, hep-lat/0305021.

M.S. Plyushchay, Phys. Lett. B 235 (1990) 47.

at the LHC

Simon Bray a , Jae Sik Lee b, , Apostolos Pilaftsis a

a School of Physics and Astronomy, University of Manchester, Manchester M13 9PL, United Kingdom

b Center for Theoretical Physics, School of Physics, Seoul National University, Seoul 151-747, South Korea

Received 6 March 2007; received in revised form 29 June 2007; accepted 4 July 2007

Available online 17 July 2007

Abstract

The observed light neutrinos may be related to the existence of new heavy neutrinos in the spectrum

of the SM. If a pair of heavy neutrinos has nearly degenerate masses, then CP violation from the interference between tree-level and self-energy graphs can be resonantly enhanced. We explore the possibility

of observing CP asymmetries due to this mechanism at the LHC. We consider a pair of heavy neutrinos

N1,2 with masses ranging from 100500 GeV and a mass-splitting mN = mN2 mN1 comparable to

their widths N1,2 . We find that for mN N1,2 , the resulting CP asymmetries can be very large or even

maximal and therefore, could potentially be observed at the LHC.

2007 Elsevier B.V. All rights reserved.

1. Introduction

The observation of neutrino oscillations has established that the observed neutrinos are not

massless and so the Standard Model (SM) must be extended in order to accommodate these [1].

In order to explain why the neutrinos are so much lighter than any of the other fermions, it is

common to postulate that in addition to the three observed light neutrinos, there also exist partner

heavy neutrinos. In order to avoid very stringent constraints due to their non-observation at the

Large ElectronPositron (LEP) collider, these must have masses greater than about 100 GeV [2].

However, if they exist with masses greater than this, but less than about 500 GeV, they fall into

the category of particles that could be produced for the first time at the Large Hadron Collider

* Corresponding author.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.07.002

96

(LHC) [36]. Complementary to such a search, heavy neutrinos could also be observed at a future

linear collider. Several studies have been conducted into their signals at the International Linear

Collider (ILC) [7,8] as well as possible alternatives such as an e collider [9].

If neutrinos with such masses do exist, then in general they should break lepton-flavour conservation and, if they are Majorana, lepton-number (L) conservation as well. Furthermore, since

their couplings can be complex, they can also contribute to CP violation. A scenario of particular

interest is if two or more of the heavy neutrinos are quasi-degenerate in mass [10]. In this case,

CP violation can be resonantly enhanced such that for appropriate couplings, CP asymmetries

can be large or even maximal [11]. By introducing flavour symmetries in the singlet neutrino

sector, models can be built where quasi-degenerate heavy neutrinos with masses of order the

electroweak scale appear naturally [12].1 These mostly focus on models of resonant leptogenesis, which can be used to explain the Baryon Asymmetry of the Universe (BAU). Observation

of heavy neutrinos at the LHC, and in particular, measurements of CP asymmetries due to them

would be a way of testing such models.

Since it is not known at this time whether neutrinos are Dirac or Majorana particles, both

possibilities need to be considered. For the former, the collider signatures at the LHC are leptonflavour-violating (LFV) processes. For the latter, in addition to these, lepton-number-violating

(LNV) processes could also be observed. Either of these types of processes should be virtually

background free since they are forbidden in the SM which is both lepton-number-conserving

(LNC) and lepton-flavour-conserving (LFC). Higher order processes with light neutrinos in the

final state could in principle fake the signals, but these can be excluded by suitable kinamatical

cuts, e.g., by vetoing on missing transverse momentum [5,6].

CP violation could show up in asymmetries between possible signal final states and their CPconjugates. Although the initial pp state is not a CP-eigenstate, true CP-violating observables

can be constructed, either by taking into account the theoretically calculable difference expected

due to the Parton Distribution Functions (PDFs) [14,15], or by considering appropriate ratios of

different processes such that this factor drops out.

Observation of heavy neutrinos at the LHC would be a major discovery. Less direct evidence

could come from them contributing to LFV decays, e.g., e , e conversion in nuclei,

or (if Majorana) neutrinoless double beta decay. The non-observation of such processes, along

with the excellent agreement of electroweak data to the SM, places limits on the strength of their

couplings.

This paper is organised as follows: In Section 2, we describe extensions of the SM which

include heavy neutrinos and discuss the experimental constraints on them. Two specific models

are considered, one predicting heavy Majorana neutrinos, the other Dirac. Section 3 is a short

discussion of the signatures of such particles at the LHC, in particular, we classify them according

to whether or not they are LNV. Next, in Section 4, we present the formalism for describing the

propagator of a system of two coupled quasi-degenerate heavy neutrinos. This is based on the

field-theoretic resummation approached developed in [16] for describing resonant transitions

involving the mixing of intermediate fermionic states.

Assuming the signals are due entirely to two nearly degenerate heavy (Dirac or Majorana)

neutrinos, the 2 2 propagator matrices for such a pair are then used in Section 5 to derive

expressions for the CP asymmetries between them. We give example scenarios where these

asymmetries are large and discuss their compatibility with both experimental and theoretical

1 For recent studies within supersymmetric models, see [13].

97

the possible level of leptonic CP violation observable at the LHC. We define a number of CPviolating observables and plot these, along with the cross sections they are derived from. Finally,

Section 7 contains our conclusions.

2. Heavy neutrino extensions of the Standard Model

2.1. Heavy Majorana neutrino model

We first describe the SM, minimally extended to include right-handed neutrinos. Assuming

the Higgs sector is not extended, the Lagrangian describing the neutrino masses and mixings

reads

0 c

1 0 0 c 0

(L )

mD

+ h.c.,

Lm = L R

(2.1)

mTD mM

2

R0

where L0 and R0 denote column vectors of the left- and right-handed neutrino fields in the weak

basis, and the notation c C T represents the charge conjugate fields. Although it is commonly

assumed that there is one right-handed neutrino per generation (as required in SO(10) Grand

Unified Theories (GUTs) [17]), this needs not be so in the bottom up approach considered

here. In fact, as will be shown in Section 5, in the context of searches for CP violation, it is

phenomenologically more interesting if the model contains at least four right-handed states. In

order to maintain generality, we will consider adding nR right-handed states, where nR can be

any positive integer. The elements of the complex matrices mD and mM give rise to Dirac and

Majorana mass terms for the neutrinos, respectively. The only constraint on their structure is that

mM must be symmetric.

The Majorana mass-eigenstate neutrinos are related to the weak-eigenstates through

L0

L

T

=U

(2.2)

c .

NL

(R0 )

The states represented by are the three observed light neutrinos, whereas N represents extra

heavy neutrinos (of which there will be as many as right-handed weak-eigenstates). U is a (3 +

nR ) (3 + nR ) unitary matrix chosen such that

0

mD

U = diag(m1 , . . . , m3+nR ),

UT

(2.3)

mTD mM

where m1 , . . . , m3+nR are the physical neutrino masses.

Since mD is derived from the Higgs mechanism, it is most natural to assume that its elements

should be of order the vacuum expectation value of the Higgs field. By contrast, mM is unrelated

to any SM observables and so could be as large as the GUT scale. This observation leads to

the popular seesaw mechanism by which the extreme smallness of the light neutrino masses

are explained through the large hierarchy in these scales [18]. For a recent discussion within

the context of GUT neutrino models, see [19]. Unfortunately, generic seesaw scenarios are not

phenomenologically very interesting since the heavy neutrinos are predicted to be extremely

heavy (of order the GUT scale) and also have their couplings to SM particles highly suppressed.

More interesting scenarios for collider physics can be formed by introducing approximate flavour

symmetries that impose structure on the mass matrices mD and mM [3,12,20,21]. This can then

98

allow the heavy neutrino couplings to be completely independent of the light neutrino masses. In

such theories it is possible to have heavy neutrinos with masses of order 100 GeV and significant

couplings to SM particles, without being in contradiction with light neutrino data.

Writing the Lagrangian for neutrino interactions in terms of the mass-eigenstates gives [3]

g

+ h.c.,

LW = W li PL Bij

(2.4)

N j

2

g

G li [mli PL mj PR ]Bij

+ h.c.,

LG =

(2.5)

N j

2MW

g

LZ =

(2.6)

Z ( N )i [PL Cij PR Cij ]

,

N

4 cos w

j

g

LH =

(2.7)

H ( N )i (mi PL + mj PR )Cij + (mj PL + mi PR )Cij

,

N j

4MW

ig

0

.

LG0 =

(2.8)

N j

4MW

The matrices B and C in the above are given by

Bij =

VLki

Ukj

,

Cij =

k=1

k=1

Uki Ukj

3+n

Cik Cjk ,

(2.9)

k=1

charged leptons. Without loss of generality, we assume that there is no mixing in the charged

leptons, i.e., VL is the unit matrix. This allows B to be written as Bli with l = e, , and so

Cij =

Bli Blj .

(2.10)

l=1

3+n

mi Bli Bl i = 0.

(2.11)

i=1

These are important when it comes to considering the viability of coupling scenarios which give

rise to CP violation, as is done in Section 5.

2.2. Heavy Dirac neutrino model

An alternative model in which the heavy neutrinos are Dirac particles can be constructed by

adding left-handed singlets SL0 to the theory in addition to the right-handed neutrino fields R0 .

These have no couplings to SM particles and only enter the theory through their mixings with the

other neutrinos. For simplicity, we shall assume that the same number of right-handed neutrinos

and left-handed singlets are included. This model can be obtained as the low energy limit of

GUTs based on SO(10) [22] or E6 [20,23] gauge groups.

99

The Lagrangian for neutrino masses is then given by

( 0 )c

0

m

0

L

D

1

c

Lmass

(2.12)

=

mTD

0 M T R0 + h.c..

0 ( 0 ) SL0

2 L R

c

0

0

M

0

(S )

L

As in the previous model, mD and M are complex matrices. This mass matrix can be diagonalised

through the rotations

L0

L

T

(2.13)

= UL

,

R = UR R0 ,

SL

SL0

where UL is a (3 + nR ) (3 + nR ) and UR a nR nR unitary matrix. If these are chosen

appropriately, the Lagrangian given in (2.12) can then be expressed as

0

0

0

(L )c

1

0

MN R + h.c.,

Lmass

(2.14)

= L (R )c SL 0

2

0 MN

0

(SL )c

with MN diagonal. With the identifications SL NL and R NR , this is then a mass term for

three massless neutrinos 2 and nR massive Dirac neutrinos N .

The three weak-eigenstates L0 in this theory are related to the mass-eigenstates through a

(3 + nR ) (3 + nR ) unitary matrix, just like as in the previous theory without singlets. Hence,

the Lagrangian for their interactions with W and G bosons is given by (2.4) and (2.5), just

with UL replacing U in the definition of Blj . However, since the neutrinos are Dirac particles,

the Lagrangian for their couplings to the Z, H and G0 bosons differs from the corresponding

one for Majorana particles [cf. (2.6)(2.8)]. It only contains terms proportional to Cij , not Cij ,

and is given by

g

LZ =

(2.15)

Z ( N )i PL Cij

,

N

2 cos w

j

g

H ( N )i (mi PL + mj PR )Cij

,

LH =

(2.16)

N

2MW

j

ig

G ( N )i (mi PL mj PR )Cij

,

LG0 =

(2.17)

N j

2MW

where again UL replaces U in the definition of Cij .

Dirac neutrinos can be considered as the limit of two degenerate Majorana neutrinos, say Ni

and Nj , whose couplings are related through Bli = iBlj . It is easy to see then that for these,

Eq. (2.11) is automatically satisfied and hence will not act as a constraint for Dirac neutrinos.

2 Although this has already been ruled out, the model can be made compatible with experiment by adding small

0 (S 0 )c [23]. After diagonalisation, these translate to small Majorana

Majorana mass terms for the singlets, e.g., SL

L

masses for the light neutrinos. However, this will have no effect on any collider observables, since these masses are tiny

compared to the energy scales involved.

100

In both of the models described above, the weak-eigenstate neutrinos are related to the masseigenstates through

3+n

R

L

0

Ll

(2.18)

=

Bli

,

NL i

i=1

where l = e, , . To make the notation clearer, B is split up into two parts relating to the light

and heavy states

0

=

Ll

Bli Li +

i=1

nR

(2.19)

BlNi NLi .

i=1

ll ll

i=1

Bli Bl i =

nR

BlNi Bl Ni ,

(2.20)

i=1

which is a generalisation of the LangackerLondon parameters (sLe,, )2 [24], with the identification ll = (sLl )2 . The 3 3 matrix Bl is, to a good approximation, the PontecorvoMaki

NakagawaSakata (PMNS) matrix [25], giving the mixing of three left-handed neutrinos. Any

deviation from unitarity of the PMNS matrix is given by ll , and would constitute evidence for

new physics, such as heavy neutrinos.

Constraints on ll come from LEP and low-energy electroweak data [24,2629]. Tree-level

processes with light neutrinos in the final state can be used as a probe by looking for a reduction

of the couplings of the light neutrinos from their SM values. A global analysis of such processes

gives the upper limits [28]

ee 0.012,

0.0096,

0.016,

(2.21)

at 90% confidence level. These are mostly model independent and depend only weakly on the

heavy neutrino masses.

LFV decays such as , e , , eee, e conversion in nuclei and Z l + l

also constrain the couplings. Heavy neutrinos contribute in loops, as such the limits obtained

from these depend on the heavy neutrino masses and Yukawa couplings [27]. For mN MW

and mD MW , the limits derived including recent analyses of BaBar data [30], are

|e | 0.0001,

|e | 0.02,

| | 0.02.

(2.22)

Since the modulus lies outside of the sums, the applicability of these limits to the individual

couplings is limited as there can be cancellations between the contributions from different heavy

neutrinos. Furthermore, lepton-flavour violation is a very general signature of beyond the SM

physics. Contributions from SUSY particles for example, could also create cancellations reducing the sum.

Attempts to set limits from neutrinoless double beta decay experiments run into similar problems. Non-observation translates to a bound for Majorana neutrinos of [31]

2

BeNi

< 5 108 GeV1 .

(2.23)

mNi

i

101

Fig. 1. Feynman diagrams for the parton-level subprocesses relevant to heavy neutrino production at the LHC.

Although this would appear to severely constrain heavy Majorana neutrino couplings to the electron, this is again a sum in which large cancellations can occur between contributions from

different particles. In particular, if heavy neutrinos are pseudo-Dirac, this constraint can be

avoided, as there is an extra suppression factor (mN2 mN1 )/(mN2 + mN1 ).

3. LHC signals

At the LHC, the dominant production mechanisms for heavy neutrinos if they have masses

in the 100500 GeV range will be q q (W ) l N , as shown in Fig. 1 [36]. Due to the

enhanced contribution from the valence quarks, W + bosons will be produced more copiously

than their charge conjugates, hence the process with an intermediate W + will give the larger

signal. Since the Feynman graphs in Fig. 1 have a W boson in the s-channel, the signal cross

section falls dramatically as the mass of the heavy neutrino is increased. Hence, even though the

LHC centre-of-mass-system (cms) energy will be 14 TeV, there is only an observable signal for

neutrinos below about 400500 GeV at most.

By far the cleanest signals come from the heavy neutrinos decaying as N l W l jj ,

with l = e, and j representing a hadronic jet. All the decay products can then be detected,

allowing the reconstruction of the invariant mass, and more importantly, the observation of

lepton-flavour and lepton-number violation. Given this decay chain, signals that conserve L will

be of the form pp l l W X, where here X represents the beam remnants and the W boson is assumed to decay hadronically. In addition to these, for Majorana neutrinos only, LNV

processes of the form pp l l W X are also possible. If observed, this would unravel the

Dirac or Majorana nature of the heavy neutrinos.

In order to suppress the SM background, signals for heavy neutrinos must be LFV (which includes any LNV processes). Since both lepton-flavour and lepton-number violation are forbidden

in the SM, the backgrounds to these processes require extra light neutrinos in the final state. The

main source of this type of background will come from three W bosons. If two of these decay

leptonically and the third hadronically, this can mimic the signal apart from the additional light

neutrinos. Recent analyses of this process have concluded that such a background can be made

negligible after cuts [5,6]. In particular, a missing pT cut is very effective, since this should have

no effect on the signal.

4. The resummed heavy neutrino propagator

CP violation may originate from self-energy, vertex or higher order quantum corrections. In

general these are small effects, since electroweak loop corrections themselves are small. However, if two or more of the heavy neutrinos are nearly degenerate in mass, then CP violation

from self-energy corrections (often termed

-type CP violation [10]) can be resonantly enhanced [11]. In fact, in the limit of degenerate heavy neutrinos, finite-order perturbation theory

breaks down. A well defined field-theoretic formalism is based on a resummation of the selfenergy graphs [11,16]. This approach is manifestly gauge invariant within the Pinch Technique

102

(PT) framework [32,33] and maintains other field-theoretic properties, such as unitarity and CPT

invariance. Our formalism involves the absorptive part of the heavy neutrino self-energy, which

is computed here at the one-loop level. An important point regarding this formalism is that both

the diagonal and off-diagonal elements of the self-energy must be inserted into the heavy neutrino propagator matrix. This is crucial, since for small mass-splittings, the off-diagonal elements

play a major role.

Following this approach, the propagator for a system of two heavy neutrinos is given by3

1

p

/ m1 + i Im 11 ( p

/)

i Im 12 ( p

/)

p

S(

/) =

,

(4.1)

/)

p

/ m2 + i Im 22 ( p

/)

i Im 21 ( p

where Im ij ( p

/ ) is the absorptive part of the heavy neutrino self-energy.

4.1. Majorana neutrinos

For Majorana neutrinos, Im ij ( p

/ ) is of the form

Im ij ( p

/ ) = Aij (s)/

p PL + Aij (s)/

p PR ,

(4.2)

Writing the propagator as

SM ( p

/ ) = DM (s)/

p PL + EM (s)/

p PR + FM (s)PL + GM (s)PR ,

the matrices DM , EM , FM and GM are given by

1

i(sA21 Y + m1 m2 A12 )

X22 A 11 + s|A12 |2 A 22

DM =

,

ZM i(sA12 Y + m1 m2 A21 ) X11 A 22 + s|A12 |2 A 11

1

i(sA12 Y + m1 m2 A21 )

X22 A 11 + s|A12 |2 A 22

EM =

,

ZM i(sA21 Y + m1 m2 A12 ) X11 A 22 + s|A12 |2 A 11

1

X22 m1 sm2 A212

is(m1 A21 A 22 + m2 A12 A 11 )

FM =

,

X11 m2 sm1 A221

ZM is(m1 A21 A 22 + m2 A12 A 11 )

1

X22 m1 sm2 A221

is(m1 A12 A 22 + m2 A21 A 11 )

GM =

,

X11 m2 sm1 A212

ZM is(m1 A12 A 22 + m2 A21 A 11 )

(4.3)

(4.4)

(4.5)

(4.6)

(4.7)

where

Xii = s A 2ii m2i ,

Y = |A12 |2 + A 11 A 22 ,

2

= X11 X22 + sm1 m2 A12 + A221 + s 2 |A12 |2 2A 11 A 22 + |A12 |2 .

A ii = 1 + iAii ,

ZM

(4.8)

EM [A ] = DM [A],

T

,

E M = DM

GM [A ] = FM [A],

T

FM = F M

,

GM = GTM .

(4.9)

3 In all that follows the light neutrino masses have been neglected. Hence, to simplify the notation we use m m .

i

Ni

Also, we shall use Bli BlNi , this should not create any confusion since we are concerned hereafter, only with the

couplings of the heavy neutrinos.

m1 0

1 + iA11

M=

,

A=

iA21

0 m2

iA12

1 + iA22

103

,

(4.10)

1

p PL + A T p

(p

/ ) = A/

/ PR M.

SM

(4.11)

Combining this with (4.3), the property SS 1 = 1 gives the further equalities

sEM A FM M = 1,

GM A = DM M,

sDM A T GM M = 1,

FM A T = EM M.

(4.12)

These relations, which can be directly verified for the matrices given in (4.4)(4.7) are important for checking that CPT invariance is preserved in the theory. More details will be given in

Section 5.

4.2. Dirac neutrinos

For Dirac neutrinos, the absorptive part of their self-energy has only the left-handed component, viz

Im ij ( p

/ ) = Aij (s)/

p PL .

(4.13)

SD ( p

/ ) = DD (s)/

p PL + ED (s)/

p PR + FD (s)PL + GD (s)PR ,

the matrices DD , ED , FD and GD are given by the expressions

1

iA12 m1 m2

(s A 22 m22 )A 11 + s|A12 |2

DD =

,

iA21 m1 m2

(s A 11 m21 )A 22 + s|A12 |2

ZD

1

isA12

s A 22 m22

,

ED =

isA21

s A 11 m21

ZD

1

ism2 A12

(s A 22 m22 )m1

,

FD =

ism1 A21

(s A 11 m21 )m2

ZD

1

ism1 A12

(s A 22 m22 )m1

,

GD =

ism2 A21

(s A 11 m21 )m2

ZD

(4.14)

(4.15)

(4.16)

(4.17)

(4.18)

with

ZD = s A 11 m21 s A 22 m22 + s 2 |A12 |2 ,

(4.19)

The inverse propagator can be expressed as

1

p PL + p

SD

(p

/ ) = A/

/ PR M,

(4.20)

104

Fig. 2. Feynman graphs contributing to the one-loop self-energy of heavy neutrinos. For Dirac neutrinos, only the LNC

graphs exist.

sED A FD M = 1,

GD A = DD M,

sDD GD M = 1,

FD = ED M.

(4.21)

Calculating the absorptive part of the self-energy at one-loop level in the Feynmant Hooft

gauge4 (for which the Feynman graphs are given in Fig. 2), the matrix A(s), which is the same

for both Dirac and Majorana neutrinos, is given by

Aij (s) =

g 2 Cij

2

128s 2 MW

2

2

2 2

2

4MW

s MW

+ 2MZ2 s MZ2 s MZ2

s MW

2

2 2

2

+ mi mj 2 s MW

+ s MZ2 s MZ2

s MW

2 2

2

.

s MH

+ s MH

(4.22)

The tree-level heavy neutrino widths can then be obtained through Ni = mi Aii (m2i ) for Dirac

neutrinos and Ni = 2mi Aii (m2i ) for Majorana neutrinos.

5. CP asymmetries in lW l W type processes

For the LHC signals described in Section 3, the heavy neutrino propagator is coupled to a

charged lepton and W boson at each end. The asymmetries between such processes and their CPconjugates will thus be the same as for the 2 2 processes lW l W . This can be understood

since the fermion line containing the heavy neutrino is the same in the Feynman graphs for both

the 2 3 signals and the corresponding 2 2 processes. The fact that one of the charged

leptons changes from being a final state particle to an initial state particle will not effect much

the size of the CP asymmetries.

As a result of CPT invariance, if all possible final states X are summed over, then [16]

(l + W N X) = (l W + N X).

(5.1)

However, it is possible for the asymmetry between the cross sections for producing any particular

final state and its CP-conjugate to be large.

Considering LNC processes first, the CP-violating difference between (q q l + l W + )

and (qq

l l + W ) will thus be proportional to that between (l W + l W + ) and

+

(l W l + W ). The only relevant parts of the cross sections are those that involve the

couplings of the heavy neutrinos, any pair of CP-conjugate processes will be otherwise identical.

4 We note that the PT result for fermion self-energies coincides with that obtained in the Feynmant Hooft gauge

[32,33].

2

2

LNC

Bl ED (s)Bl ,

CP Dirac = Bl ED (s)Bl

2

2

LNC

Bl EM (s)Bl ,

CP Majorana = Bl EM (s)Bl

105

(5.2)

(5.3)

depending on whether the heavy neutrino is a Dirac or Majorana particle. In the above, Bl =

(Bl1 , Bl2 ) with N1 and N2 being the two nearly degenerate heavy neutrinos involved. As is a

direct consequence of CPT invariance, these vanish if l and l , the two charged leptons that the

heavy neutrinos couple to, are the same. Since E[B ] = E T [B] for both Dirac and Majorana

CP transform into each other under B B , hence confirming

that they represent CP-conjugates. Using the expression for ED given in (4.16), LNC

CP |Dirac is

given by

4s(s m22 )

LNC

A11 Im[Bl1

Bl 1 Bl2 Bl 2 ]

CP Dirac =

|ZD |2

] + (1 2).

|Bl1 |2 Im[A12 Bl 1 Bl 2 ] + |Bl 1 |2 Im[A12 Bl1 Bl2

(5.4)

CP |Majorana is rather more complicated, it is thus pragmatic to work

with an approximation when performing analytic calculations. The elements of A are very small

(of order the heavy neutrino widths divided by their masses), so terms above first order in them

can be dropped to a very good approximation. Doing this, the matrix EM is approximated as

1

(s m22 )A 11 + 2isA22 i(sA12 + m1 m2 A21 )

EM

(5.5)

.

ZM i(sA21 + m1 m2 A12 ) (s m21 )A 22 + 2isA11

Using this approximation, LNC

CP |Majorana is given by

4(s m22 )

s + m21 A11 Im[Bl1

Bl 1 Bl2 Bl 2 ]

LNC

CP Majorana =

2

|ZM |

|Bl1 |2 s Im[A12 Bl 1 Bl 2 ] + m1 m2 Im[A21 Bl 1 Bl 2 ]

] + (1 2).

+ |Bl 1 |2 s Im[A12 Bl1 Bl2

(5.6)

The propagator for Dirac neutrinos simply contains a subset of the terms in the corresponding

one for Majorana neutrinos, the same is thus also true of the CP-violating expressions (5.4) and

(5.6). The extra terms that appear for Majorana neutrinos all have an m2 mass dependence, these

are due to interference between graphs without a chirality flip in them and graphs with a double

chirality flip, the latter not appearing for Dirac neutrinos. There will also be higher order (in

the elements of A) terms for Majorana neutrinos that have been neglected in our approximation

for EM .

An analogous expression can be derived for the LNV signals (assuming neutrinos are Majorana particles so such processes are allowed), which are of the form (q q l l W ).

The asymmetries between these signals are related to those between (l W + l + W ) and

(l + W l W + ), which are proportional to

2

2

LNV

(5.7)

Bl FM (s)BlT ,

CP = Bl GM (s)Bl

where only the contributions from the resonant s-channel diagrams are included. FM (s) and

GM (s) are symmetric, so both terms in this expression are individually invariant under l l .

106

Also, since GM [B ] = FM [B], it can again be confirmed that these terms represent CPconjugates, since they transform into each other under B B .

In order to make the expressions manageable, we will continue to work with an approximation

of the heavy Majorana neutrino propagator that drops higher order terms in the elements of the

matrix A. With this approximation, the matrices FM and GM are given by

1

m1 (s(1 + 2iA22 ) m22 ) is(m1 A21 + m2 A12 )

FM

(5.8)

,

is(m1 A21 + m2 A12 ) m2 (s(1 + 2iA11 ) m21 )

ZM

1

m1 (s(1 + 2iA22 ) m22 ) is(m1 A12 + m2 A21 )

GM

(5.9)

.

is(m1 A12 + m2 A21 ) m2 (s(1 + 2iA11 ) m21 )

ZM

Using these expressions, the CP asymmetry for LNV processes of the type considered is

LNV

CP =

4sm1 (s m22 )

2m2 A11 Im[Bl1

Bl 1 Bl2 Bl 2 ]

|ZM |2

+ |Bl1 |2 m1 Im[A12 Bl 1 Bl 2 ] + m2 Im[A21 Bl 1 Bl 2 ]

] + (1 2).

+ |Bl 1 |2 m1 Im[A12 Bl1 Bl2

(5.10)

Since all first order graphs contributing to LNV processes have a single chirality flip in the

propagator, all the terms in the above expression are proportional to m2 .

5.1. Theoretical constraints (for Majorana neutrinos)

For three heavy Majorana neutrinos, BlN is a 3 3 matrix. Ignoring the light neutrino masses,

the constraints in (2.11) thus leave four of the heavy neutrino couplings as free parameters. For

example, Bl1 (l = e, , ) and Be2 can be chosen, Eq. (2.11) is then satisfied by

2 + m B2

m1 Be1

Bl1 Bei

2 e2

Be3 = i

(5.11)

,

Bli =

.

m3

Be1

In order to see how this effects the expressions for the CP s, write the matrix A as

mi mj

Aij = Cij a(s) + b(s) 2 ,

MW

where a and b are dimensionless real functions. Then, using (2.10)

mi mj

]= a+b 2

Im[Bl i Bl j Bl1 Bl2

],

Im[Aij Bl1 Bl2

MW

l

which, after the application of (5.11) becomes

mi mj

|Bl1 |2

Im[Bei

Bej Be1 Be2

].

|Be1 |4

MW

(5.12)

(5.13)

(5.14)

CP |Majorana can be shown to vanish, while in (5.10), all terms proportional to a cancel out and we are left with

LNV

CP =

2 |Z |2

MW

M

2

Be2 2

|Be2 |

2

2

s

m

+

m

s

m

Im

m

,

2

1

1

2

Be1

|Be1 |2

107

(5.15)

where

b=

2 )2 (s M 2 ) + (s M 2 )2 (s M 2 ) + (s M 2 )2 (s M 2 ))

g 2 (2(s MW

Z

Z

Z

H

H

.

128s 2

(5.16)

For neutrino masses accessible to colliders this is always very small, so no observable CP violation is possible for the model considered with three heavy Majorana neutrinos. This result

holds as long as only two of them are close enough in mass to have significant mixing in the

propagator. The case of all three neutrinos being nearly degenerate is more involved, so whether

large CP asymmetries can occur in this case may be studied elsewhere.

If at least four heavy neutrinos exist, Eq. (2.11) can be satisfied for any choices of the couplings of the two quasi-degenerate neutrinos. Scenarios which result in large CP asymmetries are

thus possible, and it is in the context of such a theory which our numerical results for Majorana

neutrinos are to be considered. As mentioned in Section 2.2, Eq. (2.11) is automatically satisfied for the model with Dirac neutrinos. As such, even with just two heavy neutrinos, large CP

asymmetries can result here.

5.2. Scenarios with large CP asymmetries

For the purposes of our numerical calculations, we assume the couplings of the nearly degenerate neutrinos can be chosen independently for both Dirac and Majorana neutrinos. Although

this implies at least four heavy neutrinos in the Majorana case, we also assume that only two

of these are close in mass, such that we can use our 2 2 propagators given in Section 4. In

order to determine scenarios which would result in large CP asymmetries, we consider just the

kinematic point s = s = 12 (m21 + m22 ). To further simplify the expressions, we will also set the

heavy neutrino couplings to all have a common magnitude |B|.

CP , Z(s ) Z and Aij (s ) A ij , the CP asymmetry for

Introducing the notation CP (s )

Dirac neutrinos is given by

m41 m42

LNC

(A 11 + A 22 ) Im[Bl1

Bl 1 Bl2 Bl 2 ]

CP Dirac =

|Z D |2

|Bl1 |2 + |Bl2 |2 Im[A 12 Bl 1 Bl 2 ]

].

(5.17)

B B B ], which can

One way to obtain a large value for this expression is to maximise Im[Bl1

l 1 l2 l 2

be done by having one of the four couplings imaginary and the rest real. The couplings to the

third charged lepton (i.e., not l or l ) can then be chosen such that C12 is either real or imaginary,

] will be zero. If we now impose the condition

and so either Im[A 12 Bl 1 Bl 2 ] or Im[A 12 Bl1 Bl2

that all the couplings have the same magnitude |B|, Eq. (5.17) becomes

LNC

CP

Dirac

4

= |B| A 11 + A 22 2|A 12 | m4 m4 .

1

2

|Z D |2

(5.18)

108

Although there is a partial cancellation between the elements of A,

only be about one third of the magnitude of the diagonal elements (which are approximately

equal). This is because in A 12 , the contribution from the couplings to the third charged lepton

cancel exactly with the contribution from either l or l . The equality A 11 = A 22 = 3|A 12 | is

not exact however, since there are small differences due to the mass-splitting of the two heavy

neutrinos.

LNC |Majorana is

For Majorana neutrinos, the expression for

CP

2

m21 m22 2

2

2

LNC

3m

A

A

Im[Bl1

+

m

+

m

+

3m

Bl 1 Bl2 Bl 2 ]

11

22

1

2

1

2

CP Majorana =

|Z M |2

|Bl1 |2 + |Bl2 |2 m21 + m22 Im[A 12 Bl 1 Bl 2 ]

+ 2m1 m2 Im[A 21 Bl 1 Bl 2 ]

]

(5.19)

] = Im[A

21 Bl1 B ]. Under the same assumption for the

If A 12 is imaginary, then Im[A 12 Bl1 Bl2

l2

B-couplings, we find

LNC

CP

=

3m1 + m22 A 11 + m21 + 3m22 A 22

Majorana

2

|ZM |

+ O (m1 m2 )2 .

(5.20)

LNV is given by

For LNV processes,

CP

LNV

CP =

m41 m42

2m1 m2 (A 11 + A 22 ) Im[Bl1

Bl 1 Bl2 Bl 2 ]

|Z M |2

+ m21 |Bl1 |2 + m22 |Bl2 |2 Im[A 12 Bl 1 Bl 2 ]

+ m1 m2 |Bl1 |2 + |Bl2 |2 Im[A 21 Bl 1 Bl 2 ]

]

2

2

(5.21)

LNV 2m1 m2 |B|4

CP =

(A 11 + A 22 )m41 m42 + O (m1 m2 )2 .

2

|ZM |

(5.22)

If l = l , then

LNV

CP =

m41 m42

2

11 + A 22 ) Im (Bl1

2m

m

(

A

B

)

1

2

l2

|Z M |2

]

+ 2 m21 |Bl1 |2 + m22 |Bl2 |2 Im[A 12 Bl1 Bl2

] .

(5.23)

The easiest way to get a large value for this is to have Bl1 Bl2

l1 l2

zero, but both Im[A12 Bl1 Bl2 ] and Im[A21 Bl1 Bl2 ] will be large (and of the same sign), as long as

109

Table 1

Properties of the cross sections and CP asymmetries considered in Section 6

Observable(s)

Heavy neutrino

type

CP violation

in (S2)

CP violation

in (S3)

K-dependence

(pp e W X)

(pp e W X)

(pp e W X)

(pp e W X)

(pp e e W X)

(pp W X)

(pp e W X)

ACP (LNC)

ACP (LNC)

RCP (LNC)

RCP (LNC)

ACP (LNV1)

ACP (LNV2)

RCP (LNV)

Dirac

Majorana

Dirac

Majorana

Majorana

Majorana

Majorana

Dirac

Majorana

Dirac

Majorana

Majorana

Majorana

Majorana

No

Yes

No

Yes

Yes

No

Yes

No

Yes

No

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No

No

Yes

Yes

Yes

Yes

Yes

No

Yes

Yes

No

No

No

No

No

No

No

No

No

No

No

Yes

Yes

No

LNV 2|B|4 4

m m4 (m1 + m2 )2 Re[A 12 ].

CP =

1

2

|Z M |2

(5.24)

In all the above coupling scenarios, if the heavy neutrino mass-splitting mN m2 m1 is

of order their widths, then the CP asymmetries can be of order the cross sections involved. In the

resonant region, these have terms proportional to either m2 , N2 or mN (all over |Z D/M |2 ).

The CP asymmetries are themselves proportional to mN (over |Z D/M |2 ), so for m N ,

CP violation can be resonant. This still requires particular coupling scenarios such as we have

described, but since we are interested in investigating the possibility of large CP asymmetries at

the LHC, we will use these optimised conditions for our numerical studies.

6. Numerical results

In this section we give our numerical results for the cross sections of the various signals of

the type discussed in Section 3, for clarity, we list these in Table 1. As mentioned in Section 3,

in all signals considered, the W bosons are assumed to decay hadronically. We do not consider

leptons as possible final state particles since these decay before reaching the detector and so

would require a more involved analysis. Since we are primarily interested in CP asymmetries

between the signal processes, we also plot these, which we define in later this section and also

include in Table 1. For the results shown, we have used the CTEQ6M PDFs [14], in which

we have used Q = mN . Setting Q equal to the invariant mass of the W instead, for example,

increases the cross sections slightly for relatively low heavy neutrino masses.

The basic assumption required for our calculations to be valid is that the two quasi-degenerate

heavy neutrinos are the only new physics particles that make an appreciable contribution to the

processes considered. All others are assumed to either not couple to relevant particles, or have

masses 1 TeV. Our calculations depend weakly on the mass of the Higgs boson, for which we

have used MH = 120 GeV, and we have globally applied the cuts pT > 15 GeV and || < 2.5

for all final state particles. Note that these kinematical cuts are similar to those chosen in [5].

110

(S1): BlN = 0.05 for l = e, , and N = 1, 2.

(S2): As (S1), except Be2 = 0.05i.

(S3): As (S1), except B2 = 0.05i and B 2 = 0.05.

The first scenario, where all couplings are real, is the CP-conserving limit. This enables us to

separate true CP asymmetries from the asymmetry due to the PDFs. The second two are chosen

specifically as examples of resonant CP violation, as guided by our analysis in Section 5.2. Scenario (S2) gives rise to large asymmetries for Majorana neutrinos, but not for Dirac neutrinos,

whereas in scenario (S3) large CP asymmetries occur for both Dirac and Majorana neutrinos, but

only if the two final state charged leptons are of different flavour.

Including

just the two heavy neutrinos that are directly involved in the signal processes, we

have i |BlNi |2 = 0.005 (for all l). This leaves room for couplings to other heavy neutrinos

without invalidating the experimental limits in (2.21). This is especially important for Majorana

neutrinos, since as noted in Section 5.1, it is necessary then to have at least four heavy neutrinos

to avoid theoretical constraints on their couplings. In the same context, a degree of cancellation

between the different loop contributions might be necessary for the scenarios (S1)(S3) to satisfy

the bounds in (2.22), especially those derived from the non-observation of e .

We start our investigation of CP violation by constructing the following CP asymmetry

ACP =

(pp (W + ) Si ) K (pp (W ) Si )

,

(pp (W + ) Si ) + K (pp (W ) Si )

(6.1)

where Si can stand for any of the signal final states considered and Si its CP-conjugate. The

function K takes account of the different PDFs involved in producing W + and W bosons,

such that ACP = 0 if CP is conserved. It has to be calculated theoretically and is defined as

(pp (W + ) Si )

K=

(6.2)

,

(pp (W ) Si ) CP=0

/

where again Si can be any of the possible final states considered and Si its CP-conjugate. An

important point is that although ACP is in general different for different Si , K is, to a very good

approximation, universal whichever signal (of the type considered) is used to define it. Therefore,

we calculate the cross sections in the CP-conserving limit with all couplings real, like for example

in scenario (S1). K is independent of the magnitudes of the couplings and mass-splitting (for

mN mN ), so any other CP-conserving scenario would give the same value. It is plotted as a

function of mN1 (the mass of the lighter of the two heavy neutrinos) in Fig. 3 along with the signal

cross sections in (S1). The value of K can be obtained by taking the ratio of either of the two

pairs of CP-conjugate signals shown. In fact, in this special scenario, where all heavy neutrino

couplings are equal, all additional signals of the type considered have the same cross section as

one of the four shown. However, there is an element of theoretical uncertainty predominantly

coming from the different choices of PDFs used. For instance, using the MRST2004 PDFs

[15] instead, we find values for K that differ by less than 1%. They are also insensitive to the

factorisation scale Q that is used.

Another set of CP-violating observables can be constructed by considering ratios of different

processes, such that the asymmetries due to the different PDFs cancel out. Defining the ratios

111

Fig. 3. Left plot: Signal cross sections for scenario (S1). Right plot: The function K as defined in (6.2). All additional

signals of the type considered have the same cross section as one of the four shown.

Fig. 4. Cross sections for LNC signals in scenario (S2). In this scenario, no CP violation is present for Dirac neutrinos, so

plots are not shown for these. In this and all other plots, the vertical dotted line represents the value of the heavy neutrino

widths.

R+ and R as

R+ =

(pp (W + ) Si )

,

(pp (W + ) Sj )

R =

(pp (W ) Si )

,

(pp (W ) Sj )

(6.3)

RCP =

R+ R

.

R+ + R

(6.4)

This second method has the advantage that it does not depend on K. However, it is more complicated to construct and analyse since two different processes (plus their CP-conjugates) are

required for each asymmetry.

112

We consider four distinct LNC signals, these are:

(1) pp e+ W + X,

(2) pp e + W X,

(3) pp e+ W X,

(4) pp e + W + X.

(6.5)

Signals (1) and (2) are CP-conjugates of each other and likewise (3) and (4). Furthermore, the

cross sections for (3) and (4) are related to those for (1) and (2) through

pp e W + X = K pp e W X ,

(6.6)

which holds even when CP is not conserved. Using this relation, ACP as defined in (6.1) can be

expressed without involving K, viz.

ACP (LNC) =

(pp e+ W X) (pp e + W X)

.

(pp e+ W X) + (pp e + W X)

(6.7)

This has also the advantage that it is not necessary to distinguish between the two W boson

charges experimentally. RCP , given in (6.4), can also be re-expressed using (6.6), and can thus

be written as

RCP (LNC) =

(ppe+ W X)

(ppe + W X)

(ppe+ W X)

(ppe + W X)

(ppe + W X)

(ppe+ W X)

(ppe + W X)

(ppe+ W X)

(6.8)

where again, the two charges of W boson are summed over. However, the observable RCP (LNC)

turns out to be closely related to ACP (LNC).

The cross sections for the signal processes given in (6.5) are shown in Fig. 4 for scenario (S2)

and Figs. 5 and 6 for scenario (S3). Since scenario (S2) only results in CP violation for Majorana

neutrinos, the results for Dirac neutrinos are not shown. Weve plotted the signals as functions of

mN1 and mN . In the latter plots, the point mN = N is marked by a vertical dotted line.5 The

CP asymmetries are close to their maxima at this point, which we use for our plots against mN1 .

The CP-violating observables given in (6.7) and (6.8) are shown as functions of mN in Fig. 7.

Again the point mN = N is marked by vertical dotted lines (one each for Dirac and Majorana

neutrinos).

6.2. LNV processes

We now analyse the LNV processes:

(1) pp e+ e+ W X,

(3) pp e+ + W X,

(2) pp e e W + X,

(4) pp e W + X.

(6.9)

We have not included the di-muon channel processes here since for the two scenarios (S2) and

(S3), there is no CP asymmetry between these. However, interchanging Be2 B2 in these

scenarios would give asymmetries in the di-muon channel but not the di-electron channel, which

is just as likely.

5 In the scenarios considered, the widths of the two heavy neutrinos are almost equal, as all couplings have the same

magnitude.

Fig. 5. Cross sections for LNC signals in scenario (S3) with Majorana neutrinos.

Fig. 7. The CP-violating observables ACP (LNC) and RCP (LNC) as defined in (6.7) and (6.8).

113

114

Fig. 9. Cross sections for LNV signals in scenario (S3). CP asymmetries are only present for this scenario between signals

with different flavour final state charged leptons.

(pp e+ e+ W X) K (pp e e W + X)

,

(pp e+ e+ W X) + K (pp e e W + X)

(pp e+ + W X) K (pp e W + X)

,

ACP (LNV2) =

(pp e+ + W X) + K (pp e W + X)

ACP (LNV1) =

RCP (LNV) =

(ppe+ e+ W X)

(ppe+ + W X)

(ppe+ e+ W X)

(ppe+ + W X)

(ppe e W + X)

(ppe W + X)

(ppe e W + X)

(ppe W + X)

(6.10)

(6.11)

(6.12)

The signal cross sections given in (6.9) are plotted in the same manner as the LNC signals were.

These are shown in Fig. 8 for scenario (S2) and Fig. 9 for scenario (S3). Similarly, the CPviolating observables given in (6.10)(6.12) are shown in Fig. 10. The cross sections for both the

LNC and LNV signals fall very rapidly with increasing heavy neutrino mass. For |BlN | = 0.05

and an integrated luminosity of 100 fb1 , to produce at least 10 signal events in any single

channel would typically require mN 300400 GeV. This could be enough to not just discover

heavy neutrinos, but also observe resonant CP violation as well. As can be seen from the plots in

115

Fig. 10. The CP-violating observables ACP (LNV1), ACP (LNV2) and RCP (LNV) as defined in (6.10)(6.12). In scenario

(S3), ACP (LNV1) = 0 and so RCP (LNV) = ACP (LNV2).

Figs. 7 and 10, the CP asymmetries in scenarios (S2) and (S3) are very large for mN N .

With this condition, and taking mN1 = 300 GeV for example, scenario (S2) would result in about

20 signal events for pp e+ e+ W X without a single pp e e W + X event likely to be

observed.

In the limit mN = 0, the asymmetries vanish. This is as expected from Section 5.2, since

there it is shown that for couplings of equal magnitude, the asymmetries are proportional to the

mass-splitting of the heavy neutrinos. For the two CP-violating scenarios considered, the CP

asymmetries are larger for Majorana neutrinos than for Dirac ones. However, more examples

would need to be calculated to determine if this is a general trend. Since they are constructed out

of the asymmetries between two independent processes and their CP-conjugates, it should come

as no surprise that the RCP asymmetries are larger than the ACP ones. The one exception to this

is RCP (LNV) in scenario (S3). This is because there is no asymmetry between one of the two

pairs of CP-conjugate processes it is constructed from. For mN = N , both ACP and RCP are

independent of the heavy neutrino mass scale, although this is not obvious from our plots since

K does depend on mN .

7. Conclusions

Based on the resummation formalism developed in [11,16], we have calculated the 22 heavy

neutrino propagator matrix for both Dirac and Majorana neutrinos. Our results apply to scenarios

for which the two heavy neutrinos that are nearly degenerate in mass dominate the production

cross sections. The formalism involves the absorptive part of the heavy neutrino self-energy,

which is given to one-loop, again for both Dirac and Majorana neutrinos.

Assuming a two heavy neutrino mixing system, we have given numerical estimates for the

production cross sections of LFV and LNV heavy neutrino signal processes at the LHC. These

are of the form pp ll W X, with the final state particle charges (not including the beam remnants X) adding up to 1. For Dirac neutrinos, the leptons have to have opposite charges and so,

in order to avoid large backgrounds, l and l should be different. For Majorana neutrinos, LNV

signals of the form pp l l W X are also allowed, where l and l can (but do not have to)

be equal. The SM background to either of these types of signals should be negligible after cuts.

For the magnitudes of couplings we used (|BlN | = 0.05), heavy neutrinos will be observable at

116

the LHC if they have masses between about 100 GeV and about 400 GeV. If the heavy neutrinos

are much lighter than 100 GeV, the LEP data put severe limits on the B-couplings, leading to

unobservable signal cross sections.

We have plotted the signal cross sections in three different possible scenarios for the couplings of the heavy neutrinos. In the first, all the couplings are real. This is the CP-conserving

limit which is needed to calculate the asymmetry due to the different PDFs for producing W +

and W bosons. The second and third scenarios are chosen as examples in which large CP asymmetries exist in a number of channels. CP-violating observables constructed from the signal cross

sections are also plotted. For a mass-splitting of the two neutrinos of order their widths, these

asymmetries can be very large or even maximal, giving rise to resonant CP violation. It should

be noted that these scenarios require additional flavour symmetries in the heavy neutrino sector

in order to be possible. Experimental constraints, in particular those from the lack of observation

of e , require contributions from additional particles of new physics (either further heavy

neutrinos or something else) to be satisfied. These extra particles would need to cause quite large

cancellations in the rate of this process without making a significant contribution to the signals

of interest. Also, for Majorana neutrinos, there are theoretical constraints that require at least two

additional heavy neutrinos (four in total) to be present in the theory in order to be satisfied. Nevertheless, our results demonstrate that very large CP asymmetries are possible at the LHC. The

couplings used have not been motivated in any fashion, but neither are they unique. In particular

it should be kept in mind that an interchange of charged lepton labels gives scenarios which are

no more or less likely.

Among the different processes we have been studying here, the most realistic modes to

look for large CP asymmetries at the LHC are the di-muon or di-electron channels. These

processes are LNV and hence are only allowed for Majorana neutrinos. For these channels, it

is possible to have the heavy neutrino couplings to either the electron or muon heavily suppressed, hence satisfying the experimental constraints. It may then be possible to have resonant

CP violation in these channels without resorting to additional flavour symmetries or cancellations.

The analysis presented here could be extended to other colliders, in particular to the ILC.

Here the signals e+ e l W can be considered which are CP-conjugates of each other. The

ILC should be a cleaner experimental environment and depending on its cms energy, may well

be able to produce heavy neutrinos of considerably larger mass. However, the ILC would suffer

compared to the LHC in the search for heavy neutrinos in the sense that the SM background

cannot so easily be suppressed. Linked to this, evidence of L violation and hence whether or

not neutrinos are Majorana particles is far harder to obtain. Also, an observable signal requires a

significant coupling of the heavy neutrino to the electron, a requirement not shared by the LHC.

In summary, resonant CP violation due to electroweak-scale heavy neutrinos is an interesting

possibility that might well be observed for the first time at the LHC. If realised, this could, in

principle, give an explanation for the BAU through resonant leptogenesis. The observation of

electroweak-scale heavy neutrinos may not directly unravel the detailed structure of the light

neutrino mass matrix, but it will naturally point towards scenarios based on some approximate

lepton-number or flavour symmetry (see also our comment in Section 2.1). This approximate

lepton-number symmetry may be imprinted into the relative decay rates of the heavy neutrinos

into the different charged-lepton flavours e, and , from which an estimate of the large elements of the Dirac mass matrix mD could, in principle, be obtained. Possible new signatures

or constraints from low-energy experiments, including the low-energy neutrino oscillation data,

may offer valuable information to restrict the form of mD and the light-to-heavy neutrino-mixing

117

matrix mD /mM . A detailed analysis of possible correlations between resonant CP violation due

to heavy neutrinos which we have been studying here and other observables may be the subject

of a future communication.

Note added

Shortly after communicating our paper, we became aware of a new analysis of the background

contributing to the heavy neutrino signals [34]. According to this analysis, previous studies have

grossly underestimated the background by a factor of 30, because they did not take into account high-order pile-up processes of jets which cannot be reduced by various kinematic cuts.

As a consequence, those authors find that the LHC will be capable of only probing relatively

light heavy neutrinos of masses up to 175 GeV, thus restricting the results of our study accordingly. Nonetheless, it should be noticed that the background of high multiplicity jet processes

stems predominantly from colour non-singlet states, unlike the 2 distinct jets in the signal which

originate from decays of the colourless heavy neutrinos or W bosons. Hence, the signal and

background processes are expected to show a different topology of hadronic activities in the central region, which could be used to eliminate the contribution of the high-order pile-up processes

of jets, with the hope to extend the reach of the LHC to heavier Majorana neutrinos. Such an

analysis lies beyond the scope of this paper.

Acknowledgements

We thank Mrinal Dasgupta and Jeff Forshaw for a discussion of hadronization and colour interference effects. The work of S.B. has been funded by the PPARC studentship PPA/S/S/2003/

03666. The work of J.S.L. was supported in part by the Korea Research Foundation (KRF) and

the Korean Federation of Science and Technology Societies Grant and in part by the KRF grant:

KRF-2005-084-C00001 funded by the Korea Government (MOEHRD, Basic Research Promotion Fund). The work of A.P. was supported in part by the PPARC grants: PP/D0000157/1 and

PP/C504286/1.

References

[1] Super-Kamiokande Collaboration, Y. Fukuda, et al., Phys. Rev. Lett. 81 (1998) 1562, hep-ex/9807003;

CHOOZ Collaboration, M. Apollonio, et al., Phys. Lett. B 466 (1999) 415, hep-ex/9907037;

SNO Collaboration, Q.R. Ahmad, et al., Phys. Rev. Lett. 89 (2002) 011301, nucl-ex/0204008;

K2K Collaboration, M.H. Ahn, et al., Phys. Rev. Lett. 90 (2003) 041801, hep-ex/0212007;

KamLAND Collaboration, K. Eguchi, et al., Phys. Rev. Lett. 90 (2003) 021802, hep-ex/0212021.

[2] L3 Collaboration, P. Achard, et al., Phys. Lett. B 517 (2001) 67, hep-ex/0107014.

[3] A. Pilaftsis, Z. Phys. C 55 (1992) 275, hep-ph/9901206.

[4] A. Datta, M. Guchait, A. Pilaftsis, Phys. Rev. D 50 (1994) 3195, hep-ph/9311257;

F.M.L. Almeida Jr., Y.A. Coutinho, J.A. Martins Simoes, M.A.B. do Vale, Phys. Rev. D 62 (2000) 075004, hepph/0002024;

O. Panella, M. Cannoni, C. Carimalo, Y.N. Srivastava, Phys. Rev. D 65 (2002) 035005, hep-ph/0107308.

[5] T. Han, B. Zhang, Phys. Rev. Lett. 97 (2006) 171804, hep-ph/0604064.

[6] F. del Aguila, J.A. Aguilar-Saavedra, R. Pittau, J. Phys. Conf. Ser. 53 (2006) 506, hep-ph/0606198.

[7] J. Gluza, M. Zralek, Phys. Rev. D 55 (1997) 7030, hep-ph/9612227;

G. Cvetic, C.S. Kim, C.W. Kim, Phys. Rev. Lett. 82 (1999) 4761, hep-ph/9812525;

J.F.M.L. Almeida, Y.A. Coutinho, J.A. Martins Simoes, M.A.B. do Vale, Phys. Rev. D 63 (2001) 075005, hepph/0008201.

118

[8] F. del Aguila, J.A. Aguilar-Saavedra, A. Martinez de la Ossa, D. Meloni, Phys. Lett. B 613 (2005) 170, hepph/0502189.

[9] J. Peressutti, O.A. Sampayo, J.I. Aranda, Phys. Rev. D 64 (2001) 073007, hep-ph/0105162;

S. Bray, J.S. Lee, A. Pilaftsis, Phys. Lett. B 628 (2005) 250, hep-ph/0508077.

[10] M. Flanz, E.A. Paschos, U. Sarkar, Phys. Lett. B 345 (1995) 248, hep-ph/9411366;

M. Flanz, E.A. Paschos, U. Sarkar, J. Weiss, Phys. Lett. B 389 (1996) 693, hep-ph/9607310;

L. Covi, E. Roulet, F. Vissani, Phys. Lett. B 384 (1996) 169, hep-ph/9605319.

[11] A. Pilaftsis, Phys. Rev. D 56 (1997) 5431, hep-ph/9707235.

[12] A. Pilaftsis, T.E.J. Underwood, Nucl. Phys. B 692 (2004) 303, hep-ph/0309342;

A. Pilaftsis, T.E.J. Underwood, Phys. Rev. D 72 (2005) 113001, hep-ph/0506107.

[13] B. Garbrecht, C. Pallis, A. Pilaftsis, JHEP 0612 (2006) 038, hep-ph/0605264;

G.C. Branco, A.J. Buras, S. Jager, S. Uhlig, A. Weiler, hep-ph/0609067.

[14] J. Pumplin, et al., JHEP 0207 (2002) 012, hep-ph/0201195.

[15] A.D. Martin, R.G. Roberts, W.J. Stirling, R.S. Thorne, Eur. Phys. J. C 28 (2003) 455, hep-ph/0211080.

[16] A. Pilaftsis, Nucl. Phys. B 504 (1997) 61, hep-ph/9702393.

[17] H. Fritzsch, P. Minkowski, Ann. Phys. 93 (1975) 193.

[18] P. Minkowski, Phys. Lett. B 67 (1977) 421;

M. Gell-Mann, P. Ramond, R. Slansky, in: P. van Nieuwenhuizen, D. Friedman (Eds.), Supergravity, North-Holland,

Amsterdam, 1979, p. 315;

T. Yanagida, in: O. Sawada, A. Sugamoto (Eds.), Preceedings of the Workshop on the Unified Theories and Baryon

Number of the Universe, KEK, Tsukuba, 1979;

R.N. Mohapatra, G. Senjanovic, Phys. Rev. Lett. 44 (1980) 912.

[19] J. Ellis, M.E. Gomez, S. Lola, hep-ph/0612292.

[20] E. Witten, Nucl. Phys. B 268 (1986) 79;

R.N. Mohapatra, J.W.F. Valle, Phys. Rev. D 34 (1986) 1642.

[21] J. Gluza, Acta Phys. Pol. B 33 (2002) 1735, hep-ph/0201002;

G. Altarelli, F. Feruglio, New J. Phys. 6 (2004) 106, hep-ph/0405048.

[22] D. Wyler, L. Wolfenstein, Nucl. Phys. B 218 (1983) 205.

[23] S. Nandi, U. Sarkar, Phys. Rev. Lett. 56 (1986) 564;

J.W.F. Valle, Prog. Part. Nucl. Phys. 26 (1991) 91.

[24] P. Langacker, D. London, Phys. Rev. D 38 (1988) 886.

[25] B. Pontecorvo, Sov. Phys. JETP 6 (1957) 429;

B. Pontecorvo, Sov. Phys. JETP 7 (1958) 172;

Z. Maki, M. Nakagawa, S. Sakata, Prog. Theor. Phys. 28 (1962) 870.

[26] T.P. Cheng, L.-F. Li, Phys. Rev. Lett. 45 (1980) 1908;

J.G. Korner, A. Pilaftsis, K. Schilcher, Phys. Lett. B 300 (1993) 381, hep-ph/9301290;

J. Bernabeu, J.G. Korner, A. Pilaftsis, K. Schilcher, Phys. Rev. Lett. 71 (1993) 2695, hep-ph/9307295;

C.P. Burgess, S. Godfrey, H. Konig, D. London, I. Maksymyk, Phys. Rev. D 49 (1994) 6115, hep-ph/9312291;

E. Nardi, E. Roulet, D. Tommasini, Phys. Lett. B 327 (1994) 319, hep-ph/9402224;

G. Bhattacharya, P. Kalyniak, I. Melo, Phys. Rev. D 51 (1995) 3569, hep-ph/9503248;

F. Deppisch, T.S. Kosmas, J.W.F. Valle, Nucl. Phys. B 752 (2006) 80, hep-ph/0512360.

[27] A. Ilakovac, A. Pilaftsis, Nucl. Phys. B 437 (1995) 491, hep-ph/9403398.

[28] S. Bergmann, A. Kagan, Nucl. Phys. B 538 (1999) 368, hep-ph/9803305.

[29] J.I. Illana, T. Riemann, Phys. Rev. D 63 (2001) 053004, hep-ph/0010193;

G. Cvetic, C. Dib, C.S. Kim, J.D. Kim, Phys. Rev. D 66 (2002) 034008, hep-ph/0202212.

[30] BaBar Collaboration, B. Aubert, et al., Phys. Rev. Lett. 95 (2005) 041802, hep-ex/0502032;

BaBar Collaboration, B. Aubert, et al., Phys. Rev. Lett. 96 (2006) 041801, hep-ex/0508012.

[31] G. Belanger, F. Boudjema, D. London, H. Nadeau, Phys. Rev. D 53 (1996) 6292, hep-ph/9508317.

[32] J.M. Cornwall, J. Papavassiliou, Phys. Rev. D 40 (1989) 3474;

J. Papavassiliou, Phys. Rev. D 41 (1990) 3179;

D. Binosi, J. Papavassiliou, Phys. Rev. D 66 (2002) 111901, hep-ph/0208189;

D. Binosi, J. Papavassiliou, J. Phys. G 30 (2004) 203, hep-ph/0301096.

[33] J. Papavassiliou, A. Pilaftsis, Phys. Rev. Lett. 75 (1995) 3060, hep-ph/9506417;

J. Papavassiliou, A. Pilaftsis, Phys. Rev. D 53 (1996) 2128, hep-ph/9507246;

J. Papavassiliou, A. Pilaftsis, Phys. Rev. D 54 (1996) 5315, hep-ph/9605385.

[34] F. del Aguila, J.A. Aguilar-Saavedra, R. Pittau, hep-ph/0703261.

Melanie Becker a , Li-Sheng Tseng b,c, , Shing-Tung Yau c

a George P. and Cynthia W. Mitchell Institute for Fundamental Physics, Texas A&M University,

b Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA

c Department of Mathematics, Harvard University, Cambridge, MA 02138, USA

Received 26 January 2007; received in revised form 2 July 2007; accepted 9 July 2007

Available online 19 July 2007

Abstract

We characterize the geometric moduli of non-Khler manifolds with torsion. Heterotic supersymmetric

flux compactifications require that the six-dimensional internal manifold be balanced, the gauge bundle be

Hermitian YangMills, and also the anomaly cancellation be satisfied. We perform the linearized variation

of these constraints to derive the defining equations for the local moduli. We explicitly determine the metric

deformations of the smooth flux solution corresponding to a torus bundle over K3.

2007 Elsevier B.V. All rights reserved.

1. Introduction

Ever since the discovery of CalabiYau compactifications [1], string theorists have tried to

make the connection to the Minimal Supersymmetric Standard Model (MSSM) and grand unified

theories (GUT). This turned out to be a difficult problem, as many times exotic particles appear

along the way. These are particles that play no role in the current version of the MSSM.1 Recently

[2,3] have made a rather interesting proposal for three generation models without exotics in the

context of CalabiYau compactifications of the heterotic string.2

* Corresponding author at: Department of Mathematics, Harvard University, Cambridge, MA 02138, USA.

1 It is, of course, possible that additional particles not known at present might be discovered, leading to an extension

of the MSSM.

2 String duality implies that in principle one could get realistic models in the context of type II theories. A concrete

proposal has been made recently in terms of a D3-brane in the presence of a dP8 singularity [4]. Alternatively, one could

use intersecting D-brane models. For a review see [5].

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.07.006

120

Even though these models have some rather interesting features, it is not possible to predict

with them the values of the coupling constants of the Standard Model, because compactifications

on conventional CalabiYau compactifications lead to unfixed moduli, and therefore additional

massless scalars. This issue can only be addressed in the context of flux compactifications, which

are known to lift the moduli [6,7].

If flux compactifications are considered in the context of the heterotic theory, the resulting

internal geometry is a non-Khler manifold with torsion [810]. Simple examples of such compactifications were constructed in [11,12] in the orbifold limit and a smooth compactification was

constructed in [13,14] in terms of a T 2 bundle over K3. See [1518] for some related works. It

would be extremely exciting to construct a torsional manifold with all the features of the MSSM.

At present, we are not yet at such a state. Many properties of CalabiYau manifolds are not shared

by non-Khler manifolds with torsion, so that well-known aspects of CalabiYau manifolds need

to be rederived for these manifolds.

One of the important open questions is to understand how to characterize the scalar massless fields, in other words, the moduli space of heterotic flux compactifications. We investigate

this question by analyzing the local moduli space emerging in such compactifications from a

spacetime approach. A massless scalar field in the effective four-dimensional theory emerges for

each independent modulus of the background geometry. Thus, the dimension of the moduli space

corresponds to the number of massless scalar fields in the theory. In our analysis, we restrict to

supersymmetric deformations, as we expect the analysis of the supersymmetry constraints to be

easier than the analysis of the equations of motion. While the later equations are corrected by R 2

terms, the form of the supersymmetry transformations is not modified to R 2 order, as long as the

heterotic anomaly cancellation condition is imposed [19]. That a solution of both the supersymmetry constraints and the modified Bianchi identity is also a solution to the equations of motion

has been shown in [20,21].

Unlike the CalabiYau case, the supersymmetry constraint equations in general non-linearly

couple the various fields and thus the analysis even at the linearized variation level is non-trivial.

As an example of our general analysis, we shall give the description of the scalar metric moduli

for the smooth solution of a T 2 bundle over K3 presented in [13,14]. It is an interesting question

to understand whether the massless moduli found in our approach are lifted by higher order

terms in the low energy effective action. For conventional CalabiYau compactifications it is

known that moduli fields appearing in the leading order equations will remain massless even if

higher order corrections are taken into account [22,23]. In our case, such an analysis has not been

performed yet from the spacetime point of view, though the question can be answered from the

world-sheet approach recently developed in [18]. In this work, a gauged linear sigma model was

constructed which in the IR flows to an interacting conformal field theory. The analysis of the

linear model indicates that massless fields emerging at leading order in will remain massless,

even if corrections to the spacetime action are taken into account.

This paper is organized as follows. In Section 2, we perform the linear variation of the supersymmetry constraints. In Section 3, we analyze the variation of the T 2 bundle over K3 solution

and discuss its local moduli space. In Section 4, conclusions and future directions are presented.

In Appendix A, we clarify some of mathematical notations that we used.

2. Determining equations for the moduli fields

The non-Khler manifolds with torsion M that we are interested in are complex manifolds

described in terms of a Hermitian form which is related to the metric

J = iga b dza d z b ,

121

(2.1)

d = 0,

(2.2)

satisfying J = 0. The geometry can be deformed by either deforming the Hermitian form

or deforming the complex structure of M. We are interested in deformations that preserve the

supersymmetry constraints as well as the anomaly cancellation condition.

N = 1 supersymmetry for heterotic flux compactifications to four spacetime dimensions imposes three conditions: the internal geometry has to be conformally balanced, the gauge bundle

satisfies the Hermitian YangMills equation, and the H -flux satisfies the anomaly cancellation

condition. Explicitly, they are [13,14]

d J J J = 0,

(2.3)

Fmn J mn = 0,

F (2,0) = F (0,2) = 0,

(2.4)

= tr(R R) tr(F F ) .

2i J

(2.5)

4

Above, we have replaced the two standard background fieldsthe three-form H and the dilaton field with the required supersymmetric relations

H = i( )J,

J = e

2(+0 )

(2.6)

.

(2.7)

Doing so allows us to consider the constraint equations solely in terms of the geometrical data

(J, ) and the gauge bundle.

Deformations of the metric that are of pure type, i.e. (0, 2) or (2, 0), describe deformations of

the complex structure

(2.8)

while deformations of mixed type, i.e. of type (1, 1), describe deformations of the Hermitian

form

iga b dza d z b .

(2.9)

We analyze below the linear variation of the three constraint equations (2.3)(2.5) with respect to a background solution. For simplicity, we shall keep the complex structure of the

six-dimensional internal geometry fixed. For the moduli space of CalabiYau compactifications,

it turns out that the Khler and complex structure deformations decouple from one another [24].

It would be interesting to determine whether some decoupling still persists in the non-Khler

case and more generally how the Hermitian and complex structure deformations are coupled. We

will leave this more general analysis for future work.

2.1. Conformally balanced condition

We consider the linear variation of the conformally balanced condition (2.3). We shall vary

the metric or Hermitian form Ja b = iga b while holding fixed the complex structure. Let

Ja b = Ja b + Ja b ,

(2.10)

122

J J = J J + 2J J,

|g |

|ga b |

2J = a b 2J =

2J

|ga b |

|ga b |(1 + g cd gcd )

= 1 g cd gcd 2J .

(2.11)

(2.12)

1

1

= g a b ga b = J mn Jmn .

4

8

The linear variation of the conformally balanced condition can be written as

d J J J = d J J J + 2 = 0,

where is a four-form given by

1

= J J J (J J )J mn Jmn .

8

(2.13)

(2.14)

(2.15)

We can invert (2.15) and express J in terms of . To do this, we note that any (2, 2)form, 4 , can be Lefschetz decomposed as follows

1

4 = L4 L2 2 4 ,

(2.16)

4

where the Lefschetz operator L and its adjoint have the following action on exterior forms

L:

J ,

J .

(2.17)

Jmn =

1

mnrs J rs .

2J

(2.18)

From the linear variation of Eq. (2.14), we observe that the allowed deformations (i.e. which

preserve the conformally balanced condition) satisfy d = 0. Eq. (2.18) implies that any variation of the Hermitian metric can be expressed in terms of a variation by a closed (2, 2)-form.

Equivalently, we can also express the linear variation condition directly for the Hermitian metric

as

1 mn

d J J J Jmn = 0,

(2.19)

4

where J = J J .

Note that J variations that are equivalent to a coordinate transformation (i.e. a diffeomorphism) are physically unobservable and must therefore be quotient out. Under an infinitesimal

coordinate transformation

y m = y m + v m (y),

the variation of a p-form p is given by the Lie derivative

p = Lv p = iv (dp ) + d(iv p ) ,

(2.20)

(2.21)

123

where v = v m m is a vector field and iv denotes the interior product. For the conformally balanced four-form, a coordinate transformation results in

Lv J J J = d iv J J J .

(2.22)

We can thus identify, as physically not relevant, variations that are exterior derivatives of a

non-primitive three-form

d J J ,

(2.23)

where m = v n Jnm . Using (2.18), this corresponds to deformations of the Hermitian form

J

1

d J J .

J

(2.24)

Let us now interpret the content of the above variation formulas. By the identification

of (2.18), variations of the Hermitian metric that preserve the conformally balanced condition

can be parametrized by closed (2, 2)-forms. Moreover, modding out by diffeomorphisms results

in the cohomology3

ker(d) 2,2

.

d( J )

(2.25)

Thus, the space of conformally balanced metrics is equivalent to the space of closed (2, 2)-forms

modded out by those which are exterior derivatives of a non-primitive three-form. But notice

that exact forms which are exterior derivative of a primitive three-form are not quotient out.

Hence, if there exists such a primitive three-form, 30 , then the space of balanced metrics is

infinite-dimensional. This is because d(f 30 ) where f is any real function would be closed but

not modded out.

The cohomology of (2.25) can also be expressed directly in terms of (1, 1)-forms. From (2.19),

every co-closed (1, 1)-form defines a metric deformation preserving the conformally balanced

condition. To see this explicitly, we note that any (1, 1)-form can be Lefschetz decomposed as

follows

1

Cmn = (C0 )mn + Jmn J rs Crs

6

1

(C0 )mn + Jmn C ,

3

(2.26)

where C0 denotes the primitive part and C = 12 J rs Crs encodes the non-primitivity of Cmn . We

can therefore re-express (2.19) as

1

0 = d J J J

2

1

= d J0 J J

6

= d C,

(2.27)

3 Note that complex structures are also defined up to diffeomorphism. So any diffeomorphism generated by a real

vector field will keep the complex structure in the same equivalence class.

124

Furthermore, variations associated with diffeomorphisms can be written as

J d J J ,

(2.28)

so that we have

1

J J J d ( J ),

2

(2.29)

where m

m n

J

ker(d ) 1,1

.

d ( J )

(2.30)

Therefore, the local moduli space can also be described as spanning all co-closed (1, 1)-forms

modulo those which are d of non-primitive three-forms. This space is isomorphic to that

of (2.25) and is in general infinite-dimensional. We have however yet to consider the two

other supersymmetry constraints. Imposing them, especially the anomaly cancellation condition,

will greatly reduce the number of allowed deformations and render the moduli space finitedimensional. This can be seen clearly in the T 2 bundle over K3 example discussed in the next

section.

Finally, let us point out that if we had taken into consideration variations of the complex

structure, then a J variation will in general include also a (2, 0) and a (0, 2) part. Nevertheless,

J + J must still be a (1, 1)-form with respect to the deformed complex structure as is required

by supersymmetry.

2.2. Hermitian YangMills condition

Any variation of the Hermitian gauge connection with the complex structure held fixed

will preserve the holomorphic condition F (2,0) = F (0,2) = 0. As for the primitivity condition

Fmn J mn = 0, we shall vary its equivalent form

0 = (F J J ) = F J 2 + 2F J J.

The Hermitian field strength F can be written as

Fab

= a h 1 b h ,

= a Ab = a h b h

(2.31)

(2.32)

where , ,

is the transpose of the Hermitian metric on the

the gauge field varies as

gauge bundle. Under the variation, h = h + h,

+ h 1 h

A = A A = h 1 ( h)

= h 1 h h 1 h h 1 h h 1 h

= h 1 h + A h 1 h h 1 h A

D A h 1 h .

(2.33)

A (h 1 h)).

This implies that the field strength varies as F = (D

0 = D A h 1 h J 2 + 2F J J.

(2.34)

125

This gives the constraint relation between the variations of the Hermitian form and the gauge

field. The pair (J, h) will be further constrained when inserted into the anomaly cancellation

condition as we now show.

2.3. Anomaly cancellation condition

We can write the variation of the anomaly cancellation equation as

tr R(g) R(g) tr F (h) F (h) .

(2.35)

2

The left-hand side is a of a (1, 1)-form, so we should write the variation of the right-hand

side of the equation similarly. With the curvature defined using the Hermitian connection, we

can write the variation using the BottChern form [25,26]. For two Hermitian metrics (g1 , g0 )

that are smoothly connected by a path parameterized by a parameter t [0, 1], the difference of

the first Pontryagin classes is given by the BottChern form

=

2i J

tr[R1 R1 ] tr[R0 R0 ] = 2i BC

2 (g1 , g0 ),

(2.36)

where

1

BC2 (g1 , g0 ) = 2i

tr Rt g t1 g t dt,

(2.37)

and g = gab

denotes the transpose of the Hermitian metric, the dot denotes the derivative with

respect to t , and the tr in (2.37) traces over only the holomorphic indices.4 We now use the

BottChern formula to obtain the variation. Let

gt = g + t (g g) = g + tg,

(2.38)

tr[R R] = 2 tr[R R] = 4 tr R g 1 g ,

(2.39)

tr R g 1 g a b = iRa b cd Jcd .

(2.40)

With (2.39), the linear variation of the anomaly equation (2.41) becomes

= tr R g 1 g tr F h 1 h .

2i J

(2.41)

By factoring out the 2i derivatives, the anomaly condition can be equivalently expressed as

1

(2.42)

tr R g g tr F h 1 h = ,

2

where is a closed (1, 1)-form.

Note that for the special case where either the gauge bundle is trivial (i.e. F = 0) or h = 0,

there is a simple relationship between J and . The anomaly variation with (2.40) inserted

J i

4 Note that the BottChern form is defined only up to and exact terms.

126

Ja b

R cd Jcd = a b .

2 ab

(2.43)

Grouping the two Hermitian indices (a b)

above equation and obtain

J = (1 M)1 ,

(2.44)

cd

cd

2 Ra b .

As long as (1 M) is invertible,

we see that J is parametrized by the space of -closed (1, 1)-forms . Modding out by diffeomorphism equivalence, we can obtain a cohomology associated with the anomaly equation of

the form

1,1

ker( )

,

(2.45)

diff

where

diff = (1 M)Jdiff =

1

(1 M)d J J ,

J

(2.46)

and Jdiff is the variation of the Hermitian form corresponding to diffeomorphism given in (2.24).

To summarize, we list the three linear variation conditions with complex structure fixed.

1

d J 2J J (J J )J mn Jmn = 0,

(2.47)

4

D A h 1 h J 2 + 2F J J = 0,

(2.48)

= 0.

tr R g 1 g tr F h 1 h

J i

(2.49)

2

In the next section, we will write down explicit deformations that satisfy the above equations for

the T 2 bundle over K3 flux background.

3. T 2 bundle over K3 solution

The metric of the T 2 bundle over K3 solution [13,14] has the form

2

+ (dx + 1 )2 + (dy + 2 )2

ds 2 = e2 dsK3

2

= e2 dsK3

+ |dz3 + |2 ,

(3.1)

= dz3

where

+ is a (1, 0)-form and = 1 + i2 . The twisting of the

two-form defined on the base K3,

(2,0)

= 1 + i2 = d = S

(1,1)

+ A

T2

is encoded in the

(3.2)

JK3 = 0,

and obeys the quantization condition

i

i =

H 2 (K3, Z).

2

(3.3)

(3.4)

127

With this metric ansatz, the anomaly cancellation equation reduces to a highly non-linear secondorder differential equation for the dilaton . Importantly, a necessary condition for the existence

of a solution for is that the background satisfies the topological condition

1

2

2 JK3 JK3

S + A

tr F F = 24.

+

(3.5)

2!

16 2

K3

K3

If this condition is satisfied, then the analysis of Fu and Yau [13] guarantees the existence of a

smooth solution for that solves the differential equation of anomaly cancellation.

3.1. Equations for the moduli

For expressing the constraint equations of the allowed deformations, we first write down more

explicitly the Hermitian metric. Note that the conventions we follow here are that Ja b = iga b and

i

J = e2 JK3 + ,

2

and we write the corresponding metric as

1 2g + BB B

ga b =

,

1

B

2

(3.6)

(3.7)

where g i j = e2 gK3 is the base K3 metric with the e2 warp factor included, B = (B1 , B2 ) is a

column vector with entries locally given by = B1 dz1 + B2 dz2 , and B = B .

An allowed deformation of the conformally balanced condition must satisfy the requirement

that the four-form (2.15)

1

mn

= J J J (J J )J Jmn

8

i

1

= JK3 J + e2 J e2 JK3 JK3 + iJK3 J mn Jmn , (3.8)

2

8

is d-closed.

As for the anomaly condition, we shall work with the constraint given in the form of (2.49)

(with trivial gauge bundle)

1

J i tr R g g = 0.

(3.9)

2

The curvature term can be written out explicitly as

tr R g 1 g = i R j i Ji j + R j 3 J3j + R 3i Ji 3 + R 33 J33 ,

(3.10)

where

1

1

R j i = g 1 R g 1 B

,

B g

2

1

1 g 1 B

B g

B,

R j 3 = g 1 R B + g 1 B

2

(3.11)

(3.12)

128

1

1

B g

,

R 3i = B g 1 R B g 1 + B g 1 B

2

1

B g 1 B + B g 1 B

R 33 = B g 1 R B + B g 1 B

2

1

,

B g B B g 1 B

(3.13)

(3.14)

g 1 g ) is the curvature tensor of K3 with respect to the g metric. Note that the R ba

and R = (

are two-forms with components only on the coordinates of K3.

Below, we shall analyze the infinitesimal deformations of the T 2 bundle over K3 model with

trivial gauge bundle. For this type of model, the topological constraint (3.5) is satisfied purely by

the curvature of the T 2 twist. (See Section 5.2 in [14] for explicit examples.) We shall discuss

the variation of the three components of the metricthe dilaton conformal factor, the K3 base,

and the T 2 bundleseparately below. We will show that the moduli given below satisfy both

the conformally balanced and anomaly cancellation condition. For the trivial bundle case, the

Hermitian YangMills condition does not place any constraint on the deformations. Finally, we

will also discuss the variation of the complex structure in this model.

The dilaton is associated to the warp factor of the K3 base. Thus, varying the dilaton corresponds to varying the local scale of the K3. The deformation of the Hermitian form due to the

variation of the dilaton is

J = 2e2 JK3 ,

(3.15)

where depends only on the K3 coordinates. This is consistent with the dilaton variation

condition = (1/8)J mn Jmn of Eq. (2.13). As for the conformally balanced condition, it in fact

does not place any constraint on the dilaton. The metric variation (3.15) when inserted into (3.8)

gives the four-form

= e2 JK3 JK3 ,

(3.16)

which is indeed d-closed for any real function on the base K3. Since the space of real function

is infinite-dimensional, the dimensionality of the deformation space is also infinite if only the

conformally balanced condition is considered.

Imposing anomaly cancellation condition will however make the deformation space finite.

Anomaly cancellation (3.9) imposes the condition

2e2 JK3 i

2

1

+ 4

= 0,

tr B B gK3

e

2

(3.17)

where we have used (3.11). The analysis of Fu and Yau [13] guarantees only a one-parameter

family of solutions parametrized by the normalization

A=

e

K3

8 JK3

JK3

2!

1/4

,

(3.18)

129

as long as the topological condition (3.5) is satisfied and also A 1. (See [14] for a discussion of the physical implications of the A 1 bound.) The variation of the dilaton can thus be

parametrized by the value of A.5

3.3. Deformations of the K3 metric

The metric moduli of the K3 are associated with deformations of the Hermitian form JK3

such that the curvatures of the T 2 bundle, i for i = 1, 2, remain primitive (3.3). This implies

that the allowed variation of JK3 satisfies

i JK3 + i JK3 = 0,

i = 1, 2.

(3.19)

Hence, of the 20 possible h1,1 Khler deformations of K3, only the subset that satisfies (3.19) is

allowed.

First, consider the case where i = 0. We then have the condition

i JK3 = 0,

(3.20)

which must be satisfied locally at every point on K3. With the curvature form containing a

(1, 1) part, (3.20) is a very strong condition that in general can only be satisfied by a variation

proportional to the Hermitian form, JK3 JK3 . But this would then be the modulus identified

above as associated with the dilaton (3.15).

i , where fi for i = 1, 2 are functions on the base K3.

More generally, we can have i = i f

This form of i is required so that the variation does not change the H 2 (K3) integral class

of i as required by the quantization of (3.14). Let JK3 = H 1,1 (K3) and not proportional

to JK3 , then the variation (3.19) corresponds to

i JK3

0 = i + i f

JK3 JK3

= fi fi

.

2

(3.21)

K3

Here, we have replaced i = fi JK3 J

noting that the exterior product of two (1, 1)-forms

2

on the base must be a function times the volume form of the K3. Now, the sufficient condition

that a solution for fi exists is that

JK3 JK3

fi

= i = 0.

(3.22)

2

K3

K3

But this is related to the requirement that the intersection numbers are zero. The intersection

numbers of K3 are defined to be

dI J = I J ,

(3.23)

K3

5 Rigorously, one should be able to show that there does not exist a dilaton variation that satisfies (3.17) and leaves the

normalization A unchanged. Regardless, the finite-dimensionality of the deformation space is ensured if one assumes the

elliptic condition required by Fu and Yau [13] to solve the anomaly cancellation equation for .

130

where I , I = 1, . . . , 22, denotes a basis of H 2 (K3, Z). The matrix dI J is the metric of the even

self-dual lattice with Lorentzian signature (3, 19) given by

0 1

0 1

0 1

,

(E8 ) (E8 )

(3.24)

1 0

1 0

1 0

where

2

0

2

0

1 0

0 1

E8 =

0

0

0

0

0

0

0

0

1 0

0

0

0

0

0 1 0

0

0

0

2 1 0

0

0

0

1 2 1 0

0

0

,

0 1 2 1 0

0

0

0 1 2 1 0

0

0

0 1 2 1

0

0

0

0 1 2

(3.25)

is the Cartan matrix of E8 Lie algebra. Thus we see that a variation of JK3 = is allowed as

long as the intersection numbers of with i are zero. This implies at least that = 1 , 2 .

The above variations of the Khler form on the K3 require the metric variations

i

J = e2 + ( + ) + 2e2 JK3 ,

2

i

= + JK3 ( + ) + e2 JK3 JK3 ,

2

(3.26)

where = i(f1 + if2 ). One can check that the above is closed when (3.21) is satisfied. We

note that the additional variation of the dilaton in (3.26) is needed in order to satisfy the anomaly

condition. With it, the analysis of Fu and Yau [13] then guarantees the existence of a solution

for for each consistent pair (, ). Therefore, J variations in (3.26) satisfying (3.22) are

indeed moduli.

3.4. Deformation of the T 2 bundle

We now consider the variation of the size of the T 2 bundle. This is an allowed variation of the

conformally balanced condition since the metric variation

i

J = ,

2

results in the closed four-form

i

1

= e2 JK3 JK3 + JK3 ,

4

4

(3.27)

(3.28)

where is a constant infinitesimal parameter. But we must also check the anomaly condition.

The variation of the curvature term can be calculated using (3.11)(3.14) and we obtain

1

B g 1 .

tr R g 1 g = tr B

2

The anomaly condition (3.9) therefore becomes

(3.29)

1

0 = i J i tr R g g

2

1

= tr B

B g 1

2

2

2

J

1

B g 1 ,

= 2 K3 + tr B

2

2

2

131

(3.30)

but this cannot hold true. To see this, we can integrate the last line over the base K3. The first term

gives a positive contribution while the second term integrates to zero. Here, we have used the fact

B g 1 ] in the second term is well-defined and has dependence only

that the two-form tr[B

on the base K3 as was shown in [13] (see Lemma 10 on page 11). Thus, the size of the torus

cannot be continuously varied as it is fixed by the anomaly condition.

With the size of the torus fixed, it is evident that there cannot be any overall radial moduli

J = J for this model, as has also been noted previously in [21,27,28]. Actually, it is true in

general that the anomaly cancellation forbids an overall constant radial modulus for any heterotic compactification with non-zero H -flux. The reason is simply that tr[R R] is invariant

under constant scaling of the metric since the Riemann tensor, Rmn p q , is scale invariant. How depends on J and cannot be scale invariant. Hence, the overall scale is not a

ever dH = 2i J

modulus.

To summarize, the T 2 bundle over K3 model has a dilaton modulus and also moduli associated with the Khler moduli of the base K3. The number of moduli in particular depends on the

curvature of the T 2 twist, . The size of the T 2 is however fixed and hence there is no overall

radial modulus in the model.

3.5. Fixing the complex structure

We have mostly taken the complex structure to be fixed in analyzing the moduli. But for

the T 2 bundle over K3 solution, the complex structures are rather transparent and we can describe how they can be fixed. To begin, the complex structures are simply those on the K3 plus

that on the T 2 . For the T 2 , its complex structure determines the integral first Chern class quantization condition (3.4) for 1 and 2 . For an arbitrary torus complex structure = 1 + i2 , the

quantization conditions depend on and takes the form

1

1

1

2 Z,

1 2 Z,

(3.31)

2

2

2 2

fixes . And even if we were to allow to vary infinitesimally, the complex structure integrability

condition = 1 + i2 (2,0) (K3) (1,1) (K3) and the topological condition (3.5) must be

imposed. All together, these strong conditions generically fix the T 2 complex structure moduli.

Note also that the condition H (1,1) (K3, Z) = H (1,1) (K3) H 2 (K3, Z) also strongly constrains the complex structure of the K3 since the dimension of H (1,1) (K3, Z) do vary with the

complex structure of K3.6

The complex structures of K3 can also be fixed if the T 2 twist contains a (2, 0) selfdual part, (2,0) = kK3 , which up to a constant k must be proportional to the holomorphic

6 That the T 2 complex structures are fixed has also be noted from the gauged linear sigma model point of view in [18].

132

(2, 0)-form of K3. The above mentioned quantization condition for the (2, 0) part then takes the

form (for = i)

k

(3.32)

K3 Z,

2

which defines the periods of the holomorphic (2, 0)-form on the K3. These periods specify

the complex structures chosen on K3, and the quantization condition thus fixes the complex

structures on K3.

4. Conclusions and open questions

In this paper, we have derived the defining equations for the local moduli of supersymmetric heterotic flux compactifications. The defining equations were derived by performing a linear

variation of the supersymmetry constraints obeyed by such compactifications. We further analyzed the corresponding geometric moduli spaces and discussed the particular example of a T 2

bundle over K3 in detail. This T 2 bundle over K3 solution is special in that in it is dual to M- or

F-theory on K3 K3. Notice that under infinitesimal deformations, the manifold K3 K3 remains K3 K3. Thus, the corresponding heterotic T 2 bundle over K3 dual must also be locally

unique; that is, it remains a T 2 bundle over K3 under infinitesimal variation.

In much of our analysis, we have set the gauge bundle to be trivial. For the T 2 bundle over

the K3 case, the non-trivial, non-U (1) bundle are simply the stable bundles on K3 lifted to the

six-dimensional space. The moduli space then corresponds to the space of K3 stable bundle. The

dimension of this moduli space M is given by the Mukai formula [29]

dim M = 2rc2 (E) (r 1)c12 (E) 2r 2 + 2,

(4.1)

where r is the rank of the bundle (i.e. the dimension of the fiber), and (c1 (E), c2 (E)) are the

first and second Chern number of the gauge bundle E. It would be interesting to understand the

moduli space of stable gauge bundle in general.

There are a number of interesting open questions. First, in our analysis we have kept for

simplicity the complex structure fixed. It is well known that for CalabiYau compactifications

the moduli space is a direct product of complex structure and Khler structure deformations. For

non-Khler manifolds with torsion, this likely is not the case and it would be interesting to allow

for a simultaneous variation of the complex structure and the Hermitian form.

It would be interesting to analyze the geometry of the moduli space and to determine if powerful tools such as the well-known special geometry of CalabiYau compactifications [30] can

be derived in this case.

Furthermore, counting techniques for moduli fields need to be developed and we expect that

the number of moduli can be characterized in terms of an index or some topological invariants

of the manifold.

Finally, it would be interesting to analyze the moduli space from the world-sheet approach

using the recently constructed gauged linear sigma model [18]. Moduli fields will correspond to

the marginal deformations of the IR conformal field theory.

Acknowledgements

We would like to thank A. Adams, K. Becker, S. Giddings, J. Lapan, J. Sparks, E. Sharpe,

A. Subotic, V. Tosatti, D. Waldram, M.-T. Wang, P. Yi, and especially J.-X. Fu for helpful dis-

133

cussions. We thank the 2006 Simons Workshop at YITP Stony Brook for hospitality where part

of this work was done. M. Becker would like to thank members of the Harvard Physics Department for their warm hospitality during the final stages of this work. The work of M. Becker is

supported by NSF grants PHY-0505757, PHY-0555575 and the University of Texas A&M. The

work of L.-S. Tseng is supported in part by NSF grant DMS-0306600 and Harvard University.

The work of S.-T. Yau is supported in part by NSF grants DMS-0306600, DMS-0354737, and

DMS-0628341.

Appendix A

We summarize our notation and conventions.

Our index conventions are as follows: m, n, p, q, . . . denote real six-dimensional coordi c,

nates, a, b, c, . . . and a,

b,

. . . denote six-dimensional complex coordinates, and i, j, k, . . .

j, k,

. . . denote four-dimensional complex coordinates on the base K3.

and i,

The gauge field Am and field strength Fmn take values in the SO(32) or E8 E8 Lie-algebra

with the generators being anti-Hermitian.

The Riemann tensor is defined as follows

Rmn p q = m n p q n m p q + m p r n r q n p r m r q .

With a Hermitian metric g with components ga b , we write the Hermitian curvature two1 ] where g is the transposed of g with components g .

g 1 g]

= [(g)g

form as R = [

ba

Explicitly, in components, we write

cd

dc

c

Rab

.

= a (b gd d )g

d = a g gdd

We follow the convention standard in the mathematics literature for the Hodge star operator.

For example, (H )mnp = 3!1 Hrst rst mnp with mnprst being the Levi-Civita tensor.

We use the definition for 2J :

J3

.

3!

For a vector field, v = v m m , the interior product acting on a p-form with components

m1 m2 ...mp is just

= 2J

Given a Hermitian form J , the adjoint of the Lefschetz operator acting on a p-form with

components m1 m2 ...mp is

()m3 m4 ...mp =

1 m1 m2

J

m1 m2 m3 m4 ...mp .

2!

References

[1] P. Candelas, G.T. Horowitz, A. Strominger, E. Witten, Vacuum configurations for superstrings, Nucl. Phys. B 258

(1985) 46.

[2] V. Braun, Y.H. He, B.A. Ovrut, T. Pantev, A heterotic standard model, Phys. Lett. B 618 (2005) 252, hep-th/0501070.

[3] V. Bouchard, R. Donagi, An SU(5) heterotic standard model, Phys. Lett. B 633 (2006) 783, hep-th/0512149.

[4] H. Verlinde, M. Wijnholt, Building the standard model on a D3-brane, hep-th/0508089.

134

[5] R. Blumenhagen, M. Cvetic, P. Langacker, G. Shiu, Toward realistic intersecting D-brane models, Annu. Rev. Nucl.

Part. Sci. 55 (2005) 71, hep-th/0502005.

[6] M.R. Douglas, S. Kachru, Flux compactification, hep-th/0610102.

[7] M. Grana, Flux compactifications in string theory: A comprehensive review, Phys. Rep. 423 (2006) 91, hep-th/

0509003.

[8] C.M. Hull, Superstring compactifications with torsion and spacetime supersymmetry, in: R. DAuria, D. Fre (Eds.),

1st Torino Meeting on Superunification and Extra Dimensions, September 1985, Torino, Italy, World Scientific,

Singapore, 1986, p. 347.

[9] A. Strominger, Superstrings with torsion, Nucl. Phys. B 274 (1986) 253.

[10] B. de Wit, D.J. Smit, N.D. Hari Dass, Residual supersymmetry of compactified d = 10 supergravity, Nucl. Phys.

B 283 (1987) 165.

[11] K. Dasgupta, G. Rajesh, S. Sethi, M theory, orientifolds and G-flux, JHEP 9908 (1999) 023, hep-th/9908088;

K. Becker, K. Dasgupta, Heterotic strings with torsion, JHEP 0211 (2002) 006, hep-th/0209077.

[12] K. Becker, M. Becker, K. Dasgupta, P.S. Green, Compactifications of heterotic theory on non-Khler complex

manifolds I, JHEP 0304 (2003) 007, hep-th/0301161;

K. Becker, M. Becker, P.S. Green, K. Dasgupta, E. Sharpe, Compactifications of heterotic strings on non-Khler

complex manifolds II, Nucl. Phys. B 678 (2004) 19, hep-th/0310058.

[13] J.-X. Fu, S.-T. Yau, The theory of superstring with flux on non-Khler manifolds and the complex MongeAmpre

equation, hep-th/0604063.

[14] K. Becker, M. Becker, J.-X. Fu, L.-S. Tseng, S.-T. Yau, Anomaly cancellation and smooth non-Khler solutions in

heterotic string theory, Nucl. Phys. B 751 (2006) 108, hep-th/0604137.

[15] M. Cyrier, J.M. Lapan, Towards the massless spectrum of non-Kaehler heterotic compactifications, hep-th/0605131.

[16] T. Kimura, P. Yi, Comments on heterotic flux compactifications, JHEP 0607 (2006) 030, hep-th/0605247.

[17] S. Kim, P. Yi, A heterotic flux background and calibrated five-branes, JHEP 0611 (2006) 040, hep-th/0607091.

[18] A. Adams, M. Ernebjerg, J.M. Lapan, Linear models for flux vacua, hep-th/0611084.

[19] E.A. Bergshoeff, M. de Roo, The quartic effective action of the heterotic string and supersymmetry, Nucl. Phys.

B 328 (1989) 439.

[20] J.P. Gauntlett, D. Martelli, S. Pakis, D. Waldram, Commun. Math. Phys. 247 (2004) 421, hep-th/0205050.

[21] G. Lopes Cardoso, G. Curio, G. DallAgata, D. Lust, BPS action and superpotential for heterotic string compactifications with fluxes, JHEP 0310 (2003) 004, hep-th/0306088.

[22] D.J. Gross, E. Witten, Superstring modifications of Einsteins equations, Nucl. Phys. B 277 (1986) 1.

[23] D. Nemeschansky, A. Sen, Conformal invariance of supersymmetric sigma models on CalabiYau manifolds, Phys.

Lett. B 178 (1986) 365.

[24] P. Candelas, Yukawa couplings between (2, 1) forms, Nucl. Phys. B 298 (1988) 458;

P. Candelas, X. de la Ossa, Moduli space of CalabiYau manifolds, Nucl. Phys. B 355 (1991) 455.

[25] R. Bott, S.S. Chern, Hermitian vector bundles and the equidistribution of the zeroes of their holomorphic sections,

Acta Math. 114 (1965) 71.

[26] C.M. Hull, Actions for (2, 1) sigma models and strings, Nucl. Phys. B 509 (1998) 252, hep-th/9702067.

[27] K. Becker, M. Becker, K. Dasgupta, S. Prokushkin, Properties of heterotic vacua from superpotentials, Nucl. Phys.

B 666 (2003) 144, hep-th/0304001.

[28] K. Becker, L.-S. Tseng, Heterotic flux compactifications and their moduli, Nucl. Phys. B 741 (2006) 162, hep-th/

0509131.

[29] S. Mukai, Moduli of vector bundles on K3 surfaces, and symplectic manifolds, Sugaku Expositions 1 (1988) 139.

[30] A. Strominger, Special geometry, Commun. Math. Phys. 133 (1990) 163.

with two spinors via a-maximization

Teruhiko Kawano, Futoshi Yagi

Department of Physics, University of Tokyo, Hongo, Tokyo 113-0033, Japan

Received 11 June 2007; accepted 6 July 2007

Available online 19 July 2007

Abstract

We give a detailed analysis of the superconformal fixed points of four-dimensional N = 1 supersymmetric Spin(10) gauge theory with two spinors and vectors by using a-maximization procedure.

2007 Elsevier B.V. All rights reserved.

1. Introduction

In the previous paper [1], we studied four-dimensional N = 1 supersymmetric Spin(10) gauge

theory with a single chiral superfield in the spinor representation and NQ chiral superfields Qi

(i = 1, . . . , NQ ) in the vector representation and with no superpotential at the superconformal

infrared (IR) fixed point. This theory is believed to have a non-trivial IR fixed point for 7

NQ 21, where the dual description is available [2,3].

At the IR fixed points, since the conformal dimension D(O) of a gauge invariant chiral primary operator O can be determined by the superconformal U (1)R charge R(O) [4] as

3

D(O) = R(O),

(1.1)

2

the U (1)R symmetry in the superconformal algebra plays an important role.

The unitarity requires the conformal dimension of a gauge invariant Lorentz scalar to satisfy

D(O) 1, where the equality is satisfied if and only if O is free [5]. With (1.1),

* Corresponding author.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.07.007

136

2

R(O) .

(1.2)

3

However, a gauge invariant chiral primary operator sometimes appears to violate the inequality (1.2), when we na vely assume that the global symmetry at the IR fixed point is the same as

that in the ultraviolet (UV) region. It has been argued in [68] that the operator O decouples from

the remaining interacting system to become free at the IR fixed point, where a new global U (1)

symmetry which transforms only O is enhanced and the real U (1)R charge of O becomes 2/3.

The superconformal U (1)R symmetry can be expressed as a linear combination of anomalyfree U (1) symmetries as

U (1)R = U (1) +

(1.3)

xi U (1)i ,

i

where global U (1) symmetries under which the gaugino has no charge are denoted by U (1)i

(i = 1, 2, . . .) and an anomaly-free U (1) symmetry which transforms the gaugino with charge 1

by U (1) . In order to determine the superconformal U (1)R symmetry, we have to determine the

coefficients xi in (1.3). In fact, we may use a-maximization [9] for this purpose. Following this

method, we regard xi in (1.3) as variables to be determined and construct the trial a-function1

a0 (x1 , x2 , . . .) = 3 Tr R 3 Tr R.

(1.4)

Each term in the right-hand side of (1.4) represents the t Hooft anomaly [10], where the charge R

is the U (1)R charge given in terms of xi in (1.3), but they are not necessarily the superconformal

U (1)R charges at the IR fixed point. If there are no accidental symmetries at the IR fixed point, the

t Hooft anomalies can be evaluated in the UV by using the t Hooft anomaly matching condition

for asymptotically-free theories. Then, a-maximization tells us that the local maximum of the

function (1.4) gives xi for the superconformal U (1)R symmetry in (1.3).

However, as mentioned above, the function (1.4) does not make sense in the range of xi

where gauge invariant chiral primary operators seem to violate the unitarity bounds (1.2). It was

proposed in the paper [7] that, in the range where operators Oi seem to violate the unitarity

bounds (1.2), the trial a-function should be modified into

a(x1 , x2 , . . .) = a0 +

(1.5)

aOi R(Oi ) + aOi (2/3) .

i

The function aOi represents the contribution from the operator Oi to the trial a-function and can

be evaluated as

3

aOi R(Oi ) = dOi 3 R(Oi ) 1 R(Oi ) 1 ,

(1.6)

where dOi is the number of the components of the operator Oi , and R(Oi ) is the U (1)R charge

of Oi , as given in (1.3). The term aOi (2/3) is obtained by substituting the value R(Oi ) = 2/3

of free fields into (1.6) to give 2dOi /9. The prescription (1.5) can be interpreted as subtracting

the contribution which is evaluated under the assumption that the operator Oi is interacting and

adding the contribution of the operators as free fields. Thus, by dividing the range of xi according to which operators hit the unitarity bounds and by modifying the trial a-function as (1.5)

for each range, we obtain the trial a-function in the whole range of the variables xi [1,7]. The

superconformal U (1)R symmetry could be identified by the local maximum of this function.

1 We omit the overall factor 3/32 of the trial a-function in this paper, which does not affect the calculation of the

U (1)R charges.

137

By using the method discussed above, we showed in the previous paper [1] that the meson

operator M ij = Qi Qj hits the unitarity bound and becomes free for NQ = 7, 8, 9. We also analyzed the IR fixed point by using the electricmagnetic duality and found that the decoupling

of the meson operator can be seen more clearly in the magnetic theory. In the magnetic theory,

since the meson operator is described by elementary fields, we do not need the prescription (1.5).

We thus proved the validity of the prescription (1.5) in the theory.

The magnetic theory is SU(NQ 5) gauge theory with NQ antifundamentals qi , a single

fundamental q, a symmetric tensor s, and singlets M ij and Y i , and its superpotential is given by

Wmag = M ij qi s qj + Y i q qi + det s.

(1.7)

in the electric theory, respectively. When M ij hits the unitarity bound, it decouples from the

interacting system, and thus, the interaction M ij qi s qj in (1.7) becomes irrelevant at the IR fixed

point. This can be checked by evaluating the U (1)R charge of this term. Thus, we may identify

the remaining interacting system with the theory without the term M ij qi s qj in the superpotential

so that we can construct the trial a-function of this interacting system together with the free

meson without the prescription (1.5), but the resulting function is actually identical to (1.5).

We further discussed that, since the interaction M ij qi s qj in (1.7) vanishes at the IR fixed

point, we do not have the F -term condition for M ij , and new massless degrees of freedom

corresponding to qi s qj appear there. The dual of the magnetic theory without the interaction

term M ij qi s qj is given by the original electric theory but with the superpotential

W = Nij Qi Qj ,

(1.8)

where Nij are additional singlets and correspond to qi s qj . We found that the IR fixed point of the

original theory is identical to that of this theory. This renormalization group flow can be seen in

the original electric theory by introducing auxiliary fields M ij and the Lagrange multipliers Nij

to give the superpotential

W = Nij Qi Qj M ij .

We can see that it is the same theory as the original one by integrating out M ij and Nij . The

equations of motion give the constraints

M ij = Qi Qj ,

Nij = 0.

(1.9)

When M ij hits the unitarity bound, the interaction Nij M ij becomes irrelevant, to give rise to the

superpotential (1.8) at the IR fixed point, where the constraints (1.9) does not exist. In this way,

we find that a-maximization and the electricmagnetic duality reveal the rich dynamics at the IR

fixed point.

In this paper, we extend the analysis to the theory with two spinors and NQ vectors and show

that the meson operator M ij = Qi Qj decouples from the interacting system to become free for

NQ = 6, 7.

This paper is organized as follows: In Section 2, we briefly review the electricmagnetic

duality in the theory with two spinors, especially about the matching of gauge invariant operators.

In Section 3, we study which operators become free by using a-maximization for both electric

and magnetic theory. Section 4 is devoted to summary and discussion. In the appendices, we

discuss the gauge invariant operators in both the electric and the magnetic theory.

138

We study four-dimensional N = 1 supersymmetric Spin(10) gauge theory with two chiral superfields I (I = 1, 2) in the spinor representation and NQ chiral superfields Qi (i = 1, . . . , NQ )

in the vector representation and with no superpotential. From the 1-loop beta function, we find

that it is asymptotically free for NQ 19. It is believed that the theory has a non-trivial superconformal IR fixed point for 6 NQ 19, where the magnetic dual description exists [11].

This theory has the anomaly-free global symmetry SU(NQ ) SU(2) U (1)F U (1) , and

the fields Qi and I have charges (NQ , 1, 4, 1) and (1, 2, NQ , 1), respectively, under the

symmetry. Here, U (1)F is a global symmetry under which the gaugino have no charge, while

U (1) transforms it with charge 1. If there are no accidental symmetries at the IR fixed point,

the U (1)R symmetry in the superconformal algebra should be given as a linear combination of

these U (1) symmetries as

U (1)R = xU (1)F + U (1)

(2.1)

with some real number x. Thus, the U (1)R charge of the matter fields can be expressed as

R(Q) = 4x + 1,

R( ) = NQ x 1.

(2.2)

As explained in the introduction, in order to construct the trial a-function in the whole range

of x, we need to know gauge invariant chiral primary operators at the IR fixed point. As discussed

in Appendix A, the gauge invariant generators of the classical chiral ring of this theory are given

by

M ij = Qai Qaj ,

YXi = IT C(2 X )I J a J Qai ,

C i1 i3 = IT C(2 )I J a1 a3 J Qa1 i1 Qa3 i3 ,

i i5

BX1

i i9

E2 X1

H i1 i4 = IT C(2 X )I J a1 a5 J KT C(2 X )KL a1 L Qa2 i2 Qa5 i5 ,

D0 i1 i6 = a1 a10 Qa1 i1 Qa6 i6 W a7 a8 W a9 a10 ,

D1 i1 i8 = a1 a10 Qa1 i1 Qa8 i8 W a9 a10 ,

D2 i1 i10 = a1 a10 Qa1 i1 Qa10 i10 ,

S = Tr W W .

(2.3)

Here, a and a1 , a2 , . . . are the indices of the gauge group Spin(10), and the matrix C is the charge

conjugation matrix of it. The matrices X (X = 1, 2, 3) are the Pauli matrices for the flavor of

the spinors. Taking account of the number of the antisymmetrized indices of the SU(NQ ) global

symmetry, we see that whether each operator exists depends on NQ . For example, the operator

D2 i1 i10 exists only for NQ 10.

Now, we turn to the magnetic theory, which is believed to be equivalent as the original electric

theory in the IR region. The magnetic theory is given by SU(NQ 3) Sp(1) gauge theory with

139

Table 1

The matter contents of the magnetic theory

SU(NQ 3)

a, b, . . .

Sp(1)

, , . . .

SU(NQ )

i, j, . . .

SU(2)

I, J, . . .

U (1)F

qai

2 NQ 3

q a I

Q

N 3

Q

a

qX

2NQ NQ 3

s ab

22

t I

M ij

i

YX

4NQ

NQ 3

1

1

1

2

1

1

1

22

2

2

1

3

2NQ

8

2NQ 4

U (1)

N 6

1

NQ 3

NQ 2

NQ 3

3NQ 10

NQ 3

N 23

Q

2N

N 4

Q

2

2

1

a

+ I J q a I s ab qb J

Wmag = M ij qai s ab qbj + YXi qai qX

a J

+ (X 2 )I J q a I qX

t .

(2.4)

This theory has the same anomaly-free global symmetry as the electric theory. Thus, the U (1)R

symmetry should also be expressed as (2.1) with the same value of x as in the electric theory.

There exist gauge invariant operators in this theory which correspond to those of the electric

theory. They are fundamental singlets M ij and YXi and the following composite operators:

(C)i1 iNQ 3

a1 aNQ 3

(B)Xi1 iNQ 5

(F )i1 iNQ 7

a1 aNQ 3

a1 aNQ 3

q I

aN

Q 6

aN

Q 4

(2 X )I J q J

aN

Q 3

(2 X )I J q J

aN

Q 5

q aNQ 4 (2 X )KL q L

aN

K

Q 3

aNQ 9

aNQ 8 aNQ 7

(sw )

aNQ 6 aNQ 5 NQ 4 NQ 3

qY

qZ

,

(sw )

G t I (2 )I J t J ,

(H )i1 iNQ 4 I J

a1 aNQ 3

aN

Q 3

t J ,

qX

qY

qZ

,

aNQ 8

a

aN 5 aN 4 aN 3

a

sw NQ 7 NQ 6 qX Q qY Q qZ Q ,

a

a

aN 5 aN 4 aN 3

a

a

a

(sw ) NQ 9 NQ 8 sw NQ 7 NQ 6 qX Q qY Q qZ Q ,

S Tr w w ,

S Tr w w .

(2.5)

140

Here, w and w are the field strength of the SU(NQ 3) and Sp(1) gauge groups, respectively,2

and the operation represents the Hodge duality with respect to the flavor SU(NQ ) indices. The

magnetic theory has two kinds of glueball superfields corresponding to the two gauge group

factors. We can check that every operator has the same charges as that of the electric theory.

Furthermore, it seems that more gauge invariant generators exist in the magnetic theory than

in the electric one. They are given by

U0 = det s,

U1XY = a1 aNQ 3 b1 bNQ 3 s a1 b1 s

qX

qY

,

s a1 b1 s

qX1

qX2

qY1

qY2

,

U3 = X1 X2 X3 Y1 Y2 Y3 a1 aNQ 3 b1 bNQ 3

aNQ 6 bNQ 6 aNQ 5 aNQ 4 aNQ 3 bNQ 5 bNQ 4 bNQ 3

q X1

qX2

qX3

qY1

qY2

qY3

,

a

a

N

4

N

3

a

(E0 )Xi1 iNQ 5 = XY Z a1 aNQ 3 (s qi1 )a1 (s qiNQ 5 ) NQ 5 qY Q qZ Q ,

a

(E1 )Xi1 iNQ 7 = a1 aNQ 3 XY Z (s qi1 )a1 (s qiNQ 7 ) NQ 7

s a1 b1 s

a

aN 4 aN 3

a

sw NQ 6 NQ 5 qY Q qZ Q ,

aNQ 4 aNQ 3

qX

,

aNQ 6

sw

aN

a

Q 5 NQ 4

aNQ 3

qX

aNQ 8

a

aN 3

a

a

a

(sw ) NQ 7 NQ 6 sw NQ 5 NQ 4 qX Q ,

a

a

a

(J1 )i1 iNQ 5 = a1 aNQ 3 (s qi1 )a1 (s qiNQ 5 ) NQ 5 sw NQ 4 NQ 3 ,

a1

a

a

a

a

(sw ) NQ 6 NQ 5 sw NQ 4 NQ 3 .

(2.6)

In spite of our best effort, we have not succeeded to show that these operators are decomposed

or vanish in the classical chiral ring, as discussed in Appendix B.

The discrepancy makes it difficult for us to understand what happens at the IR fixed point.

Though these two theories might actually not be equivalent to each other at the IR fixed point, it

is not plausible that all the other non-trivial checks discussed in [11] are only accidental. Thus, in

this paper, we assume that the classical chiral ring is deformed by the quantum effects and that the

quantum chiral rings of both the theories are identical. However, it is still unclear what is indeed

happening quantum-mechanically at the IR fixed point. This issue affects the construction of the

trial a-function. Therefore, we will consider both the functions in the electric and the magnetic

theory and compare the results. In the next section, we will see that both the functions have the

identical local maximum.

2 The index of the field strength w and w

is that of Lorentz spinors, which would not cause any confusion.

141

3. a-Maximization

In this section, we study Spin(10) gauge theory with two spinors and NQ vectors at the superconformal IR fixed point both in the electric and the magnetic theory by using a-maximization.

We calculate the local maximum of the trial a-function defined in the whole range of the parameter x and determine which operators become free at the IR fixed point.

3.1. a-Maximization in the electric theory

We begin with the electric theory. As the result depends on NQ , we first analyze the case

NQ = 6. Taking account of the number of the antisymmetrized indices of the global symmetry

SU(NQ ), we find that the gauge invariant operators in this case are M, Y , C, B, G, H , D0 ,

and S in (2.3). Since the U (1)R charge of the glueball superfield S is always 2 and never hit the

unitarity bound, we can concentrate on the other seven operators. Using (2.2), the U (1)R charges

R(O) of the gauge invariant operators can be written in terms of x as

R(M) = 8x + 2,

R(G) = 24x 4,

R(Y ) = 8x 1,

R(H ) = 8x,

R(C) = 1,

R(B) = 8x + 3,

R(D0 ) = 24x + 8.

(3.1)

By solving R(O) < 2/3 for each operator, we find that the ranges of x are given in Fig. 1. Since

the operator C does not hit the unitarity bound for all the ranges of x, it does not appear in the

figure.

Now, we construct the trial a-function in the whole range of the parameter x. The trial

a-function in the region where no operators hit the unitarity bound is given by

a0 (x) = 90 + 32F R( ) + 10NQ F R(Q) ,

where F (y) = 3(y 1)3 (y 1). The first term of this function is the contribution from the

gaugino. The U (1)R charges R( ) and R(Q) may be rewritten in terms of x as (2.2). We modify

this function as (1.5) for each range according to which operators hit the unitarity bound. Writing

each term in the summation of (1.5) as fO (x) = aO (R(O)) + aO (2/3), the trial a-function for

the whole range of x is given by

Fig. 1. The ranges of x where each operator hits the unitarity bound for NQ = 6.

142

),

a0 (x) + fY (x) + fG (x) + fH (x) (x 12

1

1

a

(x)

+

f

(x)

+

f

(x)

(

x

),

0

Y

G

12

6

1

7

7

5

a(x) = a0 (x) + fM (x) + fY (x) ( 36 x 24 ),

5

7

x 24

),

a0 (x) + fM (x) ( 24

36 ),

36 x).

More explicitly, the function fO is given by

3

fO (x) = dO 3 R(O) 1 R(O) 1 + 2/9 ,

(3.2)

(3.3)

dM =

NQ (NQ + 1)

,

2

dG = 1,

dH =

dY = 3NQ ,

NQ !

,

4!(NQ 4)!

3NQ !

,

5!(NQ 5)!

NQ !

=

.

6!(NQ 6)!

dB =

dD 0

(3.4)

The U (1)R charge R(Oi ) for each operator Oi is given in (3.1). We find that the function (3.2)

has a unique local maximum at

3 + 143N 2 928N + 1824

18NQ + 6 4NQ

Q

Q

x=

(3.5)

,

2

6(NQ + 8NQ 12)

or equivalently, substituting this to (2.2),

2 12N 48 + 2 4N 3 + 143N 2 928N + 1824

3NQ

Q

Q

Q

Q

R(Q) =

,

2

3(NQ + 8NQ 12)

3

2 42N + 72 N

2

12NQ

Q

Q 4NQ + 143NQ 928NQ + 1824

,

R( ) =

2 + 8N 12)

6(NQ

Q

(3.6)

which is in the range where only the meson operator M ij hits the unitarity bound. Thus, we find

that the meson operator M ij decouples from the interacting system to become free at the IR fixed

point for NQ = 6.

Also in the case of NQ = 7, we find that M ij hits the unitarity bound and the U (1)R charges

are given by (3.6) in the same way as for NQ = 6, though the ranges of x are different from

Fig. 1.

We go on to the case NQ = 8. The ranges of x are divided as Fig. 2. In this case, we encounter a

subtlety that we do not understand how to deal with the situation where gauge invariant Lorentz

spinors like D1 hit the unitarity bound.3 The best we can do at this stage is just to neglect

them assuming that such operators are massive in this case as in the previous paper [1]. Even if

they are actually massless, our analysis in the region where they do not hit the unitarity bound,

which is x 1/4 for this case, is still valid. We find that the trial a-function have a unique local

3 The unitarity bound for gauge invariant Lorentz spinors is R(O) 1 [5].

143

Fig. 2. The ranges of x where each operator hits the unitarity bound for NQ = 8.

maximum at

x=

12NQ

2

2900 NQ

2 20)

6(NQ

(3.7)

or equivalently,

R(Q) =

R( ) =

2 24N 60 + 2 2900 N 2

3NQ

Q

Q

2 20)

3(NQ

2 + 120 N

2

6NQ

Q 2900 NQ

2 20)

6(NQ

(3.8)

This is in the range where no operators hit the unitarity bounds. Though the ranges of x depend

on NQ , we obtain the similar result for 9 NQ 19.

In summary, we find that for NQ = 6, 7, the U (1)R charges are given by (3.6) and the meson

operator M ij becomes free, while for 8 NQ 19, the U (1)R charges are given by (3.8) and

no operators become free.

3.2. a-Maximization in the magnetic theory

We next study the magnetic theory. Though we expect the same results as that in the electric

theory, it is non-trivial because of the extra operators (2.6). The trial a-function of the magnetic

theory is different from that of the electric theory in the region where such operators hit the

unitarity bounds.

We begin with the case NQ = 6 and compare with the result of the previous subsection. The

gauge invariant operators are U0 , U1 , U2 , E0 , I0 , I1 , and J1 in (2.6), which exist only in the

magnetic theory, as well as M, Y , C, B, G, H , D0 , and S in (2.5), which have the counterpart

in the electric theory. The charge of these operators can be written with x of (2.1) by using the

charges of U (1)F and U (1) for each field given in Table 1. They are given by (3.1) and also by

R(U0 ) = 24x 2,

R(I0 ) = 8x + 2,

R(U1 ) = 4,

R(I1 ) = 3,

R(J1 ) = 16x.

R(E0 ) = 8x + 5,

(3.9)

We thus, find that the ranges of x where each operator hits the unitarity bound is given by Fig. 3.

Since the operators C, U1 and I1 do not hit the unitarity bounds for all the ranges of x, they

144

Fig. 3. The ranges of x where each operator hits the unitarity bound for NQ = 6 in the magnetic theory.

do not appear in Fig. 3. The bold arrows correspond to the operators which exist only in the

magnetic theory. The dotted arrows correspond to the Lorentz spinor operators, which we ignore

as in the previous subsection.

As in Fig. 3, we find that the trial a-function is given by

6

12

x 19 ),

a0 (x) + fY (x) + fG (x) + fU0 (x) ( 12

1

7

7

5

a(x) = a0 (x) + fM (x) + fY (x) ( 36

(3.10)

x 24

),

5

7

(x)

+

f

(x)

(

x

),

a

0

M

24

24

x 11

36 ),

11

7

(x)

+

f

(x)

+

f

(x)

+

f

(x)

(

a

0

M

B

D

0

36 x 18 ),

x 13

a0 (x) + fM (x) + fB (x) + fD0 (x) + fU2 (x) ( 18

24 ),

24 x),

where fO (x) is given by (3.3). The numbers of the components dO which appear in (3.3) are

given by (3.4) and also by

dU0 = 1,

dU2 = 6,

d E0 =

3Q !

,

5!(NQ 5)!

dI 0 =

3NQ !

.

4!(NQ 4)!

(3.11)

For the range 1/9 x 7/18, where the operators which exist only in the magnetic theory do

not hit the unitarity bound, the trial a-function (3.10) have the same shape as that of the electric

theory. As the trial a-function (3.2) of the electric theory have a local maximum in this range, this

function also have the local maximum at the same value of x. We can also check that there are

no other local maximum throughout the whole range of x, though the function itself is different

from that in the electric theory. Also in the case of NQ = 7, we can obtain the same result as in

the electric theory.

In the case of NQ = 8, the ranges of x are given in Fig. 4. We find that the trial a-function

has the same shape as that of the electric theory for 1/12 x 23/96, which includes the local

maximum given by (3.7). Since we can verify that there are no local maximum outside this range,

we find that the trial a-function have the unique local maximum, and no operators become free.

145

Fig. 4. The ranges of x where each operator hits the unitarity bound for NQ = 8 in the magnetic theory.

Though the ranges of x depend on NQ , we can find the same result as in the electric theory also

for 9 NQ 19.

Thus, we obtain the same results about the value of the U (1)R charge in spite of the discrepancy of the gauge invariant operators.

4. Summary and discussion

We have studies Spin(10) gauge theory with NQ vectors and two spinors. We found that the

meson operator M ij = Qi Qj decouples from the interacting system to become free for NQ =

6, 7.

We have discussed the renormalization group flow for the single spinor case in the paper [1].

In particular, for NQ = 7, 8, 9, we have seen the two electric theories flow into the same theory

at the IR fixed point. In the present case, since the magnetic theory flows into the theory without

the term M ij qi s qj in the IR, the electric theory flows into the same theory as that with Nij Qi Qj

in the superpotential, as discussed for the single spinors case in the introduction (see [1] for more

details).

Acknowledgements

We are indebted to Yutaka Ookouchi for collaboration at the early stages of this work. The

research of T.K. was supported in part by the Grants-in-Aid (#16740133) and (#19540268) from

the MEXT of Japan. The research of F.Y. was supported in part by JSPS Research Fellowships

for Young Scientists.

Appendix A. Gauge invariant operator of the electric theory

In this appendix, we explain how to obtain the gauge invariant generators (2.3) of the classical chiral ring of the electric theory. In order to deal with the operators including the Spin(10)

spinors I , let us recall that the product of the spinors can be decomposed into antisymmetric

tensor representations as

16 16 = [1] + [3] + [5]+

where [n] represents the rank n antisymmetric tensor, and the rank 5 tensor is self-dual. They can

be explicitly expressed as IT C a1 an J . These are symmetric under the exchange of I and J

for n = 1, 5 and antisymmetric for n = 3. All gauge invariant operators can be obtained by contracting the Spin(10) gauge indices ai (= 1, . . . , 10) of the antisymmetric tensors IT C a1 an J

146

(n = 1, 3, 5), the vectors Qai , the field strength Wa1 a2 , and the antisymmetric invariant tensors

a1 a10 . However, many of the operators constructed in this way are decomposed into the product of other gauge invariant operators or vanish up to the D 2 exact term. In order to identify

the independent gauge invariant operators, we discuss the constraints among the chiral fields Q,

, W , and the invariant tensors a1 a10 .

Since the invariant tensor a1 a10 satisfies4

a1

10

a1 a10 b1 b10 = [b

b10

],

1

a

(A.1)

we can see that a pair of the invariant tensors can annihilate. Therefore, all the gauge invariant

operators can be reduced into those with at most one of the invariant tensor a1 a10 .

It follows from (A.1) that

a1 a10 IT C b1 bn J b[a11 bann IT C an+1 a10 ] J .

(A.2)

If we introduce the antisymmetric tensors of rank 7 and 9, as seen from (A.2), we do not need

operators with both of the invariant tensor a1 a10 and the antisymmetric tensor. Thus, we find

that all the invariants are classified into operators containing no spinors with at most one of the

invariant tensors a1 a10 and those with spinors and none of the invariant tensors a1 a10 .

We begin with the operators in the former class. A constraint between the field strength W

and other fields q is given by

a

a

WA T A b q b D 2 eV D eV q

(A.3)

0,

where q is a field in a representation of Spin(10) and T A is the generator in the representation.

For example, when it is the field strength W , we obtain that {W , W } 0. Thus, operators with

more than two of the field strength in this class vanish by the anticommutativity of them. Taking

account of (A.3), we find that all the operators in this class are given by

M ij = Qai Qaj ,

S = Tr W W ,

D1 i1 i8 = a1 a10 Qa1 i1 Qa8 i8 W a9 a10 ,

D2 i1 i10 = a1 a10 Qa1 i1 Qa10 i10 .

(A.4)

We go on to the latter class. We first consider the operators without the field strength. Most of

the constraints on the spinors can be obtained from the Fierz identities. After repeat use of the

Fierz identities and lengthy calculations, we find that the product

IT C a1 ai c1 cn J KT C b1 bj c1 cn L

(A.5)

IT C(2 X )I J a J KT C(2 X )KL a L ,

IT C(2 X )I J a1 a4 b J KT C(2 X )KL b L ,

(A.6)

and those where two antisymmetric tensors are not at all contracted with each other. The sum of

the ranks of the two antisymmetric tensors in the third contribution is always less than that of the

original product (A.5). By using this fact, it turns out that the third contribution is decomposed

4 The brackets [ ] denote the antisymmetrization of the indices.

147

into other invariant operators. When we use the products of the antisymmetric tensors, they are

thus given by (A.6). Therefore, we can see that all the operators with no field strength in this

class contain at most two of the antisymmetric tensors. More explicitly, they are given by

YXi = IT C(2 X )I J a J Qai ,

C i1 i3 = IT C(2 )I J a1 a3 J Qa1 i1 Qa3 i3 ,

i i5

BX1

i i9

E2 X1

H i1 i4 = IT C(2 X )I J a1 a5 J KT C(2 X )KL a1 L Qa2 i2 Qa5 i5 .

(A.7)

We next consider operators with the spinors and the field strength. The field strength W in the

operators of this class only connect to another one W or the antisymmetric tensors due to (A.3)

and {W , W } 0. By using the identity

[a1 a2 am ]

[a1 a2 a3 am ]

a1 am bc = a1 am bc + [b

c] bc

0 Wbc IT C a1 am bc J + 2W [a1 b IT C a2 am ]b J 2W [a1 a2 IT C a3 am ] J .

By decomposing this equation into the symmetric and the antisymmetric part under the exchange

of the flavor indices I and J , we obtain the equations

Wbc IT C a1 am bc J 2W [a1 a2 IT C a3 am ] J ,

W [a1 b IT C a2 am ]b J 0.

(A.8)

We can see from the first equation of (A.8) that the rank of the antisymmetric tensor connected to

the field strength with two indices can be reduced by four. By using the second equation of (A.8),

we find that the operators including the field strength contracted with two antisymmetric tensors,

IT C a1 am1 b J W bc KT C a1 am1 c L ,

IT C a1 am1 b J W bc W cd KT C a1 an1 d L ,

can be reorganized into the operator where the two antisymmetric tensors are directly contracted.

Similarly to the previous discussion leading to (A.7), such products of the antisymmetric tensors

can be rewritten, and if not vanish, the field strength is in turn connected to the antisymmetric

tensor with the two indices or is decomposed with another field strength into the glueball S. Thus,

we find that operators with the spinors and the field strength finally vanish according to (A.3) or

are decomposed into the product of the glueball superfield S and operators with the spinors.

To summarize, the operators in (A.4) and (A.7) are the gauge invariant generators of the

classical chiral ring of the electric theory, as listed in (2.3).

Appendix B. Gauge invariant operators of the magnetic theory

In this appendix, we only discuss the outline on how to obtain the gauge invariant generators

of the classical chiral ring of the magnetic theory. Similarly to the case of the electric theory, an

148

group is given by

a1 aNQ 3

aN

a1 aNQ 3

a1

b1 bNQ 3 = [b

bN Q3 ] .

1

(B.1)

Thus, all the gauge invariant operators can be classified into operators with none of the antisymmetric invariant tensors, those with the invariant tensors with the lower indices, and those with

the invariant tensors with the upper indices.

We first consider the operators without the antisymmetric invariant tensors. Eq. (A.3) is also

valid for the field strength w and w of SU(NQ 3) and Sp(1), respectively. Taking (A.3) into

account together with the F -term conditions, we can verify that operators without the invariant

tensors are given by the gauge singlets M, Y , and the composites

G t I (2 )I J t J ,

S Tr w w ,

S Tr w w .

(B.2)

s ab wb c s cb wb a ,

(B.3)

s ab .

We next consider operators including the invariant tensors a1 aNQ 3 . It turns out that all the

operators in this class are given by the contraction of the invariant tensor a1 aNQ 3 with the four

operators

a1

,

(s q)

a1 i ,

qX

n a1 b1

b1 bNQ 3

sw

(sw )a1 a2 ,

(n = 0, 1, 2),

(B.4)

which are supposed so that the indices a1 , a2 of the third in (B.4) are contracted with those of

a1 aNQ 3 , while the indices b2 , . . . , bNQ 3 of the other invariant tensor b1 bNQ 3 in the fourth

are contracted with another of (B.4). Taking account of the index X = 1, 2, 3 of the field qX and

the index = 1, 2 of the field strength w , we notice that at most three of the first in (B.4) and

two of the third can be contracted with the same invariant tensor. Therefore, all the operators

with the single a1 aNQ 3 are given by

Dn ,

En ,

In ,

Jm

(n = 0, 1, 2, m = 1, 2),

(B.5)

aNQ 3

can be decomposed into the product of the operator C in (2.5) and U0 in (2.6).

We turn to the gauge invariant operators with more than one a1 aNQ 3 and find that all the

independent gauge invariant operators are given by

U0 = det s,

U1XY = a1 aNQ 3 b1 bNQ 3 s a1 b1 s

qX

qY

,

s a1 b1 s

qX1

qX2

qY1

qY2

,

U3 = X1 X2 X3 Y1 Y2 Y3 a1 aNQ 3 b1 bNQ 3

s a1 b1 s

q X1

qX2

qX3

qY1

qY2

qY3

.

(B.6)

149

Let us begin with one invariant tensor a1 aNQ 3 and all the symmetric tensor s ab contracted

with it,

a1 ak ak+1 aNQ 3 s a1 b1 s ak bk Tak+1 aNQ 3 b1 bk ,

in an operator of this class. The indices ak+1 , . . . , aNQ 3 are supposed to be contracted with

those of the first in (B.4) or those of the field strength w in the third. As the indices b1 , . . . , bk

in T are antisymmetric, by using (B.1), we can rewrite it as

Tak+1 aNQ 3 b1 bk Tak+1 aNQ 3 d1 dk d1 dk ek+1 eNQ 3

b1 bk ek+1 eNQ 3

(B.7)

On the other hand, since we are considering the operators with more than one invariant tensors,

the operators have another invariant tensor c1 cNQ 3 other than those included in (B.7). Then,

we apply (B.1) again to

b1 bk ek+1 eNQ 3

bk

Tak+1 aNQ 3 b1 bk c1 cNQ 3 Tak+1 aNQ 3 d1 dk d1 dk [ck+1 cNQ 3 cb11c

.

k]

(B.8)

After this procedure, other s ab besides those in (B.8) may connect to the original a1al al+1aNQ 3 ,

upon the use of (B.1). Then, we can use (B.1) for all the symmetric tensors s ab contracted with

the tensor a1 al al+1 aNQ 3 to annihilate the other d1 dk ck+1 cNQ 3 in (B.8) and the appearing

invariant tensor of the upper indices. If the resulting operator does not vanish, we obtain the

following form

a1 al al+1 aNQ 3 s a1 b1 s al bl q al+1 q

aNQ 3

b1 bl bl+1 bNQ 3 ,

(B.9)

where the remaining indices bk+1 bNQ 3 are contracted with those in (B.4). Again, we apply (B.1) to all symmetric tensors contracted with b1 bk bk+1 bNQ 3 in (B.9) to eliminate the

original invariant tensor a1 aNQ 3 and the newly appearing invariant tensor. We find that all the

a a

gauge invariant operators with more than one 1 NQ 3 except for (B.6) vanish or are decomposed into the gauge invariant operators.

a a

We next consider operators including the invariant tensors 1 NQ 3 with the upper indices.

It turns out that all the operators in this class are given by the contraction of the invariant tensor

a a

1 NQ 3 with the five operators5

qa1 i ,

I J q a1 I t J ,

q a1 (I q b1 ||J )

b1 bNQ 3

q a1 (I q b ||J q |b|KL) ,

,

q a1 (I q a2 ||J )

(B.10)

a

(X 2 )I J ,

q aI J qX

and thus, it is symmetric under exchange of the indices I and J . The indices a1 and a2 of the

a a

b b

fifth operator in (B.10) are contracted with those of 1 NQ 3 , while 1 NQ 3 in the fourth is

contracted with the operators in (B.10).

Taking account of the indices of the local Sp(1) and those of the global SU(2), we find that at

a a

most four q can be contracted with 1 NQ 3 . The numbers of the second, the third, the fourth,

a a

and the fifth operators in (B.10) contracted with the invariant tensor 1 NQ 3 are limited from

5 The parentheses ( ) denote the symmetrization of the indices, while [ ] does the antisymmetrization.

150

this fact. Further, if two q from the second, the third, and the fourth are contracted with the

invariant tensor, the symmetric part of the global SU(2) indices of them can be rewritten in terms

of the fifth and some other parts. In fact, when the indices of the global SU(2) of these two q

are symmetric, the local Sp(1) indices of those q should be antisymmetric. Then, by using the

relation for the invariant tensor of Sp(1),

1 2

1 2 1 2 = [

,

1 2 ]

1 a a a

= 1 2 NQ 3 q a1 1 (I q a2 |2 |J ) 1 2 1 2 ,

2

and it gives rise to the fifth. This is always possible when more than two q from the second, the

a a

third, and the fourth are contracted with the invariant tensor 1 NQ 3 of SU(NQ 3), because

the global SU(2) indices of two q of them must take the same value, thus symmetric. Thus,

we find that the total number of the second, the third, and the fourth contracted with the same

a a

invariant tensor 1 NQ 3 should be less than three.

When four of q are contracted with the invariant tensor, each two of them take the same

value of the SU(2) indices, respectively, and can be rewritten in terms of two copies of the fifth

and some other parts. Thus, when one of the fifth is contracted with the invariant tensor, the

total number of the second, the third, and the fourth contracted with the same invariant tensor

a a

1 NQ 3 should be less than two.

Wrapping up these facts, together with the F -term conditions, (A.3), and (B.1), we can verify

a a

that all the operators with the single 1 NQ 3 are given by

a1 a2 aNQ 3

a1

1 (I

(C)i1 iNQ 3

|2 |J )

a1 aNQ 3

(B)Xi1 iNQ 5

(F )i1 iNQ 7

a2

a1 aNQ 3

a1 aNQ 3

q aN

I

(2 X )I J q aN

Q 4

Q 3

(2 X )I J q aN

Q 6

(H )i1 iNQ 4 I J

a1 aNQ 3

Q 5

q aN

K

Q 4

(2 X )KL q aN

L

Q 3

I

Q 3

t J .

,

(B.11)

a a

We go on to the operators with more than two 1 NQ 3 and skip those with two here. The

latter will be explained later. We will see that all the operators in these classes do not give the

independent gauge invariant operators. Since the only fourth operators in (B.10) can connect with

a a

a a

two invariant tensors 1 NQ 3 , the operators with more than two 1 NQ 3 should include at

a1 aNQ 3

which are contracted with two of the fourth. Further, all the remaining indices

least one

a a

of the same 1 NQ 3 must be contracted with the first operator in (B.10), as

a a

b b

1 NQ 3 qa1 qaNQ 5 q aNQ 4 (I q b1 ||J ) 1 NQ 3

c c

q aNQ 3 (K q c1 ||L) 1 NQ 3 ,

(B.12)

as we can see from the previous discussion. Here, we apply the identity

a1 aNQ 3

b1 bNQ 3

= 0,

(B.13)

and

in (B.12). Ignoring the terms decomposed into the products of gauge

to

invariant operators, we find that the resulting operators are given by

NQ 5

1 a1 ak1 b1 ak+1 aNQ 3

2

k=1

a b b

c c

q aNQ 3 (K q c1 ||L) 1 NQ 3 k 2 NQ 3 qak .

151

(B.14)

If the resulting operator is not decomposed into gauge invariant operators, the last factor

a b b

c c

k 2 NQ 3 qak in (B.14) are connected with the invariant tensor 1 NQ 3 via other operac1 cNQ 3

in (B.14) is contracted with two

tors. This happens only when the invariant tensor

of the fourth in (B.10). This is the same situation we previously have seen for the invariant tena a

sor 1 NQ 3 in (B.12), and thus, we can repeat the same procedure to show that the resulting

operator is decomposed into gauge invariant operators.

a a

We now turn to the operators with two 1 NQ 3 . As discussed previously, the only fourth

operator in (B.10) can be used to connect the two invariant tensors. In particular, they are connected by at most two of the operator. By using the identity (B.13), we can see that the invariant

tensors connected by two of the fourth in (B.10) can be reduced to those by one. Thus, we only

have to consider the latter operators. If either of the invariant tensors does not have the fifth operator of (B.10), we can use the identity (B.13) If both of them have the fifth operators, a closer

examination is needed on the symmetry of the global SU(2) indices of q s. Taking account of this

point and the identity (B.13), we can verify that they are also decomposed into gauge invariant

operators.

To summarize, the singlets M, Y , and the operators listed in (B.2), (B.5), (B.6), and (B.11)

are the gauge invariant generators of the classical chiral ring of the magnetic theory.

As discussed in Section 2, to all the gauge invariant generators (2.3) of the classical chiral ring

in the electric theory, there exist the counterparts (2.5) in the magnetic theory. However, the extra

gauge invariant operators (2.6) seem to exist in the magnetic theory. If the electricmagnetic

duality is true for this model, this discrepancy should disappear at the quantum level.

References

[1] T. Kawano, Y. Ookouchi, Y. Tachikawa, F. Yagi, Pouliot type duality via a-maximization, Nucl. Phys. B 735

(2006) 1, hep-th/0509230.

[2] P. Pouliot, M.J. Strassler, Duality and dynamical supersymmetry breaking in Spin(10) with a spinor, Phys. Lett.

B 375 (1996) 175, hep-th/9602031.

[3] T. Kawano, Duality of N = 1 supersymmetric SO(10) gauge theory with matter in the spinorial representation,

Prog. Theor. Phys. 95 (1996) 963, hep-th/9602035.

[4] M. Flato, C. Fronsdal, Representations of conformal supersymmetry, Lett. Math. Phys. 8 (1984) 159.

[5] G. Mack, All unitary ray representations of the conformal group SU(2, 2) with positive energy, Commun. Math.

Phys. 55 (1977) 1.

[6] N. Seiberg, Electricmagnetic duality in supersymmetric non-Abelian gauge theories, Nucl. Phys. B 435 (1995)

129, hep-th/9411149.

[7] D. Kutasov, A. Parnachev, D.A. Sahakyan, Central charges and U (1)R symmetries in N = 1 super YangMills,

JHEP 0311 (2003) 013, hep-th/0308071.

[8] D. Kutasov, A. Schwimmer, On duality in supersymmetric YangMills theory, Phys. Lett. B 354 (1995) 315, hepth/9505004.

[9] K. Intriligator, B. Wecht, The exact superconformal R-symmetry maximizes a, Nucl. Phys. B 667 (2003) 183,

hep-th/0304128.

[10] G. t Hooft, Naturalness, chiral symmetry, and spontaneous chiral symmetry breaking, in: G. t Hooft, et al. (Eds.),

Recent Developments in Gauge Theories, Plenum Press, New York, 1980, p. 135.

[11] M. Berkooz, P.L. Cho, P. Kraus, M.J. Strassler, Dual descriptions of SO(10) SUSY gauge theories with arbitrary

numbers of spinors and vectors, Phys. Rev. D 56 (1997) 7166, hep-th/9705003.

scattering at HERA

ZEUS Collaboration

S. Chekanov 1 , M. Derrick, S. Magill, B. Musgrave, D. Nicholass 2 ,

J. Repond, R. Yoshida

Argonne National Laboratory, Argonne, IL 60439-4815, USA 3

M.C.K. Mattingly

Andrews University, Berrien Springs, Michigan 49104-0380, USA

Institut fr Physik der Humboldt-Universitt zu Berlin, Berlin, Germany

D. Boscherini, A. Bruni, G. Bruni, L. Cifarelli, F. Cindolo, A. Contin,

M. Corradi 4 , S. De Pasquale, G. Iacobucci, A. Margotti, R. Nania,

A. Polini, G. Sartorelli, A. Zichichi

University and INFN Bologna, Bologna, Italy 5

M. Jngst, O.M. Kind 7 , A.E. Nuncio-Quiroz, E. Paul 8 , R. Renner 6 ,

U. Samson, V. Schnberg, R. Shehzadi, M. Wlasenko

Physikalisches Institut der Universitt Bonn, Bonn, Germany 9

H.H. Wills Physics Laboratory, University of Bristol, Bristol, United Kingdom 10

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.05.027

RAPID COMMUNICATION

153

E. Tassi

Calabria University, Physics Department and INFN, Cosenza, Italy 5

Chonnam National University, Kwangju, South Korea 13

Jabatan Fizik, Universiti Malaya, 50603 Kuala Lumpur, Malaysia 14

Nevis Laboratories, Columbia University, Irvington on Hudson, New York 10027 15

P. Stopa, L. Zawiejski

The Henryk Niewodniczanski Institute of Nuclear Physics, Polish Academy of Sciences, Cracow, Poland 16

M. Przybycien, L. Suszycki

Faculty of Physics and Applied Computer Science, AGH-University of Science and Technology, Cracow, Poland 17

A. Kotanski 18 , W. Sominski 19

Department of Physics, Jagellonian University, Cracow, Poland

R. Ciesielski, N. Coppola, A. Dossanov, V. Drugakov, J. Fourletova,

A. Geiser, D. Gladkov, P. Gttlicher 21 , J. Grebenyuk, I. Gregor, T. Haas,

W. Hain, C. Horn 22 , A. Httmann, B. Kahle, I.I. Katkov, U. Klein 23 ,

U. Ktz, H. Kowalski, E. Lobodzinska, B. Lhr, R. Mankel,

I.-A. Melzer-Pellmann, S. Miglioranzi, A. Montanari, D. Notz,

L. Rinaldi, P. Roloff, I. Rubinsky, R. Santamarta, U. Schneekloth,

A. Spiridonov 24 , H. Stadie, D. Szuba 25 , J. Szuba 26 , T. Theedt, G. Wolf,

K. Wrona, C. Youngman, W. Zeuner

Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany

RAPID COMMUNICATION

154

W. Lohmann, S. Schlenstedt

Deutsches Elektronen-Synchrotron DESY, Zeuthen, Germany

University and INFN, Florence, Italy 5

Fakultt fr Physik der Universitt Freiburg i.Br., Freiburg i.Br., Germany 9

I.O. Skillicorn

Department of Physics and Astronomy, University of Glasgow, Glasgow, United Kingdom 10

I. Gialas 28 , K. Papageorgiu

Department of Engineering in Management and Finance, University of Aegean, Greece

T. Schrner-Sadenius, J. Sztuk, K. Wichmann, K. Wick

Hamburg University, Institute of Experimental Physics, Hamburg, Germany 9

Imperial College London, High Energy Nuclear Physics Group, London, United Kingdom 10

Y. Yamazaki

Institute of Particle and Nuclear Studies, KEK, Tsukuba, Japan 31

Institute of Physics and Technology of Ministry of Education and Science of Kazakhstan, Almaty, Kazakhstan

V. Aushev 1

Institute for Nuclear Research, National Academy of Sciences, Kiev and Kiev National University, Kiev, Ukraine

D. Son

Kyungpook National University, Center for High Energy Physics, Daegu, South Korea 13

RAPID COMMUNICATION

155

J. de Favereau, K. Piotrzkowski

Institut de Physique Nuclaire, Universit Catholique de Louvain, Louvain-la-Neuve, Belgium 32

M. Soares, J. Terrn, M. Zambrana

Departamento de Fsica Terica, Universidad Autnoma de Madrid, Madrid, Spain 34

Department of Physics, McGill University, Montral, Qubec, Canada H3A 2T8 35

T. Tsurugai

Meiji Gakuin University, Faculty of General Education, Yokohama, Japan 31

Moscow Engineering Physics Institute, Moscow, Russia 36

I.A. Korzhavina, V.A. Kuzmin, B.B. Levchenko 37 , O.Yu. Lukina,

A.S. Proskuryakov, L.M. Shcheglova, D.S. Zotkin, S.A. Zotkin

Moscow State University, Institute of Nuclear Physics, Moscow, Russia 38

Max-Planck-Institut fr Physik, Mnchen, Germany

H. Tiecke, M. Vzquez 29 , L. Wiggers

NIKHEF and University of Amsterdam, Amsterdam, The Netherlands 39

Physics Department, Ohio State University, Columbus, OH 43210, USA 3

RAPID COMMUNICATION

156

R.C.E. Devenish, B. Foster, K. Korcsak-Gorzo, S. Patel, V. Roberfroid 40 ,

A. Robertson, P.B. Straub, C. Uribe-Estrada, R. Walczak

Department of Physics, University of Oxford, Oxford, United Kingdom 10

A. Garfagnini, S. Limentani, A. Longhin, L. Stanco, M. Turcato

Dipartimento di Fisica dell Universit and INFN, Padova, Italy 5

Department of Physics, Pennsylvania State University, University Park, PA 16802, USA 15

Y. Iga

Polytechnic University, Sagamihara, Japan 31

Dipartimento di Fisica, Universit La Sapienza and INFN, Rome, Italy

Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, United Kingdom 10

Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics, Tel-Aviv University, Tel-Aviv, Israel 44

M. Kuze, J. Maeda

Department of Physics, Tokyo Institute of Technology, Tokyo, Japan 31

Department of Physics, University of Tokyo, Tokyo, Japan 31

Tokyo Metropolitan University, Department of Physics, Tokyo, Japan 31

Universit di Torino and INFN, Torino, Italy 5

RAPID COMMUNICATION

157

M. Arneodo, M. Ruspa

Universit del Piemonte Orientale, Novara, and INFN, Torino, Italy 5

Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A7 35

J.H. Loizides, M.R. Sutton 48 , M. Wing

Physics and Astronomy Department, University College London, London, United Kingdom 10

J. Malka 50 , R.J. Nowak, J.M. Pawlak, T. Tymieniecka, A. Ukleja,

A.F. Zarnecki

Warsaw University, Institute of Experimental Physics, Warsaw, Poland

M. Adamus, P. Plucinski 51

Institute for Nuclear Studies, Warsaw, Poland

Department of Particle Physics, Weizmann Institute, Rehovot, Israel 52

A.A. Savin, W.H. Smith, H. Wolfe

Department of Physics, University of Wisconsin, Madison, WI 53706, USA 3

J. Standage, J. Whyte

Department of Physics, York University, Ontario, Canada M3J 1P3 35

RAPID COMMUNICATION

158

Available online 8 June 2007

Abstract

Inclusive dijet and trijet production in deep inelastic ep scattering has been measured for 10 < Q2 <

x, 104 < xBj < 102 . The data were taken at the HERA ep collider with

100 GeV2 and low Bjorken

centre-of-mass energy s = 318 GeV using the ZEUS detector and correspond to an integrated luminosity

of 82 pb1 . Jets were identified in the hadronic centre-of-mass (HCM) frame using the kT cluster algorithm

in the longitudinally invariant inclusive mode. Measurements of dijet and trijet differential cross sections

are presented as functions of Q2 , xBj , jet transverse energy, and jet pseudorapidity. As a further examination

of low-xBj dynamics, multi-differential cross sections as functions of the jet correlations in transverse momenta, azimuthal angles, and pseudorapidity are also presented. Calculations at O(s3 ) generally describe

the trijet data well and improve the description of the dijet data compared to the calculation at O(s2 ).

2007 Elsevier B.V. All rights reserved.

* Corresponding author.

1 Supported by DESY, Germany.

2 Also affiliated with University College London, UK.

3 Supported by the US Department of Energy.

4 Also at University of Hamburg, Germany, Alexander von Humboldt Fellow.

5 Supported by the Italian National Institute for Nuclear Physics (INFN).

6 Self-employed.

7 Now at Humboldt University, Berlin, Germany.

8 Retired.

9 Supported by the German Federal Ministry for Education and Research (BMBF), under contract numbers

HZ1GUA 2, HZ1GUB 0, HZ1PDA 5, HZ1VFA 5.

10 Supported by the Particle Physics and Astronomy Research Council, UK.

11 Supported by Chonnam National University in 2005.

12 Supported by a scholarship of the World Laboratory Bjrn Wiik Research Project.

13 Supported by the Korean Ministry of Education and Korea Science and Engineering Foundation.

14 Supported by the Malaysian Ministry of Science, Technology and Innovation/Akademi Sains Malaysia grant SAGA

66-02-03-0048.

15 Supported by the US National Science Foundation. Any opinion, findings and conclusions or recommendations

expressed in this material are those of the authors and do not necessarily reflect the views of the National Science

Foundation.

16 Supported by the Polish State Committee for Scientific Research, grant No. 620/E-77/SPB/DESY/P-03/DZ

117/2003-2005 and grant No. 1P03B07427/2004-2006.

17 Supported by the Polish Ministry of Science and Higher Education as a scientific project (20062008).

18 Supported by the research grant No. 1 P03B 04529 (20052008).

19 This work was supported in part by the Marie Curie Actions Transfer of Knowledge project COCOS (contract

MTKD-CT-2004-517186).

20 Now at Univ. Libre de Bruxelles, Belgium.

21 Now at DESY group FEB, Hamburg, Germany.

22 Now at Stanford Linear Accelerator Center, Stanford, USA.

23 Now at University of Liverpool, UK.

24 Also at Institut of Theoretical and Experimental Physics, Moscow, Russia.

25 Also at INP, Cracow, Poland.

RAPID COMMUNICATION

159

1. Introduction

Multijet production in deep inelastic ep scattering (DIS) at HERA has been used to test the

predictions of perturbative QCD (pQCD) over a large range of negative four-momentum transfer

squared, Q2 , and to determine the strong coupling constant s [1,2]. At leading order (LO) in

s , dijet production in neutral current DIS proceeds via the bosongluon-fusion (V g q q with

V = , Z 0 ) and QCD-Compton (V q qg) processes. Events with three jets can be seen as

dijet processes with an additional gluon radiation or with a gluon splitting into a quarkantiquark

pair and are directly sensitive to O(s2 ) QCD effects. The higher sensitivity to s and the large

number of degrees of freedom of the trijet final state provide a good testing ground for the pQCD

predictions. In particular, multijet production in DIS is an ideal environment for investigating different approaches to parton dynamics at low Bjorken-x, xBj [3]. An understanding of this regime

is of particular relevance in view of the startup of the LHC, where many of the Standard Model

processes such as the production of electroweak gauge bosons or the Higgs particle involve the

collision of partons with a low fraction of the proton momentum.

In the usual collinear QCD factorisation approach, the cross sections are obtained as the convolution of perturbative matrix elements and parton densities evolved according to the DGLAP

26 On leave of absence from FPACS, AGH-UST, Cracow, Poland.

27 Partly supported by Moscow State University, Russia.

28 Also affiliated with DESY.

29 Now at CERN, Geneva, Switzerland.

30 Also at University of Tokyo, Japan.

31 Supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and its grants

32 Supported by FNRS and its associated funds (IISN and FRIA) and by an Inter-University Attraction Poles Programme

subsidised by the Belgian Federal Science Policy Office.

33 Ramn y Cajal Fellow.

34 Supported by the Spanish Ministry of Education and Science through funds provided by CICYT.

35 Supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

36 Partially supported by the German Federal Ministry for Education and Research (BMBF).

37 Partly supported by Russian Foundation for Basic Research grant No. 05-02-39028-NSFC-a.

38 Supported by RF Presidential grant No. 8122.2006.2 for the leading scientific schools and by the Russian Ministry

of Education and Science through its grant Research on High Energy Physics.

39 Supported by the Netherlands Foundation for Research on Matter (FOM).

40 EU Marie Curie Fellow.

41 Partially supported by Warsaw University, Poland.

42 This material was based on work supported by the National Science Foundation, while working at the Foundation.

43 Also at Max Planck Institute, Munich, Germany, Alexander von Humboldt Research Award.

44 Supported by the GermanIsraeli Foundation and the Israel Science Foundation.

45 Now at KEK, Tsukuba, Japan.

46 Now at Nagoya University, Japan.

47 Department of Radiological Science.

48 PPARC Advanced fellow.

49 Also at dz University, Poland.

50 dz University, Poland.

51 Supported by the Polish Ministry for Education and Science grant No. 1 P03B 14129.

52 Supported in part by the MINERVA Gesellschaft fr Forschung GmbH, the Israel Science Foundation (grant

No. 293/02-11.2) and the USIsrael Binational Science Foundation.

Deceased.

RAPID COMMUNICATION

160

evolution equations [4]. These equations resum to all orders the terms proportional to s ln Q2

and the double logarithms ln Q2 ln 1/x, where x is the fraction of the proton momentum carried

by a parton, which is equal to xBj in the quarkparton model. In the DGLAP approach, the parton

participating in the hard scattering is the result of a partonic cascade ordered in transverse momentum, pT . The partonic cascade starts from a low-pT and high-x parton from the incoming

proton and ends up, after consecutive branching, in the high-pT and low-x parton entering in

the hard scattering. This approximation has been tested extensively at HERA and was found to

describe well the inclusive cross sections [5,6] and jet production [1,2,7,8]. At low xBj , where

the phase space for parton emissions increases, terms proportional to s ln 1/x may become large

and spoil the accuracy of the DGLAP approach. In this region the transverse momenta and angular correlations between partons produced in the hard scatter may be sensitive to effects beyond

DGLAP dynamics. The information about cross sections, transverse energy, ET , and angular

correlations between the two leading jets in multijet production therefore provides an important

testing ground for studying the parton dynamics in the region of small xBj .

In this analysis, correlations for both azimuthal and polar angles, and correlations in jet transverse energy and momenta for dijet and trijet production in the hadronic ( p) centre-of-mass

(HCM) frame are measured with high statistical precision in the kinematic region restricted to

10 < Q2 < 100 GeV2 and 104 < xBj < 102 . The results are compared with pQCD calculations

at next-to-leading order (NLO). A similar study of inclusive dijet production was performed by

the H1 Collaboration [9].

2. Experimental set-up

The data used in this analysis were collected during the 19982000 running period, when

HERA operated with protons of energy Ep = 920 GeV and electrons or positrons53 of energy

Ee = 27.5 GeV, and correspond to an integrated luminosity of 81.7 1.8 pb1 . A detailed description of the ZEUS detector can be found elsewhere [10,11]. A brief outline of the components

that are most relevant for this analysis is given below.

Charged particles are measured in the central tracking detector (CTD) [12], which operates in

a magnetic field of 1.43 T provided by a thin superconducting solenoid. The CTD consists of 72

cylindrical drift chamber layers, organised in nine superlayers covering the polar-angle54 region

15 < < 164 . The transverse momentum resolution for full-length tracks can be parameterised

as (pT )/pT = 0.0058pT 0.0065 0.0014/pT , with pT in GeV. The tracking system was

used to measure the interaction vertex with a typical resolution along (transverse to) the beam

direction of 0.4 (0.1) cm and also to cross-check the energy scale of the calorimeter.

The high-resolution uranium-scintillator calorimeter (CAL) [13] covers 99.7% of the total

solid angle and consists of three parts: the forward (FCAL), the barrel (BCAL) and the rear

(RCAL) calorimeters. Each part is subdivided transversely into towers and longitudinally into

one electromagnetic section and either one (in RCAL) or two (in BCAL and FCAL) hadronic

sections. The smallest subdivision of the calorimeter is called a cell. Under

test-beam conditions,

E for electrons and

the CAL single-particle

relative

energy

resolutions

were

(E)/E

=

0.18/

53 In the following, the term electron denotes generically both the electron (e ) and the positron (e+ ).

54 The ZEUS coordinate system is a right-handed Cartesian system, with the Z axis pointing in the proton beam di-

rection, referred to as the forward direction, and the X axis pointing left towards the centre of HERA. The coordinate

origin is at the nominal interaction point.

RAPID COMMUNICATION

161

The luminosity was measured from the rate of the bremsstrahlung process ep ep. The

resulting small-angle energetic photons were measured by the luminosity monitor [14], a leadscintillator calorimeter placed in the HERA tunnel at Z = 107 m.

3. Kinematics and event selection

A three-level trigger system was used to select events online [11,15]. Neutral current DIS

events were selected by requiring that a scattered electron candidate with an energy more than

4 GeV was measured in the CAL. The variable xBj , the inelasticity y, and Q2 were reconstructed

offline using the electron (subscript e) [16] and JacquetBlondel (JB) [17] methods. For each

event, the reconstruction of the hadronic final state was performed using a combination of track

and CAL information, excluding the cells and the track associated with the scattered electron.

The selected tracks and CAL clusters were treated as massless energy flow objects (EFOs) [18].

The offline selection of DIS events was similar to that used in the previous ZEUS measurement

[1] and was based on the following requirements:

Ee > 10 GeV, where Ee is the scattered electron energy after correction for energy loss from

the inactive material in the detector;

a kinematic region with good reconstruction;

ye < 0.6 and yJB > 0.1, to ensure

40 < < 60 GeV, where = i (Ei PZ,i ), where Ei and PZ,i are the energy and

z-momentum of each final-state object. The lower cut removed background from photoproduction and events with large initial-state QED radiation, while the upper cut removed

cosmic-ray background;

|Zvtx | < 50 cm, where Zvtx is the Z position of the reconstructed primary vertex, to select

events consistent with ep collisions.

The kinematic range of the analysis is

10 < Q2 < 100 GeV2 ,

Jets were reconstructed using the kT cluster algorithm [19] in the longitudinally invariant

inclusive mode [20]. The jet search was conducted in the HCM frame, which is equivalent to the

Breit frame [21] apart from a longitudinal boost.

jet

The jet phase space is defined by selection cuts on the jet pseudorapidity, LAB , in the laborajet

tory frame and on the jet transverse energy, ET ,HCM , in the HCM frame:

jet1,2(,3)

jet1

jet2(,3)

where jet 1, 2(, 3) refers to the two (three) jets with the highest transverse energy in the HCM

frame for a given event. The dijet and trijet samples are inclusive in that they contain at least two

or three jets passing the selection criteria, respectively.

4. Monte Carlo simulation

Monte Carlo (MC) simulations were used to correct the data for detector effects, inefficiencies

of the event selection and the jet reconstruction, as well as for QED effects. Neutral current DIS

events were generated using the A RIADNE 4.10 program [22] and the L EPTO 6.5 program [23]

interfaced to H ERACLES 4.5.2 [24] via D JANGO 6.2.4 [25]. The H ERACLES program includes

RAPID COMMUNICATION

162

2 ). In the case of A RIADNE , events were generated using the colourQED effects up to O(EM

dipole model [26], whereas for L EPTO, the matrix-elements plus parton-shower model was used.

The CTEQ5L parameterisations of the proton parton density functions (PDFs) [27] were used in

the generation of DIS events for A RIADNE, and the CTEQ4D PDFs [27] were used for L EPTO.

For hadronisation the Lund string model [28], as implemented in J ETSET 7.4 [29,30] was used.

The ZEUS detector response was simulated with a program based on G EANT 3.13 [31]. The

generated events were passed through the detector simulation, subjected to the same trigger requirements as the data, and processed by the same reconstruction and offline programs.

The measured distributions of the global kinematic variables are well described by both the

A RIADNE and L EPTO MC models after reweighting in Q2 [1]. The L EPTO simulation gives a

better overall description of the jet variables, but A RIADNE provides a better description of dijets

with small azimuthal separation. Therefore, for this analysis, the events generated with the A RI ADNE program were used to determine the acceptance corrections. The events generated with

L EPTO were used to estimate the uncertainty associated with the treatment of the parton shower.

The NLO calculations were carried out in the MS scheme for five massless quark flavors

with the program NLO JET [32]. The NLO JET program allows a computation of the dijet (trijet)

production cross sections to next-to-leading order, i.e. including all terms up to O(s2 ) (O(s3 )).

In certain regions of the jet phase space, where the two hardest jets are not balanced in transverse

momentum, NLO JET can be used to calculate the cross sections for dijet production at O(s3 ). It

was checked that the LO and NLO calculations from NLO JET agree with those of D ISENT [33]

at the 12% level for the dijet cross sections [34,35].

For comparison with the data, the CTEQ6M [36] PDFs were used, and the renormalisation and

factorisation scales were both chosen to be (E T2 ,HCM + Q2 )/4, where for dijets (trijets) E T ,HCM

is the average ET ,HCM of the two (three) highest ET ,HCM jets in a given event. The choice of

renormalisation scale matches that used in the previous ZEUS multijet analysis [1]. The strong

coupling constant was set to the value used for the CTEQ6 PDFs, s (MZ ) = 0.118, and evolved

according to the two-loop solution of the renormalisation group equation.

The NLO QCD predictions were corrected for hadronisation effects using a bin-by-bin procedure. Hadronisation correction factors were defined for each bin as the ratio of the hadronto parton-level cross sections and were calculated using the L EPTO MC program, which, at the

parton level, gives a better agreement with NLO JET than A RIADNE. The correction factors Chad

were typically in the range 0.80.9 for most of the phase space.

The theoretical uncertainty was estimated by varying the renormalisation scale up and down

by a factor of two. The uncertainties in the proton PDFs were estimated in the previous ZEUS

multijets analysis [1] by repeating NLO JET calculations using 40 additional sets from CTEQ6M,

which resulted in a 2.5% contribution to the theoretical uncertainty and was therefore neglected.

6. Acceptance corrections

The A RIADNE MC was used to correct the data for detector effects. The jet transverse energies were corrected for energy losses from inactive material in the detector. Typical jet energy

correction factors were 11.2, depending on the transverse energy of the detector-level jet and

the jet pseudorapidity.

RAPID COMMUNICATION

163

The measured cross sections were corrected to the hadron level using a bin-by-bin procedure.

These corrections account for trigger efficiency, acceptance, and migration. Typical efficiencies

and purities were about 50% for the differential cross sections, with correction factors typically

between 1 and 1.5. For the double-differential cross sections, the efficiencies and purities were

typically 2050%, with correction factors between 1 and 2.

The cross sections were corrected to the QED Born level by applying an additional correction

obtained from a special sample of the L EPTO MC with the radiative QED effects turned off. The

QED radiative effects were typically 24%.

7. Systematic uncertainties

A detailed study of the sources contributing to the systematic uncertainties of the measurements has been performed [37,38]. The main sources contributing to the systematic uncertainties

are listed below:

the data were corrected using L EPTO instead of A RIADNE;

the jet energies in the data were scaled up and down by 3% for jets with transverse energy

less than 10 GeV and 1% for jets with transverse energy above 10 GeV, according to the

estimated jet energy scale uncertainty [39];

jet

the cut on ET ,HCM for each jet was raised and lowered by 1 GeV, corresponding to the ET

resolution;

jet1,2(,3)

the upper and lower cuts on LAB

were each changed by 0.1, corresponding to the

resolution;

the uncertainties due to the selection cuts was estimated by varying the cuts within the resolution of each variable.

The largest systematic uncertainties came from the uncertainty of the jet energy scale, which

jet3

produced a systematic uncertainty of 510%. For the trijet sample, altering the cut on ET ,HCM

also produced a systematic uncertainty of 510%. The other significant systematic uncertainty

arose from the choice of L EPTO instead of A RIADNE for correcting detector effects. This systematic uncertainty was also typically 510%. The other systematic uncertainties were smaller

than or similar to the statistical uncertainties.

The systematic uncertainties not associated with the absolute energy scale of the jets were

added in quadrature to the statistical uncertainties and are shown as error bars in the figures.

The uncertainty due to the absolute energy scale of the jets is shown separately as a shaded

band in each figure, due to the large bin-to-bin correlation. In addition, there is an overall normalisation uncertainty of 2.2% from the luminosity determination, which is not included in the

figures.

8. Results

8.1. Single-differential cross sections d/dQ2 , d/dxBj and trijet to dijet cross section ratios

The single-differential cross sections d/dQ2 and d/dxBj for dijet and trijet production are

presented in Figs. 1(a) and (c), and Tables 14. The ratio trijet /dijet of the trijet cross section

to the dijet cross section, as a function of Q2 and of xBj are presented in Figs. 1(b) and (d),

respectively. The ratio trijet /dijet is almost Q2 independent, as shown in Fig. 1(b), and falls

RAPID COMMUNICATION

164

Fig. 1. Inclusive dijet and trijet cross sections as functions of (a) Q2 and (c) xBj . Figures (b) and (d) show the ratios of

the trijet to dijet cross sections. The bin-averaged differential cross sections are plotted at the bin centers. The inner error

bars represent the statistical uncertainties. The outer error bars represent the quadratic sum of statistical and systematic

uncertainties not associated with the jet energy scale. The shaded band indicates the jet energy scale uncertainty. The

predictions of perturbative QCD at NLO, corrected for hadronisation effects and using the CTEQ6 parameterisations of

the proton PDFs, are compared to data. The lower parts of the plots show the relative difference between the data and

the corresponding theoretical prediction. The hatched band represents the renormalisation-scale uncertainty of the QCD

calculation.

steeply with increasing xBj , as shown in Fig. 1(d). In the cross section ratios, the experimental and

theoretical uncertainties partially cancel, providing a possibility to test the pQCD calculations

more precisely than can be done with the individual cross sections. Both the cross sections and

the cross section ratios are well described by the NLO JET calculations.

RAPID COMMUNICATION

165

Table 1

The inclusive dijet cross sections as functions of Q2 . Included are the statistical, systematic, and jet energy scale uncertainties in columns 3, 4, and 5, respectively. Column 6 shows the correction factor from QED radiative effects applied

to the measured cross sections, and column 7 shows the hadronization correction applied to the NLO JET calculations

shown in the figures

Q2

d

dQ2

stat

syst

ES

(GeV2 )

(pb/GeV2 )

(pb/GeV2 )

(pb/GeV2 )

(pb/GeV2 )

1015

66.0

0.8

1520

41.4

0.6

2030

26.2

0.3

+3.7

4.4

+2.0

2.4

+1.0

0.8

+0.4

0.3

+0.17

0.16

+5.7

5.9

+3.5

3.6

+2.2

2.0

+1.0

1.1

+0.38

0.38

3050

14.0

0.1

50100

5.82

0.06

CQED

Chad

0.984

0.866

0.968

0.870

0.965

0.876

0.955

0.884

0.952

0.887

Table 2

The inclusive dijet cross sections as functions of xBj . Other details as in the caption to Table 1

xBj 104

d

dxBj

(pb, 104 )

stat

syst

ES

(pb, 104 )

(pb, 104 )

(pb, 104 )

+5.6

6.8

+5.9

6.2

+3.3

3.7

+0.8

0.8

+0.08

0.07

+7.0

6.3

+8.8

8.9

+6.9

7.1

+2.2

2.2

+0.17

0.17

1.73.0

85.3

1.7

3.05.0

113.8

1.5

5.010.0

83.1

0.8

10.025.0

29.5

0.3

25.0100.0

2.31

0.03

CQED

Chad

0.987

0.910

0.975

0.887

0.969

0.876

0.958

0.876

0.948

0.862

Table 3

The inclusive trijet cross sections as functions of Q2 . Other details as in the caption to Table 1

Q2

d

dQ2

stat

syst

ES

(GeV2 )

(pb/GeV2 )

(pb/GeV2 )

(pb/GeV2 )

(pb/GeV2 )

1015

7.9

0.2

1520

4.40

0.17

2030

3.19

0.11

3050

1.68

0.06

50100

0.719

0.024

+1.1

1.3

+0.46

0.66

+0.27

0.37

+0.13

0.11

+0.044

0.027

+1.0

1.0

+0.45

0.52

+0.38

0.38

+0.20

0.19

+0.077

0.070

CQED

Chad

0.991

0.759

0.946

0.776

0.969

0.786

0.949

0.794

0.956

0.795

jet

The single-differential cross sections d/dET ,HCM for two (three) jet events are presented

in Fig. 2. The measured cross sections are well described by the NLO JET calculations over the

jet

whole range in ET ,HCM considered.

jet

The single-differential cross sections d/dLAB for dijet and trijet production are presented

jet

in Figs. 3(a) and (c). For this figure, the two (three) jets with highest ET ,HCM were ordered in

RAPID COMMUNICATION

166

Table 4

The inclusive trijet cross sections as functions of xBj . Other details as in the caption to Table 1

xBj 104

d

dxBj

stat

syst

ES

(pb, 104 )

(pb, 104 )

(pb, 104 )

(pb, 104 )

1.73.0

14.7

0.7

3.05.0

15.9

0.5

5.010.0

9.6

0.3

10.025.0

3.35

0.10

25.0100.0

0.192

0.013

+1.5

3.3

+2.0

2.3

+0.9

0.9

+0.21

0.19

+0.032

0.020

+1.5

1.9

+1.9

1.8

+1.1

1.1

+0.40

0.37

+0.023

0.022

CQED

Chad

1.00

0.811

0.968

0.796

0.961

0.780

0.954

0.785

0.95

0.739

Table 5

The bin edges used for the measurements of the jet correlations presented. For the trijet sample, the first two bins in

jet1,2

|HCM | are combined

Variable

Bin

jet1,2

ET ,HCM

jet1,2

| pT ,HCM |

jet1,2

jet1

jet1,2

|HCM |

Boundaries

1

2

3

4

04 GeV

410 GeV

1018 GeV

18100 GeV

1

2

3

4

04 GeV

410 GeV

1016 GeV

16100 GeV

1

2

3

4

00.5

0.50.7

0.70.85

0.851

1

2

3

4

0/4

/4/2

/23/4

3/4

jet

jet1,2

LAB . Also shown are the measurements of the single-differential cross sections d/d|HCM |,

jet1,2

jet

where |HCM | is the absolute difference in pseudorapidity of the two jets with highest ET ,HCM

(see Figs. 3(b) and (d)). The NLO JET predictions describe the measurements well.

8.3. Jet transverse energy and momentum correlations

Correlations in transverse energy of the jets have been investigated by measuring the doublejet1,2

jet1,2

differential cross sections d 2 /dxBj dET ,HCM , where ET ,HCM is the difference in transverse

jet

energy between the two jets with the highest ET ,HCM . The measurement was performed in xBj

bins, which are defined in Table 2, for dijet and trijet production. Figs. 4 and 5 show the cross

jet1,2

sections d 2 /dxBj dET ,HCM for all bins in xBj for the dijet and trijet samples, respectively.

RAPID COMMUNICATION

167

jet

jet

Fig. 2. Inclusive dijet (a) and trijet (b) cross sections as functions of ET ,HCM with the jets ordered in ET ,HCM . The cross

sections of the second and third jet were scaled for readability. Other details as in the caption to Fig. 1.

jet1,2

The NLO JET calculations at O(s2 ) do not describe the high-ET ,HCM tail of the dijet sample

at low xBj , where the calculations fall below the data. Since these calculations give the lowestjet1,2

order non-trivial contribution to the cross section in the region ET ,HCM > 0, they are affected

by large uncertainties from the higher-order terms in s . A higher-order calculation for the dijet

jet1,2

sample is possible with NLO JET if the region ET ,HCM near zero is avoided. NLO JET calculajet1,2

tions at O(s3 ) for the dijet sample have been obtained for the region ET ,HCM > 4 GeV and are

compared to the data in Fig. 4. With the inclusion of the next term in the perturbative series in

s , the NLO JET calculations describe the data within the theoretical uncertainties. The NLO JET

calculations at O(s3 ) for trijet production are consistent with the measurements.

As a refinement to the studies of the correlations between the transverse energies of the jets,

further correlations of the jet transverse momenta have been investigated. The correlations in jet

transverse momenta were examined by measuring two sets of double-differential cross sections:

jet1,2

jet1,2

jet1

jet1,2

d 2 /dxBj d| pT ,HCM | and d 2 /dxBj d(|pT ,HCM |/(2ET ,HCM )). The variable | pT ,HCM | is

the transverse component of the vector sum of the jet momenta of the two jets with the highest

jet

jet1,2

ET ,HCM . For events with only two jets | pT ,HCM | = 0, and additional QCD radiation increases

jet1,2

jet1

this value. The variable |pT ,HCM |/(2ET ,HCM ) is the magnitude of the vector difference of the

jet

transverse momenta of the two jets with the highest ET ,HCM scaled by twice the transverse enjet1,2

jet1

ergy of the hardest jet. For events with only two jets |pT ,HCM |/(2ET ,HCM ) = 1, and additional

jet1,2

QCD radiation decreases this value. Figs. 69 show the cross sections d 2 /dxBj d| pT ,HCM |

jet1,2

jet1

and the cross sections d 2 /dxBj d|pT ,HCM |/(2ET ,HCM ) in bins of xBj for the dijet and trijet

samples.

At low xBj , the NLO JET calculations at O(s2 ) underestimate the dijet cross sections at high

jet1,2

jet1,2

jet1

values of | pT ,HCM | and low values of |pT ,HCM |/(2ET ,HCM ). The description of the data by

RAPID COMMUNICATION

168

jet

jet

Fig. 3. The inclusive dijet (a) and trijet (c) cross sections as functions of LAB with the jets ordered in LAB :

jet1

jet2

jet3

LAB > LAB > LAB . The cross sections of the second and third jet were scaled for readability. Figures (b) and (d)

jet1,2

jet

show the dijet and trijet cross sections as functions of |HCM | between the two jets with highest ET ,HCM . Other details

as in the caption in Fig. 1.

the NLO JET calculations at O(s2 ) improves at higher values of xBj . A higher-order calculajet1,2

tion with NLO JET at O(s3 ) for the dijet sample has been obtained for the region | pT ,HCM | >

jet1,2

jet1

4 GeV, which is compared to the data in Fig. 6; and for the region |pT ,HCM |/(2ET ,HCM ) <

0.85, which is compared to the data in Fig. 8. With the inclusion of the next term in the perturbative series in s , the NLO JET calculations describe the data well. The NLO JET calculations at

O(s3 ) for trijet production are consistent with the measurements.

RAPID COMMUNICATION

169

jet1,2

Fig. 4. Dijet cross sections as functions of ET ,HCM . The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed

(solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 ) predictions. The

jet1,2

boundaries for the bins in ET ,HCM are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

170

jet1,2

Fig. 5. Trijet cross sections as functions of ET ,HCM . The measurements are compared to NLO JET calculations at

jet1,2

O(s3 ). The boundaries for the bins in ET ,HCM are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

171

jet1,2

Fig. 6. Dijet cross sections as functions of | pT ,HCM |. The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed

(solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 ) predictions. The

jet1,2

boundaries for the bins in | pT ,HCM | are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

172

jet1,2

Fig. 7. Trijet cross sections as functions of | pT ,HCM |. The measurements are compared to NLO JET calculations at

jet1,2

O(s3 ). The boundaries for the bins in | pT ,HCM | are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

jet1,2

jet1

jet1,2

jet1

173

Fig. 8. Dijet cross sections as functions of |pT ,HCM |/(2ET ,HCM ). The NLO JET calculations at O(s2 ) (O(s3 )) are

shown as dashed (solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 )

predictions. The boundaries for the bins in |pT ,HCM |/(2ET ,HCM ) are given in Table 5. Other details as in the caption

to Fig. 1.

RAPID COMMUNICATION

174

jet1,2

jet1

Fig. 9. Trijet cross sections as functions of |pT ,HCM |/(2ET ,HCM ). The measurements are compared to NLO JET

jet1,2

jet1

calculations at O(s3 ). The boundaries for the bins in |pT ,HCM |/(2ET ,HCM ) are given in Table 5. Other details as in

the caption to Fig. 1.

RAPID COMMUNICATION

jet1,2

175

Fig. 10. Dijet cross sections as functions of |HCM |. The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed

(solid) lines. The lower parts of the plots show the relative difference between the data and the O(s3 ) predictions. The

jet1,2

boundaries for the bins in |HCM | are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

176

jet1,2

Fig. 11. Trijet cross sections as functions of |HCM |. The measurements are compared to NLO JET calculations at

jet1,2

O(s3 ). The boundaries for the bins in |HCM | are given in Table 5. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

jet1,2

177

Fig. 12. The dijet and trijet cross sections for events with |HCM | < 2/3 as functions of xBj in two different Q2 -bins.

The NLO JET calculations at O(s2 ) (O(s3 )) are shown as dashed (solid) lines. The trijet measurements are compared

to NLO JET calculations at O(s3 ). The lower parts of the plots in (a) and (b) show the relative difference between the

data and the O(s3 ) predictions. Other details as in the caption to Fig. 1.

RAPID COMMUNICATION

178

jet1,2

jet1,2

jet

is the azimuthal separation of the two jets with the largest ET ,HCM , for dijet and trijet production

are shown in Figs. 10 and 11 for all bins in xBj . For both dijet and trijet production the cross secjet1,2

tion falls with |HCM |. The NLO JET calculations at O(S2 ) for dijet production decrease more

jet1,2

jet1,2

rapidly with |HCM | than the data and the calculations disagree with the data at low |HCM |.

A higher-order NLO JET calculation at O(S3 ) for the dijet sample has been obtained for the rejet1,2

gion |HCM | < 3/4 and describes the data well. The measurements for trijet production are

reasonably well described by the NLO JET calculations at O(S3 ).

A further investigation has been performed by measuring the cross section d 2 /dQ2 dxBj

jet1,2

for dijet (trijet) events with |HCM | < 2/3 as a function of xBj . For the two-jet final states,

jet1,2

the presence of two leading jets with |HCM | < 2/3 can indicate another high-ET jet or set

of high-ET jets outside the measured range. These cross sections are presented in Fig. 12.

The NLO JET calculations at O(S2 ) for dijet production underestimate the data, the difference

increasing towards low xBj . The NLO JET calculations at O(S3 ) are up to about one order of

magnitude larger than the O(S2 ) calculations and are consistent with the data, demonstrating the

importance of the higher-order terms in the description of the data especially at low xBj . The

NLO JET calculations at O(S3 ) describe the trijet data within the renormalisation-scale uncertainties.

9. Summary

Dijet and trijet production in deep inelastic ep scattering has been measured in the phase

space region 10 < Q2 < 100 GeV2 and 104 < xBj < 102 using an integrated luminosity of

82 pb1 collected by the ZEUS experiment. The high statistics have made possible detailed

studies of multijet production at low xBj . The dependence of dijet and trijet production on the

jet

jet

kinematic variables Q2 and xBj and on the jet variables ET ,HCM and LAB is well described by

perturbative QCD calculations which include NLO corrections. To investigate possible deviations

with respect to the collinear factorisation approximation used in the standard pQCD approach,

jet

measurements of the correlations between the two jets with highest ET ,HCM have been made. At

low xBj , measurements of dijet production with low azimuthal separation are reproduced by the

perturbative QCD calculations provided that higher-order terms (O(s3 )) are accounted for. Such

terms increase the predictions of pQCD calculations by up to one order of magnitude when the

jet1,2

two jets with the highest ET ,HCM are not balanced in transverse momentum. This demonstrates

the importance of higher-order corrections in the low-xBj region.

Acknowledgements

It is a pleasure to thank the DESY Directorate for their strong support and encouragement.

The remarkable achievements of the HERA machine group were essential for the successful

completion of this work and are greatly appreciated. The design, construction and installation of

the ZEUS detector has been made possible by the efforts of many people who are not listed as

authors. It is also a pleasure to thank Zoltan Nagy for useful discussions about NLO JET.

RAPID COMMUNICATION

179

References

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

H1 Collaboration, C. Adloff, et al., Phys. Lett. B 515 (2001) 17.

J.D. Bjorken, Phys. Rev. 179 (1969) 1547.

V.N. Gribov, L.N. Lipatov, Sov. J. Nucl. Phys. 15 (1972) 438;

G. Altarelli, G. Parisi, Nucl. Phys. B 126 (1977) 298;

L.N. Lipatov, Sov. J. Nucl. Phys. 20 (1975) 94;

Yu.L. Dokshitzer, Sov. Phys. JETP 46 (1977) 641.

H1 Collaboration, C. Adloff, et al., Eur. Phys. J. C 21 (2001) 33.

ZEUS Collaboration, S. Chekanov, et al., Phys. Rev. D 67 (2003) 012007.

ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 23 (2002) 13;

ZEUS Collaboration, S. Chekanov, et al., Phys. Lett. B 547 (2002) 164;

ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 21 (2001) 443.

H1 Collaboration, C. Adloff, et al., Eur. Phys. J. C 19 (2001) 289;

H1 Collaboration, C. Adloff, et al., Phys. Lett. B 515 (2001) 17;

H1 Collaboration, S. Aid, et al., Nucl. Phys. B 470 (1996) 3.

H1 Collaboration, A. Aktas, et al., Eur. Phys. J. C 33 (2004) 477.

ZEUS Collaboration, M. Derrick, et al., Phys. Lett. B 293 (1992) 465.

ZEUS Collaboration, in: U. Holm (Ed.), The ZEUS Detector. Status Report, unpublished, DESY (1993), available

on http://www-zeus.desy.de/bluebook/bluebook.html.

N. Harnew, et al., Nucl. Instrum. Methods A 279 (1989) 290;

B. Foster, et al., Nucl. Phys. B (Proc. Suppl.) B 32 (1993) 181;

B. Foster, et al., Nucl. Instrum. Methods A 338 (1994) 254.

M. Derrick, et al., Nucl. Instrum. Methods A 309 (1991) 77;

A. Andresen, et al., Nucl. Instrum. Methods A 309 (1991) 101;

A. Caldwell, et al., Nucl. Instrum. Methods A 321 (1992) 356;

A. Bernstein, et al., Nucl. Instrum. Methods A 336 (1993) 23.

J. Andruszkw, et al., Preprint DESY-92-066, DESY, 1992;

ZEUS Collaboration, M. Derrick, et al., Z. Phys. C 63 (1994) 391;

J. Andruszkw, et al., Acta Phys. Pol. B 32 (2001) 2025.

W.H. Smith, K. Tokushuku and L.W. Wiggers, C. Verkerk, W. Wojcik (Eds.), in: Proceedings of the Computing in

High Energy Physics (CHEP 92), Geneva, Switzerland, 1992, p. 222, also in preprint DESY 92-150B.

K.C. Hger, in: W. Buchmller, G. Ingelman (Eds.), Proceedings of the Workshop on Physics at HERA vol. 1,

DESY, Hamburg, Germany, 1992, p. 43.

F. Jacquet, A. Blondel, in: U. Amaldi (Ed.), Proceedings of the Study for an ep Facility for Europe, Hamburg,

Germany, 1979, p. 391, also in preprint DESY 79/48.

G.M. Briskin, Ph.D. Thesis, Tel Aviv University, 1998, DESY-THESIS-1998-036.

S. Catani, et al., Nucl. Phys. B 406 (1993) 187.

S.D. Ellis, D.E. Soper, Phys. Rev. D 48 (1993) 3160.

R.P. Feynman, PhotonHadron Interactions, Benjamin, New York, 1972;

K.H. Streng, T.F. Walsh, P.M. Zerwas, Z. Phys. C (1979) 237.

L. Lnnblad, Comput. Phys. Commun 71 (1992) 15.

G. Ingelman, A. Edin, J. Rathsman, Comput. Phys. Commun. 101 (1997) 108.

A. Kwiatkowski, H. Spiesberger, H.-J. Mhring, Comput. Phys. Commun. 69 (1992) 155;

A. Kwiatkowski, H. Spiesberger, H.-J. Mhring, in: Proceedings of the Workshop Physics at HERA, DESY, Hamburg, 1991.

K. Charchula, G.A. Schuler, H. Spiesberger, Comput. Phys. Commun. 81 (1994) 381.

G. Gustafson, U. Pettersson, Nucl. Phys. B 306 (1988) 746.

H.L. Lai, et al., Phys. Rev. D 55 (1997) 1280.

B. Andersson, et al., Phys. Rep. 97 (1983) 31.

M. Bengtsson, T. Sjstrand, Comput. Phys. Commun. 46 (1987) 43.

T. Sjstrand, Comput. Phys. Commun. 82 (1994) 74.

R. Brun, et al., GEANT 3, Technical Report CERN-DD/EE/84-1, CERN, 1987.

Z. Nagy, Z. Trocsanyi, Phys. Rev. Lett. 87 (2001) 082001.

S. Catani, M.H. Seymour, Nucl. Phys. B 485 (1997) 291.

RAPID COMMUNICATION

180

[34]

[35]

[36]

[37]

[38]

[39]

N. Krumnack, Ph.D. Thesis, University of Hamburg, 2004.

L. Li, Ph.D. Thesis, University of WisconsinMadison, 2004.

J. Pumplin, et al., JHEP 0207 (2002) 012.

T. Gosau, Ph.D. Thesis, University of Hamburg, 2007, in preparation.

T. Danielson, Ph.D. Thesis, University of WisconsinMadison, 2007, in preparation.

ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 27 (2003) 531;

ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 23 (2002) 615;

M. Wing, on behalf of ZEUS Collaboration, in: R. Zhu (Ed.), Proceedings of the 10th International Conference on

Calorimetry in High Energy Physics, Pasadena, USA, 2002, p. 767, hep-ex/0206036.

production in DIS at HERA

ZEUS Collaboration

S. Chekanov 1 , M. Derrick, S. Magill, B. Musgrave, D. Nicholass 2 ,

J. Repond, R. Yoshida

Argonne National Laboratory, Argonne, IL 60439-4815, USA 3

M.C.K. Mattingly

Andrews University, Berrien Springs, MI 49104-0380, USA

Institut fr Physik der Humboldt-Universitt zu Berlin, Berlin, Germany

D. Boscherini, A. Bruni, G. Bruni, L. Cifarelli, F. Cindolo, A. Contin,

M. Corradi, S. De Pasquale, G. Iacobucci, A. Margotti, R. Nania,

A. Polini, G. Sartorelli, A. Zichichi

University and INFN Bologna, Bologna, Italy 4

M. Jngst, O.M. Kind 6 , A.E. Nuncio-Quiroz, E. Paul 7 , R. Renner 8 ,

U. Samson, V. Schnberg, R. Shehzadi, M. Wlasenko

Physikalisches Institut der Universitt Bonn, Bonn, Germany 9

H.H. Wills Physics Laboratory, University of Bristol, Bristol, United Kingdom 10

E. Tassi

Calabria University, Physics Department and INFN, Cosenza, Italy 4

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.022

RAPID COMMUNICATION

182

Chonnam National University, Kwangju, South Korea 13

Jabatan Fizik, Universiti Malaya, 50603 Kuala Lumpur, Malaysia 14

Nevis Laboratories, Columbia University, Irvington on Hudson, NY 10027, USA 15

P. Stopa, L. Zawiejski

The Henryk Niewodniczanski Institute of Nuclear Physics, Polish Academy of Sciences, Cracow, Poland 16

M. Przybycien, L. Suszycki

Faculty of Physics and Applied Computer Science, AGH-University of Science and Technology, Cracow, Poland 17

A. Kotanski 18 , W. Sominski 19

Department of Physics, Jagellonian University, Cracow, Poland

R. Ciesielski, N. Coppola, A. Dossanov, V. Drugakov, J. Fourletova,

A. Geiser, D. Gladkov, P. Gttlicher 21 , J. Grebenyuk, I. Gregor, T. Haas,

W. Hain, C. Horn 22 , A. Httmann, B. Kahle, I.I. Katkov, U. Klein 23 ,

U. Ktz, H. Kowalski, E. Lobodzinska, B. Lhr, R. Mankel,

I.-A. Melzer-Pellmann, S. Miglioranzi, A. Montanari, D. Notz,

L. Rinaldi, P. Roloff, I. Rubinsky, R. Santamarta, U. Schneekloth,

A. Spiridonov 24 , H. Stadie, D. Szuba 25 , J. Szuba 26 , T. Theedt, G. Wolf,

K. Wrona, C. Youngman, W. Zeuner

Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany

W. Lohmann, S. Schlenstedt

Deutsches Elektronen-Synchrotron DESY, Zeuthen, Germany

University and INFN, Florence, Italy 4

RAPID COMMUNICATION

183

Fakultt fr Physik der Universitt Freiburg i.Br., Freiburg i.Br., Germany 9

I.O. Skillicorn

Department of Physics and Astronomy, University of Glasgow, Glasgow, United Kingdom 10

I. Gialas 28 , K. Papageorgiu

Department of Engineering in Management and Finance, University of Aegean, Greece

T. Schrner-Sadenius, J. Sztuk, K. Wichmann, K. Wick

Hamburg University, Institute of Experimental Physics, Hamburg, Germany 9

Imperial College London, High Energy Nuclear Physics Group, London, United Kingdom 10

Y. Yamazaki

Institute of Particle and Nuclear Studies, KEK, Tsukuba, Japan 31

Institute of Physics and Technology of Ministry of Education and Science of Kazakhstan, Almaty, Kazakhstan

V. Aushev 1

Institute for Nuclear Research, National Academy of Sciences, Kiev,

and Kiev National University, Kiev, Ukraine

D. Son

Kyungpook National University, Center for High Energy Physics, Daegu, South Korea 13

J. de Favereau, K. Piotrzkowski

Institut de Physique Nuclaire, Universit Catholique de Louvain, Louvain-la-Neuve, Belgium 32

M. Soares, J. Terrn, M. Zambrana

Departamento de Fsica Terica, Universidad Autnoma de Madrid, Madrid, Spain 34

RAPID COMMUNICATION

184

Department of Physics, McGill University, Montral, Qubec, Canada H3A 2T8 35

T. Tsurugai

Meiji Gakuin University, Faculty of General Education, Yokohama, Japan 31

Moscow Engineering Physics Institute, Moscow, Russia 36

I.A. Korzhavina, V.A. Kuzmin, B.B. Levchenko 37 , O.Yu. Lukina,

A.S. Proskuryakov, L.M. Shcheglova, D.S. Zotkin, S.A. Zotkin

Moscow State University, Institute of Nuclear Physics, Moscow, Russia 38

Max-Planck-Institut fr Physik, Mnchen, Germany

H. Tiecke, M. Vzquez 29 , L. Wiggers

NIKHEF and University of Amsterdam, Amsterdam, Netherlands 39

Physics Department, Ohio State University, Columbus, OH 43210, USA 3

R.C.E. Devenish, B. Foster, K. Korcsak-Gorzo, S. Patel, V. Roberfroid 40 ,

A. Robertson, P.B. Straub, C. Uribe-Estrada, R. Walczak

Department of Physics, University of Oxford, Oxford, United Kingdom 10

A. Garfagnini, S. Limentani, A. Longhin, L. Stanco, M. Turcato

Dipartimento di Fisica dellUniversit and INFN, Padova, Italy 4

Department of Physics, Pennsylvania State University, University Park, PA 16802, USA 15

Y. Iga

Polytechnic University, Sagamihara, Japan 31

RAPID COMMUNICATION

185

Dipartimento di Fisica, Universit La Sapienza and INFN, Rome, Italy 4

Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, United Kingdom 10

Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics, Tel-Aviv University, Tel-Aviv, Israel 44

M. Kuze, J. Maeda

Department of Physics, Tokyo Institute of Technology, Tokyo, Japan 31

Department of Physics, University of Tokyo, Tokyo, Japan 31

Tokyo Metropolitan University, Department of Physics, Tokyo, Japan 31

Universit di Torino and INFN, Torino, Italy 4

M. Arneodo, M. Ruspa

Universit del Piemonte Orientale, Novara, and INFN, Torino, Italy 4

Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A7 35

J.H. Loizides, M.R. Sutton 48 , M. Wing

Physics and Astronomy Department, University College London, London, United Kingdom 10

J. Malka 50 , R.J. Nowak, J.M. Pawlak, T. Tymieniecka, A. Ukleja,

A.F. Zarnecki

Warsaw University, Institute of Experimental Physics, Warsaw, Poland

RAPID COMMUNICATION

186

M. Adamus, P. Plucinski 51

Institute for Nuclear Studies, Warsaw, Poland

Department of Particle Physics, Weizmann Institute, Rehovot, Israel 52

A.A. Savin, W.H. Smith, H. Wolfe

Department of Physics, University of Wisconsin, Madison, WI 53706, USA 3

J. Standage, J. Whyte

Department of Physics, York University, Ontario, Canada M3J 1P3 35

Received 29 May 2007; received in revised form 7 June 2007; accepted 7 June 2007

Available online 3 July 2007

Abstract

The first observation of (anti)deuterons in deep inelastic scattering at HERA has been made with the

ZEUS detector at a centre-of-mass energy of 300318 GeV using an integrated luminosity of 120 pb1 . The

measurement was performed in the central rapidity region for transverse momentum per unit of mass in the

range 0.3 < pT /M < 0.7. The particle rates have been extracted and interpreted in terms of the coalescence

model. The (anti)deuteron production yield is smaller than the (anti)proton yield by approximately three

orders of magnitude, consistent with the world measurements.

2007 Elsevier B.V. All rights reserved.

* Corresponding author.

1 Supported by DESY, Germany.

2 Also affiliated with University College London, UK.

3 Supported by the US Department of Energy.

4 Supported by the Italian National Institute for Nuclear Physics (INFN).

5 Now with TV Nord, Germany.

6 Now at Humboldt University, Berlin, Germany.

7 Retired.

8 Self-employed.

9 Supported by the German Federal Ministry for Education and Research (BMBF), under contract Nos. HZ1GUA 2,

HZ1GUB 0, HZ1PDA 5, HZ1VFA 5.

10 Supported by the Particle Physics and Astronomy Research Council, UK.

11 Supported by Chonnam National University in 2005.

RAPID COMMUNICATION

187

13 Supported by the Korean Ministry of Education and Korea Science and Engineering Foundation.

14 Supported by the Malaysian Ministry of Science, Technology and Innovation/Akademi Sains Malaysia grant SAGA

66-02-03-0048.

15 Supported by the US National Science Foundation. Any opinion, findings and conclusions or recommendations

expressed in this material are those of the authors and do not necessarily reflect the views of the National Science

Foundation.

16 Supported by the Polish State Committee for Scientific Research, grant Nos. 620/E-77/SPB/DESY/P-03/DZ

117/2003-2005 and 1P03B07427/2004-2006.

17 Supported by the Polish Ministry of Science and Higher Education as a scientific project (20062008).

18 Supported by the research grant No. 1 P03B 04529 (20052008).

19 This work was supported in part by the Marie Curie Actions Transfer of Knowledge project COCOS (contract

MTKD-CT-2004-517186).

20 Now at University Libre de Bruxelles, Belgium.

21 Now at DESY group FEB, Hamburg, Germany.

22 Now at Stanford Linear Accelerator Center, Stanford, USA.

23 Now at University of Liverpool, UK.

24 Also at Institut of Theoretical and Experimental Physics, Moscow, Russia.

25 Also at INP, Cracow, Poland.

26 On leave of absence from FPACS, AGH-UST, Cracow, Poland.

27 Partly supported by Moscow State University, Russia.

28 Also affiliated with DESY.

29 Now at CERN, Geneva, Switzerland.

30 Also at University of Tokyo, Japan.

31 Supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and its grants

for Scientific Research.

32 Supported by FNRS and its associated funds (IISN and FRIA) and by an Inter-University Attraction Poles Programme

subsidised by the Belgian Federal Science Policy Office.

33 Ramn y Cajal Fellow.

34 Supported by the Spanish Ministry of Education and Science through funds provided by CICYT.

35 Supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

36 Partially supported by the German Federal Ministry for Education and Research (BMBF).

37 Partly supported by Russian Foundation for Basic Research grant No. 05-02-39028-NSFC-a.

38 Supported by RF Presidential grant No. 8122.2006.2 for the leading scientific schools and by the Russian Ministry

of Education and Science through its grant Research on High Energy Physics.

39 Supported by The Netherlands Foundation for Research on Matter (FOM).

40 EU Marie Curie Fellow.

41 Partially supported by Warsaw University, Poland.

42 This material was based on work supported by the National Science Foundation, while working at the Foundation.

43 Also at Max Planck Institute, Munich, Germany, Alexander von Humboldt Research Award.

44 Supported by the GermanIsraeli Foundation and the Israel Science Foundation.

45 Now at KEK, Tsukuba, Japan.

46 Now at Nagoya University, Japan.

47 Department of Radiological Science.

48 PPARC Advanced fellow.

49 Also at dz University, Poland.

50 dz University, Poland.

51 Supported by the Polish Ministry for Education and Science grant No. 1 P03B 14129.

52 Supported in part by the MINERVA Gesellschaft fr Forschung GmbH, the Israel Science Foundation (grant

No. 293/02-11.2) and the USIsrael Binational Science Foundation.

Deceased.

RAPID COMMUNICATION

188

1. Introduction

Light stable nuclei, such as deuterons (d) and tritons (t ), are loosely bound states whose

production mechanism in high-energy collisions is poorly understood. Most measurements of

A selection of d from primary

light stable nuclei have been performed for antideuterons (d).

interactions is more difficult as it requires separation of such states from particles produced by

interactions of colliding beams with residual gas in the beam pipe and by secondary interactions

in detector material. The first observation of d [1] was followed by a number of experiments on

antideuteron production. The production rate of d in e+ e q q collisions [25] is significantly

lower than that measured in (1S) and (2S) decays [2,5]. The d rate in e+ e q q is also

lower than that in protonnucleus (pA) [6,7], protonproton (pp) [8] and photonproton (p)

collisions at HERA [9], but higher than that in nucleusnucleus collisions [10,11]. For heavy-ion

collisions, the coalescence model [12] was proposed to explain the production of d(d).

This paper presents the results of the first measurement of d and d in the central rapidity

region of deep inelastic ep scattering (DIS). The analysis was performed for exchanged photon

virtuality, Q2 , above 1 GeV2 .

2. Coalescence model for (anti)deuteron formation

According to the coalescence model [12] developed for heavy-ion collisions, the production

rate of d is determined by the overlap between the wave-function of a proton (p) and a neutron (n)

with the wave-function of a d. In this case, the d cross section is the product of single-particle

cross sections for protons and neutrons, with a coefficient of proportionality reflecting the spatial

size of the fragmentation region emitting the particles. The same approach applies for d produc production in pp [8], p [9] and e+ e [2,4]

tion. This model was also used to describe d(d)

interactions.

Assuming that all baryons are uncorrelated and the invariant differential cross section for

neutrons is equal to that for protons, the invariant differential cross section for deuteron formation

can be parameterised as

E p d3 p 2

E d d3 d

=

B

,

2

tot dpd3

tot dpp3

where Ed(p) and d(p) are the energy and the production cross section of the d(p), respectively,

pd (pp ) is the momentum of the d(p) and tot is the total ep cross section for the considered

kinematic range. The coalescence parameter, B2 , is inversely proportional to the volume of the

fragmentation region emitting the particles. The same relation holds for d and p.

If B2 is the same

2 . The coalescence

is equal to (p/p)

for particles and antiparticles, then the production ratio d/d

B2 =

E d d3 d

tot dpd3

E p d3 p

tot dpp3

2

1

d3 d

d

= Mp4 Md2 R 2 (d/p)

,

tot d(pd /Md )3

where Md(p) is the mass of the d(p), d = Ed /Md , R(d/p) is the ratio of the number of d to p

expressed as a function of pT /Md(p) , with pT being the transverse momentum [9].

RAPID COMMUNICATION

189

3. Experimental set-up

A detailed description of the ZEUS detector can be found elsewhere [13]. A brief outline of

the components that are most relevant for this analysis is given below.

Charged particles are tracked in the central tracking detector (CTD) [14], which operates

in a magnetic field of 1.43 T provided by a thin superconducting solenoid. The CTD consists of 72 cylindrical drift chamber layers, organised in nine superlayers covering the polarangle53 region 15 < < 164 . The transversemomentum resolution for full-length tracks is

(pT )/pT = 0.0058pT 0.0065 0.0014/pT , with pT in GeV. To estimate the ionisation

energy loss per unit length, dE/dx, of particles in the CTD [15], the truncated mean of the

anode-wire pulse heights was calculated, which removes the lowest 10% and at least the highest

30% depending on the number of saturated hits. The measured dE/dx values were corrected

by normalising to the average dE/dx for tracks around the region of minimum ionisation for

pions with momentum p satisfying 0.3 < p < 0.4 GeV. Henceforth, dE/dx is quoted in units

of minimum ionising particles (mips). The resolution of the dE/dx measurement for full-length

tracks is about 9%.

The high-resolution uraniumscintillator calorimeter (CAL) [16] consists of three parts: the

forward (FCAL), the barrel (BCAL) and the rear (RCAL) calorimeters. Each part is subdivided

transversely into towers and longitudinally into one electromagnetic section (EMC) and either

one (in RCAL) or two (in BCAL and FCAL) hadronic sections (HAC). The smallest subdivision

of the calorimeter is called a cell.

under test-beam

The CAL energy resolutions, as measured

conditions, are (E)/E = 0.18/ E for electrons and (E)/E = 0.35/ E for hadrons, with E

in GeV. A presampler [17] mounted in front of the calorimeter and a scintillator-strip detector

(SRTD) [18] were used to correct the energy of the scattered electron.54 The position of electrons

scattered close to the electron beam direction is determined by the SRTD detector.

The inactive material between the interaction region and the CTD, relevant for this analysis,

consists of the central beam pipe made of aluminum with 1.5 mm wall thickness and the inner

diameter of 135 mm. The CTD inner wall with a diameter of 324 mm consists of two aluminum

skins, each 0.7 mm thick, separated by a 8.6 mm gap filled with polyurethane foam with a nominal density of 0.05 g/cm3 .

The luminosity was measured using the bremsstrahlung process ep ep with the luminosity monitor [19], a leadscintillator calorimeter placed in the HERA tunnel at Z = 107 m.

4. Monte Carlo simulation

To study the detector response, the A RIADNE 4.12 Monte Carlo (MC) model [20] for the

description of inclusive DIS events was used. The A RIADNE program uses the Lund string

model [21] for hadronisation, as implemented in P YTHIA 6.2 [2224]. In its original version,

this MC does not include a mechanism for the production of d or other light stable nuclei. To

determine reconstruction efficiencies, a second A RIADNE sample was generated in which ds

were included at the generator level by combining p and n with similar momenta.

53 The ZEUS coordinate system is a right-handed Cartesian system, with the Z axis pointing in the proton beam direction, referred to as the forward direction, and the X axis pointing left towards the centre of HERA. The coordinate

origin is at the nominal interaction point.

54 Henceforth the term electron is used to refer both to electrons and positrons.

RAPID COMMUNICATION

190

The A RIADNE events were passed through a full simulation of the detector using the

G EANT 3.13 [25] program. The G EANT simulation uses the G HEISHA model [26] to simulate

hadronic interactions in the material. The G EANT program cannot be used for d as this particle

is not included in the particle table.

5. Event sample

5.1. DIS event selection

The data sample corresponds to an integrated luminosity of 120.3 pb1 taken between 1996

and 2000 with the ZEUS detector at HERA. This sample consists of 38.6 pb1 of e+ p data taken

at a centre-of-mass energy of 300 GeV, 65.0 pb1 taken at 318 GeV and 16.7 pb1 of e p data

taken at 318 GeV.

The search was performed using DIS events with exchanged-photon virtuality Q2 > 1 GeV2 .

The event selection was similar to that used in a previous ZEUS publication [27]. A three-level

trigger [13] was used to select events online. At the third-level trigger, an electron with an energy

greater than 4 GeV was required. Data below Q2 20 GeV2 were prescaled to reduce trigger

rates.

The Bjorken scaling variable, xBj , and Q2 were reconstructed using the electron method (denoted by the subscript e), which uses measurements of the energy and angle of the scattered electron. The scattered-electron candidate was identified from the pattern of energy deposits in the

CAL [28]. In addition, the inelasticity was reconstructed using the JacquetBlondel method [29],

yJB , or the electron method, ye .

For the final DIS sample, the following requirements were imposed:

Q2e > 1 GeV2 ;

the impact point of the scattered electron on the RCAL outside the (X, Y ) region (12,

6) cm centred on the beamline;

Ee > 8.5 GeV, where Ee is the energy of the scattered electron measured in the CAL and

corrected for energy losses;

35 < < 65 GeV, where = Ei (1 cos i ), Ei is the energy of the ith calorimeter cell,

i is its polar angle and the sum runs over all cells;

ye < 0.95 and yJB > 0.01;

at least three tracks fitted to the primary vertex to ensure a good reconstruction of the primary

vertex and to reduce

contributions from non-ep events;

2 + Y 2 < 1 cm, where Z , X

|Zvtx | < 40 cm and Xvtx

vtx

vtx and Yvtx are the coordinates of

vtx

the vertex position determined from the tracks.

5.2. Track selection and the dE/dx measurement

The present analysis is based on charged tracks measured in the CTD. The tracks were required to have:

at least 40 CTD hits, with at least 8 of them for the dE/dx measurement;

the transverse momentum pT 0.15 GeV.

RAPID COMMUNICATION

191

These cuts selected a region where the CTD track acceptance, as well as the resolutions in

momentum and the dE/dx, were high.

To identify particles originating from ep collisions, the following additional variables were

reconstructed for each track:

the distance, Z, of the Z-component of the track helix to Zvtx ;

the distance of closest approach (DCA) of the track to the beam-spot location in the transverse plane. The beam-spot position is determined from the average primary-vertex distributions in X and Y for each data-taking period. The DCA is assigned a positive (negative)

value if the beam spot lies left (right) of the particle path.

Fig. 1 shows the dE/dx distribution as a function of the track momentum for positive and

negative tracks. The events were selected by requiring at least one track with dE/dx > 2.5 mips.

To reduce the fraction of tracks coming from non-ep collisions, the tracks were required to have

|Z| < 1 cm and |DCA| < 0.5 cm. After such a selection, clear bands corresponding to charged

kaons, protons and deuterons were observed. The requirement dE/dx > 2.5 mips enhances the

fraction of events with at least one particle with a mass larger than the pion mass and leads to the

discontinuity near dE/dx = 2.5 mips seen in Fig. 1. The lines show the most probable energy

loss calculated from the BetheBloch formula [30]. The dE/dx bands for K and p are slightly

shifted with respect to the BetheBloch expectations due to the geometrical structure of the CTD

drift cells which leads to a different response to negative and positive tracks.

Fig. 2 shows the reconstructed masses, M, for different particle species. The masses were

calculated from the measured track momentum and energy loss using the BetheBloch formula.

The mass distributions were fitted with asymmetric55 Gaussian functions. The relative width

obtained was 11% (7%) for the left (right) part of the function.

The number of p(p)

candidates in the mass region 0.7(0.6) < M < 1.5 GeV was 1.61 105

5

(1.66 10 ). Due to a shift in the dE/dx for negative tracks, the lower mass cut for p was at

0.6 GeV. The numbers of d and d in the mass window 1.5 < M < 2.5 GeV were 309 and 62,

respectively. The number of p migrating to the d mass region was estimated to be less than 1%

of the total number of d candidates. A similar estimate was obtained for antiparticles. A small

number of triton candidates was observed in the mass window 2.5 < M < 3.5 GeV. However,

due to low statistics, it was difficult to establish a peak inside this mass window, therefore, no

conclusive statement on the origin of the tracks in the region 2.5 < M < 3.5 GeV was possible.

candidates were required to be in the central rapidity region,

The observed p(p)

and d(d)

|y| < 0.4, and to have 0.3 < pT /M < 0.7. This determines the kinematic range used for the

cross-section calculations.

5.3. Identification of particles produced in ep collisions

candidates selected after the dE/dx mass cuts can originate

The observed p(p)

and d(d)

from secondary interactions in the inactive material between the interaction point and the central

tracking detector.

originating from ep collisions, both DCA and Z cuts were

In order to select p(p)

and d(d)

removed and a statistical background subtraction based on the DCA distribution was performed.

55 An asymmetric Gaussian has different widths for the left and right parts of the function.

RAPID COMMUNICATION

192

Fig. 1. The dE/dx distributions as a function of the track momentum for (a) positive and (b) negative tracks. The DIS

events were accepted by requiring at least one track with dE/dx > 2.5 mips (denoted by the dashed lines), |Z| < 1 cm

and |DCA| < 0.5 cm. The lines show the most-probable energy loss calculated using the BetheBloch formula for different particle species.

The Z distributions for p(p)

and d(d)

Z = 0 are observed. To optimise the signal-over-background ratio for the DCA distribution, all

candidates were selected using the |Z| < 2(1) cm restriction for p, p (d, d).

candidates. The distributions show

Fig. 4 shows the DCA distributions for p(p)

and d(d)

peaks at zero due to tracks originating from the primary vertex. The number of particles

RAPID COMMUNICATION

193

Fig. 2. The mass spectra for (a) positive and (b) negative particles. Tracks are selected as for Fig. 1. The mass distribution was calculated from the track momenta and the dE/dx. The arrows indicate the cuts applied for the selection of

candidates.

originating from primary ep collisions was determined using the side-band background subtraction. A linear fit to the DCA distribution on either side of the peak region in the range

2 < |DCA| < 4 cm was performed. Then, the expected number of background events in the signal

candidates was subtracted.

region of |DCA| < 1.5(0.5) cm for p, p (d, d)

The number of p(p)

obtained after the DCA side-band background subtraction was 1.52105

5

(1.62 10 ). The numbers of d and d particles were 177 17 and 53 7, respectively. The difference in the observed numbers of p and p can be explained by different dE/dx efficiencies and

the mass cuts for positive and negative tracks. Such a difference in the efficiencies for particles

and antiparticles cannot explain the difference in the observed numbers of d and d.

Fig. 5 shows the distributions for several DIS kinematic variables: Q2e , xe , Ee and . In

addition, rapidity (y) distributions for the selected candidates are shown. The numbers of p(p)

candidates were calculated in each bin from the DCA distributions after the side-band

and d(d)

background subtraction. The distributions for d are consistent with those for p and p,

while the

d sample shows some deviations for the Ee variable and, consequently, for the variable.

6. Studies of background processes

The following two background sources for heavy stable charged particles were considered:

interactions of the proton (or electron) beam with residual gas in the beam pipe, termed

beam-gas interactions;

RAPID COMMUNICATION

194

Fig. 3. The distributions of Z, the distance of the Z-component of the track helix to Zvtx for: (a)(b) particles and

(c)(d) antiparticles, as indicated in the figure. The p, p,

d and d candidates were identified using the dE/dx mass cuts

(see text). The arrows indicate the applied cuts.

secondary interactions of particles in inactive material between the interaction point and the

central tracking detector.

6.1. Beam-gas interactions

The contribution from proton-gas interactions is significantly reduced after the ZEUS threelevel trigger which requires a scattered electron in the CAL. In addition, the requirement to accept

only events with more than three tracks fitted to the primary vertex significantly diminishes the

contribution from both electron-gas and proton-gas events. The remaining fraction of beam-gas

interactions can be assessed by studying the Zvtx distribution.

candidate. The

Fig. 6 shows the Zvtx distributions for events with at least one p(p)

or d(d)

distributions were reconstructed in the signal region |Z| < 2(1) cm and |DCA| < 1.5(0.5) cm

candidates without the background subtraction. Fig. 6 shows that there is essenfor p, p (d, d)

tially no beam-gas background for d events. A small background for d at positive Zvtx is expected

from the DIS MC generated for inclusive DIS events in which ds are produced by secondary

interactions in the material in front of the CTD. This background is expected to have a flat DCA

and, therefore, is subtracted by the procedure described in Section 5.3.

The Zvtx distributions were fitted using a Gaussian function with a first-order polynomial

for the background description. The extracted Gaussian widths are fully consistent with those

obtained for inclusive DIS events without the d preselection.

RAPID COMMUNICATION

195

Fig. 4. The distributions of the distance of closest approach, DCA, for: (a)(b) particles and (c)(d) antiparticles. The

DCA are shown after the cut |Z| < 2(1) cm as discussed in the text. The arrows indicate the signal region for the

side-band background subtraction. The dashed lines show the fitted background level.

To further study the Zvtx distribution, a special event selection was performed for noncolliding electron and proton bunches. Since the requirement to detect an electron with energy

Ee 8.5 GeV significantly reduces the rate of such background events, this requirement was not

selection. The requirement to accept

applied. All other tracking cuts were the same as in the d(d)

events with at least three tracks fitted to the primary vertex rejects most of the beam-gas events

( 95% from the total number of the triggered events). As expected, the remaining events show

clear peaks at zero for the Z and DCA distributions, but the reconstructed Zvtx distribution did

not show a peak at zero.

The enhancement at large Zvtx for d, which was found to be consistent with that originating

from secondary interactions, could partially be due to electron-gas interactions. If one assumes

that the background seen in Fig. 6(b) is due to non-ep interactions, then the contribution from

beam-gas interactions does not exceed 17% of the total number of events with a deuteron.

6.2. Secondary interactions on inactive material

A pure sample of DIS events will still contain deuterons produced by secondary interactions

of particles in material. The aim of the side-band background subtraction discussed in Section 5.3

was to remove such a background contribution, assuming that the background processes do not

create a residual peak at Z = 0 and DCA = 0. Several checks of this assumption are discussed

below.

RAPID COMMUNICATION

196

and p(p)

Fig. 5. The distributions of the number of events with at least one d(d)

candidate normalised to unity as a

function of: (a)(d) DIS kinematic variables and (e) rapidity y. The points for d and d are slightly shifted horizontally

for clarity.

The DCA and Z distributions were investigated using an MC simulation of inclusive DIS

production at the generator level. Deuterons from secondary interactions

events without d(d)

were selected as for the data. The reconstructed DCA and Z for d did not show a peak at

zero. A more detailed study of the DCA and Z distributions was possible for p not originating

from an ep collision at the MC generator level, since in this case the available MC statistics is

significantly higher than for the d case. After the track-quality cuts, no peak at zero was observed

in the DCA and Z distributions.

If a deuteron is produced by secondary interactions of the particles from the DIS event in

the surrounding matter, the secondary d will not point precisely back to the interaction point,

and both DCA and Z distributions will be wider than in case of d and p.

Therefore, the DCA

and Z distributions were fitted with double-Gaussian distributions to establish the width of the

distributions. It was found that the observed deuteron DCA and Z widths were consistent with

the corresponding widths for p and p.

One possible source for d is the reaction N + N d + , where one of the nucleons N

originates from an ep collision, while the other one originates from the detector material in front

of the CTD. For low initial nucleon momenta, the DCA of the d track is in general large and

it does not form an important background; at high initial nucleon momenta however, the DCA

can become small enough that misidentification could become important.56 Since the processes

56 Note that the cross section for the reaction N + N d + decreases rapidly with increasing energy.

RAPID COMMUNICATION

197

Fig. 6. The Zvtx distributions for: (a)(b) particles and (c)(d) antiparticles, as indicated in the figure. The solid lines

show the fit using a Gaussian distribution with a first-order polynomial function for the background description. The

dashed line shows the fitted background. The arrows indicate the cuts applied for the final selection.

can be studied by comparing the average charged multiplicity of tracks for d and d events. In

addition, the distance of closest approach, DCA12, between the d track and other non-primary

tracks in the same event should have an enhancement at zero. The study indicated that the average

number of tracks for d events is smaller than that for d events. The rejection of events with

|DCA12| < 2 cm did not lead to a statistically significant reduction in the number of the observed

d events.

Secondary deuterons may also be produced in pickup (p + n d) reactions by primary

p(n) interacting in the surrounding material. These deuterons, peaking in the direction of the

primary p(n), point approximately to the interaction point and are therefore a potentially dangerous source of background. Experimental data on the pickup reactions at the relevant energy

are scarce and therefore only a rough estimate of the size of this background is possible. From

the extrapolation of data on Sm154 [31] and C [32,33] targets using the K. Kikuchi theory [34]

to allow for the change of material, the estimated d background from the pickup reaction was

in the range 110% of the total number of observed d events, depending on the extrapolation

input.

The angular distributions of d from pickup reactions have also been investigated in several

experiments [32,35,36] for various targets and for a range of p/M similar to the present analysis.

In all cases, the angular distribution of d observed in these experiments would lead to a much

wider DCA than that shown in Fig. 4(b).

RAPID COMMUNICATION

198

7. Detector corrections

In this analysis, all measurements are based on event ratios, therefore, the detector corrections

due to DIS event selection and trigger efficiency were found to be small and thus are not discussed

here. The detector corrections for the tracking efficiency and the efficiency of the dE/dx cuts

are described below.

7.1. Tracking efficiency

The efficiency due to the track reconstruction, , was estimated separately for p(p)

and d

using the A RIADNE MC model (with d included at the generator level). The obtained efficiencies

are about 0.95 for p and d and 0.90 for p.

The method cannot be applied to d which are not treated in the G EANT simulation. Therefore,

= (d)(p)/(p).

the tracking efficiency for d was modelled as (d)

the hit reconstruction efficiency is described by the first term, (d), while the absorption loss

(including annihilation) of d and p are assumed to be similar. This modelling assumes that the

cross sections of annihilation in the detector material are the same for d and p,

since the inelastic

nuclear cross section of p is much larger than that of n for the momentum region less than

0.4 GeV [37]. The use of the geometrical model discussed in [11,37] and the model in which

the p and n inelastic absorption cross sections are added linearly [4,37] to obtain the inelastic

reduces (d)

by 1% and 5%, respectively.

nuclear cross section of d,

7.2. Efficiency of the dE/dx cuts

Another important contribution to the efficiency comes from the dE/dx threshold cuts and the

mass cuts. The inefficiency due to the dE/dx requirements were estimated separately for positive

and negative tracks using p (+ c.c.) decays. In this approach, protons were identified

from the peak and then the proton dE/dx selection efficiency was reconstructed as the ratio

of the events without and with the dE/dx requirement. These efficiencies were determined as

a function of p/M. The efficiency for each pT /M bin was corrected by reweighting the p/M

is 0.7 for

distributions using A RIADNE. The average efficiency of the dE/dx cuts for d(d)

pT /M < 0.5. For larger momenta, the efficiency decreases due to the dE/dx > 2.5 mips cut.

The signal extraction is not possible for pT /M > 0.7 due to a very small efficiency. For the

low-momentum region pT /M < 0.5, the efficiencies for negative tracks tend to be larger than

is higher by 15% than that for d(d).

Alternatively, the overall tracking and the dE/dx efficiency was calculated using the A RI ADNE MC model; consistent results with the approach discussed above were found.

8. Systematic uncertainties

The systematic uncertainties were evaluated by changing the selection and the analysis procedure. Only the largest contribution of each cut variation for the final invariant cross section is

given below. The following sources of systematic uncertainties were studied:

efficiency of the track reconstruction and selection. The systematic uncertainty on the tracking efficiency for p, p,

d was 2%. This systematic uncertainty was found after variations

the systematic uncertainty, 5%, includes both the effect

of the track-quality cuts. For d,

RAPID COMMUNICATION

199

of track-quality-cut variations and the reduction in (d)

absorption was used (see Section 7.1);

efficiency due to the dE/dx selection. This systematic uncertainty was estimated by varying

the cut dE/dx > 2.5 mips within the dE/dx resolution and by using the MC simulation.

This systematic uncertainty was 5%. For the lowest pT /M bin, the uncertainty was 10%;

variations in the particle yields associated with the signal extraction:

were reconstructed using a Gaussian fit to the DCA distribution with

the number of d(d)

a first-order polynomial for the background description;

the region used to determine the background for the side-band background subtraction

was reduced to 1.5 < |DCA| < 3.5 cm;

the DCA cut for the side-band background subtraction was varied within its resolution of

0.1 cm;

for the side-band background subtraction, the background shape was taken from the MC

(without d at the generator level);

the cut on Z was varied by 0.2 cm;

These variations lowered the production yields by 5.0% for p, 2.2% for p,

26.0% for d and

The largest effect originates from the conservative treatment of the shape of the

6.1% for d.

and 11% for d.

DCA background. The upper systematic error was below 1% for p, p and d,

the background contribution under the Zvtx peak for d events was assumed to be due to

beam-gas interactions and, therefore, it was subtracted (4% contribution for p, p,

d and

17% contribution for d);

the correction for decays applied for the p(p)

sample was changed by 10% (see Section 9.1). The size of this uncertainty, which is similar to that in other publications [4,9], was

determined by the uncertainty on the strangeness suppression factor in the A RIADNE model;

variations of the DIS-selection cuts. The cut on the energy of the scattered electron was

increased to 10 GeV, and the lower cut on the distribution was tightened to 40 GeV. The

cut on Zvtx was varied by 5 cm. The cut on the number of primary tracks was increased

+3.6

from three to four. These variations led to changes of +3.3

+3.7

4.1 % for p, 4.4 % for p,

8.5 % for d

+5.7

Variations of the cuts on ye and yJB distributions showed a negligible

and 13.3 % for d.

effect.

The overall systematic uncertainty was determined by adding the above uncertainties in

quadrature. The largest experimental uncertainty was due to the uncertainties on the tracking

efficiency and the signal extraction.

9. Results

9.1. Production cross sections and B2

For each particle type i, the invariant differential cross section can be calculated from the

rapidity range y and the transverse momentum pT ,i of a corresponding particle through

d3 i

1

1

Ni

i

=

,

3

tot d(pi /Mi )

NDIS 2(pT ,i /Mi )y (pT ,i /Mi )

Ni is the particle yield in each pT ,i /Mi bin

where the subscript i denotes a p(p)

or a d(d),

after the correction for the tracking efficiencies and the particle selection and NDIS = 2.59 107

RAPID COMMUNICATION

200

Fig. 7. The invariant differential cross sections for p(p)

and d(d)

The inner error bars show the statistical uncertainties, the outer ones show statistical and systematic uncertainties added

in quadrature. For clarity, the points for particles and antiparticles are slightly shifted horizontally with respect to the

corresponding pT /M.

is the number of DIS events used in the analysis. For the present measurement, y = 0.8 and

rate

the bin sizes are (pT ,i /Mi ) = 0.1. For comparisons with other experiments, the p(p)

was corrected for the decay products of . A correction factor of 0.79 was estimated from the

A RIADNE simulation which gives an adequate description of KS0 and production [38].

are shown

The invariant differential cross sections as a function of pT /M for p(p)

and d(d)

in Fig. 7 and given in Tables 1 and 2. The d(d) invariant cross section is smaller by approximately three orders of magnitude than that of p(p).

These cross sections were used to extract the

coalescence parameter B2 as discussed in Section 2. The parameter B2 is shown in Fig. 8 and

especially at low pT /M. The

listed in Tables 3 and 4. For d, B2 tends to be higher than for d,

value of B2 for d is in agreement with the measurements in photoproduction [9], but larger than

that observed in e+ e annihilation at the Z resonance [4]. The measured B2 is also significantly

larger than that observed in heavy-ion collisions [11].

were analysed in the Breit frame [39]. The

The events containing at least one p(p)

or d(d)

number of events with p(p)

in the current region of the Breit frame was about 2.5% of the total

number of observed events with p(p).

In this region, neither d nor d was found. Since the current

region of the Breit frame is analogous to a single hemisphere of e+ e , the observation of d(d)

+

reported in this paper is not in contradiction with the low d rate observed in e e [24].

9.2. Production ratios

p ratios as a function of pT /M are shown in Fig. 9(a)

The detector-corrected d/p and d/

and listed in Tables 3 and 4. For the antiparticle ratio, there is a good agreement with the H1

p ratio was also

published data for photoproduction [9], as well as with pp data [8]. A similar d/

observed in hadronic (1S) and (2S) decays [2].

RAPID COMMUNICATION

201

Table 1

The measured invariant cross sections for the production of p and d in DIS as a function of pT /M. The statistical and

systematic uncertainties are also listed

pT /M

0.30.4

0.40.5

0.50.6

0.60.7

+0.19

1.33 0.010.21

+0.16

1.34 0.010.18

+0.10

0.88 0.010.12

+0.04

0.38 0.010.05

+0.50

3.29 0.431.24

+0.17

1.37 0.260.51

+0.14

1.16 0.280.42

Table 2

The measured invariant cross sections for the production of p and d in DIS as a function of pT /M. The statistical and

systematic uncertainties are also listed

pT /M

0.30.4

+0.16

1.59 0.010.19

+0.09

0.77 0.150.14

0.40.5

0.50.6

0.60.7

+0.07

1.21 0.010.09

+0.05

0.86 0.010.07

+0.02

0.35 0.010.03

+0.03

0.45 0.110.07

+0.05

0.60 0.190.09

Fig. 8. The pT /M dependence of the parameter B2 for d and d produced in DIS ep collisions and in photoproduction [9].

The inner error bars show the statistical uncertainties, the outer ones show statistical and systematic uncertainties added

in quadrature. For clarity, the points for particles and antiparticles are slightly shifted horizontally with respect to the

corresponding pT /M.

and p/p

The d/d

ratios as a function of pT /M are shown in Fig. 9(b) and listed in Table 5.

The p/p

ratio is consistent with unity, as expected from hadronisation of quark and gluon jets.

RAPID COMMUNICATION

202

Table 3

The measured d-to-p production ratio and the parameter B2 for d as a function of pT /M. The last row of the table

shows the data in the full measured phase space. The statistical and systematic uncertainties are also listed

pT /M

0.30.4

0.40.5

0.50.6

0.60.7

0.30.7

R(d/p)(103 )

B2 (d)(102 GeV2 )

+0.55

2.48 0.331.00

+0.19

1.02 0.190.40

+0.24

1.32 0.320.51

+1.47

4.11 0.541.97

+0.50

1.68 0.320.74

+0.99

3.31 0.801.45

+0.40

1.88 0.200.75

+1.13

3.32 0.341.55

Table 4

p production ratio and the parameter B2 for d as a function of pT /M. The last row of the table

The measured d-toshows the data in the full measured phase space. The statistical and systematic uncertainties are also listed

pT /M

3 )

p)(10

R(d/

2 GeV2 )

B2 (d)(10

0.30.4

+0.08

0.48 0.090.10

+0.18

0.67 0.130.19

+0.31

1.80 0.570.36

+0.07

0.49 0.070.09

+0.19

0.89 0.140.20

0.40.5

0.50.6

0.60.7

0.30.7

+0.04

0.37 0.090.06

+0.08

0.70 0.220.12

+0.12

0.67 0.170.13

The dominant uncertainty on the ratio is due to systematic effects associated with the track selection and reconstruction.

especially at low pT . Under the assumption

The production rate of d is higher than that of d,

that secondary interactions do not produce an enhancement at DCA = 0 for the d case, the result

2 expected from the coalescence model

and (p/p)

would indicate that the relation between d/d

For collisions involving incoming baryon beams, there are several models [40,41] that predict

baryonantibaryon production asymmetry in the central rapidity region. A pp asymmetry in

proton-induced reactions is predicted to be as high as 7% [41]. Given the experimental uncertainty, this measurement is not sensitive to the expected small pp asymmetry.

In heavy-ion collisions, the d to d production ratio is expected to be smaller than unity [42].

A recent measurement at RHIC [11] indicated a lower production rate of d compared to that

= 0.47 0.03 was compatible with the square of the

of d. The average value of the ratio d/d

p/p

= 0.73 0.01 ratio. Assuming the same size of the production volume for baryons and

antibaryons, this RHIC result is consistent with the coalescence model. A similar conclusion

was obtained earlier in fixed-target pp [8] and pA [7] experiments. For e+ e collisions, the d

yield is compatible with that of d within the large uncertainties [4,5].

10. Summary

in ep collisions in the DIS regime at HERA is presented. The

The first observation of d(d)

is smaller than that for p(p)

production rate of d(d)

by three orders of magnitude, which is in

broad agreement with other experiments.

RAPID COMMUNICATION

203

Fig. 9. (a) d/p and d/

and p/p

(b) The d/d

production ratios as a function of pT /M. The inner error bars show the statistical uncertainties,

the outer ones show statistical and systematic uncertainties added in quadrature. The points in (a) are slightly shifted

horizontally for clarity.

Table 5

and d-to-d

production ratios as a function of pT /M. The last row of the table shows the data in

the full measured phase space. The statistical and systematic uncertainties are also listed

pT /M

0.30.4

0.40.5

0.50.6

0.60.7

0.30.7

R(p/p)

R(d/d)

+0.20

1.19 0.010.19

+0.10

0.90 0.010.09

+0.11

0.97 0.010.10

+0.10

0.92 0.030.09

+0.09

0.23 0.050.05

+0.15

1.05 0.010.14

+0.11

0.31 0.050.06

+0.12

0.33 0.100.07

+0.19

0.52 0.210.10

was studied in terms of the coalescence model. The coalescence paraThe production of d(d)

meter is in agreement with the measurements in photoproduction at HERA. However, it is larger

than that measured in e+ e annihilation at the Z resonance.

The production rate of p is consistent with that of p in the kinematic range 0.3 <

pT /M < 0.7. Due to significant uncertainties, it is not possible to test models that predict a

small baryonantibaryon asymmetry in the central fragmentation region.

If the obFor the same kinematic region, the production rate of d is higher than that for d.

served d are solely attributed to deuterons produced in primary ep collisions, the results would

RAPID COMMUNICATION

204

indicate that the coalescence model with the same source volume for d and d cannot fully explain

in DIS.

the production of d(d)

Acknowledgements

We thank the DESY Directorate for their strong support and encouragement. The remarkable

achievements of the HERA machine group were essential for the successful completion of this

work and are greatly appreciated. We are grateful for the support of the DESY computing and

network services. The design, construction and installation of the ZEUS detector have been made

possible owing to the ingenuity and effort of many people from DESY and home institutes who

are not listed as authors. We thank Prof. D. Heinz and Prof. T. Sloan for the useful discussion of

this topic.

References

[1] T. Massam, et al., Nuovo Cimento 39 (1965) 10.

[2] ARGUS Collaboration, H. Albrecht, et al., Phys. Lett. B 157 (1985) 326;

ARGUS Collaboration, H. Albrecht, et al., Phys. Lett. B 236 (1990) 102.

[3] OPAL Collaboration, R. Akers, et al., Z. Phys. C 67 (1995) 203.

[4] ALEPH Collaboration, S. Schael, et al., Phys. Lett. B 639 (2006) 16.

[5] CLEO Collaboration, D.M. Asner, et al., Phys. Rev. D 75 (2007) 012009.

[6] IHEP-CERN Collaboration, F. Binon, et al., Phys. Lett. B 30 (1969) 510;

Yu.M. Antipov, et al., Phys. Lett. B 34 (1971) 164.

[7] J.W. Cronin, et al., Phys. Rev. D 11 (1975) 3105.

[8] B. Alper, et al., Phys. Lett. B 46 (1973) 265;

BritishScandinavian Collaboration, W.M. Gibson, et al., Nuovo Cimento Lett. 21 (1978) 189;

V.V. Abramov, et al., Sov. J. Nucl. Phys. 45 (1987) 845.

[9] H1 Collaboration, A. Aktas, et al., Eur. Phys. J. C 36 (2004) 413.

[10] M. Aoki, et al., Phys. Rev. Lett. 69 (1992) 2345;

NA52 (NEWMASS) Collaboration, G. Appelquist, et al., Phys. Lett. B 376 (1996) 245;

STAR Collaboration, C. Alper, et al., Phys. Rev. Lett. 87 (2001) 262301;

E802 Collaboration, L. Ahle, et al., Phys. Rev. C 57 (1998) 1416;

NA44 Collaboration, I.G. Bearden, et al., Nucl. Phys. A 661 (1999) 387;

NA44 Collaboration, I.G. Bearden, et al., Eur. Phys. J. C 23 (2002) 237.

[11] PHENIX Collaboration, S.S. Adler, et al., Phys. Rev. Lett. 94 (2005) 122302.

[12] S.T. Butler, C.A. Pearson, Phys. Rev. 129 (1963) 836.

[13] ZEUS Collaboration, in: U. Holm (Ed.), The ZEUS Detector, Status Report (unpublished), DESY, 1993, available

on http://www-zeus.desy.de/bluebook/bluebook.html.

[14] N. Harnew, et al., Nucl. Instrum. Methods A 279 (1989) 290;

B. Foster, et al., Nucl. Phys. B (Proc. Suppl.) 32 (1993) 181;

B. Foster, et al., Nucl. Instrum. Methods A 338 (1994) 254.

[15] ZEUS Collaboration, J. Breitweg, et al., Phys. Lett. B 481 (2000) 213;

ZEUS Collaboration, J. Breitweg, et al., Eur. Phys. J. C 18 (2001) 625;

D. Bartsch, PhD thesis (unpublished), Universitt Bonn, Bonn, Germany, 2007.

[16] M. Derrick, et al., Nucl. Instrum. Methods A 309 (1991) 77;

A. Andresen, et al., Nucl. Instrum. Methods A 309 (1991) 101;

A. Caldwell, et al., Nucl. Instrum. Methods A 321 (1992) 356;

A. Bernstein, et al., Nucl. Instrum. Methods A 336 (1993) 23.

[17] A. Bamberger, et al., Nucl. Instrum. Methods A 382 (1996) 419;

S. Magill, S. Chekanov, in: B. Aubert, et al. (Eds.), Proceedings of the IX International Conference on Calorimetry

Annecy, 914 October 2000, in: Frascati Physics Series, vol. 21, Annecy, France, 2001, p. 625.

[18] A. Bamberger, et al., Nucl. Instrum. Methods A 401 (1997) 63.

RAPID COMMUNICATION

205

ZEUS Collaboration, M. Derrick, et al., Z. Phys. C 63 (1994) 391;

J. Andruszkw, et al., Acta Phys. Pol. B 32 (2001) 2025.

[20] L. Lnnblad, Comput. Phys. Commun. 71 (1992) 15.

[21] B. Andersson, et al., Phys. Rep. 97 (1983) 31.

[22] M. Bengtsson, T. Sjstrand, Comput. Phys. Commun. 46 (1987) 43.

[23] T. Sjstrand, Comput. Phys. Commun. 82 (1994) 74.

[24] T. Sjstrand, et al., Comput. Phys. Commun. 135 (2001) 238.

[25] R. Brun, et al., GEANT3, Technical Report CERN-DD/EE/84-1, CERN, 1987.

[26] H. Fesefeldt, The simulation of hadronic showers: Physics and applications (unpublished), PITHA-85-02.

[27] ZEUS Collaboration, S. Chekanov, et al., Phys. Lett. B 591 (2004) 7.

[28] H. Abramowicz, A. Caldwell, R. Sinkus, Nucl. Instrum. Methods A 365 (1995) 508.

[29] F. Jacquet, A. Blondel, in: U. Amaldi (Ed.), Proceedings of the Study for an ep Facility for Europe, Hamburg,

Germany, 1979, p. 391. Also in preprint DESY 79/48.

[30] Particle Data Group, W.-M. Yao, et al., J. Phys. G 33 (2006) 1.

[31] N. Blasi, et al., Nucl. Phys. A 624 (1997) 433.

[32] J. Franz, et al., Nucl. Phys. A 472 (1987) 733.

[33] G.R. Smith, et al., Phys. Rev. C 30 (1984) 593.

[34] K. Kikuchi, Prog. Theor. Phys. 18 (1957) 503.

[35] P.G. Roos, et al., Nucl. Phys. A 255 (1975) 187.

[36] B. Fagerstrom, et al., Phys. Scr. 13 (1976) 10.

[37] A.A. Moiseev, J.F. Ormes, Astropart. Phys. 6 (1997) 379.

[38] ZEUS Collaboration, S. Chekanov, et al., Eur. Phys. J. C 51 (2007) 1.

[39] R.P. Feynman, PhotonHadron Interactions, Benjamin, New York, 1972;

K.H. Streng, T.F. Walsh, P.M. Zerwas, Z. Phys. C 2 (1979) 237.

[40] G.T. Garvey, B.Z. Kopeliovich, B. Povh, Comments Mod. Phys. A 2 (2001) 47;

S. Chekanov, Eur. Phys. J. C 44 (2005) 367;

F. Bopp, Yu.M. Shabelski, Phys. At. Nucl. 68 (2005) 2093;

F. Bopp, Yu.M. Shabelski, Eur. Phys. J. A 28 (2006) 237.

[41] B. Kopeliovich, B. Povh, Z. Phys. C 75 (1997) 693;

B. Kopeliovich, B. Povh, Phys. Lett. B 446 (1999) 321.

[42] S. Leupold, U.W. Heinz, Phys. Rev. C 50 (1994) 1110.

Atsuo Kuniba a, , Reiho Sakamoto b , Yasuhiko Yamada c

a Institute of Physics, Graduate School of Arts and Sciences, University of Tokyo, Komaba, Tokyo 153-8902, Japan

b Department of Physics, Graduate School of Science, University of Tokyo, Hongo, Tokyo 113-0033, Japan

c Department of Mathematics, Faculty of Science, Kobe University, Hyogo 657-8501, Japan

Available online 21 June 2007

Abstract

(1)

We introduce ultradiscrete tau functions associated with rigged configurations for An . They satisfy an

ultradiscrete version of the Hirota bilinear equation and play a role analogous to a corner transfer matrix

for the boxball system. As an application, we establish a piecewise linear formula for the KerovKirillov

Reshetikhin bijection in the combinatorial Bethe ansatz. They also lead to general N -soliton solutions of

the boxball system.

2007 Elsevier B.V. All rights reserved.

1. Introduction

The Bethe ansatz and the corner transfer matrix are methods of primary importance in

analysing solvable lattice models [1]. The Bethe ansatz produces eigenvectors of row transfer

matrices from solutions of the Bethe equation [2]. The corner transfer matrix method determines

the one-point function from the one-dimensional sums [1]. See [35] and [6,7] for some typical

applications. Interestingly, both of these approaches are known to admit combinatorial versions,

which have brought fruitful insights and applications into representation theory as well [8].

The combinatorial Bethe ansatz was initiated by Kerov, Kirillov and Reshetikhin (KKR)

[9,10]. They invented the object called rigged configuration, which serves as a combinatorial

substitute for the solutions of the Bethe equation. By the KKR bijection, they are in one-to-one

correspondence with the LittlewoodRichardson tableaux, or equivalently, highest paths which

* Corresponding author.

yamaday@math.kobe-u.ac.jp (Y. Yamada).

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.06.007

208

are the combinatorial analogues of the Bethe eigenvectors. As for the corner transfer matrix

method, a decisive progress came with the advent of the crystal base theory [11,12], where the

one-dimensional sums are formulated as generating functions of the energy of affine crystals over

paths.

Guided by a number of relevant results [1318], these streams have merged into the so-called

X = M conjecture [19,20] for general affine Lie algebra. Here X is the one-dimensional sum

in the corner transfer matrix method. For type A(1)

n , it coincides essentially with the Kostka

Foulkes polynomial [21] for the case treated in [9,10]. On the other hand, M is the fermionic

formula (2.10) in the Bethe ansatz, which is a generating function of the charge function c(, r)

(2.9). By now, the X = M conjecture has been studied extensively and solved in several cases

[2225].

During these developments, it was realized that not only the Bethe ansatz or the corner transfer matrix, but also the solvable lattice models themselves admit decent combinatorial versions.

In fact, vertex models with the quantum group symmetry Uq (A(1)

n ) turned out to be the soliton

cellular automata at q = 0 [26,27] that had been known as the boxball systems [28,29]. Row

transfer matrices in the former tend to commuting time evolutions in the latter. The finding has

led to a systematic generalization of such automata [3032], which possess fascinating features

as ultradiscrete integrable systems [33]. (See the explanation under (5.10) for the ultradiscretization.) Thus it is a natural endeavor to study these automata by the combinatorial versions of the

Bethe ansatz and the corner transfer matrix.

As for the Bethe ansatz, this has been done in [34,35], which yielded the inverse scattering

formalism of the boxball systems. It turned out that rigged configurations are action-angle variables, which provide the conserved quantities or linearize the commuting time evolutions. The

KKR bijection is the direct/inverse scattering (GelfandLevitan) map. In particular, the mysterious combinatorial algorithm in the bijection is identified with a crystal theoretical vertex

operator.

Then what about the corner transfer matrix? And this is the issue that we are going to address

in this paper. From a naive point of view, one is tempted to regard the number of balls in a quadrant of the two-dimensional time evolution pattern of the boxball system as its candidate. We

introduce such a quantity i (p) (4.1) for a path p. On the other hand, the combinatorial analogue

of the corner transfer matrix in the crystal base theory is the energy of affine crystals [12,17],

which is denoted by Ei (p) in (4.12). Our Proposition 4.6 asserts i (p) = Ei (p) indeed. One of

the main results in this paper is Theorem 6.12, which states i (p) = i (p) = Ei (p). Here i (p)

is the piecewise linear function on the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) ))

for p:

i (p) = max c(, s) (i) ,

c(, s) =

(a)

1

s ,

Ca,b min (a) , (b) min , (1) +

2

a

a,b

where (Cab )1a,bn is the Cartan matrix of An . c(, s) is the charge function appearing in the

fermionic formula, and the max extends over all the subsets ( (a) , s (a) ) ((a) , r (a) ) of the

rigged configuration. See (2.19), (2.20), (2.24) and Section 2.1 for a precise account. In short, i

is an ultradiscretization of a single summand in the fermionic formula with respect to the subsets

of the rigged configuration.

An origin of this curious quantity goes back to Satos theory of soliton equations [36]. In fact,

i arises as an ultradiscretization of the well known tau function for the KP hierarchy [37] under

209

Table 1

Role in boxball system

Description of dynamics

Bethe ansatz

Rigged configuration

Action-angle variable

Linear

Tau function

Bilinear

a special choice of parameters adapted to the rigged configuration. Using this fact, we show

that i satisfies an ultradiscrete version of the Hirota bilinear equation, which actually serves

as a characterization of i up to a boundary condition. We call i the ultradiscrete tau function.

It serves as an analogue of a corner transfer matrix in the boxball system and bilinearize the

dynamics. These features are summarized in Table 1.

As the main consequences of Theorem 6.12, we derive a piecewise linear formula for the KKR

bijection (Theorem 2.1), the solution of the initial value problem (Theorem 7.6) and the general

N -soliton solution (7.21), (7.37), (7.42) for the boxball system. Note that the quantities i = Ei

arise from the corner transfer matrix and crystals, whereas i is an explicit formula originating

in the Bethe ansatz. Therefore our Theorem 6.12, i.e., i = Ei = i provides another connection

of the two methods analogous to the X = M conjecture.

The layout of the paper is as follows. In Section 2, i is introduced in (2.18)(2.20) as a

(a)

piecewise linear function on rigged configurations. It is actually a member of the family i

(2.22) which obeys the recursion relation (2.23). It reflects the nested structure sln+1 sln

sl2 , which will be utilized extensively. The piecewise linear formula for the KKR bijection

is stated in Theorem 2.1.

In Section 3, we give the definition and the basic properties of the boxball system.

In Section 4, we introduce i and Ei . i in (4.1) is the number of balls in the SW quadrant in

the time evolution pattern of the boxball system. Ei is defined by (4.12) and (4.11), which is a

sum of local energy function in the affine crystal. They are analogues of the corner transfer matrix

[1] in complementary viewpoints; i originates in the boxball system and Ei in the crystal base

theory. They are identified in Proposition 4.6.

The piecewise linear formula for the KKR bijection (Theorem 2.1) is a consequence of the

further identification i = i = Ei in Theorem 6.12. Sections 5 and 6 are devoted to a proof of

this fact. In Section 5, i is shown to emerge as an ultradiscretization of the tau functions of the

KP hierarchy (Lemma 5.3) and satisfy the Hirota type bilinear equation (Proposition 5.1). The

key to these results is the special choice of the parameters (5.5)(5.9). It assures the positivity,

which is vital in the ultradiscretization (Lemma 5.2). The content of this section is a refinement

of the earlier analysis [26].

(1)

In Section 6, i = i for An is proved on the asymptotic states by induction on the rank n

(Proposition 6.1 and its reduction in Proposition 6.4). From the assumption i = i = Ei for

(1)

An1 , the scattering data is expressed in terms of tau functions (Lemma 6.6). Then we take

advantage of the vertex operator formulation of the KKR bijection [34,35] to make the induction proceed. Combined with the results in Section 5, the agreement on the asymptotic states is

enough to establish the claim i = i everywhere.

In Section 7, Theorem 2.1 and Theorem 6.12 are generalized to arbitrary (non-highest) states.

As an application, we present the solution of the initial value problem of the boxball system in

Theorem 7.6. Our tau functions are parametrized by the conserved quantities that specify solitons. We rewrite them in several forms in (7.21), (7.37) and (7.42). They yield general N -soliton

210

solutions of the boxball system. Among others, our ultradiscrete tau functions are most elegantly presented in (7.42) in terms of affine crystals in the principal picture.

Appendix A summarizes the rudiments of the crystal base theory. Appendix B illustrates the

graphical rule [17] for obtaining the combinatorial R, the winding and the non-winding numbers

relevant to the energy function. Appendix C recalls the combinatorial algorithm for the KKR

bijection. Appendix D is the crystal theoretical reformulation of the KKR map due to [34,35].

Appendix E is an exposition of the inverse scattering formalism of the boxball system which

supplements Section 3.

2. Ultradiscrete tau function

2.1. Preliminary

We summarize the basic notation used throughout the paper. For a multiset = (1 , . . . , k ),

we use the symbols

|| = 1 + + k ,

() = k,

(2.1)

[N] = (1 , . . . , N )

(0 N k),

(2.2)

min(, ) =

m

k

(2.3)

min(i , j ),

i=1 j =1

def

{1 , . . . , k } {1 , . . . , m },

(2.4)

where accounts the multiplicity as well. For example, , (1, 1), (1, 3, 1) (1, 2, 1, 3) but

(2, 2) (1, 2, 1, 3).

2.2. Rigged configurations

Consider the data of the form

(0) (1) (1)

, , r , . . . , (n) , r (n) ,

(a)

(2.5)

(a)

(a)

(a)

where (a) = (1 , . . . , la ) (Z1 )la and r (a) = (r1 , . . . , rla ) (Z0 )la for some la 0.

(a)

(a)

Apart from (0) , each ((a) , r (a) ) is to be understood as a multiset of the pairs (1 , r1 ), . . . ,

(a)

((a)

la , rla ) whose ordering does not matter. The data (2.5) is called a rigged configuration for

(1)

An if

(a)

0 ri

p

(a)

(a)

i

(a)

(a)

i , ri

(2.6)

pj(a) = Ej(a1) 2Ej(a) + Ej(a+1)

(a)

Ej =

la

k=1

(1 a n),

(a)

(0 a n),

min j, k

(n+1)

Ej

(2.7)

= 0.

(2.8)

211

(a)

The array ((0) , . . . , (n) ) is called a configuration and the nonnegative integers ri are called

(a)

(a)

(a)

rigging. Note that pj and Ej depend only on the configuration. In particular E = |(a) |. It

(a)

(a)

as an n-tuple of Young diagrams (1) , . . . , (n) where the row of length (a)

i is assigned with

(a)

the rigging ri subject to the condition (2.6). In this convention, we identify all the diagrams

obtained by reordering the rows of equal length with different rigging. In what follows we do not

(a)

(a)

assume 1 la unless explicitly mentioned.

For a multiset with positive components , let RC() denote the set of rigged configurations

(2.5) with (0) = . Set

c(, r) =

(a)

1

r ,

Cab min (a) , (b) min (0) , (1) +

2

a

(2.9)

a,b

where (Cab )1a,bn is the Cartan matrix of An . The fermionic formula [9,10] is obtained as the

generating function:

M() =

q c(,r) ,

(2.10)

where the sum extends over all the rigged configurations (, ((1) , r (1) ), . . . , ((n) , r (n) ))

RC() with prescribed values for |(1) |, . . . , |(n) |. The sum (2.10) is arranged as M() =

c(,0) r (a)

a,i i , where the sum over the rigging r under the condition (2.6) yields a

q

rq

product of q-binomial coefficients as is well known.

2.3. Crystals

(1)

We recapitulate basic facts on the An crystal Bl . For a general background see Appendix A.

The Bl is the crystal base of the l-fold symmetric tensor representation. As the set it is given by

Bl = x = (x1 , . . . , xn+1 ) (Z0 )n+1 | x1 + + xn+1 = l .

(2.11)

The Kashiwara operators act as ei (x) = x , fi (x) = x with xj = xj + i,j i,j +1 and xj =

xj i,j + i,j +1 . Here indices are in Zn+1 and x and x are to be understood as 0 unless

they belong to (Z0 )n+1 . The combinatorial R: Aff(Bl ) Aff(Bm ) Aff(Bm ) Aff(Bl ) has

the form R: x[d] y[e]
y[e

H (x y)] x[d

+ H (x y)], which are described by the

piecewise linear formula [26,38]:

xi = xi + Qi (x y) Qi1 (x y),

yi = yi + Qi1 (x y) Qi (x y),

k1

n+1

xi+j +

yi+j 1 k n + 1 ,

Qi (x y) = min

j =1

(2.12)

(2.13)

j =k+1

H (x y) = min(l, m) Q0 (x y).

(2.14)

The energy function H here is normalized so that 0 H min(l, m) and coincides with the

winding number [17]. In general min(l, m) Qi is the ith winding number that counts the

lines crossing xi and xi+1 (Appendix B).

The element x = (x1 , . . . , xn+1 ) is also denoted by a row shape semistandard tableau of length

l containing the letter i xi times and x[d] Aff(Bl ) by the tableau with index d. For example in

212

(1)

(1, 2, 0, 1) [5] (1, 0, 1, 0) [9] (0, 1, 0, 1) [8] (2, 1, 1, 0) [6],

1224

13

24

1123 6 .

(2.15)

ul = 1l = 1 1 Bl .

a l = a a Bl ,

Setting

= (x1 , . . . , xn+1 ) Bl | x1 = = xa = 0

a+1

Bl

we have

1

Bl = Bl

2

Bl

n+1

Bl

(2.16)

(0 a n),

= (n + 1)l

(2.17)

(1)

An

but also for the nested family

(1)

(1)

(1)

(1)

A0 , A1 , . . . , An1 . In such a circumstance we realize the crystal Bl for Ana (0 a n) on

a+1

the set B

with the Kashiwara operators ei , fi (a i n). In this convention the highest

l

a+1

.

Let

P+ () = p B1 BL | ei p = 0, 1 i n

be the set of highest elements (paths) with respect to An . The bijection [9,10] between

RC() and the LittlewoodRichardson tableaux is translated to the one between RC() and

P+ (). We call the resulting map the KKR bijection. See Appendix C for an exposition of

the algorithm and Appendix D for the recent reformulation as the crystal theoretical vertex operator [34,35]. In particular, there is a nested structure with respect to the rank in

(1)

the sense that if ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is a rigged configuration for An , so is

(1)

((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) for Ana . Moreover, the KKR bijection sends the lata+1

a+1

ter to a highest path in B (a) B (a) .

1

la

We use the notation defined in Section 2.1. Given a rigged configuration ((0) , ((1) , r (1) ),

. . . , ((n) , r (n) )), we introduce the ultradiscrete tau functions 0 (), 1 (), . . . , n+1 () for

(0) as follows:

0 () = n+1 () ||,

1 d n + 1, (n+1) = 0 ,

d () = max c(, s) (d)

c(, s) = min , (1) + min (1) , (2) + + min (n1) , (n)

min (1) , (1) min (2) , (2) min (n) , (n)

s (1) s (n) .

(2.18)

(2.19)

(2.20)

In (2.19), max is taken over = ( (1) , . . . , (n) ), where the components are independently

chosen under the condition (1) (1) , . . . , (n) (n) . The array s = (s (1) , . . . , s (n) ) denotes the set of the riggings s (1) r (1) , . . . , s (n) r (n) that are paired with the chosen

(a)

(a)

(a)

213

(a)

(1) , . . . , (n) as {(i , si )} {(i , ri )}. The quantity c(, s) in (2.20) is obtained

from c(, r) (2.9) by replacing (, r) = ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) with (, s) =

(, ( (1) , s (1) ), . . . , ( (n) , s (n) )). Apart from (a) (a) , there is no further constraint on

| (1) |, . . . , | (n) | and it is not required that the data (, ( (1) , s (1) ), . . . , ( (n) , s (n) )) to be a rigged

(1)

configuration for An . Since the max (2.19) includes the trivial case (a) = , the quantities

1 (), . . . , n+1 () are nonnegative integers. Note that n+1 ((0) ) = max{c(, s)} in (2.19)

may be viewed as an ultradiscretization of the single summand q c(,r) in the fermionic formula (2.10) with respect to the subsets (, s) (, r). See also (5.11).

Theorem 2.1. Let the image of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under

the KKR bijection be the highest path p1 pL P+ ((0) ). Then pk = (x1 , . . . , xn+1 )

B(0) is expressed as

k

(0)

(2.21)

(0)

(0)

Due to the nested structure of the KKR bijection with respect to the rank [34], Theorem 2.1

is also stated as a family of relations corresponding to sln+1 sln sl2 . To do so, we

(a)

introduce the family of ultradiscrete tau functions {d () | 0 a n 1, a d n + 1,

(a)

(a)

(a) } by a () = n+1 () || and

(a)

d () = max min , (a+1) + min (a+1) , (a+2) + + min (n1) , (n)

min (a+1) , (a+1) min (a+2) , (a+2) min (n) , (n)

s (a+1) s (a+2) s (n) (d) (a + 1 d n + 1),

(2.22)

where | (n+1) | = 0 as before. The max is taken over the independent choices (a+1)

(a+1) , . . . , (n) (n) . The subsets of the riggings s (a+1) r (a+1) , . . . , s (n) r (n) are those

paired with the chosen (a+1) , . . . , (n) as before. The previously introduced tau function d ()

(0)

(2.19) is equal to d (). Now Theorem 2.1 is rephrased as

Theorem 2.2. Given a rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and 0 a

n 1, let the image of ((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) under the KKR bijection be the

a+1

a+1

Ana highest path p1 pla B (a) B (a) . Then pk = (xa+1 , xa+2 , . . . , xn+1 )

1

is expressed as

(a)

(a)

(a)

la

(a)

(a)

(a)

(a)

(a)

(a)

Again, xa+1 + + xn+1 = k is evident by the construction. For a proof of Theorem 2.1, see

Section 4.4.

The tau functions (2.22) are the solution of the recursion relation with respect to the rank:

(a)

(a+1)

()

d () = max min(, ) min(, ) |s| + d

(2.23)

(a+1)

214

(a)

(a)

(n)

(n)

condition n+1 () = 0, n () = ||. The rigging s is the subset of r (a+1) paired with the

chosen .

(a)

Proof. It suffices to prove a = 0 case. When = , (2.19) becomes d () = min {c()

|s (1) | + + |s (n) | + |s (d) |}, where c()

c()

1

1

(b)

Ca,b min (a) , (b) =

Ca,b

min(i, j )m(a)

i mj ,

2

2

a,b

a,b

(a)

i,j

(a)

where mi is the number of k such that k = i. This is a positive definite quadratic form whose

(a)

minimum is 0 at mj = 0. The other part |s (1) | + + |s (n) | + |s (d) | appearing in d () also

attains the minimum 0 simultaneously at this point. 2

Let the image of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under the KKR

bijection be the highest path p1 pL P+ ((0) ) B(0) B(0) . In what follows

L

1

we will also write

(0)

(0)

(0)

(1 k L).

for = [k] = 1 , . . . , k

i () = k,i = i (p1 pk )

(2.24)

can be extended to a longer one p1 pk pk+1 pL in which pk+1 pL

is not unique. Suppose that ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and ( (0) , ( (1) , r (1) ), . . . ,

( (n) , r (n) )) are two rigged configurations corresponding to such extensions of p1 pk ,

and let i (p1 pk ) and i (p1 pk ) be the associated tau functions in the sense of

(2.24). Then i (p1 pk ) = i (p1 pk ) will be guaranteed by Theorem 4.9. Note

however that they are different as the piecewise linear expressions as in (2.18)(2.20). By the

reason, we will always mention the rigged configurations relevant to p1 pk .

Example 2.4. Consider the highest path p = 11112221322433 B1L of length L = 14, where

we have omitted the symbol . The corresponding rigged configuration is depicted in Example C.2. Thus we set

(0) = 114 ,

(3) = (1),

r (3) = (0).

k

10

11

12

13

14

k,1

k,2

k,3

k,4

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

1

1

0

2

2

2

0

3

3

3

1

4

4

4

2

5

6

6

3

7

8

8

4

9

10

10

6

11

12

13

8

13

15

16

10

15

18

19

215

The choices of the subsets = ( (1) , (2) , (3) ) that attain these values for k,4 = max { } in

(2.19) are as follows:

k

1, 2, 3

4

5, 6, 7

8

9, 10

11

12, 13, 14

A

A, B

B

B, C

C

C, D

D

A = (, , ),

B = (4), , , (4), (1), , (4), (1), (1) ,

C = (4, 2), (3), , (4, 2), (1), , (4, 2), (3), (1) , (4, 2), (1), (1) , (4, 2), (3, 1), (1) ,

D = (4, 3, 2), (3, 1), (1) = (1) , (2) , (3) .

The case k = 0 enforces the choice (a) = in agreement with Lemma 2.3. In the other extreme

case k = L, the full choice = is the consequence of the general result in Remark 6.14. In

general the maximum attaining for i () = max { } gradually grows with . The above p

will be investigated further in Examples E.1 and E.4.

3. Boxball system

3.1. Conventional formulation

Consider the tensor product B1 B2 BL . Its elements are called states. We regard

each component (x1 , . . . , xn+1 ) Bl as a capacity l box containing xi balls with color i for 2

i n + 1. On the other hand the letter 1 is to be interpreted as a vacancy. Thus x1 represents the

empty space in the box. A state represents an array of boxes with capacity 1 , . . . , L containing

balls of colors 2, 3, . . . , n + 1.

We define the time evolution Tl (p) = p1 pL of a state p = p1 pL by

ul [0] p1 [0] pL [0] p1 [d1 ] pL [dL ] vl [d1 + + dL ]

(3.1)

Aff(Bl ). Here vl Bl and di are uniquely determined by (2.12)(2.14). We set

El (p) = e1 + + eL ,

ej = min(j , l) dj ,

(3.2)

It is known [26,27,31] that Tl is weight preserving, the commutativity Tl Tk = Tk Tl is valid and

El (p) is a conserved quantity, i.e., El (Tk (p)) = El (p) for any k and l, provided that pj = uj for

L j L with sufficiently large L L . The proof of these facts is based on the YangBaxter

equation of the combinatorial R (Proposition A.1) and the property:

v l = ul

if pj = uj

(3.3)

216

Since each dj is the winding number (2.14), El (p) is the sum of the non-winding number ej .

In particular for l = , ej is equal to the number of balls x2 + + xn+1 in the j th box pj =

(x1 , x2 , . . . , xn+1 ) Bj . Therefore we find

E (p) = number of balls contained in p.

(3.4)

In the terminology of solvable lattice models, El is the energy associated with a row transfer

matrix. It should not be confused with another energy Ei (4.12) relevant to the corner transfer

matrix. Their relation is given in Proposition 4.8. The conserved quantity El will be evaluated

explicitly for highest states in Proposition 6.15 and for general states in Proposition 7.7.

2 (p), T 3 (p) are

Example 3.1. The time evolution of the top row p under T , i.e., T (p), T

listed downward. The frame of the semistandard tableaux and the symbol are omitted.

11

11

11

11

122

111

111

111

2

1

1

1

1333

1222

1111

1111

1

3

2

1

1

3

2

1

4

3

2

1

1

4

3

2

1

1

4

3

1

1

3

2

1

1

3

2

1

1

1

4

1

1

1

3

1

1

1

3

1

1

1

1

1

1

1

1

1

1

1

1

The conserved quantities are given by E1 (p) = 3, E2 (p) = 5 and El (p) = 7 for l 3.

The time evolution T can be calculated by a simple prescription [26]. We introduce a map

Li (2 i n + 1) by

Li : Z0 Bl Bl Z0 ,

(m, y)
(y , m ),

where m and y = (y1 , . . . , yn+1

) are determined from m and y = (y1 , . . . , yn+1 ) by

yi + (y1 m)+ if j = 1,

if j = i,

m = yi + (m y1 )+ , yj = min(m, y1 )

y

otherwise,

j

(3.5)

(3.6)

where (m)+ = max(m, 0). Li may be viewed as the interaction of the box Bl with the carrier that

contains m balls of color i. The carrier drops as many balls as possible into the empty space y1

and picks away all the color i balls that were originally in the box. Using Li , we introduce the

operators Ki (2 i n + 1) that sends a state to another as follows:

Ki (p1 p2 ) = p1 p2 ,

Li (mj , pj ) = (pj , mj +1 ) for j 0 (m0 = 0).

The latter relation is applied successively for j = 0, 1, 2, . . . , determining all the pj s. In other

words the operator Ki attaches an empty carrier to the left of the state and sends it to the right,

by which the color i balls are moved to the right according to the local interaction rule Li .

Proposition 3.2 (See [26]). The time evolution T admits the factorization:

T = K2 K3 Kn+1 .

217

Example 3.3. For p in Example 3.1, K4 (p), K3 K4 (p) and K2 K3 K4 (p) = T (p) are given.

11

11

11

11

122

122

122

111

2

2

2

1

1333

1333

1111

1222

1

1

3

3

1

1

3

3

4

1

3

3

1

4

4

4

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

tells that in the state T (p) = p1 pL , pj = uj is valid for 1 j k + 1.

3.2. Bethe ansatz

Highest states in B1 BL are in one to one correspondence with rigged configurations

((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) with (0) = (1 , . . . , L ) by the KKR bijection. Suppose L

is sufficiently large. If a state p = p1 pL is highest and pk = uk for k 1, so is its time

evolution Tl (p). Thus the boxball system induces the time evolution on the associated rigged

(0)

(1)

configurations. For such states, Ej (2.8) and the vacancy number pj are sufficiently large, and

one can increase the color 1 rigging ri(1) without violating the condition (2.6).

Proposition 3.5 (See [34], Proposition 2.6). Let p = p1 pL P+ ((0) ) be the image

of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under the KKR bijection. Assume

(1)

(1)

(1)

that vl = ul in (3.1) and set ri = ri + min(l, i ).

Then ((0) , ((1) , r (1) ), ((2) , r (2) ), . . . , ((n) , r (n) )) is a rigged configuration and corresponds to the highest state Tl (p) P+ ((0) ).

This is proved from the definition of the time evolution (3.1) and Lemma C.3. The time evo(a)

lution Tl in this paper corresponds to the a = 1 case of Tl considered in [34]. In this sense

the rigged configurations are the action-angle variables of the boxball system which linearize

the original nonlinear dynamics (3.1). Moreover it is clear that all the Tl (p) are the same if

l max (1) .

t (p) (t = 0, 1, 2, 3) in Example 3.1

Example 3.6. The rigged configuration corresponding to T

(0)

(apart from ).

((1) , r (1) )

12

12

13

3+t

(a)

3t

3t

((2) , r (2) )

0

0

((3) , r (3) )

0

0

(a)

The length of each row is i and the numbers on its right and left are the rigging ri and the

(a)

vacancy number p (a) , respectively. (Vacancy numbers are exhibited here for a check of (2.6).)

i

The Bethe ansatz produces transfer matrix eigenvectors from solutions to Bethe equations. The

KKR bijection is its combinatorial version in the sense that the former is replaced by highest

states and the latter by rigged configurations. Thus we see that the combinatorial Bethe ansatz

provides a linearization scheme, or equivalently, an inverse scattering method of the boxball

system [34]. See Appendix E for a further exposition combined with the vertex operator formalism of the KKR bijection.

218

4.1. Number of balls in the SW quadrant

t (p p ) = p t

Let p = p1 pL be a state and write its time evolution as T

1

L

1

t

t

t

t

t

pL , with pj = (xj,1 , xj,2 , . . . , xj,n+1 ) Bj . We do not assume that p is highest. For

0 k L and 1 d n + 1, we define the function k,d (p) Z0 by (0,d (p) = 0)

k,d (p) =

k

k

t

0

0

t

xj,2

+

xj,2 + + xj,n+1

.

+ + xj,d

j =1

(4.1)

t1 j =1

Here

term is finite due to Remark 3.4. In fact the double sum may well be replaced

the second

k

by k1

j =t+1 only where the nonzero contributions are contained. This region is depicted as

t=1

the SW quadrant of the time evolution pattern like Example 3.1.

The first term in (4.1) is the number of balls of color 2, 3, . . . , d contained in the top row,

which is the truncation p1 pk of the state p. The second term counts the balls of all

colors 2, . . . , n + 1 within the hatched domain. By the definition, k,n+1 is the total number of

balls within p1 pk and the SW quadrant beneath it. Thus

k,1 (p) = k,n+1 T (p)

(4.2)

holds. Note that k,d (p) is independent of pk+1 , pk+2 , . . . , pL . In this regard, we will also use

the notation

d (p1 pk ) = k,d (p).

(4.3)

d (ul p1 pk ) = d (p1 pk )

(4.4)

for any l.

The above picture reminds us of Baxters corner transfer matrix (CTM) in solvable lattice

models [1]. In fact k,d serves its ultradiscrete analogue adapted to the boxball system as we

will see below.

Example 4.1. For p in Example 3.1, k,d (p) takes the following values.

k

k,1

k,2

k,3

k,4

0

0

0

0

0

2

2

2

0

3

3

3

3

6

9

9

5

8

11

11

7

10

13

13

9

12

15

16

12

15

18

19

219

By the definition, the kth component pk = (x1 , . . . , xn+1 ) Bk in a state p = p1 pL

is expressed as

xd = k,d k1,d k,d1 + k1,d1

(1 d n + 1),

(4.5)

where k,d = k,d (p) for 1 d n + 1 and the extra one k,0 (p) is specified by

k,0 (p) = k,n+1 (p) (1 + + k )

(4.6)

sense, as an ultradiscrete analogue of the Baxter formula (Eq. (13.1.12) in [1]): 1 =

Tr(SABCD)/Tr(ABCD) for one point function in terms of CTMs.

We use the notation

k,d = k,d T (p) .

(4.7)

Thus (4.2) reads

k,1 = k,n+1 .

(4.8)

k,d1 + k1,d = max(k,d + k1,d1 , k1,d1 + k,d k ).

(4.9)

Proof. In the time evolution T = K2 K3 Kn+1 (Proposition 3.2), let us calculate the effect of

the operator Kd on the kth box pk = (x1 , . . . , xn+1 ) Bk in Kd+1 Kn+1 (p). In the following,

the fact that color d balls are touched only by Kd is taken into account. Suppose that the carrier

contains m and m balls with color d just before and after the interaction Ld (3.5). In (3.6) we

are to set

m =

k

(j,d j 1,d j,d1 + j 1,d1 ) ( )

j =1

m = m |kk1 ,

yd = xd = k,d k1,d k,d1 + k1,d1 ,

where we have used (4.5). As for the empty space y1 concerning Ld in (3.5), we show that it is

given by

y1 = k + k,d k1,d k,d + k1,d

(2 d n + 1)

(4.10)

by induction on d in the decreasing order d = n + 1, n, . . . , 2. In so doing, the bilinear relation (4.9) will be established simultaneously.

For d = n + 1, (4.10) coincides with x1 in (4.5) by (4.6) and (4.8), hence it is correct. Then

the relation m = yd + (m y1 )+ (3.6) leads to (4.9). The new empty space is determined from

220

k + k,d1 k1,d1 k,d1 + k1,d1 .

This coincides with (4.10) with d replaced by d 1, making the induction proceed.

The relation (4.9) is an ultradiscrete analogue of the Hirota bilinear equation. In view of

(4.8), it determines k1,1 , k1,2 , . . . , k1,n+1 successively from {k1,d , k,d , k,d | 1 d

t (p)) are fixed uniquely from the data at sufficiently large t and k.

n + 1}. Thus all the k,d (T

Then the local states are specified by (4.5). In this sense the ultradiscrete CTM d achieves a

bilinearization of the dynamics of the boxball system.

4.3. Relation to energy function

Let p B1 BL be any element which is not necessarily highest. For 1 k L, we

introduce the sum:

(j +1)

(1 i n + 1),

Qi pj pm

Ei (p1 pk ) =

(4.11)

1j <mk

where Qi is the ith non-winding number (2.13) with the convention Qn+1 = Q0 . The element

(j +1)

pm

is defined by sending pm to the left by applying the combinatorial R successively as

(m1)

pm1

pj pj +1 pm1 pm pj pj +1 pm

(j +2)

pj pj +1 pm

(j +1)

p j pm

pm1

pj +1 pm1

.

Ei (p1 pk ) = Ei (u p1 pk )

(1 i n + 1),

(4.12)

where u actually means ul with sufficiently large l. Ei does not depend on such l. In fact, from

the graphical rule in Appendix B, we find Qi (ul x) = x2 + x3 + + xi if x = (x1 , . . . , xn+1 )

(1)

and x1 + + xn+1 l is satisfied. Thus writing pj = (xj,1 , . . . , xj,n+1 ), (4.12) is split into the

boundary and the bulk parts as

Ei (p1 pk ) =

k

(xj,2 + + xj,i ) + Ei (p1 pk ).

(4.13)

j =1

up to an additive constant. In

that the quantity usually called energy [17,20] is En+1 or En+1

what follows, whenever the notation u is used, it should be understood as ul with sufficiently

large l and the relevant quantity is independent of such l.

To the relation x y y x with e = Qi (x y), we assign the diagram

x

e

y

(4.14)

221

Let ((x1 , x2 , . . . , xn+1 )) = (x2 , x3 , . . . , x1 ) be the Dynkin diagram automorphism acting on

Bl decreasing the tableau letters cyclically by one. We extend it naturally to the tensor product

by (p1 pk ) = (p1 ) (pk ). Since the combinatorial R commutes with , the

ith non-winding number has the properties similar to the i = 0 case. In particular, under the

YangBaxter relation

a

d

e

b

=

Aff(Bk ) Aff(Bl ) Aff(Bm ) Aff(Bm ) Aff(Bl ) Aff(Bk ) for some k, l and m. If i = 0 for

instance, the associated non-winding number Q0 is related to H via (2.14), therefore by setting

a = min(k, l) a, b = min(k, m) b and c = min(l, m) c, the left-hand side represents the

following relation under the combinatorial R:

x [1 + a]

z[3 ]

x[1 ] y[2 ] z[3 ] y [2 a]

x [1 + a + b]

z [3 b]

y [2 a]

y [2 a + c]

x [1 + a + b].

z [3 b c]

Similarly, by setting d = min(l, m) d, e = min(k, m) e and f = min(k, l) f , the same

element is transformed along the right-hand side as

y [2 + d]

x [1 + e]

y [2 + d]

z [3 d e]

y [2 + d f] x [1 + e + f].

z [3 d e]

Since the YangBaxter relation is valid among the affine crystals, we obtain not only x =

x , y = y and z = z but also b + c = d + e,

a c = f d and a + b = e + f, which are

equivalent to the two relations b + c = d + e and a + b = e + f . Note that a + b + c = e + f + d

in general.

Remark 4.3. The energy is invariant under any reordering of p1 pk by the combinatorial R. Namely, Ei (p1 pk ) = Ei (p1 pk ) and Ei (p1 pk ) = Ei (p1 pk )

hold if p1 pk p1 pk by the combinatorial R. For i = n + 1 this is essentially

Proposition 3.9 in [20] and the general i case follows from the symmetry under .

Let us consider a particular diagram involving p1 pk , which is illustrated for k =

2, 3, 4. The general case is similar.

p1 p2 p3 p4

p1

p2

p1 p2 p3

Incidentally, this kind of diagrams have been known as the half twist in the construction of link

invariants [39].

222

Lemma 4.4. The energy Ei (p1 pk ) is the sum of the non-winding numbers Qi (as e in

(4.14)) attached to all the vertices of the corresponding diagram for p1 pk as above.

Proof. For k = 2 it is obvious. We use the definition (4.12) and illustrate the induction step along

the one from k = 3 to k = 4.

p1

p3

p2

p4

p3

p2

p1

d3

p4

d2

e1

e2

d1

e3

(j +1)

By the induction assumption, the sum of three is equal to 1j <m3 Qi (pj pm ).

(j +1)

Thus we are to verify e1 + e2 + e3 = 1j <4 Qi (pj p4

). But the YangBaxter equation

(4)

(3)

(2)

Qi (p2 p4 ), d1 = Qi (p1 p4 ). 2

Lemma 4.5. i (p1 pk ) i (T (p1 pk )) = e1 + + ek , where ej s are the ith

non-winding numbers specified by the following diagram:

u

e1

p1

e2

p2

ek

pk

k

j =1

k

1

0

0

1

xj,2

+

xj,i+1 + + xj,n+1

.

+ + xj,i

j =1

By using the graphical rule [17] explained in Appendix B, it is easy to show that the non-winding

0 + + x 0 ) + (x 1

1

number Qi (2.13) is given by ej = (xj,2

j,i

j,i+1 + + xj,n+1 ). 2

The main result in this subsection is the following, which identifies the ultradiscrete CTM i

(4.3) with the energy Ei that originates in the crystal theory.

Proposition 4.6. i (p1 pk ) = Ei (p1 pk ) holds for any k and 1 i n + 1.

t (p) with sufficiently large t, its leftmost k components become u u

Proof. For T

1

k

due to Remark 3.4. In this case the both i and Ei are obviously zero. Therefore it suffices to

show

223

we are to show Ei (p1 p2 p3 ) = Ei (p1 p2 p3 ) + e1 + e2 + e3 . Recall that p1 pk

is determined by carrying u by the combinatorial R through p1 pk to the right as

u p1 pk p1 pk (). Combining this with Lemma 4.4, one can depict the

two sides as follows:

p1

p3

p2

e1

e1

p2

e2

e2 p2

e3

p3

e1

p1

e2

a

p1

e3

p3

e3

b

Ei (p1 p2 p3 )

Ei (p1 p2 p3 ) + e1 + e2 + e3

1 2 3 without loss of generality. Then the above equality is a consequence of the separate

ones e1 = 0, a = a + e2 and b + c = b + c + e3 . To see them, note that u b um ()

for any b Bm under the combinatorial R. Moreover Qi (um uj ) = 0 for any m, j . Thus

e1 = Qi (u u1 ) = 0 indeed. The other relations can also be seen by appropriately deforming

the leftmost line from u in the right diagram with the aid of the YangBaxter equation:

u

p1

p2

p3

e1

e1

p1

p3

e2

e2

e3

e3

a

d

p2

b

c

c

Comparing the lines for p2 in the left diagram here and the previous one, we find e2 + e2 + a =

e2 + a + d. Similarly, the lines for p3 in the right diagram here and the previous one lead to

e3 + e3 + b + c = e3 + b + c + d . The proof is finished by noting d = d = 0 because of

Qi (um uj ) = 0 for any m, j . 2

As a corollary of Proposition 4.6 and (4.4), one has

Ei (ul p) = Ei (p),

which can also be verified by an argument similar to the above proof.

(4.15)

224

Remark 4.7. Although the both i and Ei admit decompositions into the boundary and the bulk

parts as in (4.1) and (4.13), these parts are not equal separately in general. Proposition 4.6 has

also been proved by Mark Shimozono by using the technique known as katabolism (private

communication).

The energy En+1 (4.12) and the row transfer matrix energy El (3.2) are related by

Proposition 4.8.

En+1 (p) En+1 Tl (p) = El (p).

For l = this coincides Lemma 4.5 with i = n + 1.

Proof. We illustrate the proof for p = p1 pk with k = 3. Consider the diagrams:

u

p1

ul

p2

p3

ul

p1

p2

p3

e1

0

d1

e1

d1

e3

d2

e2

d2

e2

e3

d3

=

a

d3

b

b

c

Here the numbers above the vertices signify the (n + 1)th non-winding number as in (4.14)

with i = n + 1, and we have applied the YangBaxter relation to the line from ul . According to

Lemma 4.4, En+1 (ul p) is equal to the sum of all the numbers in the left diagram. Similarly,

En+1 (Tl (p)) is obtained from the right diagram as En+1 (Tl (p)) = d1 + d2 + d3 + a + b + c. The

YangBaxter equation tells that ei + di = ei + di for i = 1, 2, 3. Using these facts and (4.15), we

obtain En+1 (p) En+1 (Tl (p)) = En+1 (ul p) En+1 (Tl (p)) = e1 + e2 + e3 , which coincides

with El (p) in (3.2). 2

4.4. Proof of Theorem 2.1

Theorem 2.1 is a simple corollary of (4.5) and

Theorem 4.9. For any rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and the corresponding highest state p = p1 pL under the KKR bijection, the associated ultradiscrete

tau function (2.19) and the ultradiscrete CTM (4.1), (4.3) coincide. Namely

i (p1 pk ) = i (p1 pk )

(1 i n + 1, 1 k L).

(4.16)

rigged configuration is obtained from that of p by just changing (0) into (0) (1L ). It is easily

225

seen that k,i and k,i for p are the same as those for p as long as 1 k L. Thus we understand

them as associated with p rather than p.

Our proof is based on Propositions 5.1 and 6.1, which will be established in Sections 5 and

6, respectively. Proposition 5.1 states that i satisfies the same bilinear equation (4.9) as i .

Combined with (4.8), it determines k1,1 , k1,2 , . . . , k1,n+1 successively in this order from

{k1,i , k,i , k,i | 1 i n + 1}. Namely, the tau function on the NW corner in

k

k1

k1

the tau functions and are associated

with p and T (p ), respectively (see the beginning of Section 5), and the above diagram can be

extended to a two-dimensional square lattice with the indicated coordinates. The square at (k, t)

t (p )))n+1 .

is associated with (k,i (T

i=1

Consider the rectangular region on the lattice 0 t t0 , 1 k L + L , where the tau

functions for p constitutes the top line t = 0 of it. They are uniquely determined from the right

t (p )))n+1 | 0 t t }, and the bottom boundary t = t ,

boundary k = L + L , i.e., {(L+L ,i (T

0

0

i=1

t0

n+1

i.e., {(k,i (T (p )))i=1 | 1 k L + L }. The coincidence of i and i on these boundaries will

be proved in Proposition 6.1 by taking t0 and L sufficiently large. 2

5. Bilinear relation for i

Let k,d be the ultradiscrete tau function specified in Theorem 2.1 and (2.18)(2.20). We

define k,d to be k,d with |s (1) | replaced by |s (1) | + | (1) | in (2.20). In view of Proposition 3.5,

this corresponds to the rigged configuration that has undergone the time evolution T once.

Proposition 5.1. The substitution k,d = k,d and k,d = k,d solves the bilinear equation (4.9).

This section is devoted to the proof of Proposition 5.1 by a refinement of the approach in [26].

We invoke the free fermion construction of tau functions associated with gl() [37]. For l Z,

set

(a) (a) (a)

H (x)

,

g|l, g = exp

ci pi qi

l (x) = l|e

(5.1)

(a,i)

where the notation is the same as Eq. (2.3) in [37] except that l there is denoted by l here

(a)

for distinction from (2.19). (pi here is not the vacancy number (2.7).) The operators (k) =

j are the free fermions. They obey the anti-commutation rej

j Z j k , (k) =

j Z j k

lations [i , j ]+ = [i , j ]+ = 0 and [i , j ]+ = ij , hence (k)2 = (k)2 = 0. |l is the

charge l vacuum of the Fock space. H (x) = i1 xi j Z j j+i is the Hamiltonian with in(a)

(a)

(a)

finitely many time variables x = (x1 , x2 , . . .). In (5.1), we associate each triple (ci , pi , qi )

(a) (a)

with the data (i , ri ) in the rigged configuration (, ((1) , r (1) ), . . . , ((n) , r (n) )). The sum

extends over all the colors 1 a n and the rows 1 i ((a) ). The tau function (5.1) is an

N -soliton solution of the KP hierarchy with N = ((1) ) + + ((n) ).

226

H (x)

H (x) = e(x,k) (k) and

The time evolution of the free fermion is given

by e i (k)e

(x,k)

=e

(k) with (x, k) = i1 xi k . Consequently,

q

(p) (q)

p

expanded as

l (zk ) =

(5.2)

l (zk ) ,

=( (1) ,..., (n) )

l (zk ) =

=

(a) (a)

c i qi

(a,i)

(a)

(a,i)<(b,j ) (pi

k

(a) l

pi

(a)

qi

(b)

(a)

j qi

(a)

j =1 j pi

(b)

(a)

pj )(qj qi )

(a)

(a,i),(b,j ) (pi

(b)

qj )

(5.3)

(5.4)

(1)

(1)

where the sum

, (n) (n) independently.

In (5.3),

(5.2) extends over the subsets , . . . (a)

(a) . In (5.4),

the product (a,i) runs over the rows of the

selected

subset

(a,i)<(b,j ) runs

over the pairs of such indices, whereas (a,i),(b,j ) simply means the double product. is the

(a)

(a)

using the formulas:

l

q

p

j j

l|(p) (q)|l =

p q =

,

pq q

j l1

l

m

pi

i<j (pi pj )(qj qi )

m

l|(p1 ) (pm ) (qm ) (q1 )|l =

qi

.

qi

i,j =1 (pi qj )

i=1

Now we make a special choice of the parameters that further reflects the rigged configuration

(, ((1) , r (1) ), . . . , ((n) , r (n) )). Fixing d {2, . . . , n + 1} we set

(a)

(a)

i

i

(a)

(a)

(a+1)

=

+ i exp

,

qi =

,

(a)

(a) exp 2(a)

i +ri

if a {1, d},

(a) (a)

i

ci q i =

(a)

(a)

+r

(a)

otherwise,

i exp i i

j

j = (1) + j exp

,

(a)

pi

(a)

(a)

i exp

(a)

(5.5)

(5.6)

(5.7)

(hence distinct) parameters such that

(1) > > (d1) > (d) = 0 > (d+1) > > (n+1) ,

(5.8)

(a)

(a)

i

> 0,

(a)

i

> 0,

> 0.

(5.9)

227

Lemma 5.2. Set q = e1/ . In the limit q 0, the summand l (zk ) (5.3) of the tau function has

the following behavior:

(1)

(d)

(d1) || (d) |)

l (zk ) = q c(,s)+| |+| |l(|

+ O(q) , > 0,

(d)

(d1) || (d) |)

l zk + (1)1 = q c(,s)+| |l(|

(5.10)

+ O(q) , > 0,

where and are independent of . c(, s) is defined by (2.20) with = (1 , . . . , k ).

UD

that A = A0 q a + higher order terms in q for some leading coefficient A0 (= 0). (A0 still can

UD

depend on as long as A0 0 although it is not needed in our case.) We let the relation A B

mean lim+0 log A = lim+0 log B.

(a)

(a)

(a)

(a)

Proof. Let {(i , si )} be the subset of the rigged configuration {(i , ri )} corresponding to

= ( (1) , . . . , (n) ) as in (2.20). We investigate the leading power of the constituent factors in

(5.3). From (5.5)(5.7) we find

(i)

(a,i)

(ii)

(a,i)

(iii)

(iv)

(a) (a) UD

c i qi

(a) l

pi

(a)

qi

a=1

p (d) l

i

(d1)

qi

k

(a)

j qi

(a)

(a,i) j =1 j pi

n

(a) (a) (1) (d)

+ s ,

(a)

pi

UD

l (d1) (d) ,

k

(1)

i j =1 j pi

(b) (b)

qj

pj

(a)

qi

UD

min , (1) ,

n

(a)

pi

(a) 2

pj

a=1 i<j

(a,i)<(b,j )

UD

n

min (a) , (a) + (a) ,

a=1

(v)

(a)

pi

(b) 1

qj

(vi)

(a)

pi

(a1) 1 UD

qj

a=2 i,j

(a,i),(b,j )

(1) q (a)

i

(a)

(1)

pi

(a,i)

n

1

(1)

(1) pi

n

min (a) , (a1) ,

a=2

UD

(1) ,

where | (n+1) | = 0 and the notation (2.3) is used. The contributions (i)(v) sum up to c(, s)

| (1) | | (d) | + l(| (d1) | | (d) |). This verifies the leading power of l (zk ) in (5.10). Similarly,

the one for l (zk + ( (1)1 )) is derived by including the contribution from (vi).

The remaining task is to check the positivity and -independence of the leading coefficients

and . We first illustrate them along l (zk ) . In the right-hand side of (5.3), we show the

positivity individually for the constituent factors (i), (ii), (iii) and = (iv) (v) considered in

(a)

the above. The leading coefficient from (i) is (a,i) i by (5.6), which is positive due to (5.9).

228

(d1) (d+1)

i

(d)

(d1)

i

j j

1an

a=d1,d

(a+1)

(a)

( (a) )

,

where the products on i and j extend over the selected rows in (d1) and (d) , respectively. The

symbol ( (a) ) denotes the length of (a) as defined in (2.1). This is positive thanks to (5.8) and

(5.9). The leading coefficient from (iii) with a fixed j is equal to the one from

i

(1) (2)

(1)

j q j + i q i

(1)

(1) (a+1)

.

(1) (a)

(a,i)

a2

It is positive by (5.8) and (5.9). The leading coefficients from (iv) and (v) are respectively equal

to those in

n

(a)

(a)

(a)

(a)

j q j i q i

2

a=1 i<j

( (a) )( (b) )

(a) (b) (b+1) (a+1)

,

1a<bn

n

(a)

(a)

(a1)

i q i j

(a1)

j

a=2 i,j

1

(a) (b+1)

1a,bn

a=b+1

In view

of (5.8) and (5.9), the coefficients are both positive apart from the same sign factor

(a)

(b)

(1) 1a<bn ( )( ) . Thus the leading coefficient from the product (iv) (v) is positive.

For l (zk + ( (1)1 )), the leading positivity > 0 is proved similarly. The only necessary

modification is to include the contribution from (vi):

n

(1) (a+1) )( (a) )

a=1 (

,

(1) n

(1) (a) )( (a) )

a=2 (

i i

which is again positive due to (5.8) and (5.9). Finally, and are -independent as they are

rational functions of the parameters appearing in (5.8) and (5.9) only. 2

Lemma 5.3.

UD

0 (zk ) k,d ,

UD

0 zk + (1)1 k,d ,

UD

1 (zk ) k,d1 ,

UD

1 zk + (1)1 k,d1 .

lim log 0 zk +

+0

(1)1

c(,s)+| (d) |

q

= lim log

,

+0

(5.11)

where q = e1/ and (5.10) has been substituted. Lemma 5.2 furthermore tells that there is no

cancellation in the -sum here because of > 0. Therefore the limit tends to max {c(, s)

| (d) |} = k,d . See the definitions of k () (2.19) and k,d (2.21). The other limits are confirmed

similarly. 2

229

Proof of Proposition 5.1. It is well known that l satisfies the bilinear equation:

1

1 0 z + 1 + 1 1 z + 1

+ 1 1 0 z + 1 + 1 1 z + 1

+ 1 1 0 z + 1 + 1 1 z + 1 = 0.

This is derived by setting x = z + ( 1 ) + ( 1 ) + ( 1 ), x = z and (l, l ) = (0, 1) in

Eq. (2.4)l,l in p. 956 of [37]. Setting

1

,

= k ,

= ,

x = zk1 = 11 + + k1

= (1) ,

we get

k 0 zk1 + (1)1 1 (zk )

= (1) 0 (zk )1 zk1 + (1)1 + k ek / 0 zk + (1)1 1 (zk1 ),

where k (1) has been evaluated by (5.7). In view of (5.8) and (5.9), the coefficients k , (1)

and k here are all positive and -independent. Moreover from Lemma 5.2, there is no cancellation of the leading terms coming from the two terms on the right-hand side. Therefore by taking

the UD limit lim+0 log() of the two sides and applying Lemma 5.3, we obtain

k1,d + k,d1 = max(k,d + k1,d1 , k,d + k1,d1 k ).

(5.12)

This coincides with (4.9) with replaced by . Note that the range 2 d n + 1 for the both

also match. This completes the proof of Proposition 5.1. 2

Let us compare the results in this section with the similar ones in Section IV of [26]. In [26],

the tau function is supposed to fulfill the periodicity k,d = k,d+n+1 in the present notation. This

(a) (a)

led to a reduction condition (Proposition 4.4 in [26]) on each pair of the parameters (pi , qi )

in (5.1), restricting the class of tau functions captured in the UD limit. In our approach, reduction

conditions are bypassed by the special choice of the parameters (5.5)(5.9) depending on the d

that enters the bilinear equation (5.12) to prove. As it will turn out in Section 7.3, the ultradiscrete

tau functions derived here cover all the solutions of the boxball system.

6. Asymptotic coincidence of i and i

6.1. Statement and its reduction

In this section we prove

L

Proposition 6.1. Given a highest path p with length L, set p = p 1 1 and k0 = L + L .

Then the equalities (1 i n + 1)

t

t

0 (p ) =

0

k,i T

(6.1)

1 k k0 ,

k,i T (p )

t

t

k0 ,i T (p ) = k0 ,i T (p )

(6.2)

0 t t0

hold if t0 1 in (6.1), and if furthermore k0 Lt0 in (6.2).

230

Combined with Proposition 5.1, it establishes Theorem 4.9 and thereby completes the proof of

Theorem 2.1. Let ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be the rigged configuration corresponding

t

(1)

(1)

to T0 (p ). Without loss of generality we assume 1 2 . Moreover from the condition

t0 1 and Proposition 3.5, we assume

(1)

(1)

(1)

1 r1 r2 r3 ,

ri(1) rj(1)

(1)

if (1)

i < j

(6.3)

throughout this section. From Remark 3.4 and k0 Lt0 , the state T0 (p ) takes the form:

t

a1

t0

T

(p ) = u1

b1

uL 1 1 ( ) 1 1,

(6.4)

path.

Lemma 6.2. Under the same condition as Proposition 6.1, the following relation holds:

t

t+1

t

t+1

k0 ,i T

(p ) k0 ,i T

(p ) = k0 ,i T

(p ) k0 ,i T

(p ) (0 t t0 1).

Proof. Suppose ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is the rigged configuration for T0 (p ). From

t (p ))

the definition (4.1) and the assumed situation (6.4), it is easily seen that k0 ,i (T

t+1

k0 ,i (T (p )) is the number of balls with colors 2, . . . , n + 1 contained in p, which is

t (p ))

t+1

equal to |(1) |. To calculate k0 ,i (T

k0 ,i (T (p )), we apply the formula (2.23).

t (p )) is obtained by replacing there with (1k0 L ) and r (1) with r (1) (t t)(1)

k0 ,i (T

0

i

i

i

by Proposition 3.5. Then the max contains k0 only via min( (1k0 L ), ), hence one can

t (p )) =

let it be achieved at = (1) by taking k0 sufficiently large. Consequently k0 ,i (T

(1)

min( (1k0 L ), (1) ) min((1) , (1) ) |r (1) | + (t0 t)|(1) | + i ((1) ) for any 0 t t0

t (p ))

t+1

(1)

as long as k0 Lt0 . Therefore k0 ,i (T

k0 ,i (T (p )) = | | in agreement with

t

t+1

k0 ,i (T (p )) k0 ,i (T (p )). 2

t

By Lemma 6.2, (6.2) is attributed to t = t0 case. Thus the proof of Proposition 6.1 reduces to

showing (6.1), on which we shall concentrate from now on.

Lemma 6.3. For 1 k L, k,i (T0 (p )) = k,i (T0 (p )) = 0. For L < k k0 , the following

relations hold:

t

0 (p ) =

k,i T

(6.5)

kL,i (p),

t

0

(6.6)

t

Proof. For k,i , the assertion is obvious from (6.4) and the definition (4.1). As for k,i ,

t

we use the expression (2.23) for T0 (p ) which corresponds to the rigged configuration

(0)

(1)

(1)

(n)

(n)

( , ( , r ), . . . , ( , r )).

(1)

i ( ) = max min(, ) min(, ) |s| + i () , (0) .

(6.7)

(1)

231

(1)

(1)

According to (6.4), we have (0) = (1L ). From Proposition 3.5, we know that ri = i t0 +

(1)

(1)

ri , where ri is the rigging for p . (This is also equal to the rigging for p in Proposition 6.1

although this fact is not used below.) Thus t0 enters (6.7) only via |s| = ||t0 + |s |, where |s |

is t0 -independent. Fixing = (1 , . . . , k ) with 1 k L and taking t0 sufficiently large, we

t

(1)

see that the maximum (6.7) forces the choice = . This yields k,i (T0 (p )) = i () = 0 for

1 k L, where the latter equality is due to Lemma 2.3.

The maximum can be different from 0 for L < k k0 , where we are allowed to take k so large

up to k0 depending on t0 . This corresponds to the situation (6.6), which will be considered in the

sequel. To compute the right-hand side of (6.6) by (6.7), we need to know the rigged configuration

t

for p.

In view of (6.4), it is obtained from the one ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) for T0 (p )

(1)

(1)

(1)

(1)

L

by replacing (0) with (1L ) and the rigging ri with ri = ri j =1 min(j , i ). See

Lemma C.3. This amounts to changing |s| in (6.7) to |s| min(, ) in the notation (2.3). Thus

by setting = (1kL ), we get

= max min 1kL , min(, ) |s| min(, ) + i(1) ()

kL,i (p)

(1)

(1)

= max min 1kL , min(, ) |s| + i () .

(1)

by (6.7). 2

(0)

(0)

k

To summarize so far, we have reduced Proposition 6.1 to (6.1) for p such that p B1 0 .

Resetting the meaning of p, p , p,

L, L and k0 , we restate it as

Proposition 6.4. Let p B1L be a highest path and ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) be its

(1)

(1)

rigged configuration with 1 2 . If L is sufficiently large and the condition (6.3) is

satisfied, the equality

k,i (p) = k,i (p)

(6.8)

is valid for 1 k L, 2 i n + 1.

A highest path p B1L satisfying the assumption of Proposition 6.4 will be called an asymptotic state. We have excluded i = 1 case since it is contained as the i = n + 1 case of T (p)

which is also an asymptotic state. See (2.19), Proposition 3.5 and (4.2). The remainder of this

section is devoted to the proof of Proposition 6.4. Our strategy is to express the both sides of

(1)

(6.8) in terms of the quantities associated with the smaller algebra An1 and invoke the induction with respect to n. Note that the induction allows us to use Theorem 2.2 with 1 a n 1

(1)

and Theorem 4.9 for An1 .

6.2. Precise description of asymptotic states

The KKR bijection from rigged configurations to highest paths is known to be equivalent with

the vertex operator construction [34,35]. Here we utilize the notions in the latter formalism such

as scattering data and normal ordering explained in Appendix D. In particular, we remark that a

232

2

2

normal ordered if and only if 1 N .

Lemma 6.5. For an asymptotic state p, denote any successive tensor product components of the

normal ordered scattering data by

d2

d1

(6.9)

2

2 a1 a2 alA n + 1,

2

2 b1 b2 blB n + 1.

= a1 a2 . . . alA BlA ,

= b1 b2 . . . blB BlB ,

d1 d2

11blB b2 b1 11 1 alA a2 a1 11 .

(6.10)

Proof. Since p is an asymptotic state, we have lB lA . We divide the proof into two cases.

Case 1. Assume lB < lA . From the definition of the modes of scattering data (D.3), we have

lB d1 d2 for asymptotic states. Therefore, the calculation of the vertex operator goes as (see

around (E.2) for the explanation of B )

d1 d2

B (11 1 alA a2 a1 )

d1 d2 lB

= blB b2 b1 11 1 TlB (alA a2 a1 )

d1 d2

= blB b2 b1 11 1 alA a2 a1 ,

where Tl is a time evolution of the boxball system with capacity l career (3.1).

Case 2. Next, consider the case lB = lA = l. Let the energy function be H = H (B A). Applying

the definition of the mode (D.3) to (6.9), we have d2 = l +rB +h and d1 = l +rA +h+H (B A),

where rA , rB are the riggings for A, B and h denotes the last term in (D.3) for d2 here. Since the

asymptotic state satisfies the condition (6.3), we have rB rA , leading to H d1 d2 (=: ). If

l, the proof is the same as Case 1. Therefore assume H < l in the following. Calculating

the action of B , we arrive at the following situation:

1

1 1

al a2

a1

al a2

1

B

B

bl

bl1 bl+1

a1

233

where B = 1 b1 b2 bl . The diagram says that B al a ( ) under the combinatorial R. Let us show that a = bl . For the purpose, we first claim bl < al . In fact, suppose

bl al on the contrary. We construct the pairs for B A according to the graphical rule in

Appendix B to compute H = H (B A). We know that there are H winding pairs irrespective of

the ways of making pairs. Since bi is weakly increasing with respect to i, we see that more than

+ 1(> H ) is satisfy bi al . On the other hand, al is the largest letter in A, therefore all the

letters in B greater than al have to constitute winding pairs, and we have seen that the number of

these winding pairs is greater than H . This is a contradiction. Therefore we obtain bl < al .

We have seen that bl < al , and we know that bl is the largest number in B . When

we construct the pairs for B al , this fact means that bl and al form an unwinding pair.

Therefore, the action of the combinatorial R is given by B al a ( ) with a = bl . By

continuing the same argument, we arrive at (6.10). 2

In what follows we use the notation explained in Section 2.1.

(1)

Lemma 6.6. Suppose that Proposition 6.4 is true for An1 . For a rigged configuration

((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) with (1) = (1 , . . . , N ) and r (1) = (r1 , . . . , rN ), assume

2

1 N . Then the corresponding scattering data b1 [d1 ] bN [dN ] Aff(B1 )

2

Aff(BN ) is given by

(1)

bM = (x2 , . . . , xn+1 ),

(1)

(1)

(1)

(1)

dM = |[M] | + rM + M1,n+1

(6.11)

(1)

M,n+1 .

(6.12)

This lemma is shown without assuming that the scattering data b1 [d1 ] bN [dN ] is

normal ordered.

(1)

Proof. From the arguments in Section 6.1, the assumption makes Theorem 2.1 for An1 valid.

Then (6.11) is a corollary of Theorem 2.2 with a = 1. According to the definition (D.3), the mode

dM is given by

(j +1)

dM = M + rM +

(6.13)

.

H bj bM

1j <M

On the other hand, combining (2.14) and (4.13) with i = n + 1, we have En+1 (b1 bM ) =

(j +1)

)). (Since b1 bM is An1 -highest, the first

1j <mM (min(j , m ) H (bj bm

term in (4.13) vanishes.) We know En+1 (b1 bM ) = n+1 (b1 bM ) by Proposi(1)

tion 4.6. Moreover, since Proposition 6.4 for An1 is assumed, we are allowed to use Theorem 4.9

(1)

(1)

(j +1)

(1)

M,n+1 =

min(j , m ) H bj bm

.

(6.14)

1j <mM

Given a rigged configuration ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) with (1) = (1 , . . . , N )

and r (1) = (r1 , . . . , rN ), we introduce the numbers

(1)

(1)

(6.15)

234

for 1 M N, 1 i n + 1.

(1)

Lemma 6.7. Suppose that Proposition 6.4 is true for An1 . Let ((1L ), ((1) , r (1) ), . . . ,

((n) , r (n) )) be a rigged configuration for an asymptotic state. Set (1) = (1 , . . . , N ) with

1 N . Then the following relations are valid:

kM,n+1 kM,n kM,2 kM,1

kM,1 kM+1,n+1

(1 M N ),

(1 M N 1),

kM,1 kM,n+1 = M

(6.16)

(6.17)

(1 M N ).

(6.18)

Proof. By the assumption we may use Lemma 6.6. The scattering data b1 [d1 ] bN [dN ]

considered there should be understood as a normal ordered one here because we deal with an

asymptotic state and assume 1 N . See the remark before Lemma 6.5. From the de(1)

(1)

(1)

(1)

M,i1

M1,i

+ M,i

for 2 i n + 1. This

finition (6.15), kM,i1 kM,i = M1,i1

is equal to xi in (6.11) hence nonnegative, proving (6.16). Summing this over 2 i n + 1

we get (6.18). Comparing (6.12) and (6.15), we have kM,n+1 = dM + |[M1] |. Therefore

kM+1,n+1 kM,1 = kM+1,n+1 kM,n+1 M = dM+1 dM . Since di s are the modes of normal

ordered scattering data, this is nonnegative, showing (6.17). 2

Now we are ready to determine the precise form of asymptotic states from the associated

rigged configurations.

(1)

Lemma 6.8. Suppose that Proposition 6.4 is true for An1 . For an asymptotic state p, let

((1L ), ((1) , r (1) ), . . . , ((n) , r (n) )) be its rigged configuration and (1) = (1 , . . . , N ) with

1 N . Then p = p1 pL B1L is given by

i kM,i < k kM,i1 (2 i n + 1, 1 M N ),

pk =

(6.19)

1 kM,1 < k kM+1,n+1 (0 M N ),

where k0,1 = 0, kN+1,n+1 = L. Namely p has the form:

11 11(b1 )11 11(bM )11 11(bM+1 )11 11(bN )11 11,

where the segment

(bM ) B1 M

(6.20)

(soliton) looks as

kM,n

kM,n1 kM,i kM,i1 kM,2

kM,1

kM,n+1

n+1, . . . , n+1, n, . . . , n, . . . . . . , i, . . . , i, . . . . . . , 2, . . . , 2

(6.21)

Note that Lemma 6.7 guarantees that the regions of k appearing in (6.19) is the disjoint union

decomposition of 1 k L.

Proof. By the assumption we may use Lemmas 6.6 and 6.7. In particular we use the notation xi

and dM in Lemma 6.6. Lemma 6.5 tells that p indeed has the form (6.20). The segment (bM )

xn+1

x2

has the left end at k = dM + |[M1] | + 1 and is arranged as n + 1 n + 1 2 2 with

xi specified by (6.11). From the proof of Lemma 6.7, we find that k = kM,n+1 + 1 and xi =

kM,i1 kM,i . Therefore it looks as (6.21). 2

235

(1)

(1)

Lemma 6.9. Suppose that Proposition 6.4 is true for An1 . If ((1L ), ((1) , r (1) ), . . . , ((n) , r (n) ))

is a rigged configuration for an asymptotic state with (1) = (1 , . . . , N ), r (1) = (r1 , . . . , rN ),

(1 N ), the associated tau function is given by

(1)

(6.22)

Proof. From (2.23) we know

(1)

k,i = max ()k min(, ) |s| + i () .

(6.23)

(1)

Since s is the rigging attached to and runs over the subset of r (1) that satisfies the asymptotic

condition (6.3), the choice of that attains the maximum must be of the form = [M] for some

(1)

0 M N . (We interpret [0] = .) In terms of the notation (2.24), we have i(1) ([M] ) = M,i

.

In (6.23), the quantity in {} at = [M1] and = [M] become equal if and only if

(1)

(6.24)

This yields k = kM,i (6.15). Comparing the k-dependence (M 1)k and Mk, we conclude that

= [M] gives a larger value than = [M1] if kM,i < k. Moreover we may use Lemma 6.7

by the assumption and therefore know that < kM,i < kM+1,i < . Thus we conclude that

the maximum in (6.23) is attained at = [M] for kM,i < k kM+1,i , where k,i is equal to the

left-hand side of (6.24). 2

Next we evaluate k,i for asymptotic states.

Lemma 6.10. Under the same assumption as Lemma 6.9, k,i for the asymptotic state is given

by

k,i = k,i

(1 k L, 2 i n + 1),

(6.25)

Proof. By the assumption we may use Lemma 6.8, which specifies the concrete form of the asymptotic state as in (6.21). To evaluate k,i (p) (4.1), we count only the balls of colors 2, 3, . . . , i

t1

in p itself and those of any color {2, . . . , n + 1} in the subsequent states T (p). From Proposition 3.5 and (6.15), the positions kM,i in (6.21) changes as kM,i kM,i + M under the time

evolution. Due to 1 N , there is no collision among the segments (solitons) (bM )s in

(6.20) under the time evolution. In view of these facts, the counting for k,i within the region

kM,i < k kM+1,i is done as

k,i =

M

(k kM ,i ).

(6.26)

M =1

From (6.15) and Lemma 2.3, this coincides with the right-hand side of (6.22).

236

Example 6.11. The following figure helps to understand the counting (6.26). Consider an asymptotic state in which the Mth soliton is (bM ) = 44332. Its time evolution takes the form:

k

4

2

4

k = kM,3

2

4

2

4

2

4

t

Here we have omitted , letters 1 and the other solitons for simplicity. Then the contribution to

k,3 from the Mth soliton comes from the balls within the frame, and their number is certainly

equal to k kM,3 .

Proof of Proposition 6.4. Due to Lemma 6.10 and induction on n, it now suffices to show n = 1

case of Proposition 6.4 to complete its proof. It is Lemma 6.6 that we started relying on the n 1

case. But when n = 1, all the subsequent assertions are easily derived by only using Lemma 6.5

and the definitions of the scattering data and normal ordering in Appendix D. In particular, all

(1)

(1)

the formulas are valid by setting M,2 = 0 and M,1 = |[M] | in agreement with the definition

under (2.23). Thus (6.11) becomes bM = (x2 ) with x2 = M , and (6.12) reads dM = |[M] | + rM .

The definition (6.15) reads kM,2 = kM,1 M = min([M] , [M] ) min([M1] , [M1] ) + rM .

Using the fact that rM rM+1 for normal ordered scattering data, one can directly verify the

properties (6.16)(6.21). By using them Lemma 6.9 is shown for n = 1, and (6.22) reads k,2 =

k,1 + |[M] | = Mk min([M] , [M] ) |r[M] |. Finally (6.25) can be checked by substituting

the above kM,2 into (6.26) with i = 2. This proves n = 1 case of Proposition 6.4, therefore it is

established for any n. 2

Summary of proofs. We have finished proving Proposition 6.4. From the arguments in Section 6.1, it leads to Proposition 6.1. Combined with Proposition 5.1, Proposition 6.1 proves

Theorem 4.9 as explained in Section 4.4. Combined with (4.5), Theorem 4.9 proves Theorem 2.1.

In the course of these proofs, we have identified the three basic quantities by Proposition 4.6

and Theorem 4.9. The tau function i (2.19) which is a piecewise linear function on the rigged

configuration, the CTM for the boxball system i (4.1) and the energy Ei (4.12). We rephrase it

as

Theorem 6.12. For any rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and the corresponding highest path p1 pL P+ ((0) ), the equality

i (p1 pk ) = i (p1 pk ) = Ei (p1 pk )

is valid for 1 i n + 1 and 1 k L.

(6.27)

237

Note that the second equality (Proposition 4.6) has been shown even for non-highest states.

The generalization of the first equality to them will be done in Theorem 7.4. Before closing the

section we include a few immediate consequences.

Corollary 6.13. For k = L, Theorem 6.12 becomes

i (p1 pL ) = i (p1 pL ) = Ei (p1 pL ) = c(, r) (i) , (6.28)

where c(, r) is the value of (2.20) at the full choice ( (a) , s (a) ) = ((a) , r (a) ), and we

employ the convention |(n+1) | = 0 as in (2.19).

Proof. For i = n + 1, the equality En+1 (p1 pL ) = c(, r) is a consequence of the

known relation between the charge of rigged configurations and the energy of paths [10,22]. For

i general, we find from (4.1) that n+1 (p1 pL ) i (p1 pL ) is the number of balls

with colors i + 1, i + 2, . . . , n + 1 in p1 pL . By the definition of the KKR bijection, it is

equal to |(i) |. 2

Remark 6.14. Corollary 6.13 tells that if = (0) , the max (2.19) is attained at the full

(a)

((a) ) = min((a) , (a+1) )

choice ( (a) , s (a) ) = ((a) , r (a) ). In particular, (2.23) leads to n+1

(a+1)

Now we are able to evaluate the conserved quantity El (3.2) for highest states in terms of the

rigged configurations.

Proposition 6.15. Let p P+ ((0) ) be the highest state corresponding to the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )). Then, its row transfer matrix energy El (p) (3.2) is given

(1)

(1)

by El (p) = j min(l, j ), which is El in (2.8).

Proof. Combining Proposition 4.8 and Theorem 6.12, we have

(0)

.

El (p) = En+1 (p) En+1 Tl (p) = n+1 (0) n+1

((0) ) is obtained from

(0)

Here, by Proposition 3.5, n+1

n+1 ( ) by replacing the rigging ri

(1)

(1)

(1)

(1)

with ri =

ri + min(l, i ). This amounts to changing |s| in (2.23) (with a = 0, d = n + 1)

into |s| j min(l, j ). On the other hand from Remark 6.14, we know that the max in (2.23)

((0) ) is equal to

for = (0) is attained at = (1) . Therefore the difference n+1 ((0) ) n+1

(1)

j min(l, j ). 2

Proposition 6.15 will be extended to non-highest states in Proposition 7.7.

7. N -soliton solutions of the boxball system

As an application of Theorem 2.1, we present the solution of the initial value problem and

N -soliton solutions of the boxball system. To cope with arbitrary states not necessarily highest,

we first introduce in Section 7.1 an extension of the rigged configurations for such states, which

238

we expect is equivalent to those studied in [23,40]. We naturally extend the domain of the tau

function to them. Generalizations of Theorems 2.1, 4.9 and 6.12 to arbitrary (non-highest) states

are presented in Section 7.2. Based on these results, we give the solution of the initial value

problem in Section 7.3. In Section 7.4 we derive several formulas for our tau functions in terms

of the parameters that specify solitons. Together with (7.13), they yield the N -soliton solution of

the boxball system. Our approach provides the general solution, which accommodates arbitrary

number and kinds of solitons. A class of special solutions have been constructed earlier in [26].

7.1. i for non-highest states

(0)

(0)

L

1

necessarily highest. Set

p = pvac p,

(7.1)

(7.2)

where (12 . . . n) for example means 1 n B1n . The rigged configuration for pvac is given

by

rcvac =

La =

1L0 , 1L1 , 0L1 , . . . , 1Ln , 0Ln ,

n

n

b min(a, b) Mb =

(b a)Mb

b=1

(7.3)

(0 a n).

(7.4)

b=a+1

(a)

configuration ((1L0 ), (1L1 ), . . . , (1Ln )) of rcvac is calculated as

a,1 L0

n

Ca,b Lb = Ma

(7.5)

b=1

for any j 1. In (7.1), one can always make the state p highest by taking M1 , . . . , Mn sufficiently

large. In fact, the choice

Ma > ma+1

(1 a n)

(7.6)

suffices, where ma denotes the total number of the letter a contained in the tableau representation

of p.

Let (,

r ) = ( (0) , ( (1) , r (1) ), . . . , ( (n) , r (n) )) be the rigged configuration for the highest

state p.

By the definition of the KKR bijection, it contains rcvac (7.3) for pvac . By this we

239

mean that (,

r ) can be depicted as follows (n = 3):

(0)

(1)

(2)

(3)

(0)

(1)

(2)

(3)

..

.

L2

..

.

L1

L0

Recall that (0) is not limited to a partition, therefore it is not necessarily a Young diagram.

(a) (a)

Neither (a) has been depicted so. As mentioned after (2.5), any reordering of {( i , ri )} for

each a should be understood as the same rigged configuration.

From the above rigged configuration ( (0) , ( (1) , r (1) ), . . . , ( (n) , r (n) )), we extract the data

(1)

( , r (1) ), . . . , ((n) , r (n) ) by

(a) la

1La ,

(a) = i i=1

(a) la

(a) = i i=1

,

(7.7)

la

(a)

r (a) = ri + Ma i=1

0La ,

(a) la

r (a) = ri i=1

(7.8)

(a)

for 1 a n, where la = ((a) ). The shift Ma in defining ri by (7.8) has been introduced on

account of (7.5) and the algorithm for the KKR bijection, especially Lemma C.3. As the result,

((1) , r (1) ), . . . , ((n) , r (n) ) become independent of M1 , . . . , Mn as they get large sufficiently.

Therefore the data (, r) = ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) is determined unambiguously

from p B(0) B(0) by the prescription (7.1)(7.8). We call (, r) the unrestricted

L

1

rigged configuration for p, which we expect is equivalent to the one studied in [23,40]. For

highest states, it coincides with the rigged configuration under the KKR bijection, but in gen(a)

eral ((0) , (1) , . . . , (n) ) is not necessarily a configuration. The vacancy number pj (2.7) can

become negative. The rigging r (a) Zla is no longer limited to the range (2.6) but obeys the re(a)

(a)

laxed condition ri p (a) with some non-positive lower bound. We associate the tau function

(a)

(0)

For = [k] and p = p1 pL , we will also use the notation i () = k,i = i (p1 pk )

as in (2.24).

240

Example 7.1. Take n = 3 and consider the non-highest state p and the highest state p as

p = 344 2 13 24 B3 B1 B2 B2 ,

p = pvac p,

pvac = 123123121 B19 ,

r ) for p is

where we have omitted in pvac . The rigged configuration (,

0

1

0

0

1

1

0

0

0

0

0

We have

(M1 , M2 , M3 ) = (1, 1, 2),

(L0 , L1 , L2 , L3 ) = (9, 5, 2, 0)

according to (7.4). Thus the definitions (7.7) and (7.8) yield the unrestricted rigged configuration

(, r) depicted as

1

0

(3)

7.2. i = i for non-highest states

r ) and (, r) =

Lemma 7.2. For any element p B(0) B(0) , let pvac , La , (,

1

((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be as in (7.2)(7.8). For a fixed (0) , the tau function

associated with the rigged configuration (,

r ) is decomposed as

L

i 1 0 = i (pvac ) + L1 () + i (),

(7.9)

1

i (pvac ) = L0 L1

(7.10)

Ca,b La Lb Li

2

1a,bn

for sufficiently large M1 , M2 , . . . , Mn . Here i (pvac ) is the tau function for the rigged configuration rcvac (7.3). The last term in the right-hand side of (7.9) is the tau function (2.19) associated

with the unrestricted rigged configuration (, r).

Proof. Let us write down the left-hand side of (7.9) according to (2.19) and (2.20) as

i 1

L0

1

= max min 1L0 , (1)

Ca,b min (a) , (b)

2

a,b

s (a) (i) .

241

(7.11)

For M1 , M2 , . . . , Mn sufficiently large, one has L0 L1 Ln1 1. In such a circumstance, one can show that the max can be limited to those (a) (a) that contain (1La ) part

entirely. Accordingly, we set

(a) (a) ,

(a) = (a) 1La ,

|s (a) | = s (a) + Ma (a) ,

s (a) r (a) ,

taking (7.7) and (7.8) into account. Substituting these forms into (7.11) and using the formula (7.5) and min( (a) (1La ), (b) (1Lb )) = La Lb + La ( (b) ) + Lb ( (a) ) + min( (a) , (b) ),

we obtain (7.9). The expression (7.10) is derived by means of (6.28). 2

A decomposition parallel to (7.9) takes place also for i .

Lemma 7.3. Under the same setting as Lemma 7.2, set p = p1 pL and take = (0)

[k]

in the notation (2.2), hence () = k. Then for M1 , M2 , . . . , Mn sufficiently large, the following

relation is valid:

i (pvac p1 pk ) = i (pvac ) + L1 k + i (p1 pk ).

L0

follows (n = 3).

(7.12)

On the top row, the length L0 part is pvac and the length k part is p1 pk . By the definition (4.1), i (pvac p1 pk ) is the number of balls with colors 2, . . . , i on the top row

and all the balls in the SW quadrant beneath it.

For M1 , . . . , Mn sufficiently large, one has L0 M1 1. Moreover from the time evolution rule in Proposition 3.2, the left segment within pvac with length L0 M1 undergoes just

a translation to the right by one lattice unit under T . Thus this segment and the hatched region containing the balls are entirely separated by the strip 11 11 of empty boxes with width

M1 1. Therefore i (pvac p1 pk ) is decomposed into the contributions from pvac

242

(trapezoid in the bottom left), p1 pk (hatched region) and the parallelogram in the bottom. By the definition, the first two are equal to i (pvac ) and i (p1 pk ), respectively.

The last one yields L1 k because there are L1 balls in total in the left segment in pvac with length

L0 M1 . 2

Now we give the generalization of Theorems 4.9 and 6.12 to arbitrary (non-highest) states.

Theorem 7.4. For any state p = p1 pL B(0) B(0) , let (, r) = ((0) ,

L

((1) , r (1) ), . . . , ((n) , r (n) )) be the unrestricted rigged configuration, and let i be the associated tau function. Then the equality (6.27), namely, i (p1 pk ) = i (p1 pk ) =

Ei (p1 pk ) holds for 1 k L.

Proof. The equality i = Ei has been already shown in Proposition 4.6 for any state, and we are

only to show i = i . Since pvac p1 pk is a highest state associated with the rigged

(0)

configuration ([k] (1L0 ), ( (1) , r (1) ), . . . , ( (n) , r (n) )), Theorem 4.9 tells that (7.12) is equal

(0)

Combining Theorem 7.4 with (4.5), we obtain a generalization of Theorem 2.1 to arbitrary

states.

Corollary 7.5. For any element, p B(0) B(0) , let (, r) = ((0) , ((1) , r (1) ), . . . ,

L

((n) , r (n) )) be the unrestricted rigged configuration. Then pk = (x1 , . . . , xn+1 ) B(0) is exk

pressed as

xd = k,d k1,d k,d1 + k1,d1

(0)

(0)

7.3. N -soliton solution

To simplify the notation we write in place of (0) in this subsection. We shall exclusively

treat the states p = p1 pL B1 BL such that L is formally infinite and the

boundary condition pk = uk is satisfied for k 1. Under such a setting, the right-hand side of

the inequality (7.6) is still finite, therefore all the arguments in Sections 7.1 and 7.2 remain valid.

Our solution of the initial value problem of the boxball system is formulated as

Theorem 7.6. For any initial state p = p1 p2 B1 B2 , let (, r) =

(, ((1) , r (1) ), . . . , ((n) , r (n) )) be its unrestricted rigged configuration. Then the state after the

time evolution p1 p2 = Tl1 Tl2 Tlt (p) is expressed as pk = (x1 , . . . , xn+1 ) Bk with

xd = k,d k1,d k,d1 + k1,d1 .

(7.13)

(1)

(1)

(1)

((2) , r (2) ), . . . , ((n) , r (n) )), where ri = ri + tj =1 min(lj , i ).

Proof. This is a consequence of Corollary 7.5 and Proposition 3.5.

(, ((1) , r (1) ),

243

Let us evaluate the conserved quantity El (3.2) in terms of the data (, ((1) , r (1) ), . . . ,

((n) , r (n) )).

Proposition 7.7. For any state p = p1 p2 B1 B2 , let (, r) = (, ((1) , r (1) ),

. . . , ((n) , r (n) )) be its unrestricted rigged configuration. Then the row transfer matrix energy

(1)

El (p) (3.2) is given by El (p) = j min(l, j ).

When p is highest, this reduces to Proposition 6.15.

Proof. Let p be the highest state (7.1) and let (,

r ) be the corresponding rigged configuration.

Proposition 6.15 tells that

(1)

(1)

El (p)

=

min l, j =

min l, (1) 1L1 j = L1 +

min l, j ,

j

j

(1)

where we have substituted (7.7) into j . On the other hand, due to M1 1 in (7.2) and the

is decomposed as El (p)

= El (pvac ) + El (p). It is easy to check El (pvac ) =

property (3.3), El (p)

L1 by counting the non-winding number using the graphical rule in Appendix B. 2

Following [26,27,31], we call those states p of the boxball system such that El (p) =

l1

(1)

(1)

(1)

j =1 min(l, j ) l1 -soliton states with amplitudes 1 , . . . , l1 . Thus Proposition 7.7 tells

that any state of the boxball system is an l1 -soliton state for some l1 . Moreover, Theorem 7.6

(1)

asserts that in the unrestricted rigged configuration (, ((1) , r (1) ), . . . , ((n) , r (n) )), the An1

(1)

(2)

(2)

(n)

(n)

(1)

part ( , ( , r ), . . . , ( , r )) is the conserved quantity among which provides the

list of amplitudes of solitons. In the remainder of this section we set

l1 = N,

(1) = (1 , . . . , N ),

r (1) = (r1 , . . . , rN ),

and rewrite the tau function in terms of the parameters that specify solitons. These parameters

are equivalent to the conserved quantity (, ((2) , r (2) ), . . . , ((n) , r (n) )) as we will see shortly.

The result yields the general N -soliton solution of the boxball system, which supplements the

special solution in [26].

(1)

From [27,30,31], it is known that N -soliton states in the An boxball system are labelled

2

2

2

2

(1)

with the An1 affine crystal Aff(B1 ) Aff(BN ). The classical part B1 BN

parametrizes the internal degrees of freedom of solitons. The affine part is incorporated in the

integers r1 , . . . , rN , and specifies the positions of the solitons. Thus we start with any such data

b1 bN B2

B2

,

N

1

(r1 , . . . , rN ) ZN ,

(7.14)

where we call each bi a soliton. Let (, ((2) , r (2) ), . . . , ((n) , r (n) )) be the unrestricted rigged

configuration for b1 bN . Without loss of generality we assume

1 N ,

ri rj

if i = j

(1)

and

i < j.

(1)

(7.15)

For any , let us express the An1 tau function i () associated with (, ((2) , r (2) ),

. . . , ((n) , r (n) )) in terms of b1 , . . . , bN . We parametrize (1 , . . . , N ) as = (j1 , . . . , jM )

in terms of the subset J = {j1 < < jM } {1, 2, . . . , N}. From the array of N solitons

244

2

2

b1 bN we extract an element in Bj1 BjM by sending the corresponding components to the left by the combinatorial R as follows:

2

2

2

2

(1)

(M)

B1 BN Bj1 BjM ( ),

b 1 bN

bj1 bjM ( ).

(7.16)

(2)

b1 b2 b3 b1 b3 ()

(1)

b2

(2)

b3

()

Obviously, the elements represented by the same symbol b3(2) in the two lines are not equal in

()

general. In this way, bj is uniquely determined only by further specifying J except = 1. In

what follows we will always take it for granted that J has been prescribed.

(1)

From Theorem 7.4 for An1 , we know

(1)

(1)

(M)

i () = Ei bj1 bjM

(2 i n + 1).

(1)

Applying the formula (4.13) for An1 to the right-hand side we get

(1)

(1)

(1)

(1)

(1)

(M)

i () =

bj,3 + bj,4 + + bj,i + Ei bj1 bjM

(2 i n + 1),

j J

(1)

(1)

(1)

where bj = (bj,2 , . . . , bj,n+1 ) is the representation in terms of the number of tableau letters

(1)

(1)

n+1 () || given just before (2.22). Substituting the above formula with i = n + 1 to this, we

find the result is unified into the single formula

(1)

(1)

(1)

i(1) () = ||

bj,i+1 + + bj,n+1 + bj,2

j J

+ Ei

(1)

(M)

bj1 bjM

(1 i n + 1),

(7.17)

(1)

(1)

(M)

(M)

bj1 bjM .

(1)

This is natural in view of the mod n structure of the indices in An1 . Similarly, the sum in (7.17)

(1)

(1)

(1)

Now we are ready to express the An tau function (2.23) associated with (, (, r), ((2) , r (2) ),

. . . , ((n) , r (n) )):

(1)

k,i = max min([k] , ) min(, ) |s| + i () (k 1, 1 i n + 1)

(7.18)

{j1 , . . . , jM } {1, . . . , N } as before, and introduce the functions:

(1)

(1)

(1)

k,i (j ) = min([k] , j ) rj bj,i+1 + + bj,n+1 + bj,2 (j J ),

(7.19)

i (J ) = 2

(1)

(M)

min(l , m ) Ei bj1 bjM ,

245

(7.20)

l,mJ

l<m

where min([k] , j ) = km=1 min(m , j ) according to (2.3). (To simplify the formula,

min(l , m ) has been

(7.17) into (7.18) and noting that

kept as it is despite (7.15).) Substituting

min(, ) || = 2 l,mJ,l<m min(l , m ) and |s| = j J rj , we find that k,i is expressed as

k,i = max

(7.21)

k,i (j ) i (J )

J {1,...,N }

j J

for k 1, 1 i n + 1. We introduce k,0 = k,n+1 |[k] | according to (2.18). Then by Theorem 7.6, the local states are specified by (7.13) and the time evolution Tl is given by changing rj

to rj + min(l, j ), i.e., k,i (j ) into k,i (j ) min(l, j ).

Using the formula (7.21), it is easy to evaluate the local state (7.13) explicitly for k 1

if k = 1 in this region and the condition (6.3) (without the super script (1) in the present

notation) is satisfied. It yields the asymptotic state of the boxball system well after the collisions

of solitons. Omitting the derivation similar to Lemma 6.8, we give the final result:

w

11 11(b1 )11 11(bM ) 11 11(bM+1 )11 11(bN )11 11 ,

(7.22)

2

where 1 B1 and the symbol has been suppressed. For each bM = (x2 , . . . , xn+1 ) BM ,

xn+1

x2

xn

n + 1n + 1nn22.

independent of r1 , . . . , rN . Therefore if M < M+1 , we have w 1 due to rM rM+1 . In case

M = M+1 , we have

w = rM+1 rM + H (bM bM+1 ) H (bM bM+1 )

(7.23)

(1)

because of (7.15). Here H (bM bM+1 ) is the energy (2.14) for An1 crystals. It is known

(cf. [27,31]) that H (bM bM+1 ) is the minimum distance until which the solitons of the same

amplitude can get close. Therefore (7.23) is consistent with the fact that the tau function (7.21)

constructed from the data (7.14) covers all the N -soliton solutions.

Our formula (7.21) possesses a structure

analogous to the well known tau function of the KP

hierarchy [37]. For each J , the sum j J k,i (j ) is the superposition of individual solitons,

whereas the quantity i (J ) reflects a multi-body effect. A characteristic feature in k,i (j ) (7.19)

(1)

is that it contains bj in (7.16) rather than bj that appears in the asymptotic state (7.22). As for

i (J ), using the definition (4.11), it is factorized into the two-body function as

()

(+1)

i (J ) =

(7.24)

,

S i b j b j

1<M

Si (b c) = 2 min(l, m) Qi (b c)

(+1)

Here bj

()

2

b c Bl

2

.

Bm

(7.25)

246

()

(1)

()

(1)

()

(1)

()

(1)

bj1 bj bj bj1 bj bj

(+1)

bj1 bj bj

()

( ).

Si in (7.25) is equal to min(l, m) plus the ith winding number min(l, m) Qi (b c). For i =

n + 1, it has been identified as the two-body phase shift of the solitons labelled with b and c

[27,31]. Thus i (J ) can be regarded as a generalization of it to the multi-body phase shift for an

arbitrary color i.

7.4. Alternative forms of N -soliton solution

We retain the notation in the previous subsection. The N -soliton solution (7.21) has been

expressed in terms of the parameters in (7.14). Here we rewrite it further in terms of the scattering

data (Appendices D and E):

2

2

Aff b

,

b1 [d1 ] bN [dN ] Aff b

(7.26)

N

1

(k+1)

, b0 = 2l (l 1).

H bk b j

dj = rj +

(7.27)

0k<j

Our task is essentially to switch from the position (rigging) rj to the mode dj . See (2.16) for

the symbol 2l . The mode dj here is a natural generalization of the one defined by (D.3). In

fact, when b1 bN is a highest element with respect to An1 , one has bj(1) = 2j and

(1)

H (b0 b1 ) = j , hence (7.27) reduces to (D.3). The mode is transformed according to (A.1)

under the combinatorial R. The affinization of (7.16) reads

2

2

2

2

Aff B1 Aff BN Aff Bj1 Aff BjM ( ),

(1) (1)

(M) (M)

(7.28)

b1 [d1 ] bN [dN ] bj1 dj1 bjM djM ( ).

()

()

For the notation dj , the same caution as for bj is necessary as mentioned under (7.16). Ap(1)

(1)

(M)

(M)

plying the definition (7.27) to bj1 [dj1 ] bjM [djM ] in the above, we find

()

(+1)

()

H b j b j

,

dj = rj +

0<

(0)

where the notation is the same as (7.24) and we have employed the convention j0 = 0 and b0 =

2

(1)

(1)

b0 . The element b0 Bl in (7.27) is the An1 analogue of u appearing in (4.12) for An . By

(1)

using (4.11), (4.12) and (2.14) for An1 crystals, this can be rewritten as

(1)

(1)

()

()

(1)

dj = rj En+1 bj1 bj + En+1 bj1 bj1

+

min(j , j ),

0

where min(0 , j ) = j . Taking the sum over and using (4.13), we get

M

=1

()

dj =

j J

rj +

j J

(1)

(1)

(M)

bj1 bjM

bj,2 En+1

1M

min(j , j ),

(7.29)

(1)

247

(1)

where we have used bj,2 + + bj,n+1 = j . On the other hand, from Corollary 7.5 we deduce

M

()

()

(1)

(1)

bj ,2 + + bj ,i = i () 1 ()

=1

(1)

(1)

= i () n+1 () + ||

(1 i n + 1),

(1)

(1)

where = (j1 , . . . , jM ) as in the previous subsection. Since i = Ei for An1 by Theorem 7.4, the right-hand side here is evaluated by using (4.13), leading to

M

(1)

(1)

()

()

(1)

(M)

bj ,2 + + bj ,i =

bj,i+1 + + bj,n+1 + Ei bj1 bjM

j J

=1

(1)

En+1

bj1 bj(M)

+ ||.

M

(7.30)

k,i (j ) i (J ) = min([k] , ) +

j J

()

j

()

= d j

+ j +

M

()

()

()

j + bj ,2 + + bj ,i ,

=1

min(j , j ) = rj +

1<

(+1)

,

Sn+1 bj bj

0<

k,i =

max

J {1,...,N }

M

()

()

()

min([k] , j ) j + bj ,2 + + bj ,i

(1 i n + 1),

=1

(7.31)

where the max extends over all the subsets J = {j1 , . . . , jM } {1, . . . , N }. Compared with

(7.21), the expression (7.31) is formally free from the multi-body effect. It has been absorbed

()

into the quantity j , which is a shifted mode.

The formula (7.31) is most naturally presented in terms of the principal picture of affine

crystals rather than the conventional homogeneous one. To explain it, let us make a short

(1)

digression on the principal picture in this paragraph. Recall that an element in the affine An

crystal Aff(Bl ) is parametrized as (x1 , . . . , xn+1 )[d], where d Z and xi Z0 are to satisfy

x1 + + xn+1 = l. See (2.15). We naturally extend xi to i Z by xi+n+1 = xi . Instead of

(x1 , . . . , xn+1 )[d], the element is also parametrized as xi = i1 i and d = 0 in terms of an

infinite sequence = (i )iZ such that

i Z,

i1 i ,

i = i+n+1 + l

for all i Z.

(7.32)

xi for i 0 and i = d + x0 + x1 + + xi+1 for i < 0. We set Affp (Bl ) = { = (i )iZ | (7.32)}

and call the crystal structure induced on it the principal picture. Explicitly, it is given as follows:

(n+1)

(n+1)

ej ( ) = i i,j

,

fj ( ) = i + i,j

for = (i ),

(n+1)

where i,j = 1 if i j mod n + 1 and 0 otherwise. If the right-hand sides break the condition

i1 i in (7.32), they are to be understood as 0. The combinatorial R is especially simple in

248

R : Affp (Bl ) Affp (Bm ) Affp (Bm ) Affp (Bl ),

(i ) (i ) (i Si ) (i + Si ).

(7.33)

Here Si = Si+n+1 = Si (

(1)

for An with b c Bl Bm , where b and c are specified by b = (i1 i )n+1

i=1 and c =

(i1

i )n+1

.

From

(2.13),

S

reads

explicitly

as

i

i=1

Si ( ) = 2 min(l, m) i + i+n+1

min {i+k

i+k1 }

(7.34)

1kn+1

for Affp (Bl ) Affp (Bm ). Observe the compatibility between (7.33) and (2.12). Actually

for i = 0, the rule (7.33) on 0 , 0 disagrees with the changes of d, d in (A.1) under the above

mentioned identification 0 = d, 0 = d , which renders, however, no problem being merely the

discrepancy in the normalizations of the energy function. By Affp we mean the crystal structure

including the convention specified in (7.33). is a generalized phase variable of solitons.

Back to our N -soliton solution, we restart with the principal picture of the scattering data

(7.26):

2

2

1 N Affp b

(7.35)

Affp b

.

N

1

Accordingly, (7.28) reads

2

2

2

2

Affp B1 Affp BN Affp Bj1 Affp BjM ( ),

1 N

(1)

(M)

j1 jM ( ),

(7.36)

()

where, again, the notation j is unambiguous only combined with J = {j1 , . . . , jM } as cau()

()

()

()

()

2

()

()

()

2

()

()

()

()

k,i =

max

J {1,...,N }

M

()

min([k] , j ) j ,i

(1 i n + 1),

(7.37)

=1

where the max extends over all the subsets J = {j1 , . . . , jM } {1, . . . , N } as in (7.31). Note that

j()

= j()

+ j is consistent with the time evolution rule in Proposition 3.5 and k,1 (p) =

,1

,n+1

k,n+1 (T (p)) indicated by (4.2).

Finally we present an operator formalism that formally leads to (7.37) via the ultradiscretization. Let q be an indeterminate. Let A be the algebra over C[q, q 1 ] generated by the sym2

2

bols ( ), ( ) ( Affp (Bl )) that satisfy the commutation relations ( Affp (Bl ),

2

Affp (Bm )):

( ) ( ) = ( ) ( ).

Here

are related to

.

(7.38)

(7.39)

249

equip A with the time evolution Tl (l Z1 ):

Tl ( )Tl1 = Tl ( ) ,

Tl ( )Tl1 = Tl ( ) ,

2

Tl ( ) = i + min(l, m) for = (i ) Affp Bm

(7.40)

.

Tl is an automorphism of A since it commutes with the combinatorial R, i.e., Tl ( ) Tl ( )

Tl ( ) Tl ( ) holds under (7.39). Obviously, Tl Tm = Tm Tl is valid.

For i Z, let the bracket i : A C[q, q 1 ] be the linear form on A characterized by the

following properties:

X ( ) i = Xi ,

( )X i = q i Xi for = (i )iZ ,

1i = 1,

(7.41)

where X denotes an arbitrary element in A. We shall write Tlk XTlk i simply as Tlk Xi for

2

2

2

any k Z. As an example, let Affp (Ba ) Affp (Bb ) Affp (Bc ). Then one has

Tl ( ) + ( ) () + () () + () i

= Tl ( ) () () i + Tl ( ) () () i + Tl ( ) () () i

+ Tl ( ) () () i + Tl ( ) () () i + Tl ( ) () () i

+ Tl ( ) () () i + Tl ( ) () () i .

We need the following reordering of by the combinatorial R:

(1) (1) (2) (2) (1) .

See (7.36). As cautioned after (7.16), there are two elements (2) and (2) that are relevant to

under the choices J = {2, 3} and {1, 3}, respectively. In terms of these elements, the above

bracket is evaluated as

(1)

(1)

(2)

(1)

(2)

+i

+ q min(l,a)+min(l,b)+min(l,c)+i +i +i .

From the commutation relation (7.38), the characterization of the bracket (7.41) and the definition (7.36), it follows that the tau function (7.37) associated to the scattering data 1 N

(7.35) comes out as the ultradiscretization:

k

1

Tj (1 ) + (1 ) (N ) + (N )

+0

j =1

(1 i n + 1),

(7.42)

above example (N = 3). In each of them, the list of the positions of specifies the subset

J = {j1 , . . . , jM } {1, . . . , N} for the relevant contribution in (7.37). The timeevolution of the

tau function k,i (Tl (p)) is obtained from (7.42) by further inserting the product kj =1 T1

of the

j

automorphism (7.40).

250

Unlike the tau function (5.1) for the KP hierarchy, A is not the Clifford algebra and it is not

known to us whether the Laurent polynomial

k

1

Tj (1 ) + (1 ) (N ) + (N )

j =1

satisfies any sort of bilinear relations. However, the formula (7.42) is a most intrinsic way to

present our ultradiscrete tau function. It synthesizes the principal features in the theories of solitons and crystal basis, i.e., the free-fermion like structure and the combinatorial R.

8. Summary

In this paper we have introduced the ultradiscrete tau function and exploited several properties

related to the KKR bijection and the boxball systems.

In Section 2, i is introduced in (2.18)(2.20) as a piecewise linear function on rigged configurations. The piecewise linear formula for the KKR bijection is stated in Theorem 2.1. After a

brief exposition on the boxball system in Section 3, we have furthermore introduced i and Ei

in Section 4. i in (4.1) is the number of balls in the SW quadrant in the time evolution pattern

of the boxball system. Ei defined by (4.12) and (4.11) is a sum of local energy function in the

affine crystal. The fact i = Ei has been shown in Proposition 4.6. The two quantities provide

analogues of the corner transfer matrix [1] in complementary viewpoints; i from the boxball

system and Ei from the crystal base theory. Theorem 2.1 is a consequence of the further identification i = i = Ei in Theorem 6.12. Sections 5 and 6 are devoted to a proof of this fact. In

Section 5, i is shown to emerge as an ultradiscretization of the tau functions of the KP hierarchy

(Lemma 5.3) and satisfy the Hirota type bilinear equation (Proposition 5.1). In Section 6, i = i

is proved on the asymptotic states by induction on the rank (Proposition 6.1 and its reduction

in Proposition 6.4). These properties are enough to establish the claim i = i everywhere. Section 7 gives the generalization of Theorems 2.1 and 6.12 to arbitrary (non-highest) states. As

an application, the solution of the initial value problem in the boxball system is given in Theorem 7.6. We have also included the formulas (7.21), (7.37) and (7.42) for general N -soliton

solutions. Curiously, they are most elegantly presented in terms of affine crystals in the principal

picture introduced in Section 7.4.

Acknowledgements

The authors thank Masato Okado, Anne Schilling, Mark Shimozono and Taichiro Takagi for

useful discussion. Y.Y. is supported by Grants-in-Aid for Scientific No. 17340047. R.S. is grateful to Miki Wadati for warm encouragement during the study. He is a research fellow of the Japan

Society for the Promotion of Science.

Appendix A. Crystals and combinatorial R

The crystals Bl used in the main text are crystal bases of irreducible finite-dimensional representations of a quantum affine algebra Uq (g). Let us recall basic facts on them following [11,12].

Let P be the weight lattice, {i }0in the simple roots, and {

i }0in the fundamental

weights of g. A crystal B is a finite set with weight decomposition B = P B . The Kashiwara

operators ei , fi (i = 0, 1, . . . , n) act on B as ei : B B+i {0}, fi : B Bi {0}. In

251

For any b B, set i (b) = max{m 0 | eim b = 0} and i (b) = max{m 0 | fim b = 0}. Then we

have the weight wtb of b by wtb = ni=0 (i (b) i (b))i .

For two crystals B and B , one can define the tensor product B B = {b b | b B, b B }.

The operators ei , fi act on B B by

e b b if i (b) i (b ),

ei (b b ) = i

b ei b if i (b) < i (b ),

fi (b b ) = fi b b if i (b) > i (b ),

b fi b if i (b) i (b ).

Here 0 b and b 0 should be understood as 0. For crystals we are considering, there exists a

unique isomorphism B B B B, i.e., a unique map which commutes with the action of

Kashiwara operators. In particular, it preserves the weight.

For a crystal B we define its affinization Aff(B) = {b[d] | d Z, b B} by ei (b[d]) =

(ei b)[d i0 ] and fi (b[d]) = (fi b)[d + i0 ]. (b[d] here corresponds to T d af (b) in [12].) The

Aff(B) called the combinatorial R. It has the following form:

R : Aff(B) Aff(B ) Aff(B ) Aff(B),

b[d] b [d ] b d H (b b ) b d + H (b b ) ,

(A.1)

B B B B.

where b b b b under the isomorphism

function and determined up to an additive constant by

H b b + 1 if i = 0, 0 (b) 0 (b ), 0 (b ) 0 (b),

H ei (b b ) = H (b b ) 1 if i = 0, 0 (b) < 0 (b ), 0 (b ) < 0 (b),

otherwise.

H (b b )

Proposition A.1 (YangBaxter equation). The following equation holds on Aff(B) Aff(B )

Aff(B ):

(R 1)(1 R)(R 1) = (1 R)(R 1)(1 R).

We often write the map R simply by . The combinatorial R is naturally restricted to B B .

In the main text we are concerned about the crystal Bl corresponding to the l-fold symmetric

tensor representation. We normalize the energy function so that

max H (b c) | b c Bl Bm = min(l, m).

Under this convention one has min{H (b c) | b c Bl Bm } = 0. When l = m, the combinatorial R becomes the identity map on Bl Bl but still acts non-trivially as R(x[d] y[e]) =

x[e H (x y)] y[d + H (x y)].

Appendix B. Graphical rule for combinatorial R

(1)

Following [17], we introduce a graphical rule to calculate the combinatorial R for An and

energy function given by (2.12) and (2.14). Given the two elements

x = (x1 , x2 , . . . , xn+1 ) Bk ,

y = (y1 , y2 , . . . , yn+1 ) Bl ,

252

x1

y1

x2

y2

xn+1

yn+1

Combinatorial R and the energy function H for Bk Bl (with k l) are calculated by the

following rule.

(1) Pick any dot, say a , in the right column and connect it with a dot a in the left column by

a line. The partner a is chosen from the dots which are in the lowest row among all dots

whose positions are higher than that of a . If there is no such dot, we return to the bottom

and the partner a is chosen from the dots in the lowest row among all dots. In the latter case,

we call such a pair or line winding.

(2) Repeat the procedure (1) for the remaining unconnected dots (l 1)-times.

(3) Action of the combinatorial R is obtained by moving all unpaired dots in the left column to

the right horizontally. We do not touch the paired dots during this move.

(4) The energy function H is given by the number of winding pairs.

It is known that the results for the combinatorial R and the energy functions are not affected

by the order of making pairs [17, Propositions 3.15 and 3.17]. For more properties, including that

the above definition indeed satisfies the axiom, see [17].

Example B.1. The diagram for 1233 124 is

By moving the unpaired dot (letter 2) in the left column to the right, we obtain

1233 124 133 1224 .

253

1233 124 = 1.

For i Zn+1 , the number of connecting lines that cross the horizontal level of the border between

xi and xi+1 is called the ith winding number. The energy function H is the (n + 1)th winding

number. The quantity min(l, k)(ith winding number) is called the ith non-winding number.

It is known that Qi (x y) in (2.13) gives the ith non-winding number. By the definition, the

winding numbers for x y and y x are the same if x y y x by the combinatorial R.

Appendix C. KKR bijection

In order to define the KerovKirillovReshetikhin (KKR) bijection, there are two different

ways. One is the original combinatorial algorithm [9,10] explained here, and the other one is an

algebraic version [34,35] which will be treated in Appendix D. Although the both definitions are

known to be equivalent, they work complementarily in some aspects. In fact, we use the both

definitions case by case in the main text.

C.1. Definition

The KKR bijection provides one to one correspondence between the set of rigged configura(1)

tions and that of highest paths. For a given An rigged configuration

(n) (n)

(0) (1) (1)

,

RC = j , j , rj , . . . , j , rj

(C.1)

we define the KKR procedure RC
p B(0) B(0) B(0) , which gives a highest

N

(a)

path p. See Section 2.2 for definitions of rigged configurations, vacancy numbers Ej

gings. The data (0) is called quantum space.

and rig-

Definition C.1. For a given RC, the image (or path) p of the KKR bijection is obtained by the

following procedure.

Step 1. For each row of the quantum space (0) , we assign the numbers from 1 to N arbitrarily,

and reorder it as

(0)

(0)

(0) = (0)

(C.2)

N , . . . , 2 , 1 .

(0)

Take row 1 .

(0)

(0)

1 =

(0)

l1

2(0) 1(0) .

(C.3)

(0)

p1 =

(C.4)

254

(0)

(i)

Starting from the box 1 , we recursively choose 1 (i) by the following Rule 1:

(i1)

(i) whose lengths w satisfy

(i1)

w col 1

,

where the right-hand side means the number of columns in (i1) that are not located to the right

(i1)

.

of the box 1

def

(i)

Let gs ( g (i) ) be the set of all the singular rows ( rows whose corresponding vacancy

(i)

number and rigging are equal) in the set g (i) . If gs = , then choose one of the shortest rows of

(i)

(i)

(i)

(i)

(n)

gs , and denote its rightmost box by 1 . If gs = , then we take 1 = = 1 = .

(0)

(j 1)

(1)

(k)

mum k such that 1 = . After the removal, construct a new RC by

(a)

(a1)

(a)

(a+1)

2Ei + Ei

along the configuration

after the removal. For those rows shortened by the removal, assign their vacancy numbers equal

to the new riggings. For the other row, keep the original rigging before Step 3.

Put letter j1 into the leftmost empty box of p1 as

p1 =

j1

(C.5)

(0)

(0)

(0)

Step 4. Repeat Step 2 and Step 3 for the rest of the boxes 2 , 3 , . . . , l1 in this order. Put

letters jk into empty boxes of p1 from left to right.

(0)

(0)

(0)

Step 5. Repeat Step 1 to Step 4 for the rest of the rows 2 , 3 , . . . , N in this order. Then we

obtain pk from (0)

.

k , which we identify with the tableau representation of the element in B(0)

k

The image of the KKR bijection is given by p = pN p2 p1 .

The above procedure gives a map from rigged configurations to highest paths. Its inverse also

admits a similar description. See Theorem 2 of [9].

C.2. Example of the KKR bijection

Let us illustrate a typical example of the KKR bijection. For a later convenience, we treat the

single column type quantum space. The procedure for general quantum space is quite similar.

255

Example C.2. We show that the following rigged configuration corresponds to a path p =

11112221322433.

(0)

(1)

(2)

0

2

(3)

1

1

0

In the above diagram, we have specified the boxes to be removed by Step 3 with the symbol .

Note that the boxes with are the rightmost boxes of the shortest possible singular rows, and

their column coordinates are increasing from the left to the right. We can remove three boxes at

a time, thus resulting part of a path is 3 . Similarly we can proceed as

(113 )

(112 )

(111 )

(110 )

0

4

4

0

8

4

0

4

3

0

3

1

3

0

0

0

0

0

3

1

0

0

4

1

0

256

(19 )

2

6

(18 )

(17 )

3

2

(14 )

4

1

p= 1 1 1 1 2 2 2 1 3 2

2 4 3 3.

The following lemma is useful.

Lemma C.3. Let p P+ ((0) ) and q P+ ( (0) ) be the highest paths corresponding to the

rigged configurations ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) and ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )),

respectively. Then the rigged configuration for the highest path p q is given by

(0)

(0) , (1) , r (1) (1) , s (1) , . . . , (n) , r (n) (n) , s (n) .

(C.6)

Here ( (a) , s (a) ) = {(i(a) , si (a) )} and the rigging s (a) = (si (a) ) is given by

(a)

si

(a)

= si

+p

(a)

(a) ,

i

(a)

where pj is the vacancy number (2.7) for ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )).

(a)

Proof. Let qj

(a)

number pj

be the vacancy number for ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )). Then the vacancy

(a)

(a)

(a)

rigging) of the row (i(a) , si (a) ) in (C.6) is pj (a) si (a) = qj(a) si(a) with j = i(a) , which is nothing but the co-rigging of the same row in ( (0) , ( (1) , s (1) ), . . . , ( (n) , s (n) )). Recall that the KKR

257

procedure (Definition C.1) consults co-riggings to decide boxes to be removed from a rigged

configuration. Therefore the above coincidence of the co-rigging means that the KKR procedure

on (C.6) gives the path q when the part (0) is firstly removed from (0) (0) . Moreover at this

stage, the remaining rigged configuration is exactly ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )). 2

Appendix D. Vertex operator formalism of the KKR bijection

Here we give a crystal theoretic reformulation of the KKR bijection based on [34,35]. The

central notions are scattering data, normal ordering and the vertex operator. For illustrative examples, see Appendix E.

D.1. Scattering data and normal ordering

We call elements of affine crystals b1 [d1 ] bm [dm ] Aff(Bl1 ) Aff(Blm ) scattering data. The number di is called the ith mode. By using the combinatorial R, scattering data can be reordered and the modes are changed accordingly. Given a scattering data

s Aff(Bl1 ) Aff(Blm ), define Sm to be the set of such reordering as

!

Sm = s

Aff(Bl (1) ) Aff(Bl (m) ) s s ,

Sm

where means the disjoint union over all the distinct permutations of (l1 , . . . , lm ). For instance,

if s = 234 7 223 2 , we have

S2 = 234 7 223 2 , 234 0 223 9 .

Note that in this case, the union over is trivial as (l1 , l2 ) = (l2 , l1 ) = (3, 3), but S2 contains two

distinct elements since the combinatorial R is nontrivial as remarked in the end of Appendix A.

For i = 2, . . . , m, let Si1 be the subset of Si having the maximal ith mode. Then we have

= S1 S2 Sm .

(D.1)

In the above example, we have S1 = 234 0 223 9 . We call the elements of S1 normal

ordered forms of s. In general the normal ordered form b1 [d1 ] bm [dm ] is not unique

but the mode sequence d1 , . . . , dm is unique by the definition and satisfies d1 dm . Any

element of S1 is denoted by :s:.

D.2. Maps C (1) , . . . , C (n)

(1)

Let ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) be an An rigged configuration. Pick the color a part

(a)

( , r (a) ). Here we simply write it as (, r). Namely = (1 , . . . , m ) is an array of positive

integers and r = (ri ), where ri is the rigging attached to the ith row in of length i . For

a+1

(1)

1 a n, let Bl = Bl

be the Ana crystal in the sense explained around (2.17). Define the

(1)

map C (a) among the Ana crystals by

258

b1 bm :b1 [d1 ] bm [dm ]:

(k+1)

.

H bk b i

di = ri + i +

(1 a n),

(D.2)

(D.3)

1k<i

(j )

Here bi

(j )

(bj bi1 ) bi bi

( )

under the isomorphism (Bj Bi1 ) Bi Bi (Bj Bi1 ). Note that the

choice (D.3) is compatible with (A.1).

n+1

(1)

The map C (n) involves A0 crystal Bl

= {(n + 1)l }. See (2.16) for the notation a l . The

(n)

following suffices to define C :

(n + 1)l (n + 1)m (n + 1)m (n + 1)l ,

H (n + 1)l (n + 1)m = min(l, m).

Since the normal ordering in (D.2) is not unique, C (a) is actually multi-valued in general.

Here we mean by C (a) () to pick any one of the normal ordered forms. C (a) is an operator that

(1)

transforms elements of classical Ana crystals to normal ordered scattering data by assigning the

modes.

D.3. Maps (1) , . . . , (n)

Pick the color a and a 1 parts of the configuration and denote them simply by (a) =

a+1

a

(1 , . . . , m ) and (a1) = (1 , . . . , k ). Set Bl = Bl

and Bl = Bl . We define the map

(1)

(1)

(a) from the normal ordered scattering data in Ana affine crystals to classical Ana+1 crystals:

(a) : :Aff(B1 ) Aff(Bm ): B 1 B k

(1 a n),

b1 [d1 ] bm [dm ] c1 ck .

(D.4)

From (D.3) and the fact that b1 [d1 ] bm [dm ] is normal ordered, we have 0 d1 dm .

Then the image c1 ck is determined by the following relation under the isomorphism of

a

(1)

Ana+1 crystals: (We write Tad = a d (B1 )d for short.)

d dm1

Tad1 b1 Tad2 d1 b2 Ta m

bm a 1 a 2 a k

(c1 ck ) tail.

(D.5)

a+1

a

(2.17) as sets. The tail part has the same structure as (Tad1 b1 Tad2 d1 bm ) on the

left-hand side. In the actual use, it turns out to be (Tad1 a 1 Tad2 d1 a m ) containing

the letter a only. (This fact will not be used.)

259

(1)

To obtain c1 ck using (D.5), one applies the Ana+1 combinatorial R many times

d dm1

to carry (Tad1 b1 Ta m

procedure is depicted as

a 1

a 2

a k

a m

bm

dm dm1

d1

b1

a 1

a

c1

ck

c2

(1)

(n)

(n)

(D.6)

Theorem D.1. The image p of the rigged configuration ((0) , ((1) , r (1) ), . . . , ((n) , r (n) )) under

the KKR bijection is given by

p = (1) C (1) (2) C (2) (n) C (n) p (n) .

(D.7)

This is announced in [34] and proved in [35]. The theorem asserts that the right-hand side is

independent of the choices of the possibly non-unique normal ordered forms when applying the

maps C (1) , . . . , C (n) .

Set

p (a) = (a+1) C (a+1) (n) C (n) p (n) (0 a n 1),

(D.8)

(1)

a+1

(a)

1

a+1

(a) .

la

Corollary D.2. For 0 a n 1, p (a) coincides with the image of the truncated rigged configuration ((a) , ((a+1) , r (a+1) ), . . . , ((n) , r (n) )) under the KKR bijection.

By the construction, C (a) (p (a) ) is a normal ordered scattering data which is an element of an

affine crystal. Then the map (a) produces an Ana+1 highest path by injecting the scat-

(1)

Ana

(a1)

(a1)

a la1 . We call (a) vertex operator

in this sense. The construction (D.8) involves the family of scattering data and vertex operators

(1)

(1)

(1)

for crystals of A0 A1 An . It can be regarded as a crystal theoretical formulation of

the nested Bethe ansatz due to Schultz [41].

260

This appendix is an exposition of the inverse scattering formalism of the boxball system

mentioned in Section 3.2. We illustrate the calculations of scattering data, normal ordering and

vertex operators explained in Appendix D along several examples.

E.1. Time evolution, scattering data and normal ordering

Example E.1. Consider the rigged configuration in Example C.2. We put many 1 B1 on the

both sides of the corresponding path p = 11112221322433, and consider its time evolution under

T of the boxball system. See Section 3.1 for the definition of T .

t = 0:

1111222211111133211143111111111111111111111111111111

t = 1:

1111111122221111133211431111111111111111111111111111

t = 2:

1111111111112222111133214311111111111111111111111111

t = 3:

1111111111111111222211133243111111111111111111111111

t = 4:

1111111111111111111122221132433111111111111111111111

t = 5:

1111111111111111111111112221322433111111111111111111

t = 6:

1111111111111111111111111112211322433211111111111111

t = 7:

1111111111111111111111111111122111322143321111111111

t = 8:

1111111111111111111111111111111221111322114332111111

t = 9:

1111111111111111111111111111111112211111322111433211

Here the length of the paths is 52, and t = 5 state contains the original path as 120 p 118 .

The following rigged configurations correspond to the above paths at each time.

(0)

(152 )

(1)

38

40

43

(2)

0 + 4t

7 + 3t

13 + 2t

1

0

(3)

1

The linear dependence of the rigging on t is in agreement with Proposition 3.5. The following is

the list of all the normal ordered scattering data corresponding to each time t of the above paths.

t = 0, 1, 2, 3, 4

t =5

2222 24 233 26 34 26

t =6

2222 24 23 26 334 26

22 27 2223 29 334 29

t = 7, 8, 9

22 27 223 29 2334 29

Compare this list with the above time evolution pattern. Each tensor product component of the

scattering data corresponds to a soliton in the path. When the modes of the scattering data are well

separated, the normal ordering is unique, and the corresponding path consists of well separated

261

solitons that contain the tableau letters in the scattering data (in the reverse order). t = 5, 6 are

such cases. From the viewpoint of the scattering data, collisions of solitons happen when the

modes get close and the normal ordering becomes non-unique. t = 5, 6 are such cases. See also

Example 2.4 for the tau functions at t = 5, where k,i there is relevant to k+20,i here.

Let us illustrate the derivation of the normal ordered scattering data at t = 5. At t = 5, riggings

of (1) attached to the rows of length 2, 3 and 4 are r1 = 23, r2 = 22 and r3 = 20, respectively.

By Theorem D.1 and (D.8), we know that p = (1) C (1) (p (1) ), where C (1) (p (1) ) is the normal

ordered scattering data. It is constructed from the A2 -highest path p (1) containing the letters 2, 3

and 4. According to Corollary D.2, p (1) is the image of the KKR bijection of the following part

of the original rigged configuration:

(1)

(2)

1

0

(3)

1

(1)

Here (1) plays the role of the quantum space, and the KKR bijection is A2 type with letters 2,

3 and 4. For example, if we can remove only a box from (1) , then we have the letter 2 as a part

of the path, whereas if boxes are removed from (1) , (2) and (3) , the letter is 4. Removing the

rows of (1) from the top, we obtain the A2 highest path:

p (1) = b1 b2 b3 = 22 223 2334 .

(E.1)

Assigning this with the modes according to (D.2) and (D.3), we get

b1 [d1 ] b2 [d2 ] b3 [d3 ] = 22

2334 25 .

To derive the mode d3 = 25, for instance, we calculate 1k<3 H (bk b3(k+1) ) in (D.3) as

0

25

223

26

H

where a b signifies the value of the energy function H (a b) = H . Since in (D.3), we have

r3 = 20 and 3 = 4, the mode is d3 = 20 + 4 + 0 + 1 = 25.

To find the normal ordered scattering data C (1) (p (1) ), we follow the procedure (D.1) and list

the following sets:

"

S3 = 22 25 223 26 2334 25 , 22 25 2223 25 334 26 ,

222

25

23

26

2334

25 ,

222

25

2233

25

34

26 ,

#

2222 24 23 26 334 26 , 2222 24 233 26 34 26 ,

"

S2 = 22 25 2223 25 334 26 , 222 25 2233 25 34 26 ,

#

2222 24 23 26 334 26 , 2222 24 233 26 34 26 ,

"

#

S1 = 2222 24 23 26 334 26 , 2222 24 233 26 34 26 .

The both elements in S1 serve as the normal ordered scattering data in agreement with the previous list at t = 5.

262

(0)

(152 )

(1)

38

40

43

(2)

0 + 4t

5 + 3t

10 + 2t

1

0

(3)

1

t = 0, 1, 2, 3

t =4

2222 20 233 21 34 21

2222 20 23 21 334 21

222 20 2233 21 34 21

222 20 23 21 2334 21

22 20 2223 21 334 21

t = 5, 6, 7, 8, 9

22 20 223 21 2334 21

22 12+2t 223 9+3t 2334 5+4t

At t = 4, all 6 reorderings are simultaneously normal ordered. In a sense three solitons collide

all together at t = 4. Compare this with the following time evolution pattern.

t = 0:

1111222211113321143111111111111111111111111111111111

t = 1:

1111111122221113321431111111111111111111111111111111

t = 2:

1111111111112222113324311111111111111111111111111111

t = 3:

1111111111111111222213243311111111111111111111111111

t = 4:

1111111111111111111122132243321111111111111111111111

t = 5:

1111111111111111111111221132214332111111111111111111

t = 6:

1111111111111111111111112211132211433211111111111111

t = 7:

1111111111111111111111111122111132211143321111111111

t = 8:

1111111111111111111111111111221111132211114332111111

t = 9:

1111111111111111111111111111112211111132211111433211

Here we illustrate the action of the vertex operators (1) , . . . , (n) introduced in Appendix D.3 (D.4). It is convenient to use the vertex type diagram to express the action of the

combinatorial R. For example the following successive actions of the combinatorial R

a b c b a c b c a ,

263

b

c

a

a .

Given a path p and an element b Bl , one can carry b through p to the right by successively

applying the combinatorial R as

b p p b ,

(E.2)

under the isomorphism Bl (Bk1 BkN ) (Bk1 BkN ) Bl . As the result we get

b Bl and another path p . Actually, the only situation b = ul (highest element of Bl ) will

be encountered in our case, and the relation (E.2) will be denoted by b (p) = p . This is an

elementary vertex operator. The previous ones (1) , . . . , (n) defined by (D.5) are compositions

of b with several b.

5

For example, to calculate 2334 ( 1 ), the relevant diagram is

1

2334

1

1233

1

1123

1

1112

1

1111

1111

1

Therefore we obtain 2334 ( 1 ) = 43321. Note that b has created one soliton labeled by the

letters in b.

In general, if b1 [d1 ] bm [dm ] is a normal ordered scattering data, (1) defined by (D.5)

is realized as the following composition of elementary vertex operators:

d dm1

bm ,

(E.3)

g(p) = f (g(p)). The superscript (1) corresponds to that of (1) . Note that for a

where f

the effect of Tad in (D.5) is described by T1d = ( 1 )d .

In what follows we illustrate Theorem D.1 and Corollary D.2.

= 1,

Example E.3. Take a path p = 11112221322433, which we have already considered in Example

C.2 and Example E.1. From t = 5 case of Example E.1, the both sides of

2222 4 23 6 334 6 2222 4 233 6 34

(E.4)

C (1) (p (1) ).

According to Theorem D.1 and (D.8), the

original path p is reconstructed as p = (1) C (1) (p (1) ). This (1) is realized, according to (E.3)

and (E.4), as the following compositions of elementary vertex operators:

14

p = T14 2222 T12 23 334 1

14

.

= T14 2222 T12 233 34 1

It is easy to check p = 11112221322433 from these formulas.

Let us illustrate Corollary D.2, which reflects the nested structure of the KKR bijection. For

(a) (D.5) with general a, the formula (E.3) is replaced by

(a) = ( a )d1 b1 ( a )d2 d1 ( a )dm dm1 bm .

(E.5)

264

Example E.4. We consider the same example as above. In the rigged configuration (see Example

(1)

E.1), first look at the rightmost two diagrams which form an A1 rigged configuration:

(2)

(3)

0

From (3) , we set p (3) = 4 according to (D.6) and obtain the scattering data C (3) (p (3) ) = 4 1 ,

which is obviously normal ordered. From (E.5), the A1 highest path p (2) = (3) C (3) (p (3) ) with

letters 3 and 4 is constructed as

p (2) = 3 4 3 333 = 3 334 .

Taking the rigging attached to (2) into account, we obtain the normal ordered scattering data

C (2) (p (2) ) = 3 1 334 4 .

Next we look at the following parts

(1)

(2)

1

0

(3)

1

Then the A2 highest path p (1) = (2) C (2) (p (2) ) with letters 2, 3 and 4 is calculated along (E.5)

as

p (1) = 2 3 ( 2 )3 334 22 222 2222

= 22 223 2334 .

As a result, we have reproduced (E.1), which was the starting point of the previous Example E.3.

Summarizing, the path p = 11112221322433 has been obtained as p = (1) C (1) (2) C (2)

(3) C (3) (p (3) ).

References

[1] R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982.

[2] H.A. Bethe, Zur Theorie der Metalle, I. Eigenwerte und Eigenfunktionen der linearen Atomkette, Z. Phys. 71 (1931)

205231.

[3] M. Gaudin, La fonction donde de Bethe, Masson, Paris, 1983.

[4] V.E. Korepin, N.M. Bogoliubov, A.G. Izergin, Quantum Inverse Scattering Method and Correlation Functions,

Cambridge Univ. Press, 1997.

[5] M. Takahashi, Thermodynamics of One-Dimensional Solvable Models, Cambridge Univ. Press, 1999.

[6] G.E. Andrews, R.J. Baxter, P.J. Forrester, Eight vertex SOS model and generalized RogersRamanujan-type identities, J. Stat. Phys. 35 (1984) 193266.

[7] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Exactly solvable SOS models: Local height probabilities and

theta function identities, Nucl. Phys. B 290 (1987) 231273;

E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Proof of the startriangle relation and combinatorial identities,

Adv. Stud. Pure Math. 16 (1988) 17122.

265

[8] Combinatorial Aspect of Integrable Systems, A. Kuniba, M. Okado (Eds.), MSJ Memoirs 17 (2007).

[9] S.V. Kerov, A.N. Kirillov, N. Yu, Reshetikhin, Combinatorics, the Bethe ansatz and representations of the symmetric

group, J. Sov. Math. 41 (1988) 916924.

[10] A.N. Kirillov, N.Yu. Reshetikhin, The Bethe ansatz and the combinatorics of Young tableaux, J. Sov. Math. 41

(1988) 925955.

[11] M. Kashiwara, On crystal bases of the q-analogue of universal enveloping algebras, Duke Math. J. 63 (1991) 465

516.

[12] S.-J. Kang, M. Kashiwara, K.C. Misra, T. Miwa, T. Nakashima, A. Nakayashiki, Affine crystals and vertex models,

Int. J. Mod. Phys. A 7 (Suppl. 1A) (1992) 449484.

[13] A. Berkovich, B.M. McCoy, A. Schilling, RogersSchurRamanujan type identities for the M(p, p ) minimal models of conformal field theory, Commun. Math. Phys. 191 (1998) 325395.

[14] S. Dasmahapatra, R. Kedem, T.R. Klassen, B.M. McCoy, E. Melzer, Quasi-particles, conformal field theory, and

q-series, Int. J. Mod. Phys. B 7 (1993) 36173648.

[15] B.L. Feigin, A.V. Stoyanovsky, Quasi-particle models for the representations of Lie algebras and geometry of flag

manifold, hep-th/9308079.

[16] O. Foda, T.A. Welsh, Melzers identities revisited, Contemp. Math. 248 (1999) 207234.

[17] A. Nakayashiki, Y. Yamada, Kostka polynomials and energy functions in solvable lattice models, Selecta Math.

New Ser. 3 (1997) 547599.

[18] S.O. Warnaar, Fermionic solution of the AndrewsBaxterForrester model I: unification of TBA and CTM methods,

J. Stat. Phys. 82 (1996) 657685.

[19] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Y. Yamada, Remarks on fermionic formula, Contemp. Math. 248

(1999) 243291.

[20] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Z. Tsuboi, Paths crystals and fermionic formulae, Prog. Math.

Phys. 23 (2002) 205272.

[21] I. Macdonald, Symmetric functions and Hall polynomials, second edition, Oxford Univ. Press, New York, 1995.

[22] A.N. Kirillov, A. Schilling, M. Shimozono, A bijection between LittlewoodRichardson tableaux and rigged configurations, Selecta Math. 8 (2002) 67135.

[23] A. Schilling, X = M Theorem: Fermionic formulas and rigged configurations under review, Combinatorial Aspect

in Integrable Systems, MSJ Memoirs 17 (2007) 75104.

[24] A. Schilling, M. Shimozono, X = M for symmetric powers, J. Alg. 295 (2006) 562610.

[25] M. Okado, A. Schilling, M. Shimozono, A crystal to rigged configuration bijection for nonexceptional affine algebras, in: N. Jing (Ed.), Algebraic Combinatorics and Quantum Groups, World Scientific, 2003, pp. 85124.

(1)

[26] G. Hatayama, K. Hikami, R. Inoue, A. Kuniba, T. Takagi, T. Tokihiro, The AM automata related to crystals of

symmetric tensors, J. Math. Phys. 42 (2001) 274308.

[27] K. Fukuda, M. Okado, Y. Yamada, Energy functions in boxball systems, Int. J. Mod. Phys. A 15 (2000) 13791392.

[28] D. Takahashi, On some soliton systems defined by using boxes and balls, in: Proceedings of the International

Symposium on Nonlinear Theory and Its Applications NOLTA 93, 1993, pp. 555558.

[29] D. Takahashi, J. Satsuma, A soliton cellular automaton, J. Phys. Soc. Jpn. 59 (1990) 35143519.

[30] G. Hatayama, A. Kuniba, T. Takagi, Soliton cellular automata associated with crystal bases, Nucl. Phys. B 577

(2000) 619645.

[31] G. Hatayama, A. Kuniba, M. Okado, T. Takagi, Y. Yamada, Scattering rules in soliton cellular automata associated

with crystal bases, Contem. Math. 297 (2002) 151182.

[32] A. Kuniba, M. Okado, Y. Yamada, Boxball system with reflecting end, J. Nonlin. Math. Phys. 12 (2005) 475507.

[33] T. Tokihiro, D. Takahashi, J. Matsukidaira, J. Satsuma, From soliton equations to integrable cellular automata

through a limiting procedure, Phys. Rev. Lett. 76 (1996) 32473250.

[34] A. Kuniba, M. Okado, R. Sakamoto, T. Takagi, Y. Yamada, Crystal interpretation of KerovKirillovReshetikhin

bijection, Nucl. Phys. B 740 (2006) 299327.

[35] R. Sakamoto, Crystal interpretation of KerovKirillovReshetikhin bijection II. Proof for sln Case, math.QA/

0601697, J. Alg. Comb., in press.

[36] M. Sato, Y. Sato, Soliton equations as dynamical systems on infinite dimensional Grassmann manifold, Nonlinear

PDE in Applied Science, USJapan Seminar, Tokyo, 1982, Lecture Notes Numer. Appl. Anal. 5 (1982) 259271.

[37] M. Jimbo, T. Miwa, Solitons and infinite dimensional Lie algebras, Publ. RIMS. Kyoto Univ. 19 (1983) 9431001.

266

[38] Y. Yamada, A birational representation of Weyl group, combinatorial R-matrix and discrete Toda equation, in: A.N.

Kirillov, N. Liskova (Eds.), Physics and Combinatorics 2000, World Scientific, 2001, pp. 305319.

[39] J.S. Birman, Braids, Links, and Mapping Class Groups, Princeton Univ. Press, 1974.

[40] L. Deka, A. Schilling, New fermionic formula for unrestricted Kostka polynomials, J. Comb. Theor. Ser. A 113

(2006) 14351461.

[41] C.L. Schultz, Eigenvectors of the multicomponent generalization of the six-vertex model, Physica A 122 (1983)

7188.

A. Losev a , S. Shadrin b, , I. Shneiberg c

a Institute for Theoretical and Experimental Physics, Bolshaya Cheremushkinskaya 25, Moscow 117218, Russia

b Department of Mathematics, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland

c Department of Algebra, Faculty of Mechanics and Mathematics, Moscow State University, Leninskie Gory, GSP,

Received 17 April 2007; accepted 4 July 2007

Available online 19 July 2007

Abstract

We propose a Hodge field theory construction that captures algebraic properties of the reduction of

Zwiebach invariants to GromovWitten invariants. It generalizes the BarannikovKontsevich construction

to the case of higher genera correlators with gravitational descendants. We prove the main theorem stating

that algebraically defined Hodge field theory correlators satisfy all tautological relations. From this perspective the statement that BarannikovKontsevich construction provides a solution of the WDVV equation

looks as the simplest particular case of our theorem. Also it generalizes the particular cases of other lowgenera tautological relations proven in our earlier works; we replace the old technical proofs by a novel

conceptual proof.

2007 Elsevier B.V. All rights reserved.

1. Introduction

In this paper we present an attempt to formalize what may be called a string field theory (SFT)

for (closed) topological strings with Hodge property.

From the very first days of string theory it was considered as a kind of generalization of the

perturbative expansion of the quantum field theory in the (functional) integral representation. The

space of graphs with g loops with metrics on edges (Schwinger proper times) was generalized

to the moduli space of Riemann surfaces. Indeed, the latter space really looks like a principle

* Corresponding author.

(I. Shneiberg).

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.07.003

268

U (1)n bundle over the former space near the points of maximal degeneracy (i.e., where the

maximal number of handles are pinched).

A natural question is whether there are special string theories that degenerate exactly to quantum field theories (may be, of the special kind). Would it happen such theories should enjoy both

finiteness of string theory and (functional) integral description of quantum field theory.

One of the first attempts to construct a theory of this type was done by Zwiebach in [1]. He

divided the moduli space into two regions: the internal piece and the boundary. He observed that

surfaces representing the boundary region may be constructed from those representing the internal piece by gluing them with the help of cylinders (with flat metric). Therefore, he proposed

to take integrals over the moduli spaces in two steps: first, to take an integral over the internal

pieces, such that this would produce vertices, and then take an integral along metrics on cylinders, that would exactly reproduce integral along the Schwinger parameters on graphs in QFT

prescription.

In this approach, he came with the infinite number of vertices of different internal genera and

with different number of external legs. However, he observed that such vertices satisfy quadratic

relations that were a quantum version of some infinity-structure. At that time community of

theoretical physicists seemed not to be impressed by the Lagrangian with infinite number of

(almost uncomputable1 ) vertices.

The next attempt was done by Witten [2]. He assumed that in topological string theories there

may be a limit in the space of two-dimensional theories such that the measure of integration

goes to the vicinity of the points of maximal degeneration. In the type B theories such limit

seems to be the large volume limit of the target space; this motivated Wittens ChernSimonslike representation for the topological string theory. This approach was further developed by

Bershadsky et al. in [3]. We note that the tropical limit of GromovWitten theory [4] (type A

topological strings) seems to realize the same QFT degeneration of string theory. Indeed, the

tropical limit of a Riemann surface mapped to a toric variety is represented by the graph mapped

to the moment map domain.

In the development of topological string theory it became clear that the proper object is not

just a measure on the moduli space of complex structures of Riemann surfaces, but rather a

differential form on this space. In original formulation these differential forms were assigned

to the tensor algebra of cohomology of some complex; such objects are called GromovWitten

invariants. We say that GromovWitten invariants are QFT-like if the differential forms of nonzero degree have support only in a vicinity of the points of maximal degeneration.

We generalized the definition of GromovWitten invariants in [5] by lifting it from the cohomology of a complex to the full complex. Such generalization involved enlargement of the

moduli space from DeligneMumford space to KimuraStasheffVoronov space [6], and we

called it Zwiebach invariants (in fact, some pieces of this construction appeared earlier in [1]

and [7]). The complex of states involved in the definition of Zwiebach invariants is a bicomplex

due to the action of the second differential. The second differential represents the substitution of

a special vector field corresponding to the constant rotation of the phase of the local coordinate

at a marked point into differential forms on the KimuraStasheffVoronov space.

Once we have some Zwiebach invariants, it is possible to produce new Zwiebach invariants

by contraction of an acyclic Hodge sub-bicomplex. In fact, it is one of the main properties of

Zwiebach invariants. Consider a sub-bicomplex, where these two differentials act freely. We call

1 Note, that computation of an integral over a subspace with a boundary is harder than that one over a compact space.

269

it Hodge contractible bicomplex. The operation of contraction of a Hodge contractible bicomplex turns Zwiebach invariants into induced Zwiebach invariants on the coset with respect to

contactable bicomplex. Induced Zwiebach invariants are differential forms whose support is a

union of the support of the initial Zwiebach invariants and small neighbourhoods of the points

of maximal degeneration. This procedure is a generalization from intervals to cylinders of the

procedure of induction of L -structures, see, e.g., [8,9].

This way we can obtain QFT-like GromovWitten invariants. We just should start with

Zwiebach invariants that have (in some suitable sense) no support inside the KimuraStasheff

Voronov spaces. In fact, it is even enough to consider a weaker condition, motivated by applications. That is, usually people consider the integrals of GromovWitten invariants only over the

tautological classes in the moduli space of curves. So, we call a set of Zwiebach invariants vertexlike if the integral over the KimuraStasheffVoronov spaces of any their non-zero component

multiplied by the pullback (from the DeligneMumford space) of any tautological class vanishes.

Consider vertex-like Zwiebach invariants. Assume we contract a Hodge contractible bicomplex down to cohomology. We obtain differential forms on the DeligneMumford space, such

that the integral of the product of any such form of non-zero degree with any tautological class

vanishes the interior of the moduli space. Integrals of such forms over the moduli spaces turn

out to be sums over graphs (corresponding to degenerate Riemann surfaces). They resemble

Feynman diagrams, and generation function for the integral over moduli spaces resemble diagrammatic expansion of perturbative quantum field theory.

In this paper, we do not construct examples of vertex-like Zwiebach invariants (we are going

to do this explicitly in a future publication, as well as the corresponding theory for the spaces

introduced in [10,11]). Rather we conjecture that they exist and study the consequences of this

assumption. We call the emerging construction the Hodge field theory, and now we will explain

it in some detail.

First of all, degree zero parts of vertex-like Zwiebach invariants induce the structure of homotopy cyclic Hodge algebra on the target complex [5]. We remind that a cyclic Hodge algebra is

just a Hodge dGBV-algebra with one additional axiom (1/12-axiom, see below).

In fact, this structure is interesting by itself, without any reference to Zwiebach invariants. It

has first appeared in the paper of Barannikov and Kontsevich [12]; it captures the properties of

polyvector fields on CalabiYau manifolds. More examples of dGBV algebras are studied in [13]

and [14]. It is possible to understand the structure of dGBV-algebra as a natural generalization of

the algebraic structure studied in [15].

In the Hodge field theory construction we consider only a particular case, where we obtain

axioms of a cyclic Hodge algebra itself, not up to homotopy. We are aware of the fact that demanding existence of vertex-like Zwiebach invariants simultaneously with vanishing homotopy

piece of cyclic Hodge algebra conditions may be too restrictive, and while considering only those

relations that lead to axioms of cyclic Hodge algebra may be too weak, however we proceed.

In the Hodge field theory construction we define graph expressions for the analogues of

GromovWitten invariants multiplied by tautological classes using only cyclic Hodge algebra

data. We call them Hodge field theory correlators. The corresponding action of the Hodge field

theory is written down explicitely in Section 6.2.

Our main result is the proof that the Hodge field theory correlators satisfy all universal equations that follow from relations among tautological classes in the cohomology.

The first result of this kind is due to Barannikov and Kontsevich. They have noticed that there

is a solution of the WDVV equation that is associated to a dGBV-algebra (this solution is the

critical value of the BCOV action [3], see [12, Appendix] and [5, Appendix]). Later, we reproved

270

this in [5]. Then, in [5,1618] we proved some other low-genera universal equations. Here we

generalize all these results and put all calculations done before in a proper framework.

In particular, the main problem for us was to define a graph expression in tensors of a cyclic

Hodge algebra that corresponds to the full GromovWitten potential with descendants. The first

steps were done in [16,17], where we introduced the definition of descendants at one point in

Hodge field theory (mostly for combinatorial reasons). But then we observed that it is a part of a

natural definition of potential with descendants in cyclic Hodge algebras that appears as a special

case of degeneration of vertex-like Zwiebach invariants multiplied by tautological classes.

In this paper, we present and study this construction. We prove in a completely algebraic

way that Hodge field theory correlators satisfy the same equations as a GromovWitten potential: string, dilaton, and the whole system of PDEs coming from tautological relations in the

cohomology of the moduli space of curves (see also [18] for some preliminary results). In what

follows we will not only present the proof but also will do our best relating algebraic definitions

and statements on Hodge field theory to analoguous constructions and theorems in the theory of

Zwiebach invariants.

1.1. Organization of the paper

In Section 2 we remind all necessary facts about the axiomatic GromovWitten theory. In Section 3 we define Zwiebach invariants and explain the motivation to consider the sums over graphs

in cyclic Hodge algebras. In Section 4 we define cyclic Hodge algebras and the corresponding

descendant potential. In Section 5 we state the main properties of the descendant potential in

cyclic Hodge algebras, and the rest of the paper is devoted to the proofs.

2. GromovWitten theory

In this section we remind what is GromovWitten theory and explain its basic properties that

we are going to reproduce in Hodge field theory construction.

2.1. GromovWitten invariants

Let us fix a finite dimensional vector space H0 over C together with the choice of a homogeneous basis H0 = e1 , . . . , es and a non-degenerate scalar product ij = (,) on it. Let e1 be a

distinguished even element of the basis.

g,n . On each M

g,n we take a differential form g,n

Consider the moduli spaces of curves M

of mixed degree with values in H0n . The whole system of forms {g,n } is called GromovWitten

invariants, if it satisfies the axioms [19,20]:

1. There are two actions of the symmetric group Sn on g,n . First, we can relabel the marked

g,n ; second, we can interchange the factors in the tensor product H n .

points on curves in M

0

We require that g,n is equivariant with respect to these two actions of Sn . In other words,

one can think that each copy of H0 in the tensor product is assigned to a specific marked

g,n .

point on curves in M

2. The forms must be closed, dg,n = 0.

g,n forgetting the last marked point. Then the cor g,n+1 M

3. Consider the mapping : M

respondence between g,n and g,n+1 is given by the formula

g,n = (g,n+1 , e1 ).

(1)

271

The meaning of the right-hand side is the following. We want to turn a H0n+1 -valued form

into a H0n -valued one. So, we take the copy of H0 corresponding to the last marked point

and contract it with the vector e1 using the scalar product.

g,n , whose generic point is represented by

4. Consider an irreducible boundary divisor in M

g ,n +1

g ,n +1 M

a two-component curve. It is the image of a natural mapping : M

1 1

1 2

g,n , where g = g1 + g1 and n = n1 + n2 . We require that

M

g,n = g1 ,n1 +1 g2 ,n2 +1 , 1 .

(2)

Here on the right-hand side we contract with a scalar product the two copies of H0 that

correspond to the node.

In the same way, consider the divisor of genus g 1 curves with one self-intersection. It is

g,n . In this case, we require that

g1,n+2 M

the image of a natural mapping : M

g,n = g1,n+2 , 1 .

(3)

As before, we contract the two copies of H0 corresponding to the node.

5. We also assume that (0,3 , e1 ei ej ) = (ei , ej ) = ij .

2.2. GromovWitten potential

Let us associate to each ei the set of formal variables Tn,i , n = 0, 1, 2, . . . . By Fg denote the

formal power series in these variables defined as

n

n

s

1

ai

Fg :=

(4)

i ,

ej Tai ,j .

g,n

n!

n

a1 ,...,an 0

Mg,n

i=1 j =1

i=1

The first sum is taken over n 3 for g = 0, n 1 for g = 1, and n 0 for g 2. On the righthand side, we contract each copy of H0 with the factor of the tensor product associated to the

same marked point.

The formal power series F := exp( g0 h g1 Fg ) is called GromovWitten potential associated to the system of GromovWitten invariants {g,n }. The coefficients of Fg , g 0, are called

correlators and denoted by

n

n

aj

a1 ,i1 an ,in g :=

(5)

j ,

ei j .

g,n

g,n

M

j =1

j =1

The main properties of GW potentials come from geometry of the moduli space of curves.

First, one can prove that coefficients of F satisfy string and dilaton equations:

n

n

aj ,ij =

ak ,ik ,

0,1

(6)

aj 1,ij

j =1

1,1

n

j =1

j =1

k =j

= (2g 2 + n)

aj ,ij

g

n

j =1

aj ,ij

(7)

g,n+1 M

g,n

The string equation is a corollary of the fact that j = j Dj ; here : M

is the projection forgetting the last marked point and Dj is the divisor in Mg,n+1 whose generic

272

point is represented by a two-component curve with one node such that one component has

genus

0 and contains exactly two marked points, the ith and the (n + 1)th ones. It is assumed

that nj=1 aj > 0.

The dilaton equation is a corollary of the fact that, in the same notations, n+1 = 2g 2+n.

Of course, we assume that 2g 2 + n > 0.

g,n among natural -strata gives a relation

Second, any relation in the cohomology of M

for the correlators. Let us explain this in more detail.

2.3. Tautological relations

2.3.1. Stable dual graphs

g,n [21] has a natural stratification by the topological type of

The moduli space of curves M

stable curves. We can combine natural strata with -classes at marked points and at nodes and

-classes on the moduli spaces of irreducible components. These objects are called -strata.

g,n is the language of stable dual graphs.

A convenient way to describe a -stratum in M

Take a generic curve in the stratum. To each irreducible component we associate a vertex marked

by its genus. To each node we associate an edge connecting the corresponding vertices (or a

loop, if it is a double point of an irreducible curve). If there is a marked point on a component,

then we add a leaf at the corresponding vertex, and we label leaves in the same way as marked

points. If we multiply a stratum by some -classes, then we just mark the corresponding leaves

or half-edges (in the case when we add -classes at nodes) by the corresponding powers of .

Also we mark each vertex by the -class associated to it.

g,n we mean the classes

Let us remark that by -classes on M

k1 ,...,kl :=

l

kj +1

n+j

(8)

j =1

g,n+l M

g,n is the projections forgetting the last l marked points. It is just another

where : M

additive basis in the ring generated by the ordinary -classes (k , k 1, in our notations). The

basic properties of these classes are stated in [22].

2.3.2. Integrals over -strata

Using the properties of GW invariants, one can express the integral of g,n over a stratum S in terms of correlators.

Consider a special case, when S is represented by a two-vertex graph with no - and classes. Then, according to axiom 4, the integral of g,n is the product of integrals of g1 ,n1 +1

and g2 ,n2 +1 over the moduli spaces corresponding to the vertices, contracted by the scalar

product:

n

ei j =

e i j ei

g1 ,n1 +1 ,

g,n ,

S

j =1

g ,n +1

M

1 1

j J1

i i

g2 ,n2 +1 ,

g ,n +1

M

2 2

j J2

eij ei .

(9)

273

Here we assume that the genus of one component of a generic curve in S is g1 and n1 marked

points with labels j J1 , |J1 | = n1 , are on this component. The other component has genus g2

and n2 marked points with labels j J2 , |J2 | = n2 , lie on it. Of course, g1 + g2 = g, n1 + n2 = n.

Now consider a special case, when S is represented by a one-vertex graph with - and classes. Let us assign a vector in the basis of H0 to each leaf (to each marked point). Then,

according to axiom 3 the integral

g,n

M

n

n

aj

j b1 ,...,bk ,

ei j

g,n

j =1

j =1

(10)

is equal to

g,n+k

g,n+k

M

n

j =1

a

j j

k

n

bj +1

n+j ,

ei j

j =1

j =1

e1k

(11)

Combining these two special cases one can obtain an expression in correlators that corresponds to an arbitrary -stratum.

2.3.3. Relations for correlators

Suppose that we have a linear combination L of -strata that is equal to

0 in the cohomol g,n (a tautological relation). Since dg,n = 0, the integral of (g,n , n eij ) over L

ogy of M

j =1

is equal to zero, for an arbitrary choice of primary fields. This gives an equation for correlators.

g,n+n , n 0, multiplied by arbitrary

Usually, one consider also the pull-backs of L to M

monomials of -classes. Of course, they are also represented as vanishing linear combinations

of -strata. This gives a system of PDEs for the formal power series Fg , g 0. For the

detailed description of the correspondence between tautological relations and universal PDEs for

GW potentials see, e.g., [23] or [22].

There are 8 basic tautological relations known at the moment: WDVV, Getzler, Belorousski

0,4 , M

1,1 , M

2,1 , M

2,2 , M

3,1 [2326].

Pandharipande, and topological recursion relations in M

3. Zwiebach invariants

In GromovWitten theory (and also in topological string theory) the GromovWitten invariants is usually a structure on the cohomology of a target manifold (the space H0 ) of on the

cohomology of a complex of some other geometric origin. We have introduced the notion of

Zwiebach invariants in [5] in order to formalize in a convenient way what physicists mean by

topological conformal quantum field theory at the level of a complex rather than at the level of

the cohomology.

The very general principles of homological algebra imply that algebraic structures on the

cohomology are often induced by some fundamental structures on a full complex (the standard

example is the induction of the infinity-structures from differential graded algebraic structures).

Such induction usually can be represented as a sum over trees with vertices corresponding to

fundamental operations and edges corresponding to the homotopy that contracts the complex to

its cohomology.

274

We would like to stress that GromovWitten invariants also can be considered as an induced

structure on the cohomology of a complex. In this case, the fundamental structure on the whole

complex is determined by Zwiebach invariants.

We are able to associate some structure on a bicomplex with a special compactification of the

moduli space curves (KimuraStasheffVoronov compactification). So, complexes are replaced

by bicomplexes, where the second differential reflects the rotation of attached cylinders (or circles). This is an appearance of the string nature of the problem.

As an induced structure we indeed obtain a GromovWitten-type theory that, under some

additional assumptions, can be presented in terms of a sum over graphs. Below we explain the

whole construction, following [5] and with some additional details.

3.1. KimuraStasheffVoronov spaces

We remind the construction of the KimuraStasheffVoronov compactification K g,n of the

g,n ; we just

moduli space of curves of genus g with n marked point. It is a real blow-up of M

remember the relative angles at double points. We can also choose an angle of the tangent vector

at each marked point; this way we get the principal U (1)n -bundle over K g,n . We denote the total

space of this bundle by Sg,n .

There are also the standard mappings between different spaces Sg,n . First, one can consider

the projection : Sg,n+1 Sg,n forgetting the last marked point. Suppose that under the projection we have to contract a sphere that contains the points xi , xn+1 , and a node. Denote the

natural coordinates on the circles corresponding to xi and a node on a curve in Sg,n+1 by i

and . Let i be a coordinate on the circle corresponding to xi in Sg,n . Then i = i + under

the projection . In the same way, if we contract a sphere that contains two nodes and xn+1 , then

= 1 + 2 , where 1 and 2 are the coordinates on the circles corresponding to the two nodes

of a curve in Sg,n+1 and is a coordinate on the circle at the resulting node in Sg,n .

In the same way, when we consider the mappings : Sg1 ,n1 +1 Sg1 ,n2 +1 Sg,n representing

the natural boundary components of Sg,n , we have = n1 +1 + n2 +1 , where n1 +1 and n2 +1

are the coordinates on the circles corresponding the points that are glued by into the node and

is the coordinate on the circle at the node. For the mapping : Sg1,n+2 Sg,n we also have

= n+1 + n+2 with the same notations.

3.2. Zwiebach invariants

Let us fix a Hodge bicomplex H with two differentials denoted by Q and G and with an

even scalar product = (,) invariant under the differentials:

(Qv, w) = (v, Qw),

(G v, w) = (v, G w).

H = H0

e , Qe , G e , QG e ,

(12)

(13)

(k)

Below we consider the action of Q and G on H n . We denote by Q(k) and G the action

of Q and G , respectively, on the kth component of the tensor product.

On each Sg,n we take a differential form Cg,n of the mixed degree with values in H0n . The

whole system of forms {Cg,n } is called Zwiebach invariants, if it satisfies the axioms:

275

1. Cg,n is Sn -equivariant.

(k)

3. (G + k )Cg,n = 0 for all 1 k n (we denote by k the substitution of the vector field

generating the action on Sg,n of the kth copy of U (1)); Cg,n is invariant under the action

of U (1)n ;

4. Cg,n = (Cg,n+1 , e1 ), where : Sg,n+1 Sg,n is the mapping forgetting the last marked

point.

5. Cg,n = (Cg1 ,n1 +1 Cg2 ,n2 +1 , 1 ), where : Sg1 ,n1 +1 Sg1 ,n2 +1 Sg,n represents

the boundary component. In the same way, Cg,n = (Cg1,n+2 , 1 ) for the mapping

: Sg1,n+2 Sg,n .

6. (C0,3 , e1 v v ) = ((Id + d2 G )v , (Id + d3 G )v ), 2 and 3 are the coordinates

on the circles at the corresponding points.

Zwiebach invariants on the bicomplex with zero differentials determine GromovWitten invariants. Indeed, in this case the factorization property implies that {Cg,n } is lifted from the

blowdown of KimuraStasheffVoronov spaces, i.e., it is determined by a set of forms on

DeligneMumford spaces. Then it is easy to check that this system of forms satisfies all axioms

of GromovWitten invariants.

3.3. Induced Zwiebach invariants

Induced Zwiebach invariants are obtained by the contraction of H4 . We denote by G+ the

contraction operator. This means that G+ H0 = 0, = {Q, G+ } is the projection to H4 along H0 ,

{G+ , G } = 0, and (G+ v, w) = (v, G+ w).

ind on a homotopy equivalent modification S

We construct an induced Zwiebach form Cg,n

g,n of

the space Sg,n . At each boundary component we glue the cylinder [0, +] such that in

Sg,n is identified with {0} in the cylinder.

So, we have the mappings : Sg1 ,n1 +1 Sg1 ,n2 +1 [0, +] Sg,n and : Sg1,n+2

[0, +] Sg,n representing the boundary components with glued cylinders. We take a form

Cg,n , restrict it to H0n , and extend it to the glued cylinder by the rule that

ind

Cg,n

(14)

= Cgind

Cgind

, et dtG+

1 ,n1 +1

2 ,n2 +1

in the first case and

ind

ind

= Cg1,n+2

, et dtG+

Cg,n

(15)

in the second case, where [et dtG+ ] is the bivector obtained from the operator etdtG+ ,

ind completely.

t is the coordinate on [0, +]. This determines Cg,n

ind are (d + Q)-closed and

Now it is a straightforward calculation to check that the forms Cg,n

satisfy the factorization property when restricted to the strata {+}.

3.4. Induced GromovWitten theory

The induced Zwiebach invariants determine GromovWitten invariants. The correlators of the

corresponding GromovWitten potential are given by the integrals over the fundamental

cycles

n

ai

ind

of K g,n (we just forget the circles at marked points in Sg,n ) of the forms Cg,n

.

i=1 i

276

In fact, the fundamental class of K g,n is represented as a sum over all irreducible boundary

g,n has real codimension equal to the dou g,n . Indeed, a boundary stratum in M

strata in M

bled number of the nodes of its generic curve. But then we add in K g,n a real two-dimensional

cylinder for each node. A simple explicit calculation allows to express the integral over the component of the fundamental cycle of K g,n corresponding to . It splits into the integrals of the

initial Zwiebach invariants (multiplied by -classes) over the moduli spaces corresponding to

the irreducible components of curves in ; they are contracted with the bivectors [G G+ ] (obtained from the operator G G+ via the scalar product) corresponding to the nodes according to

the topology of curves in .

So, we represent the correlators of the induced GromovWitten theory as sums over graphs.

Then one can observe that C0,3 determines a multiplication on H . Topology of the spaces S0,4

and S1,1 implies that the whole algebraic structure that we obtain on H is the structure of cyclic

Hodge algebra up to Q-homotopy, see [5]. Let us assume that the initial system of Zwiebach

invariants is simple enough, i.e., it induces the explicit structure of cyclic Hodge algebra on H

and only the integrals of the zero-degree parts of the initial Zwiebach invariants (multiplied by

-classes) are non-vanishing on fundamental cycles. In this case, the induced GromovWitten

potential can be described in very simple algebraic terms. It is the motivation of the definition of

the Hodge field theory construction given in the next section.

4. Construction of correlators in Hodge field theory

In this section, we describe in a very formal algebraic way the sum over graphs obtained as an

expression for the GromovWitten potential induced from Zwiebach invariants in the previous

section.

4.1. Cyclic Hodge algebras

In this section, we recall the definition of cyclic Hodge dGBV-algebras [5,16,17,20] (cyclic

Hodge algebras, for short). A supercommutative associative C-algebra H with unit is called

cyclic Hodge

algebra, if there are two odd linear operators Q, G : H H and an even linear

function : H C called integral. They must satisfy the following axioms:

1. (H, Q, G ) is a bicomplex:

Q2 = G2 = QG + G Q = 0.

(16)

of dimension 4 generated by e , Qe , G e , QG e for some vectors e H4 , i.e.,

e , Qe , G e , QG e

H = H0

(17)

(Hodge decomposition).

3. Q is an operator of the first order, it satisfies the Leibniz rule:

Q(ab) = Q(a)b + (1)a aQ(b)

(here and below we denote by a the parity of a H ).

(18)

277

G (abc) = G (ab)c + (1)b(a+1)

b

G (a)bc (1)a aG (b)c (1)a+

abG (c).

str(G a) = (1/12) str G (a)

(19)

(20)

(here a and G (a) are the operators of multiplication by a and G (a), respectively, str

means supertrace).

Define an operator G+ : H H related to the particular choice of Hodge decomposition. We

put G+ H0 = 0, and on each subspace e , Qe , G e , QG e we define G+ as

G+ e = G+ G e = 0,

G+ Qe = e ,

G+ QG e = G e .

(21)

projection to H0 along H4 .

Consider the integral : H C. We require that

Q(a)b = (1)a+1

aQ(b),

G (a)b = (1)a aG (b),

G+ (a)b = (1)a aG+ (b).

(22)

These

properties

imply that G G+ (a)b = aG G+ (b), 4 (a)b = a4 (b), and

0 (a)b = a0 (b).

We can define a scalar product on H as (a, b) = ab. We suppose that this scalar product is

non-degenerate. Using the scalar product we may turn any operator A : H H into the bivector

that we denote by [A].

4.2. Tensor expressions in terms of graphs

Here we explain a way to encode some tensor expressions over an arbitrary vector space in

terms of graphs.

Consider an arbitrary graph (we allow graphs to have leaves and we require vertices to be

at least of degree 3, the definition of graph that we use can be found in [20]). We associate a

symmetric n-form to each internal vertex of degree n, a symmetric bivector to each edge, and

a vector to each leaf. Then we can substitute the tensor product of all vectors in leaves and

bivectors in edges into the product of n-forms in vertices, distributing the components of tensors

in the same way as the corresponding edges and leaves are attached to vertices in the graph. This

way we get a number.

278

(23)

We assign a 5-form x to the left vertex of this graph and a 3-form y to the right vertex. Then the

number that we get from this graph is x(a, b, c, v, w) y(v, w, d).

Note that vectors, bivectors and n-forms used in this construction can depend on some variables. Then what we get is not a number, but a function.

4.3. Usage of graphs in cyclic Hodge algebras

Consider a cyclic Hodge algebra H . There are some standard tensors over H , which we

associate to elements of graphs below. Here we introduce the notations for these tensors.

We always assign the form

(a1 , . . . , an ) a1 an

(24)

to a vertex of degree n.

There is a collection of bivectors that will be assigned below to edges: [G G+ ], [0 ], [Id],

[QG+ ], [G+ Q], [G+ ], and [G ]. In pictures, edges with these bivectors will be denoted by

,

(25)

respectively. Note that an empty edge corresponding to the bivector [Id] can usually be contracted

(if it is not a loop).

The vectors that we will put at leaves depend on some variables. Let {e1 , . . . , es } be a homogeneous basis of H0 . In particular, we assume that e1 is the unit of H . To each vector ei we

associate formal variables

Tn,i , n 0, of the same parity as ei . Then we will put at a leaf one of

the vectors En = si=1 ei Tn,i , n 0, and we will mark such leaf by the number n. In our picture,

an empty leaf is the same as the leaf marked by 0.

4.3.1. Remark

There is a subtlety related to the fact that H is a Z2 -graded space. In order to give an honest

definition we must do the following. Suppose we consider a graph of genus g. We can choose

g edges in such a way that the graph being cut at these edge turns into a tree. To each of these

edges we have already assigned a bivector [A] for some operator A : H H . Now we have to

put the bivector [J A] instead of the bivector [A], where J is an operator defined by the formula

J : a (1)a a.

In particular, consider the following graph (this is also an example to the notations given

above):

.

(26)

An empty loop corresponds to the bivector [Id]. An empty leaf correspondsto the vector E0 .

A trivalent vertex corresponds to the 3-form given by the formula (a, b, c) abc.

If we ignore this remark, then what we get is just the trace of the operator a E0 a. But

using this remark we get the supertrace of this operator.

279

In fact, this subtlety will play no role in this paper. It affects only some signs in calculations

and all these signs will be hidden in lemmas shared from [5,16]. So, one can just ignore this

remark.

4.4. Correlators

We are going to define the potential using correlators. Let

k1 (V1 ) kn (Vn ) g

(27)

be the sum over graphs of genus g with n leaves marked by ki (Vi ), i = 1, . . . , n, where

V1 , . . . , Vn are vectors in H , and ki are just formal symbols. The index of each internal vertex of these graphs is 3; we associate to it the symmetric form (24). There are two possible

types of edges: edges marked by [G G+ ] (thick black dots in pictures, heavy edges in the text)

and edges marked by [Id] (empty edges). Since an empty edge connecting two different vertices

can be contracted, we assume that all empty edges are loops.

Consider a vertex of such graph. Let us describe all possible half-edges adjusted to this vertex.

There are 2g, g 0, half-edges coming from g empty loops; m half-edges coming from heavy

edges of graph, and l leaves marked ka1 (Va1 ), . . . , kal (Val ). Then we say that the type of this vertex is (g, m; ka1 , . . . , kal ). We denote the type of a vertex v by (g(v), m(v); ka1 (v) , . . . , kal(v) (v) ).

Consider a graph in the sum determining the correlator

k1 (V1 ) kn (Vn ) g .

(28)

We associate to a number: we contract according to the graph structure all tensors corresponding to its vertices, edges, and leaves (for leaves, we take vectors V1 , . . . , Vn ). Let us denote this

number by T ( ).

Also we weight each graph by a coefficient which is the product of two combinatorial constants. The first factor is equal to

g(v) g(v)!

vVert( ) 2

V ( ) =

(29)

.

| aut( )|

Here | aut( )| is the order of the automorphism group of the labeled graph , Vert( ) is the set

of internal vertices of . In other words, we can label each vertex v by g(v), delete all empty

loops, and then we get a graph with the order of the automorphism group equal to 1/V ( ).

The second factor is equal to

al(v) (v)

a (v)

P ( ) =

(30)

1 1 l(v)

.

vVert( )

Mg(v),m(v)+l(v)

The integrals used in this formula can be calculated with the help of the WittenKontsevich

theorem [2733].

So, the whole contribution of the graph to the correlator is equal to V ( )P ( )T ( ). One

can check that the non-trivial contribution

to the correlator k1 (V1 ) kn (Vn )g is given only

by graphs that have exactly 3g 3 + n ni=1 ki heavy edges.

The geometric meaning here is very clear. The number T ( ) comes from the integral of the

induced GromovWitten invariants of degree zero, while the coefficient V ( )P ( ) is exactly

the combinatorial interpretation of the intersection number of 1k1 nkn with the stratum whose

dual graph is obtained from by the procedure described after the definition of V ( ).

280

4.5. Potential

We fix a cyclic Hodge algebra and consider the formal power series F = F (Tn,i ) defined as

F = exp

h g1 Fg

= exp

g=0

g1

g=0

1

n!

n

a1 ,...,an Z0

a1 (Ea1 ) an (Ean ) g .

(31)

Abusing notations, we allow to mark the leaves by a (Ea ), Ea , or a; all this variants are

possible and denote the same.

4.6. Trivial example

For example, consider the trivial cyclic Hodge algebra: H = H0 = e1 , Q = G = 0,

e1 = 1. Then Ea = e1 ta , and the correlator a1 (Ea1 ) an (Ean )g consists just of one graph

with one vertex, g empty loops, and n leaves marked by a1 , . . . , an . The explicit value of the

coefficient of this graph is, by definition,

a1 an g :=

(32)

1a1 nan .

g,n

M

So, in the case of trivial cyclic Hodge algebra we obtain exactly the GromovWitten potential of

the point (i.e., (g,n , e1n ) 1) that we denote below by F pt .

4.6.1. Remark about notations

Abusing notation, we use the same symbol g for the correlators in GW theory and in Hodge

field theory. We hope that it does not lead to a confusion. For instance, a1 (Ea1 ) an (Ean )g

in the trivial example above is the correlator of the trivial Hodge field theory, while a1 an g

is the correlator of the trivial GW theory.

5. String, dilaton, and tautological relations

In this section, we prove that the potential (31) satisfies the same string and dilaton equations

as GW potentials.

5.1. String equation

n

n

aj (eij ) =

ak (eik ) .

aj 1 (eij )

0 (e1 )

j =1

j =1

k =j

(33)

Proof. Consider a graph contributing to the correlator on the left-hand side of the string

equation. The special leaf that we are going to remove is marked by 0 (e1 ) and is attached to a

281

vertex v of genus gv (i.e., with gv attached light loops) with lv more attached leaves labeled by

indices in Iv , |Iv | = lv , and mv attached half-edges coming from heavy edges and loops.

Let us remove the leaf 0 (e1 ) and change the label of one of the leaves attached to the same

vertex from aj (eij ) to aj 1 (eij ). This way we obtain a graph j contributing to the j th summand of the right-hand side of (33). We take the sum of these graphs over j Iv . Of course, we

skip the summands where aj = 0.

Note that this sum is not empty (if gives a non-zero contribution to the left-hand side

of (33)). Indeed, if it is empty, this means that aj = 0 for all j Iv . Therefore, since we expect

that the contribution to P ( ) of the vertex v on the left-hand side is non-zero, it follows that

gv = 0 and mv + lv = 2. So, there are three possible local pictures:

,

and

(34)

The first picture can be replaced with the bivector [G G+ G G+ ], which is equal to zero.

Therefore, T ( ) is also equal to zero. In the second case, we also get 0 since G G+ (e1 ej ) =

only when it is the whole graph, and this is in

G G+ (E0 ) = 0. The third picture is possible

Note also that T ( ) = T (j ) and V ( ) = V (j ) for all j . Indeed, we have just removed

the leaf with the unit of the algebra, so this cannot change anything in the contraction of tensors.

Therefore, T ( ) = T (j ). Also both the leaf 0 (e1 ) and the vertex v are the fixed points of

any automorphism of . The same is for the vertex corresponding to v in j . Therefore, the

automorphism groups are isomorphic for both graphs. Since we make no changes for empty

loops, it follows that V ( ) =

V (j ).

Let us prove that P ( ) = j Iv P (j ). Indeed, the vertices of and j are in a natural oneto-one correspondence. Moreover, the local pictures for all of them except for v and its image

in j are the same. Therefore, the corresponding intersection numbers contributing to P ( )

and P (j ) are the same. The unique difference appears when we take the intersection numbers

corresponding to v and its images in j , j Iv . But then we can apply the string equation (6) of

the GW theory of the point (32), and we see that

aj

aj 1 ak

o(j ) =

o(j

o(k)

(35)

)

g ,k +l +1 j Iv

M

v v v

j Iv

Mg

k =j

v ,kv +lv

(here

o : Iv {1, . . . , l} is an arbitrary on-to-one mapping). This implies that P ( ) =

j Iv P (j ).

of (33), we should just notice that when we write down this expression for all graphs contributing

to the left-hand side of (33), we use each graph contributing to the right-hand side of (33) exactly

once. 2

5.2. Dilaton equation

Theorem 2. If 2g 2 + n > 0, we have:

n

n

aj (eij ) = (2g 2 + n)

aj (eij ) .

1 (e1 )

j =1

j =1

(36)

282

Proof. Consider a graph contributing to the correlator on the left-hand side of (36). The special leaf that we are going to remove is marked by 1 (e1 ) and is attached to a vertex v of genus gv

(i.e., with gv attached light loops) with lv more attached leaves labeled by indices in Iv , |Iv | = lv ,

and mv attached half-edges coming from heavy edges and loops.

Let us remove the leaf 1 (e1 ). We obtain a graph contributing to the right-hand side of (33).

Let us prove this. Indeed, if we remove a leaf and do not get a proper graph, it follows that we

have a trivalent vertex. Since the contribution of this vertex to P ( ) should be non-zero, it

follows that the unique possible local picture is

(37)

But this picture is the whole graph, and it is in contradiction with the condition 2g 2 + n > 0.

The same argument as in the proof of the string equation shows that T ( ) = T ( ) and

V ( ) = V ( ). Also, the contribution to P ( ) and P ( ) of all vertices except for the changed

one is the same. The change of the intersection number corresponding to the vertex v is captured

by the dilaton equation (6) of the trivial GW theory (32):

aj

aj

(38)

l+1

o(j ) = (2gv 2 + kv + lv )

o(j )

g ,k +l +1

M

v v v

j Iv

g ,k +l j Iv

M

v v v

2 + kv + lv )P ( ), and, therefore,

V ( )P ( )T ( ) = (2gv 2 + kv + lv )V ( )P ( )T ( ).

(39)

Let us write down the last equation for all graphs contributing to the left-hand side (36).

Observe that any graph contributing to the right-hand side occurs | Vert( )| times, since

the leaf 1 (e1 ) could be attached to any its vertex. Therefore, any graph contributing to the

right-hand side of (36) appears in these equations with the coefficient

(40)

(2gv 2 + kv + lv ) = 2g 2 + n.

vVert( )

As we have explained in Section 2.3.3, any linear relation L among -strata in the cohomology of the moduli space of curves gives rise to a family of universal relations for the

correlators of a GromovWitten theory.

Theorem 3 (Main Theorem). The system of universal relations coming from a tautological

rela

tion in the cohomology of the moduli space of curves holds for the correlators nj=1 aj (eij )g

of cyclic Hodge algebra.

Note that some special cases of this theorem were proved in [5,16,17]. Our argument below is

a natural generalization of the technique introduced in these papers. Also we are able now to give

an explanation why we have managed to perform all our calculations there, see Remark 8.6.1.

283

Let us give here a brief account of the proof of this theorem. First, the definition of correlators

of the Hodge field theory can be extended to the intersection with an arbitrary tautological class

g,n , not only the monomial in -classes. In that case, we do the

of degree K in the space M

following. Again, we consider the sum over all graphs with 3g 3 + n K heavy edges,

and the number T ( ) is defined as above. Instead of the coefficient V ( )P ( ) we use the

intersection number of with the stratum whose dual graph is obtained from by the procedure

described right after the definition of V ( ). Namely, a vertex with g loops is replaced by a vertex

marked by g.

This definition is very natural from the point of view of Zwiebach invariants. However, we

know from GromovWitten theory that this extension of the notion of correlator is unnecessary.

Indeed, all integrals with arbitrary tautological classes can be expressed in terms of the integrals

with only -classes via some universal formulas.

The main question is whether these universal formulas also work in Hodge field theory. Actually, the main result that more or less immediately proves the theorem is the positive answer to

this question.

5.3.1. Organization of the proof

The rest of the paper is devoted to the proof of Main Theorem, and here we would like to

overview it here.

In Section 6, we study the structure of graphs that can appear in formulas for the correlators of

Hodge field theory. We prove that if T ( ) = 0 and there is at least one heavy edge in , then all

vertices have genus 1, i.e., there is at most one empty loop at any vertex. This basically means

that in calculations well have to deal only with genera 0 and 1. Also this allows us to write down

the action of a Hodge field theory.

In Section 7, we prove the main technical result (Main Lemma). Informally, it states that

Q = G when we apply these two operators to the correlators of a Hodge field theory. In

order to prove it, we look at a small piece (consisting just of one heavy edge and one or two

vertices that are attached to it) in one of the graphs of a correlator. Of course, in the correlator

we can vary this small piece in an arbitrary way, such that the rest of the graph remains the same.

So, when we consider the sum of all these small pieces, it is also a correlator of the Hodge field

theory. Thus we reduce the proof to a special case of the whole statement. But since the genus of

a vertex is 1, it appear now to be a low-genera statement that can be done by a straightforward

calculation.

In Section 8, we present the proof of the Main Theorem. Consider a -stratum whose

stable dual graph has k 1 edges. There is a universal expression of the integral over coming

from GromovWitten theory. It includes k entries of the scalar product restricted to H0 . In terms

of graphs, it means that we are to introduce new edges with the bivector [0 ] on them, and there

are k such edges in our expression. A direct corollary of the Main Lemma is that we can always

replace [0 ] by [Id] [G G+ ] [G G+ ] .

In Sections 8.28.4, we show that when we replace [0 ] by [Id] [G G+ ] [G G+ ] at

all edges corresponding to the scalar product restricted to H0 , we obtain a new expression for the

integral over that again contains only heavy edges and empty loops, as any ordinary correlator.

The main problem now is to understand the combinatorial coefficient of a graph obtained this

way.

Since we have a sum over graphs with heavy edges and empty loops, it is natural to identify again these graphs with the corresponding strata in the moduli space of curves. Then we

can calculate the intersection index of the stratum corresponding to a graph and the initial

284

class . Roughly speaking, the main thing that we have to do is to decide about each node

in (represented initially by [0 ]), whether we have this node in stratum corresponding to .

If yes, then we have an excessive intersection (so, we must put on one of the half-edges of

the corresponding edge), and we keep this edge in (so, we replace [0 ] with [G G+ ] or

[G G+ ] ). If no, then we do not have this edge in , so we contract [0 ], i.e., replace it

with [Id].

So, the procedure that we used to get rid of the scalar product is the same as the procedure of

the intersection of with strata of the complementary dimension. This means (Section 8.5) that

the universal formula coming from GromovWitten theory is equivalent to the natural formula

for the correlator with coming from Zwiebach theory. The tautological relation is a sum

of classes equal to zero. So, while the universal formula coming from GromovWitten theory

gives (in the case of a vanishing class) a non-trivial expression in correlators, the natural formula

coming from Zwiebach theory gives identically zero. This proves our theorem, see Section 8.6.

6. Vanishing of the BV structure

In this section, we recall several useful lemmas shared in [16,34]. In particular, these lemmas

give some strong restrictions on graphs that can give a non-zero contirbution to the correlators

defined above.

6.1. Lemmas

Lemma 1. (See [16,34].) The following vectors and bivectors are equal to zero:

(41)

Also let us remind another lemma in [16] that is very useful in calculations.

Lemma 2. (See [16].) For any vectors V0 , V1 , . . . , Vk , k 2,

+ +

(42)

+ +

(43)

Both lemmas are just simple corollaries of the axioms of cyclic Hodge algebra.

285

Consider a graph studied in Section 4.4. It can have leaves, empty and heavy loops, and heavy

edges. Consider a vertex of such graph. Let us assume that there are A empty loops, B heavy

loops, C heavy edges going to the other vertices of the graph, and D leaves attached to this

vertex:

(44)

Lemma 3. If A 2 and B + C 1, then (A, B, C, D) = 0.

In other words, if there are at least two empty loops at a vertex, then there should not be

any heavy loops or edges attached to this vertex. Otherwise the contribution of the whole graph

vanishes. This implies

Corollary 1. In the definition of correlators one should consider only graphs of one of the following two types:

(1) One-vertex graphs with no heavy edges (loops).

(2) Arbitrary graphs with at most one empty loop at each vertex.

The contribution of all other graphs vanishes.

This corollary dramatically simplifies all our calculations with graphs given below. Also we

can write down now the action of the Hodge field theory.

Let Fg0 (v0 , v1 , v2 , . . .), vi H CJ{Tn,i }K be the dimension zero part of the potential of

the Hodge field theory, namely,

Fg0 :=

1

n!

n

a1 ++an =3g3+n

a1 (va1 ) an (van ) g .

(45)

The first sum is taken over n 0 such that 2g 2 + n > 0. So, it is exactly the generating function

for the vertices of our graph expressions. Then the action of the Hodge field theory is equal to

A(v) := F00 (E0 + G v, E1 , E2 , . . .) + h F10 (E0 + G v, E1 , E2 , . . .)

1

Qv G v.

+

h g Fg0 (E0 , E1 , E2 , . . .)

2

(46)

g2

If we put Tn,i = 0 for n 1, then we immediately obtain the BCOV-type action discussed in

[12, Appendix] and [5, Appendix]. The similar actions were also studied in [35] and [36].

286

We consider the form (A, B, C, D) and we assume that A 2.

First, let us study the case when C 1. In this case, our C + D-form can be represented as a

contraction via the bivector [Id] of two forms, (A 2, B, C 1, D + 1) and (2, 0, 1, 1). Let

us prove that the last one is equal to zero. Indeed, this two-form can be represented as (,G

+ ),

where the two-form is represented by the picture

(47)

C, D) is also equal to zero.

Now consider the case when B 1. In this case, our C + D-form can be represented as a

contraction via the bivector [Id] of two forms, (A 2, B 1, C, D + 1) and (2, 1, 0, 1). Let

us prove that the last one is equal to zero. Indeed,

(2, 1, 0, 1) =

1

2

1

2

(48)

Here the first equality is definition of (2, 1, 0, 1), the second one is just an equivalent redrawing,

the third equality is application of Lemma 2, the fourth one is again an equivalent redrawing.

The last picture contains the bivector (47) which is equal to zero according to Lemma 1. Therefore, the whole picture is equal to zero, and (2, 1, 0, 1) = 0. So, the whole form (A, B, C, D)

is equal to zero also in this case. This proves the lemma.

7. Main Lemma

7.1. Statement

The main technical tool that we use in the proof of Theorem 3 is the lemma that we prove in

this section.

Lemma 4 (Main Lemma). For any v1 , . . . , vn H , a1 , . . . , an 0,

n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) g

i=1

n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) g .

i=1

(49)

287

Lemma 5. For any w H , v1 , . . . , vn H0 ,

a0 (Qw)a1 (v1 ) an (vn ) g + a0 +1 (G w)a1 (v1 ) an (vn ) g = 0.

(50)

7.2. Special cases

The proof of the lemma can be reduced to a small number of special cases. We consider

correlators whose graphs have only one heavy edge.

v1 , . . . , vn H ,

i=1 ai

n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) 0

i=1

n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) 0 .

(51)

i=1

First, we see that according to the definition of the correlator, the left-hand side of Eq. (51) is

the sum over graphs with two vertices and with [G G+ ] on the unique edge that connects the

vertices. For each I J = {1, . . . , n} we can consider the corresponding distribution of leaves

between

the vertices

(to be precise, let us assume that 1 I ). Then the coefficient of such graph is

0 iI ai 0 0 j J aj 0 , and we take the sum over all possible positions of Q at the leaves.

Using the Leibniz rule for Q and the property that [Q, G G+ ] = G , we see that this

sum is equal to the sum over graphs with two vertices and with [G ] on the unique edge

that connects the vertices. For each I J = {1, . . . , n}, |I |, |J | 2, we consider the corresponding

distribution

of leaves between the vertices. Then the coefficient of such graph is still

0 iI ai 0 0 j J aj 0 , and the underlying tensor expression can be written (after we multiply the whole sum by 1) as

vi G

vj .

v1 ,

(52)

j J

iI \{1}

G

vj =

G (vi vj )

vk

j J

i,j J, i<j

|J | 2

j J

kJ \{i,j }

G (vj )

vi .

iJ \{i,j }

(53)

288

v1 ,

vk G (vi vj )

k =1,i,j

1<i<j

ak ai aj 0

ak

a1 0

I J {i,j }={2,...,n}

kI

vj G (vi )

v1 ,

i =1

j =1,i

|J | 1 a1 0

I J {i}={2,...,n}

kJ

aj ai 0

aj

.

0

j I

j J

(54)

Using that

a1 0

I J {i,j }={2,...,n}

ak

ai aj 0

0

kI

ak

kJ

= a1 +1

ak ,

k =1

(55)

aj

(n 3) a1 +1

0

j =1

|J | 1 a1 0

I J {i}={2,...,n}

aj ai 0

aj = ai +1

aj ,

0

j I

j J

n

vj ai +1

aj .

G (vi ),

j =i

i=1

j =i

j =i

(56)

(57)

The last formula coincides by definition with the right-hand side of Eq. (51) multiplied by 1.

This proves the first special case.

7.2.2. The second case is in genus 1. Let

vn H ,

i=1 ai

n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) 1

i=1

n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) 1 .

(58)

i=1

According to the definition of the correlator, the left-hand side of Eq. (58) is the sum over

graphs of two possible types. The first type include graphs with two vertices and two edges. The

first edge is heavy and connects the vertices; the second edge is an empty loop attached to the first

vertex. For each I J = {1, . . . , n}, |J | 2, we can consider the corresponding distribution of

leaves between the vertices (we assume

that leaveswith indices in I are at the first edge). Then

the coefficient of such graph is 0 iI ai 1 0 j J aj 0 . The second type include graphs

with one vertex and one heavy loop. All leaves are attached to this vertex, and the coefficient of

289

such graph is 02 ni=1 ai 0 . For both types of graphs, we take the sum over all possible positions

of Q at the leaves.

Using the Leibniz rule for Q and the property that [Q, G G+ ] = G , we get the same

graphs as before, but there is no Q, and instead of [G G+ ] we have [G ] on the corresponding

edge. Using Lemma 2, we move G in graphs of the first type to the leaves marked by indices

in J . Using the 1/12-axiom and Lemma 2, we move G in graphs of the second type to all

leaves.

This way we get graphs of the same type in both cases. We get graphs with one vertex, one

empty loop attached to it, all leaves are also attached to this vertex, and there is G on one of

the leaves. One can easily check that the coefficient of the graphs with G at the ith leaf is

equal to

n

1 2

ak 0 ai

ak +

ak

0

24 0

1

0

0

kI

kJ

I J {i}={1,...,n}

k=1

= ai +1

ak .

k =i

(59)

It is exactly the unique graph contributing to the ith summand of the right-hand side of Eq. (58),

and the coefficient is right. This proves that special case.

7.2.3. Consider g 2. Let

v 1 , . . . , vn H ,

i=1 ai

n

a1 (v1 ) ai1 (vi1 )ai Q(vi ) ai+1 (vi+1 ) an (vn ) g

i=1

n

a1 (v1 ) ai1 (vi1 )ai +1 G (vi ) ai+1 (vi+1 ) an (vn ) g ,

(60)

i=1

7.3. Proof of Main Lemma

Consider the left-hand side of Eq. (49). As usual, using the Leibniz rule for Q and the property

that [Q, G G+ ] = G , we can remove all Q, but then we must change one of [G G+ ] on

edges to [G ]. Let us cut out the peaces of graphs that includes this edges with [G ], all

empty loops, leaves and halves of heavy edges attached to the ends of this special edge.

Since we consider the sum over all possible graphs contributing to correlators, these small

pieces can be gathered into groups according to the type of the rest of the initial graph. Each

group forms exactly one of the special cases studied above. So, we know that G should jump

either to one of the leaves or to one of the heavy edges attached to the ends of its edge. In the

first case, we get exactly the graphs in the right-hand side of Eq. (49); in the second case, we get

zero. One can easily check that we get the right coefficients for the graphs in the right-hand side

of Eq. (49). This proves the lemma.

290

8. Proof of Theorem 3

8.1. Equivalence of expression in graphs

Consider the expression in correlators corresponding to a -stratum as it is described in

Section 2.3.2. To each vertex of the corresponding stable dual graph we assign the sum of graphs

that forms correlator in the sense of Section 4.4. The leaves of these graphs corresponding to the

edges of the stable dual graph (nodes) are connected in these pictures by edges with [0 ] (the

restriction of the scalar product to H0 ). We call the edges with [0 ] white edges and mark

them in pictures by thick white points, see (25).

The axioms of cyclic Hodge algebra imply a system of linear equations for the graphs of this

type. In particular, it has appeared that playing with this linear equations we can always get rid of

white edges in the sum of pictures corresponding to a stable dual graph, see [5,16,17]. However,

previously it was just an experimental fact. Now we can show how it works in general.

The numerous examples of the correspondence between stable dual graphs and graphs expressions in cyclic Hodge algebras and also of the linear relations implied by the axioms of cyclic

Hodge algebra are given in [5,16,17].

Below, we explain how one can represent the expression in correlators corresponding to a

-stratum in terms of graphs with only empty and heavy edges and with no white edges. The

unique tool that we need is Lemmas 4 and 5 proved above.

8.2. The simplest example

Consider a stable dual graph with two vertices and one edge connecting them:

(61)

.

The corresponding expression in correlators is

n1

l1

n2

l2

j1 j2

ai (e )

ui (e1 )

b0 (ej2 )

bi (e )

vi (e1 )

a0 (ej1 )

i=1

i=1

i=1

g1

i=1

(62)

g1

expression as

n1

l1

n2

l2

a0 (x1 )

(63)

ai (e )

ui (e1 )

[0 ]1 2 b0 (x2 )

bi (e )

vi (e1 ) ,

i=1

i=1

i=1

g1

i=1

g1

where {x } is the basis of the whole H . Using the fact that 0 = I d QG+ G+ Q and applying

Lemma 5, we obtain

n1

l1

n2

l2

1 2

ai (e )

ui (e1 )

[0 ]

b0 (x2 )

bi (e )

vi (e1 )

a0 (x1 )

i=1

= a0 (x1 )

i=1

n1

i=1

ai (e )

g1

l1

i=1

i=1

[Id]1 2 b0 (x2 )

ui (e1 )

g1

i=1

n2

i=1

bi (e )

g1

l2

i=1

vi (e1 )

g1

a0 +1 (x1 )

n1

ai (e )

i=1

ui (e1 )

i=1

[G G+ ]

l1

1 2

b0 (x2 )

n1

l1

g1

n2

bi (e )

i=1

a0 (x1 )

i=1

[G G+ ]

1 2

ai (e )

l2

vi (e1 )

i=1

g1

ui (e1 )

i=1

291

b0 +1 (x2 )

g1

n2

i=1

bi (e )

l2

i=1

vi (e1 )

(64)

g1

In all three summands of the right-hand side we still have two correlators, whose leaves corresponding to the nodes are connected by some special edges. But now the connecting edge is

either marked by [Id] (an empty edge) or by [G G+ ] (an ordinary heavy edge). So, this way we

get rid of the white edge in this case.

Informally, in terms of pictures, we can describe Eq. (64) as

=

(65)

.

When we put , we mean that we add one more -class at the node at the corresponding branch

of the curve. Dashed circles denote correlators.

8.3. Example with two nodes

Now we consider an example of stratum, whose generic point is represented by a threecomponent curve. Again, we allow arbitrary -classes at marked points and two branches at

nodes.

We perform the same calculation as above, but now we explain it in terms of informal pictures

from the very beginning. So, the first step is the same as above:

=

.

Then we apply Lemma 4 to each of the summands in the right-hand side:

=

(66)

292

(67)

=

+

+

+

(68)

(69)

and

=

+

+

+

We take the sum of these three expressions, and we see that all pictures where we have edges

with [G ] and [G+ ] are cancelled. So, we get an expression for the sum of graphs representing

the initial stratum in terms of graphs with only empty and heavy edges.

8.4. General case

The general argument is exactly the same as in the second example. In fact, this gives a

procedure how to write an expression in graphs with only empty and heavy edges (and no white

edges) starting from a stable dual graph. Let us describe this procedure.

g,n . First, we

Take a stable dual graph corresponding to a -stratum of dimension k in M

are to decorate it a little bit. For each edge, we either leave it untouched, or substitute it with an

arrow (in two possible ways). At the pointing end of the arrow, we increase the number of classes by 1. Each of these graphs we weight with the inversed order of its automorphism group

(automorphisms must preserve all decorations) multiplied by (1)arr , where arr is the number of

arrows.

Consider a decorated dual graph. To each its vertex we associate the corresponding correlator

of cyclic Hodge algebra (we add new leaves in order to represent -classes). Then we connect

293

the leaves corresponding to the nodes either by empty edges (if the corresponding edge of the

decorated graph is untouched) or by heavy edges (if the corresponding edge of the dual graph is

decorated by an arrow).

It is obvious that the number of heavy edges in the final graphs is equal to k.

8.5. Coefficients

We can simplify the resulting graphs obtained in the previous subsection. First, we can contract empty edges (as much as it is possible; it is forbidden to contract loops). Second, we can

remove leaves added for the needs of -classes. Indeed, each such leaf is equipped with a unit

of H , so it does not affect the contraction of tensors corresponding to a graph. Moreover, when

we remove all leaves corresponding to -classes, we still have graph with at least trivalent vertices. Otherwise, this graph is equal to zero, cf. arguments in the proofs of string and dilaton

equations.

So, we obtain final graphs that have the same number of heavy edges as the dimension of the

initial -stratum, the same number of leaves as the initial dual graph, and some number of

empty loops, at most one at each vertex. The exceptional case is when k = 0; in this case we

obtain only one graph, with one vertex, n leaves, and g empty loops.

In the first case, let us turn a graph like this into a stable dual graph. Just replace its vertices

with no empty loops by vertices of genus zero, vertices with empty loops by vertices of genus

one, heavy edges are edges, and leaves are leaves. There are no - or -classes. It is obvious that

the codimension of the stratum corresponding to this dual graph is k. Indeed, in this case it is just

the number of nodes.

strata of codimension k with no - or -classes, whose curves have irreducible components of

genus 0 and 1 only.

Proposition 1. We have ci = X Yi .

Proof. We prove it two steps. First, consider a one-vertex stable dual graph with no edges (just

a correlator). In this case, the intersection number X Yi is just by definition ci = V (i )P (i ),

where i is the cyclic Hodge algebra graph that turns into Yi via the procedure described above.

Then, consider a stable dual graph with one edge. It is the intersection of the one-vertex stable

dual graph with an irreducible component of the boundary. For a given Yi , this component of the

boundary either intersects it transversaly, or we have an excessive intersection. In the first case,

the corresponding node is not represented in Yi . This means that in i it should be an empty

edge. In the second case, this node is one of the nodes of Yi , so it should be a heavy edge of i .

Also, it is an excessive intersection, so we are to add the sum of -classes with the negative sign

at the marked points (half-edges) corresponding to the node, see [37, Appendix].

Exactly the same argument works for an arbitrary number of nodes, we just extend it by

induction. 2

In the case of k = 0, we get just one final graph with coefficient c.

Proposition 2. If k = 0, the coefficient of the final graph is equal to the number of points in the

initial -stratum.

294

Proof. If k = 0, this means that each of the vertices of the initial stable dual graph also has

dimension 0, and the corresponding correlator of cyclic Hodge algebra is represented by one

one-vertex graph with no heavy edges. Also this means that each edge of the initial stable dual

graph is replaced in the algorithm above by an empty edge. So, we can think that we just work

with the correlators of the GromovWitten theory of the point. In this case proposition becomes

obvious. 2

Now we deduce Theorem 3 from these propositions.

8.6. Proof of Theorem 3

Consider the system of subalgebras

g,n ) RH (M

g,n )

RH1 (M

(70)

of the cohomological tautological algebras of M

and with irreducible curves of genus 0 and 1 only.

g,n . Then the expression in

Let L be a linear combination of -strata of dimension k in M

correlators of cyclic Hodge algebras corresponding to L is equaivalent to a sum of some graphs

g,n ).

with coefficients equal to the intersection of L with classes in RH1k (M

So, if the class of L is equal to zero, then the corresponding equation (and also the whole

system of equations that we described in Section 2.3.3) for correlators of cyclic Hodge algebra

is valid. Theorem is proved.

8.6.1. Remark

g,n ) is a module over RH (M

g,n ). Also it is obvious that RH (M

g,n )

Evidently, RH1 (M

1

is closed under pull-backs and push-forwards via the forgetful morphisms. This explains why it

was enough to make only one check in the simplest case in order to get the system of equations

in [5,17] (cf. an argument in the last section in [5]).

8.7. An interpretation of Propositions 1 and 2

From the point of view of the theory of Zwiebach invariants, both propositions look very

natural. Indeed, we try to give a graph expression for the integral of an induced GromovWitten

form multiplied by a tautological class X. Since we know that we are able to integrate only degree

zero parts of induced GromovWitten invariants, we should just take the sum over all graphs that

g,n ). The coefficients are to be

correspond to the strata of complimentary dimension in RH1 (M

the intersection numbers of these strata with X.

On the other hand, we know that in any GromovWitten theory it is enough to fix the integrals

of GromovWitten invariants multiplied by -classes. Then the integrals of GromovWitten invariants multiplied by arbitrary tautological classes are expressed by universal formulas. We can

try to use these universal formulas also in Hodge field theory. They are exactly our expressions

with white edges.

So, we have two different natural ways to express in terms of graphs the integrals of induced

GromovWitten invariants multiplied by tautological classes. Propositions 1 and 2 state that these

two different expressions coinside.

295

Acknowledgements

A.L. was supported by the Russian Federal Agency of Atomic Energy and by the grants

INTAS-03-51-6346, NSh-8065.2006.2, NWO-RFBR-047.011.2004.026 (RFBR-05-02-89000NWO-a), and RFBR-07-02-01161-a.

S.S. was supported by the grant SNSF-200021-115907/1. S.S. is grateful to the participants of

the Moduli Spaces program at the Mittag-Leffler Institute (Djursholm, Sweden) for the fruitful

discussions of the preliminary versions of the results of this paper. The remarks of C. Faber,

O. Tommasi, and D. Zvonkine were especially helpful.

I.S. was supported by the grant RFBR-06-01-00037.

References

[1] B. Zwiebach, Closed string field theory: Quantum action and the BatalinVilkovisky master equation, Nucl. Phys.

B 390 (1) (1993) 33152.

[2] E. Witten, ChernSimons gauge theory as a string theory, in: The Floer Memorial Volume, in: Progress in Mathematics, vol. 133, Birkhuser, Basel, 1995, pp. 637678.

[3] M. Bershadsky, S. Cecotti, H. Ooguri, C. Vafa, KodairaSpencer theory of gravity and exact results for quantum

string amplitudes, Commun. Math. Phys. 165 (2) (1994) 311427.

[4] G. Mikhalkin, Enumerative tropical algebraic geometry in R2 , J. Amer. Math. Soc. 18 (2) (2005) 313377.

[5] A. Losev, S. Shadrin, From Zwiebach invariants to Getzler relation, Commun. Math. Phys. 271 (3) (2007) 649679.

[6] T. Kimura, J. Stasheff, A. Voronov, On operad structures of moduli spaces and string theory, Commun. Math.

Phys. 171 (1) (1995) 125.

[7] E. Getzler, BatalinVilkovisky algebras and two-dimensional topological field theories, Commun. Math. Phys. 159

(1994) 265285.

[8] F. Schaetz, BVF-complex and higher homotopy structures, math.QA/0611912.

[9] P. Mnev, Notes on simplicial BF theory, hep-th/0610326.

[10] A. Losev, Y. Manin, New moduli spaces of pointed curves and pencils of flat connections, Michigan Math. J. 48

(2000) 443472.

[11] A. Losev, Yu. Manin, Extended modular operad, in: Frobenius Manifolds, in: Aspects of Mathematics, vol. E36,

Vieweg, Wiesbaden, 2004, pp. 181211.

[12] S. Barannikov, M. Kontsevich, Frobenius manifolds and formality of Lie algebras of polyvector fields, Int. Math.

Res. Notices 4 (1998) 201215.

[13] S.A. Merkulov, Formality of canonical symplectic complexes and Frobenius manifolds, Int. Math. Res. Not. 14

(1998) 727733.

[14] Yu.I. Manin, Three constructions of Frobenius manifolds: A comparative study, Asian J. Math. 3 (1) (1999) 179

220.

[15] A. Losev, Hodge strings and elements of K. Saitos theory of primitive form, in: Topological Field Theory, Primitive

Forms and Related Topics, Kyoto, 1996, in: Progress in Mathematics, vol. 160, Birkhauser Boston, Boston, MA,

1998, pp. 305335.

[16] S. Shadrin, A definition of descendants at one point in graph calculus, math.QA/0507106.

[17] S. Shadrin, I. Shneiberg, BelorousskiPandharipande relation in dGBV algebras, J. Geom. Phys. 57 (2) (2007)

597615.

2,2 , Funct. Anal. Appl. (2007), in press.

[18] I. Shneiberg, Topological recursion relations in M

[19] M. Kontsevich, Y. Manin, GromovWitten classes, quantum cohomology, and enumerative geometry, Commun.

Math. Phys. 164 (3) (1994) 525562.

[20] Yu. Manin, Frobenius Manifolds, Quantum Cohomology, and Moduli Spaces, American Mathematical Society

Colloquium Publications, vol. 47, Amer. Math. Soc., Providence, RI, 1999.

[21] J. Harris, I. Morrison, Moduli of Curves, Graduate Texts in Mathematics, vol. 187, Springer-Verlag, New York,

1998.

[22] C. Faber, S. Shadrin, D. Zvonkine, Tautological relations and the r-spin Witten conjecture, math.AG/0612510.

[23] E. Getzler, Topological recursion relations in genus 2, in: Integrable Systems and Algebraic Geometry, Kobe/Kyoto,

1997, World Scientific, River Edge, NJ, 1998, pp. 73106.

296

1,4 and elliptic GromovWitten invariants, J. Am. Math. Soc. 10 (4) (1997)

[24] E. Getzler, Intersection theory on M

973998.

[25] P. Belorousski, R. Pandharipande, A descendent relation in genus 2, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 29 (1)

(2000) 171191.

[26] T. Kimura, X. Liu, A genus-3 topological recursion relation, Commun. Math. Phys. 262 (3) (2006) 645661.

[27] E. Witten, Two-dimensional Gravity and Intersection Theory on Moduli Space, Surveys in Differential Geometry,

vol. 1, Lehigh Univ., Bethlehem, PA, 1991, pp. 243310.

[28] M. Kontsevich, Intersection theory on the moduli space of curves and the matrix Airy function, Commun. Math.

Phys. 147 (1) (1992) 123.

[29] A. Okounkov, R. Pandharipande, GromovWitten theory, Hurwitz numbers, and matrix models, I, math.AG/

0101147.

[30] M. Mirzakhani, WeilPetersson volumes and intersection theory on the moduli space of curves, J. Am. Math.

Soc. 20 (1) (2007) 123.

[31] M. Kazarian, S. Lando, An algebro-geometric proof of Wittens conjecture, math.AG/0601760.

[32] Y.-S. Kim, K. Liu, A simple proof of Witten conjecture through localization, math.AG/0508384.

[33] L. Chen, Y. Li, K. Liu, Localization, Hurwitz numbers and the Witten conjecture, math.AG/0609263.

[34] U. Tillmann, Vanishing of the BatalinVilkovisky algebra structure for TCFTs, Commun. Math. Phys. 205 (2)

(1999) 283286.

[35] R. Dijkgraaf, Chiral deformations of conformal field theories, Nucl. Phys. B 493 (3) (1997) 588612.

[36] A. Gerasimov, S. Shatashvili, Towards integrability of topological strings I: Three-forms on CalabiYau manifolds,

JHEP 0411 (2004) 074.

[37] T. Graber, R. Pandharipande, Constructions of non-tautological classes on moduli spaces of curves, Michigan Math.

J. 51 (1) (2003) 93109.

systems at multiples of c = 26:

III. The spectra of c = 52 strings

M.B. Halpern

Department of Physics, University of California and Theoretical Physics Group,

Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA

Received 17 April 2007; accepted 8 August 2007

Available online 14 August 2007

Abstract

In the second paper of this series, I obtained the twisted BRST systems and extended physical-state conditions of all twisted open and closed c = 52 strings. In this paper, I supplement the extended physical-state

conditions with the explicit form of the extended (twisted) Virasoro generators of all c = 52 strings, which

allows us to discuss the physical spectra of these systems. Surprisingly, all the c = 52 spectra admit an

equivalent description in terms of generically-unconventional Virasoro generators at c = 26. This description strongly supports our prior conjecture that the c = 52 strings are free of negative-norm states, and

moreover shows that the spectra of some of the simpler cases are equivalent to those of ordinary untwisted

open and closed c = 26 strings.

2007 Elsevier B.V. All rights reserved.

1. Introduction

Opening another chapter in the orbifold program [115], this is the third in a series of papers which considers the critical orbifolds of permutation-type as candidates for new physical

string systems at higher central charge. In the first paper [16] of this series, we found that the

twisted sectors of these orbifolds are governed by new, extended (permutation-twisted) worldsheet gravitieswhich indicate that the free-bosonic orbifold-string systems of permutation-type

can be free of negative-norm states at critical central charge c = 26K. Correspondingly-extended

world-sheet permutation supergravities are expected in the twisted sectors of the superstring

E-mail address: halpern@physics.berkeley.edu.

0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.nuclphysb.2007.08.001

298

superstring central charges.

In the second paper [17] of the series, we found the corresponding twisted BRST systems for

all sectors of the free-bosonic orbifolds which couple to the simple case of Z2 -twisted permutation gravity, i.e. for all the twisted strings with c = 52 matter. The new BRST systems also

implied the following extended physical-state conditions for the physical states {|} of each of

the c = 52 strings:

17

u

Lu m +

(1.1a)

0 m+ u2 ,0

| = 0, m Z, u = 0, 1,

2

8

u

v

, Lv n +

Lu m +

2

2

uv

u+v

= mn+

Lu+v m + n +

2

2

2

u

u

52

1 m+n+ u+v ,0 .

m+

m+

+

(1.1b)

2

12

2

2

The algebra in Eq. (1.1b) is called an order-two orbifold Virasoro algebra (or extended, twisted

Virasoro algebra) and general orbifold Virasoro algebras [1,9,12,1618] are known to govern all

the twisted sectors of the orbifolds of permutation-type at higher multiples of c = 26.

The set of all c = 52 orbifold-strings is a very large class of fractional-moded free-bosonic

string systems, including e.g. the twisted open-string sectors of the orientation orbifolds, the

twisted closed-string sectors of the generalized Z2 -permutation orbifolds and many others (see

Refs. [16,17] and Section 2). Starting from the extended physical-state conditions (1.1) (and a

right-mover copy of (1.1) on the same {|} for the twisted closed-string sectors) this paper

begins the concrete study of the physical spectrum of each c = 52 string.

As the prerequisite for this analysis, I first provide in Section 2 the explicit formin terms of

twisted matter fieldsof the extended Virasoro generators {L u (m + u2 ), u = 0, 1} of all c = 52

strings. This construction allows us to begin the study of the general c = 52 string spectra in

Section 3. The same subject is further considered in Section 4, where I point out that all the

c = 52 spectra admit an equivalent description in terms of generically-unconventional Virasoro

generators at c = 26. This description allows us to see clearly a number of spectral regularities

which are only glimpsed in Section 3, including strong further evidence that the critical orbifolds

of permutation-type can be free of negative-norm states. Moreover, although the generic c = 52

spectrum is apparently new, we are able to show that some of the simpler spectra are equivalent

to those of ordinary untwisted open and closed critical strings at c = 26.

Based on these results, the discussion in Section 5 raises some interesting questions about

these theories at the interacting level, and speculates on the form of the extended physical-state

conditions for more general orbifold-strings of permutation-type. I will return to both of these

subjects in succeeding papers of the series.

2. The twisted Virasoro generators of c = 52 strings

As emphasized in Ref. [17], the universal form of the twisted BRST systems and the extended

physical-state conditions (1.1) are consequences of their origin in Z2 -twisted permutation gravity,

which governs all twisted c = 52 matter.

299

There are however many distinct c = 52 strings, including the twisted open-string sectors of

the orientation orbifolds [12,13,1517]

U (1)26

,

H

H = Z2 (w.s.) H,

(2.1)

and the twisted closed-string sectors of the generalized Z2 -permutation orbifolds [1517]

U (1)26 U (1)26

,

H+

H+ = Z2 (perm) H ,

(2.2)

as well as the generalized open-string Z2 -permutation orbifolds and their T -duals [1517].

For the orientation orbifolds in Eq. (2.1), I remind that H is any automorphism group of

the untwisted closed string U (1)26 which includes world-sheet orientation-reversing automorphisms. Indeed the twisted open-string orientation-orbifold sectors correspond to the orientationreversing automorphisms, which have the form , H , where the basic automorphism

exchanges the left- and right-movers of the closed string and is an extra automorphism

which acts uniformly on the left- and right-movers of the closed string. Similarly, the automorphism group H+ of the generalized Z2 -permutation orbifolds in (2.2) is generated by elements

of the form + , H , where the basic automorphism + exchanges the two copies of the

closed string and the extra automorphism again acts uniformly on the left- and right-movers

of each closed string. In both cases, the extra automorphisms in may or may not form

a group (see the examples at the end of this section).

The spectra of different c = 52 strings are characterized by their extended (twisted) Virasoro

generators, all of which can in fact be written in the following unified form:

1

1 n(r);n(r)

u

Lu m +

G

( )

=

2

4 r,

v=0 pZ

v

uv

n(r)

n(r)

:Jn(r)v p +

+

Jn(r),,uv m p

+

:M

( ) 2

( )

2

+ m+ u ,0 0 ( ),

(2.3a)

2

u

v

n(r)

n(s)

Jn(r)u m +

+

, Jn(s)v n +

+

( ) 2

( ) 2

n(r)

u

=2 m+

+

n(r)+n(s),0 mod ( ) m+n+ n(r)+n(s) + u+v ,0 Gn(r),;n(r), ( ),

2

( )

( ) 2

v

u

n(r)

Lu m +

, Jn(r)v n +

+

2

( ) 2

v

u+v

n(r)

n(r)

+

Jn(r),u+v m + n +

+

,

= n+

( ) 2

( )

2

n(r)

13 1

1

n(r)

1

n(r)

dim n(r)

,

0 ( ) =

8

2 r

( ) 2

( ) 2

( )

dim n(r)

= 26.

r

(2.3b)

(2.3c)

(2.3d)

(2.3e)

300

Each set of extended Virasoro generators in Eq. (2.3a) satisfies the order-two orbifold Virasoro

algebra (1.1b) at c = 52, and the current algebras in Eq. (2.3b) are of the type called doublytwisted in the orbifold program.

For those unfamiliar with the program, I first give a short summary of the standard notation in

the result (2.3)followed by the derivation of the result. As in the extended Virasoro generators

themselves, the indices u, v with fundamental range u,

v {0, 1} describe the twist of the basic

permutations in each H . For each extra automorphism ( ) in each ( ), the spectral

indices {n(r)} and the degeneracy indices { (n(r))} of each twisted sector are determined

by the so-called H -eigenvalue problem [3,5,6] of ( ),

n(r)

2i (

)

( ) H or H ,

1 0

( )a c ( )b d Gcd = Gab , Gab = Gab =

,

0 1

a, b = 0, 1, . . . , 25,

n(r)

0, 1, . . . , ( ) 1 ,

( )a b U ( )b n(r) = U ( )a n(r) e

,

(2.4a)

(2.4b)

(2.4c)

where G is the untwisted target-space metric of U (1)26 . The quantity ( ) is the order of ( )

and all indices {n(r)} are periodic modulo ( ), with {n(r)}

region and dim[n(r)]

The index r is summed once over the fundamental region in Eqs. (2.3a), (2.3d) and (2.3e). The twisted metric G. ( ) and its inverse G . ( )

are defined in terms of the unitary eigenvectors U ( ) of the H-eigenvalue problem

Gn(r);n(s) ( ) = n(r) n(s) U ( )n(r) a U ( )n(s) b Gab ,

(2.5a)

1

1

n(s)

Gab U ( )a

G n(r);n(s) ( ) = n(r)

= n(r)+n(s),0 mod ( ) G

n(r)

U ( )b

n(r),;n(r),

(2.5b)

n(s)

(2.5c)

(2.5d)

where G is again the untwisted metric and the s are essentially-arbitrary normalization constants. Finally, the standard mode normal-ordering in Eq. (2.3a) is:

u

v

n(r)

n(s)

+

Jn(s)v n +

+

:M

:Jn(r)u m +

( ) 2

( ) 2

n(r)

u

v

u

n(s)

n(r)

= m+

+

0 Jn(s)v n +

+

Jn(r)u m +

+

( ) 2

( ) 2

( ) 2

u

u

v

n(r)

n(s)

n(r)

Jn(s)v n +

+

< 0 Jn(r)u m +

+

+

.

+ m+

( ) 2

( ) 2

( ) 2

(2.6)

It follows that the quantity 0 ( ) in Eqs. (2.3a) and (2.3d)

u

n(r)

+

0 |0 = 0

Jn(r)u m +

(2.7a)

( ) 2

u

L u m +

(2.7b)

0 |0 = 0 ( )m+ u2 ,0 |0

2

is the conformal weight of the scalar twist-field state |0 of sector .

I comment briefly on the derivation of the unified form (2.3) of the c = 52 extended Virasoro generators. Essentially this result was given for the twisted open-string sectors of the

301

non-Abelian orientation orbifolds in Sections 3.4, 3.5 of Ref. [12], and that result is easily reduced for our Abelian case U (1)26 /H in Eq. (2.1). With a right-mover copy of the extended

Virasoro generators (and u j = 0, 1), the result also hold for the twisted closed-string sectors

of the generalized Z2 -permutation orbifolds (U (1)26 U (1)26 )/H+ in Eq. (2.2). This follows

by the substitution

G G,

u

n(r)

u

+

2

( ) 2

(2.8)

into the known results for the ordinary Z2 -permutation orbifolds with trivial H (see Ref. [9]

and Section 4.2 of Ref. [16]). Finally, a single copy of the unified form (2.3) holds as well

for each twisted sector of the generalized open-string Z2 -permutation orbifolds (U (1)26

U (1)26 )open /H+ and all possible T -dualizations of each of these sectors. This conclusion follows because the left-mover extended Virasoro generators of the closed-string orbifolds for each

H+ are the input data for the construction of the corresponding open-string orbifolds [14],

and the twisted-current form of each set of extended Virasoro generators is independent of Tdualization [15]. The branes, quasi-canonical algebra and non-commutative geometry of the

twisted open strings [1317] depend of course on the particular T-dualization, but these will

not be needed here.

In what follows I will consider each twisted c = 52 string separately, but the reader may find

it helpful to bear in mind the complete sector structure of these orbifold-string systems as labeled

by the elements of the automorphism groups H . Given a particular extra automorphism n H

or H of order n, one may list the following low-order examples:

(1; ),

(2.9a)

(1; 2 ),

1, 3 , 32 ; , 3 , 32 ,

1, 42 ; 4 , 43 ,

1, 62 , 64 ; 6 , 63 , 65 .

(2.9b)

(2.9c)

(2.9d)

(2.9e)

For the generalized Z2 -permutation orbifolds (+ ) all of these sectors are twisted closed strings

at c = 52, while all the sectors of the generalized open-string Z2 -permutation orbifolds (+ )

and their T-dualizations are twisted open strings at c = 52. For the orientation orbifolds ( )

the sectors before the semicolon are twisted closed strings at c = 26 (which form an ordinary

space-time orbifold) while the sectors after the semicolon are twisted open strings at c = 52.

More generally, orientation orbifolds always contain an equal number of twisted open and closed

strings. In all cases, the twisting is of course trivial for sectors corresponding to the unit element.

3. First discussion of the c = 52 string spectra

To frame this discussion, I remind [1] the reader that the Virasoro primary states of our

orbifold CFTs are defined by the integral Virasoro subalgebra (generated by {L 0 (m)}) of the

extended Virasoro algebra. Then the extended physical-state conditions (1.1a) tell us that all the

physical states {|} of each c = 52 orbifold-string are Virasoro primary

L 0 (m > 0)| = 0

(3.1)

302

but only a small subset of these primary states are selected by the rest of the physical-state

conditions:

17

1

L 0 (0)

(3.2)

| = L 1 m +

> 0 | = 0.

8

2

In what follows, I will refer to the L 0 (0) condition in Eq. (3.2) as the spectral condition, since it

will determine the allowed values of momentum-squared for each c = 52 string.

The space of physical states of each orbifold-string is then much smaller than the space of

states of the underlying orbifold conformal field theory. For the experts, I remark in particular

that the extended physical-state conditions generically disallow the characteristic sequence [19]

of Virasoro primary states known as the principle-primary states [1,9]. This follows first by the

spectral condition (which fixes the conformal weight), and second because the physical-state

condition {L u ((m + u2 ) > 0) 0} is stronger than the principle-primary state condition [1,9]

u

L u m +

(3.3)

|p.p.s. = 0, u = 0, 1, m > 0,

2

which does not extend to m = 0.

I turn now to concretize the spectral condition of each twisted c = 52 string, using the explicit

form (2.3) of its extended Virasoro generators. For this, recall [12,15] first that these generators contain in general two kinds of commuting zero modes (dimensionless momenta), namely

{J00 (0)} and {J( )/2,,1 (0)}, where the latter is relevant only when the order ( ) of ( ) is

even. In what follows, I often refer to these zero modes collectively as {J(0)}. It is then natural

to define the momentum-squared operator P 2 as follows:

1

) + 0 ( ),

L 0 (0) = P 2 + R(

4

G 0;0 ( )J00 (0)J00 (0)

P 2

,

+G

)

R(

( )

( )

2 ,; 2 ,

( )J( )/2,,1 (0)J( )/2,,1 (0) ,

(3.4a)

(3.4b)

G n(r);n(r), ( )

r,, u pZ

u

u

n(r)

n(r)

+

Jn(r),,u p

:M .

:Jn(r)u p +

( ) 2

( ) 2

(3.4c)

Here the primed sum in the level-number operator R(

With this decomposition, the spectral condition in Eq. (3.2) takes the simple form:

2

) |,

P 2 | = P(0)

(3.5a)

+ R(

2

P(0)

2 0 ( ) 1 ,

0 ( ) =

r

n(r)

1

n(r)

1

n(r)

dim n(r)

>

0.

( ) 2

( ) 2

( )

(3.5b)

(3.5c)

303

Although I will continue the discussion primarily in this form, in fact Eqs. (3.4a) and (3.5a) hold

only for the twisted open-string sectors of the orbifolds. For the twisted closed-string sectors,

we also have right-mover copies of the extended Virasoro generators (2.3), and a corresponding

right-mover copy of the extended physical-state conditions (1.1) on the same {|}. For simplicity

I will limit the discussion of these sectors here to the case of decompactified zero modes, for

which it is appropriate to equate the left and right movers

1

JR (0) = JL (0) = J(0) R R ( ) = R L ( )

(3.6)

2

where the last equality is level-matching in each twisted sector. Keeping the same definition of

the operator P 2 in Eq. (3.4b), the correct closed-string c = 52 spectral condition is then obtained

by the substitution

1

P 2 P 2

2

(3.7)

in both Eqs. (3.4a) and (3.5a). These identifications, and hence P(0)

(0)

point in the discussion below to obtain the corresponding closed-string results.

Returning to the open-string case, one simple solution of the extended physical-state conditions is the ground state |0, J(0) of twisted sector :

) 0, J(0) = L u m + u > 0 0, J( ) = 0,

R(

(3.8a)

2

2

2

0, J(0) , P(0)

= 2 + 20 ( ).

P 2
0, J(0) = P(0)

(3.8b)

This is the momentum-boosted twist-field state (see Eq. (2.7)) of that sector, with ground-state

2 . Moreover Eq. (3.4c) and the commutator (2.3c) give the increments

mass-squared P(0)

n(r)

+

P = R( ) = 4
m +

(3.9)

( ) 2

obtained by adding the negatively-moded current

u

n(r)

+

<0

Jn(r)u m +

( ) 2

to any previous state. The precise content of these excited levels must of course be determined

from the remainder of the extended physical-state conditions.

I continue this discussion with some specific examples of c = 52 strings, beginning with the

simplest twisted open-string orientation-orbifold sectors [12,13,1517]:

u

,

= 1: = 1, n = 0, U = 1, G = G, J0au m +

(3.10a)

2

u

P 2 = 4
m +
(u = 0 is DD, u = 0 is ND),

(3.10b)

2

u+1

,

= 1: = 2, n = 1, U = 1, G = G, J1au m +

(3.11a)

2

u + 1

(u = 0 is DN, u = 0 is NN).

P 2 = 4
m +

(3.11b)

2

304

In these cases, the extra automorphisms act uniformly on the labels a = 0, . . . 25 and G is the

untwisted target space metric in Eq. (2.4b). Although both twisted strings have (26 + 26) = 52

matter degrees of freedom, note that each example has only one of the two types of zero modes

{J(0)}: 26DD zero modes {J0a0 (0)} for = 1 and 26NN zero modes {J/2,a,1 (0)} for = 1.

In both cases, the momentum-squared (3.4b) has the schematic form

1 0

P 2 = ab Ja (0)Jb (0), =

(3.12)

,

0 1

where = G is the standard (west-coast) 26-dimensional target-space metric. Then we compute from Eqs. (3.5b) and (3.5c) that both strings share the same tachyonic ground-state masssquared

13

2

0 ( ) = ,

= 2

P(0)

8

and the first excited state of each is massless:

1

J0a1 2

1,

2

P

1

0, J(0) = 0 for =

1.

J1a0

0 ( ) = 0,

(3.13)

(3.14)

For this level, I have checked that the L 1 ( 12 ) 0 gauge eliminates the longitudinal parts of the

26-dimensional photons, and moreover the L 1 ( 12 ) and L 0 (1) gauges together eliminate the

negative-norm states at the next level:

1 2

+ J(1)
0, J(0) ,

P 2 = 2.

J

(3.15)

2

Since the increments (P 2 ) in Eqs. (3.10b) and (3.11b) are even integers, we are led to suspect

that the spectra of these two twisted c = 52 strings are nothing but the spectrum of an ordinary

open c = 26 string in disguise.1 I will return to this question in the following section.

A larger subset of twisted c = 52 strings is the following. For a particular twisted sector ,

suppose that = 1 acts uniformly on a set of d labels a = 0, 1, . . . d 1, d 4, while a

non-trivial element (perm) of some permutation group acts non-trivially on the other 26 d

spatial labels. Then Eqs. (2.4), (2.5) and standard results [3,57,9] in the orbifold program give

the following explicit form of the extended Virasoro generators (2.3) in this sector:

u

Lu m +

2

1 ab

v+

uv

= m+ u2 ,0 0 ( ) + G(d)

:Jav p +

J,b,uv m p +

:M

4

2

2

v p

+

1 1

4 v

fj ( )

j

:Jjj v p +

fj ( )1

j=0

v

uv

j

j

+

Jj,j,uv m p

+

:M ,

fj ( ) 2

fj ( )

2

(3.16a)

1 The spectra of these two c = 52 strings look even more familiar in terms of the dimensionful momenta k

0 ( ) =

( )1

fj

j

j=0

fj ( ) = 26 d,

1

j

fj ( ) 2

j

1

j

>

0,

fj ( ) 2

fj ( )

4 d 26.

305

(3.16b)

(3.16c)

Here = 0 or 1 for = 1 or 1, G(d) is the restriction of the flat target-space metric (2.4b) to the

first d labels, fj ( ) is the size of the j th cycle in (perm), and the previous cases with 0 ( ) = 0

are included when d = 26. The half-integer moded currents in the second term of (3.16) satisfy

the twisted current algebra (2.3b) with G G(d) . For the permutation-twisted currents in the last

term of (3.16), I have used the standard relation (n(r)/( )) = (j/fj ( )) and (the inverse of )

the twisted metric [3,57,9]

Gjj ;ll ( ) = j l fj ( )j+l,0

mod fj ( )

(3.17)

which also determines the twisted current algebra (2.3b) for these currents. Using Eq. (3.16b),

2 = 2

we see that the non-trivial element of Z2 on two labels also gives 0 ( ) = 0 and a P(0)

ground state, but a non-trivial element of Z3 on three labels gives a slightly-raised ground state

1

121

16

2

=

0 ( ) =

,

P(0)

0 ( ) = ,

(3.18)

9

72

9

and no photons.

Given the cycle-structure {fj ( )} of any extra automorphism w(perm) (see e.g. Eq. (3.4) of

Ref. [16]), it is straightforward to evaluate the sum in Eq. (3.16b). As an illustration, one finds

the simple tachyonic ground-state mass-squares

1

1

2

(3.19)

d 2+

12

26 d

in twisted sectors which correspond to the action of any non-trivial element of the cyclic group

Z of prime order on 3 ( = 26 d) 21 spatial labels. The result (3.19) includes Eq. (3.18)

2 = 2 discussed above.

when d = 23, but does not extend to the cases d = 26, 24 with P(0)

2

I remind that this result applies only to the open orbifold-strings, while twice these values of P(0)

are obtained for the closed-string versions.

Further analysis of the c = 52 strings, including the larger subset of examples (3.16), is

found in the following section.

4. Equivalent c = 26 description of the c = 52 spectra

In fact, there exists an entirely equivalent description of all the c = 52 string spectra in terms

of generically-unconventional Virasoro generators at c = 26.

To obtain the c = 26 description, I first define the relabeled (unhatted) operators

u

2n(r)

n(r)

Jn(r) 2m + u +

(4.1a)

Jn(r)u m +

+

, u = 0, 1,

( )

( ) 2

13

u

L(2m + u) 2L u m +

(4.1b)

m+ u2 ,0

2

4

306

in terms of the hatted operators above. This 11 map is recognized as a modest generalization of (the inverse of) the order-two orbifold-induction procedure of Borisov, Halpern and

Schweigert [1]. Since M 2m + u, u = 0, 1 covers the integers once, we then find from (2.3)

the explicit form of the c = 26 generators:

L(M) = 0 ( )M,0

1 n(r);n(r),

+

G

( )

2 r,,

2n(r)

2n(r)

:Jn(r) Q +

Jn(r), M Q

:M ,

( )

( )

QZ

n(r)

1

1

n(r)

n(r)

0 ( ) =

dim n(r)

>

,

( ) 2

( ) 2

( )

r

26

L(M), L(N ) = (M N )L(M + N ) + M M 2 1 M+N,0 ,

12

2n(r)

2n(r)

2n(r)

= N +

Jn(r) M + N +

,

L(M), Jn(r) N +

( )

( )

( )

2n(r)

2n(s)

, Jn(s) N +

Jn(r) M +

( )

( )

= n(r)+n(s),0 mod ( ) M+N+2( n(r)+n(s) ),0 Gn(r);n(r), ( ).

(4.2a)

(4.2b)

(4.2c)

(4.2d)

(4.2e)

( )

The expression (4.2b) for 0 ( ) is the same as above, and the mode-normal ordering in Eq. (4.2a)

2n(r)

2n(s)

:Jn(r) M +

Jn(s) N +

:M

( )

( )

2n(r)

2n(s)

2n(r)

= M +

0 Jn(s) N +

Jn(r) M +

( )

( )

( )

2n(r)

2n(r)

2n(s)

< 0 Jn(r) M +

Jn(s) N +

+ M +

(4.3)

( )

( )

( )

follows from the c = 52 ordering (2.6) because the map (4.1) preserves the sign of all arguments.

I emphasize that the c = 26 Virasoro generators in Eq. (4.2) are generically-unconventional

because the twisted matter is now summed over the fractions {2n/} instead of the conventional

orbifold-fractions {n/}. This distortion of the extra twist is the price we must pay in order to

unwind the basic twist associated to the basic permutations of H .

The map (4.1) also tells us that the c = 52 momenta {J(0)} and the c = 26 momenta {J (0)}

are identical, and we may record

J (0) = J(0):

P 2 = P 2 =

(4.4a)

( )

( )

G 0:0 ( )J0 (0)J0 (0) + G 2 ,; 2 , ( )J( )/2, (0)J( )/2, (0)

,

(4.4b)

2

where the c = 52 form of P was given in Eq. (3.4b). Similarly, the level-number operator

R( ) in the decomposition of L(0) is the same

L(0) =

1 2

P + R( ) + 0 ( ),

2

307

(4.5a)

)

R( ) = R(

=

G n(r);n(r), ( )

r,, QZ

2n(r)

2n(r)

:Jn(r) Q +

Jn(r), Q

:M

( )

( )

(4.5b)

where the c = 52 form of R(

By itself, the inverse orbifold-induction procedure (4.1) is only a relabeling of the operators

of the permutation-orbifold CFTs. The central point of this discussion however is that for the

orbifold-string theoriesrestricted by the extended physical state conditions (1.1)the map

also gives us a completely equivalent c = 26 description of the physical spectrum of each c = 52

orbifold-string. Indeed, it is easily checked that both components u = 0, 1 of the c = 52 extended

physical-state condition (1.1a) map directly onto the simpler and in fact conventional physicalstate condition

L(M 0)| = M,0 |

(4.6)

in the 26-dimensional description! A right-mover copy of Eq. (4.6) on the same physical states

{|} is similarly obtained in the equivalent c = 26 description of the closed orbifold-strings.

I emphasize that the physical states {|} of the 26-dimensional description (4.6) are exactly

the original physical states (1.1a) of the c = 52 string. Indeed, each physical state | can be

regarded as invariant under the map, or each can now be rewritten in 26-dimensional form. In

further detail, Eqs. (4.5) and (4.6) give the same spectral condition P 2 P02 + R( ), the same

physical ground state2

0, J (0)
0, J(0) ,

P02 = P02 = 2 + 20 ( )

(4.7)

and each negatively-moded hatted current in any physical state can be replaced according to

Eq. (4.1a) by the corresponding unhatted current mode. Note finally that the commutator (4.2d)

and the decomposition (4.5a) give the 26-dimensional increment

2n(r)

P = R( ) = 2
M +

( )

(4.8)

( ) ) < 0) to any previous state. With M =

2m + u, these are recognized as the same increments (3.9) obtained in the c = 52 description.

As simple examples, consider the larger subset (3.16) of c = 52 stringswhose equivalent

c = 26 physical state condition (4.6) now involves the following subset of the c = 26 Virasoro

generators (4.2):

2 Although it is not directly relevant in either description of the c = 52 strings, one notes that the conformal weight of

the scalar twist-field state |0 of sector has now shifted from 0 ( ) to 0 ( ) in the c = 26 description.

308

1

L(M) = M,0 0 ( ) + Gab

:Ja (Q + )J,b (M Q ):M

2 (d)

QZ

1 1

2j

2j

:Jjj Q +

Jj,j M Q

:M ,

+

2

fj ( )

fj ( )

fj ( )

j

j=0 QZ

(4.9a)

fj ( )1

1

2j

2j

2j

1 2

>1

,

0 ( ) =

(4.9b)

4

fj ( )

fj ( )

fj ( )

j

j=0

fj ( ) = 26 d, 4 d 26.

a, b = 0, . . . , d 1,

(4.9c)

fj ( )1

Recall for the larger subset that = 0, 1 corresponds in the symmetric theory to the action of

the extra automorphism = 1 on the first d 4 labels {a}, while fj ( ) is the length of the

j th cycle of the extra permutation (perm) which acts on the remaining 26 d spatial labels.

Shifting the dummy integer Q by the integer , we note that the second term in Eq. (4.9a) is a set

of ordinary Virasoro generators for d untwisted bosons with the ordinary current algebra

Ja (Q), Jb (P ) = G(d)

(4.10)

ab QQ+P ,0

for both values of . The currents in the third term satisfy the twisted current algebra (4.2e) with

the permutation-twisted metric (3.17), and the value of 0 ( ) in Eq. (4.9b) is only a slightlyrewritten form of that given in Eq. (3.16b).

We are now in a position to confirm our suspicions in the previous section about the simplest

orbifold-strings, described earlier at c = 52 by the extended Virasoro generators:

1

u

v+

uv

L u m +

:Jav p +

= Gab

J,b,uv m p +

:M

2

4

2

2

v

## Viel mehr als nur Dokumente.

Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.

Jederzeit kündbar.