Nucl - Phys.B v.720 PDF

Nuclear Physics B 720 (2005) 346
Gravitational quantum corrections in warped

supersymmetric brane worlds
T. Gregoire a , R. Rattazzi a,1 , C.A. Scrucca b,a , A. Strumia c ,
E. Trincherini d,e
a Physics Department, Theory Division, CERN, CH-1211 Geneva 23, Switzerland
b Institut de Physique, Universit de Neuchtel, CH-2000 Neuchtel, Switzerland
c Dipartimento di Fisica, Universit di Pisa, I-56126 Pisa, Italy
d Dipartimento di Fisica, Universit di Milano Bicocca, I-20126 Milano, Italy
e Institut fr Theoretische Physik, Universitt Heidelberg, D-69120 Heidelberg, Germany
Received 20 December 2004; accepted 2 May 2005

Available online 25 May 2005
Abstract
We study gravitational quantum corrections in supersymmetric theories with warped extra dimensions. We develop for this a superfield formalism for linearized gauged supergravity. We show that
the 1-loop effective Khler potential is a simple functional of the KK spectrum in the presence of
generic localized kinetic terms at the two branes. We also present a simple understanding of our
results by showing that the leading matter effects are equivalent to suitable displacements of the
branes. We then apply this general result to compute the gravity-mediated universal soft mass m20 in
models where the visible and the hidden sectors are sequestered at the two branes. We find that the
contributions coming from radion mediation and brane-to-brane mediation are both negative in the
minimal set-up, but the former can become positive if the gravitational kinetic term localized at the
hidden brane has a sizable coefficient. We then compare the features of the two extreme cases of flat
and very warped geometry, and give an outlook on the building of viable models.
2005 Elsevier B.V. All rights reserved.
E-mail address: thomas.gregoire@cern.ch (T. Gregoire).

1 On leave from INFN, Pisa, Italy.
0550-3213/$ see front matter 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.nuclphysb.2005.05.001
T. Gregoire et al. / Nuclear Physics B 720 (2005) 346
1. Introduction
Low energy supersymmetry is arguably the best motivated extension of the Standard
Model (SM). It solves the gauge hierarchy problem, has a natural dark matter candidate,
and in the minimal scenario predicts gauge coupling unification. However, supersymmetry
needs to be broken in order to give weak scale masses to the superpartners. In order for all
the superpartners to be heavier than the SM particles, supersymmetry is typically broken
in a hidden sector and transmitted to the SM by gravitational [1] or gauge interactions [2]
(for a review see [3]). At low energy, the breaking of supersymmetry is encoded in soft
supersymmetry breaking terms. A crucial point in designing a supersymmetry breaking
scenario is to ensure that the soft scalar masses do not generate phenomenologically unacceptable Flavor Changing Neutral Currents (FCNC). The safest way of doing this is to
generate soft masses in the IR, where the only flavor spurions are the Yukawa matrices.
This insures that there will be a super-GIM mechanism suppressing FCNC. Gauge mediation is of this type, while ordinary gravity mediation is not, because the soft masses are
affected by divergent gravity loops dominated in the UV. Another supersymmetry breaking
transmission mechanism that is safe with respect to flavor is anomaly mediation [4,5]. In
this scenario, supersymmetry breaking is transmitted via the super-Weyl anomaly, so that it
is also dominated in the IR. However, the anomaly mediated contribution is parametrically
smaller than the ordinary gravity mediated one. A way to suppress gravity mediation is
to invoke an extra dimension of space. This allows to spatially separate the visible sector
from the hidden sector where supersymmetry is broken. By locality, contact interactions
between the two sectors are absent at tree level and are only generated by calculable gravity
loops. The contribution to the soft parameters from these effective operators and the one
from anomaly mediated have different radius dependence. For large enough radius, anomaly mediation dominates, leading to a sharp prediction for the soft terms. Unfortunately
this sharp prediction entails tachyonic sleptons. Thus pure anomaly mediation is not viable and extra contributions should be invoked. A natural and interesting next-to-minimal
scenario is obtained when the radius is small enough for the finite gravitational loops to
compete with anomaly mediation effects [6]. In principle this can cure the tachyonic sleptons while keeping an energy gap between the scale of mediation, the inverse radius 1/R,
and the five-dimensional (5D) quantum gravity scale M5 at which presumably extra flavor
breaking effects come into play. The presence of this energy gap allows to control the size
of flavor violation in soft terms.
In the simplest situation of a flat geometry, the effective low-energy theory is described
in terms of the visible and hidden sector chiral superfields i with i = 0, 1, the radion
field T , whose scalar component vacuum expectation value (VEV) determines the radius
T = R, and the 5D Planck scale M5 . The tree-level kinetic function has the form2 :

tree = 3M53 T + T + 0 0 + 1 1 .
(1.1)
The radius dependence of the non-local operators induced by gravity loops is completely
fixed by simple power counting (notice that each graviton line brings in a factor of 1/M53 ).
2 The Khler potential is given by K = 3 ln(/3), in units of the 4D Planck mass. Throughout this paper,
with an abuse of nomenclature, we will often refer to as the Khler potential.
The operators that are relevant for soft scalar masses are easily seen to have the form
1-loop a
0 0
M53 (T + T )3
+a
1 1
M53 (T + T )3
+b
0 0 1 1
M56 (T + T )4
(1.2)
When supersymmetry is broken, the radion and the hidden sector chiral multiplets get in
general non-zero F -terms, and the visible sector scalar fields get soft masses respectively
from one of the first two terms and from the third term in (1.2). The former give the socalled radion mediated contribution, and its coefficient a was calculated for the first time in
Ref. [7]. The latter yields the brane-to-brane mediated contribution and its coefficient b was
recently calculated in Refs. [8,9]. Both contributions turn out to be negative in the simplest
situation. However in Ref. [8] the soft masses were computed also in a more general case,
where the supergravity fields have non-vanishing localized kinetic terms. It was shown, in
particular, that a kinetic term with a large coefficient at the hidden brane changes the sign
of the radion-mediated contribution but not its size, whereas it does not change the sign
of the brane-to-brane-mediated contribution but suppresses its size. It was then shown that
in this situation a viable model of supersymmetry breaking with competing effects from
gravity and anomaly mediation can be achieved.
The main qualitative effects of a large kinetic term on the hidden brane is to shift the
spectrum of the KaluzaKlein (KK) modes and localize their wave functions away from
the hidden brane. This is somewhat similar to what happens in a warped geometry like the
RS1 set-up of Ref. [10]. It is then conceivable that a warping of the geometry could lead to
an acceptable pattern of gravity-mediated soft terms, representing perhaps a more natural
and appealing substitute for the localized kinetic terms invoked in Ref. [8]. The aim of this
paper is to generalized the analysis of Refs. [8,9] to warped geometries by computing the
full effective Khler potential in a supersymmetric version of the RS1 set-up. In order to
investigate the quantitative relation between the effects of warping and localized kinetic
terms, we shall moreover allow for arbitrary localized kinetic terms at the two branes in
the warped case as well.
In the regime where the warping is significant, the expected form of the corrections can
be deduced from the AdS/CFT correspondence [11]. According to this correspondence, all
the physics of a RS1 model is equivalent to that of a 4D conformal field theory in which
the conformal symmetry is non-linearly linearized in the IR and explicitly broken in the
UV, and in which 4D gravity is gauged [1217]. The coordinates that are most suitable for
studying the holographic interpretation of the RS model are the ones where the metric is
written as:

L2
dx dx + dz2 ,
(1.3)
2
z
where L is the AdS radius length and where boundaries at z = z0 and z = z1 with z0 z1
are assumed. From the CFT point of view, the z coordinate corresponds to a renormalization scale. In this respect the boundaries at z0 and z1 are named respectively the UV brane
and the IR brane. The presence of the UV brane corresponds to cutting off the CFT in the
UV at the energy scale 1/z0 and to gauging 4D gravity. Indeed the graviton zero mode is
localized at the UV brane, and the effective 4D Planck mass is M 2 = (M53 L3 )/z02 . Notice
that when z0 0 the Planck mass diverges and 4D gravity decouples. By a change of x,
ds 2 =
z coordinates we can however always work with z0 = L, in which case we get the familiar RS parametrization M 2 = M53 L. The presence of the IR brane at the position z = z1
corresponds instead to a spontaneous breaking of conformal invariance in the IR at the
energy 1/z1 . The radion field of the 5D theory basically corresponds to fluctuations in the
position z1 of the IR brane and can be interpreted as the Goldstone boson of spontaneously
broken dilatation invariance. Matter fields on the UV brane are identified with elementary
fields coupled to the CFT through gravity and higher-dimensional operators suppressed by
powers of the UV cut-off 1/z0 . On the other hand, compatibly with the interpretation of z
as an RG scale, matter fields living at z1 are interpreted as bound states with compositeness scale of order 1/z1 . In the limit where the position z0 of the Planck brane is sent to
0, the theory becomes conformal, implying, among other things, that the couplings of the
radion are dictated by conformal invariance. It then follows that for z0 0 the Khler potential, including quantum corrections, must have its tree-level form. Any correction with
a different dependence would explicitly break conformal invariance. Moving the Planck
brane in, however, makes the 4D graviton dynamical and breaks conformal symmetry explicitly. Due to this, non-trivial corrections to the Khler potential are induced by graviton
loops attached to the CFT. These loops are cut-off at the KK scale 1/z1 , playing the role
of the scale of compositeness. The induced effects are therefore suppressed by powers of
1/(z1 M)2 .
To write the low energy effective theory of the supersymmetric RS model it is convenient to parameterize the radion with a superfield whose scalar component VEV is
precisely the position of the IR brane: = 1/z1 . The effective kinetic function can then
be written as

1
3 3
tree = 3M5 L
(1.4)
+ 0 0 + 1 1 L2 .
L2
Determining the form of the leading terms in the 1-loop action is more difficult than in the
flat case, since there are now two length scales L and z1 instead of just one (T ). Simple
power counting must therefore be supplemented by additional considerations. The use of
the holographic pictures provides a very direct power counting insight. Apart from trivial
UV divergent effects of the same form as the tree-level action, the calculable effects must
be related to the explicit breakdown of conformal invariance due to the propagating 4D
graviton. These corrections can be represented diagrammatically as in Fig. 1. Keeping in
mind that the quantum corrections are saturated by he only physical IR scale, i.e., the
compositeness scale , it is easy to power-count these effects to obtain
( )2 0 0
( )2 0 0 1 1
( )2 1 1
+
a
+
b
,
(1.5)
1
k2M 2
k2M 2
k2M 4
where k = 1/L. For instance the first term is obtained by the second diagram in Fig. 1 as
the product of the following factors

1 2 2
2
N 0 0
(1.6)
,
M2
1-loop a0
where the first factor N 2 (M5 L)3 counts the numerical coefficient in front of the CFT
action [1217], the second and third factors are obvious, while the fourth factor is just
Fig. 1. The two kinds of diagrams that are responsible for the effective operators that we are interested in from
the CFT point of view.
dictated by dimensional analysis once is recognized as the only IR scale. Eq. (1.5) represents the leading effect due to 4D graviton loops. However the UV brane introduces also an
explicit UV cut-off equal to 1/z0 1/L, so that we expect further corrections to Eq. (1.5)
suppressed by powers of L2 . These terms are unimportant for L 1, i.e., for large
warping. However, when the warping gets small these higher powers become crucial to
reproduce the flat case result. Finally, upon supersymmetry breaking, there will again be
a radion-mediated and a brane-to-brane-mediated contribution to the soft masses, and it is
therefore important to compute the coefficients a0,1 and b, and in particular their signs.
The results of the explicit computation we shall present in the next sections show that
in the absence of localized kinetic terms, the coefficients a0,1 and b are positive and the
corresponding soft masses squared are negative. This means that a significant warping of
the geometry, while qualitatively similar to, is quantitatively different than having large
localized kinetic terms in flat space. In the presence of a single sizable localized kinetic
term on one of the two branes, which is then identified with the hidden sector, the situation
is found to change as follows. If the hidden sector is at the UV brane, the effect of the
localized kinetic term amounts basically to a rescaling of the 4D Planck mass, and the
coefficients a1 and b that are relevant for the radion-mediated and brane-to-brane-mediated
effects get both suppressed without changing sign. If instead the hidden sector is at the IR
brane, a non-vanishing localized kinetic term tends to change the sign of the coefficient
a0 relevant for radion mediation, and to suppress the size of the coefficient b relevant for
brane-to-brane mediation while preserving its sign. This last case is therefore potentially
as viable as the flat case. However, there is a limitation on the size of the coefficient of the
localized kinetic term on the IR brane: if it is too large then the radion becomes a ghost
[18]. In order to understand whether and to what extent a warping of the geometry can be
a helpful and appealing supplement to localized kinetic terms, it is therefore necessary to
perform a detailed analysis at finite warping and localized kinetic terms.
There are various techniques that can be used to perform the calculation of the gravitational loop effects discussed above in the general case of finite warping. One possible
approach is the off-shell component formalism of Zucker [1921], which was used in the
flat case in Ref. [8]. This approach is somewhat inconvenient in the warped case for two
reasons. First, the formalism is plagued with singular products of distributions that are
hard to deal with in warped space. Second, the trick used by Ref. [8] of considering a
background with a non-zero VEV for the F -term of the radion multiplet and calculating
the induced potential instead of the correction to the Khler potential, is also hard to generalize to the warped case. Another possible approach is the superfield formalism employed
in Ref. [9], where only half of the bulk supersymmetry is manifest. This turns out to be
easier to generalize to the warped case, and we shall therefore use it as the basic framework to set up the computation. In the end, however, it turns out that all the information
that is needed to derive the result is effectively contained in the spectrum of a single massless scalar with arbitrary localized kinetic terms. In fact, the calculation of the full 1-loop
correction to the Khler potential is quite analogous to that of the effective potential for
this simple system.
The paper is organized as follows. In Section 2 we describe more precisely the context
of our computation, namely the warped supersymmetric brane worlds based on the RS1
geometry. In Section 3 we generalize the linearized superfield formalism of Ref. [9] to the
warped case, apply it to set up the supergraph computation of the gravitational loop effects
that we want to compute, and study the structure of the latter to show that it effectively
maps to a computation within a simple free theory of a real scalar field with localized kinetic terms. In Section 4 we concretely perform the computation of the loop effects and
discuss the results and their implications on the gravity-mediated soft mass terms. In Section 5 we present a simple argument allowing to relate the form of the matter-dependent
effects to the matter-independent one. In Section 6 we discuss the application of our results
to model building. Finally, Appendix A contains some useful technical details concerning
the linearized superfield description of warped models. The readers who are not interested
in the details concerning a rigorous set up of the computation within the superfield approach can skip Section 3 and Appendix A and take Section 4 as a starting point for the
computation.
2. Warped supersymmetric brane worlds

In this section, we shall describe more precisely the context of our computation and
introduce the basic notation we will use throughout the paper. Our starting point is the
supersymmetric version of the 5D RS1 geometry [2128]. The theory is constructed by
performing a gauging of 5D supergravity compactified on the orbifold S 1 /Z2 . Denoting by
y the internal coordinate and by y0 = 0 and y1 = R the two fixed-points, the Lagrangian
takes the form:
L = L5 + 0 (y)L0 + 1 (y)L1 ,
where i (y) = (y yi )
(2.1)
and3
3 There are two formulations of the supergravity extension of the RS model (presented respectively in [2224]
and [21,25]) that are equivalent but differ by a singular gauge transformation [2628]. This singular transformation makes the second formulation hard to deal with, and we therefore use the first.

1
3i
g5 5 M53 R5 + i M MRN DR k(y) MN N

2
2

1 2
+ FMN + ,
2

1
Li = g4 i Mi2 R4 + i D + | i |2 + i i D i + .
2
L5 =
In this expression M5 is the 5D fundamental scale, R is the radius of the compact extra
dimension, k is a curvature scale and M0,1 are two scales parametrizing possible localized kinetic terms for the bulk fields, whereas 5 and 0,1 are a bulk and two boundary
cosmological constants that are tuned to the following values
5 = 6M53 k 2 ,
0 = 1 = 6M53 k.
(2.2)
We assume as usual that the compactification scale 1/R and the curvature k are much
smaller than the fundamental scale M5 , so that we can reliably use the above 5D supergravity theory as an effective description of the physics of the model and neglect
higher-dimensional operators.
The above theory has a non-trivial supersymmetric warped solution defining a slice of
an AdS5 space that is delimited by the two branes at y = y0,1 . Defining (y) = k|y|, the
background is given by
+ M N ,
gMN = e2 M N
M = 0,
AM = 0.
(2.3)
Notice that the 4D geometry is flat at every point y of the internal dimension, but the
conformal scale of the metric varies exponentially with it. This gives rise to a dependence
of physical effective energy scales on y.
2.1. Effective theory
At energies much below the compactification scale, the fluctuations around the above
background solution are described by a 4D supergravity with vanishing cosmological constant. The fields of this effective theory are the massless zero mode of the 5D theory and
fill out a supergravity multiplet G = (h , ) and a radion chiral multiplet T = (t, t ).
They are defined by parametrizing the fluctuations around the background as follows:

Re t
Re t 2
( + h ),
gy = 0,
gyy =
,
g = exp 2
R
R
3
2
y =
A = 0,
Ay = Im t.
= ,
(2.4)
t ,
2
Substituting Eqs. (2.4) into Eq. (2.1) and integrating over the internal dimension, one finds
the following effective Khler potential [29]:

M3

T + T , i , i = 3 5 1 ek(T +T )
k

+ 0 0 , 0 + 1 1 , 1 ek(T +T ) .
(2.5)
10
The first term is the matter-independent contribution from the bulk, whereas the last two
terms are the matter-dependent contributions from the branes

i i , i = 3Mi2 + i i + .
(2.6)
The dots denote irrelevant operators involving higher powers of i i . Note that there is
a limit on the possible size of M1 , because if M12 > M53 /k, the radion, parametrized by
exp(kT ), becomes a ghost [18]. Superpotentials Wi localized on the branes give rise to
the following effective superpotential at low energy:
W (T , i ) = W0 (0 ) + W1 (1 )ekT .
(2.7)
Finally, the effective Planck scale can be read off from the part of (2.5) after substituting T
with its VEV R. One finds, assuming vanishing matter VEVs,

M53
1 e2kR + M02 + M12 e2kR .
(2.8)
k
Assume now that the field 0 represents collectively the fields of the visible sector,
while T and 1 represent the hidden sector. It is evident that the above tree level action
does not induce any soft terms even after supersymmetry is broken. Notice that the result
would be the same in the reversed situation where 1 represents the visible sector. This
is because the radion couples to 1 as a conformal compensator: tree level effects can be
easily seen to cancel through a field redefinition 1 = 1 ekT . Therefore the above tree
Lagrangian realizes the sequestering of the hidden sector. Soft terms can only be induced
by calculable quantum corrections.
M2 =
2.2. Loop corrections

The corrections to the Khler potential of the effective theory that are induced by loops
of bulk modes come in two different classes. The first class represents a trivial renormalization of the local operators corresponding to the classical expression (2.5). As we just
explained, terms of this form do not mediated soft masses: although UV divergent (or better, incalculable), this correction is uninteresting. A second class corresponds instead to
new effects that have a field dependence different from the one implied by locality and
general covariance in (2.5). By definition these effects are genuinely non-local and, therefore, finite and calculable. These corrections can be parametrized in the following general
form:

n
n

Cn0 ,n1 T + T 0 0 0 1 1 1 .
T + T , i , i =
(2.9)
n0 ,n1 =1
The functions cn0 ,n1 control the leading effects allowing the transmission of supersymmetry breaking from one sector to the other, and can be computed along the same lines as for
the flat case, which was studied in Refs. [8,9].
Unfortunately the trick that was used in Ref. [8], namely computing the effective potential at FT
= 0 and deducing from it the form of Eq. (2.9), cannot be generalized in
11
a straightforward way to the warped case. In the flat case, a consistent tree-level solution with FT
= 0 and flat 4D (and 5D) geometry could be found by simply turning on
constant superpotentials W0 and W1 at the boundaries. Since the interesting terms in the
Khler potential are associated to terms quadratic in FT in the effective potential, and since
FT W0 + W1 , it was enough to work with infinitesimal W0,1 and calculate the effective
potential at quadratic order in W0,1 . Notice, by the way, that boundary superpotentials are
just the only local, zero-derivative deformation that is available in 5D Poincar supergravity. The situation is drastically modified in 5D AdS supergravity. In this case, by turning
on boundary superpotentials we have that [3032]: (1) the radius is stabilized, (2) the 4D
metric of the 4D slices (and of the low energy effective theory) becomes AdS4 , (3) there
is a (compact) degeneracy of vacua associated to the VEV of the graviphoton component
A5 . At all points the scale of supersymmetry breaking is subdominant to the scale of AdS4
curvature and at a special point supersymmetry is restored. The last property can be quantified by the deviation m3/2 of the gravitino mass from its supersymmetric value, which
is equal to the curvature 1/L4 in AdS4 . One obtains m3/2 L4 2 , where is the AdS5
warp factor. In principle one could go ahead and calculate corrections to the effective potential in this background and read back from it the effective Khler potential. However,
the 4D curvature, as we said, cannot be treated as a subleading effect and this complicates
both the calculation and the indirect extraction of the Khler potential.
Rather than trying to encompass this difficulty, we will generalize to the warped case
the linearized superfield approach that was used in [9] and in which the Khler potential
is calculated directly. This generalization is interesting on its own, and we will present it
in detail in the next section. However, it turns out that the result that it produces for the
correction to the effective Khler potential is a very intuitive and obvious generalization
of those derived in [8] and [9] for the flat case: one has just to replace the flat space propagators with the corresponding warped space ones. We shall prove this general result in
a rigorous way with superfield techniques in the next section. The actual computation is
postponed to a subsequent section.
3. Superspace description
A convenient way of performing loop calculations in supersymmetric theories is to use
supergraph techniques. By calculating loop diagrams directly in terms of superfields, the
number of graphs is greatly reduced, and various cancellations between graphs that are insured by supersymmetry are guaranteed to happen. Unfortunately, the use of this technique
in theories with more than four spacetime dimensions is not straightforward, because the
amount of supersymmetry is higher than in four dimensions. From a 4D perspective, there
are in this case several supercharges, and the simultaneous realization of the associated
symmetries requires a superspace with a more complex structure. However, it is still possible to manifestly realize one of the 4D supersymmetries on a standard superspace, at the
expense of losing manifest higher-dimensional Lorentz invariance [3336]. In this way,
only a minimal N = 1 subgroup of the extended higher-dimensional supersymmetry will
be manifest, but this turns out to be enough for our purposes.
12
In fact, writing higher-dimensional supersymmetric theories in term of N = 1 superfields not only simplifies loop computations, but makes it also easier to write down
supersymmetric couplings between bulk and brane fields. As explained in Ref. [37], the
way of doing this is to group the higher-dimensional supermultiplets into subsets that
transform under an N = 1 subgroup of the full higher-dimensional supersymmetry. Brane
couplings can then be written using the known 4D N = 1 supersymmetric couplings. Splitting higher-dimensional multiplets in different N = 1 superfields does just that, without
having to look explicitly at the higher-dimensional supersymmetry algebra. Also, the couplings in component form often involve ambiguous products of functions that arise when
auxiliary fields are integrated out. Using superfields this problem is avoided, because auxiliary fields are never integrated out. The drawback of the superfield approach is that, since
higher-dimensional Lorentz invariance and the full supersymmetry are not manifest, one
has to more or less guess the Lagrangian, and check that it reproduces the correct higherdimensional Lagrangian in components.
This technique has been used for studying linearized 5D supergravity in flat space in
Ref. [36], and the resulting formalism has been successfully applied in Ref. [9] to compute
gravitational quantum corrections in orbifold models, with results that agree with those
derived in Ref. [8] by studying a particular component of the corresponding superspace
effective operators. In the following, we will briefly review the approach of Refs. [9,36]
for the case of a flat extra dimension, and then generalize it to the case of a warped extra
dimension.
3.1. Bulk Lagrangian for a flat space
The propagating fields of 5D supergravity consist of the graviton hMN , the graviphoton BM and the gravitino M , which can be decomposed into two two-components Weyl
+
spinors M
and M
. These fields can be embedded into a real superfield Vm , a complex
general superfield , and two chiral superfields T and , according to the following
schematic structure:
+
+ ,
Vm = n (hmn mn h) + 2 m
+ 2 y+ + ,
= m (Bm + ihmy ) + m m
= hyy + iBy + y
= s + .
+ ,
(3.1)
(3.2)
(3.3)
(3.4)
The dots denote higher-order terms involving additional fields, which are either genuine
auxiliary fields or fields that are a priori not but eventually turn out to be non-propagating
[36]. We also need to introduce a real superfield P acting as a prepotential for the chiral
conformal compensator : = 1/4D 2 P . This introduces yet more non-propagating
fields.
Using the above fields, it is possible to construct in an unambiguous way a linearized
theory that is invariant under infinitesimal transformations of all the local symmetries
characterizing a 5D supergravity theory on an interval. These linearized gauge transformations consist of the usual 4D superdiffeomorphisms, which are parametrized by a general
13
complex superfield L , and the additional transformations completing these to 5D superdiffeomorphisms, which are parametrized by a chiral multiplet . The corresponding
linearized gauge transformations of the superfields introduced above are given by
1
1
Vm = m (D L D L ),
= y L D ,
2
4
P = D L + h.c,
T = y .
(3.5)
As usual, L also contains conformal transformations that extend the 4D super-Poincar

group to the full 4D superconformal group, but these extra symmetries are fixed by gauging
away the compensator multiplet .
The Lagrangian for linearized 5D supergravity in flat space can be constructed by writing the most general Lagrangian that is invariant under the above linearized gauge transformations. This fixes the Lagrangian up to one unknown constant that can be determined
by imposing that the component form be invariant under 5D Lorentz transformations. The
result is

1
1
2i
L = M53 d 4 V m Kmn V n +
m Vm
2
3
3

2

1
1
2
y V (D D ) + y P D + D
2
4

1
T + 2im V m + h.c. ,
(3.6)
2
where
1
1
Kmn = nm D D 2 D + m
n [D , D ][D , D ] + 2m n .
4
24
(3.7)
The first line of (3.6) has the same form as the usual 4D linearized supergravity Lagrangian.
To obtain the component Lagrangian, one chooses a suitable WessZumino type of gauge,
and eliminates all the auxiliary fields. By doing so, one correctly reproduces the linearized
Lagrangian of 5D supergravity, with in addition some extra fields that do not propagate but
have the dimensionality of propagating fields [36]. The above construction can be generalized to the S1 /Z2 orbifold in a straightforward way, by assigning a definite Z2 parity to
each multiplet: Vm , and T are even, whereas is odd.
The 5D Lagrangian (3.6) can be written in a physically more transparent form by using
a complete set of projectors that defines the different orthogonal components of the real
superfield Vm with superspin 0, 1/2, 1 and 3/2. The projectors are given by [3840]:
0mn = Lmn PC ,
1 1 m n
1
mn
[D , D ][D , D ] + Lmn PT + 0mn ,
1/2
=
48
3
1mn = Tmn PC ,
1 1 m n
2
mn
[D , D ][D , D ] + mn PT Lmn + 0mn ,
=
3/2
48
3
(3.8)
(3.9)
(3.10)
(3.11)
14
in terms of the transverse and chiral projectors on vector superfields, which are given by
1 D D 2 D
1 D 2 D 2 + D 2 D 2
,
PC =
,
8

16

and the transverse and longitudinal projectors acting on vector indices, given by
PT =
m n
m n
,
Lmn =
.

The kinetic operator (3.7) can then be written as

2 mn
mn
mn
K = 2 3/2 0 .
3
Tmn = mn
(3.12)
(3.13)
(3.14)
Using the above complete set of superspin projectors, we can split the field Vm into four
orthogonal parts V0 , V1/2 , V1 and V3/2 . The first three transform non-trivially under local
super-diffeomorphism, but not the last one, which is invariant. The Lagrangian (3.6) can
then be equivalently rewritten as

2
2
i
3
4
m
2
m
L = M5 d V3/2 + y V3/2m m V0
3
2

2
1
(D D )
y V0
+ V1/2
+ V1
2

1
i
2
+ y P D + D
m V0
.
+i T T
4
2
(3.15)
We can see very clearly in this language why the compensator is needed. The kinetic Lam is non local, due to the singular form of
grangian for the gauge-invariant component V3/2
mn . This non-local part is cancelled by a similar non-local part coming from the kinetic
3/2
Lagrangian of the gauge-variant component V0m . The non-trivial variation under gauge
transformations of this term is then compensated by that of . Therefore the kinetic term
of linearized 4D supergravity, the first line of Eq. (3.15), decomposes as the sum of two
invariant terms respectively of maximal (3/2) and minimal (0) superspin. This is fully
analogous to the situation in ordinary Einstein gravity, where the linearized kinetic term
decomposes as the sum of spin 2 and spin 0 components. Notice also that, like in Einstein
m being the only component of maximal superspin, it cannot mix to any other
gravity, V3/2
component.
We can verify that the correct 4D N = 1 effective Lagrangian is obtained for the zero
modes, by taking the even fields to depend only on the four-dimensional coordinates and
integrating over the extra dimension with radius R. To be precise, we use a hat to distinguish the 4D zero mode of each field from the corresponding 5D field itself. The result is:

2
2
i
3
4
m
m
Leff = 2RM5 d V3/2 ()V3/2m m V0

3
2

i
.
+ i T T m V0m
(3.16)
2
15
It can be verified (trivially for the Vm -independent terms) that this is indeed the quadratic
expansion of the 4D supergravity Lagrangian, which can be written in terms of the 4D
conformal compensator = exp(/3)

and the full 4D radion field T = R(1 + T ) as

Leff = 3M53 d 4 T + T .
(3.17)
In this expression, the d 4 integration is in fact an abbreviated notation for taking the Dterm in a covariant manner. In particular, factors of the metric should be included. This
result agrees with what was found in [41].
3.2. Bulk Lagrangian for warped space
We now turn our attention to the case of warped space. AdS5 is not a solution of the ordinary, ungauged supergravity Lagrangian, which does not admit a cosmological constant.
To have a cosmological constant term, a U (1)R subgroup of the SU (2)R symmetry must
be gauged by the graviphoton. However, because we will restrict ourselves to the quadratic
Lagrangian of supergravity, the gauging of the U (1)R will not be apparent in our formalism. Within this gauged theory, we then assume a fixed background defined by Eq. (2.3)
and look for the quadratic Lagrangian for the fluctuations around that background.
The first important thing that we want to show is that the fluctuation Lagrangian can be
written in terms of ordinary 4D superfields. To do so we start by considering the quadratic
Lagrangian for 5D supergravity. At the local level there are two supersymmetries. However, the boundary conditions on S1 /Z2 are such that globally there remains at most one
supersymmetry, regardless of there being 5D curvature. As it has been discussed in several
papers, the locally supersymmetric RS1 model preserves one global supersymmetry QG
.
Technically this means that there exists one killing spinor over the RS1 background. In
bispinor notation, and introducing a generic constant Weyl spinor , this Killing spinor
has the form [24]
/2
e
.
=
(3.18)
0
The generator QG
of the corresponding global supersymmetry is defined in terms of
the generators of the local 5D supersymmetries, denoted in bispinor notation by QL
L
G
/2 QL , which implies
L
(QL
1
2 , Q1 ), by the equation Q = Q = e
/2 L
Q1 .
QG
=e
(3.19)
and the rest of the global 4D super-Poincar group (P = i plus

We can now realize
Lorentz boosts) over ordinary flat 4D superspace:
QG
ia a .
(3.20)

In this realization of our field space, the fifth coordinate y is just a label upon which our
4D superfields S(x, y, ) depend. Of course 5D covariance is never manifest in this formulation of the theory and the correct action is obtained via the explicit dependence of
the superspace Lagrangian on y and y . By expanding the superfields as S(x, y, ) =
QG
=
16
n Sn (x, y)
n,
we can identify Sn (x, y) with the local 5D fields. However, our global
supersymmetry knows little about the local 5D geometry, so that in general the Sn s are
not normalized in a way that makes 5D covariance manifest. This is obviously not a problem: the correct normalization (as well as the correct covariant derivative structure) can
always be obtained by local redefinitions of the Sn s by powers of the warp factor [42,43].
We can figure out the right rescaling that defines the canonical fields by considering the
normalization of the supercharge. By Eq. (3.19), QL
1 is related to the global supercharge
/2 QG . Then by defining a new local superspace coordinate = e/2 , and
=
e
by QL
1
substituting the flat vielbein a by the curved one ea = e a , we can write

QL
1 =
ia ea .

(3.21)
Not surprisingly, the presence of the vielbein shows that QL

1 is covariant and realizes the
local supersymmetry subalgebra
L

a m
L
Q1 , Q
(3.22)
1 = 2i ea m .
Therefore, if we parametrize our superfields in terms of the local instead of the global
, the field coefficients
should correspond to the local canonical fields. From the definition

S = n Sn n = n Sn n we conclude that the canonical fields Sn must be defined as4
Sn (x, y) = en/2 Sn (x, y).
(3.24)
At first sight it might seem better to work directly with the coordinates. However, QL
1
does not commute with y , since by Eq. (3.19) it depends explicitly on y, and therefore
y S is not a superfield over the superspace defined by . We could define a supercovariant
Dy derivative, but we find it more convenient to work with global, flat superspace.
Now that we know that it is possible to write the desired Lagrangian in term of standard
N = 1 superfields, we need to examine the gauge symmetry that this Lagrangian should
possess. We parametrize the fluctuations around the background as
ds 2 = e2 (mn + hmn ) dx m dx n + 2e hmy dx m dy + (1 + hyy ) dy 2 .
(3.25)
The linearized general coordinate transformations are then given by:

hmn = m n + n m 2 mn y ,
hmy = e
y m + e m y ,
hyy = y y .
(3.26)
(3.27)
(3.28)
4 To be precise, one should remember that the superfields coefficients S involve in some cases 4D derivn
atives . However, this does not affect our conclusions, because the change = e/2 in coordinates
corresponds to the change a = ea in derivatives. For instance, in the case of a chiral superfield we have

a
a
= ei a + + F 2 = ei ea + e/2 + e F 2 .
(3.23)
17
Comparing with the flat case, we see that the warping is responsible for a new term proportional to in the transformation law for hmn .
The embedding of component fields into superfields can be done as in the flat case,
except that we need to introduce in this case a real prepotential PT for T as well, in
such a way that T = 1/4D 2 PT . The transformation laws can then be written in terms of
superfields, after introducing a prepotential P also for , as:
1
Vm = m
(D L D L ),
2
1
= e y L e D ,
4
P = D L 3 P ,
PT = y P .
(3.29)
(3.30)
(3.31)
(3.32)
From the last two expressions, it follows that:

1
= D 2 D L 3 ,
4
T = y .
(3.33)
(3.34)
Note that the new term proportional to in the transformation law for hmn is encoded
in superfield language in a new term in the transformation law for . This is possible because, in addition to general coordinate invariance, the transformations parametrized by L
also include Weyl and axial transformations. These extra conformal transformations can be
fixed by setting the lowest component s of the compensator to 0. But the subgroup of
transformations that preserve this gauge choice involves scale transformations that are correlated with diffeomorphisms and induce the appropriate extra term in Eq. (3.26). To show
this more precisely, let us consider the transformation laws of hmn and s under diffeomorphisms with real parameters M and complexified Weyl plus axial transformations with
complex parameter , as implied by Eqs. (3.29) and (3.33):

2
1
hmn = m n + n m m m + mn + ,
3
6
s = 2m m 6 y .
(3.35)
(3.36)
As anticipated we can now use the Weyl and axial symmetries associated to to set s to 0.
To preserve that gauge choice, however, diffeomorphisms must then be accompanied by a
suitable Weyl transformations with parameter
+ = 4m m 12 y .
(3.37)
Plugging this expression back into (3.35), we find that the net transformation law of the
graviton under a diffeomorphism, after the conformal gauge-fixing, reproduces indeed
Eq. (3.26).
It is now straightforward to construct a superfield Lagrangian that is invariant under
the transformations (3.29)(3.34) and reduces to (3.6) in the flat limit. It has the following
18
expression:

2
2
i
m
+ e2 y e4 y V3/2m m V0m
d 4 e2 V3/2
3
2

1
2
e y V0 + V1/2
+ V1
(D D )
2
2

1
+ e (y P + 3 PT ) D + D
4

i
+i T T
(3.38)
m V0
.
2
L = M53
There is one important remark to make about this Lagrangian: it possesses an extra (accidental) local invariance in addition to those we employed to derive it. Its parameter is a
chiral superfield W and the transformation laws are given by
= 3 e W ,
PT = D W + D W .
(3.39)
(3.40)
This new symmetry is associated to a redundancy in the parametrization of T by a prepotential PT . Indeed PT shifts by a linear multiplet so that using W we can gauge away
the newly introduced components of PT . By this invariance of the quadratic action, some
combination of fields have no kinetic term. In order for our expansion to make sense when
going to non-linear order, it is important to demand full invariance under this new transformation.
The component form of the Lagrangian (3.38) is reported in Appendix A. It reproduces
the correct component Lagrangian for supergravity on a slice of AdS5 at the linearized
level. The well known derivation of the low energy effective theory and KK decomposition
then follows. It is however instructive to derive these results in terms of superfields.
3.3. KK mode decomposition
Let us start by constructing the zero mode superfield action. First of all, is Z2 odd and
does therefore not have zero modes. Second, notice that if we choose V m (x, y) V m (x),
the mass terms, involving y cancel out. On the other hand, the mass term involving P and
PT in the third line is non-vanishing for y independent field configurations. One possible
parametrization of the zero modes for which this term vanishes altogether5 is defined by
PT (x, y) PT (x),
P (x, y) P (x) 3 (y)PT (x),
T (x, y) T (x),
(x, y) (x)
3 (y)T .
(3.41)
Notice that the conformal compensator depends on y precisely like the conformal factor
of the metric does in the zero mode parametrization for the bosonic RS1 model [44]. In
order to write the effective action in compact form it is useful to rewrite the superspin 0
5 One can check that, by using the gauge freedom associated to L and W , P (x) and P (x) can be chosen
to be purely chiral + antichiral (i.e., no linear superfield component) while keeping = 0.
19
component as m V0m = + and form the two combinations

A + 2i,
B 2i.
(3.42)
It is important to notice that A is gauge-invariant but B is not. The zero mode action,
with the second and third lines in Eq. (3.38) vanishing, depends only on A . Defining the
zero modes of the latter in analogy with Eq. (3.41), i.e.,
3 (y)T (x) + 2i (x) = A (x) 3 (y)T (x),

A (x, y) (x)
(3.43)
the 5D action for the zero modes can be rewritten simply as:

M3
1 m
1
V3/2m + A 3 T ( A 3 T )
d 4 y e2 V3/2
L 5
2
6
(3.44)
showing that the geometry of the boundaries is what matters in the low energy effective
action. The quadratic action for the zero modes is then

m
M53
4
V3/2m
d 1 e2kR V3/2
Leff =
k

1
2kR
A 3kR T (A 3kR T ) .
A A e
(3.45)
3
By expliciting the dependence of A on Vm it can be verified that this Lagrangian is local
as it should. Moreover, with the identification = exp(/3)

and T = R(1 + T ), it agrees
with the quadratic expansion of the full non-linear result, which was inferred by general
arguments in Refs. [29,45]:

M53

d 4 1 ek(T +T ) .
Leff = 3
(3.46)
k
In fact, even though Eq. (3.45) is only valid at quadratic level in the superfield T , the
full non-linear dependence on the radion superfield T of Eq. (3.46) can be deduced through
the following argument. At the linearized level, the real part of the scalar component t
of T coincides in principle with hyy gyy 1 only up to higher order terms. However, one can argue that by a holomorphic field redefinition T T = f (T ) it should
always be possible to choose Re t = gyy 1 [29]. Let us then choose T such that
Re t = gyy 1. Now, focussing on the constant mode of gyy , we know that the low
energy Lagrangian can only depend on it via the covariant combination R gyy equalling
the physical length of the fifth dimension. This is not yet enough to fully fix the dependence
on T . We need to use the constraints on the dependence on Im t Ay the graviphoton fifth
component. At tree level, The VEV of the low energy kinetic function corresponds to
the effective 4D Planck scale of the usual RS1 model, which does not depend on other
bulk fields than the radion. In particular it does not dependent on gauge fields like the
graviphoton. This fixes completely the dependence on T to be obtained by the simple substitution 2R R(2 + T + T ), compatibly with our results. The dependence on Ay
is actually constrained even at the quantum level by the presence of an accidental (gauge)
symmetry Ay Ay + const of minimal 5D supergravity on S1 /Z2 . The point is that the
20
graviphoton appears in covariant derivatives via a Z2 -odd charge, y y + iq(y)Ay :

basically the graviphoton is a Z2 odd gauge field used to gauge an even symmetry. It is
then evident that a constant Ay = a configuration can be gauged away by a gauge rotation
with parameter (y) = a (y)/(kR). Since we will be using the quadratic supergravity Lagrangian for the computation of the quantum effective action, we will later need
this argument to turn the dependence of our result on R into a dependence on the superfield T .
The KK spectrum can be studied in a similar way. For doing so, it is convenient
to work in the gauge = 0, where it is evident that V1/2 and V1 do not propagate
while V3/2 has the KK decomposition of the graviton in RS1. We then exploit the local invariances associated to the parameters P , W and L to bring the fields PT
and P in a convenient form. In addition to a chiral part of superspin 0, these real
superfields contain a linear superfield mode of superspin 1/2 that does not propagate,
as will now argue. First, by using P and W , we can always bring PT in the form
e2 PT (x), where PT is y-independent and satisfies D 2 D PT = 0. This choice eliminates all the modes in PT apart from a chiral radion zero mode. Next, by using the
residual freedom L = S with S (x) chiral and constant over y we can eliminate the
y independent linear superfield mode in P . Note that Vm and are unaffected by
these transformations. In this way, in the superspin 1/2 sector of P and PT , only the
non-trivial KK modes of P are left. Like for V1/2 and V1 , their kinetic Lagrangian
is a simple quadratic term proportional to (y P )2 with a trivial mass shell condition
setting the modes themselves to zero. In what follows, we can then concentrate on the
components with superspin 0 and work directly with the chiral fields T and . Compatibly with the above discussion, the decompositions of the fields are chosen as follows:
T (x, y) = e2 (y) T (x),
3
y).
(x, y) = e2 (y) T (x) + (x,
2
The corresponding expressions for A,B (cf. Eq. (3.42)) are then given by
(3.47)
(3.48)
3
A (x, y) = e2 T (x) + A (x, y),
2
3
B (x, y) = e2 T (x) + B (x, y).
(3.49)
2
With this parametrization the Lagrangian for the chiral fields becomes

1
1
3
1
LT , = d 4 e2 T T e2 A A + e4 y B y A + h.c. .
4
3
4

(3.50)
It is useful, and straightforward, to decompose A and B in a complete set of orthogonal
KK modes satisfying:
(n)
(n)
y e4 y A,B = m2n e2 A,B .
(3.51)
21
The KK modes Lagrangian can then be written (with a convenient normalization) as:

3 2R
4
e
1 T T
LT , (n) = d
4

(n) (n)
1
2R
2 (n) 1 (n)
1e
.
A A + mn B
+ h.c.
3
A
n
(3.52)
This parametrization makes it manifest that the non-zero KK modes of B act as
Lagrange multipliers for the modes of A [40]. The only physical modes that are
(0)
left are therefore T along with A (x): they represent an alternative parametrization of the zero modes, in which the radion does not mix kinetically with the 4D
graviton. This is the superfield analogue of the radion parametrization discussed in
Ref. [46].
3.4. Boundary Lagrangians
The only interactions that are needed for our calculation are those between brane and
bulk fields. At the fixed points, vanishes, Vm and undergo the same transformations
of linearized 4D supergravity (remember that is odd and vanishes at the boundaries)
and finally T is the only field transforming under . By this last property, T cannot couple to the boundary, so that only Vm and can couple and they must do so precisely
like they do in 4D supergravity (see [9]). What is left are the 4D superdiffeomorphisms
of the boundaries. The presence of warping shows up in the boundary Lagrangian via the
suitable powers of the warp factor. These are easy to evaluate according to our discussion in the previous section. Using locally inertial coordinates x and , no power of the
warp factor should appear in the invariant volume element d 4 x d 4 for the Khler potential and d 4 x d 2 for the superpotential. From the ordinary RS1 model, we know that
x = e x, where x are the global coordinates, while from the previous section we have
learned that = e/2 . We conclude that the warp factors multiplying the Khler and the
superpotential of the actions localized at y = yi must be equal respectively to e2 (yi ) and
e3 (yi ) .
Let us consider a chiral superfield i localized at the brane at yi and with quadratic
Khler potential i = i i . As we already argued, the couplings of i to Vm and are
the same as in ordinary 4D supergravity. For our purposes, since we are only interested in
the 1-loop Khler potential, it is sufficient to consider terms that are at most quadratic in
Vm and , with derivatives acting on at most one of i and i . The relevant part of the
4D boundary Lagrangian at yi is then given by

1
1
2
Li = d 4 e2 (yi ) i i 1 + 1 + + ii m i V m
3
3
3

1
i i V m Kmn Vn .
(3.53)
6
To compute the effective Khler potential, we split the matter field into classical background i and quantum fluctuation i : i = i + i . We then rewrite the Lagrangian in
22
term of the different superspin components defined previously6

1
1
Li = d 4 e2 (yi ) i i + i i + i i A + i i A
3
3

1
1
m
i i V3/2
V3/2m A A .
3
3
(3.54)
In this expression, we have neglected terms involving derivative of the background and
interactions among the quantum fluctuations, as they do not affect the Khler potential at 1m .
loop. Notice that the Lagrangian involves only the gauge invariant quantities A and V3/2
This property will be very useful for the computation of the next subsection. The effect of
localized kinetic terms is easily obtained by generalizing the localized Khler potential as
i i 3Mi2 . In the above Lagrangian, this amounts to change 3Mi2
in those terms that are quadratic in the background, without affecting the terms that are
instead linear in the background and involve fluctuations.
Finally, by using our parametrization of the radion and compensator zero modes, we
can check that the boundary contribution to the low energy effective Lagrangian is just the
linearized version of the full non-linear result, i.e.:

Leff = d 4 0 0 , 0 + 1 1 , 1 ek(T +T ) .
(3.55)
3.5. One-loop effective Khler potential
We now have all the necessary ingredients to set up more concretely the calculation of
gravitational quantum corrections to the 4D effective Khler potential. As we restrict our
attention to the effective Khler potential, we neglect all derivatives on the external fields.
The supergraph calculation that needs to be done becomes then very similar to the calculation of the ColemanWeinberg potential in a non-supersymmetric theory [47]. Similar
superfield computations have already been done for the gauge corrections to the Khler
potential in 4D supersymmetric theories in [48].
Normally, the most convenient procedure for doing this kind of computation is to add
a suitable gauge-fixing term to the Lagrangian and work in generalized R gauges. In
the case at hand, however, it turns out that there is a much simpler approach. The point
is that the fields V3/2 and A appearing in the boundary Lagrangian are already gaugeinvariant combinations. Their propagator is gauge-independent and so we can work directly
in the unitary gauge (defined in Section 3.3), where the bulk Lagrangian takes it simplest form. Let us first examine the contribution from the A field. It couples to matter
at the boundaries, and the quantity that is relevant for the computation is the propagator A (x, yi ) A (x, yj ) connecting two branes at positions yi and yj . The relevant
Lagrangian for computing the propagator has already been worked out and is given by
Eq. (3.52). For the non-zero KK modes, we can write the kinetic matrix in two by two
6 Note that these projectors are ill-defined when p 2 = 0. However, since we are interested in a loop computation with non-vanishing virtual momentum, this does not cause any problem.
23
matrix notation as:

K (n) , (n) =
A
13
2
1 mn
4
m2n

.
(3.56)
Inverting this matrix, we find that for mn

= 0 we have A (x)(n) A (x)(n) = 0, and correspondingly, these modes do not contribute to A (x, yi ) A (x, yj ) = 0 (cf. Eq. (3.49)).
(0)
The only modes that are left are thus the zero modes T and A (x). The kinetic operator
for these two fields is, again in two by two matrix notation:

K (0) ,T =
A
13 (1 e2R )
0
0
3 2R
4 (e
1)
(3.57)
It is straightforward to verify that the resulting propagators for A(0) and T give no contributions to A (x, yi ) A (x, yj ). This is only possible thanks to the ghostly nature
(0)
of the A field, which contains indeed the conformal mode of the 4D graviton. The
cancellation is spoiled if yi = yj , and in that case one gets a non-trivial contribution to
the propagator. However, the volume dependence is such that it contributes only to local
terms, we therefore conclude that the zero modes are also irrelevant for our calculation.
Summarizing, the above reasoning shows that the whole A field is not relevant to our
m , and the relevant Lagrangian has the
computation. We therefore need to focus only on V3/2
form

m
3/2
LV3/2 = M53 d 4 Vm e2 1 0 0 (y) 1 1 (y) + y e4 y V3/2
.
(3.58)
The quantities i represent the boundary corrections to kinetic terms,7 including the effect
of matter field VEVs. For phenomenological application, we will be interested in the case
i = (3Mi2 + i i )/(3M53 ). Inverting the above kinetic term the superspin structure
factors out in an overall projector:

3/2
3/2
mn 4
Vm (x1 , y1 , 1 )Vn (x2 , y2 , 2 ) = 3/2
(1 2 )(x1 , y1 ; x2 , y2 ),
(3.59)
where is the propagator of a real scalar field in a slice of AdS5 , in presence of boundary
kinetic terms, defined by the equation

e2 1
i i (y) + y e4 y (x1 , y1 ; x2 , y2 ) = 4 (x1 x2 )(y1 y2 ).
i
(3.60)
This is compatible with the results found in Ref. [8] for the flat case. By applying the
methods of Ref. [48], the 1-loop effective action can be written in a factorized form
7 Note that we have defined these quantities with an overall negative sign, as in Ref. [8].
24
as8
=
i
d 4 x d 4 x 4 (x x ) dy dy (y y ) d 4 d 4 4 ( )
2

mn 4
mn 3/2
(3.62)
( ) ln 1 (x, y, ; x , y , ) .
By using then the identity

4
mn 4
d 4 d 4 4 ( ) mn 3/2
( ) = d 4
,
(3.63)
expanding the scalar propagator in KK modes, and continuing to Euclidean momentum,

the 1-loop correction to the Khler potential is finally found to be

1
d 4 p 4 2
ln p + m
2n .
1-loop =
(3.64)
4
2
2
(2) n p
Apart from the 4/p 2 factor, arising from the superspace trace, this formula is just the
Casimir energy of a scalar whose propagation is governed by Eq. (3.60). From now on, we
will therefore concentrate on this simple scalar Lagrangian.
It is worth noticing that the fact that the boundary interactions involve only the gauge
invariant superspin 3/2 and 0 components is actually a general result valid for any number
of extra dimensions. What is specific to 5D theories is that the scalar component does not
possess any propagating KK mode, and therefore cannot contribute to the genuinely calculable effects, like the Casimir energy or the radion-mediated and brane-to-brane-mediated
soft masses. In higher dimensions, this is no longer true, so that the scalar channel can
in principle contribute. However it turns out that, for a simple but remarkable property of
Eq. (3.54), this happens only for the Casimir energy but not for the terms that are quadratic
in the matter fields. The reason is that Eq. (3.54) inherits from the original quadratic matter Lagrangian a rescaling symmetry, thanks to which the compensator dependence can be
fully reabsorbed by a redefinition of the matter field. In Eq. (3.54), this rescaling amounts
to the shift i i i A /3. Diagrammatically, the existence of this rescaling symmetry reveals itself in the mutual cancellations of the two diagrams shown in Fig. 2. Using
this property, we can, for diagrams involving matter, do our computation by first eliminating the compensator , rather than A . After that we can work in a generalization
mn projector.
of Landaus gauge, where the V m V n propagator is proportional to the 3/2
In this gauge, the diagrams involving cubic vertices cancel individually, in analogy with
what happens for the calculation of the effective potential in non-supersymmetric theories. In order to properly define the gauge that is needed to implement the above program,
8 This relation can we obtained by first defining the trivial Gaussian functional integral for a non-propagating
field as
4 4
m
DVm ei d x d Vm V = 1.
(3.61)
The interesting case is then obtained by computing a similar functional integral where the trivial kinetic function
3/2
mn is substituted with mn + mn (1 1). The resummation in perturbation theory of all the insertions of
mn
3/2 leads to Eq. (3.62).
25
Fig. 2. Cancellation of two diagrams contributing to the effective Khler potential.
we must add a suitable gauge-fixing to the Lagrangian. Since the gauge transformation
V = D L D L spans the subspace of components with superspin 0, 1/2 and 1,
m and V m , but not V m .
the parameter L can be used to adjust the components V0m , V1/2
1
3/2
Correspondingly, the most general acceptable gauge-fixing Lagrangian consists of a comm and V m . The choice that allows to reach the gauge
bination of quadratic terms for V0m , V1/2
1
m
where only V3/2 propagates turns out to be given by the following expression:

1
2
mn
Vn Vm 0mn Vn .
Lgf = d 4 e2 Vm mn 3/2
(3.65)
3
The part of the total Lagrangian that is quadratic in Vm then becomes:

mn 2
e
+ y e4 y Vn
Lquad = d 4 Vm 3/2

1
mn 2
e
Vm mn 3/2
Vn .
(3.66)
For 1, and for a 4D theory where the extra superfields would be absent, this procedure
would define the analogue of the super-Lorentz gauge. Its form coincides with the one
that was used in Ref. [9], except for terms involving , T and . For 0, instead, it
defines the analogue of the Landau gauge that we need. Indeed, it is clear that when is
m component can propagate. Moreover, since V m does not couple to
sent to 0 only the V3/2
3/2
, and T , the full Vm Vn is now exactly given by the left-hand side of Eq. (3.59).
4. Explicit computation and results

As demonstrated in the last section, the full 1-loop correction to the Khler potential is
encoded in the spectrum of a single real 5D scalar field with Lagrangian

1
L = e2 (y) ( )2 e2 (y) (y )2 + 0 0 (y) + 1 1 (y) ( )2 .
(4.1)
2
More precisely, see Eq. (3.64), the effective 1-loop Khler potential is obtained by inserting
a factor 4/p 2 in the virtual momentum representation of the scalar Casimir energy. It is
26
also understood that the circumference 2R should be promoted to the superfield T + T

(see Section 3.3), and similarly the constants i should be promoted to the superfields
(3Mi2 + i i )/(3M53 ). In the end, the superspace structure therefore only counts in a
suitable way the various degrees of freedom via Eq. (3.63). The factor 4 takes into account
the multiplicity of bosonic and fermionic degrees of freedom and the factor 1/p 2 the fact
that the Khler potential determines the component effective action only after taking its D
component.
Let us now come to the computation. Along the lines of Ref. [8], we find it convenient
to start from i = 0 and to construct the full result by resuming the Feynman diagrams with
all the insertions of i . The building blocks for the computation of the matter-dependent
terms are the boundary-to-boundary propagators connecting the points yi and yj , with
yi,j = 0, R:
3
3
3
3
n (yi )n (yj )
ij (p) =
e 2 (yi ) e 2 (yj )
= e 2 (yi ) e 2 (yj ) (p, yi , yj ).
2 + m2
p
n
n
(4.2)
Here and in what follows, n (y) and mn denote the wave functions and the masses of the
KK modes of the scalar field , in the limit i = 0. The quantity appearing in the second
equality is then by definition the propagator of the scalar field in the same limit and
in mixed momentum-position space: momentum space along the non-compact directions,
and configuration space along the fifth. The exponential factors have been introduced for
later convenience. The quantity that is relevant to compute the matter-independent term is
instead the following spectral function:

p 2 + m2n .
Z(p) =
(4.3)
n
The explicit expressions for the above quantities are most conveniently written in terms
of the functions I1,2 and K 1,2 , defined in terms of the standard Bessel functions I1,2 and
K1,2 as

I1,2 (x) =
(4.4)
xI1,2 (x),
K1,2 (x) =
xK1,2 (x).
2
These functions are elliptic generalizations of the standard trigonometric functions, and
satisfy the relation
I1 (x)K 2 (x) + K 1 (x)I2 (x) = 1.
Their asymptotic behavior at large argument x 1 is

3
15
,
I1 (x)
K 1 (x) ex 1 +
8x 128x 2

15
105
+
+
,
I2 (x)
K 2 (x) ex 1 +
8x 128x 2
(4.5)

ex
1
2

ex
1
2
and their asymptotic behavior at small argument x 1 is instead

1

2
1 3
x 2 + ,
x2 + ,
K 1 (x)
I1 (x)
2 2

3
15
,
8x 128x 2

15
105
+
+
,
8x 128x 2

K 2 (x)

2 3
2x 2 + ,
I2 (x)
27

1 5
2
x + .
2 8
Consider first the computation of the quantities (4.2). Rather than computing them directly as infinite sums over KK mode masses, we derive them from particular cases of
the propagator (p, y, y ) for , which is given by the solution with Neumann boundary
conditions at y equal to 0 and R of the following differential equation:

2ky 2
p y e4ky y (p, y, y ) = (y y ).
e
(4.6)
The solution of this equation is most easily found by switching to the conformal variable
z = eky /k, in which the metric is given by Eq. (1.3) and the positions of the two branes by
z0 = 1/k and z1 = ekR /k. Notice that in these coordinates, a rescaling z z is equivalent to a shift z0,1 z0,1 of both boundaries plus a Weyl rescaling g g /2 of the
metric along the 4D slices. Therefore, both 1/z0 and 1/z1 have the properties of conformal compensators, and by locality it is then natural to identify them at the superfield level
with the superconformal compensators at the respective boundaries: (kz0 )2 and
(kz1 )2 ek(T +T ) . Defining also u = min(z, z ) and v = max(z, z ), the propagator

is given by [49,50]:
(p, u, v)
[I1 (pz0 )K 2 (pu) + K 1 (pz0 )I2 (pu)][I1 (pz1 )K 2 (pv) + K 1 (pz1 )I2 (pv)]
. (4.7)
=
3
3
2p(ku) 2 (kv) 2 [I1 (pz1 )K 1 (pz0 ) K 1 (pz1 )I1 (pz0 )]
The brane restrictions of the general propagator (4.7) defining Eqs. (4.2) are then easily
3
3/2
computed. The factors e 2 kyi,j that have been introduced cancel the factors kzi,j appearing
in (4.7). Moreover, one of the factors in the numerator is always equal to 1 thanks to
Eq. (4.5). The results finally read
00 (p) =
1 I1 (pz1 )K 2 (pz0 ) + K 1 (pz1 )I2 (pz0 )

,
2p I1 (pz1 )K 1 (pz0 ) K 1 (pz1 )I1 (pz0 )
1 I1 (pz0 )K 2 (pz1 ) + K 1 (pz0 )I2 (pz1 )

,
2p I1 (pz1 )K 1 (pz0 ) K 1 (pz1 )I1 (pz0 )
1
1
.
01,10 (p) =
2p I1 (pz1 )K 1 (pz0 ) K 1 (pz1 )I1 (pz0 )
11 (p) =
(4.8)
(4.9)
(4.10)
Next, consider the formal determinant (4.3). Although this is not precisely a propagator,
it is still a function of the spectrum and can be functionally related to the propagator in
Eq. (4.7). Indeed, the masses mn are defined by the positions of the poles p = imn of (4.7).
These are determined by the vanishing of the denominator, that is by the equation
F (imn ) = I1 (imn z1 )K 1 (imn z0 ) K 1 (imn z1 )I1 (imn z0 ) = 0.
(4.11)
The infinite product in Eq. (4.3) is divergent. More precisely, it has the form of a constant
divergent prefactor times a finite function
of the momentum. In order to compute the latter,
we consider the quantity p lnZ(p) = n 2p/(p 2 + m2n ). The infinite sum over the eigenvalues, which are defined by the transcendental equation (4.11), is now convergent and
28
Fig. 3. We sum diagrams with an arbitrary number of 0 insertions using a dressed propagator that includes 1
insertions.
can be computed with standard techniques, exploiting the so-called SommerfeldWatson

transform. The result is simply given by p lnF (p). This implies that Z(p) = F (p), up
to the already mentioned irrelevant infinite overall constant. Omitting the latter, we have
therefore
Z(p) = I1 (pz1 )K 1 (pz0 ) K 1 (pz1 )I1 (pz0 ).
(4.12)
Now that we have the expression for the propagators in the absence of localized kinetic
terms (i = 0), we can compute the full effective Khler potential at 1-loop, including the
i s. To do so, we continue to Euclidean momentum and follow Ref. [8]. In summary, we
first calculate the sum of diagrams with an arbitrary number of insertions of one of the
two localized kinetic terms, say 0 , and the other one turned off. The result depends on
the brane-to-brane propagator 00 (p). We then replace this propagator with a propagator
dressed with an arbitrary number of insertions of the other kinetic term, that is 1 (see
Fig. 3). Finally, we also add the diagram with no insertion of localized kinetic terms at all,
3
3
which is proportional to ln Z(p). Since we have already included a factor (kzi ) 2 (kzj ) 2
in the definition of ij compared to the standard propagator defined with a pure -function
source and no induced metric factor, and the interaction localized at zi in (4.1) involves a
factor (kzi )2 , each factor i will come along with a factor kzi . The result can be written
in the following very simple form as a two by two determinant:

d 4 p 2
2 0 z0 00 (p) 0 z0 01 (p)
1-loop =
ln Z(p) det 1 kp
.
(2)4 p 2
1 z1 10 (p) 1 z1 11 (p)
(4.13)
The momentum integral is divergent, and as expected the divergence corresponds to a
renormalization of local operators. In order to disentangle the finite corrections associated
to non-local quantities that we want to compute from the divergent contributions corresponding to local terms, we must first classify the latter and then properly subtract them.
The expected general form of UV divergences can be deduced by inspecting Eqs. (3.46)
and (3.55), which define the general form of the tree-level effective action and therefore of
29
2 , which as mentioned above

the allowed local terms. Defining for convenience t0,1 = 1/z0,1
should be thought of in terms of the corresponding superfield compensators, this is given
by
UV = F0 (0 )t0 + F1 (1 )t1 .
(4.14)
These divergent terms can come from two different sources: the renormalization of the
5D Planck mass and the renormalization of the kinetic functions at the boundaries. On the
other hand, covariance under the Weyl shift t0,1 1/2 t0,1 discussed above constrains
the whole Khler function to have the general form
= t0 (t1 /t0 ).
(4.15)
We have not displayed the dependence on the boundary matter fields, as that is not constrained by Weyl symmetry. By the structure of the UV divergences in Eq. (4.14), it follows
that the derivative quantity
t1
t0 t0 t1 = (t1 /t0 )
t0
(4.16)
must be finite, since its annihilates Eq. (4.14). So one way to proceed is to first calculate
and then reconstruct the full by solving an ordinary second order differential equation.
This solution is determined up to two integration constants associated to the general solution of the homogeneous equation = 0: = F0 + F1 t1 /t0 . These constants precisely
parametrize, as they should, the UV divergences in Eq. (4.14).
In what follows, however, we will not directly apply the above derivative method.
We shall instead subtract from the loop integral which defines 1-loop a suitable divergent integral with the properties that: (1) its functional dependence on t0,1 is such
that it is manifestly annihilated by the operator t0 t1 , (2) its subtraction makes the integral converge. The finite result obtained in this way is then guaranteed to contain all
the finite calculable terms we are interested in. As we will explain in more details below, the functions defining the appropriate subtractions are obtained by simply replacing each propagator ij (p) with its asymptotic behaviors ij (p) for p . Up to
exponentially suppressed terms of order e2p(z1 z0 ) , which are clearly irrelevant, we
find:

3
1 K 2 (pz0 )
3
1
1+
00 (p) =
(4.17)
+
+ ,

2p K 1 (pz0 ) 2p
2pz0 8(pz0 )2

3
1 I2 (pz1 )
3
1
1
11 (p) =
(4.18)
+
+ ,

2p I1 (pz1 ) 2p
2pz1 8(pz1 )2
01,10 (p) = 0.
(4.19)
Similarly, for the quantity Z(p), the appropriate subtraction that has to be done to isolate the finite contribution associated to non-local quantities is defined by the asymptotic
behavior Z(p)
of this quantity for p . In this limit, the second term in (4.12) is of
2p(z
1 z0 ) with respect to the first, and can be neglect. Therefore, the quantity that
order e
30
controls the UV divergences in the Casimir energy is
Z(p)
= I1 (pz1 )K 1 (pz0 )

1
1
1
1 p(z1 z0 )
3 1
15
e
1
+ .
2
8 pz1 pz0
128 (pz1 )2 (pz0 )2
(4.20)
Notice that the contribution to the effective action related to Z is proportional to an integral
of ln Z. The divergent subtraction defined by (4.20) therefore splits as expected into the
sum of two terms depending only on z0 and z1 .
The non-local corrections to the Khler potential can now be computed by subtracting
from Eq. (4.13) the same expression but with Z and ij replaced by their asymptotic
behaviors Z and ij , that is the quantity

d 4 p 2
2
ln
Z(p)
+
div =
(4.21)
ln
1
kp
z
(p)
.
i i ii
(2)4 p 2
i
This formal expression, being the sum of terms that depend either on z0 or on z1 but not
on both, vanishes under the action of t0 t1 , and is therefore an acceptable subtraction. Our
final result is then given by

d 4 p 2 Z(p) i (1 kzi i p 2 ii (p)) i (kzi i p 2 ii (p))

=
ln
,

2
(2)4 p 2 Z(p)
i (1 kzi i p ii (p))
(4.22)
where, as already said:
i =
i i
3M53
Mi2
M53
(4.23)
This formula (4.22) is the main result of this paper. It generalizes the flat case result (6.32)
of Ref. [8] to the warped case and correctly reduces to the latter in the limit k 1/R.9 Notice also that it is finite for any k, since untilded and tilded quantities differ by exponentially
suppressed terms at large momentum.
4.1. Structure of divergences
It is worth spending a few words on the structure of the UV divergences we have subtracted. Since (4.21) satisfies t0 t1 div = 0, it must have the same form than (4.14), if
properly (i.e., covariantly) regulated. It is instructive to see how this dependence comes
about when working with a hard momentum cut-off. Let us first concentrate on the term
9 Indeed, in this limit the propagators and the spectral function Z correctly reproduce the flat expressions:
ij
lim 00,11 =
kR0
1
coth(pR),
2p
lim 01,10 =
kR0
1
csch(pR),
2p
lim Z = sinh(pR).
kR0
(4.24)
Similarly, the tilded quantities involved in the subtraction correctly reduce to the large volume limit of the above
untilded expressions.
31
Since Z(p)
proportional to ln Z.
is the product of a function of pz0 times a function of
pz1 , one can split the subtraction in two pieces depending only on z0 and z1 , and change
integration variables respectively to v0 = pz0 and v1 = pz1 in the two distinct contributions:
1
d 4 v0 2
d 4 v1 2
sugra
1 (v0 ) + 1
div = 2
(4.25)
ln
K
ln I1 (v1 ).
4
2
2
(2) v0
(2)4 v12
z0
z1
These quantities have indeed the expected structure provided the cut-offs 0 L and 1 L
of the v0 and v1 integrals do not depend on z0 and z1 . This has an obvious interpretation,
related to the fact that the above two distinct contributions must be associated (after an
integral over the fifth dimension) with divergences at the two distinct boundaries. The
point is that the original momentum integration variable p is not the physical coordinateinvariant momentum.
At each point in the bulk, the physical momentum is rather given

by pphys = p p g = pz/L, so that the variables v0 and v1 do indeed parametrize
the appropriate physical virtual momentum at each boundary. Since a covariant cut-off
procedure should bound the physical rather than the comoving momentum, it is indeed the
cut-off for v0,1 rather than for p that must be fixed and universal. It is easy to check that
this is indeed what happens when the above integrals are regulated through the introduction
of 5D PauliVillars fields of mass : one finds 0 = 1 = . By using the asymptotic
expansion of K 1 and I1 at large argument, we finally find that the divergent part has the
form

1
1
1
1
sugra
3
div = 3 2 2 (L) + L 3 2 + 2 ln .
(4.26)
z0 z1
z0 z1
This expression is manifestly invariant, as it should, under the exchange of the two boundaries: z0 z1 , L L. Comparing with Eq. (2.5) we see that the first part corresponds
to a renormalization M53 = 3 + k 2 of the 5D Planck mass, and the second part to a
renormalization Mi2 = k 2 ln of the boundary kinetic terms.
Consider next the two terms in the sum of the second term of Eq. (4.21). Since p ii (p)
is actually a function of pzi only and not pzi , one can as before factorize all the dependence on zi from each integral be changing the momentum variables to vi = pzi :

1
k0 v0 K 2 (v0 )
d 4 v0 2
mat
= 2
ln
1
div
2 K 1 (v0 )
(2)4 v02
z0

1
k1 v1 I2 (v1 )
d 4 v1 2
.
+ 2
(4.27)
ln
1
2 I1 (v1 )
(2)4 v12
z1
The same arguments as before concerning covariance and the appropriate cut-offs to be
used apply. The result is again that the integral can be evaluated by using the asymptotic
expression for large argument of the functions appearing in its integrand and the dimensionless cut-off L for both variables vi . Treating i as small quantities, we can also
expand the logarithms. Proceeding in this way we finally find:
mat
div
=

0n
1n
n+2 2
n n+2 L2 + .
L
+
+
n
2
2
z
z
n=1 0
n=1 1
(4.28)
32
Comparing with Eq. (2.5) we see that the n-th term corresponds to a correction to the
4D potentials i of the form i = n n+2 (3M53 )n (3Mi2 + i i )n + , the dots
representing less singular terms. This correction represents a renormalization of the quantities Mi2 , the wave function multiplying the kinetic term i i of the matter fields and
the coefficients of those higher-dimensional local interactions involving up to n powers of
i i .
4.2. General results
Let us now come to a more explicit discussion of the results. Comparing Eq. (4.22) with
the general expression (2.9), it is possible to extract all the coefficients Cn0 ,n1 . We will
consider in more details the first coefficients with n0,1 = 0, 1, which control the vacuum
energy and the scalar soft masses that are induced by supersymmetry breaking, as functions
of the quantities
i =
Mi2
M53
which define the localized kinetic terms for the bulk fields. Taking suitable derivatives of
(4.22) with respect to i and setting these to i , we find:

d 4 p 2
ln
Z(p)
ln
Z(p)
,
p
(2)4

2
d 4p
00 (p) ,
00 (p)
C1,0 =
(kz
)
0
3
4
(2)
3M5

2
d 4p
11 (p) ,
11 (p)
(kz
)
C0,1 =
1
3
4
(2)
3M5

2
d 4p 2
(kz
)(kz
)
p 01 (p)10 (p) ,
C1,1 =
0
1
6
4
(2)
9M5
C0,0 = 2
(4.29)
(4.30)
(4.31)
(4.32)
where ij are the brane-to-brane propagators in presence of localized kinetic terms i , and
Z is the i -dressed analog of Z:

Z = Z 1 + kz0 0 p 2 00 1 + kz1 1 p 2 11 kz0 0 kz1 1 p 4 01 10 ,
(4.33)
00 =
00 (1 + kz1 1 p 2 11 ) kz1 1 p 2 01 10
,
(1 + kz0 0 p 2 00 )(1 + kz1 1 p 2 11 ) kz0 0 kz1 1 p 4 01 10
11 (1 + kz0 0 p 2 00 ) kz0 0 p 2 01 10
,
(1 + kz0 0 p 2 00 )(1 + kz1 1 p 2 11 ) kz0 0 kz1 1 p 4 01 10
01,10
.
01,10 =
2
(1 + kz0 0 p 00 )(1 + kz1 1 p 2 11 ) kz0 0 kz1 1 p 4 01 10
11 =
ij are similarly defined out of Z and ij .

The subtractions Z and
(4.34)
(4.35)
(4.36)
33
4.3. Results in the absence of localized kinetic terms

In the simplest case of vanishing localized kinetic terms, that is i = 0, the first four
relevant terms in the correction to the effective Khler potential are given by Eqs. (4.29)
(4.32) with Z Z and ij ij , and similarly for the tilded quantities defining the
subtractions.
In the limit of flat geometry, one recovers the known results for the coefficients of the
four leading operators:
C0,0 =
C0,1 =
1
cf
,
2
4 (T + T )2
cf
6 2 M53
C1,0 =
1
,
(T + T )3
C1,1 =
cf
6 2 M53
cf
6 2 M56
1
,
(T + T )3
1
,
(T + T )4
(4.37)
where cf = (3) 1.202. Let us mention that these results differ from the previous results
of Refs. [8,9] by a factor of 2 mismatch in the definition of the 5D kinetic coefficient M52 .
Concerning Ref. [8], we found that the source of the discrepancy is a wrongful normalization of the graviton propagator employed in that paper. Moreover, the argument we shall
give in next section confirms in a simple and unambiguous way that the correct result is
the one presented just above.
In the limit of very warped geometry the coefficients of the first four operators are
instead:
C0,0 =
C0,1 =
cw k 2 2k(T +T )
e
,
4 2
cw k 3
6 2 M53
C1,0 =
e2k(T +T ) ,
C1,1 =
cw k 3
e2k(T +T ) ,
3
2
12 M5
cw k 4 2k(T +T )
e
,
18 2 M56
(4.38)
where
1
cw =
2
3 K1 (x)
1
=
dx x
I1 (x)
8
dx x 3
1.165.
I1 (x)2
(4.39)
4.4. Results in the presence of localized kinetic terms for the case of small and large
warping
In the presence of localized kinetic terms, that is i
= 0, it is convenient to consider
directly the full Khler potential as a function i = i + i i /(3M53 ).
In the flat case, the result simplifies to

0
1
1
1
f
,
,
=
(4.40)
f
4 2 (T + T )2
T +T T +T
where

1 + a0 x/2 1 + a1 x/2 x
.
e
ff (a0 , a1 ) = dx x ln 1
(4.41)
1 a0 x/2 1 a1 x/2
0
34
Expanding this expression at leading order in the matter fields, one finds10

0
1
1
1
(n0 ,n1 )
Cn0 ,n1 =
f
,
.
3n +3n
2+n +n f
T +T T +T
4 3n0 +n1 2 M5 0 1 (T + T ) 0 1
(4.42)
As already discussed in Ref. [8], C0,0 and C0,1 become negative for large 0 and small 1 .
Similarly C0,0 and C1,0 become negative for large 1 and small 0 .
In the limit of large warping, one finds:
=
k 2 2k(T +T )
e
fw (k0 , k1 ),
4 2
(4.43)
where now
1
fw (a0 , a1 ) =
2
dx x 3
0
K1 (x) 1 1 + a1 x/2K2 (x)/K1 (x)

.
I1 (x) 1 a0 1 a1 x/2I2 (x)/I1 (x)
(4.44)
From this we deduce that

Cn0 ,n1 =
k 2+n0 +n1
3n +3n
4 3n0 +n1 2 M5 0 1
e2k(T +T ) fw(n0 ,n1 ) (0 k, 1 k).
(4.45)
None of the coefficients becomes negative for large 0 and small 1 , whereas C0,0 and
C1,0 become negative for 1 0.6/k and small 0 . We again stress that there is an upper
bound for 1 , as for 1 = 1/k, the radion becomes a ghost. But luckily, C1,0 changes sign
before 1 reaches this critical value.
Notice that the dimensionless parameters that control the effect of the localized kinetic
terms depends on the warping. In the flat limit, they are given by i /(2R), and their impact therefore significantly depends on the radius dynamics, whereas in the large warping
limit they are given by i k, and their impact is therefore not directly dependent of the
radius.
5. A simplified computation
We present here an alternative technique which allows to rederive in an immediate way
some of the results of the present paper and of Refs. [79]. The basic remark is that the
effects we are computing are dominated by relatively soft modes with 5D momentum
of the order of the compactification scale. The softness of these modes does not allow
them to distinguish between an infinitesimal shift in the position of a boundary, i.e., a
small change of the radius, and the addition of infinitesimal boundary kinetic terms. The
equivalence between these two effects11 allows one to relate in a straightforward way the
10 We use the standard notation f (n0 ,n1 ) (a , a ) = (/a )n0 (/a )n1 f (a , a ).
0 1
0
1
0 1
11 This equivalence is related to the analysis of Ref. [51], which shows that a displacement of the branes can
be accounted for by a renormalization group flow of the local operators that they support.
35
matter-independent term (Casimir energy) to the matter-dependent ones (radion and braneto-brane mediation terms).
Let us see how this works in more detail. As it will become clear from the discussion,
warping and spin play no role. Therefore we can focus first on the case of the scalar Lagrangian in Eq. (4.1) in the flat limit k 0. Notice that 0,1 have the dimension of a
length, so that the case of infinitesimal kinetic terms should correspond to 0,1 R.
Consider the equations of motion and boundary conditions in the vicinity of one of the two
boundaries, say y = 0:
0
(0) = p 2 (0).
(y) = p 2 (y),
(5.1)
2
Now, for p0 1 the solution does not vary appreciably over the distance 0 . Therefore
we can safely write the value of at the point y = 0 /2 by means of a Taylor expansion.
By using both Eq. (5.1) we obtain then

2
0
0
= (0) + (0) + (0) 0 +

2
2
8
2 2
3 4
= 0 + O 0 p (0) + O 0 p (0) + .
(5.2)
This equation implies that up to terms of order 02 p 2 , the solutions are the same as those
of an equivalent system with Neumann boundary condition = 0 at a shifted boundary
y0 = 0 /2. Consequently, at the same order in 0 all the properties of the two systems, KK
masses included, will be the same. The same remark applies, in a totally independent way,
to the other boundary. Now, quantities like the Casimir energy are dominated by modes
of momentum p 1/R, so that the above equivalence works up to terms of order 02 /R 2 ,
12 /R 2 .
This means that radion mediation O(0 ) and brane-to-brane mediation O(0 1 ) effects
can be read off by simply shifting the radius as 2R 2R 0 1 in the Casimir
energy (or the Khler function). In fact the shifted quantity M 2 = M53 (2R 0 1 ) is
nothing but the effective low energy Planck mass in the presence of boundary kinetic terms.
Being a low energy quantity, M 2 manifestly realizes the equivalence between boundary
kinetic terms and shift in R.
The argument we used above is very robust and quickly generalizes to other cases of
interest. In particular, as the argument is based on a local expansion close to the boundary
where the small kinetic term has been added, it still holds true whatever boundary condition is given at the other boundary. For instance, in the presence of a sizable kinetic term
1 , turning on an infinitesimal 0 is still equivalent to the shift 2R 2R 0 in the
full Casimir energy V (R, 1 ). Moreover the presence of curvature is clearly not affecting
our basic argument as long as 0,1 are smaller than the typical curvature length of the metric. Basically, curvature introduces an extra, but fixed, momentum scale in addition to the
4D momentum p of our previous argument. For the RS metric, we would have a double
expansion in both 0 p and 0 k. Without doing an explicit computation it is easy to deduce
the equivalence relation between boundary kinetic terms and brane shift in the RS model.
The point is that the effective 4D Planck mass

1 k0 1 + k1
2
3
M = M5 k
(5.3)
z02
z12
36
must be invariant. We conclude that, at linear order, switching on infinitesimal i s is equivalent to shifting the positions zi of the latter to
z0 = z0 e+k0 /2 z0 + k0 z0 /2,
z1 = z1 ek1 /2 z1 k1 z1 /2.
(5.4)
5.1. Flat case

Let us consider now in more detail the implications of the above reasoning. In the flat
case, without localized kinetic terms, the matter-independent effective Khler potential is
very simple to deduce (see also Ref. [52]). The modes have Neumann boundary conditions at both branes and masses mn = n/R. One then finds cf /(4 2 )(T + T )2 ,
where cf = Li3 (1) = (3). Applying to this result the shift T + T T + T (0 0 +
1 1 )/(3M53 ) we find:

0 0 1 1 2
(3)
.
T
+
T
4 2
3M53
3M53
(5.5)
Expanding this expression to linear order in each brane term, we reproduce the coefficients
(4.37) for the four leading operators.12
It is also illuminating to consider the limit in which one of the kinetic terms, say 1 , becomes large with the visible sector located at y = 0. The condition at y = R reduces to a
Dirichlet boundary condition, forcing all the wave functions of the non-zero modes to vanish there. The massive spectrum becomes mn = (n + 1/2)/R, but with the zero mode mass
obviously unaltered. The matter-independent term in the Khler potential is then given by
cf /(4 2 )(T + T )2 , where now cf = Re Li3 (ei ) = 3/4 (3). Knowing this result, and applying again the shift, which now implies T + T T + T 0 0 /(3M53 ),
we can deduce that:

0 0 2
3 (3)
T
+
T

(5.6)
.
4 4 2
3M53
When expanded, this again agrees with our result for the two leading operators. As remarked in Ref. [8], the effect of the large localized kinetic term is to flip the sign of the
coefficient of the operator controlling radion mediation, and to send the coefficient of the
operator controlling the brane-to-brane mediation to zero.
Notice finally that Ref. [8] also considered the case in which the radion T is stabilized
precisely by the 1-loop Casimir energy in the presence of localized kinetic terms. By placing the visible sector on a brane where no other kinetic term contribution was present, it
was then found, surprisingly, that the soft masses vanished exactly at the minimum of the
radion potential. Our argument about the shift symmetry makes this result obvious. The
scalar mass term is obtained by shifting T + T T + T 0 0 /(3M53 ) in the radion
Casimir energy V (T + T ). The scalar mass is therefore proportional to V , and vanishes
at the minimum of the potential.
12 We thank Adam Falkowski for pointing out this fact to us thus stimulating the discussion of this section.
37
5.2. Strongly warped case

To conclude we can consider the strongly warped case. In the absence of localized
kinetic terms, the matter-independent term in the effective Khler potential is known from
Refs. [53,54] (see also [5557]) and has the form eff cw /(4 2 )z02 z14 , where cw
1.165. Applying the shift, we find

1 0 0 1 1 1
cw k 2
.
exp 2k T + T

(5.7)
6 M53
3 M53
4 2
which, when expanded at linear order in 0 0 and in 1 1 , reproduces our results (4.38)
for C1,0 , C0,1 and C1,1 .
In this case, no illuminating argument is available for the effect of a localized kinetic
term. More precisely, in the not very interesting situation in which a kinetic term is located
at the UV brane, its only effect is basically to increase the effective 4D Planck mass, and
therefore all the radiative effects get simply suppressed when it becomes large, without any
sign flip. In the more interesting situation in which a kinetic term is switched on at the IR
brane, the impact on the radiative effects can instead be more interesting and significant,
as we saw, but due to the fact that the latter is limited to stay below a finite critical value,
there is no useful limit in which the problem simplifies. One has then to rely on the exact
computation to understand its physical consequences.
6. Applications
We would now like to investigate the extent to which our result for the radion-mediated
contribution to the scalar soft masses squared can help to cure the problem of tachyonic
sleptons of anomaly mediation. In order to do that, we need to find a model in which the
radius is stabilized, and supersymmetry is broken in such a way that radion mediation dominates over anomaly and brane-to-brane mediation. In this section, we study the effective
description of such a model.
By effective description, we mean that we do not fully specify the sector that breaks
supersymmetry, but parametrize it through a Goldstone supermultiplet X with a linear superpotential. Similarly, the radion is stabilized by some unspecified 5D dynamics (see [29,
35,41,44,58,59] for specific examples) that we shall parametrize through an effective superpotential depending on the radion multiplet. Finally, in order to cancel the cosmological
constant, we also need to add a constant superpotential and tune its coefficient. All these
superpotentials admit microscopic realizations, for instance in terms of gaugino condensations, but we will not discuss them here in any detail. What is instead important for
us is that in such a general situation, there are three important sources of supersymmetry
breaking effects for the visible sector matter and gauge fields, coming from the F terms
of the compensator, the radion and the Goldstone chiral multiplets, respectively , T and
X. These will induce contributions to the soft masses corresponding to anomaly, radion,
and brane-to-brane mediation effects. The absence of tachyons requires, necessarily, that
the radion contribution be positive and comparable or bigger than the other two. We shall
38
discuss some examples in this section. On the other hand, the extent to which a model
helps solving the supersymmetric flavor problem depends on the separation between the
compactification scale and whatever other fundamental UV scale there is in the theory, for
instance M5 . The bigger this separation is, the more suppressed are higher order effects and
in principle flavor-breaking corrections to soft masses. We will not give a full discussion
of this problem in this paper as it will require a careful model by model analysis. We plan
to come back on this issue in a forthcoming paper [60].
6.1. Flat case
In Ref. [8], a model of the type discussed above was considered, along the lines of [41].
The radion was stabilized by gaugino condensation in the bulk and on the hidden brane,
giving rise to the following low energy Khler potential and superpotential:

= 3M53 T + T 3M12 + X XZ XX /2c ,
(6.1)
3
3 n T
3
2
a
+ b + c X .
W = a e
(6.2)
We are assuming for simplicity that the wave function Z stabilizes X at the origin X = 0
(see for example Ref. [61]). We are also assuming that the hidden sector and the brane
kinetic term M12 are localized on the same brane. Minimizing the potential, it is found that
the radius is stabilized, and we can tune c to cancel the cosmological constant. The order
of magnitude of the different F -terms are found to be:
FT
FX
(6.3)
(na T ).
M
T
The point here is that the radion-mediated contribution, the only one that can give positive mass squared, is suppressed compared to the two other contributions by a factor of
(na T )1 , which is precisely the loop factor G g52 /(16 2 T ) for the 5D gauge interactions at the compactification scale. Because of this extra suppression the equality
between radion-mediated and anomaly-mediated contributions is obtained for
4
g
2
2,
G 5
(6.4)
4
F
where 5 = 1/(M5 T )3 measures the loop expansion parameter for 5D gravity. This relation
implies that the radius should be fairly small. Moreover, a second, necessary, requirement
is that the radion contribution be positive and dominant with respect to the brane-to-brane
one. Adding a localized kinetic term on the hidden brane helps in two ways. It makes
the radion-mediated contribution positive while suppressing brane-to-brane effects. Taking
2 ) suppression of the radion contribution one finds by inspection
into account the extra O(G
of the explicit formulae that a positive mass is obtained for a pretty large brane kinetic
2 . In the regime where the bulk gauge theory is perturbative at the
term: M12 /M53 T 1/G
compactification scale, the size of the 4D effective Planck scale is fully determined by the
boundary, with the bulk playing the just role of a small perturbation. Although peculiar
this is not the real problem of this model. The real problem is represented by Eq. (6.4),
which implies that the gauge and gravity expansion parameters are not so small. These
39
parameters control higher order effects. For instance, it is natural to expect flavor violating
four-derivative boundary gravitational interactions, suppressed by two extra powers of M5 .
2/3
These would subleading O(1/(M5 T )2 ) 5 flavor-breaking corrections to the radionmediated effect [8]. Similarly, one expects extra interactions between the boundary matter
fields and the bulk gauge fields. At the visible brane, we expect flavor violating couplings of
the type Qi Qj W W , and similarly at the hidden brane we expect a coupling of the form
XX W W , suppressed by three powers of some high energy scale. In presence of these
interactions, one-loop exchange of the bulk gauge fields would add to the Khler potential
a flavor-violating brane-to-brane term. If the relevant scale suppressing the couplings were
the five-dimensional quantum gravity scale, then these effects would even be larger than
the universal radion-mediated term. The presence of localized gauge kinetic terms for the
gauge fields would help making this contribution smaller than the universal one, but only
at the price of making the bulk gauge theory more strongly coupled. This fact makes this
pathway to model building not very appealing, though it may be worth a more detailed
analysis. Notice however that the difficulties arise from the extra suppression of the radion
F -term in the specific model we are considering. This suppression follows directly from
the no-scale structure of the Khler potential (i.e., linearity in T ) but depends also crucially
on the specific mechanism of radion stabilization (the superpotential). Indeed, assuming a
general W (T ) and the same Khler potential, the stationarity conditions are easily seen to
imply
T W
FT
= F 2 .
T
T T W
(6.5)
Now, for a generic superpotential depending on powers of T , we would expect the righthand side to be of order F . Indeed, this is what one finds for instance for W = a + bT n .
However, for the purely exponential dependence of Eq. (6.2) there comes an extra 1/T
suppression in FT /T with respect to the naively expected result. Notice that for FT /T
F , G would drop out from Eq. (6.4) and the expected flavor violation from gravity loops
2/3
would scale like 5 4/3 , which is in the interesting range. It would be interesting to
look for alternative models of radius stabilization where FT /T F and we leave it for
future work. In the meantime, we would like to study some basic features of the warped
case, which, as we shall see, opens a perhaps more promising direction of investigation.
6.2. Strongly warped case
In the warped case, the situation is similar, but the two possible choices for the locations of the visible and hidden sectors are no longer equivalent, and must be studied
separately. Notice also that the radion, as parametrized by = kekT , looks like a ordinary matter field with a canonical Khler potential. This is not at all surprising in view of
the holographic interpretation. Moreover, unlike in a flat geometry, a constant superpotential localized on the IR brane leads to a radion-dependent effective 4D superpotential.
We will consider the case where the visible sector is on the UV brane at z0 and the hidden sector on the IR brane at z1 , as in this case, localized kinetic terms can make the soft
masses squared positive. If the visible sector is put on the IR brane, it is not possible to ob-
40
tain positive corrections to the soft masses squared from the gravitational sector; therefore
we do not study this case further.
Since the matter and gauge fields live on the UV brane, they have a canonically normalized kinetic term and they couple to the ordinary conformal compensator . The soft
masses are then given by
g2
|F |,
(6.6)
(4)2
g4
||2
k 2 ||2 (1,1)
2
(1,0)
2
m20 = b
|F
|
(
k)|F
|
fw (i k)|FX |2 .
f
w
(4)4
3 2 M53 k
36 2 M56
(6.7)
Notice that we canonically normalized the Goldstone multiplet X, see Eq. (6.8). The numerical coefficients a and b controlling the anomaly-mediated contributions depend on the
quantum numbers of the corresponding particles. The function fw was calculated in Section 4 (cf. Eq. (4.45)) in the limit of large warping and is a function of the localized kinetic
terms. As in the flat case, f (1,0) and f (1,1) are both positive for 0 = 0 and 1 = 0, but for
0 = 0 and 1 k > 0.6, the first becomes negative, whereas the second remains positive. We
therefore have the same potentially interesting case as in the flat case for 0 and 1 k 1,
although the brane-to-brane-mediated contribution, which stays negative, is a priori not
suppressed with respect to the radion-mediated contribution, which can become positive.
We parametrize the effective Khler potential and superpotential as follows:
m1/2 = a

M53 2
M12
2
k
3M
3
+ X X,
0
k3
k2
W = 3n 3n n + 30 3 + a3 + bX2 .
= 3
(6.8)
(6.9)
The superpotential terms can arise from both gaugino condensation [29] and tree level dynamics, like in the supersymmetric version of the GoldbergerWise mechanism [44,58].
Compatibly with the AdS/CFT interpretation the first term can be interpreted as coming
from a deformation by chiral operator of conformal dimension n. Then unitarity requires
n > 1. In fact for the case in which this term arises from gaugino condensation [29] we
have n 4 2 /(kg52 ) 1. Low energy quantities depend on the localized kinetic terms
through the combinations M2 = M53 /k + M02 and M2 = (M53 /k M12 ). The Planck mass
is given by M 2 = M2 M2 |/k|2 , or just M 2 M2 for small . The extreme case where
0 = 0 with a
= 0, that is with a constant superpotential only in the hidden sector, was
analyzed in Ref. [29]. Taking an appropriate choice for the parameters , a and b, there
exists a solution with vanishing cosmological constant and small /k. However, we get
in this case that F F / FX /M. This implies that the contribution from radion mediation is parametrically smaller than the one from brane-to-brane mediation by a factor
(/k)2 . Therefore this case cannot work in the limit of large warping that we want to consider. The opposite extreme case, where 0
= 0 and a = 0, that is to say just a constant
superpotential on the Planck brane, can be analyzed similarly. In this case, a solution with
vanishing cosmological constant and small /k exists but only for n < 1. Thus we shall
not consider this case further. A more interesting situation can be obtained by considering
the more general regime in which a is non-zero and 0 is small enough to be negligible in
41
the radius stabilization dynamics (which is then entirely controlled by the positive matter
terms associated to and X) This assumption is consistent with the possibility to tune
0 to cancel the cosmological constant: this is the same remark usually made in models
where supersymmetry is dynamically broken at low energy. In such a situation, we can
find a solution with vanishing cosmological constant and small /k by tuning 30 and by
choosing (/k)3n a for n > 3 or (/k)3n a for n < 3. Defining for convenience
the parameter

bM
,
=
(6.10)
ka
we find that the cancellation of the cosmological constant requires
2
2
6

1 f ()2 + = (0 /k) M (/k)4 ,
3
M
a2
where
n3
f () = 1 +
2(n 1)

1

8(n 1) 2
1
.
3(n 3)2
(6.11)
(6.12)
Minimization of the potential yields the following results for and the different F -terms:
1

n3
3
af ()
,
n
30
,
M2
30
|1 f ()|
M

|F |
k,
M (1 f ())2 + 2 /3 M 2
|F |
|FX |
|0 |3
.
(1 f ())2 + 2 /3 M
(6.13)
(6.14)
(6.15)
(6.16)
The value of the parameter

can be arbitrarily chosen between the minimal value 0 and

the maximal value 3/8 (n 3)2 /(n 1). Taking and f () of order one, we get
F F /k FX /M. This implies that the contribution to m20 in Eq. (6.7) from radion
and brane-to-brane mediations have the same magnitude and can compete with the contribution from anomaly mediation if /M g 2 /(4)2 . Note that the localized kinetic
term helps in making the radion-mediated contribution dominant compared to the braneto-brane-mediated one, as we need to take M < M in order to make it positive. More
precisely, in order to significantly suppress the latter with respect to the former, one has to
go to the regime in which the localized kinetic term gets close to its critical value. This is
the analogue of a large localized kinetic term in the flat case. However, it is important to
emphasize the in the flat case the suppression comes from the coefficients of the radiatively
induced operators, whereas in the strongly warped case it comes from the scaling of the
F -terms.
42
A more detailed study is needed to determine whether or not the scenario outlined here
can be embedded in a fully viable model. In order to do so we would have to choose a specific mechanism to generate the bulk superpotential (for instance by adding Goldberger
Wise hypermultiplets in the bulk) and then check that the new sources of soft terms that
are generated in the specific model do not spoil the solution to the supersymmetric flavor
problem. What we have just proven is the absence of obstructions to obtain positive masses
from gravitational loops plus anomaly mediation while having soft terms that scale like
2
g
m3/2 ,
m0 m1/2
(6.17)
16 2
mradion (/k)1 m3/2 .
(6.18)
Overall, this scenario seems more promising than the one presented in flat case, as we do
not have the extra suppression of the radion-mediated contribution.
7. Summary and conclusions

Anomaly mediation is very attractive as it is very model-independent and predictive,
and solves the supersymmetric flavor problem. However, it predicts tachyonic sleptons,
and therefore other contributions to the slepton soft masses are needed. In the context of
sequestered models where the supersymmetry breaking sector and the SM sector are spatially separated in an extra-dimension, gravity loops can potentially provide such additional
contributions. The transmission of supersymmetry breaking through these gravity loops is
finite and calculable, because of its non-local nature. Moreover, and for the same reason, it
does not suffer from the flavor problem of the usual 4D gravity mediation of supersymmetry breaking. Recently, the contribution of gravitational loops to brane and radion-mediated
soft masses was computed in the case of a flat extra-dimension [8,9]. The result was partly
disappointing as the contributions to the scalars soft masses squared were found to be negative in most cases. Fortunately, in Ref. [8] it was found that the presence of a sizable
localized gravitational kinetic term at the hidden brane induces a positive contribution to
the soft masses squared.
Inspired by this result, in this paper we have calculated the effective Khler potential
in warped sequestered models. We used a superfield technique to perform our calculation.
More precisely, we wrote the linearized bulk supergravity Lagrangian in term of N = 1 superfields. Couplings between brane and bulk fields are then easily written down using the
known N = 1 supergravity couplings. We then performed a one loop supergraph calculation to find the effective Khler potential valid in the presence of arbitrary localized kinetic
terms for the supergravity fields. This calculation has the same flavor as the calculation of
the ColemanWeinberg potential and the result is given by

1
d 4 p 4 2
ln p + m
2n ,
1-loop =
(7.1)
4
2
2
(2) n p
where m
n are the masses of the KK gravitons in RS1, with arbitrary kinetic terms on both
branes.
43
In general, we find that in the absence of localized kinetic terms for the supergravity
multiplet, the gravitational contributions to the soft masses squared are negative, thus worsening the tachyonic slepton problem of anomaly mediation. In the limit of large warping,
they go to zero, consistently with the fact that in this limit the theory becomes conformal.
However, we find that adding a localized kinetic term on the IR brane can make the radion
contribution to the soft masses positive.
We emphasize that our formula is of a general nature. In particular, its generalization to
higher co-dimensions should be easy to explore. This would only require the knowledge of
the KK spectrum of the J = 2 modes as a function of the moduli of the higher-dimensional
space in question, which is a relatively easy task. Also, the embedding of these moduli into
superfields should be known, which should again be easy, if the low energy description of
the model is known. We leave the exploration of this generalization for future work.
We also presented a simple way of understanding the form of the matter-dependent
terms in the effective Khler potential (radion and brane-to-brane contributions) as a function of the matter-independent one. The relevant observation is that for the low lying
modes, that dominate our calculation, a shift in the position of the brane is equivalent
to adding a localized kinetic term on the brane. Therefore, the radion-to-brane and braneto-brane operators can be obtain from the matter independent effective Khler potential by
taking derivatives of the latter with respect to the radion.
Finally, we presented the effective description of a model where the radius is stabilized, and supersymmetry is broken in such a way that parametrically, the radion-mediated
contribution to the soft masses squared can compete with the anomaly-mediated and braneto-brane contributions, in a region of parameter space where the effective field theory
description is valid. In such a model, phenomenologically viable supersymmetry breaking soft masses could arise from flavor universal and purely gravitational quantum effects.
Acknowledgements
We thank J.-P. Derendinger, A. Falkowski, M. Luty and F. Zwirner for useful discussions. This research was supported in part by the European Commission through a Marie
Curie fellowship and the RTN contract MRTN-CT-2004-503369.
Appendix A. Component Lagrangian

In this appendix, we describe the component form of the superfield Lagrangian (3.38).
Apart from the terms involving the prepotential PT for the radion, the calculation is very
similar to the one done in Ref. [36], and we therefore refer the reader to that paper for
more details. The first step is to chose a convenient WessZumino gauge as in Ref. [36], in
which the bosonic components of the superfields are given by the following expressions:
Vn = m h mn + 2 2 dn ,
44

i
1
= u + v + 2 w + n n v
2
4

i
+ 2 y + n
n u ,
2
P = m m + 2 2 D ,

i
= 2 D m m ,
2

1
PT = T + 2 t + 2 t m mT + 2 2 DT T ,
4

i
1
T = t + 2 DT m mT + i m m t + 2 2 t.
2
4
In components this give:

1 n 2 1 n 2 1
1
mn
hnm + hmn (p h mn )2 + h
L = e2
m n h
2
2
2
3

1
2 1 mnpq p h mn 2 + 4 dn2 + 2 mnpq p h mn dq
+ (m h)
6
6
3
3
2
1 n
+ w + w + Re n v e y h n
4
1
i
i
+ Im vn Im v n + 2e Im vn y d n + Im v w Im v
w
2
2
2

1
2
2
+ w + n vn 3e t + Re tm n hmn + 2|yn |
2

2
1
i n

T
u u e
y m + 3 m
2 Re yn +
4
4

3
1
e T 2p Im yp e y D + 3 DT T
2
4

1
1
1
2
2
.
m m
DT D m mT m m D
(A.1)
4
3
12
There are two independent sectors: one containing h, t, v, w and d, and one containing
D , DT , m , mT , yn , u . In the former, h, t and v are physical fields, whereas w and d
are auxiliary and must be integrated out. In the latter, none of the fields propagates. After
eliminating all the auxiliary and non-propagating fields, one finally finds the following
Lagrangian:

2
1
1
L = M53 e2 n hnm + hn m hnm + (m h)2 (p hmn )2
2
2

1
1 2
(y h)2 (y hmn )2 (n hmy m hny )2
+ e
2
2
+ 2e m hny y hmn 2e n hny y h + hyy m n hmn hyy h
+ 6e2 2 h2yy 6e hyy n hny + 3e2 hyy y h

1
1
e2 (n Bm m Bn )2 (m By y Bm )2 ,
4
2
45
(A.2)
where we have defined:

1
1
1
hmn = (h mn + h nm ) mn h,
(A.3)
hyy = Re t;
hmy = Re vm ,
2
3
2

3
3
Bm = e
(A.4)
By =
Im vm ,
Im t.
2
2
This Lagrangian matches the expantion of the full RS1 Lagrangian to quadratic order in
the fluctuations, provided these are parametrized in the following way:
ds 2 = e2 (mn + hmn ) dx m dx n + 2e hmy dx m dy + (1 + h55 ) dy 2 .
(A.5)
In fact, a better parametrization of the fluctuations, that makes the zero mode manifest, is
defined by:
ds 2 = e2 (1+T ) (mn + hmn ) dx m dx n + 2e hmy dx m dy + (1 + T )2 dy 2 .
(A.6)
It can be checked that T = T (x) and hmn = h mn (x) are zero modes. This fact can be seen
in our quadratic Lagrangian by replacing hmn by h mn (x) h yy (x), hyy by h yy (x), By
by B y (x), setting the odd field hmy and Bm to zero, and integrating over y. We get in this
way:

2kR
2
3 1e
m n h mn
n h nm + h
L = M5
k

1
1
1
2
2
2
+ (m h) (p hmn ) (m By )
2
2
2

+ Re2kR n m h mn h h yy + 3kR h yy h yy .
(A.7)
References
[1] A.H. Chamseddine, R. Arnowitt, P. Nath, Phys. Rev. Lett. 49 (1982) 970;
R. Barbieri, S. Ferrara, C.A. Savoy, Phys. Lett. B 119 (1982) 343;
L.J. Hall, J. Lykken, S. Weinberg, Phys. Rev. D 27 (1983) 2359;
H.P. Nilles, M. Srednicki, D. Wyler, Phys. Lett. B 120 (1983) 346;
L.E. Ibanez, Phys. Lett. B 118 (1982) 73;
N. Ohta, Prog. Theor. Phys. 70 (1983) 542.
[2] M. Dine, A.E. Nelson, Y. Shirman, Phys. Rev. D 51 (1995) 1362, hep-ph/9408384.
[3] G.F. Giudice, R. Rattazzi, Phys. Rep. 322 (1999) 419.
[4] L. Randall, R. Sundrum, Nucl. Phys. B 557 (1999) 79, hep-th/9810155.
[5] G.F. Giudice, M.A. Luty, H. Murayama, R. Rattazzi, JHEP 9812 (1998) 027, hep-ph/9810442.
[6] Z. Chacko, M.A. Luty, I. Maksymyk, E. Ponton, JHEP 0004 (2000) 001, hep-ph/9905390.
[7] T. Gherghetta, A. Riotto, Nucl. Phys. B 623 (2002) 97, hep-th/0110022.
[8] R. Rattazzi, C.A. Scrucca, A. Strumia, Nucl. Phys. B 674 (2003) 171, hep-th/0305184.
[9] I.L. Buchbinder, et al., Phys. Rev. D 70 (2004) 025008, hep-th/0305169.
[10] L. Randall, R. Sundrum, Phys. Rev. Lett. 83 (1999) 3370, hep-ph/9905221.
46
[11] J.M. Maldacena, Adv. Theor. Math. Phys. 2 (1998) 231, hep-th/9711200.
[12] J. Maldacena, unpublished;
E. Witten, Talk at the ITP conference New Dimensions in Field Theory and String Theory, Santa Barbara,
http://www.itp.ucsb.edu/online/susy_c99/discussion.
[13] S.S. Gubser, Phys. Rev. D 63 (2001) 084017, hep-th/9912001.
[14] H. Verlinde, Nucl. Phys. B 580 (2000) 264, hep-th/9906182.
[15] N. Arkani-Hamed, M. Porrati, L. Randall, JHEP 0108 (2001) 017, hep-th/0012148.
[16] R. Rattazzi, A. Zaffaroni, JHEP 0104 (2001) 021, hep-th/0012248.
[17] M. Perez-Victoria, JHEP 0105 (2001) 064, hep-th/0105048.
[18] M.A. Luty, M. Porrati, R. Rattazzi, JHEP 0309 (2003) 029, hep-th/0303116.
[19] M. Zucker, Nucl. Phys. B 570 (2000) 267, hep-th/9907082.
[20] M. Zucker, JHEP 0008 (2000) 016, hep-th/9909144.
[21] M. Zucker, Phys. Rev. D 64 (2001) 024024, hep-th/0009083.
[22] T. Gherghetta, A. Pomarol, Nucl. Phys. B 586 (2000) 141, hep-ph/0003129.
[23] A. Falkowski, Z. Lalak, S. Pokorski, Phys. Lett. B 491 (2000) 172, hep-th/0004093.
[24] E. Bergshoeff, R. Kallosh, A. Van Proeyen, JHEP 0010 (2000) 033, hep-th/0007044.
[25] R. Altendorfer, J. Bagger, D. Nemeschansky, Phys. Rev. D 63 (2001) 125025, hep-th/0003117.
[26] J. Bagger, D.V. Belyaev, Phys. Rev. D 67 (2003) 025004, hep-th/0206024.
[27] Z. Lalak, R. Matyszkiewicz, Phys. Lett. B 562 (2003) 347, hep-th/0303227.
[28] M. Zucker, Fortschr. Phys. 51 (2003) 899.
[29] M.A. Luty, R. Sundrum, Phys. Rev. D 64 (2001) 065012, hep-th/0012158.
[30] J. Bagger, M. Redi, JHEP 0404 (2004) 031, hep-th/0312220.
[31] J. Bagger, M. Redi, Phys. Lett. B 582 (2004) 117, hep-th/0310086.
[32] J.-P. Derendinger, C. Kounnas, F. Zwirner, Nucl. Phys. B 691 (2004) 233, hep-th/0403043.
[33] N. Marcus, A. Sagnotti, W. Siegel, Nucl. Phys. B 224 (1983) 159.
[34] N. Arkani-Hamed, T. Gregoire, J. Wacker, JHEP 0203 (2002) 055, hep-th/0101233.
[35] N. Arkani-Hamed, L.J. Hall, D.R. Smith, N. Weiner, Phys. Rev. D 63 (2001) 056003, hep-ph/9911421.
[36] W.D. Linch, M.A. Luty, J. Phillips, Phys. Rev. D 68 (2003) 025008, hep-th/0209060.
[37] E.A. Mirabelli, M.E. Peskin, Phys. Rev. D 58 (1998) 065002, hep-th/9712214.
[38] S.J.J. Gates, S.M. Kuzenko, J. Phillips, Phys. Lett. B 576 (2003) 97, hep-th/0306288.
[39] S.J. Gates, M.T. Grisaru, M. Rocek, W. Siegel, Frontiers Phys. 58 (1983) 1, hep-th/0108200.
[40] T. Gregoire, M.D. Schwartz, Y. Shadmi, JHEP 0407 (2004) 029, hep-th/0403224.
[41] M.A. Luty, R. Sundrum, Phys. Rev. D 62 (2000) 035008, hep-th/9910202.
[42] D. Marti, A. Pomarol, Phys. Rev. D 64 (2001) 105025, hep-th/0106256.
[43] T. Hirayama, K. Yoshioka, JHEP 0401 (2004) 032, hep-th/0311233.
[44] W.D. Goldberger, M.B. Wise, Phys. Lett. B 475 (2000) 275, hep-ph/9911457.
[45] J. Bagger, D. Nemeschansky, R.-J. Zhang, JHEP 0108 (2001) 057, hep-th/0012163.
[46] C. Charmousis, R. Gregory, V.A. Rubakov, Phys. Rev. D 62 (2000) 067505, hep-th/9912160.
[47] S.R. Coleman, E. Weinberg, Phys. Rev. D 7 (1973) 1888.
[48] M.T. Grisaru, M. Rocek, R. von Unge, Phys. Lett. B 383 (1996) 415, hep-th/9605149.
[49] T. Gherghetta, A. Pomarol, Nucl. Phys. B 602 (2001) 3, hep-ph/0012378.
[50] L. Randall, M.D. Schwartz, JHEP 0111 (2001) 003, hep-th/0108114.
[51] A. Lewandowski, M.J. May, R. Sundrum, Phys. Rev. D 67 (2003) 024036, hep-th/0209050.
[52] E. Ponton, E. Poppitz, JHEP 0106 (2001) 019, hep-ph/0105021.
[53] J. Garriga, O. Pujolas, T. Tanaka, Nucl. Phys. B 605 (2001) 192, hep-th/0004109.
[54] J. Garriga, A. Pomarol, Phys. Lett. B 560 (2003) 91, hep-th/0212227.
[55] A.A. Saharian, M.R. Setare, Phys. Lett. B 552 (2003) 119, hep-th/0207138.
[56] W.D. Goldberger, I.Z. Rothstein, Phys. Lett. B 491 (2000) 339, hep-th/0007065.
[57] I. Brevik, K.A. Milton, S. Nojiri, S.D. Odintsov, Nucl. Phys. B 599 (2001) 305, hep-th/0010205.
[58] H.-S. Goh, M.A. Luty, S.-P. Ng, hep-th/0309103.
[59] N. Maru, N. Okada, Phys. Rev. D 70 (2004) 025002, hep-th/0312148.
[60] T. Gregoire, R. Rattazzi, C. Scrucca, in preparation.
[61] K.I. Izawa, Y. Nomura, K. Tobe, T. Yanagida, Phys. Rev. D 56 (1997) 2886, hep-ph/9705228.
Split supersymmetry from anomalous U (1)

K.S. Babu a , Ts. Enkhbat a , Biswarup Mukhopadhyaya b
a Department of Physics, Oklahoma State University, Stillwater, OK 74078, USA
b Harish-Chandra Research Institute, Chhatnag Road, Jhusi, Allahabad 211 019, India
Received 21 February 2005; received in revised form 22 April 2005; accepted 19 May 2005
Available online 2 June 2005
Abstract
We present a scenario wherein the anomalous U (1) D-term of string origin triggers supersymmetry breaking and generates naturally a split supersymmetry spectrum. When the gaugino and the
higgsino masses (which are of the same order of magnitude) are set at the TeV scale, we find the
scalar masses to be in the range (106 108 ) GeV. The U (1) D-term provides a small expansion parameter which we use to explain the mass and mixing hierarchies of quarks and leptons. Explicit
models utilizing exact results of N = 1 supersymmetric gauge theories consistent with anomaly constraints, fermion mass hierarchy, and supersymmetry breaking are presented.
1. Introduction
It is widely believed that supersymmetry may be relevant to Nature. There are four major observations which may justify this belief: (i) Supersymmetry (SUSY) can stabilize
scales associated with spontaneous symmetry breaking. (ii) Unification of gauge couplings
works well in the minimal SUSY extension of the Standard Model (SM). (iii) SUSY provides a natural candidate for cold dark matter. (iv) Supersymmetry is a necessary ingredient
of superstring theory, which may eventually lead to a consistent quantum theory of gravity.
Among these, reasoning (i), when applied to stabilize the electroweak scale, would suggest
that all superpartners of the SM particles must have masses below or around a TeV. This is
E-mail addresses: babu@okstate.edu (K.S. Babu), enkhbat@okstate.edu (Ts. Enkhbat),
biswarup@mri.ernet.in (B. Mukhopadhyaya).
doi:10.1016/j.nuclphysb.2005.05.006
48
K.S. Babu et al. / Nuclear Physics B 720 (2005) 4763
indeed what was assumed in almost all applications of supersymmetry to particle physics
in the past twenty five years. The second and third observations above would only require
that a subset of superpartners be lighter than a TeV, while the last one allows SUSY to be
broken anywhere below the Planck scale, MPl = 2.4 1018 GeV. This is because, among
the superpartners, if the split members of a unifying group (SU(5), SO(10), etc.), namely
the gauginos and the higgsinos, are lighter than a TeV, while the complete multiplets (the
scalar partners of SM fermions) are much heavier, unification of gauge couplings would
work just as well. The lightest of these SUSY particles would still be a natural candidate
for cold dark matter.
A scenario dubbed as split supersymmetry, in which the spin 1/2 superparticles,
namely, the gauginos and the higgsinos, have masses of order TeV while the spin zero
superparticles (squarks and sleptons) are much heavier, has recently been advocated [1].
This scenario gives up the conventionally employed naturalness criterion, since the light
SM Higgs boson is realized only by fine-tuning. Such a finely tuned scenario, it is argued,
may not be as improbable as originally thought [1]. This is because in any theory with
broken SUSY one has to cope with another, even more severe, fine-tuning, in the value
of the cosmological constant. A cosmic selection rule, an anthropic principle [2], may be
active in this case. If so, a similar argument may also explain why the SM Higgs boson
is light [3]. Supersymmetry plays no role in solving the hierarchy problem here. Recent
realization of a string landscape [4], which suggests the existence of a multitude of string
vacua, may justify this approach. Probabilistically, the chances of finding a vacuum with a
light SM Higgs (along with a small cosmological constant) may not be infinitesimal, given
the existence of a large number of string vacua [5].
Split supersymmetry has a manifest advantage over TeV scale supersymmetry: unacceptably large flavor changing neutral current (FCNC) processes [6], fermion electric
dipole moments, and d = 5 proton decay rate, which generically plague TeV scale SUSY
are automatically absent in split supersymmetry. Various aspects of this scenario have been
analyzed by a number of authors [7,8].
In this paper we take the split supersymmetry scenario from a theoretical point of view.
Perhaps the most important question in this context is a natural realization of the split spectrum. Although it may be argued that R-symmetries would protect masses of the spin 1/2
SUSY fermions and not of the squarks and sleptons, in any specific scenario for SUSY
breaking there is very little freedom in choosing the relative magnitudes of the two masses.
We will focus on SUSY breaking triggered by the anomalous U (1) D-term of string origin coupled to a SUSY QCD sector [9]. Each sector treated separately would preserve
supersymmetry, but their cross coupling breaks it. We make extensive use of exact results
known for N = 1 SUSY QCD [10]. In this scenario, the squarks and sleptons receive SUSY
breaking masses at the leading order from the anomalous U (1) D-term, while the gauginos
acquire masses only at higher order. The higgsino mass also arises at higher order and is
similar in magnitude to the gaugino mass. Thus, a naturally split spectrum is realized. The
anomalous U (1) D-term also provides a small expansion parameter which we use to explain the mass and mixing hierarchies of quarks and leptons. We present complete models
which are consistent with anomaly cancellation, and which lead to naturally split SUSY
49
spectrum.1 We note that with flavor-dependent charges, the anomalous U (1) D-term contributions to the squark and slepton masses generically lead to large FCNC processes with
sub-TeV scalars [12], this problem is absent in the split supersymmetry scenario.
2. Supersymmetry breaking by anomalous U (1) and gaugino condensation

In this section we review supersymmetry breaking induced by the D-term of anomalous
U (1) symmetry [9,13] coupled to the strong dynamics of N = 1 SUSY gauge theory [10].
Each sector separately preserves supersymmetry, so an expansion parameter (the cross
coupling) is available. Exact results of supersymmetric gauge theories can then be applied.
Here we focus on the global supersymmetric limit, in Section 2.1 we extend the analysis
to supergravity. In addition to the SM fields, these models contain an SU(Nc ) gauge sector
fields of the SU(Nc ) sector are also
with Nf flavors. The quark (Q) and antiquark (Q)
charged under the U (1)A . U (1)A is broken by a SM singlet field S carrying U (1)A charge
of 1. The Standard Model fields carry flavor-dependent U (1)A charges so that the hierarchy in fermion masses and mixings is naturally explained. A small expansion parameter
0.2 is provided by the ratio = S/MPl by the induced FayetIliopoulos D-term
for the U (1). To see this, we recall that the apparent anomalies in U (1)A are canceled by
the GreenSchwarz (GS) mechanism [14]. Heterotic superstring

theory when compactified
to four dimensions contains the Lagrangian terms L (x) i ki Fi2 + i(x) i ki Fi Fi ,
where ki are the KacMoody levels, (x) is the dilaton field and (x) is its axionic partner. The GS mechanism makes use of the transformation (x) (x) (x)GS , and the
gauge variation for the U (1)A gauge field, V V + (x). The anomalies are canceled
if the following conditions are satisfied:
Agravity
AN
AA
Ai
= GS ,
=
=
=
ki
kN
3kA
24
(1)
where Ai (i = 1, 2, 3), AN , AA and Agravity are the anomaly coefficients for SM 2 U (1)A ,
SU(Nc )2 U (1)A , U (1)3A and gravity2 U (1)A . Here Agravity is the gravitational anomaly,
given by the sum of the anomalous charges of all fields in the theory. All other anomalies
must vanish. These conditions put severe restrictions on the choice of U (1)A charges.
String loop effects induce a nonzero FayetIliopoulos D-term for the U (1)A given by
[15,16]
2
gst2 MPl
Agravity ,
(2)
192 2
where gst is the string coupling at the unification scale MPl , related to the SM gauge couplings at that scale as
ki gi2 = 2gst2 .
(3)
1 A somewhat similar analysis has recently been carried out in Ref. [11], our approach is different in that we
present complete models without assuming a hidden sector and address the fermion masses and mixing hierarchy
problems. Our spectrum is also quite different, especially as regards the gravitino mass.
50
The scalar potential receives a contribution from the D-term given by

2

2
2
i 2
gA
DA
2
2
2

+
=
qi |i |
.
VD =
|S| + qQ |Qi | + qQ Q
2
2
(4)
Here S is the flavon field with charge 1, Qi and Q i are the quark and antiquark
fields belonging to the fundamental and antifundamental representations of an SU(Nc )
gauge group with U (1) charges qQ and qQ . i in Eq. (4) stand for all the other fields, and
includes the SM sector.
In our models, all fields except S, will have positive U (1)A charges, so will turn out
to be positive. The potential of Eq. (4) will minimize to preserve supersymmetry by giving
the negatively charged S field a vacuum expectation value (VEV), which would break the
U (1)A symmetry. To zeroth order in SUSY breaking parameters, S = S0 , where

gst2 Agravity
S0 =
(5)
MPl MPl .
192 2
Here 0.2 will provides a small expansion parameter to explain the hierarchy of quark
and lepton masses and mixings.
As for the N = 1 SUSY QCD sector, we consider the gauge group SU(Nc ) with Nf
flavors of quarks and antiquarks, and apply the well-known exact results of Ref. [10]. For
concreteness we choose Nf < Nc . These results have been applied to TeV scale SUSY
breaking by Binetruy and Dudas in Ref. [9] in the presence of anomalous U (1) symmetry. These models actually lead to a split supersymmetry spectrum, as we will show. We
also generalize the results of Ref. [9] to include supergravity corrections (in Section 2.1).
In Section 3, we apply these results to explicit and complete models.
The effective superpotential we consider has two pieces
Weff = Wtree + Wdynamical ,
(6)
where Wtree is the tree-level superpotential, while Wdynamical is induced dynamically by

nonperturbative effects. Since the Q and the Q fields are charged under U (1)A , a bare
mass term connecting them is not allowed. A mass term will arise through the coupling
Wtree =
n
Tr(QQ)S
Mn1
(7)
when S = S0 is inserted. Here the trace is taken over the Nf flavor indices of the Qi and
i fields. M is a mass scale at which this term is induced. The most natural value of M is
Q
MPl , which is what we will use for our numerical analysis, but we allow M to be different
from MPl for generality. We have used the definition
n = qQ + qQ
(8)
As we will see later the choice n = 1,

for the sum of the U (1) charges of Q and Q.
which would correspond to a renormalizable superpotential will be phenomenologically
unacceptable. From the results of Ref. [10], the dynamically generated superpotential is
51
known to be (for Nf < Nc )

3Nc Nf 1/(Nc Nf )
.
Wdynamical = (Nc Nf )
det(QQ)
(9)
Here is the dynamically induced scale below which the SU(Nc ) sector becomes strongly
interacting:
MPl e
2
Nc (3Nc Nf )
(10)
where Nc is the SU(Nc ) gauge coupling constant at MPl . For Nf = Nc 1, the gauge
symmetry is completely broken, and Eq. (9) is induced by instantons. For Nf < Nc 1, the
gauge symmetry is reduced to SU(Nc Nf ) and the gaugino condensate of this symmetry
induces Eq. (9).
Below the scale the effective theory can be described in terms of Nf Nf mesons
i
Zj :
j = 1, . . . , Nf ).
with (i,
i
Zji = Qj Q
(11)
Neglecting small supersymmetry breaking effects, we can describe the theory below
i in terms of the Z fields. We can make the following
along the D-flat directions Qi = Q
replacements in the D-term and the superpotential: qQ |Qi |2 + qQ |Q i |2 n Tr(Z Z)1/2

i Z i . We use the notation
and Qj Q
j
m=
S0n
Mn1
(12)
with m identified as the mass matrix of the Z field (upto small supersymmetry breaking
effects). Then the F -term for the Z fields, defined as (FZ )ij = 2[(Z Z)1/2 ]i k W/Zjk , is
found to be

3Nc Nf 1/(Nc Nf ) T i

1/2
m
.
(FZ )ij = 2 Z Z
(13)
det(Z)
Z
j
This theory preserves supersymmetry, as FZ = 0 can be realized with Z = 0 and given by
i

i
3Nc Nf 1/Nc 1
(Z0 )j det(m)
(14)
.
m j
Note that this result holds only in the presence of a nonvanishing VEV S, so that m is
nonzero.
So far we treated the U (1)A D-term and the ensuing superpotential for the Z fields
separately. The two sectors are however coupled through Wtree of Eq. (7). Owing to this
coupling, supersymmetry is actually broken. This is evident by examining the F -term of
the S field,
FS = n
Tr(mZ0 )
= 0.
S0
(15)
52
Similarly FZ is also nonzero. The VEVs of S and Z fields will shift from the supersymmetry preserving values of Eqs. (5) and (14) when the full potential is minimized jointly.
To find the soft SUSY breaking parameters we need to calculate these corrections.
The scalar potential of the model in the global limit is given by
1 1/2 1 2
FZ + DA .
Tr FZ Z Z
2
2
We expand the fields around the SUSY preserving minima:
V = |FS |2 +
S = S0 + S,
Zji = (Z0 + Z)ij
(16)
(17)
with S/S0 1, Z/Z0 1. For simplicity we assume the coupling matrix to be an

j
j
identity matrix, ij = ji , in which case Zi = Zi can be chosen. The VEV Z = Z0
arising from Eq. (14) in this case becomes

3 m Nf /Nc
.
Z0 =
(18)
m
We make an expansion in the supersymmetry breaking parameter defined as

3 m Nf /Nc
Z0 /S02 =
1.
mS02
(19)
From the minimization of the scalar potential with respect to these shifted fields, we
find
2 2
n Nf

2
2
2
gA
S S = S0 1 + (nNf )
n(Nc Nf )(2Nc Nf )
2
2Nc2 gA

3
m2
2Nc (Nc nNf ) 2 + O ,
S0
2

2
n Nf (Nc Nf )(2Nc Nf )
+
O
Z = Z0 1
(20)
.
2Nc2
This agrees with the results of Ref. [9], except that there are two apparent typos in
Eq. (2.22) of that paper.
Now the F - and the D-terms are given by

nNf (Nc Nf )(2Nc Nf )
nNf
,
n1+
FS = mS0 (nNf ) 1 +
2
Nc2

Nf

FZ = mZ0 n2 Nf
1 ,
Nc

2 2
2 nNf
1 /gA .
DA = m (nNf )
(21)
Nc
Consequently, the scalar soft masses induced from the D-term of anomalous U (1) are
m2f = qfi m20 ,
i
(22)
where
m20
= m (nNf )
2

nNf
1 .
Nc
53
(23)
There is a simple interpretation of these results in terms of the gaugino condensate (for
Nf < Nc 1), which is given by [17]
Nf /Nc

m
= e2ik/(Nc Nf ) 3
,
k = 1 (Nc Nf ).
(24)
The soft scalar masses are simply proportional to the gaugino condensate. We will make
use of these results in Section 3. Note that had we chosen n = 1 these results would have
led to negative squared masses for scalars. Note also that the D-term contributions are
proportional to the U (1)A charges, so they are zero for particles with zero charge.
2.1. Gravity corrections to the soft parameters
In this section we work out the supergravity corrections to the soft parameters found
in the global SUSY limit in the previous section. Our reasons for this extension are twofold. First, we wish to show explicitly that supergravity corrections do not destabilize the
minimum of the potential that we found in the global limit. Second, the main contribution
to the masses of scalars with zero U (1) charge will arise from supergravity corrections. In
our explicit models, we do have particles with zero charge.
It is conventional in supergravity to add a constant term to the superpotential in order to
fine-tune the cosmological constant to zero:
W = Wglobal + .
(25)
We separate the constant into two parts, = 0 + 1 , such that 0 cancels the leading
part of the superpotential in which case W = 1 . The F -term contribution to the scalar
potential in supergravity is given by

i

4 G
VF = MPl
(26)
e Gi G1 j Gj 3 ,
where
Gi G/i ,
Gi G/ i ,
Gij 2 G/i j .
(27)
We will assume for illustration the minimal form of the Khler potential. In our model it is
given by

|S|2
Tr(Z Z)1/2 |i |2
|W |2
.
G= 2 +2
(28)
+
+
ln
2
6
MPl
MPl
MPl
MPl
i
Then the scalar potential is given by
V = VF + VD ,
(29)
54
with

2

W

VF = e
F S + S M 2
Pl

1
W 1/2
W
+ Tr FZ + Z 2 Z Z
FZ + Z
2
2
MPl
MPl

2

2
F + W 3 |W | ,
+
i
i
2
2
MPl
MPl
i

2
|S|2 +2 Tr(Z Z)1/2 + i |i |2 /MPl
(30)
and
VD =
2 4
g2
Gi (Ta )ij j MPl
.
2
(31)
2 + W/ /W , so M 2 G (T )i j = (T )i j , which is
In our case for Gi = i /MPl
a j
i
i
Pl i a j
identical to the D-term of global supersymmetry (note that the term W/ i (Ta )ij j vanishes due to the gauge invariance of W ).
Including these supergravity corrections, by minimizing the potential we find

n2 Nf2

m2
2
ngA
(Nc Nf )2 + 2Nc (nNf + Nc ) 2
S S = S S global + 22 S02 2 2
4gA Nc2
S0

1 nNf 2
2
gA (Nc Nf )2 2Nc + n(Nc Nf )
2
4gA Nc

m2
+ 2Nc (nNf 4Nc ) 2 ,
S0

N
c Nf 2
Z = Zglobal Z0 2
n Nf (Nc Nf ) + 1 2Nc + n(Nc Nf ) ,
2
2Nc
(32)
where the subscript global denotes the contributions found in global SUSY case in
Eq. (20). Here we have introduced a dimensionless parameter 1 defined through the relation

1 = 1 mS02 .
(33)
From the condition that the vacuum energy is zero at the minimum for the vanishing of
the cosmological constant, 1 is found to be

nNf

2
1 + 2 .
1

(34)
3
3 3
Eq. (33) ensures that the cosmological constant remains zero to the scale of strong dynamics. With these corrections the soft scalar masses from the D-term are now given by

m2f = m2f global + qf m20

2
Nc + nNf + 1 1 4Nc /(nNf ) .
nNf Nc
(35)
55
Note that the shifts in the masses are small, suppressed by a factor of
0.2.
The gravitino mass is determined to be
m3/2
m

nNf 3 m Nf /Nc
.
3S0 MPl
1 S02
2
MPl
(36)
In addition to the D-term corrections, all scalar fields receive a contribution to their soft
masses from the term

W 2
2
2

(37)
i M 2 = m3/2 |i | .
Pl
For particles neutral under the anomalous U (1)A these are the leading source for soft
masses. With the assumed minimal Khler potential, note that these soft masses are equal
to the gravitino mass.
So far we assumed the minimal form of the Khler potential for illustration. There is
no justification for this assumption. In fact, within split supersymmetry, since there are
no excessive FCNC processes, an arbitrary form for the Khler potential is permissible
phenomenologically. The effects of such a nonminimal G can be understood in terms of
higher-dimensional operators suppressed by the Planck scale. Scalar fields can acquire soft
SUSY breaking masses through the terms

i |S|2 4
L
i
(38)
d .
2
MPl
The resulting masses are m2 = ci m23/2 , with ci being order one (flavor-dependent) coeffifi
cients. We will allow for such corrections.
3. Explicit models
In this section we consider a class of models based on flavor-dependent anomalous U (1)
symmetry and apply the results of the previous section. These models were developed to
address the pattern of fermion masses and mixings [18,19]. As noted earlier, the anomalous
U (1) D-term provides a small expansion parameter = S/MPl 0.2, which can be used
to explain the mass hierarchy. We assign charge qi to fermion fi and charge qjc to fermion
fjc , such that the mass term fi fjc H will arise through a higher-dimensional operators with
the factor (S/MPl )qi +qj and thus suppressed by a factor qi +qj . By choosing the charges
appropriately the observed mass and mixing hierarchy can be explained, even with all
Yukawa coefficients being of order one.
With sub-TeV supersymmetry this approach to fermion mass and mixing hierarchy cannot be combined with supersymmetry breaking triggered by anomalous U (1), since the
D-terms will split the masses of scalars leading to unacceptable FCNC. Within split supersymmetry, however, these two approaches can be combined, which is what we analyze
now.
c
56
The superpotential of the class of models under discussion has the following form:

c
MR ij c c S nij

W=
+
+ Hu Hd
2 i j MPl
f
3Nc Nf 1/(Nc Nf )
Tr(Z)S n
+
+ (Nc Nf )
+ WA (S, Xk ).
n1
det(Z)
MPl
f
yij fi Hfjc
S
MPl
nf
ij
(39)
Here Xk are the SM singlet fields necessary for the cancellation of gravitational anomaly.
We will focus on the sub-class of such models studied in Ref. [19]. The mass matrices for
the various sectors in Ref. [19] are given (in an obvious notation) by
82

6 4
6
4
2
,
Mu Hu

4
2
1

5
4 4

p
3
2
2
,
Md Hd

1
1

5
2

3
p
p
4
2
MD Hu
Me Hd
1 ,

1 1 ,
4 2 1
1 1
2
2

Hu 2 2p
light

M
M c M R 1 1
(40)
1 1 .
MR
1 1
1 1
Although not unique, these mass matrices would lead to small quark mixings and large
neutrino mixings. Note that the neutrino masses are hierarchical in this scheme.
The charge assignment which leads to these mass matrices is given in Table 1. Here
we use SU(5) notation for the fields in the first column for simplicity, although we do
not explicitly assume SU(5) unification. There are two parameters, p and , which can
take a set of discrete values. The parameter p takes values p = 2 (1, 0) corresponding to
low (medium, high) value of tan (the ratio of the two Higgs VEVs). Actually, in split
supersymmetry, since tan 1 is also permitted, p = 3 is also allowed. appears in the
mass of the up-quark, both = 0 and = 1 give reasonable spectrum. We also consider
the case where the charge of 5 1 is p (rather than 1 + p) in Table 1. This case would have
mass matrices which are very similar to those in Eq. (42). The main difference in this case
Table 1
The flavor U (1)A charge assignment for the MSSM fields, the SU(Nc )
and the flavon field S in the normalization where qS = 1
fields Q and Q
Field
Anomalous flavor charges
101 , 102 , 103

5 1 , 5 2 , 5 3
4 , 2, 0
1c , 2c , 3c
Hu , Hd , S, Q, Q
1 + p, p, p
1, 0, 0
0, 0, 1, n/2
57
light
is that all elements of M

will be of the same order, which would lead to larger Ue3 .
This scenario has been widely studied [20], sometimes under the name of neutrino mass
anarchy [21]. The charge assignment of Table 1, as well as its above-mentioned variant,
explain naturally the mass and mixing hierarchy of quarks and leptons, including small
quark mixings and large neutrino mixings.
The GreenSchwarz anomaly cancellation conditions for these models are given by
nNf
ANc
19 3 + 3p
A1 Ai
=
=
=
=
k1
ki
kN
2kN
2ki
or
18 3 + 3p
2ki
(41)
with Ai being the (SM)2 U (1)A anomalies for i = 23. Their equality is automatically
satisfied, due to the SU(5) compatibility of charges, provided that the KacMoody levels ki
for the SM gauge groups U (1)Y , SU(2)L and SU(3)c are chosen to be, for example, 5/3, 1
and 1, respectively. For Agravity , one needs to introduce extra heavy matter Xk (with charge
+1) which decouple at or near the Planck scale (see Ref. [19] for a detailed discussion).
In Eq. (43) the first p-dependent factor applies to the charge assignment of Table 1, while
the second one corresponds to the variant with 5 1 carrying charge
p. For every choice of
charge we can compute the expansion parameter from = gst2 Agravity /(192 2 ). We
find for = 0 and for the charges of Table 1, = 0.174 (0.187, 0.199) for p = 0 (1, 2).
The results are very similar for other choices.
Eq. (41) allows for only a finite set of choices for n, Nc and Nf . First of all, all these
must be integers. Secondly, the mass parameter m of the meson fields of SU(Nc ) must be
of order or smaller, otherwise these mesons will decouple from the low energy theory,
affecting its dynamics. Thirdly, the dynamical scale is determined for any choice of
charges, due to the string unification condition, Eq. (3). (We will confine to KacMoody
level 1 for the SU(Nc ) as well as the SM sectors.) This should lead to an acceptable SUSY
breaking spectrum. Consistent with these demands, we find four promising cases. (i) n =
5, Nf = 5, p = 2, = 0; (ii) n = 6, Nf = 4, p = 2, = 0; (iii) n = 7, Nf = 3, p = 1, =
1; and (iv) n = 6, Nf = 3, p = 1, = 1. Here (i) has 5 1 charge equal to p + 1, while the
other three cases has it to be equal to p. We will see that the choices Nc = 6 or 7 yield
reasonable spectrum.
3.1. The spectrum of the model
Now we turn to the spectrum of the model. We set the gaugino masses at the TeV scale.
(The higgsinos will turn out to have masses of the same order.) We then seek possible
values of the scale and the mass parameter m0 (the scalar mass) that would induce the
TeV scale gaugino masses. The spectrum will turn out to be that of split supersymmetry.
The main reason for this is that the leading SUSY breaking term, the U (1)A D-term,
generates squark and slepton masses, but not gaugino and higgsino masses.
Supersymmetry breaking trilinear A terms are induced in the model by the same superpotential W (Eq. (41)) that generates quark and lepton masses, once the S field acquires a
nonzero F component:
f
L yij

2
fi fjc H
S
MPl
nf
ij
FS
f f
fc
.
= Yij qi + qj fi fjc H
S
(42)
58

f
Here Yij
yij nij are the effective MSSM Yukawa couplings, with nij = qfi + qfcj , the
sum of the anomalous charge of the SM fermions fi and fjc . Substituting results from the
previous section, Eqs. (20) and (21), we find

3 m Nf /Nc
f
f f
fc
Aij = Yij qi + qj nNf 2
(43)
.
S0
These A-terms are induced at the scale . The messengers of supersymmetry breaking
are the meson fields of the SU(Nc ) sector, which have masses of order . In the momentum
range m0 , the spectrum is that of the MSSM and there is renormalization group
running of all SUSY breaking parameters as per the MSSM beta functions. This implies
that once the A-terms are induced, they will generate nonzero gaugino masses through twoloop MSSM interactions. These are estimated from the two-loop MSSM beta functions to
be2
Mgi (m0 )

gi2 b 2
m0
Ci Yb + Ci Y2
ln 2 /m20 ,
2
2
(16 )
nNf /Nc 1
(44)
where Cib = (14/5, 6, 4) and Ci = (18/5, 6, 0) for i = 13. Yb and Y are the MSSM
Yukawa couplings of the b-quark and the -lepton. From the requirement that Mgi 1 TeV
we can estimate and m0 , which will enable us to obtain the full spectrum of the model.
Assuming that m , for the bino mass we obtain (for p = 2, or tan 5):
MB (m0 ) 105 m0 .
(45)
The mass of the wino is somewhat larger than this, and that of the gluino is somewhat
smaller (compare the coefficients Cib and Ci ), all at the scale m0 . There is significant
running of these masses below m0 down to the TeV scale. This running is the largest
for the gluino [7] which increases its mass, while it is the smallest for the bino, which
decreases its mass. Consequently, at the TeV scale, we have the normal mass hierarchy
MBino MWino Mgluino .
In addition to the SM gauge interactions, the gauginos receive masses from the anomaly
mediated contributions [22]. These contributions may be suppressed in specific setups such
as in 5-dimensional supergravity [1]. We will allow for both a suppressed and an unsuppressed anomaly mediated contributions to gaugino masses. These contributions are given
by
Mgaugino =
(g)
F ,
g
(46)
where F is the F -component of the compensator superfield. With our setup as described
in the previous section, F is equal to the gravitino mass, so the wino mass, for e.g., will be
about 3 103 of the gravitino mass, or about 103 m0 . If we set the wino mass at 1 TeV,
m0 will be of order 106 GeV in such a scenario.
2 The one-loop finite corrections arising from diagrams involving the top-quark and the stop-squark are negligible since At = 0 and TeV mt .
59
Table 2
The spectrum of the model for different choices of p, , n, Nf and Nc . In computing , we use Eq. (10) with
Nc = 1/28 at the Planck scale. The bino mass estimate is very rough, and includes only the two-loop MSSM
induced contributions
(p, , n, Nf , Nc )
(GeV)
m (GeV)/
m0
MB (m0 ) (GeV)
(GeV)/
(2, 0, 5, 5, 6)
3 1012
8 1014
(2, 0, 5, 5, 7)
4 1013
8 1014
6 105 5/6
9 107 5/7
5 5/6
600 5/7
600/ 1/6
9 104 / 2/7
(2, 0, 6, 4, 6)
8 1012
1 1014
(2, 0, 6, 4, 7)
8 1012
1 1014
7 105 2/3
1 108 4/7
5 2/3
640 4/7
700/ 1/3
105 / 3/7
(1, 0, 7, 3, 6)
2 1013
3 1013
(1, 0, 7, 3, 7)
1 1014
3 1013
1 106 1/2
2 108 3/7
100 1/2
104 3/7
1600/ 1/2
2 105 / 4/7
(1, 1, 6, 3, 6)
2 1013
1 1014
2 106 1/2
200 1/2
3000/ 1/2
As we stated in the previous section, only a limited choice of n and Nf are allowed from
the mixed anomaly cancellation conditions. We have considered four cases with nNf =
25, 24, 21, or 18. Our results for the spectrum are listed in Table 2. In each case we studied
different values of Nc > Nf . Nc = 6, 7 give the correct dynamical scale which leads
to TeV scale gauginos. The scalar masses are found to be of order 106 GeV in the case of
unsuppressed anomaly mediated contribution (cases 1 and 3), and of order 108 GeV for the
suppressed case (all the other cases). Clearly this is a split supersymmetry spectrum. In the
2 /(4) = 1/28 at M = 2.4 1018 GeV. The mass
computation of Table 2 we assumed gN
Pl
c
m for the meson fields is computed in terms of an effective coupling (MPl /M )n1 .
We expect to be of order one from naturalness, if M is the same as MPl . We list the mass
m in terms of in the third column in Table 2. Note that the scalar masses from anomalous
U (1) D-term are proportional to the U (1) charges, and therefore vanish for Hu , Hd and
103 fields. These fields will however acquire masses from supergravity corrections.
The U (1)A symmetry does not forbid a bare term in the superpotential. However, it
can be banished by a discrete Z4 R-symmetry [23]. Under this Z4 , all the SM fermion
superfields (scalar components) have charge +1, the gauginos have charge +1, the Z field
has charge +2 and the SM Higsses and the S fields have charge zero. This symmetry
has no anomaly, as a consequence of discrete GreenSchwarz anomaly cancellation. The
G2 Z4 anomaly coefficients are A3 = 3, A2 = 21 = 1 and ANc = Nc . The GS condition
for discrete Z4 anomaly cancellation is that the differences Ai Aj should be an integral
multiple of 2, which is automatic when Nc is odd.
One can write the following effective Lagrangian for the term that is consistent with
the Z4 R symmetry

Tr( Z)S n
ZS n
=
N
H H .
L d 2 Hu Hd
(47)
f
n+1
n+1 u d
MPl
MPl
This leads to
Nf /Nc
m
3
= N f
.
mMPl
n
(48)
60
The numerical results for -term are given in the last column of Table 2 using this relation.
The SUSY breaking bilinear Higgs coupling, the B term, arises from the Lagrangian

B1 |S|2 + B2 Tr(Z Z)1/2
L = d 4 Hu Hd
2
MPl
=
B1 |FS0 |2 + B2 Nf |FZ |2 /|Z0 |

2
MPl
Hu Hd ,
(49)
leading to

B = m20
Nc
nNf Nc
2
Nf /Nc
m
3
B1 + B2 n Nf
.
2
mMPl
2
(50)
The second term in Eq. (50) is small compared to the first. From this we see that the 2 2
Higgs boson mass matrix has its off-diagonal entry of the same order as its diagonal entries.
Recall that the diagonal entries are of order m23/2 , since the U (1)A charges of Hu and Hd
are zero. Fine-tuning can then be done consistently so that one of the Higgs doublets remain
light, with mass of order 102 GeV.
Even when the Z4 R symmetry is not respected by gravitational corrections, the induced
term and gaugino masses are of order TeV. There can be a new contribution to the term
in this case, arising from

(ZS n )
L d 4 Hu Hd n+2 .
(51)
MPl
This term is however smaller than that from Eq. (49). Similarly, gaugino masses can
arise from

ZS n
L d 4 W W n+2
(52)
MPl
which is also smaller than the SM induced corrections.
For the scalars neutral under U (1)A (Hu , Hd and 103 ), the D-term contribution to the
soft masses vanish. We should take account of the subleading supergravity corrections then.
Since these corrections are suppressed by a factor of 2 in the mass-squared, we should
worry about potentially large negative corrections proportional to the other soft masses
arising from SM interactions through the RGE in the momentum range m0 . We
have examined this in detail and found consistency of the models.
For the masses of zero charge fields we write

m2 = ci m23/2 + m2
(53)
i
(m2 )
i
with
denoting the MSSM RGE corrections. The most prominent one-loop radiative
corrections are
2
2
2 1-loop
nNf
p
2 m0
,
(Yb )
ln
mf
2
3
16 (nNf /Nc 1) Nc
m2
f
61
2
2
2 1-loop

nNf
p
2
2 m0
,
mHd
3(Yb ) + (Y )
ln
16 2 (nNf /Nc 1) Nc
m2
(54)
where f3 = (Q 3 , e3c ). Similar corrections for Hu and uc3 scalar components are small. Since
p = 2, we have low tan 5, so these corrections are not large, although not negligible.
For example, for the down-type Higgs bosons we have
1-loop

2 103 m20 .
m2Hd
(55)
If the supergravity corrections to the mass-squared of Hd is larger than 3 102 m0 , it will
remain positive down to the scale m0 .
There is an important two-loop correction to the scalar masses arising from the gauge
sector:
2
2-loop
m20
g 4
(nN
)K
ln
,
m2
(56)
f
2
2
(16 )
m2
f
uc , ec and Hu ). This correction is

where K = (63/15, 16/5, 6/5 and 9/5) for = (Q,
estimated to be
2 2-loop
102 m20 .
mf
(57)
We see that these corrections are, although close to the gravitino contribution, at a safe
level. We conclude that split supersymmetry is realized consistently in these models.
4. Conclusion
In this paper we have proposed concrete models for supersymmetry breaking making
use of the anomalous U (1) D-term of string origin. The anomalous U (1) sector is coupled
to the strong dynamics of an N = 1 SUSY gauge theory where exact results are known.
The complete models we have presented also address the mass and mixing hierarchy of
quarks and leptons. We have generalized the analysis of Ref. [9] to include supergravity
corrections, which turns out to be important for certain fields in these models which carry
zero U (1) charge. Table 2 summarizes our results on the spectrum of these models. This
spectrum is that of split supersymmetry. The gaugino and the higgsino masses are of the
same order, when these are set at the TeV scale, the squarks and sleptons have masses in
the range (106 108 ) GeV. This provides an explicit realization of part of the parameter
space of split supersymmetry [1].
The experimental and cosmological implications of split supersymmetry have been
widely studied [68,11]. We conclude by summarizing the salient features that apply to
our framework. (i) Gauge coupling unification works well, in fact somewhat better than in
the MSSM. When embedded into SU(5) symmetry, proton decay via dimension six operators will result, with an estimated lifetime for p e+ 0 of order (1035 1036 ) yr. There
is no observable d = 5 proton decay in these models. (ii) The lightest neutralino, which is
charge and color neutral, is a natural and consistent dark matter candidate. (iii) The gluino
lifetime is estimated to be of order 107 seconds or shorter in these models. There is no
62
cosmological difficulty with such a mass. (iv) The gravitino mass is or order 107 GeV,
thus there is no cosmological gravitino abundance problem. (v) The low energy theory is
the SM plus the neutralinos and the charginos of supersymmetry. All other particles acquire
masses either near the Planck scale or through strong dynamics at a scale 1014 GeV.
Acknowledgements
K.B. and T.E. are supported in part by the US Department of Energy under grants
#DE-FG02-04ER46140 and # DE-FG02-04ER41306. B.M. acknowledges hospitality of
the Theory Group at Oklahoma State University during a visit when this work started.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
N. Arkani-Hamed, S. Dimopoulos, hep-th/0405159.

S. Weinberg, Phys. Rev. Lett. 59 (1987) 2607.
V. Agrawal, S.M. Barr, J.F. Donoghue, D. Seckel, Phys. Rev. Lett. 80 (1998) 1822.
R. Bousso, J. Polchinski, JHEP 0006 (2000) 006;
S. Kachru, R. Kallosh, A. Linde, S.P. Trivedi, Phys. Rev. D 68 (2003) 046005;
A. Maloney, E. Silverstein, A. Strominger, hep-th/0205316;
M.R. Douglas, JHEP 0305 (2003) 046;
F. Denef, M.R. Douglas, JHEP 0405 (2004) 072;
L. Susskind, hep-th/0302219.
For attempts to quantify the probabilistic interpretation see: T. Banks, M. Dine, E. Gorbatov, JHEP 0408
(2004) 058;
C. Kokorelis, hep-th/0406258;
I. Antoniadis, S. Dimopoulos, hep-th/0411032;
K.R. Dienes, E. Dudas, T. Gherghetta, hep-th/0412185.
For a suggestion of PeV scale scalars based on practical needs, see: J.D. Wells, hep-ph/0411041.
G.F. Giudice, A. Romanino, Nucl. Phys. B 699 (2004) 65.
A. Arvanitaki, C. Davis, P.W. Graham, J.G. Wacker, Phys. Rev. D 70 (2004) 117703;
A. Pierce, Phys. Rev. D 70 (2004) 075006;
S.H. Zhu, Phys. Lett. B 604 (2004) 207;
B. Mukhopadhyaya, S. SenGupta, hep-th/0407225;
E. Schmidt, hep-ph/0408088;
R. Mahbubani, hep-ph/0408096;
M. Binger, hep-ph/0408240;
K. Cheung, W.Y. Keung, Phys. Rev. D 71 (2005) 015015;
L. Anchordoqui, H. Goldberg, C. Nunez, Phys. Rev. D 71 (2005) 065014;
N. Arkani-Hamed, S. Dimopoulos, G.F. Giudice, A. Romanino, hep-ph/0409232;
M.A. Diaz, P.F. Perez, J. Phys. G 31 (2005) 1;
K. Cheung, W.Y. Keung, Phys. Rev. D 71 (2005) 015015.
P. Binetruy, E. Dudas, Phys. Lett. B 389 (1996) 503.
N. Seiberg, Phys. Rev. D 49 (1994) 6857.
B. Kors, P. Nath, hep-th/0411201.
Y. Kawamura, H. Murayama, M. Yamaguchi, Phys. Rev. D 51 (1995) 1337;
T. Kobayashi, H. Nakano, H. Terao, K. Yoshioka, hep-ph/0211347;
K.S. Babu, T. Enkhbat, I. Gogoladze, Nucl. Phys. B 678 (2004) 233.
G.R. Dvali, A. Pomarol, Phys. Rev. Lett. 77 (1996) 3728;
R.N. Mohapatra, A. Riotto, Phys. Rev. D 55 (1997) 4262.
[14] M.B. Green, J.H. Schwarz, Phys. Lett. B 149 (1984) 117;
M.B. Green, J.H. Schwarz, Nucl. Phys. B 255 (1985) 93;
M.B. Green, J.H. Schwarz, P.C. West, Nucl. Phys. B 254 (1985) 327.
[15] M. Dine, N. Seiberg, E. Witten, Nucl. Phys. B 289 (1987) 589.
[16] J.J. Atick, L.J. Dixon, A. Sen, Nucl. Phys. B 292 (1987) 109.
[17] K.A. Intriligator, N. Seiberg, Nucl. Phys. B (Proc. Suppl.) 45BC (1996) 1, hep-th/9509066.
[18] L.E. Ibanez, G.G. Ross, Phys. Lett. B 332 (1994) 100;
P. Binetruy, P. Ramond, Phys. Lett. B 350 (1995) 49;
P. Binetruy, S. Lavignac, P. Ramond, Nucl. Phys. B 477 (1996) 353;
K.S. Babu, I. Gogoladze, K. Wang, Nucl. Phys. B 660 (2003) 322;
H.K. Dreiner, H. Murayama, M. Thormeier, hep-ph/0312012.
[19] K.S. Babu, T. Enkhbat, I. Gogoladze, Nucl. Phys. B 678 (2004) 233;
K.S. Babu, T. Enkhbat, hep-ph/0406003, Nucl. Phys. B, in press.
[20] K.S. Babu, S.M. Barr, Phys. Lett. B 381 (1996) 202.
[21] L.J. Hall, H. Murayama, N. Weiner, Phys. Rev. Lett. 84 (2000) 2572.
[22] L. Randall, R. Sundrum, Nucl. Phys. B 557 (1999) 79;
G.F. Giudice, M.A. Luty, H. Murayama, R. Rattazzi, JHEP 9812 (1998) 027.
[23] K.S. Babu, I. Gogoladze, K. Wang, Nucl. Phys. B 660 (2003) 322, hep-ph/0212245.
63
Tri-bimaximal neutrino mixing from discrete

symmetry in extra dimensions
Guido Altarelli a,b , Ferruccio Feruglio c
a CERN, Department of Physics, Theory Division, CH-1211 Geneva 23, Switzerland
b Dipartimento di Fisica E. Amaldi, Universit di Roma Tre, INFN, Sezione di Roma Tre, I-00146 Rome, Italy
c Dipartimento di Fisica G. Galilei, Universit di Padova, INFN, Sezione di Padova,
Via Marzolo 8, I-35131 Padua, Italy

Received 4 May 2005; accepted 19 May 2005
Abstract
We discuss a particularly symmetric model of neutrino mixings where, with good accuracy, the
atmospheric mixing angle 23 is maximal, 13 = 0 and the solar angle satisfies sin2 12 = 1/3
(HarrisonPerkinsScott (HPS) matrix). The discrete symmetry A4 is a suitable symmetry group
for the realization of this type of model. We construct a model where the HPS matrix is exactly
obtained in a first approximation without imposing ad hoc relations among parameters. The crucial
issue of the required VEV alignment in the scalar sector is discussed and we present a natural solution of this problem based on a formulation with extra dimensions. We study the corrections from
higher dimensionality operators allowed by the symmetries of the model and discuss the conditions
on the cut-off scales and the VEVs in order for these corrections to be completely under control.
Finally, the observed hierarchy of charged lepton masses is obtained by assuming a larger flavour
symmetry. We also show that, under general conditions, a maximal 23 can never arise from an exact
flavour symmetry.
E-mail addresses: guido.altarelli@cern.ch (G. Altarelli), feruglio@pd.infn.it (F. Feruglio).

doi:10.1016/j.nuclphysb.2005.05.005
G. Altarelli, F. Feruglio / Nuclear Physics B 720 (2005) 6488
65
1. Introduction
By now there is convincing evidence for solar and atmospheric neutrino oscillations.
The m2 values and mixing angles are known with fair accuracy [1]. For m2 we have:
m2atm 2.5 103 eV2 and m2sol 8 105 eV2 . As for the mixing angles, two
are large and one is small. The atmospheric angle 23 is large, actually compatible with
maximal but not necessarily so: at 3 : 0.31 sin2 23 0.72 with central value around
0.5. The solar angle 12 is large, sin2 12 0.3, but certainly not maximal (by about 56
now [2]). The third angle 13 is strongly limited, mainly by the CHOOZ experiment, and
has at present a 3 upper limit given by about sin2 13 0.08.
In spite of this experimental progress there are still many alternative routes in constructing models of neutrino masses. This variety is mostly due to the considerable ambiguities
that remain. First of all, it is essential to know whether the LSND signal [3], which has not
been confirmed by KARMEN [4] and is currently being double-checked by MiniBoone
[5], will be confirmed or will be excluded. If LSND is right we probably need at least
four light neutrinos; if not we can do with only the three known ones, as we assume here
in the following. As neutrino oscillations only determine mass squared differences a crucial missing input is the absolute scale of neutrino masses (within the existing limits from
terrestrial experiments and cosmology [6,7]). Even for three neutrinos the pattern of the
neutrino mass spectrum is still undetermined: it can be approximately degenerate, or of
the inverse hierarchy type or normally hierarchical. Given for granted that neutrinos are
Majorana particles, their masses can still arise either from the see-saw mechanism or from
generic dimension-five non-renormalizable operators.
At a more direct level, we do not know how small the mixing angle 13 is and how close
to maximal is 23 . One can make a distinction between normal and special models.
and 13 is not too small, typically a
For normal models 23 is not too close to maximal
small power of the self-suggesting order parameter r, with r = m2sol /m2atm 1/35.
Special models are those where some symmetry or dynamical feature assures in a natural
way the near vanishing of 13 and/or of 23 /4. Normal models are conceptually more
economical and much simpler to construct. We expect that experiment will eventually find
that 13 is not too small and that 23 is sizably not maximal. But if, on the contrary, either
13 very small or 23 very close to maximal will emerge from experiment, then theory
will need to cope with this fact. Thus it is interesting to conceive and explore dynamical
structures that could lead to special models in a natural way.
We want to discuss here some particularly special models where both 13 and 23 /4
exactly vanish.1 Then the neutrino mixing matrix Uf i (f = e, , , i = 1, 2, 3), in the
basis of diagonal charged leptons, is given by, apart from sign convention redefinitions:
c12
s12
0
Uf i = s12 / 2 c12 / 2 1/ 2 ,
(1)
s12 / 2 c12 / 2 1/ 2
where c12 and s12 stand for cos 12 and sin 12 , respectively. It is much simpler to write
natural models of this type with s12 small and thus many such attempts are present in the
1 More precisely, they vanish in a suitable limit, with correction terms that can be made negligibly small.
66
early literature. More recently, given the experimental value of 12 , the more complicated
case of s12 large was also attacked, using non-Abelian symmetries, either continuous or
discrete [814]. In many examples the invoked symmetries are particularly ad hoc and/or
no sufficient attention is devoted to corrections from higher-dimensional operators that
can spoil the pattern arranged at tree level and to the highly non-trivial vacuum alignment
problems that arise if naturalness is required also at the level of vacuum expectation values
(VEVs).
An interesting special case of Eq. (1) is obtained for s12 = 1/ 3, i.e. the so-called tribimaximal or HarrisonPerkinsScott
mixing pattern (HPS) [13], with the entries in the
second column all equal to 1/ 3 in absolute value:
2
1
0
3
3
1
1
1
UHPS =
(2)
6 3 2 .
1
1
3
1
2
This matrix is a good approximation to present data.2 It would be interesting to find a

natural and appealing scheme that leads to this matrix with good accuracy. In fact this
is a most special model where not only 13 and 23 /4 vanish but also 12 assumes
a particular value. Clearly, in a natural realization of this model, a very constraining and
predictive dynamics must be underlying. We think it is interesting to explore particular
structures giving rise to this very special set of models in a natural way. In this case we
have a maximum of order implying special values for all mixing angles: at the other
extreme, anarchical models have been proposed [15], where no structure at all is assumed
in the lepton sector, so that, for example, 13 and 23 are predicted to be in no way special,
except that there must be a smallest angle (probably near to the present bound) and a largest
angle (expected sizably different from maximal).
Interesting ideas on how to obtain the HPS mixing matrix have been discussed in
Ref. [13]. The most attractive models are based on the discrete symmetry A4 , which appears as particularly suitable for the purpose, and were presented in Refs. [10,11]. In the
present paper we start by discussing some general features of HPS models. We then present
a new version of an A4 model, with (moderate) normal hierarchy, and discuss in detail all
aspects of naturalness in this model, also considering effects beyond tree level and the
problem of vacuum alignment. There are a number of substantial improvements in our version with respect to Ma in Ref. [11]. First, the HPS matrix is exactly obtained in a first
approximation when higher-dimensional operators are neglected, without imposing ad hoc
relations among parameters (in Ref. [11] the equality of b and c is not guaranteed by the
symmetry). The observed hierarchy of charged lepton masses is obtained by assuming a
larger flavour symmetry. The crucial issue of the required VEV alignment in the scalar sector is considered with special attention and we present a natural solution of this problem.
We also keep the flavour scalar fields distinct from the normal Higgs bosons (a proliferation
of Higgs doublets is disfavoured by coupling unification) and singlets under the Standard
2 In the HPS scheme tan2 = 0.5, to be compared with the latest experimental determination [2]: tan2 =
12
12
+0.09
0.450.08
.
67
Model gauge group. Last not least, we study the corrections from higher dimensionality
operators allowed by the symmetries of the model and discuss the conditions on the cut-off
scales and the VEVs in order for these corrections to be completely under control.
2. General considerations
The HPS mixing matrix implies that in a basis where charged lepton masses are diagonal
T :
the effective neutrino mass matrix is given by m = UHPS diag(m1 , m2 , m3 )UHPS

0
4 2 2
m3 0 0
m2 1 1 1
m1
m =
(3)
.
0 1 1 +
1 1 1 +
2 1
1
2 0 1 1
3 1 1 1
6 2 1
1
The eigenvalues
of m are m1 , m2 , m3 with eigenvectors (2, 1, 1)/ 6, (1, 1, 1)/ 3 and
(0, 1, 1)/ 2, respectively. In general, apart from phases, there are six parameters in a real
symmetric matrix like m : here only three are left after the values of the three mixing angles have been fixed la HPS. For a hierarchical spectrum m3 m2 m1 , m23 m2atm ,
m22 /m23 m2sol /m2atm and m1 could be negligible. But also degenerate masses and inverse hierarchy can be reproduced: for example, by taking m3 = m2 = m1 we have a
degenerate model, while for m1 = m2 and m3 = 0 an inverse hierarchy case (stability under renormalization group running strongly prefers opposite signs for the first and
the second eigenvalue which are related to solar oscillations and have the smallest mass
squared splitting). From the general expression of the eigenvectors one immediately sees
that this mass matrix, independent of the values of mi , leads to the HPS mixing matrix. It
is a curiosity that the eigenvectors are the same as in the case of the FritzschXing (FX)
matrix [16] but with the roles of the first and the third ones interchanged (so that for HPS
23 is maximal while sin2 212 = 8/9, while for FX the two mixing angles keep the same
values but are interchanged).
If the atmospheric mixing angle is really maximal as in the HPS ansatz or close to
maximal, it seems quite natural to interpret this as the effect of a flavour symmetry. It
would be tempting to think of an approximate flavour symmetry such that 23 = /4 arises
in the limit of exact symmetry, that is by neglecting all symmetry breaking effects. Here
we will show that this is not the case and that, under quite general conditions, we can never
obtain 23 = /4 as a result of an exact flavour symmetry.3 We assume that this symmetry
is a meaningful symmetry, that is it is only broken by small effects, in the real world. In
other words here we exclude symmetries that need breaking terms of order one to describe
the observed fermion masses and mixing angles. Apart from that the symmetry can be
of whatever type, global or local, continuous or discrete. Being interested in the limit of
exact symmetry, we can neglect the sector giving rise to flavour symmetry breaking. We
assume that the fields on which such symmetry acts are the fields of the standard model,
plus possibly the right-handed neutrinos, so that our results will also cover the see-saw
case. Last, we assume canonical kinetic terms, so that the symmetry acts on the fields of
the standard model through unitary transformations.
3 For related observations see Ref. [17].
68
Since the flavour symmetry is broken only by small effects, the mass matrices for
charged leptons and neutrinos can be written as:
me = m0e + ,
m = m0 + ,
(4)
where dots denote symmetry breaking effects and m0e has rank less or equal than one.
Rank greater than one, as for instance, when both the tau and the muon have non-vanishing
masses in the symmetry limit, is clearly an unacceptable starting point, since the difference
between the two non-vanishing masses can only be explained by large breaking effects,
which we have excluded, or by a fine-tuning, which we wish to avoid. If the rank of me
vanishes, than all mixing angles in the charged lepton sector are undetermined in the symmetry limit and 23 is also completely undetermined. Therefore we can focus on the case
when m0e has rank one. If m0e has rank one, then by a unitary transformations we can always
go to a field basis where

0 0 0
m0e = 0 0 0 .
(5)
0 0 m0
As in the original basis, the action of the flavour symmetry on the new field basis is per0
fectly defined. If U and Ue are the unitary matrices that diagonalize m0 and m0
e me , it
will be possible to adopt the parametrization [18]

U = K R23 23
(6)
P R13 13
P R12 12
,
where Rij is the orthogonal matrix representing a rotation in the ij sector, P =
diag(1, 1, exp(i)) and K = diag(exp(i1 ), exp(i2 ), exp(i3 )). Moreover:
e
Ue = R12 12
(7)
,
e is completely undetermined. The physical mixing matrix is U
where the angle 12
P MN S =
Ue U and we find:

e
i2
e tan 13 i(+1 )

| tan 23 | = cos 12 tan 23 e + sin 12
(8)
e
.
cos 23
Therefore, in general, the atmospheric mixing angle is always undetermined at the leading
order. When small symmetry breaking terms are added to m0e and m0 , it is possible to
obtain 23 = /4, provided these breaking terms have suitable orientations in the flavour
space. If the breaking terms are produced by a spontaneous symmetry breaking through
the minimization of the potential energy of the theory, in general two independent scalar
sectors are needed. One of them communicates the breaking to charged fermions and the
other one feeds the breaking to neutrinos. In such a framework a maximal atmospheric
mixing angle is always the result of a special vacuum alignment.
In the literature there are symmetries predicting 23 large, not necessarily maximal, in
the limit of exact symmetry [19]. For instance, this is produced by U(1) flavour symmetries, when the U(1) charges of left-handed leptons and right-handed charged leptons are
(q1 , 0, 0) and (p1 , p2 , 0), respectively, with q1 and p1,2 all non-vanishing and different. In
m LT L):
the symmetry limit, such an assignment implies (me RL,
0
0
0
0
2
,
m0
m
=
0
||
e
e
2
0 ||
and:
0
0
0
|2 + | |2
m0
m
=
0
|
0 +
0
+ ,
| |2 + | |2
69
(9)
(10)
with , , , and independent parameters of the same order of magnitude. If there

is no conspiracy among these parameters, the resulting 23 mixing is generically large.
In conclusion, a large lepton mixing in the 23 sector is possible as the result of an exact flavour symmetry. But if we want to reproduce 23 = /4 in some limit of our theory,
necessarily this limit cannot correspond to an exact symmetry in flavour space. A maximal atmospheric mixing angle can only originate from breaking effects as a solution of a
vacuum alignment problem.
3. Basic structure of the model

Our model is based on the discrete group A4 following Refs. [10,11], where its structure
and representations are described in detail. Here we simply recall that A4 is the discrete
symmetry group of the rotations that leave a tetrahedron invariant, or the group of the even
permutations of 4 objects. It has 12 elements and 4 inequivalent irreducible representations denoted 1, 1 , 1 and 3 in terms of their respective dimensions. Introducing , the
2
cubic root of unity, = exp(i 2
3 ), so that 1 + + = 0, the three one-dimensional representations are obtained by dividing the 12 elements of A4 in three classes, which are
determined by the multiplication rule, and assigning to (class 1, class 2, class 3) a factor (1, 1, 1) for 1, or (1, , 2 ) for 1 or (1, 2 , ) for 1 . The product of two 3 gives
3 3 = 1 + 1 + 1 + 3 + 3. Also 1 1 = 1 , 1 1 = 1, 1 1 = 1 , etc. For
3 (a1 , a2 , a3 ), 3 (b1 , b2 , b3 ) the irreducible representations obtained from their product are:
1 = a1 b1 + a2 b2 + a3 b3 ,
1 = a1 b1 + a2 b2 + a3 b3 ,
(11)
(12)
1 = a1 b1 + a2 b2 + a3 b3 ,
(13)
3 (a2 b3 , a3 b1 , a1 b2 ),
(14)
3 (a3 b2 , a1 b3 , a2 b1 ).
(15)
Following Ref. [11] we assigns leptons to the four inequivalent representations of A4 : lefthanded lepton doublets l transform as a triplet 3, while the right-handed charged leptons
ec , c and c transform as 1, 1 and 1 , respectively. The flavour symmetry is broken by
two real triplets and and by a real singlet . At variance with the choice made by [11],
70
these fields are gauge singlets. Hence we only need two Higgs doublets hu,d (not three
generations of them as in Ref. [11]), which we take invariant under A4 . We assume that
some mechanism produces and maintains the hierarchy hu,d = vu,d where is the
cut-off scale of the theory.4 The Yukawa interactions in the lepton sector read:
LY = ye ec (l) + y c (l) + y c (l) + xa (ll) + xd ( ll) + h.c. + .
(33)
1
(16)
(33)
In our notation, (33) transforms as 1,

transforms as and
transforms as 1 .
Also, to keep our notation compact, we use a two-component notation for the fermion
fields and we set to 1 the Higgs fields hu,d and the cut-off scale . For instance, ye ec (l)
stands for ye ec (l)hd /, xa (ll) stands for xa (lhu lhu )/2 and so on. The Lagrangian
LY contains the lowest order operators in an expansion in powers of 1/. Dots stand for
higher-dimensional operators that will be discussed in Section 6. Some terms allowed by
the flavour symmetry, such as the terms obtained by the exchange , or the term (ll)
are missing in LY . Their absence is crucial and will be motivated later on.
As we will demonstrate in Section 5, the fields , and develop a VEV along the
directions:
= (v , 0, 0),
= (v, v, v),
= u.
(17)
Therefore, at the leading order of the 1/ expansion, the mass matrices ml and m for
charged leptons and neutrinos are given by:

ye
ye
v ye
2
ml = vd
(18)
y y y ,
y
y 2 y
vu2 a 0 0
m =
(19)
0 a d ,
0 d a
where
u
v
,
d xd .
Charged leptons are diagonalized by

1 1
1
1
l
1 2 l,
3 1 2
a xa
and charged fermion masses are given by:
v
v
m = 3y vd ,
me = 3ye vd ,
(20)
(21)
m =
v
3y vd .
(22)
4 This is the well-known hierarchy problem that can be solved, for instance, by realizing a supersymmetric
version of this model.
71
We can easily obtain a natural hierarchy among me , m and m by introducing an additional U(1)F flavour symmetry under which only the right-handed lepton sector is charged.
We assign F -charges 0, 2 and 34 to c , c and c , respectively. By assuming that a flavon
, carrying a negative unit of F , acquires a VEV / < 1, the Yukawa couplings
become field dependent quantities ye,, = ye,, ( ) and we have

ye O 34 .
y O(1),
(23)
y O 2 ,
In the flavour basis the neutrino mass matrix reads5 :

d/3
vu2 a + 2d/3 d/3
m =
d/3
2d/3
a d/3 ,
d/3
a d/3
2d/3
(24)
and is diagonalized by the transformation:

U T m U =
with
vu2
diag(a + d, a, a + d),
2/3
0
1/3
U = 1/6 1/3 1/2 .
1/ 6 1/ 3 +1/ 2
(25)
(26)
The leading order predictions are tan2 23 = 1, tan2 12 = 0.5 and 13 = 0. The neutrino
masses are m1 = a + d, m2 = a and m3 = a + d, in units of vu2 /. We can express |a|, |d|
in terms of r m2sol /m2atm (|m2 |2 |m1 |2 )/(|m3 |2 |m1 |2 ), m2atm |m3 |2 |m1 |2
and cos , being the phase difference between the complex numbers a and d:

m2atm
vu2
2|a| =
,
2 cos 1 2r

v2
2|d| u = 1 2r m2atm .
(27)
To satisfy these relations a moderate tuning is needed in our model. Due to the absence of
(ll) in Eq. (16) which we will motivate in the next section, a and d are of the same order
in 1/, see Eq. (20). Therefore we expect that |a| and |d| are close to each other and, to
satisfy Eq. (27), cos should be negative and of order one. We obtain:

1
2
m2atm ,
|m1 | = r +
8 cos2 (1 2r)
1
|m2 |2 =
m2atm ,
8 cos2 (1 2r)

1
m2atm .
|m3 |2 = 1 r +
(28)
8 cos2 (1 2r)
5 Notice that a unitary change of basis like the one in Eq. (21) will in general change the relative phases of the
eigenvalues of m .
72
If cos = 1, we have a neutrino spectrum close to hierarchical:

|m3 | 0.053 eV,
|m1 | |m2 | 0.017 eV.
(29)
In this case the sum of neutrino masses is about 0.087 eV. If cos is accidentally small, the
neutrino spectrum becomes degenerate. The value of |mee |, the parameter characterizing
the violation of total lepton number in neutrinoless double beta decay, is given by:

1
1 + 4r
2
+
m2atm .
|mee | =
(30)
9
8 cos2 (1 2r)
For cos = 1 we get |mee | 0.005 eV, at the upper edge of the range allowed for normal
hierarchy, but unfortunately too small to be detected in a near future. Independently from
the value of the unknown phase we get the relation:

r
10
,
|m3 |2 = |mee |2 + m2atm 1
(31)
9
2
which is a prediction of our model.
It is also important to get some constraint on the mass scales involved in our construction. From Eqs. (27) and (20), by assuming xd 1vu 250 GeV, we have

15 v
GeV.
1.8 10
(32)
Since, to have a meaningful expansion, we expect v , we have the upper bound

< 1.8 1015 GeV.
(33)
v /,
Beyond this energy scale, new physics should come into play. The smaller the ratio
the smaller becomes the cut-off scale. For instance, when v / = 0.03, should be
close to 1014 GeV. A complementary information comes from the charged lepton sector, Eq. (22). A lower bound on v/ can be derived from the requirement that the Yukawa
coupling y remains in a perturbative regime. By asking y vd < 250 GeV, we get
v
(34)
> 0.004.
Finally, by assuming that all the VEVs fall in approximately the same range, which will be
shown in Section 5, we obtain the range
v
u
v
< 1,
(35)

that will be useful to estimate the effects of higher-dimensional operators in Section 6.
Correspondingly the cut-off scale will range between about 1013 and 1.8 1015 GeV.
0.004 <
4. Vacuum alignment
In this section we investigate the problem of achieving the vacuum alignment of
Eq. (17). At the same time we should prevent, at least at some level, the interchange between the fields and to produce the desired mass matrices in the neutrino and charged
73
lepton sectors. As we will see, there are several difficulties to naturally accomplish these
requirements. By minimizing the scalar potential of the theory with respect to and
we get six equations that we would like to satisfy in terms of the two unknown v and v .
Even though we expect that, due to the symmetry A4 , the six minimum conditions are not
necessarily independent, such an expectation turns out to be wrong in the specific case,
unless some additional relation is enforced on the parameters of the scalar potential. These
additional relations are in general not natural. For instance, even by imposing them at the
tree level, they are expected to be violated at the one-loop order. Therefore, as we will now
illustrate, the minimum conditions cannot be all satisfied by our vacuum configuration.
As an example here we analyze the most general renormalizable scalar potential invariant under A4 and depending upon the triplets and of the Lagrangian L in Eq. (16).
The term (ll) in L can be forbidden by an additional symmetry, commuting with A4 . One
possibility is just the total lepton number L or a discrete subgroup of it. Here we consider
a Z4 symmetry under which f c transform into if c (f = e, , ), l into il, is invariant
and changes sign. This symmetry also explains why and cannot be interchanged.
The scalar potential V contains bilinears Bi , trilinears Ti and quartic terms Qi , invariant
under the group A4 Z4 . A choice of independent invariants is:
B1 = 12 + 22 + 32 ,
B2 = 1 2 + 2 2 + 3 2 ,
T1 = 1 2 3 ,
T2 = 1 2 3 + 2 3 1 + 3 1 2 ,
Q1 = 12 22 + 22 32 + 32 12 ,

2
Q2 = 12 + 2 22 + 32 ,
Q3 = 1 2 2 2 + 2 2 3 2 + 3 2 1 2 ,

2
Q4 = 1 2 + 2 2 2 + 3 2 ,
Q5 = 1 2 1 2 + 2 3 2 3 + 3 1 3 1 ,

Q6 = 12 + 22 + 32 1 2 + 2 2 + 3 2 ,

Q7 = 12 + 2 22 + 32 1 2 + 2 2 + 2 3 2 .
(36)
The scalar potential reads:

V=
M12 2 M22 2
B +
B + 1 T1 + 2 T2 + c1 Q1 + c2 Q2 + c3 Q3 + c4 Q4
2 1
2 2
+ c5 Q5 + c6 Q6 + (c7 Q7 + c.c.).
(37)
We start by analyzing the field configuration:

= (v, v, v),
= (v , 0, 0).
The minimum conditions are:

V
= M12 v + 1 v 2 + 4c1 v 3 + 2c6 vv 2 + 2(c7 + c7 )vv 2 = 0,
1
(38)
74

V
= M12 v + 1 v 2 + 4c1 v 3 + 2c6 vv 2 + 2 2 c7 + c7 vv 2 = 0,
2

V
= M12 v + 1 v 2 + 4c1 v 3 + 2c6 vv 2 + 2 c7 + 2 c7 vv 2 = 0,
3
V
= M22 v + 4c4 v 3 + 6c6 v 2 v = 0,
1
V
= 2 vv + c5 v 2 v = 0,
2
V
= 2 vv + c5 v 2 v = 0.
3
(39)
The equations V /i = 0 are clearly incompatible unless c7 = 0. Even by forcing c7

to vanish, we are left with three independent equations for the two unknown v and v ,
which, for generic values of the coefficients, admit only the trivial solution v = v = 0.
This negative results cannot be modified by adding to V the terms depending on the singlet
. Also by investigating the problem in a slightly more general framework, with real and
( , ) complex, we reach the same conclusion. Although we have not a no-go theorem,
these examples show the difficulty to obtain the desired alignment.
The difficulty illustrated above is not common to all vacua. For instance, the other possible alignment:
= (v, v, v),
= (v , v , v )
(40)
leads to the minimum conditions:

V
= M12 v + 1 v 2 + 2 v 2 + 4c1 v 3 + (2c5 + 6c6 )vv 2 = 0,
i
V
= M22 v + 22 vv + 4c3 v 3 + (2c5 + 6c6 )v 2 v = 0.
i
(41)
In a non-vanishing portion of the parameter space, these equations have non-trivial solution
with non-vanishing v and v .
It is possible to show that, by sufficiently restricting the form of the most general scalar
potential invariant under A4 , the desired alignment can be obtained. Restrictions that are
unnatural in a generic model becomes technically natural in a supersymmetric (SUSY)
model. The well-known non-renormalization properties of the superpotential allow to accept, at least from a technical viewpoint, a restricted number of terms, compared to what
the A4 symmetry would permit. Undesired terms of the superpotential that are set to zero
at the tree level are not generated at any order in perturbation theory. Indeed we have
produced a SUSY example of this type, where the alignment problem is solved and this
example is discussed in detail in Appendix A. However, our real aim is to build a fully natural model, where all the terms allowed by the symmetries are present and where the only
deviations from the symmetry limit are provided by higher-dimensional operators, rather
than by small violations of ad-hoc imposed relations. As we will now see, there exist a
simple and economic solution in the context of theories with one extra spatial dimension.
75
5. A4 model in an extra dimension

One of the problems we should overcome in the search for the correct alignment is that
of keeping neutrino and charged lepton sectors separate, including the respective symmetry
breaking sectors. Here we show that such a separation can be achieved by means of an extra
spatial dimension. The spacetime is assumed to be five-dimensional, the product of the
four-dimensional Minkowski spacetime times an interval going from y = 0 to y = L. At
y = 0 and y = L the spacetime has two four-dimensional boundaries, which we will call
branes. Our idea is that matter SU(2) singlets such as ec , c , c are localized at y = 0,
while SU(2) doublets, such as l are localized at y = L (see Fig. 1). Neutrino masses arise
from local operators at y = L. Charged lepton masses are produced by non-local effects
involving both branes. Later on we will see how such non-local effects can arise in this
theory. The simplest possibility is to introduce a bulk fermion, depending on all spacetime
coordinates, that interacts with ec , c , c at y = 0 and with l at y = L. The exchange of
such a fermion can provide the desired non-local coupling between right- and left-handed
ordinary fermions. Finally, assuming that and ( , ) are localized respectively at y = 0
and y = L, we obtain a natural separation between the two sectors.
5.1. Alignment in an extra dimension
Such a separation also greatly simplify the vacuum alignment problem. We can determine the minima of two scalar potentials V0 and VL , depending only, respectively, on
and ( , ). Indeed, as we shall see, there are whole regions of the parameter space
where V0 () and VL ( , ) have the minima given in Eq. (17). Notice that in the present
setup dealing with a discrete symmetry such as A4 provides a great advantage as far as
the alignment problem is concerned. A continuous flavour symmetry such as, for instance,
SO(3) would
need some extra structure to achieve the desired alignment. Indeed the potential energy d 4 x [V0 () + VL ( , )] would be invariant under a much bigger symmetry,
Fig. 1. Fifth dimension and localization of scalar and fermion fields. The symmetry breaking sector includes the
A4 triplets and , localized at the opposite ends of the interval. Their VEVs are dynamically aligned along the
directions shown at the top of the figure.
76
SO(3)0 SO(3)L , with the SO(3)0 acting on and leaving ( , ) invariant and vice-versa
for SO(3)L . This symmetry would remove any alignment between the VEVs of and those
of ( , ). If, for instance, (17) is minimum of the potential energy, then any other configuration obtained by acting on (17) with SO(3)0 SO(3)L would also be a minimum and
the relative orientation between the two sets of VEVs would be completely undetermined.
A discrete symmetry such as A4 has not this problem, as we will show now.
Consider first the scalar potential V0 ():
M12 2
B + 1 T1 + c1 Q1 + c2 Q2 ,
(42)
2 1
where B1 , T1 , Q1,2 are defined in Eq. (36). The minimum conditions at = (v, v, v) are:
V0 () =

V0
= v M12 + 1 v + 4c1 v 2 = 0 (i = 1, 2, 3),
i
(43)
while the minimum condition at = (v, 0, 0) is:

V0
= v M12 + 4c2 v 2 = 0,
1
(44)
since in this case (V0 /2,3 ) = 0 are automatically satisfied. Both = (v, v, v) and =
(v, 0, 0) can be local minima of V0 , depending on the parameters. The constants c1,2 should
be positive, to have V0 bounded from below. We can look at the region where |1 | |M1 |.
When c1 c2 and M12 < 0, the minimum at = (v, 0, 0) is the absolute one, while for
c2 c1 and M12 < 0 V0 is minimized by = (v, v, v). Therefore we have a large portion
of the parameter space where the minimum is of the desired form: = (v, v, v). To be
precise, in this region, there are four degenerate minima: = (v, v, v), = (v, v, v)
= (v, v, v) = (v, v, v), related by A4 transformations.
Now we turn to VL ( , ). As we did in Section 3, we assume both and real and
odd under the action of a discrete Z4 symmetry. The most general renormalizable invariant
potential is a combination of B2 , Q3,4 in Eq. (36) and the following invariants:
B3 = 2 ,
Q8 = 4 ,
Q9 = 1 2 3 ,

Q10 = 2 1 2 + 2 2 + 3 2 .
(45)
We have:
M2
M22
B2 + 3 B3 + c3 Q3 + c4 Q4 + c8 Q8 + c9 Q9 + c10 Q10 .
2
2
We search for a minimum at = (v , 0, 0) and = u:
VL ( , ) =

VL
= v M22 + 4c4 v 2 + 2c10 u2 = 0,

1

VL
= u M32 + 4c8 u2 + 2c10 v 2 = 0,
(46)
(47)
77
) = 0 are always satisfied. There is a region of the parameter space

while (VL /2,3
where the absolute minimum is of this type. Taking into account the A4 symmetry, in
all this region we have six degenerate minima: = (v , 0, 0), = (0, v , 0) and
= (0, 0, v ). Putting together the minima of V0 () and VL ( , ) we have 24 degenerate minima of the potential energy, differing for signs or ordering. It can be shown that
these 24 minima produce exactly the same mass pattern discussed in Section 3, up to field
and parameter redefinitions. Therefore, it is not restrictive to choose one of them, for instance, = (v, v, v) and = (v , 0, 0), to analyze the property of this model.
The observed hierarchy among lepton masses can be efficiently described by an additional U(1)F flavour symmetry, under which only right-handed charged leptons are
charged: F (ec , c , c ) = (4, 2, 0). To spontaneously break this symmetry and to produce
the desired hierarchy, we need a scalar field , carrying a negative unit of F and developing
a VEV / 0.22. In our framework is localized on the brane at y = 0 and the scalar
potential V0 of Eq. (42) is modified into:
V0 V0 + M42 B4 + c11 Q11 + c12 Q12 ,
(48)
where
B4 = | |2 ,
Q11 = | |4 ,

Q12 = | |2 12 + 22 + 32 .
(49)
The minimum conditions at = (v, v, v) and | | = t read:

V0
= v M12 + 1 v + 4c1 v 2 + 2c12 t 2 = 0
i

V0
= 2t M42 + 2c11 t 2 + 3c12 v 2 = 0.
| |
(i = 1, 2, 3),
(50)
These conditions are satisfied by non-vanishing (t, v) in a finite portion of the parameter
space. Therefore the inclusion of an Abelian flavour symmetry is fully compatible with the
mechanism for vacuum alignment discussed above.
5.2. Lepton masses and mixing angles
We now show how it is possible to take advantage of above results to obtain the desired
lepton masses. To this purpose we introduce a bulk fermion field F (x, y) = (F1 , F2 ), singlet under SU(2) with hypercharge Y = 1 and transforming as a triplet of A4 . We also
impose the discrete Z4 symmetry introduced in Section 4 under which (f c , l, F, , , )
transform into (if c , il, iF, , , ). The action is

1
S = d 4 x dy iF1 F1 + iF2 F2 + (F2 y F1 y F2 F1 + h.c.)
2
M(F1 F2 + F1 F2 )
+ V0 ()(y) + VL ( , )(y L)
78

+ Ye ec (F1 ) + Y c (F1 ) + Y c (F1 ) + h.c. (y)

xa
xd
+
(ll)h
h
+
(
ll)h
h
+
Y
(F
l)h
+
h.c.
(y
L)
+ ,
u
u
u
u
L
2
d
2
2
(51)
where the constants Y have mass dimension 1/2. The first two lines represent the fivedimensional kinetic and mass terms of the bulk field F . The third line is the scalar potential
and the remaining terms are the lowest order invariant operators localized at the two branes.
Dots stand for the kinetic terms of f c , l, , , and for higher-dimensional operators,
which will be classified in Section 6.
The potential energy is given, at lowest order by:

U = d 4 x V0 () + VL ( , ) ,
(52)
and, under the conditions discussed above, is minimized by Eqs. (17). It is clear that and
( , ) are strictly separated only at lowest order. Indeed higher-dimensional brane interactions like, for instance, (F1 F2 )/2 , ( F1 F2 )/2 are allowed. At the one-loop level,
the exchange of the bulk fermion F will give rise to the structures Q5,6,7 of Eq. (36) and
this will necessarily deform the vacuum (17). Here we will assume that such a deformation
is sufficiently small. Indeed, as we shall see in Section 6, the operators of the type ( )
arising from one-loop F -exchange, are suppressed by 1/4 L4 .
We now discuss the effects of the tree-level exchange of F . To this purpose we consider
the equations of motion for (F1 , F2 ):
i F2 + y F1 MF1 = 0,
i F1 y F2 MF2 = 0.
(53)
If M is large and positive, we can prove that all the modes contained in (F1 , F2 ) become
heavy, at a scale greater than or comparable to 1/L, which we assume to be much higher
than the electroweak scale. If we are only interested in energies much lower than 1/L,
we can solve the equations of motion in the static approximation, by neglecting the fourdimensional kinetic term:
F1 (y) = F1 (L)eM(yL) ,
F2 (y) = F2 (0)eMy .
(54)
These equations must be supplemented with appropriate boundary conditions, which we

can identify by varying the action S with respect the fields (F1 , F2 ). The boundary terms
read

1
1
4
(S)boundary = d x F1 (L) F2 (L) + F2 (L) F1 (L) + YL lhd
2
2

1
c
c
c
+ F1 (0) F2 (0) + Ye e + Y + Y
2

1
+ F2 (0) F1 (0) ,
(55)
2
79
where = (1 , 2 , 2 3 ) and = (1 , 2 2 , 3 ). We can chose as boundary conditions:

F1 (L) = 2YL lhd ,

F2 (0) = 2 Ye ec + Y c + Y c .
(56)
Since F1 (L) = F2 (0) = 0, we have (S)boundary = 0, as desired. By substituting back

Eqs. (54) and (56) into the action S we get

y c
ye c
y
S = U + d 4x
e (l)hd +
(l) hd + c (l) hd

xa
xd
+ 2 (ll)hu hu + 2 ( ll)hu hu + ,
(57)
with
yf
(58)
= 4YL Yf eML (f = e, , ).
Therefore, in lowest order approximation we have reproduced the Lagrangian LY of

Eq. (16) and the discussion of Section 3 applies.
We also recall that, to account for the observed hierarchy of the charged lepton masses,
we have included an additional U(1) flavour symmetry. Therefore, in the present picture,
the quantities Ye,, stand for:
4
2
,
Y = Y
,
Y = Y ,
Ye = Ye
(59)
where Ye,, are field-independent constants having similar values. After spontaneous
breaking of U(1), the Yukawa couplings yf possess the desired hierarchy.
6. Higher-order corrections
The results of the previous section hold to first approximation. Higher-dimensional operators, suppressed by additional powers of the cut-off , can be added to the leading terms
in Eqs. (42), (46), (52), (57), (58). Here we will classify these terms and analyze their physical effects. In particular we will show that these corrections are completely under control
in our model and that they can be made negligibly small without any fine-tuning. We can
order higher-order operators into three groups.
6.1. Local corrections to m
There are higher-order operators that are local in the five-dimensional theory and do
not depend upon the heavy fermion sector (F1 , F2 ). As we have seen, at leading order, the
neutrino mass matrix m arises entirely from operators of this type that are localized at
y = L. On this brane we only have scalar fields (, ), odd under Z4 . Therefore, higherdimensional operators modifying m and localized at y = L are down by two powers of
80
the cut-off, compared to the leading ones. After A4 breaking, the only two operators that
cannot be absorbed by a redefinition of the parameters xa,d are:
xb
( ) (ll) hu hu ,
4
xc
(60)
( ) (ll) hu hu .
4
After adding these operators localized at y = L to the five-dimensional action of Eq. (51),
we get a neutrino mass matrix

0
0
vu2 a + b + c
m =
(61)
,
d
0
a + b + 2 c
0
d
a + 2 b + c
where
uv 2
uv 2
,
c
x
,
c
3
3
to be compared with a and d of Eq. (20).
b xb
(62)
6.2. Corrections from tree-level F -exchange

Another set of higher-dimensional operators arise from the exchange of the heavy
fermion (F1 , F2 ) in the static limit and in the tree-level approximation. To classify them,
we should list all operators localized at the two branes that are linear in the bulk fermion
(F1 , F2 ). At y = 0 such operators have the generic structure
(1)
Yf
(2)
Yf
(63)
f F1 ,
f c 3 F1 , . . . (f = e, , ).
2
After spontaneous A4 breaking, the effect of these operators can be absorbed by redefining the coupling constants Yf , (f = e, , ), at least up to order 3 . Thus the leading
interactions between f c and F1
c

Ye e (F1 ) + Y c (F1 ) + Y c (F1 ) + h.c. (y)
(64)
c
Yf f F1 ,
c 2
are unchanged up to relative order 1/2 . We are left with the couplings of F2 at the brane
y = L. Neglecting all operators that, after A4 breaking, only lead to a renormalization of
the parameter YL , we find four new terms:
Z1
( ) (F2 l) hd ,
2
Z2
( ) (F2 l) hd ,
2

Z3
1 (F2 )2 l3 + 2 (F2 )3 l1 + 3 (F2 )1 l2 hd ,
2

Z4
1 (F2 )3 l2 + 2 (F2 )1 l3 + 3 (F2 )2 l1 hd .
2
(65)
81
After the breaking of A4 , the leading order interaction of F2 at y = L is modified by the

operators (65) to

d + h.c. (y L),
YL (F2 l)h
(66)
where

l1
1 + z1 + z2
0
l2 =
0
l3
z1,2
Z1,2 v 2
,
YL 2
0
1 + z1 + 2 z2
z3
z3,4
0
z4
1 + 2 z1 + z2
l1
l2 ,
l3
Z3,4 uv
.
YL 2
(67)
(68)
After integrating out the heavy modes in (F1 , F2 ) in the limit of vanishing external momenta for the light modes, we obtain the effective four-dimensional Lagrangian
y c
y
ye c
hd ,
e ( l)hd +
( l) hd + c ( l)
yf
= 4YL Yf eML (f = e, , ).
The mass matrix for the charged leptons becomes

ye (1 + z + z )
ye (1 + z1 + 2 z2 + z3 )
1
2
v
y (1 + z1 + z2 ) y (1 + z1 + 2 z2 + z3 )
ml = vd
2
2
2
y (1 + z1 + z2 )
y (1 + z1 + z2 + z3 )
(69)
(70)
ye (1 + 2 z1 + z2 + z4 )
y 2 (1 + 2 z1 + z2 + 2 z4 )
y (1 + 2 z1 + z2 + z4 )
(71)
6.3. Effects on masses and mixing angles
To first order in the small parameters b, c and zi , the neutrino masses are modified into:
2

v
1
m1 = a + d (b + c) u ,
2
v2
m2 = (a + b + c) u ,
2

v
1
m3 = a + d + (b + c) u ,
2
and the charged lepton masses are changed into

z3 z4
v
+
vd ,
me = 3ye 1 +
3
3

v
z3
2 z4
m = 3y 1 + +
vd ,
3
3

z4
v
2 z3
+
vd .
m = 3y 1 +
3
3
(72)
(73)
82
To the same order, but neglecting terms like zi ye, /y , we get:

(b c)(d
a) + (b c)(d + a)
1
i
i

+ z1 + z 2 + z3 z4 ,
|Ue3 | =
3
3
2 2(a d + ad)
2

2
+ (b c)d
1
tan 23 = 1 + (b c)d
+ 2 z2 + z 2 + (z3 + z 3 + z4 + z 4 ) ,
3
(a d + ad)

2 1
tan 12 = 1 + 3 z1 z 1 z2 z 2 + z3 + z 3 + z4 + z 4 .
(74)
2
2
3
3
These relations explicitly show that the corrections induced by the higher-dimensional operators are of order uv /2 or v 2 /2 . From our estimate in Eq. (35) we see that these
parameters can be as small as 2 105 . If the cut-off is one order of magnitude larger
that the VEVs of the model, the resulting corrections are at the level of one percent, already beyond any planned experimental test. If on the contrary, the VEVs are anomalously
close to the cut-off , then Eq. (74) show that deviations roughly of the same size are
expected in Ue3 , tan2 23 and tan2 12 . How much close to can the VEVs be? We expect
that the subleading corrections do not spoil the leading order form of the neutrino mass
spectrum, Eq. (28). This implies that v 2 /2 r, so that r sets the natural upper bound to
the expected deviations from the leading order results.
6.4. Corrections from one-loop F -exchange
Further corrections to lepton mass matrices and to the scalar potential can arise from
one-loop exchange of (F1 , F2 ) in the static limit. Consider, for instance, the following
operators localized at y = 0 and at y = L:
1
F1 F2 (y),
1
llF1 F2 hu hu (y L).
6
(75)
By integrating out, at one-loop order, the heavy modes contained in (F1 , F2 ) we get:

1
llhu hu d 4 kF (k, 0, L)F (k, L, 0),
(76)
7
where k is the four-momentum running in the loop and F (k, y, y ) is the adimensional
propagator of (F1 , F2 ) in a mixed momentum-space representation. Since the loop integral
is convergent, we get
1 f (ML)
llhu hu ,
(77)
3 4 L4
where f (ML) is a function of the adimensional combination ML. Thus the resulting local
operator is suppressed by four additional powers of the cut-off scale. This behavior is quite
generic and similar suppressions are found for other operators originating from one-loop
exchange of (F1 , F2 ).
The corrections that modify the scalar potential discussed in the previous section are of
this type. As an example, consider the localized interactions:
1
F1 F2 (y),
2
1
F1 F2 (y L).
2
(78)
83
Also in this case, after integrating over (F1 , F2 ) in the limit of vanishing external momenta,
we get:
f (ML)
.
(79)
4 L4
Due to their large suppression, these corrections are negligible compared to those discussed
above.
7. Conclusion
There are by now several theoretical mechanisms that can qualitatively explain the
observed large lepton mixing angles [19]. They are sufficiently flexible to quantitatively accommodate the measured parameters. They are also compatible with our ideas on quarks
masses and mixing angles so that they can be nicely embedded into a unified picture of
fermion properties, such as, for instance, a grand unified theory. Many of these mechanisms predict a generically large atmospheric mixing angle and a generically small 13
angle, without favouring any specific value for these parameters. The best values of global
fits are currently very close to 23 = /4 and 13 = 0, but the experimental errors still allow for large deviations from these remarkable values. Indeed, according to many of the
above mentioned mechanisms, deviations from 23 = /4 and 13 = 0 are expected at the
observable level. It may take a long time before such deviations can be actually observed.
A sensitivity on 13 around 0.05 is foreseen in about ten years from now, with the full
exploitation of high-intensity neutrino beams. A reduction by a factor of two of the present
error on 23 will also require special neutrino beams and a similar time scale.
It might happen that after all this experimental effort, (23 /4) and 13 still remain
close to zero, within errors. At this point it would be legitimate to suspect that such special
values are produced by a highly symmetric flavour dynamics. Given the already good experimental precision on 12 , the so-called HarrisonPerkinsScott mixing scheme, where
23 = /4, 13 = 0 and sin2 12 = 1/3, would fit very well the data. In this paper we
have proposed a model that reproduces accurately the HPS mixing pattern. We started
by discussing whether such a pattern can be obtained from an exact flavour symmetry.
We showed that, under general conditions, an exactly maximal atmospheric mixing angle
cannot arise from an exact flavour symmetry. The flavour symmetry should be necessarily
broken and a maximal 23 is the result of a special alignment between the breaking effects
in the neutrino sector and those occurring in the charged lepton sector. If the flavour symmetry is spontaneously broken, this corresponds to a non-trivial vacuum alignment. Our
model gives rise to the HPS mixing scheme in the context of a spontaneously broken A4
flavour symmetry, A4 being the discrete subgroup of SO(3) leaving a tetrahedron invariant.
At leading order, that is by neglecting symmetric operators of higher dimension, neutrino masses only depend on two complex Yukawa coupling constants. Due to the unknown phase difference between these two constants, we cannot determine the absolute
scale of neutrino masses. We expect that the neutrino spectrum is of the normal hierarchy type but not too far from degenerate. At leading order the model predicts |m3 |2 =
|mee |2 + 10/9m2atm (1 r/2). A remarkable feature of our model is that at the leading
84
order the lepton mixing angles are completely independent from these two parameters, so
that the HPS mixing pattern is always obtained. The lepton mixing depends entirely on the
relative alignment between the VEVs giving masses to the neutrino sector and those giving
masses to the lepton sector. We discuss in detail the problem of vacuum alignment. To avoid
the proliferation of Higgs doublets, the scalar fields breaking A4 are gauge singlets in our
model. We propose an unconventional solution to the vacuum alignment problem, where an
extra dimension described by a spatial interval plays an important role. Two scalar sectors
live at the opposite ends of the interval and their respective scalar potentials are minimized
by the desired field configurations, for natural values of the implied parameters. Such a
mechanism only works in the case of discrete symmetries, since in the continuous case the
large symmetry of the total potential energy would make the relative orientations of the
two scalar sectors undetermined. We have also extensively discussed how this lowest order
picture is modified by the introduction of higher-dimensional operators. The induced corrections are parametrically small, of second order in the expansion parameter VEV/,
being the cut-off of the theory, and they can be made numerically negligible. Last but not
least, the hierarchy of the charged lepton masses can be reproduced by the usual Froggatt
Nielsen mechanism within the context of an Abelian flavour symmetry, which turns out to
be fully compatible with the present scheme.
We believe that, from a purely technical point of view, we have fulfilled our goal to
realize a completely natural construction of the HPS mixing scheme. But to construct our
model we had to introduce a number of special dynamical tricks (like a peculiar set of
discrete symmetries in extra dimensions). Apparently this is the price to pay for a special
model where all mixing angles are fixed to particular values. Perhaps this exercise can be
taken as a hint that it is more plausible to expect that, in the end, experiment will select a
normal model with 13 not too small and 23 not too close to maximal.
Acknowledgement
We thank Zurab Berezhiani, Isabella Masina and Luigi Pilo for useful discussions. F.F.
thanks the CERN Theory Division for hospitality in summer 2004, when this project
started. This project is partially supported by the European Program MRTN-CT-2004503369.
Appendix A
Here we discuss a SUSY solution to the vacuum alignment problem. In a supersymmetric context, the right-hand side of Eq. (16) should be interpreted as the superpotential wl
of the theory, in the lepton sector. A key observation is that this superpotential is invariant not only with respect to the gauge symmetry SU(2) U(1) and the flavour symmetry
U(1)F A4 , but also under a discrete Z3 symmetry and a continuous U(1)R symmetry
under which the fields transform as shown in Table 1.
We see that the Z3 symmetry explains the absence of the term (ll) in wl : such a term
transforms as 2 under Z3 and need to be compensated by the field in our construction.
85
Table 1
Field
A4
Z3
U(1)R
l
3
ec
hu,d
1
2
1
1
1
0
2
1
2
1
1
1
0
3
1
0
3
1
2
At the same time Z3 does not allow the interchange between and , which transform
differently under Z3 . Charged leptons and neutrinos acquire masses from two independent
sets of fields. If the two sets of fields develop VEVs according to the alignment described
in Eq. (17), then the desired mass matrices follow.
Finally, there is a continuous U(1)R symmetry that contains the usual R-parity as a
subgroup. Suitably extended to the quark sector, this symmetry forbids the unwanted dimension two and three terms in the superpotential that violate baryon and lepton number
at the renormalizable level. The U(1)R symmetry allows us to classify fields into three
sectors. There are matter fields such as the leptons l, ec , c and c , which occur in the
superpotential through bilinear combinations. There is a symmetry breaking sector including the Higgs doublets hu,d and the flavons , and . As we will see these fields
acquire non-vanishing vacuum expectation values (VEVs) and break the symmetries of the
model. Finally, there are driving fields such as 0 , 0 and 0 that allows to build a nontrivial scalar potential in the symmetry breaking sector. Since driving fields have R-charge
equal to two, the superpotential is linear in these fields.
The full superpotential of the model is
w = wl + wd ,
(A.1)
where, at leading order in a 1/ expansion, wl is given by the right-hand side of Eq. (16)
and the driving term wd reads:
wd = M(0 ) + g(0 ) + g1 (0 ) + g2 (0 ) + g3 0 ( )
+ g4 0 2 .
(A.2)
We notice that at the leading order there are no terms involving the Higgs fields hu,d . We
assume that the electroweak symmetry is broken by some mechanism, such as radiative
effects when supersymmetry (SUSY) is broken. It is interesting that at the leading order
the electroweak scale does not mix with the potentially large scales u, v and v . The scalar
potential is given by:
w 2
2
2

V=
(A.3)
+ mi |i | + ,
i
i
where i denote collectively all the scalar fields of the theory, m2i are soft masses and dots
stand for D-terms for the fields charged under the gauge group and possible additional soft
breaking terms. Since mi are expected to be much smaller than the mass scales involved
in wd , it makes sense to minimize V in the supersymmetric limit and to account for soft
breaking effects subsequently. From the driving sector we have:
w
= M1 + g2 3 = 0,
01
86
w
= M2 + g3 1 = 0,
02
w
= M3 + g1 2 = 0,
03
w

= g1 2 3 + g2 1 = 0,
01
w

= g1 3 1 + g2 2 = 0,
02
w

= g1 1 2 + g2 3 = 0,
03
w
= g3 ( ) + g4 2 = 0.
0
(A.4)
The first three equations are solved by (up to irrelevant sign ambiguities):
= (v, v, v),
v=
M
.
g
(A.5)
The remaining equations are solved, in general, by:

= (0, 0, 0),
= 0,
(A.6)
unless some further relation is imposed on the coefficients g1 , . . . , g4 . If g2 = 0, then, up

to an irrelevant reordering, we have
= (v , 0, 0),
=u=
g3
( )
g4
(A.7)
with v and u undetermined. In this case we find that, for m20 , m2 , m20 > 0, the driving
0
fields 0 , 0 and 0 vanish at the minimum. Moreover, if m2 , m2 < 0, then u and v slide to
large scales, eventually stabilized by one-loop radiative corrections. The supersymmetric
case is better than the non-supersymmetric case in two respects. First of all, at least from
a technical viewpoint, the absence of a term in the superpotential is radiatively stable.
Moreover, as we have seen, once g2 has been set to zero, the equations selecting (17) as
the correct minimum are consistent.
References
[1] A. Strumia, F. Vissani, hep-ph/0503246;
G.L. Fogli, E. Lisi, A. Marrone, A. Melchiorri, A. Palazzo, P. Serra, J. Silk, Phys. Rev. D 70 (2004) 113003,
hep-ph/0408045;
J.N. Bahcall, M.C. Gonzalez-Garcia, C. Pena-Garay, JHEP 0408 (2004) 016, hep-ph/0406294;
M. Maltoni, T. Schwetz, M.A. Tortola, J.W.F. Valle, New J. Phys. 6 (2004) 122, hep-ph/0405172.
[2] B. Aharmim, et al., SNO Collaboration, nucl-ex/0502021.
[3] LSND Collaboration, hep-ex/0104049.
[4] KARMEN Collaboration, Nucl. Phys. B (Proc. Suppl.) 91 (2000) 191.
[5] P. Spentzouris, Nucl. Phys. B (Proc. Suppl.) 100 (2001) 163.
87
[6] C. Kraus, et al., hep-ex/0412056;

V.M. Lobashev, Phys. At. Nucl. 63 (2000) 962, Yad. Fiz. 63 (2000) 1037 (in Russian).
[7] C.L. Bennett, et al., Astrophys. J. Suppl. 148 (2003) 1;
D.N. Spergel, et al., Astrophys. J. Suppl. 148 (2003) 175.
[8] W. Grimus, L. Lavoura, JHEP 0107 (2001) 045;
W. Grimus, L. Lavoura, Acta Phys. Pol. B 32 (2001) 3719;
W. Grimus, L. Lavoura, Eur. Phys. J. C 28 (2003) 123;
W. Grimus, L. Lavoura, Phys. Lett. B 572 (2003) 189;
W. Grimus, L. Lavoura, hep-ph/0305309;
W. Grimus, L. Lavoura, Acta Phys. Pol. B 34 (2003) 5393;
W. Grimus, A.S. Joshipura, S. Kaneko, L. Lavoura, M. Tanimoto, JHEP 0407 (2004) 078;
W. Grimus, A.S. Joshipura, S. Kaneko, L. Lavoura, H. Sawanaka, M. Tanimoto, Nucl. Phys. B 713 (2005)
151;
S. Morisi, M. Picariello, hep-ph/0505113;
F. Caravaglios, M. Morisi, hep-ph/0503234.
[9] C. Wetterich, Phys. Lett. B 451 (1999) 397;
R. Barbieri, L.J. Hall, G.L. Kane, G.G. Ross, hep-ph/9901228;
O. Vives, hep-ph/0504079.
[10] E. Ma, G. Rajasekaran, Phys. Rev. D 64 (2001) 113012, hep-ph/0106291;
K.S. Babu, E. Ma, J.W.F. Valle, Phys. Lett. B 552 (2003) 207;
M. Hirsch, J.C. Romao, S. Skadhauge, J.W.F. Valle, A. Villanova del Moral, hep-ph/0312244;
M. Hirsch, J.C. Romao, S. Skadhauge, J.W.F. Valle, A. Villanova del Moral, hep-ph/0312265;
E. Ma, hep-ph/0404199.
[11] E. Ma, Phys. Rev. D 70 (2004) 031901;
E. Ma, hep-ph/0409075;
E. Ma, New J. Phys. 6 (2004) 104.
[12] S.L. Chen, M. Frigerio, E. Ma, hep-ph/0404084.
[13] P.F. Harrison, D.H. Perkins, W.G. Scott, Phys. Lett. B 530 (2002) 167, hep-ph/0202074;
P.F. Harrison, W.G. Scott, Phys. Lett. B 535 (2002) 163, hep-ph/0203209;
P.F. Harrison, W.G. Scott, Phys. Lett. B 547 (2002) 219;
P.F. Harrison, W.G. Scott, Phys. Lett. B 557 (2003) 76;
P.F. Harrison, W.G. Scott, hep-ph/0402006;
P.F. Harrison, W.G. Scott, hep-ph/0403278.
[14] J. Kubo, A. Mondragon, M. Mondragon, E. Rodriguez-Jauregui, Prog. Theor. Phys. 109 (2003) 795;
T. Ohlsson, G. Seidl, Phys. Lett. B 537 (2002) 95;
T. Ohlsson, G. Seidl, Nucl. Phys. B 643 (2002) 247;
S.F. King, G.G. Ross, Phys. Lett. B 520 (2001) 243.
[15] L.J. Hall, H. Murayama, N. Weiner, Phys. Rev. Lett. 84 (2000) 2572;
N. Haba, H. Murayama, Phys. Rev. D 63 (2001) 053010;
M.S. Berger, K. Siyeon, Phys. Rev. D 63 (2001) 057302;
M. Hirsch, hep-ph/0102102;
M. Hirsch, S.F. King, Phys. Lett. B 516 (2001) 103, hep-ph/0102103;
F. Vissani, Phys. Lett. B 508 (2001) 79;
G. Altarelli, F. Feruglio, I. Masina, JHEP 0301 (2003) 035;
A. de Gouvea, H. Murayama, hep-ph/0301050;
J.R. Espinosa, hep-ph/0306019.
[16] H. Fritzsch, Z.Z. Xing, Phys. Lett. B 372 (1996) 265;
H. Fritzsch, Z.Z. Xing, Phys. Lett. B 440 (1998) 313;
H. Fritzsch, Z.Z. Xing, Prog. Part. Nucl. Phys. 45 (2000) 1;
M. Fukugita, M. Tanimoto, T. Yanagida, Phys. Rev. D 57 (1998) 4429;
M. Fukugita, M. Tanimoto, T. Yanagida, Phys. Rev. D 59 (1999) 113016;
M. Tanimoto, T. Watari, T. Yanagida, Phys. Lett. B 461 (1999) 345;
S.K. Kang, C.S. Kim, Phys. Rev. D 59 (1999) 091302;
M. Tanimoto, Phys. Lett. B 483 (2000) 417;
88
N. Haba, Y. Matsui, N. Okamura, T. Suzuki, Phys. Lett. B 489 (2000) 184;

Y. Koide, A. Ghosal, Phys. Lett. B 488 (2000) 344;
E.K. Akhmedov, G.C. Branco, F.R. Joaquim, J.I. Silva-Marcos, Phys. Lett. B 498 (2001) 237.
[17] R. Gatto, G. Morchio, G. Sartori, F. Strocchi, Nucl. Phys. B 163 (1980) 221;
J.I. Silva-Marcos, JHEP 0307 (2003) 012;
C.I. Low, R.R. Volkas, Phys. Rev. D 68 (2003) 033007;
C.I. Low, hep-ph/0404017;
Y. Koide, Phys. Rev. D 71 (2005) 016010;
F. Feruglio, Nucl. Phys. B (Proc. Suppl.) 143 (2005) 184.
[18] See, for instance S.F. King, JHEP 0209 (2002) 011.
[19] For a review, see G. Altarelli, F. Feruglio, New J. Phys. 6 (2004) 106.
Brane world unification of quark and lepton masses

and its implication for the masses of the neutrinos
P.Q. Hung
Department of Physics, University of Virginia, 382 McCormick Road,
PO Box 400714, Charlottesville, VA 22904-4714, USA
Received 24 January 2005; received in revised form 5 April 2005; accepted 30 May 2005
Abstract
A TeV-scale scenario is constructed in an attempt to understand the relationship between quark and
lepton masses. This scenario combines a model of early (TeV) unification of quarks and leptons with
the physics of large extra dimensions. It demonstrates a relationship between quark and lepton mass
scales at rather low (TeV) energies which will be dubbed as early quarklepton mass unification.
It also predicts that the masses of the neutrinos are naturally light and Dirac. There is an interesting
correlation between neutrino masses and those of the unconventionally charged fermions which are
present in the early unification model. If these unconventional fermions were to lie between 200 and
300 GeV, the Dirac neutrino mass scale is predicted to be between 0.07 eV and 1 eV.
1. Introduction
Are quark and lepton masses related? This question has been addressed almost thirty
years ago in a famous paper by [1] soon after the concept of Grand Unification (GUT)
[2] has been put forward. From this pioneer paper and subsequent works, one learns that
quarklepton unification at the GUT scale MGUT 1015 1016 GeV gives rise to, for the
particular case of SU(5) considered in [1], the equality of the lepton and bottom quark
masses at MGUT . After renomalization-group (RG) evolution down to low energies, a reE-mail address: pqh@virginia.edu (P.Q. Hung).
doi:10.1016/j.nuclphysb.2005.05.023
90
P.Q. Hung / Nuclear Physics B 720 (2005) 89115
markable prediction for the b quark mass was made, although the complete story was
significantly more complicated. Despite the enormous popularity of GUT, questions started
to arise as to whether or not there are actually structures instead of simply a desert between the electroweak scale and MGUT . If so, how would quark and lepton masses be
related if they were to have early unification?
The hope that new physics is lurking somewhere in the TeV region has given rise in
the past decade or two to a flurry of activities which resulted in a rich diversity of topics
with a variety of motivations. A common thread in all of these activities is the prediction
of new particles of one kind or another. It goes without saying that discoveries of these
new particles will vindicate all the efforts put into it. The present paper will rely on two of
such scenarios with a special emphasis put into the relationship between quark and lepton
masses, including the issues of neutrino mass: is it Dirac or Majorana? Why is it so small?
Two TeV scenarios which form the focus of this paper are the following: (1) early
petite unification of quarks and leptons [36]; (2) the possibility of the existence of extra
spatial dimensions, the mechanism of wave function overlap along an extra compact spatial
dimension [7] and its use in the attempt to explain the smallness of neutrino masses [8,9]
and the hierarchy of quark masses [1012].
In [3], the Standard Model (SM) with three independent couplings is merged into a
group GS GW with two independent couplings at some scale which is supposed to be in
the TeV region. The choice of the PatiSalam SU(4)PS [13] for GS was used. This scenario
allowed us to compute sin2 W (MZ2 ) and to use it to constraint the choices of GW . The preferred choice of [3] was the gauge group SU(4)PS SU(2)4 with an early unification scale
of several hundreds of TeVs. Recent precise measurements of sin2 W (MZ2 ) coupled with
a surge of interest in TeV scale physics have prompted [4] to reexamine the petite unification idea. There it was shown that the petite unification scale is lowered considerably,
to less than 10 TeV, due to the increase of sin2 W (MZ2 ) as compared with its value of
twenty three years ago. This has the effect of practically ruling out SU(4)PS SU(2)4
due to severe problems with the decay rate for KL e among other things. Two favorite models emerged: SU(4)PS SU(2)3 and SU(4)PS SU(3)2 , both of which nicely
and naturally avoid the KL e problem due principally to the existence of new types
of fermions. A detailed analysis of SU(4)PS SU(2)3 was performed by [5], including
a two-loop renormalization group (RG) analysis and a discussion of the physics of the
new unconventional fermions. Early unification in this model takes place at a mass scale
M = O(12 TeV).
On another front, Ref. [9] has constructed a model which made use of the mechanism
of wave function overlap along an extra compact dimension to explain the smallness of
Dirac neutrino masses. An SU(2)R symmetry was assumed and was subsequently spontaneously broken, giving rise to a phenomenon in which one member of the right-handed
doublet has a narrow wave function, while the other member acquires a broad wave function (both localized at the same point along the extra dimension). The overlap of the wave
function of the left-handed doublet with the wave functions of the right-handed fields gives
rise to the splitting between the effective four-dimensional Yukawa couplings (and eventually between the masses) of neutral and charged leptons or of the up and down quarks.
This splitting can be large or small depending on the separation d (along the extra dimension) between the wave functions for left-handed and right-handed fields as demonstrated
91
in [9]. It was further noticed that there is a deep connection between the separation dl
for the lepton sector and the separation dq for the quark sector, giving rise to a relationship between quark and lepton masses, a common feature in grand unified theories
(GUT).
At this point, one might ask about the distinction between the present scenario and a
possible attempt to incorporate a GUT scenario for the masses in the context of large extra
dimensions. First, it is fair to say that, in order to achieve grand unification above the compactification scale, some rather strong assumption has to be made about the behaviour of
the running couplings, namely a power-law running. Because of this dynamical assumption, the running masses used in extrapolating the values at the GUT scale to low energies
will also suffer from large uncertainties. This is very unlike the logarithmic behaviour used
in [1]. In our case, quarklepton unification is achieved at a scale comparable to the compactification scale and the predictions made there can be extrapolated down to the Z-mass
using familiar renormalization group techniques. In fact, since the quarklepton unification scale is an order of magnitude or so larger than the Z-mass, there will not be much
running.
The plan of the paper will be as follows. First, we present a brief review of the essential
elements that go into the wave function overlap scenario in extra dimensions. We then
briefly review the ideas of early quarklepton unification with a special emphasis on the
group structure and fermion representations. We then show how one can connect these two
ideas to relate the overall mass scales in the mass matrices of the quark sector to those of
the lepton sectors. We finish with a numerical illustration of those results along with their
physical implications, including neutrino masses. We will present predictions for Dirac
neutrino masses. Whether or not Majorana neutrino masses are needed is a question which
depends on the predicted values for Dirac neutrino masses. We will show a correlation
between the masses of the neutrinos and those of the unconventional fermions which are
present in the early unification model. If the latter fermions are required to have a mass
between the electroweak scale and approximately 1 TeV, it is shown that the Dirac neutrino
masses are too small for the see-saw mechanism [14] to provide the bulk of neutrino masses
if, as it is natural to assume, the Majorana scale is of the order of the early unifications scale.
It is also shown that if the mass of the unconventional fermions is taken to lie between 200
and 300 GeV, the range of the Dirac neutrino mass is found to be between 0.07 eV and
1 eV. In fact, there is a recent interest in the possibility that the neutrino mass might be
either mostly or pure Dirac and there are questions about the popular see-saw mechanism
itself [15].
2. Extra dimension, early quarklepton unification and mass relationship

Two TeV-scale scenarios are briefly summarized below with the purpose of exposing
their common threads and ultimately combining them in order to obtain an understanding
of the possible relationship between quark and lepton masses and the smallness of the
neutrino masses.
92
2.1. Effective Yukawa couplings in models with extra dimensions

In its simplest version, an effective Yukawa coupling (which would be proportional to
the mass of the fermion) is defined, in four dimensions, as proportional to the size of the
wave function overlap between left-handed and right-handed fermions along a compact
fifth (spatial) dimension [7]. Among the many applications of this idea, one can cite for
example the attempts to give an explanation for the smallness of the neutrino masses [8,9].
One can either arbitrarily choose the locations, along the extra dimension, of the localized wave functions for the left-handed and right-handed neutrinos in such a way that the
overlap is tiny, or one can try to build a model in which the tiny overlap comes out more
or less naturally as Ref. [9] had done. In [9], the size of the neutrino overlap came out
small as compared with the size of the charged lepton overlap. A brief review of how this
happens as described in Ref. [9] will be given below. The main point of these works is
that the four-dimensional effective Yukawa couplings can be small even if the fundamental
(four-dimensional) Yukawa coupling is of order unity.
Let us start with one extra spatial dimension y compactified on an orbifold S1 /Z2 and
having a length L. Let us, as an example, take a lepton SU(2)L doublet, L{L} (x, y), and
another lepton SU(2)R doublet, L{R} (x, y), where the superscripts refer to the groups,
respectively. Since a fermion in five dimensions is a Dirac fermion, it will have both
chiralities (left- and right-handed) under four dimensions, i.e., = (L + R ), where
L,R = PL,R , with PL,R = (1 5 )/2 being the usual four-dimensional chiral projection operator. The notations that were used in [9] and here will be as follows. For the
{L}
{L}
SU(2)L doublet, we use L{L} (x, y) = (lL + lR ), while for the SU(2)R doublet, we use
{R}
{R}
L{R} (x, y) = (lL + lR ). One can choose the Z2 parity for these fields such that the only
0,{L}
0,{R}
zero modes are lL (x, y) = lL (x)L (y) and lR (x, y) = lR (x)R (y). With the introduction of the appropriate background scalar fields, these zero modes can be localized at
some points along y. The effective Yukawa coupling (which would determine the mass of
the fermion) is then proportional to the overlap between L (y) and R (y). (Let us recall
that L (y) and R (y) are doublets.) The main focus of [9] was the construction of a model
for the SU(2)R doublet R (y). This construction will be repeated below but a few words
might be illuminating here. The wave functions for the up and down members of R (y),
although localized at the same point along y, have very different shapes: one which is wide
and the other which is narrow. It is this disparity in shapes of the right-handed wave
functions that, when overlapping with the common left-handed wave function, gives rise
to the hierarchy in mass among up and down members of the doublet.
For the sake of clarity, a review of the model of [9] is warranted here. Since the main
object is the construction of R (y), one will concentrate on L{R} . A summary of the main
results of [9] can now be given. First, the localization of R (y) can be achieved by a coupling of L{R} to a background scalar field which develops a kink solution along y. Two
background scalar fields are needed in this scenario: a singlet field S whose kink solution
localizes the wave functions of both members of R (y) at the same location while keeping
their shapes identical, and a triplet T = T . 2 whose kink solution is responsible for drastically changing the shapes of the two wave functions while keeping the localization points
93
the same. (As mentioned in [9], these background scalars are chosen to be odd under Z2
so that they do not have zero modes.) Below is how it works.
The minimum energy solutions used in [9] are as follows

0
hT (y)
T = T3 3 /2 =
(1)
,
0
hT (y)
and
S = hS (y),
(2)
where generically h(y) = v tanh(y), with = /2v being typically the thickness
(l)
of the domain wall. Coupled with the Yukawa coupling LY 2 = fT L {R} T L{R} +
(l) {R}
{R}
fS L S L , one obtains the following equations for R (y):
(l)

(l)
y R (y) + fS hS (y) + fT hT (y) R (y) = 0,

y Re (y) + fS(l) hS (y) fT(l) hT (y) Re (y) = 0.
(3)
(4)
The solutions for the up and down members of R (y) (which will have the superscripts
and e respectively) are given as
y

(l)

(l)

dy fS hS (y ) fT hT (y ) ,
R,e (y) = k,e exp
(5)
where k,e are normalization factors. The immediate implication of Eq. (5) can be seen as
follows. Using h(y) = v tanh(y) in Eq. (5), one obtains
R,e (y) = k,e e(CS ln(cosh(S y))CT ln(cosh(T y))) ,
(6)
where CS,T = fS,T /(S,T /2)1/2 . If the parameters of the two scalar potentials are such
that CS ln(cosh(S y)) CT ln(cosh(T y)), one can immediately see that R (y) is narrow while Re (y) is broad. When these wave functions overlap with the left-handed wave
function (common for both and e), one can observe a large disparity between the two
effective Yukawa couplings. A crucial quantity which enters this hierarchy in Yukawa couplings is the separation between the left-handed wave function and the two right-handed
wave functions (localized at the same point along y) and which was denoted by y (l)
in [9].
The model described above has been espoused in Ref. [9] as a mechanism for naturally small Dirac neutrino masses. Furthermore, using the same wave function profiles for
the right-handed quarks, an interesting connection between quark and lepton mass hierarchies was noticed in [9]. Basically, it was a connection between y (l) and y (q) . Possible
symmetry reasons for this connection were left open in [9]. It is the purpose of this paper to elucidate the relationship between quark and lepton mass hierarchies by considering
explicitly a model of TeV-scale quarklepton unification [35]. To set the stage for that
discussion, a brief summary of the early unification model is presented below.
94
2.2. Early TeV-scale quarklepton unification

The model that was presented in [4] and discussed in detail in [5] is based on the gauge
group
GPUT = SU(4)PS SU(2)L SU(2)R SU(2)H .
(7)
This group is characterized by two independent gauge couplings: gS for SU(4)PS and gW
for SU(2)L SU(2)R SU(2)H , where a permutation symmetry is assumed among the
three SU(2)s. GPUT is assumed to be broken down to the Standard Model in two steps,
namely
M
MZ
GPUT G1 G2 SU(3)c U (1)EM ,
(8)
where
G1 = SU(3)c U (1)S SU(2)L SU(2)R SU(2)H ,
(9)
and
G2 = SU(3)c SU(2)L U (1)Y .
(10)
In this scheme, quarks and leptons, which are generic terms for color triplets and color
singlets respectively, are grouped into quartets of SU(4)PS . The scale of such a quark
lepton unification is denoted by M as seen above. In contrast with GUT where such a
unification occurs close to the Planck scale, it has been shown in [4], and particularly in
[5], that M 2 TeV. That such a low scale of unification can be achieved is a distinctive
feature of this model. A sketch of the arguments is presented below.
At this point, it is worth noticing that, if the scale(s) of extra dimensions is comparable
with the petite unification scale, physics which are related to the breaking of petite unification can be extrapolated to low energies with little, if any, uncertainties coming from
physics beyond the compactification scale. This is in contrast with a typical GUT scenario
embedded in large extra dimensions since its scale which would normally lie above the
compactification scale. As a consequence, there are large uncertainties associated with the
extrapolation of GUT physics down to the Z-mass for example.
The main idea of petite unification has to do with the assumption that the SM, with
three independent couplings: g3 , g2 and g1 , is merged into the PUT group GS GW
which is characterized by two independent couplings: gS and gW . As a result, one can
compute sin2 W (MZ2 ) as a function of the PUT unification scale M as shown in [3,4].
The highly precise value of sin2 W (MZ2 ) = 0.23113(15), along with the requirement that
M 10 TeV, allows us to severely restrict the choices of GW , resulting in the preferred
model mentioned at the beginning of this section. (Two other models were also found:
SU(4)PS SU(2)4 and SU(4)PS SU(3)2 , with the former being, in some sense, ruled
out due to severe problems with the process KL e unless some exotic mechanisms
are invoked, for example an embedding of the model into five dimensions with the gauge
symmetry breaking accomplished by orbifold boundary conditions [16].)
Two crucial elements in the computation of sin2 W (MZ2 ) are the group theoretical fac0 (= 1/3 for G = SU(2)3 ) and the factor C which appears in the expression
tor sin2 W
W
S
95
Q = T3L + TY = QW + CS T15 , where QW is the weak charge corresponding to the group

GW , and T15 is the unbroken diagonal generator of the SU(4)PS . The value of CS depends
on how quarks and
leptons transform under SU(4)PS SU(2)L SU(2)R SU(2)H. For
instance, CS = 2/3 if fermions transform as (4, 2, 1, 1) for example, while CS = 8/3
if they transform as (4, 2, 1, 2) or (4, 1, 2, 2). This is shown in details in [3,4]. Since
sin2 W (MZ2 ) = (1/3)(1 0.067CS2 log terms) (see [4]) and coupled with the requirement that M 10 TeV (which makes for little running between M and the electroweak
scale, and hence small log terms), it was found
[4] that the only acceptable fermion rep
resentations are the ones for which CS = 8/3. Using this value for CS [4,5], a detailed
computation of sin2 W (MZ2 ), up to two loops [5], determines the petite unification scale to
be less than 2 TeV. This fermion content will be the one that will be used in this paper. For
the sake of clarity, an explicit description of the fermions of the model is presented below.
Under SU(4)PS SU(2)L SU(2)R SU(2)H , the fermions transform as

c
(lu (1), (0))
(d (1/3), U (4/3))
,
L = (4, 2, 1, 2)L =
(11)
(uc (2/3), D(1/3))

(ld (2), l(1)) L

c
(lu (1), (0))
(d (1/3), U (4/3))
R = (4, 1, 2, 2)R =
(12)
.
c
(u (2/3), D(1/3)) (ld (2), l(1)) R

As one can see, this model contains, besides conventionally charged fermions, unconventional fermions with charges up to 4/3 for the quarks and down to 2 for the leptons.
To understand the notations in Eqs. (11), (12), one notices the following conventions.
SU(2)L,R doublets:

c
d (1/3)
,
uc (2/3) L,R

U (4/3)
,
D(1/3)
L,R
SU(2)H doublets:

U (4/3)
,
d c (1/3) L,R

D(1/3)
uc (2/3)

(0)
,
l(1) L,R

lu (1)
.
ld (2) L,R
L,R
SU(4)PS quartets:

c
d (1/3)
,
lu (1) L,R

U (4/3)
,
(0) L,R

(13)
(14)

(0)
,
lu (1) L,R

l(1)
.
ld (2) L,R
D(1/3)
l(1)
(15)
(16)
uc (2/3)
ld (2)
(17)
L,R
(18)
L,R
Note that due the particular nature of the fermion representation in this model, it is

c
u(2/3)
d (1/3)
= i2
d(1/3) L,R
uc (2/3) L,R
(19)
96
which appears instead of the more familiar-looking (u(2/3), d(1/3)).

From the fermion content listed above, one can see that the SU(4)/(SU(3) U (1)S )
gauge bosons with electric charges 4/3 link the normal quarks to the higher charged
leptons lL,R , and the normal leptons to the higher charged quarks Q L,R , while the SU(2)H
L,R and the normal leptons to
gauge bosons with charges 1 link the normal quarks to Q
lL,R . What this implies is that, at tree level, there is NO transition between normal quarks
and normal leptons due to SU(4)/(SU(3) U (1)S ) and SU(2)H gauge bosons. However,
such a transition can occur at the one-loop level through a box diagram with, e.g., two Pati
Salam SU(4)/(SU(3) U (1)S ) gauge boson exchanges (MPS = O(M)) and new heavy
and new heavy leptons (l)
that have masses O(MF 250 GeV). Q and l
quarks (Q)
appear in three generations and the mixing between these generations is given by 3 3
matrices to be denoted by U and V , respectively. In the case of degenerate masses of Q i
and li the GIM mechanism is at work and the decay KL e is absent. However, GIM
mechanism remains to be powerful also when the masses are non-degenerate but all in the
range 200300 GeV. In this case it provides a suppression factor of O(104 ) at the level
of the branching ratio. With the typical loop factor (16 2 )2 4 105 , the upper bound
on the relevant mixing factors |Vid Vis |2 |Uj d Ujs |2 coming from KL e amounts then
roughly to O(104 ) and can be easily satisfied. The SU(4)/(SU(3) U (1)S ) and SU(2)H
gauge bosons can have mass as low as a TeV or so without violating any bound on rare
decays. The previous remarks have been made in [4,5].
Since the natural scales of the scenarios described in Sections 2.1 and 2.2 are both in
the TeV range, it is worthwhile to see if a marriage of some sorts can be made between
these two scenarios and if, as a result, some light can be shed concerning the relationship
between quark and lepton masses and the smallness of the neutrino masses.
2.3. Connection between the scales of quark and lepton masses
If quarks and leptons (in the generic sense as discussed above) can be unified at
the TeV scale, there is a good possibility that whatever gives rise to their masses will also
determine the relationship between their mass scales. We will show below that such a
possibility does exist within the framework of the two scenarios described above.
The basic model used in this paper is SU(4)PS SU(2)L SU(2)R SU(2)H . As stated
above, this group spontaneously breaks down to SU(2)L U (1)Y and then to U (1)em at
the scales M few TeVs and v 250 GeV, respectively. Upon embedding this model in
five dimensions, with the fifth dimension y compactified on an S1 /Z2 orbifold, it is shown
below that the following features occur: (1) the breaking of SU(4)PS splits the positions,
along y, of wave functions of the zero modes of quarks and leptons; (2) the breaking
of SU(2)R gives rise to two vastly different profiles for the wave functions of the righthanded zero modes; (3) since a SU(2)H doublet groups together a conventional quark (or
lepton) with an unconventional one, the breaking of SU(2)H splits the locations, along y,
of the wave functions of the conventional fermions relative to those of the unconventional
ones; (4) and finally, the breaking of SU(2)L U (1)Y provides a mass scale for all the
fermions. As we have explained in the previous section, a crucial quantity that appears in
the hierarchy of masses among the up and down members of an SU(2)L doublet is the
separation along y between the wave function of the left-handed doublet and that of the
97
right-handed fields, namely y (l,q) . We will show below that points # 1, 2 and 3 help
establish a relationship between y (l) and y (q) .
In the construction of the model, one important point we would like to stress is the
following. The model SU(4)PS SU(2)L SU(2)R SU(2)H contains unconventional
quarks and leptons which were assumed to be heavy enough to escape detection. The fate
of these fermions were well described in Ref. [4]. For the purpose of this paper, we will
simply require that all unconventional fermions are heavy. This will be one constraint
which will be used below.
2.3.1. SU(4)PS SU(2)L SU(2)R SU(2)H in five dimensions
The first step one would like to do is to embed SU(4)PS SU(2)L SU(2)R SU(2)H
in five dimensions. Let us first denote, in five dimensions, the fermions presented in Section 2.2 by
{L} (x, y) = (4, 2, 1, 2),
{R}
(x, y) = (4, 1, 2, 2).
(20)
(21)
Let us recall that, in five dimensions, these are four-component Dirac fields, i.e., they have
both left- and right-handed components. The superscripts {L} and {R} are used for two
reasons: (a) to denote the transformation under SU(2)L or SU(2)R ; and (b) to show that
the surviving zero modes are related to these fields. By choosing the appropriate Z2 parity
for these fields, the zero modes of {L} (x, y) and {R} (x, y) are
L (x, y) = L (x)L (y),
(22a)
R (x, y) = R (x)R (y),
(22b)
respectively.
We wish to localize L (y) and R (y) along y. This is accomplished by coupling these
fields to some background scalar fields. To see the group representations of these scalar
fields, we consider the following bilinears:
{L} (x, y) {L} (x, y) = (1 + 15, 1 + 3, 1, 1 + 3),
{R} (x, y) {R} (x, y) = (1 + 15, 1, 1 + 3, 1 + 3).
(23)
(24)
From Eq. (23), one can see that some possible scalar fields which can couple to these
fermions would transform like (1, 1, 1, 1), (15, 1, 1, 1), (15, 1, 1, 3), (15, 1, 3, 1), etc. We
will show step-by-step below how these scalar fields help establish the link between y (l)
and y (q) . We will successively invoke one scalar at a time and show how it modifies the
behaviour of the fermion zero modes.
As we shall see, the scalar fields which are needed for our scenario are the following:
S1 ,S2 = (1, 1, 1, 1),
(25a)
= (15, 1, 1, 1),
(25b)
R = (1, 1, 3, 1),
(25c)
H = (15, 1, 1, 3).
(25d)
98
2.3.2. The role of the singlet scalar fields S1 ,S2 = (1, 1, 1, 1)

The Yukawa coupling of this field with the fermions takes the form

LY 1 = fS {L} {L} + {R} {R} S1 ,
(26)
with fS > 0. We assume a kink solution for S1

S1 = hS (y),
(27)
which localizes all fermions at the same point y = 0 along y.

In order to be more general, we will allow the fermions to be localized, still at this
stage, at some common arbitrary point which might be different from the origin. The most
economical scenario is one in which the left-handed fermions are localized at that other
point. This can be accomplished by the following coupling:
LY 1p = fS S2 {L} {L} ,
(28)
where fS > 0 and the negative sign in front of it is an arbitrary choice. Assuming
S2 = ,
(29)
we obtain the following equations for the zero modes:

y L (y) + fS hS (y) fS L (y) = 0,

y R (y) + fS hS (y) R (y) = 0.
(30a)
(30b)
We will present below two scenarios.

Scenario I:
= 0.
(31)
Scenario II:
= 0.
(32)
At this stage, scenario I implies that all (left- and right-handed) fermions are localized
at the origin. Scenario II implies that the right-handed ones are localized at the origin while
the left-handed ones are localized at a common point away from the origin. As we shall
see in the last section, it will be scenario II with = 0 that is favored phenomenologically.
It is clear that this is not the end of the story because the effective Yukawa couplings
to an SU(2)L -doublet Higgs field, which depend on the overlap of the left and right wave
functions, would be universal for all fermions, a clearly undesirable feature. We therefore
have to split the various wave functions along y. To do this, one has to invoke scalars
which transform non-trivially under SU(4)PS SU(2)L SU(2)R SU(2)H . This is what
we will proceed to do next.
99
2.3.3. The roles of = (15, 1, 1, 1) and H = (15, 1, 1, 3)

From the group representations of = (15, 1, 1, 1) and H = (15, 1, 1, 3), one can see
that, in principle, both {L} and {R} can couple to and H . However, for reasons of
economy, we shall see that it is sufficient to couple {L} to H .
We now concentrate on L (y). As it is mentioned above, one has to differentiate the
unconventional fermions from the conventional ones as well as the quarks from the
leptons. Let us remind ourselves that the conventional and unconventional fermions are
grouped into SU(2)H doublets as shown in Eq. (16). To differentiate the aforementioned
fermions, we need to break both SU(4)PS and SU(2)H . This is accomplished by the use
of H = (15, 1, 1, 3) and of = (15, 1, 1, 1). The Yukawa interaction between {L} , H
and = (15, 1, 1, 1) is given by
LY 2 = {L} (fH H + f ) {L} ,
(33)
with fH , f > 0.
We will assume a vacuum expectation value for H and as follows
1 0 0 0
0 1 0 0
=
(34a)
,
0 0 1 0
0 0 0 3
1 0 0 0

0
vH
0 1 0 0
,
H =
(34b)
0 vH
0 0 1 0
0 0 0 3
where the first matrix on the right-hand side of Eq. (34) refers to the direction T15 of
SU(4)PS whereas the second matrix in the second equation refers to the direction 3 in
SU(2)H , all of which refer to (15, 1, 1, 3). Here and vH are constants.
When Eq. (33) is combined with Eq. (26), the equation for the left-handed zero modes
is now given by

y L (y) + fS hS (y) fS + fH H + f L (y) = 0.
(35)
We now make an important assumption:
fH vH = f .
(36)
This assumption has a far-reaching consequence: all unconventional quarks and leptons
will have large wave function overlaps resulting in large mass scales for those sectors as
we shall see below.
Making use explicitly of Eq. (34) and the assumption (36), one can now rewrite (35) in
terms various SU(2)L doublets as follows
Q
Q
y L (y) + fS hS (y) + 2fH vH fS L (y) = 0,
(37a)
Q
Q
y L (y) + fS hS (y) fS L (y) = 0,
(37b)
L
L
y L (y) + fS hS (y) 6fH vH fS L (y) = 0,
(37c)
y LL (y) + fS hS (y) fS LL (y) = 0,

(37d)
100
L, L refer to the normal quark, unconventional quark, norwhere the superscripts Q, Q,

mal lepton, and unconventional lepton SU(2)L doublets as shown in Eq. (14). The above
equations show the splitting between conventional and unconventional fermions as well as
between quarks and leptons. As we shall show below, this splitting is crucial to the success
of this model.
From Eqs. (37), it also is easy to see that
L (y) = LL (y).
(38)
2.3.4. The role of R = (1, 1, 3, 1)

We have encountered in Section 2.1 the SU(2)R triplet scalar field whose kink solution,
when combined with a singlet kink, gives rise to very different profiles for the wave functions of the up and down members of an SU(2)R fermion doublet. In the present context,
the triplet scalar field is now R = (1, 1, 3, 1). Its coupling to {R} can be written as
LY 3 = fR {R} R {R} ,
(39)
with fR > 0. Notice that here we use fR instead of the notation fT used in Section 2.1 in
order to be consistent with the notation used for R .
The minimum energy solution for R can be written as
1 0 0 0

0
hT (y)
0 1 0 0
,
R =
(40)
0 0 1 0
0
hT (y)
0 0 0 1
where we have assumed that there is a kink solution for R .
When one combines Eq. (26) with Eqs. (39), (40), the equation for the zero modes looks
as follows:

y R (y) + fS hS (y) + fR R R (y) = 0.
(41)
Let us define the following effective kinks:
hsym (y) = fS hS (y) + fR hT (y),
(42a)
(y) = fS hS (y) fR hT (y).
(42b)
asym
Eq. (41) then takes the following forms:

Q,up
y R
Q,up
(y) + hsym (y)R
(y) = 0,
Q,down
Q,down
(y) + hasym (y)R
(y) = 0,
y R
L,up
L,up
y R (y) + hsym (y)R (y) = 0,
y RL,down (y) + hasym (y)RL,down (y) = 0.
(43a)
(43b)
(43c)
(43d)
In Eqs. (43) and according to the particle content given in Eqs. (14), Q, up refers
Similarly,
to d c (1/3) and U (4/3). Likewise, Q, down refers to uc (2/3) and D(1/3).
L, up refers to (0) and lu (1) while L, down refers to l(1) and ld (2). It is also
101
clear, from Eqs. (43), that

Q,up
L,up
(y) = R
up
(y) = R ,
RQ,down (y) = RL,down (y) = Rdown .
(44a)
(44b)
A quick look at Eq. (42) reveals that uR (2/3) and D R (1/3) as well as lR (1) and
ld,R (2) have broad wave functions while dR (1/3) and U R (1/3) and R (0) and
lu,R (1) have narrow wave functions. These features will be shown explicitly below.
From Eqs. (43), one can easily see that all right-handed wave functions are localized
at the origin. They have, however, different profiles, a situation which is very similar to
the scenario which is summarized in Section 2.1. It is this difference in profiles, when
combined with the different locations of left-handed wave functions, which gives rise to
the disparity in mass scales.
We turn next our attention to the separations along y between left-handed and righthanded fermions which are crucial, along with the different profiles, in determining the
mass scale of each sector.
2.3.5. Wave function localizations in a linear approximation
To see heuristically how Eqs. (37), (43) help split the locations of the quarks and leptons along the extra dimension, let us make a linear approximation to the kink solutions
hS (y) and hT (y), namely
hS (y) 22S y,
(45a)
hT (y) 22T y.
(45b)
Let us recall that, with the linear approximation, a wave function behaves like a
Gaussian (y) exp(2 y 2 ). It then follows that the overlap between two functions separated by a distance y along y goes like exp(2 (y)2 ) (where for heuristic purpose
2 are taken to be the same for both wave functions which in general is not the case). The
effective Yukawa couplings in four dimensions are proportional to the overlaps between
right-handed and left-handed fermions and it can be seen that they can be large or
small depending on whether or not (y)2 2 or (y)2 2 . What we will set out to
derive in our model is the relationship between y (l) and y (q) .
With the approximation (45), let us apply it to Eqs. (42), resulting in the following
definitions:
2asym = fS 2S fR 2T ,
(46a)
2sym = fS 2S + fR 2T .
(46b)
From Eqs. (43), the locations of the up-members and the down-members of an SU(2)R
doublet, for both conventional and unconventional quarks, are found to be at the origin.
We shall denote that common point by
yR = 0.
(47)
102
For the left-handed zero modes, their locations will depend on fS and fH vH . For
convenience, let us define the following quantity:
r
fS
.
2fH vH
(48)
We will assume that r < 1. From Eqs. (37), the locations of the SU(2)L doublets for conventional and unconventional quarks are

vH
fH
(1 r),
yQL =
(49a)
fS
2S

vH
fS
fH
r,
yQ L =
(49b)
=
fS
2fS 2S
2S
while the locations of the lepton doublets are

vH
r
fH
y lL = 3
1
+
,
fS
3
2S

vH
fS
fH
r.
ylL =
=
2
f
2fS S
2S
S
(50a)
(50b)
One important comment is in order at this point. From Eqs. (47), (49), (50) as well as
(46), it is clear that there are five independent parameters: fS 2S , fS , fR 2T , f , and
fH vH , although Eq. (36) reduces to four independent parameters. What this implies is
that, out of eight locations, there are three (or four) predictions. In principle, we would
then obtain three (or four) predictions for mass scales once the other four (or three) are
fixed by the choices of the aforementioned parameters. We shall come back to this point
below.
In order to make sense out of the above locations, a few remarks are in order here.
The effective Yukawa coupling, in four dimensions, which governs the fermion mass scale
depends on the overlap between left-handed and right-handed fermions. This overlap depends on the separation between the two fermions as well as on the shapes of the fermion
wave functions. As mentioned above, the spread of the wave functions is crucial in our
scenario. This spread is roughly proportional to 1/. From Eqs. (46), one can deduce that,
for the right-handed wave functions, lR (1), ld,R (2), uR (2/3) and D R (1/3) have broad
wave functions since 2asym = fS 2S fR 2T . On the other hand, dR (1/3), U R (4/3),
R (0) and lu,R (1) have narrow wave functions since 2sym = fS 2S + fR 2T . All the left
handed wave functions, on the other hand, have a spread of the order of 1/ fS 2S . How
do these facts translate into the disparities in mass scales? To answer this question, one has
to look at the separations between the left-handed and right-handed wave functions.
From Eqs. (47)(50), one can readily derive the following leftright separations.
For the quarks:

fH
vH

(1 r),
|yU | |yR yQL | =
2
fS
S
(51a)

fH
vH
,
(1
r)
|yD | |yR yQL | =

2
fS
S

vH
fS
fH
|yU | |yR yQ L | =
r,
=
2
f
2fS S
2S
S

fH
vH
fS
|yD | |yR yQ L | =
=
r.
2
fS
2fS S
2S
103
(51b)
(51c)
(51d)
For the leptons:

fH
vH
r
1
+
,
|y | |yR ylL | = 3
fS
3
2S

fH
vH
r

,
1+
|yl(1) | |yR ylL | = 3
fS
3
2S

vH
fS
fH
|ylu (1) | |yR ylL | =
r,
=
fS
2fS 2S
2S

vH
fS
fH
r.
|yld (2) | |yR ylL | =
=
2
fS
2fS S
2S
(52a)
(52b)
(52c)
(52d)
Comparing Eqs. (52) with Eqs. (51), we arrive at the following important relationship
between conventional quarks and leptons:

1 + 3r
|yLepton | = 3
(53)
|yQuark |,
1r
where r < 1. For scenario I, one would have r = 0 and one would simply have |yLepton | =
3|yQuark |. For scenario II, where r = 0, one obtains the above relationship. What (53)
implies is the following important fact: |yLepton | 3|yQuark |. This means that the scales
of the lepton sector are generally a bit smaller than those of the quark counterpart.
It is also useful to derive a relationship between the separations of conventional and
unconventional fermions. From Eqs. (49)(51), it is straightforward to derive the following
relationship between the common leftright separation of the unconventional quarks and
leptons and that of the conventional quarks:

r
|yQuark |.
|yUnconventional | =
(54)
1r
Another useful form for (54) can be obtained by using Eq. (53), namely
|yUnconventional | =
|yLepton | 3
|yQuark |
4
|yQuark |.
(55)
From (53) and (54), (55), one notices that, in scenario I where r = 0, one obtains the
simple relations: |yLepton | = 3|yQuark | and |yUnconventional | = 0.
The above relationship is important for the following reasons. First, it implies that the
left-handed wave function for the leptons is situated much further (by a factor of three)
away from the right-handed ones than is the case for the quarks. It then means that the
104
wave function overlaps which determine the effective four-dimensional Yukawa couplings
would, in principle, be much smaller for the lepton sector than for its quark counterpart,
resulting in a large disparity in mass scales between the two sectors. This is actually what
happens in reality. The details of that disparity, in our scenario, will also depend on the
difference in the wave function profiles. This will be discussed in the next section.
f
Since |y| = S 2 for both unconventional quarks and leptons, this implies that un2fS S
conventional quarks and leptons have comparable mass scales which can be large. How
large this might be is the subject of the next section.
To summarize, the model contains four independent parameters: fS 2S , fR 2T , fH vH ,
and r. From these, we have two independent wave function profiles for the right-handed
zero modes, one independent separation |yQuark |, and one parameter r. Once r and
|yQuark | are specified, all other separations can be computed.
We now present some numerical illustrations of the above results. Our strategy will be as
follows. First, we write down the coupling between the left-handed fermions, right-handed
fermions and a Higgs field whose VEV gives rise to fermion masses. Next, we (arbitrarily)
fix the two right-handed wave function profiles. We then choose |yQuark | so that the mass
scales of the conventional Up and Down quark sectors come out correctly. With the same
wave function profiles, we next choose r so that, upon the use of Eq. (53), the mass scale
of the charged lepton sector comes out correctly. We will show below that, as a result, we
obtain predictions for the mass scale of the Dirac neutrino sector as well as those of the
unconventional quarks and leptons. An alternative way to fix the parameters is to choose
one of the right-handed wave functions, for example the one that belongs to the charged
leptons and the Up quark sector, fix r so that the mass scales come out correctly, fix the
second right-handed wave function so that the Down quark sector comes out correctly.
Once this is done, the above predictions will come out the same.
3. Computation of mass scales and implications

One can now use the results of Ref. [9] and Section 2.1 to estimate mass scales of the
normal quark and lepton sectors as well as those of the unconventional fermions. Since the
scope of this paper is the construction of a model showing a relationship between mass
scales of quarks and lepton sectors, we shall ignore issues such as fermion mixings
in the mass matrices. Higher-dimensional models have been built to tackle quark mass
hierarchies, mixing angles and CP phase (see, e.g., [11,12]). We will therefore concentrate
on the overall mass scales that appear in various mass matrices.
3.1. SM fermionHiggs coupling
By SM fermionHiggs coupling, we mean that the Higgs field that couples with lefthanded and right-handed fermions transforms non-trivially under SU(2)L .
Since {L} = (4, 2, 1, 2), {R} = (4, 1, 2, 2) and {L} {R} = (1 + 15, 2, 2, 1 + 3), an
appropriate Higgs field (the simplest choice) could be the following field:
H = (1, 2, 2, 1).
(56)
105
The Yukawa coupling with this field can be written as

LY 4 = k1 {L} H {R} + k2 {L} H {R} + H.c.,
(57)
where H = 2
2 and where, in principle, k2 can be different from k1 . Assuming the
extra dimension to be compactified on an orbifold S1 /Z2 and an even Z2 parity for H , it
follows that H can have a zero mode. This zero mode can be written as H 0 (x, y) = K(x)
where (x) is a
4-dimensional Higgs field with dimension M (mass) and K, a constant,
has a dimension M, since H has a dimension M 3/2 in five dimensions. Notice that k1,2
have a dimension M 1/2 . We define the following dimensionless couplings:
H
gY 1,2 = k1,2 K.
(58)
The VEV of is assumed to be

0
v1 / 2
.
=
0
v2 / 2
(59)
Eqs. (57) and (59) provide mass scales which appear in mass matrices as follows
MU,D,,l ,U ,D,
lu ,ld = U,D,,l ,U ,D,
lu ,ld MU,D,,l ,U ,D,
lu ,ld ,
(60)
where U,D,,l ,U ,D,

lu ,ld are the mass scales in question and MU,D,,l ,U ,D,
lu ,ld are matrices whose elements will depend on models of fermion masses (e.g., [11]). The subscripts
are self-explanatory.
Using the fermion representations (14) and Eqs. (57), (59), (38), (44), the mass scales
that appear in Eq. (60) now take the following forms:
U = gU 1 v2 / 2 + gU 2 v1 / 2,
(61a)
D = gD1 v1 / 2 + gD2 v2 / 2,
(61b)
= g1 v1 / 2 + g2 v2 / 2,
(61c)
l = gl 1 v2 / 2 + gl 2 v1 / 2,
(61d)
U = lu = gU 1 v1 / 2 + gU 2 v2 / 2,
(61e)
D = ld = gD1
(61f)
v2 / 2 + gD2
v1 / 2,
where
L
gU 1,2 = gY 1,2
dy L (y)Rdown (y),
(62a)
L
gD1,2 = gY 1,2
up
dy L (y)R (y),
(62b)
L
g1,2 = gY 1,2
up
dy LL (y)R (y),
0
(62c)
106
L
gl 1,2 = gY 1,2
dy LL (y)Rdown (y),
(62d)
L
gU 1,2 = glu 1,2 = gY 1,2
up
dy L (y)R (y),
(62e)
L
= gld 1,2 = gY 1,2
gD1,2
dy L (y)Rdown (y).
(62f)
The way v1 and v2 appear in Eqs. (61) should be clearly understood that it has to do with
the fermion content as shown in Eqs. (14), and that is the reason why the first two equations
appear in the forms shown above.
There are two possibilities concerning Eqs. (61).
g Y 1 = gY 2 :
This is a rather economical option, in terms of reducing the number of parameters. From
Eqs. (61) and (62), it is easy to see that, if gY 1 = gY 2 , the ratios of scales will simply ratios
of wave function overlaps regardless of the values of v1 and v2 as well as of the value of the
Yukawa coupling since they cancel out in the ratios. In fact, we can form six ratios (from
six independent scales) which are can be taken as D /U , /l , /D , l /U ,
U /D , D /U . Explicitly, one has:
L
up
Q
dy L (y)R (y)
D
,
= L0
Q
down (y)
U
0 dy L (y)R
L
up
dy LL (y)R (y)
,
= L0
down (y)
L
l
0 dy L (y)R
L
up
dy LL (y)R (y)
,
= 0L
up
Q
D
dy
(y)
(y)
L
R
0
L
L
dy L (y)Rdown (y)
l
,
= 0L
Q
U
dy (y) down (y)
0
(63b)
(63c)
(63d)
L
up
Q
dy L (y)R (y)
U
,
= 0L
up
Q
D
0 dy L (y)R (y)
L
Q
down (y)
D
0 dy L (y)R
.
= L
U
dy Q (y) down (y)
0
(63a)
(63e)
(63f)
Once the parameters of the wave functions and their separations are fixed, these ratios (or
any other combinations) can be computed unambiguously.
107
In the following, we will choose U and D as two independent inputs. From them,
we can extract |yQuark |. Once the parameter r is chosen, all other mass scales can be
predicted.
gY 1 = gY 2 :
For the case when gY 1 = gY 2 , one can still obtain the following ratios which depend only
on ratios of wave function overlaps:
L
dy LL (y)Rdown (y)
l
,
= 0L
Q
down (y)
U
0 dy L (y)R
L
up
dy LL (y)R (y)
,
= 0L
up
Q
D
0 dy L (y)R (y)
L
up
Q
U
0 dy L (y)R (y)
,
= L
up
Q
D
0 dy L (y)R (y)
L
Q
down (y)
D
0 dy L (y)R
.
= L
U
dy Q (y) down (y)
0
(64a)
(64b)
(64c)
(64d)
What are the implications of the above two cases? First, one chooses the two parameters
up
fS 2S and fR 2T so that R (y) and Rdown (y) are fixed. Next, we choose U and D as
two independent mass scales. In the first case where gY 1 = gY 2 , these two scales are used
to extract |yQuark |. One can then choose the parameter r so that the scale l is fixed.
Once this is done, all other scalesneutrinos, unconventional fermionscan be predicted.
For the second case where gY 1 = gY 2 , one has to both choose |yQuark | and r in order to
fix the first ratio in (64). All other ratios in (64) can then be predicted.
3.2. Some numerical examples
In this section, we will present some numerical illustrations of the above ideas. A more
comprehensive numerical analysis will be presented elsewhere. We find a surprising correlation between the Dirac neutrino masses and those of the unconventional fermions. As we
shall see below, by requiring the unconventional fermions to be heavier than the top quark
but at the same time NOT too much heavier than the electroweak scale, e.g., < 650 GeV,
it is found that the largest Dirac neutrino mass is bounded from below by a value roughly
of the order of 0.1 eV and from above by a value of the order of 30 eV. From this result
alone, it is hard to see how one can incorporate Majorana neutrinos in our scenario since
the Dirac neutrinos alone are naturally light. Actually, the detailed numerical analysis of
[4] shows that, in order to keep the early unification scale below 2 TeV, the masses of the
unconventional fermions are constrained to be less than 300 TeV which, as we shall see
below, sets the following bound for the heaviest Dirac neutrino: 0.1 < mheaviest
< 1 eV.
Or one can turn things around by using some of our knowledge, however uncertain. About
neutrino masses to set bounds on the masses of the unconventional fermions. For example,
if we set the largest neutrino mass to lie between 0.1 and 1 eV assuming it is of the Dirac
108
type, the unconventional fermions are constrained to have a mass between the top quark
mass and 300 GeV.
Our numerical strategy is as follows. (1) For a given pair of fS hS (y) and fR hT (y), we
use the ratio D /U to find |yQuark |. Actually, it is the difference between fS hS (y) and
fR hT (y) that is important. (2) We then choose r so that |yLepton |, which is given in terms
of |yQuark | via Eq. (53), gives the correct ratio l /U . (3) After steps 1 and 2 have been
carried out, one can make predictions for the mass scales of the Dirac neutrino sector as
well as those for the unconventional fermions, using Eqs. (63). After this is done, one can
then decide whether or not a Majorana neutrino mass term is needed, depending on the
value of the Dirac mass.
To be general, we start out with r = 0.
For the zero mode right-handed wave functions, we use expressions similar to those
found in Eq. (6), namely
up,down
(y) = kup,down e(CS ln(cosh(S y))CR ln(cosh(T y))) ,
(65)
where kup,down are normalization factors and CS,R = fS,R /(S,T /2)1/2 are factors which
involve the Yukawa couplings as well as the self-couplings
in the scalar potentials. (Let
us recall generically that h(y) = v tanh(y), with = /2.) The wave functions for the
left-handed zero modes are given by
Li (y) = kL e(CS ln(cosh(S (yyi )))) ,
(66)
where kL is a normalization factor, i = Q, L, Q and yi are given by (49) and (50).

Using (65) and (66), we can now illustrate the results presented in the previous section
with a numerical example. To translate ratios of mass scales into ratios of actual mass
eigenvalues, an assumption has to be made concerning the mass matrices themselves.
Explicitly, the relationship between the mass scales and the mass matrices M can be
written as
M = M,
(67)
is a dimensionless matrix whose form depends on a particular model. An exwhere M

haustive general analysis of different types of mass matrices is beyond the scope of this
paper. For the purpose of illustration in this paper, we will make the following reasonable
assumptions concerning the relationship between the mass scale that appears as a common
factor in the mass matrix and the largest eigenvalue, namely
mt
U mt ,
(68a)
3
mb
D mb ,
(68b)
3
m
l m ,
(68c)
3
where mt , mb , and m are the largest eigenvalues of the up-quark, down-quark, and
charged lepton mass matrices respectively. The lower bounds in Eqs. (68) refer to a pure
democratic mass matrix [17] where, apart from the common scale factor, all elements are
unity and the largest eigenvalue is three times the scale factor. Such a pure democratic mass
109
matrix is unrealistic but it is included here for completeness. The upper bound refers to a
class of hierarchical mass matrices where the largest eigenvalue is approximately the scale
factor itself [18]. In between these two bounds, there exists models, e.g., [11,12] which are
almost but not quite of the pure phase mass matrix type [19]. We will assume below that
the up-quark, down-quark and charged lepton sectors have mass matrices with similar
behaviour, only in the sense that the relationship between the scale factors and the largest
mass eigenvalues is assumed to be the same.
For the purpose of illustration, we will use, for the largest eigenvalues, mt , mb and m
evaluated at MZ , and neglect any running between MZ and the early unification scale
taken to be comparable to the scale of compactification. We take
mt (MZ ) = 181 GeV,
mb (MZ ) = 3 GeV,
m (MZ ) = 1.747 GeV.
(69)
With the above numbers and with the remarks made above concerning the relationship
between the scale factors and the largest mass eigenvalues, we can write
mb (MZ )
D
0.0166,
(70a)
U
mt (MZ )
l
m (MZ )
(70b)
0.00965.
U
mt (MZ )
The ratios (70a), (70b) are now used to predict the mass scales for the neutrino sector as
well as for the unconventional fermion sectors.
As we have mentioned earlier, the mass scales of the neutrinos are correlated with those
of the unconventional fermions. This will be shown in the six examples below for the more
general case of r = 0. For comparison, we will also show a result for r = 0.
First we would like to remind ourselves that it is the difference between CS
ln(cosh(S y)) and CR ln(cosh(T y)) in the wave functions that is important. In consequence, we will set CS = CR = 1, choose S = 1 (in some unit), and vary T .
In order to present the results in a transparent way, we prefer to write expressions such
as exp( ln(cosh(y))) instead of the equivalent expression 1/ cosh(y).
up
R (y) =
1
1.553
exp{ ln(cosh(y)) ln(cosh(0.7y))}; Rdown (y) =
exp{ ln(cosh(y)) + ln(cosh(0.7y))};

1
2
exp{ ln(cosh(y 23.285))};
Q
L (y)
Q
L (y) = 1
2
1
2
exp{ ln(cosh(y
1
3.718
+ 7.815))}; LL (y)
exp{ ln(cosh(y + 0.04))}.
up,down
R
(y)
so that the ratio (70a) is satisfied.

Here yQ = 7.815 is chosen for a given pair
Similarly, yL = 23.285 is chosen so that the ratio (70b) is satisfied. The location yQ =
0.04 for the unconventional fermions was calculated using Eq. (55). The predictions for
the neutrino and unconventional fermion mass scales are found to be
= 0.278 106 ,
(71a)
l
l
U
(71b)
= u = 7.3,
U
U
l
D
(71c)
= d = 7.93.
U
U
110

up
R (y) =
1
1.53

The predictions are:
1
2
Q
L (y)
Q
L (y) = 1
2
1
2
exp{ ln(cosh(y
1
4.057
+ 7.53))}; LL (y)
up
1
1.514
(72a)
(72b)
(72c)

1
2
Q
L (y)
Q
L (y) = 1
2
1
2
exp{ ln(cosh(y
1
4.332
+ 7.53))}; LL (y)
up
1
1.483
(73b)
(73c)
Q
L (y)
exp{ ln(cosh(y 29.17))}; L (y) =

1
2
1
2
exp{ ln(cosh(y
1
5.049
+ 7.07))}; LL (y)
up
1
1.476
(74b)
(74c)
(74a)

1
2
exp{ ln(cosh(y 1.99))}.
= 0.381 109 ,
l
l
U
= u = 2.834,
U
U
l
D
= d = 1.858.
U
U
R (y) =
(73a)
1
2
= 0.13 107 ,
l
l
U
= u = 4.43,
U
U
l
D
= d = 4.37.
U
U
R (y) =
= 0.5 107 ,
l
l
U
= u = 5.46,
U
U
l
D
= d = 5.82.
U
U
R (y) =
Q
L (y)
Q
L (y) = 1
2
1
2
exp{ ln(cosh(y + 7))};
1
5.276
LL (y)
111
= 0.13 109 ,
l
l
U
= u = 2.521,
U
U
l
D
= d = 1.39.
U
U
up
R (y) =
1
1.468
(75a)
(75b)
(75c)
Q
L (y)
exp{ ln(cosh(y 31.36))}; L (y) =

1
2
1
2
1
2
exp{ ln(cosh(y
1
5.527
+ 6.94))}; LL (y)
= 0.371 1010 ,
l
l
U
= u = 2.246,
U
U
l
D
= d = 0.998.
U
U
(76a)
(76b)
(76c)
The implications of the above results are given for the upper and lower bounds in (68)
in Tables 1 and 2. We use the values in Eq. (69) for the estimates given in these tables. The
predictions coming from (71)(76) are listed in the second, third, and fourth columns.
We end this section by briefly showing that the case r = 0 which gives the interesting
relations |yLepton | = 3|yQuark | and |yUnconventional | = 0 and which means that one can
also predict the charged lepton mass scale in terms of the one for the quarks, is, unfortunately, not good. For example, taking the quark wave functions used in (75) and using the
above relations, one obtains a prediction for the mass scale of the charged lepton sector to
be approximately 11 GeV. This, by itself, rules out the case r = 0. Incidentally, the neutrino mass scale comes out to be 2.5 keV and those of the unconventional fermions come
Table 1
Predictions for , U = l , and D = l for the upper bounds (68): U mt (MZ ), D mb (MZ ), and
u
d
l m (MZ )
Eq. (71)
Eq. (72)
Eq. (73)
Eq. (74)
Eq. (75)
Eq. (76)
486 eV
87 eV
23 eV
0.67 eV
0.23 eV
0.065 eV
1321 GeV
988 GeV
802 GeV
513 GeV
456 GeV
406 GeV
1435 GeV
1053 GeV
791 GeV
336 GeV
252 GeV
181 GeV
112
Table 2
Predictions for , U = l , and D = l for the lower bounds (68): U mt (MZ )/3, D mb (MZ )/3,
u
d
and l m (MZ )/3
Eq. (71)
Eq. (72)
Eq. (73)
Eq. (74)
Eq. (75)
Eq. (76)
162 eV
29 eV
7.7 eV
0.22 eV
0.077 eV
0.022 eV
440 GeV
329 GeV
267 GeV
171 GeV
152 GeV
135 GeV
478 GeV
351 GeV
264 GeV
112 GeV
84 GeV
60 GeV
out to be 590700 GeV, although these numbers are irrelevant since the prediction for
the charged lepton sector is already wrong.
We now discuss the implications of Tables 1 and 2 for the more general case r = 0.
3.3. Implications of Tables 1 and 2
To obtain a better understanding of the numerical results presented in Tables 1 and 2,
we will assume that the mass matrices for the unconventional fermions are such that, for
each sector, the fermions are approximately degenerate. That is because, if it were not the
case, a mass splitting similar to the normal fermions (quarks and charged leptons) would
renders at least one fermion for each sector to be lighter that say the top quark. It goes
without saying that none has been seen so far. We will therefore assume that the masses
of the unconventional fermions for each sector are approximately equal to the mass scales
s.
Table 1. The numerical results given in this table are for the case where the mass
matrices of the normal quarks and leptons are highly hierarchical as we had mentioned
earlier.
One obvious remark that one can make by looking at Table 1 is the following. There is
a clear relationship between the masses of the unconventional fermions and those of the
neutrinos: as the unconventional masses increase so do neutrino masses. However, if we
restrict the masses of the unconventional fermions to be less than one TeV, one notices that
the Dirac mass of the neutrinos cannot exceed a few hundreds eVs. In scenarios such as this
one, one might expect that Majorana masses, if they exist, would typically be also of the
order of TeVs. The see-saw mass for the light neutrino would then be roughly at most of the
order (few hundred eV)2 /(1 TeV) 108 eV. As a result, the bulk of the neutrino mass,
in this scenario, is Dirac. Furthermore, if the unconventional fermions are not too heavy,
say lighter than 500 GeV, nor too light, i.e., heavier than the top quark, the neutrino mass
scales vary between 1 eV and approximately 0.07 eV. From Table 1, one can tentatively
conclude that if the unconventional fermions are heavy, i.e., with masses ranging from 300
to 500 GeV, the neutrino mass scales will range between 0.2 and 0.7 eV. This would imply
that one might have a situation in which neutrinos are nearly degenerate in order to satisfy
the oscillation data. If, on the other hand, the unconventional fermions were to be lighter,
i.e., with masses ranging from 181 to 400 GeV, one could have a scenario in which the
neutrino mass matrix is hierarchical.
113
Table 2. This is the extreme case of democratic mass matrices for the normal fermions.
If the masses of the unconventional fermions were to lie between 181 and 400 GeV, the
neutrino mass scales would be of the order of a few eVs or more. In light of cosmological constraints as well as of oscillation data, this particular case might even be ruled out.
It is amusing to note that this scenario of extreme democratic mass matrices for the normal fermions does not work but itself, regardless of the neutrino sector, because it cannot
reproduce the correct mass spectrum and the CKM matrix.
Intermediate cases. In between the above two bounds, there are models, e.g., [11,
12], which deviate somewhat from a pure democratic mass matrix model but which can fit
fairly well the mass spectrum as well as the CKM matrix. In this model, the mass scales
are roughly half the value of the largest mass eigenvalues. In rescaling Table 1 by a factor
of 1/2, one notices that, in order for the lightest unconventional fermion to be heavier than
the top quark, the smallest neutrino mass scale is around 0.20.4 eV. This implies that
neutrinos are nearly degenerate.
Although a more extensive investigation of the above questions for various scenarios of
mass matrices is warranteda subject of a next papera preliminary conclusion can be
drawn from the above results. From the consideration of the lower bound on the lightest
unconventional fermion, it appears that neutrinos in our scenario are more likely to be
near-degenerate with mass lying around a few tenths of an eV. This would imply that
mixing angles as deduced from various oscillation data mainly come from the charged
lepton sector.
4. Summary
We have presented, in this paper, a model of quarklepton mass unification which marries two TeV-scale scenarios: early unification of quarks and leptons [35] and large extra
dimensions [79] (as applied to neutrino masses), has been presented. Explicitly, the early
unification model GPUT = SU(4)PS SU(2)L SU(2)R SU(2)H is embedded in 4 + 1
dimensions with the extra spatial dimension, y, being compactified on an orbifold S1 /Z2 .
Chiral zero modes are localized along the extra spatial dimension by kinks that come from
two background scalar fields, one of which transforms non-trivially under GPUT . Additional scalars are used to break GPUT down to the SM. The model contains the following
features.
The breaking of SU(4)PS splits the positions, along y, of wave functions of the zero
modes of quarks and leptons.
The breaking of SU(2)R gives rise to two vastly different profiles for the wave functions of the right-handed zero modes.
Since a SU(2)H doublet groups together a conventional quark (or lepton) with an
unconventional one, the breaking of SU(2)H splits the locations, along y, of the wave
functions of the conventional fermions relative to those of the unconventional ones.
The breaking of SU(2)L U (1)Y provides a mass scale for all the fermions.
114
The size of the effective Yukawa couplings, which depends on the overlap between
right- and left-handed wave functions, is characterized by the separation along the extra
dimension y between these two wave functions, y. In this model, we were able to show
that there is a relationship between the quark separation, y (q) and the lepton separation
y (l) , and also that of the unconventional fermions. This is due to the early unification
scenario discussed in [4,5] and again in this paper. It translates into relationships between
mass scales that appeared in fermion mass matrices and are valid in the TeV range. This is
what is referred to as early quarklepton mass unification in our scenario. A summary of
its ramification is listed below.
A feature of the SU(4)PS SU(2)L SU(2)R SU(2)H model is the existence of
quarks and leptons with unconventional electric charges. As described in the model,
these unconventional fermions acquire masses from the same sources as conventional
fermions. Because of the existence of relationships between mass scales of different sectors, a consequence of early quarklepton unification, we have found a strong correlation
between the Dirac masses of the neutrinos and those of the unconventional fermions: the
neutrino Dirac masses increase the heavier the unconventional fermions become. For example, by requiring that the masses of these unconventional fermions lie between the top
mass and 1 TeV (the lower bound is experimentally obvious while the upper bound refers
more to the wish of not having a strong Yukawa coupling regime), we found that the mass
scales of the neutrino sector range from approximately 0.07 eV to roughly 80 eV, as can
be seen in Tables 1 and 2. The Dirac masses of the neutrinos are naturally small in this scenario. Any Majorana contribution to the total mass through the see-saw mechanism would
have to be negligible.
From the above arguments, our scenario accommodates naturally light Dirac neutrinos
without having to use the see-saw mechanism. Since there is a correlation between neutrino
masses and those of the unconventionally charged fermions in the SU(4)PS SU(2)L
SU(2)R SU(2)H model, a light neutrino of mass less than 1 eV (as a concordance of
data seems to indicate [20]) will imply that the unconventionally charged fermions are
not too heavy (see Tables 1 and 2) and there might be a chance to observe them, if they
exist, at future colliders [21].
Acknowledgements
I would like to thank Lia Pancheri and Gino Isidori for the warm hospitality in the
Theory Group at Frascati. This work is supported in parts by the US Department of Energy
under grant No. DE-A505-89ER40518.
References
[1] A.J. Buras, J.R. Ellis, M.K. Gaillard, D.V. Nanopoulos, Nucl. Phys. B 135 (1978) 66.
[2] H. Georgi, S.L. Glashow, Phys. Rev. Lett. 32 (1974) 438;
H. Georgi, H.R. Quinn, S. Weinberg, Phys. Rev. Lett. 33 (1974) 451.
[3] P.Q. Hung, A.J. Buras, J.D. Bjorken, Phys. Rev. D 25 (1982) 805.
[4] A.J. Buras, P.Q. Hung, Phys. Rev. D 68 (2003) 035015.
115
[5] A.J. Buras, P.Q. Hung, N.-K. Tran, A. Poschenrieder, E. Wyszomirski, Nucl. Phys. B 699 (2004) 253.
[6] P.Q. Hung, in: Proceedings of Coral Gables Conference on Launching of Belle Epoque in High-Energy
Physics and Cosmology (CG 2003), Ft. Lauderdale, FL, 1721 December 2003.
[7] N. Arkani-Hamed, S. Dimopoulos, G.R. Dvali, Phys. Lett. B 429 (1998) 263, hep-ph/9803315;
I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos, G.R. Dvali, Phys. Lett. B 436 (1998) 257, hep-ph/9804398.
[8] N. Arkani-Hamed, S. Dimopoulos, G.R. Dvali, J. March-Russell, Phys. Rev. D 65 (2002) 024032;
K. Dienes, E. Dudas, T. Ghergetta, Nucl. Phys. B 557 (1999) 25;
J.M. Frere, G. Moreau, E. Nezri, Phys. Rev. D 69 (2003) 033003.
[9] P.Q. Hung, Phys. Rev. D 67 (2003) 095011.
[10] See, e.g., Y. Grossman, R. Harnik, G. Perez, M.D. Schwartz, Z. Surujon, hep-ph/0407260, and references
therein.
[11] P.Q. Hung, M. Seco, Nucl. Phys. B 653 (2003) 123.
[12] P.Q. Hung, M. Seco, A. Soddu, Nucl. Phys. B 692 (2004) 83.
[13] J.C. Pati, A. Salam, Phys. Rev. D 10 (1974) 275.
[14] M. Gell-Mann, P. Ramond, R. Slansky in: D. Freedman, et al. (Eds.), Supergravity, 1979;
T. Yanagida, KEK lectures (1979), unpublished;
R.N. Mohapatra, G. Senjanovc, Phys. Rev. Lett. 44 (1980) 912.
[15] A.Y. Smirnov, Talk given at SEESAW25: International Conference on the Seesaw Mechanism and the Neutrino Mass, Paris, France, 1011 June 2004, hep-ph/0411194.
[16] Z. Chako, L.J. Hall, M. Perelstein, JHEP 0301 (2003) 001.
[17] See, e.g., P. Kaus, S. Meshkov, Phys. Rev. D 42 (1990) 1863, and references therein.
[18] There is a long history on this subject. Below is an incomplete set of references:
H. Fritzsch, Phys. Lett. B 73 (1978) 317;
P. Ramond, R.G. Roberts, G.G. Ross, Nucl. Phys. B 406 (1993) 19, hep-ph/9303320;
D.S. Du, Z.Z. Xing, Phys. Rev. D 48 (1993) 2349;
L.J. Hall, A. Rasin, Phys. Lett. B 315 (1993) 164, hep-ph/9303303;
H. Fritzsch, D. Holtmannspotter, Phys. Lett. B 338 (1994) 290, hep-ph/9406241;
P.S. Gill, M. Gupta, J. Phys. G 21 (1995) 1;
P.S. Gill, M. Gupta, Phys. Rev. D 56 (1997) 3143, hep-ph/9707445;
H. Lehmann, C. Newton, T.T. Wu, Phys. Lett. B 384 (1996) 249;
Z.Z. Xing, J. Phys. G 23 (1997) 1563, hep-ph/9609204;
K. Kang, S.K. Kang, Phys. Rev. D 56 (1997) 1511, hep-ph/9704253;
T. Kobayashi, Z.Z. Xing, Mod. Phys. Lett. A 12 (1997) 561, hep-ph/9609486;
T. Kobayashi, Z.Z. Xing, Int. J. Mod. Phys. A 13 (1998) 2201, hep-ph/9712432;
J.L. Chkareuli, C.D. Froggatt, Phys. Lett. B 450 (1999) 158, hep-ph/9812499;
J.L. Chkareuli, C.D. Froggatt, H.B. Nielsen, Nucl. Phys. B 626 (2002) 307, hep-ph/0109156;
A. Mondragon, E. Rodriguez-Jauregui, Phys. Rev. D 59 (1999) 093009, hep-ph/9902240;
H. Nishiura, K. Matsuda, T. Fukuyama, Phys. Rev. D 60 (1999) 013006, hep-ph/9902385;
G.C. Branco, D. Emmanuel-Costa, R. Gonzalez Felipe, Phys. Lett. B 477 (2000) 147, hep-ph/9911418;
R. Rosenfeld, J.L. Rosner, Phys. Lett. B 516 (2001) 408, hep-ph/0106335;
H. Fritzsch, Z.Z. Xing, Phys. Lett. B 555 (2003) 63, hep-ph/0212195.
[19] G.C. Branco, J.I. Silva-Marcos, M.N. Rebelo, Phys. Lett. B 237 (1990) 446;
G.C. Branco, J.I. Silva-Marcos, Phys. Lett. B 359 (1995) 166;
G.C. Branco, D. Emmanuel-Costa, J.I. Silva-Marcos, Phys. Rev. D 56 (1997) 107.
[20] For the latest, see, e.g., S. Hannestad, hep-ph/0412181;
J.A. Aguilar-Saavedra, G.C. Branco, F.R. Joaquim, Phys. Rev. D 69 (2004) 073004, hep-ph/0310305;
V. Barger, D. Marfatia, K. Whisnant, Int. J. Mod. Phys. E 12 (2003) 569, hep-ph/0308123.
[21] See, e.g., P. Frampton, P.Q. Hung, M. Sher, Phys. Rep. 330 (2000) 263, hep-ph/9903387.
Gauge singularities in the SU(2) vacuum

on the lattice
F. Gutbrod
Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, D-22603 Hamburg, Germany
Received 20 December 2004; received in revised form 24 March 2005; accepted 24 May 2005
Abstract
I summarize and extend the qualitative results, obtained previously by inspection of SU(2) lattice
gauge field configurations. These configurations were generated by the Wilson action, then transformed to a Landau gauge and smoothed by Fourier filtering. This leads to sharp peaks in field
strengths and related quantities, the characteristics of which are very well separated from a background. These spikes are caused by gauge singularities. Their density is determined as 1.5/fm4 ,
with very good scaling properties as a function of the bare coupling constant. The net number of
spikes within a configuration vanishes when approaching the deconfinement region. Furthermore,
the Landau-gauging procedure becomes unique, if the probability to find a spike is much smaller
than unity. The relation of the spikes to the instantons obtained by cooling is investigated. Finally,
a correlation between the presence of spikes and the infrared behaviour of the gluon propagator is
demonstrated.
PACS: 11.15.Ha
Keywords: Lattice; SU(2); Landau gauge; Singularities; Instantons
E-mail address: gutbrod@mail.desy.de (F. Gutbrod).

doi:10.1016/j.nuclphysb.2005.05.014
F. Gutbrod / Nuclear Physics B 720 (2005) 116136
117
1. Introduction
If one transforms a single configuration of SU(2) lattice gauge fields to a Landau gauge
and then removes the high momentum Fourier components of the gauge fields by some
exponential cut-off (Fourier filtering, F.f., or smoothing), one observes the following
phenomenon [1]: For sufficiently large lattices and/or for sufficiently large bare coupling
constants, at a couple of lattice positions the gauge fields show a rapid variation in all
spacetime directions and for all colour components. These variations lead to narrow
In
spikes in the Wilson action density S(x) and in the topological charge density1 q(x).
group space, the spikes for q(x)
are either parallel or anti-parallel to those for the action,

and I associate a corresponding charge to them.
As an example, in Fig. 1 the charge density q(x)
is shown for a plane of a large lattice.

The cones which are oriented according to the components q a (x) in group space (see
Eq. (14)), have been generated by the VRML-language [2] and visualized by the browser
GLVIEW [3]. The positions of almost all of the spikes do not vary in the different gauges
which are due to different Gribov copies [4,5]. For sufficiently modest filtering, the spatial
extension of these spikes amounts to one or two lattice units, which roughly corresponds
to 0.05 fm.
Due to Fourier smoothing, the quantities S(x) and q(x)
are gauge dependent, and their

physical interpretation is not straightforward. The observation that, within the spikes, one
observes self-duality, i.e. q(x)
S(x) within a 10% accuracy, might suggest that we

observe narrow instantons. Two facts contradict this interpretation. First, one observes that
the distribution in terms of S(x)summed over a few lattice sites around the spikes
is strongly peaked for large action values. Those correspond to small-sized instanton-like
objects. For details see Section 3. On the contrary, several lattice studies in SU(2) predict
a size distribution with a maximum for sizes in the range from 0.2 fm [6] to 0.4 fm [7] and
around 0.3 fm or higher in SU(3) [8]. Secondly, if the configuration is smoothed by the
standard cooling technique [8,9], then one indeed observes instantons at the position of the
spikes, but their size is much larger than that of the spikes.
Now, if one transforms to a Landau gauge prior to cooling, one can observe directly
that the gauge fields are singular (regulated by the lattice) at the positions of the spikes. All
observations point towards the interpretation that the Landau gauging leads to singularities
in the gauge fields. Gauge singularities (GS) are known to appear in the continuum in
non-Abelian gauge theories, as no smooth gauging is possible [10]. The conditions under
which these singularities show up on the lattice (i.e. for which lattice sizes and coupling
constants) are the main topic of this paper.
The mechanism under which spikes arise under F.f. at the positions of GS, can be studied
by imposing a singular gauge transformation2
g(x) = (x0 + i x)/|x|
1 The normalization of these two quantities will be given in Eqs. (13) and (14).
2 The origin x = 0 has to be chosen apart from a lattice site. are the Pauli matrices.
(1)
118
Fig. 1. The topological charge density on a lattice plane running through a typical spike. The direction of the
cones indicates the orientation of the charge in three-dimensional colour space. The lattice size is 483 64, and
= 2.85. The cut-off for the Fourier transformed gauge fields is 2 = 0.5a 2 .
119
on a relatively smooth gauge field configuration, e.g. on a large-size instanton in the regular gauge.3 Then, close to the origin, there occurs a cancellation between the derivative
terms in the action and the cubic and higher gauge field terms. This cancellation will be
mutilated by filtering, which acts differently on the two contributions. Accordingly, under
F.f. a singular gauge field distribution will develop a sharp peak in the action and in q(x),
even if both quantities are smooth in x without filtering.4 Thus, the plaquette action and
q(x)after
smoothingno longer possess a definite physical meaning. They have to be

understood as a measure for unexpected things that happen while gauging.
Accepting the occurrence of spikes as a signal for GS, it is natural to ask for the correlation of these singularities with other nonperturbative phenomena on the lattice. Such
investigations are not new. E.g. the connection of nonpointlike GS with center vortices and
monopoles in the Laplacian gauge has been studied in [11], and similarly for instantons
with pointlike singularities in [12].
What can be expected for the correlation of GS with instantons in the Landau gauge?
First, a single instanton on a lattice of very large size will lead to gauge fields which are
singular (regulated by the lattice cut-off) at the center of the instanton. Configurations with
the singularity somewhere else still satisfy the differential Landau condition but not the
integral one.5 Thus, there is a one-to-one correlation between the two phenomena. Next,
for an ensemble of instantons with identical charge, the explicit construction [13] proceeds
via singularities at the various centers, and again one expects to see one spike per instanton.
At this point, however, the existence of Gribov copies [4] can modify the expectations, and
in numerical simulations it turns out that one may observe more spikes than (artificially
generated) instantons. This means that the Gribov phenomenon willin generalprevent
us from establishing a one-to-one correlation between GS and instantons. A good guess
seems to be that the number of spikes in a configuration with only one kind of charge, is
equal to or larger than the number of (anti-)instantons.
Third, the situation for an ensemble of i+ instantons and i anti-instantons seems to
lead to additional complications, as the interpretation of close-by instantonanti-instanton
pairs causes ambiguities in the Laplacian gauge [12]. Furthermore, under the standard cooling technique, which has been widely used to measure the topological susceptibility, the
annihilation of such pairs is a frequent phenomenon [7]. Thus, the difference of topological objects with charge 1, i.e. i+ i , can be determined significantly better than the
total number of objects, i.e. i+ + i . Since the number of spikes, denoted by n , can
be determined without a significant systematic error, we will expect to find more spikes
than instantons. This holds at least for prolonged cooling and if the value of the bare coupling g0 is chosen small enough. In SU(2), a limit for a save identification of spikes is
= 4/g02 2.70.
Finally, one has to consider the results for topological charge fluctuations, obtained via
the overlap Dirac operator, which cast doubts on the existence of individual localized pieces
3 The generalization to several singularities at different positions is obvious.
4 This phenomenon has been checked by analytic calculations in the continuum by H. Joos.
5 This statement depends on definitions of what is meant by the Landau gauge on the lattice. They will be
given in the next section.
120
of the charge [14]. Whether this leads to a significant uncertainty for the determination of
i+ + i , remains to be seen.
In this paper, I will concentrate on the phenomenology of the spikes, i.e. on their density
and on their correlation with other nonperturbative phenomena. The following topics will
be treated:
(1) For large , the spikes are very well defined objects, if the action (or q(x))summed
over a few lattice units around the action maximumis taken as a probe. This is
demonstrated by the histogram in Fig. 2, which is based on O(30) configurations on
large lattices and large . The histogram displays a well defined valley between a peak
containing the spikes (all satisfying self-duality q(x)
S(x)), and a background

which contains distinctly different parts of the lattice configuration. This allows to
determine the density of spikes for = 2.85 as
spikes n+ + n /V = (1.5 0.20)/fm4 .
(2)
Since this number has no significant systematic error, the density of spikes in physical
units can be used for a scaling study as a function of . Very good scaling with the
string tension results [15] is observed in the interval 2.70 2.85. For details see
Section 3. The relation of the density of spikes on various lattice sizes and of the
deconfinement transition will also be discussed in that section, as well as the robustness
of the GS-finder.
(2) The density of spikes is correlated with the average number of Gribov copies which
are found during the Landau gauging process. This number can be determined if the
lattice is chosen so small that the probability for finding one or more spikes in the
configuration is less than unity. Then, empirically, the gauging procedure is practically unique. This means that the Landau gauging algorithm ends upwith high
probabilitypreferentially with the same value for the gauge functional F (U ) (see
Eq. (5)), if only a finite number of gauging trials is performed (a trial is defined by applying a random gauge prior to the Landau gauging). The observation is that if copies
are found, the probability to find spikes is enhanced too. This positive correlation between the appearance of spikes and copies will be demonstrated in Section 4.
A suggestive interpretation of this phenomenon can be given by the fact that for a
single instanton the gauge functional has a local minimum if the GS (with a spike) is
located at the center of the instanton (see also Section 2). It is natural to assume that
this minimum also shows up in a general environment of gauge fields, thus leading to
the appearance of an additional Gribov copy. The probability for this effect to occur
clearly increases with the current number of spikes.
(3) It has been stated previously that one expects the density of spikes to be somewhat
greater than the density of instantons. For the determination of the latter, I will rely on
the cooling technique [8,9]. In Section 5, it is shown that there is a clear connection
between the outcome of cooling and of F.f. If one starts from a configuration in the
Landau gauge, finds the position of its spikes and then cools the unfiltered lattice configuration, one observes the following: the absolute values of the gauge fields, |
u (x)|
(see Eq. (3)), decrease under cooling almost everywhere, except at the positions of the
spikes. There, the gauge singularities show up, both by the strong increase in magni-
121
tude as well by the typical sign flips. The local action maxima, which accompagny the
singularities, are broad in most cases, showing self duality, and they can interpreted as
instantons. The probability to find such instantons at the positions of spikes is about
80%. With this limitation, the search for spikes becomes a kind of an instanton finder.
(4) The fact that the spikes manifest themselves as zeros in the gauge fieldsin all directions and for all colourssuggests that the gluon propagator and other correlators
of the gauge fields behave differently if spikes are present in a configuration or if not.
This is the case, as it will be shown in Section 6.
2. Landau gauge and filtering

Here a couple of topics relevant for gauging and filtering will be discussed. The Landau gauging algorithm on the lattice maximizes the sum of the large SU(2)-components
u0, (x) in the link representation
U (x) = u0, (x) + i u (x)
(3)
u20, (x) + u2 (x) = 1.
(4)
with
Thus, one searches numerically for minima of

1 u0, (x)
F (U ) =
(5)
x,
under local gauge transformations U . This I call the integral Landau gauge condition. Its
aim is to bring the small SU(2) components to the perturbative regime as closely as
possible.
If the minimum condition is fulfilled, one has

u (x) u (x e ) = 0.
(6)
This is the differential Landau gauge condition, which corresponds to

A = 0
(7)
in the continuum.
In the introduction it has been stated that the gauge fields of a single instanton in the
Landau gauge come in the singular gauge, thus leading to a spike. This assertion is based on
the condition that the lattice volume V is sufficiently large. In that case6 the regular gauge
cannot satisfy the integral gauge condition since the gauge functional F (U ) diverges in the
limit V . For the singular gauge, F (U ) remains finite in this limit. The GS will be
6 For lattice sizes smaller than the instanton radius the singular gauge gives a contribution larger than the
regular one. Furthermore, the impossibility of constructing a single instanton on the torus might then spoil the
argument.
122
located at the center of the instanton. This can be verified by starting from a configuration
where the GS is located a few lattice sites apart from the center. This configuration satisfies
the differential gauge condition, but not the integral one. After a few attempts to minimize
F (U ) numerically, the situation becomes unstable and the GS moves towards the center.
It also has been checked that the gauge functional is quadratic in the distance from the
center, at last up to 5a. Whether it can be shown rigorously that the singular gauge is
the absolute minimum is not known.
Since in the approach to the continuum the Landau gauging drives the configuration to
the region of small vector components u (x), variables with u0, (x) < 0 become very rare.
For presently available coupling constants, 2.85, this goal has almost been achieved.
This allows for a linearization and subsequent Fourier transformation of the gauge field
variables. Besides the trivial linearization,
u (x) A (x),
(8)
I use a stereographic transformation in order to minimize errors due to residual negative

u0, (x). There I define gauge fields by

u 2/(1 + u0, ) = w
(9)
(x) A .
This transformation brings the south-pole u0 = 1 not to infinity, but to w
2 = 4. The
Fourier filtering is applied to the gauge fields A (x) (see Eq. (16)), and after suppression
of high momentum Fourier amplitudes, A A , the gauge fields are transformed back
to SU(2)-variables. The back-transformation is
w
= A ,

u = 1 w
2 /4w
,

u 0, = 1 u 2 , sign for w
2 > 2.
(10)
(11)
(12)
The differences between the two methods of linearization amount to 5% for the field
strengths, with no impact on the existence and properties of the spikes.
With these variables, the action and the topological charge density will be defined in
(a)
the following way: first of all, the field strength tensor F (x) is calculated by averaging
the link products around the 4 plaquettes , with ,
= , , which are connected to x.
(a)
Given F (x), electric fields E (a) (x) and magnetic fields B (a) (x) are defined, and the
action is

1 (a)
S(x) =
(13)
E (x)E (a) (x) + B (a) (x)B (a) (x) .
2 a
For convenience, I define

q(x)
=
q a (x) =
E (a) (x)B (a) (x)
a
(14)
and call it topological charge density, inspite of the nonstandard normalization. For selfdual objects, one has S(x) = q(x).
123
Due to the slight nonlocality of the operator q(x),
it should have a negative correlator [16]

q(x)
q(0)
0
(15)
only for x 2 > 4a 2 . This property has been successfully checked for 10 configurations on
large lattices. The negativity breaks down for filtered and for cooled configurations.7 This
can be read off, for instance, from Fig. 16 of Ref. [7], and it also has been verified for the
cooled configurations used in Section 5.
3. Density of spikes
Here it will be shown that the occurrence of spikes is a well defined phenomenon in the
sense that the spikes are separated from a background of action maxima by a deep valley,
such that their density can be determined without significant systematic error. This is true,
especially, for large values of . For smaller values, the depth of the valley diminishes
somewhat, without a severe loss of significance, as long as 2.70. Thus, it can be tested
quite well whether the density of spikes scales under a variation of in accordance with
the string tension, inspite of the limited number of spikes identified so far.8
Another important question is how the average number of spikes varies with the lattice
volume, if is kept fixed. For this purpose, in addition to the large lattice of size 483 64
two smaller ones have been studied, namely 243 32 and 163 128, both at = 2.85.
The evidence, presented here, is that the density is roughly independent of the volume.
This holds even if the volume is smaller than that critical one, which leads to the deconfinement phase transition (for those lattice parameters see [17]). At the critical volume,
one obtainson the averageapproximately one spike in the configurations. Then, spikes
occur predominantly in close-by pairs of opposite charge.
3.1. Definition of spikes and the scaling of their density
I first describe the technique used for identifying spikes, starting with the situation on
large lattices. The following quantitative analysis is based on
(A) 34 configurations on a large lattice with size 483 64 at = 2.85, with twisted boundary conditions [18], and
(B) 25 configurations on a smaller lattice with size 323 48 at = 2.70, also with twisted
boundary conditions.
Each of these configurations has been gauged to a Landau gauge and then Fourier filtered, with the following substitution for the momentum space amplitudes
7 Obviously, the negativity excludes the dominance of four-dimensional, sign coherent structures in the true
vacuum, as has been emphasized by [14].
8 This is true because the change of scale enters the density with the 4th power.
124

A (q) A (q) = A (q) exp 2 q 2 .
(16)
For identifying spikes, I use 2 = 0.5a 2 . This modest filtering already leads to a strong
suppression of the perturbative noise of the A . The filtered gauge fields were transformed
back to x-space and subsequently reunitarized (see Eq. (10)).
For each of these filtered configurations, I have identified O(500) local maxima of the
action density S(x). Then, S(x) has been summed within a radius of R = 2a (a is the
lattice constant) around the positions xm of these maxima,

m 2 , R =
2
2
(xx
m ) R
S(x).
(17)
For case (A), the distribution of the values of m (2 , R) is shown as the solid line
histogram of Fig. 2. The distribution is normalized by the physical volume V () of the
lattice (see below). It is remarkable how clearly the peak at large m (2 , R) is separated
from the background at smaller values, with a dip around m (2 , R) = 8.
Furthermore, the data for case (B) are shown in Fig. 2 as a broken line. In order to
compare these data with those of case (A), one has to fix the ratios of the physical volumes.
For this, I use
V ( = 2.7)/V ( = 2.85) =
a( = 2.70)4 323 48
a( = 2.85)4 483 64
(18)
Fig. 2. Histograms for summed Wilson action (2 , R) for = 2.85, lattice size = 483 64 (full line), and
for = 2.70, lattice size = 323 48 (broken line). The summed Wilson action (2 , R) and R are defined in
Eq. (17). The histograms are normalized to V (), the physical volume of the lattices.
125
with [15] a( = 2.70)/a( = 2.85) = 1.6.

Now, it is natural to define the average density of spikes per configuration by the number
of objects with m (2 , R) 8. In case (A), this gives
Nspikes ( = 2.85) = 6.6 0.5 spikes per configuration,
(19)
and, using the string tension scale [15] a( = 2.85) = 0.028 fm,
spikes ( = 2.85) = 1.52 0.2 spikes/fm4 .
(20)
The corresponding results for case (B) are:

Nspikes ( = 2.70) = 9.6 0.6 spikes per configuration,
(21)
spikes ( = 2.70) = 1.51 0.2 spikes/fm4 .
(22)
and
Thus, Eqs. (20) and (22) show with good accuracy, that the density of spikes obeys scaling
according to the nonasymptotic variation of the string tension, and that the density itself
is a rather dilute one. The spikes themselves are better defined for large values of than
for small ones, and this implies, together with the scaling of the density, that they are not
a remnant of the strong coupling regime, which would become irrelevant in the continuum
limit.
Under certain assumptions of diluteness, the density of spikes may be identified with
the topological charge density. In conventional units, this amounts to
1/4
= (218 5) MeV.
(23)
This value is in very good agreement with those obtained by the cooling method [6] and
by the fermionic method [19]. The assumptions of diluteness are:
The spikes are associated with topological objects with charge 1.
It is possible to subdivide the large lattice into smaller ones such that one finds at most
one spike in any of these sublattices. The spikes in different sublattices are statistically
independent.
The last assumption does not hold in the deconfinement region (see next subsection), and
the average size of instantons [9] seems to be at variance with diluteness.
3.2. Densities of spikes on smaller lattices
It is well known that the topological charge susceptibility becomes extremely small if
the lattice size is chosen smaller than the deconfining volume [20]. This can be accomplished either by a sharp decrease of both n+ and n when crossing the transition, or by
a decrease of |n+ n | only, with little variation for n+ + n . It turns out that the latter
scenario is realized.
In order to show this, I have made long runs for lattice sizes 243 32 (case (C)) and
3
16 128 (case (D)), both at = 2.85. Marginally, the lattice of case (C) is in the confining
126
region, if the spatial size of this lattice is identified with the temporal size N of the lattices
of Ref. [17]. In that paper, it is shown that the critical value N,c of the temporal size is
related to the string tension by
(aN,c )1 = 0.69 0.02,

(24)
and that this holds independently of . Using the string tension a = 0.063 0.003 at
= 2.85 from [15], this gives
N,c = (23.0 1.5)a.
(25)
Thus, the lattice of case (C) is rather close to the critical one.9 The result of the long runs
is
Nspikes = 0.37 0.03 spikes per configuration
(26)
which is well compatible with the value Nspikes = 0.41 0.03 obtained from Eq. (19) under
the assumption of a volume-independence of the density. Furthermore one realizes that the
number of spikes per configuration drops below unity around the deconfinement transition.
For case (D) with a lattice of size 163 128, we can be sure to be deep in the deconfinement region. Thus, it is convincing that the average number of spikes turns out as
Nspikes = 0.67 0.06 spikes per configuration
(27)
which is slightly larger than the number Nspikes = 0.49 0.04 expected from Eq. (19).
Thus, in the deconfinement region, the expectation value for spikes is definitely nonzero,
and it is realized by close-by pairs of spikes of positive orientation (i.e. with q(x)
parallel
to the action) and negative oriented ones. The average charge is compatible with zero.
A precise upper limit is difficult to obtain, due to the rather long autocorrelation times.
3.3. How robust is the definition of spikes?
The fact mentioned above, that the dip is less pronounced at = 2.7 than at the higher
value of , is, most likely, not due to a statistical fluctuation. Data at = 2.6 show that the
peak for high values of m (2 , R) changes in shape towards a plateau. I expect the reason
to be that at low the spikes are not as stable as at larger , and a snapshot of a single
configuration will find both well developed spikes and decaying/growing ones. One has to
conclude that for a precise determination of the density a lower limit of 2.7 has to be
accepted.
The details of the histogram shown in Fig. 2 depend on the radius R over which the
action is summed up, and on the value of the Fourier filtering parameter . Both parameters
can be varied in a reasonably wide range. First of all, instead of summing the action over
a small volume, one can select and order the spikes simply according to the local maximal
values of S(x) or |q(x)|,
i.e. setting R = 0. An upper limit for R can be obtained from the

distribution of S(x) as function of the distance from the center of the spike, as shown in
9 Actually, for this geometry it is more appropriate to use the ratio of certain glueball masses [21,22], which
gives Nstep = 27.7a.
127
Fig. 3. Action (A) and topological charge (B). Open squares: action and |q(x)|
at distance r from the center of an

artificial instanton in singular gauge, Fourier smoothed with = 0.5a 2 . Full squares: action and |q(x)|
at distance
r from center of a spike. Fourier smoothed with = 0.5a 2 . The lattice size is 483 64, and = 2.85.
Fig. 3(A). There, the full squares indicate the results obtained from a spike at = 2.85,
and the empty squares are derived from an artificial instanton in singular gauge with a
radius = 7a, both filtered with 2 = 0.5a 2 . It is evident that beyond r = 3a the action
distributions differ appreciably, and the values for the spike can no longer be separated
from a background. In Fig. 3(B), the topological charge density is plotted, again for a
thermalized configuration and for an artificial instanton. Also for this variable it is evident
that nothing can be gained for r > 3a. Finally, comparing the cases (A) and (B), one notices
that self-duality is fulfilled on the level of 10%.
It should be noted that the radius of the instanton which has been used in Fig. 3, is of
little influence as long as > 3a. The radial dependences of S(r) and |q(r)|
are generic
for a gauge singularity imposed on a smooth background, and they are modified only if
this background varies strongly on the scale of the lattice constant a. For the same reason,
it is impossible to decide whether the background is due to other topological objects like
merons (for recent literature see [23]).
Next, turning to the dependence on the cut-off parameter 2 , a lower limit for this is
given by the signal-to-noise ratio determined by the perturbative background. This requires
2 0.15a 2 , as can be deduced from Fig. 6 of Ref. [1]. For rather large values of 2 , one
may encounter the difficulty that the spikes become so broad that they start to overlap.
Among 6 configuration at = 2.85 with 57 spikes, this did not happen for 2 2.0a 2 .
I conclude that for large , there is a sufficiently large window for the parameters of Fourier
filtering within which there is no significant uncertainty for identifying spikes.
128
4. Gribov copies and spikes

The main result of the investigation to be presented next is that the number of spikes
and the number of different Gribov copies in a configuration are strongly and positively
correlated. A simple interpretation for this result will be given at the end of this section.
The existence of Gribov copies is made evident by applying random gauge transformations prior to the Landau gauging. On large lattices, the final value of the gauge function
F (U ), Eq. (5), will differ for each set of random numbers, and similarly the positions of
eventual spikes will differ. This situation is difficult to analyze. For smaller lattices where
the density of spikes is somewhat less than unity per configuration, both phenomena can
be studied with sufficient statistics. The lattice case (C) considered in Section 3 with size
243 32 and with = 2.85, meets this requirement as the number of spikes per configuration is 0.4.
The observation is based on two long sequences with about 200 000 updates for each
sequence. The approximate number of Gribov copies has been determined after every 1000
sweeps. For this purpose, 12 random Landau gauges were performed, the 12 values of
the gauge function were compared and the number of different gauge functions, found in
this way, was recorded. Simultaneously, the number of spikes was found by filtering the
configuration with a cut-off 2 = 0.5a 2 and by searching for local action maxima, which
exceeded the background by a factor three or more. The results for copies and spikes are
the following:
Configurations without Gribov copies follow each other in long sequences, i.e. for
O(10 000) lattice updates. When spikes occur, they preferentially show up in all random gauges, and very often more than one spike is found per gauge. Thus, spikes and
Gribov copies occur, during the process of updating, in lumps.
The appearance of Gribov copies is not exactly correlated with the number of spikes,
i.e. not on a 1 : 1 basis for each configuration. There are configurations without
copies,10 but with spikes, and vice versa.
If both the number of copies and the number of spikes per configuration are smeared
over a few adjacent measured configurations (just to improve the presentation), a striking correlation shows up. This is demonstrated in Fig. 4. On the left-hand side, the
first 100 000 sweeps are shown, with the slim points giving the number of copies, and
the fat points giving the number of spikes (averaged over the 12 random gauges). The
results from the next 100 000 sweeps are plotted on the right-hand side.
A clue for the interpretation of this correlation is given by the observation that the
positions of spikes in different Gribov copies tend to agree with each other, within a shift
of 1 or 2 lattice units. This has already been observed in Ref. [1], and it implies that the GS
are coupled to objects the positions of which are gauge invariant. It is natural to consider
the positions of instantons as candidates for these objects. Now, in Section 2 it has been
remarked that the gauge functional is quadratic in the distance of the GS from the center
10 Of course, it is possible that some copies might be found if more random gauges had been tested.
129
Fig. 4. Averaged number of Gribov copies per configuration (slim points) and averaged number of spikes per
configuration (fat points) as function of number of sweeps in long updating sequence. The lattice size is 243 32,
and = 2.85. The Fourier cut-off is 2 = 0.5a 2 .
of a single instanton. I assume that this also is trueat least with a high probability,
for instantons in a physical environment. Thus, the spike is associated with an instanton,
the latter with a local minimum and the minimum with an additional Gribov copy. It has
to be repeated, however, that the correlation between spikes and copies holds only in a
probabilistic sense, and it will therefore be difficult to arrive at a rigorous statement.
5. Cooling a gauged configuration

The expectationwhich has been formulated in the introductionthat the number and
the positions of GS and of instantons are closely coupled, can be verified by direct inspection of lattice gauge configurations. I define candidates for instantons by simple cooling of
thermalized configurations and by comparing gauge fields, actions and q(x)
of the cooled
and of the filtered configuration. First of all, it is reported in Section 5.1 that at the positions
of spikes the cooled configuration directly shows singularities in the gauge fields, and that
the action and q(x)
have extrema at the position of the spikes in about 80% of the number
of spikes.
The next topic is what happens in the filtered configuration around the positions of those
extrema (of the cooled configuration) which are not associated with spikes. In general, one
also will encounter extrema but they fail to satisfy simple instanton criteria. Thus, one may
130
conclude that instantons without associated spikes do not occur. This will be described in
Section 5.2.
In detail, I have studied correlations between spikes and cooled objects in two ways.
First of all, around the location of spikes in a given filtered configuration (denoted by F ),
the cooled configuration is studied visually. Also, action maxima are studied, which are not
associated with spikes. Secondly, the inverse correlation will be investigated, i.e. maxima
in the cooled configuration, called C, are selected and the filtered configuration F is visualized around the locations of the cooled extrema. In the final subsection, the probability for
obtaining spikes by gauging cooled configurations is investigated. The number of spikes
andto a large degreetheir positions seem to be independent on the amount of cooling.
This excludes the possibility that the spikes are generated by dislocations.
5.1. Spikes cooled configuration
The cooling sequence starts from the same Landau gauged configurations as the filtered
ones. Seven configurations have been cooled both with 30 and 100 steps, using strong
cooling. There, each step rotates each link variable to the local action minimum.
A total of 71 spikes has been found. For these spikes, the resulting correlation between
the filtered configuration and the cooled one is quite simple: at the location of the 71 spikes
one finds, first of all, always gauge singularities in C, and, secondly, one observes that
the spikes are, apart from 12 cases11 associated with self-dual action maxima instantons.
This is consistent with the expectationstated in the introductionthat more GS than
instantons will be found.
The first observation means that the cooled gauge fields Aa (x) show clean peaks around
the position of the spikes, including sign flips in all colours and directions, with a diameter
of 3 or 4 lattice units. Outside of the peaks, the gauge fields are noisy and not particularly
small. This is due to the fact that cooling does not minimize the gauge fields, and Landau
gauging still preserves their perturbative fluctuations. The positions of the gauge singularity
and the nature of the gauge field peaks are independent of the number of cooling steps, as
has been checked by comparing the fields for 30 and 100 cooling steps.
The second observation means that at the position of a spike, pronounced maxima of the
action and of the topological charge density show up in C. The topological charge density
has the opposite sign in the two cases cooling and filtering.12 After cooling, the heights of
the action-peaks vary strongly from peak to peak. The explanation is that for instantons,
the height of the action-maxima is a strongly decreasing function of the instanton size,
such that large instantons do not show up spectacularly under visualization techniques.
Obviously, selecting positions by spikes does not select instantons of a particular radius.
For action-maxima observed after F.f. which are not spikes, the situation at the same
position in C is not very clear-cut. At some maxima one may find an instanton-like object,
11 An inspection of these exceptions reveals that either the cooled maximum of q(x)
is quite low, or that the

spike sits between several maxima.
12 This is in agreement with the interpretation of spikes as a mismatch of the quadratic terms and the higher
order terms, caused by suppression of high frequency terms via filtering.
131
at others there is one close by, and in many cases one cannot observe any activity in a
neighborhood of reasonable size.
5.2. Maxima of cooled configuration filtered configuration
The inverse correlation, i.e. between cooled action (or |q(x)|)
maxima selected in C on
the one hand, and between structures in F (obtained by filtering) on the other hand, is more
complicated to investigate than the previous one. This holds because there is much freedom
in the selection of maxima. It is not the purpose of this paper to follow the elaborate filtering techniques of Refs. [7,9] for extracting the best instanton candidates among the many
maxima which show up during the cooling process. I simply start from the 7 configurations
which have been strongly cooled by 100 sweeps, then select for each configuration 32
positions associated with the 32 highest maxima of |q(x)|,
and investigate the properties

of F around the positions of these maxima. The filtering is done both with parameters
2 = 4a 2 and 2 = 0.5a 2 .
The first observation is that one recovers 80% of the spikes among the first O(30)
maxima. This is not trivial. The sizes of the instantons which are associated with spikes,
vary considerably, such that a selection via the |q(x)|-peak
height might lead to a failure in

identifying these maxima within the first few dozens of peaks. I conclude that the spikes are
tightly correlated with instantons which are produced by cooling and selected among the
spatially less extended ones. Obviously, the spikes have a significance beyond the gauge
dependent filtering technique.
Next, it is highly interesting to investigate those locations in the filtered configuration,
F , which on the one hand are associated with peaks in C but, on the other hand, have no
GS in the gauge fields at this location. In a fraction of about 2/3 of those maxima, one
encounters also a peak in F both in the action and in |q(x)|,
with identical signs of q(x)
in
both configurations, i.e. (anti-)instantons in nonsingular gauge in C may be associated with
candidates for (anti-)instantons in F . In the other 1/3 cases, there is no significant activity
in F .
In order to understand the significance of these extrema, it is necessary to quantify the
impact of F.f. on instantons. Obviously, this is gauge dependent. For standard instantons
in the regular gauge, the effect of filtering is quite modest: the peak action of an artificial
instanton of radius 7a (this corresponds to a radius = 0.20 fm at = 2.85) will be
reduced by F.f. not too strongly, if the filtering is done with a strong cut-off, i.e. 2 = 4.0a 2 .
The reduction amounts to a factor

S x = 0, 2 = 4.0a 2 /S x = 0, 2 = 0.50a 2 0.49.
(28)
Thus, F.f. will not seriously affect those instantons, which are the most numerous ones in
SU(2), according to Ref. [6]. Smaller ones will feel a stronger reduction in peak height,
such that they essentially look like broader instantons. Furthermore, F.f. will not create
quasi-instantons nor improve the self-dual properties of maxima, in obvious contrast to the
cooling technique.
Now, the crucial question is whether the candidates without spikes have the correct
properties to be unambiguously identified with instantons. This has to be doubted for the
following reasons:
132
In F , the orientation of the components in colour space is not along the diagonal. This
fact is best recognized by visual inspection, and it will not be specified in detail here.
The electric and magnetic fields in F are not perfectly self-dual. In order to be quantitative, I consider the measure of self-dual quality, Q(i) (y), defined by
Q(i) (y) =
2E (i) (y)B (i) (y)

,
E (i)2 (y) + B (i)2 (y)
i = 1, 2, 3,
(29)
where y is taken at the lattice positions around the maxima position x in C, with
|x y| < 4a. This distance amounts to half the radius of a typical SU(2)-instanton.
A histogram of this quantity, taken between 1 Q(a) (y) 1, is peaked beautifully
at the limits 1 for the cooled configuration. For the filtered configuration, however,
Q(i) (y) shows a broad histogram for all i, extending down to Q(i) (y) 0. Thus, selfduality is not realized well in the case of F.f.
The spatial sizes of the peaks in F are much smaller than in C. Since the shape of the
topological objects seems to be rather irregular, their radii are difficult to determine
directly. A simpler way is to observe the reduction of the peak height as a function
of the filtering parameter, when it is increased from 2 = 0.5a 2 to 2 = 4a 2 . This
reduction amounts approximately to a factor 0.03, in sharp contrast to the reduction by
a factor 0.49 for an artificial lattice instanton (see Eq. (28)).
It has to be concluded that a close correspondence between cooled configurations and
filtered ones exists mainly at the position of the spikes. In the cooled configuration, the
gauge singularitythe origin of the spikeis preserved, and an instanton-like maximum
of the action etc. has developed. Action-maxima in C, which are not associated with spikes,
do not reveal the typical properties of instantons. Of course, by a more refined search
among the many maxima in C one may eventually find better instanton candidates.
5.3. Gauge and cooling (in-)dependence of spike positions
In the previous subsection, it has been stated that about 80% of all spikes in the filtered
configurations are found at a position close to maxima in the cooled configurations. The
relevant maxima are those with the largest values of the action density. This is rather important, since the positions of these cooled maxima are gauge independent. Thus, also the
position of spikes has a gauge invariant meaning, at least in a probabilistic sense.
Furthermore, in the process of cooling, most small scale fluctuations are eliminated
which, in principle, could induce the gauge singularities. The presence of such dislocations which can contribute to the topological charge [7,24] is a drawback of using the
Wilson action both for updating configurations and for cooling.
In the following, the effect of cooling will be studied once more in a different way. In
determining the correlation between the cooled maxima and the spikes, the latter were defined by Landau gauging a configuration which had all short range fluctuations undamped.
Here, the order of cooling and gauging will be reversed, i.e. a nongauged configuration will
be cooled with up to 10 strong cooling steps. This eliminates all plaquette values smaller
than 0.9. If one then finds a Landau gauge and applies filtering to this cooled configura-
133
tion, one finds approximately the same number of spikes as compared to the case without
cooling. Most of the spikes show up at the same positions in the two cases.
In detail, on a lattice of size 243 32 at = 2.65, 20 configurations have been cooled
with 3 steps (case (a)), and 20 configurations with 10 steps (case (b)). After a random gauge
had been applied, first the Landau gauging and then filtering were performed. In case (a)
one finds 93 spikes, whereas for the corresponding uncooled configurations one finds 87
spikes. In case (b), the corresponding numbers are 99 and 91. When the spike positions
are compared between the cooled and the uncooled configurations, it turns out that in 15
configurations out of the 20 cooled ones (in case (b)), more than 50% of the spikes show
up at the same positions as in the uncooled case (within 1 or 2 lattice units).
Because of the strong suppression of plaquettes with trace values < 0.9, this correlation
of spike positions makes it rather unlikely that the spikes are induced by dislocations, in so
far as these are characterized by large negative plaquette values.
6. The gauge field propagator and spikes

It is evident that close to a spike, the gauge fields show a rapid flip of sign, for all colours,
for all spacetime orientations and along all directions. Thus, it is natural to expect that the
gauge field correlators behave differently for configurations with spikes as compared to
configurations without spikes. A convenient tool to study this effect is a measurement of
the gluon propagator, evaluated separately for the two specimens of configurations. It is
to be expected that the propagator in x-space falls off more rapidly for increasing spatial
separation, when spikes are present than if none are around. In momentum space, this effect
is reproduced if the zero momentum propagator is reduced and the smallbut nonzero
momentum propagator is enhanced when spikes are present, as compared to the spike-free
configurations.
This can be tested on lattices which are so small that -simply by geometrical considerations the probability to find a spike is considerably smaller than one. For our standard
value of = 2.85, a lattice of size 243 32 has a density of 0.4 spikes per configuration
(see Eq. (26)). Since often there are more than one spikes per configuration, the probability
to find at least one spike is around 0.2, i.e. considerably smaller than 0.4. Thus, for a total
of 600 configurations which have been measured,13 one finds 480 configurations without
spikes and 120 configurations with one or more spikes. The gauge field propagator has
been measured separately for the two classes of configurations.
The difference between the two values of the propagator,-spikes present or not
depends on the momentum q 2 . For q 2 = 0, the propagator with spikes is slightly smaller
than without spikes:

G q 2 = 0 no spikes (1.19 0.05)G q 2 = 0 spikes .
(30)
This difference is just significant. For q 2 > 0, the error is significantly smaller, and we have

G q 2 0.1a 2 no spikes (0.78 0.03)G q 2 0.1a 2 spikes .
(31)
13 Out of 300 000 sweeps, after 500 updates Landau gauging and searching for spikes was performed.
134
For q 2 > 1/a 2 , no significant difference between the propagators of the two classes
could be observed. It is straightforward to transform the different behaviour of the propagators to gauge field correlators in x-space. If one sums the gauge fields over 3 spatial
dimensions and considers the correlator of this average along the t-direction,
C(T ) =

x
A(t, x)

2
,
A(t + T , x)
A(t, x)
x
(32)
x
then the inequalities (30) and (31) imply that the correlator with spikes decreases faster
with T that the correlator without spikes. The effect is small but significant.

Configurations of SU(2) lattice theory, when transformed to a Landau gauge, reveal almost pointlike structures with a scale invariant density14 of 1.5 points per fm4 , where the
gauge fields show a singularity, regulated by the lattice. This singularity can be displayed
beautifully either by coolingwith O(50) cooling stepsthe gauged configuration, or by
suppressing the high momentum Fourier components by some exponential cut-off. Under
certain assumptions of diluteness, the density of spikes can be identified with the topological charge susceptibility, leading to good agreement with the results of cooling and of the
fermionic method. In detail, the phenomena are:
(1) After cooling a gauged configuration, the gauge fields agree with those of a singular
gauge (see Eq. (1)), i.e. they shoot up and change sign at the special points. The gauge
invariant quantities action and topological charge density signal the appearance of instantons of various sizes around those singularities, with the gauge fields in the singular
gauge. This match occurs at about 80% of the singularities. Othernonsingular
instanton-like objects show up visually with a density which is higher than that of
the singular ones. This density drops quickly under prolonged cooling. The instantons
associated with singularities are stable under cooling.
(2) After removing the high momentum amplitudes by Fourier filtering, the singularities
show up as spikes in the Wilson action and in the topological charge density. This is
so because the removal of short range Fourier amplitudes destroys the cancellation
between linear and quadratic terms in the field strengths. The gauge fields show zeroes, as a function of lattice positions, in all colours and directions, but the peaks are
smoothed out relative to the cooling procedure. When more and more Fourier amplitudes are removed, the spikes get, of course, broader. However, their positions do not
vary essentially. Furthermore, the criteria for selecting spikes are independent of the
strength of the filtering cut-off.
(3) The positions of the spikes are not completely independent of the special Landau
gauge, but almost so. This means that for different Gribov copies of a given configu14 This holds if the scale as a function of is taken from a measurement of the string tension.
135
ration, almost all spikes appear at the same position, with a mismatch in the order of
10% (see [1]).
(4) For fixed , the density of spikes, (n+ + n )/V , does not vary significantly for different lattice sizes. This holds even in the deconfinement region, where |n+ n |
vanishes rapidly for decreasing lattice size.
(5) The density of spikes is correlated with the appearance of Gribov copies. This holds in
the sense that on physically small lattices, where Landau gauging is almost unique and
where the probability for the occurrence of spikes is smaller than unity, Gribov copies
preferentially show up in the same configurations as spikes do. A simple interpretation
has been given in Section 4.
(6) The gauge field propagator is different for configurations with spikes as compared to
the case without spikes. This shows up as a faster temporal decrease of zero momentum
gauge field correlators.
In summary, it is evident that the presence of spikes is strongly correlated with other
nonperturbative phenomena on the lattice. In particular, the correlation with the gluon propagator implies that the presence of spikes is connected with the decrease of the propagator
in x-space. The sign flip of gauge fields, which are associated with gauge singularities, intuitively provides a mechanism for a fast decorrelation of the fields. According to Ref. [25],
such a decorrelation can be responsible for confinement. A study of the correlation of the
string tension with the spike density on larger lattices seems to be topic for future investigations.
Acknowledgements
The author is indebted to H. Joos, I. Montvay, G. Schierholz, R.L. Stuller and H. Wittig
for useful discussions and for encouragement. The preparation of the SU(2)-configurations
on the large lattice has been accomplished on 108 nodes of the T3E parallel computer at
the NIC at the Forschungszentrum Jlich. The author is grateful for granted computer time
and support, especially to H. Attig. The development of the visualization tools has been
carried out on a dedicated SNI Celsius workstation. The author is indebted to the DESY
Directorium for providing access to this facility.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
F. Gutbrod, Eur. Phys. J. C 8 (2001) 1, hep-lat/0011046.

A.L. Ames, D.R. Nadeau, J.L. Moreland, The VRML Source Book, second ed., Wiley, New York, 1997.
H. Grahn, Download of GLVIEW 4.3 from http://www.snafu.de/~hg.
S. Gribov, Nucl. Phys. B 139 (1978) 1.
L. Giusti, M.L. Paciello, C. Parinello, C. Petrarca, B. Taglienti, Int. J. Mod. Phys. A 16 (2001) 3487.
Th. DeGrand, A. Hasenfratz, T. Kovcs, Nucl. Phys. B 520 (1998) 301.
Ph. de Forcrand, M. Garcia Prez, I.O. Stamatescu, Nucl. Phys. B 499 (1997) 409.
A. Hasenfratz, C. Nieter, Phys. Lett. B 439 (1998) 366.
UKQCD Collaboration, D.A. Smith, M. Teper, Phys. Rev. D 58 (1998) 104505, and references quoted
therein.
136
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
I.M. Singer, Commun. Math. Phys. 60 (1978) 7.

M. Pepe, Ph. de Forcrand, Nucl. Phys. B 598 (2001) 557.
Ph. de Forcrand, M. Pepe, Nucl. Phys. B (Proc. Suppl.) 94 (2001) 498.
R. Jackiw, C. Nohl, C. Rebbi, Phys. Rev. D 15 (1977) 1642.
I. Horvth, S.J. Dong, T. Draper, F.X. Lee, K.F. Liu, H.B. Thaker, J.B. Zhang, Phys. Rev. D 67 (2003)
011501, hep-lat/0203027;
See also I. Horvth, et al., hep-lat/0501025.
S.P. Booth, A. Hulsebos, A.C. Irving, A. McKerrel, C. Michael, P.S. Spencer, P.W. Stephenson, Nucl. Phys.
B 394 (1993) 509.
E. Seiler, I.O. Stamatescu, MPI-PAE/PTh 10/87;
E. Seiler, Phys. Lett. B 525 (2002) 355.
J. Fingberg, U. Heller, F. Karsch, Nucl. Phys. B 392 (1993) 493.
D. Daniel, A. Gonzles-Arroyo, C.P. Korthals Altes, Phys. Lett. B 251 (1990) 559, and references quoted
therein.
R.G. Edwards, U.M. Heller, R. Narayanan, Phys. Rev. D 60 (1999) 034502.
B. Alls, M.D. lia, A.D. Di Giacomo, Phys. Lett. B 412 (1997) 119.
C. Michael, G.A. Tickle, M.J. Teper, Phys. Lett. B 207 (1988) 313.
P. van Baal, A.S. Kronfeld, Nucl. Phys. B (Proc. Suppl.) 9 (1989) 227.
J.W. Negele, F. Lenz, M. Thies, Nucl. Phys. B (Proc. Suppl.) 140 (2005) 629.
M. Gckeler, A.S. Kronfeld, M.L. Laursen, G. Schierholz, U.-J. Wiese, Phys. Lett. B 233 (1989) 192.
H.G. Dosch, Yu.A. Simonov, Phys. Lett. B 205 (1988) 339.
Effects of SO(10)-inspired scalar non-universality

on the MSSM parameter space at large tan
M.R. Ramage
Rudolph Peierls Centre for Theoretical Physics, Department of Physics, University of Oxford,
1 Keble Road, Oxford OX1 3NP, UK
Astroparticle and High Energy Physics Group, Instituto de Fsica Corpuscular,
Edificio Institutos de Investigacin, Universidad de Valencia, Valencia E-46071, Spain
Received 1 April 2005; accepted 29 April 2005
Abstract
We analyze the parameter space of the ( > 0, A0 = 0) CMSSM at large tan with a small degree of non-universality originating from D-terms and Higgs-sfermion splitting inspired by SO(10)
GUT models. The effects of such non-universalities on the sparticle spectrum and observables such
as (g 2) , B(b Xs ), the SUSY threshold corrections to the bottom mass and CDM h2 are examined in detail and the consequences for the allowed parameter space of the model are investigated.
We find that even small deviations to universality can result in large qualitative differences compared
to the universal case; for certain values of the parameters, we find, even at low m1/2 and m16 , that
radiative electroweak symmetry breaking fails as a consequence of either ||2 < 0 or m2 0 < 0. We
A
find particularly large departures from the mSugra case for the neutralino relic density, which is sensitive to significant changes in the position and shape of the A0 resonance and a substantial increase
in the Higgsino component of the LSP. However, we find that the corrections to the bottom mass are
not sufficient to allow for Yukawa unification.
PACS: 12.10.Dm; 12.60.-i; 12.60.Jv
Keywords: SUSY phenomenology; MSSM; GUT; Non-universality; Dark matter
E-mail address: ramage@thphys.ox.ac.uk (M.R. Ramage).

doi:10.1016/j.nuclphysb.2005.04.039
138
M.R. Ramage / Nuclear Physics B 720 (2005) 137181
1. Introduction
The simplest supersymmetric extension of the Standard Model, the MSSM, contains
well over a hundred free parameters after supersymmetry breaking is taken into account.
In the most general case, a large number of mixing angles and complex phases are present
and this tends to result in predictions of flavour changing neutral current (FCNC) and
CP-violating processes in excess of the current strict experimental bounds. Moreover, to
analyze a parameter space of such magnitude in any detail would require an enormous
amount of computing power and the results would not be particularly illuminating. To
avoid these problems, simplifying theoretical constraints are usually imposed on the model
in order to restrict the form of the soft supersymmetry breaking mass and coupling matrices; i.e. they are taken to be proportional to the identity matrix in flavour space. Many
analyses of recent years, for example [15] to cite but a few, have been based on the Constrained MSSM (CMSSM) or mSugra scenario in which a universal form for the mass
and coupling matrices is assumed. In this model, gravity is presumed to be responsible
for mediating the breaking of supersymmetry from a hidden sector of the theory sharing
none of the Standard Model gauge interactions, to the visible sector [618]. With certain simplifying assumptions it is possible to construct models of supergravity that lead to
this preferred form for the soft breaking parameters. However, supergravity theories by no
means inevitably predict this universality and, in any case, universality is readily violated
below the supergravity scale by corrections deriving from, for example, renormalization
group running between the supergravity/Planck scale and the GUT scale [19,20] or from
the breaking of GUT [2029] and/or family symmetries [3038].
In a previous paper [38] we explored the low energy constraints on an SO(10) SUSYGUT model with an additional SU(3)F family symmetry [33,34,37] spontaneously broken
in such a way that a phenomenologically acceptable set of Yukawa matrices can be obtained, accounting for the observed fermion masses and mixings including the neutrino
sector. When both symmetries remain unbroken the universal form for the mass and coupling matrices for the gauginos and sfermions is ensured regardless of the supersymmetry
breaking mechanism whether it be gravity, gauge, or anomaly mediation. However, this
sfermion universality is spoilt by D-terms arising from the breaking of the SU(3)F family
symmetry. In this instance the mass squared of the third family of sfermions is split from
the first two by around 20% with the sign of the splitting undetermined. We found that, in
the case of decreased third family sfermion masses, the boundary of correct electroweak
symmetry breaking (EWSB), where vanishes, occurs at a substantially lower value of
m0 than for the universal case. Correspondingly, a new area of parameter space allowed by
the various constraints appears for large tan . A general conclusion we reached was the
allowed parameter space is very sensitive to the universality assumption at least for larger
values of tan . Here we pursue this idea and consider the case where the dominant additional contributions to the sfermion mass matrices originate from the breaking of SO(10)
and its subgroups rather than from the breaking of the family symmetry. To isolate the effects of SO(10) breaking we shall ignore the effects of the family symmetry breaking. The
impact of scalar soft mass non-universality, either deriving from SO(10) breaking or otherwise, on low energy phenomenology has been studied before [3959], but our analysis
differs somewhat in perspective and should complement previous studies. In what follows,
139
we will assume that SO(10) is broken directly to the Standard Model or, in the case that
there exists a secondary stage of breaking with an intermediate gauge group, for example
the PatiSalam group [60]:
SO(10) SU(4)PS SU(2)L SU(2)R SU(3)C SU(2)L U(1)Y ,
or SU(5) [61]:
SO(10) SU(5) U(1)X SU(3)C SU(2)L U(1)Y ,
for simplicity we will assume that the secondary breaking scale is sufficiently close to the
scale of SO(10) breaking that we can neglect the effects of RG running between those two
scales.
Additional D-term contributions to the soft mass matrices arise from the reduction of
rank of the gauge group, associated with a broken U(1) generator proportional to 2IR +
3(B L)/2 where IR is the 3rd component of weak isospin pertaining to the SU(2)R
subgroup of SO(10) and (B L) is the difference between baryon and lepton number [22,
23,26,27,62]. The sfermion and Higgs soft masses squared each receive a contribution
proportional to a quantity D 2 , which can be positive or negative, that parametrizes the Dterm contribution from the breaking of the GUT group. Besides the D-terms, there is no
reason deriving from standard GUT scenarios why the EWSB Higgs soft masses should
be related to the sfermion soft masses since they belong to different representations of
the gauge group. Therefore we assume that the Higgs soft masses are independent of the
sfermion masses in general, with m2H1 and m2H2 1 taking the common value m210 at the GUT
scale (since they are both contained in the 10 representation of SO(10)). Rather than allow
m10 and D 2 to vary arbitrarily, we take the ratioof GUT scale Higgs masses to sfermion
masses, m10 /m16 , and similarly D sign(D 2 ) |D 2 |/m16 as the independent variables.
Therefore the soft SUSY breaking masses and couplings are defined by the following free
parameters:
m16 ,
m10 /m16 ,
m1/2 ,
D,
A0 ,
tan ,
sign().
Throughout this paper we will always take sign() > 0 in accordance with the present
experimental deviation of (g 2) from its Standard Model value and the value favoured
by the branching ratio B(b Xs ), tan = 50 because for much lower tan no new
allowed regions of parameter space were found that had not already been discovered in the
CMSSM case, and A0 = 0 since it does not have a particularly large effect on the allowed
regions (except when it takes on large values) and also to keep the analysis simple so we
can focus on the additional parameters introduced by the SO(10) breaking effects.2
The structure of the paper is as follows: in Section 2 we review the various sources of
non-universality present in a typical SO(10) SUSY-GUT model focusing on the origin of
the D-term contributions to the soft SUSY breaking masses. In Section 3 we comment in
1 In our notation, as usual, m2 and m2 are the soft mass parameters for the Higgs fields that give mass to
H1
H2
the down-type quarks and up-type quarks, respectively.

2 Additional motivation for A = 0 comes from the family symmetry. If the matter superfields Q, uc , etc. are
0
R
triplets of SU(3)F , as is the case in [37], then, at least in the limit of unbroken family symmetry, A0 = 0.
140
detail on the effects on the sparticle spectrum of introducing the D-terms and splitting the
Higgs from the sfermion soft masses. In particular this can have significant implications
for the mass of the pseudo-scalar Higgs boson A0 and for electroweak symmetry breaking which in turn has consequences for the gaugino masses and mixing angles, all this in
addition to the mass splittings and shifts directly or indirectly induced in the sfermion spectrum. These changes will feed through to affect the results of calculations of the various
observables we use to constrain the theory. We discuss the nature of these effects on tan enhanced amplitudes, e.g. (g 2) , in Section 4. We go on in Section 5 to give details of
the numerical calculation of the sparticle spectrum, mixing angles and corrections to the
couplings, and to describe the constraints on the parameter space such as the branching
ratio of the inclusive b-decay B(b Xs ), the WMAP bound on the cold dark matter
density CDM h2 , and the discrepancy between the theoretical prediction of the anomalous
magnetic moment of the muon and its measured value. In Section 6 we present the results
of the constrained fit and discuss the main features. Finally, in Section 7 we summarize
and conclude.
2. SO(10) breaking and scalar masses

As we touched on briefly in the introduction, even assuming universality at the supergravity/Planck scale, MP O (1019 GeV), non-universality at the GUT scale, MG O
(1016 GeV) can arise either from renormalization group running between the GUT scale
and the Planck scale, from GUT threshold effects or from D-terms arising from the breaking of SO(10) or the SU(3)F family symmetry. In this analysis we can ignore the possibility
of RGE effects since the full SO(10) SU(3)F group remains unbroken down to the GUT
scale and hence sfermion universality is maintained. We also neglect GUT threshold effects originating from the discrepancy between MG and the masses of the heavy particles,
MH , that obtain their masses when the GUT group is broken. There are also potentially
important corrections due to the splitting between the masses of the fermions, MF , and
bosons, MB in the heavy superfields (see [20] for more details). This leads to corrections
proportional to the logarithms ln
2
MH
2
MG
and ln
MB2
,
MF2
and finite terms. These corrections de-
pend on the details of the GUT scale particle spectrum and since we do not pin ourselves
down to a particular SO(10) model, and although they may be significant, we will ignore
them for simplicity. Therefore, we choose only to include the D-term contributions to the
sfermion masses that are expected to appear when this symmetry group is broken.
Here we will outline the origin of the D-terms. For a detailed derivation, see [29,63].
Consider a U(1) subgroup of SO(10), U(1)X , which is broken when the rank of the gauge
group is reduced from 5 to 4. We assume that there exists two heavy scalars S() with
opposite U(1)X charges, for example 1, where in this case the charge X is given by
2IR + 3(B L)/2. The D-term contribution to the scalar potential corresponding to the
U(1)X generator is given by

2

1 2 2 1 2
2
2
2
xi |i |
,
g D = g10 |S(+) | |S() | +
2 10
2
i
141
where g10 is the unified coupling and xi are the U(1)X charges of the MSSM fields i .
When the fields S() obtain VEVS, this leads to corrections to the mass terms for the
MSSM scalar fields:

2
m2i |i |2 = xi g10
S(+) 2 S() 2 |i |2 .
However, the magnitudes of the D-terms are not of the order of the large S() VEVS as
would be naively expected. A more detailed analysis shows that to leading order

1
D 2 S(+) 2 S() 2 2 m2S() m2S(+) ,
2g10
where m2S() are the soft SUSY breaking masses squared for the S scalars. Therefore the
new D-term contributions are of order the SUSY breaking scale. Note that in the case of
universal soft SUSY breaking masses these D-term contributions vanish. Since the S()
scalars are expected to belong to different representations of SO(10) (for example, potential 16H and 16H heavy Higgs representations) and indeed any family symmetry that may
also exist at that scale, they are not expected a-priori to have equal masses at the GUT scale.
Even if they are equal at MP this equality can easily be wiped out by differences in the RG
running down to MG . Drawing an analogy with m2H1 and m2H2 , a significant difference in
Yukawa couplings could drive m2S() apart. Since we are otherwise ignorant of the size of
D 2 we will take it to be a free parameter in our subsequent analysis, although we will take
it to be relatively small. For SO(10), including the D-term contributions to the sfermion
and Higgs masses, we obtain the well-known GUT scale inputs for the sfermion and Higgs
soft SUSY breaking masses [22,23,26,27,62]:
2
D2,
m2Q = m216 + g10
2
D2,
m2u R = m216 + g10
2
m2eR = m216 + g10
D2,
2
m2L = m216 3g10
D2,
2
D2,
m2d = m216 3g10
R
m2H1
m2H2
2
= m210 + 2g10
D2,
2
= m210 2g10
D2.
(1)
We use these GUT scale boundary conditions in our analysis, varying m10 /m16 from 0.75
to 1.25 and D from 0.4 to 0.4. These represent reasonably small perturbations about universality, but as we shall see, the effects on the allowed parameter space can be significant.
In SO(10), a right-handed neutrino superfield completes the 16 representation. The
presence of any neutrino Yukawa couplings and associated soft SUSY breaking parameters would, in principle, feed into the renormalization group equations for the remaining
parameters above the scale of the right-handed neutrinos. However, we choose to ignore
such effects in light of our ignorance of the details of the neutrino sector and our desire to
keep the analysis simple.
142
3. The sparticle spectrum

The impact of the D-terms and the Higgs-sfermion mass splitting on the low scale
phenomenology can be understood in terms of how the soft mass differences at the GUT
scale, either directly or through their influence on the RGEs, affect EWSB and the sparticle
and Higgs masses. Here we will discuss aspects of the sparticle spectrum and mixings and
EWSB. This section and the next are meant to provide the reader with an intuitive guide to
how the D-terms and Higgs-sfermion splitting feed through the different soft parameters to
affect the masses, mixings and observables. We will explain these effects through various
approximations valid in certain circumstances before proceeding to discuss the more exact
numerical results. Readers only interested in the results may wish to skip to Section 5. We
start by considering how the changes to the universal boundary conditions feed through the
RGEs to create differences at the EWSB scale.
3.1. Renormalization group evolution
To begin with, we will implicitly assume that m10 /m16 = 1 to isolate the results of
varying D 2 . It turns out that the effect of the D-terms on the phenomenology is almost
entirely due to their tree-level contribution to the boundary conditions at MG since they
almost completely cancel out in the RGEs [29]. We will not write down all of the soft mass
RGEs here, as they are well known and can be found in [64] to 2-loop order. However, we
will reproduce the RGE for m2H2 in the third family approximation as an example of this
cancellation and because it will be useful to refer to it later.3 The m2H2 RGE is given by
16 2
dm2H2
dt

6
= 6|yt |2 m2t + m2t + m2H2 + 6|at |2 6g22 |M2 |2 g12 |M1 |2
L
R
5
3 2
+ g1 S,
5
where t = ln(Q/MG ) and

S = Tr Y m2
= m2H2 m2H1 +
generations
2

mQ 2m2u R + m2d m2L + m2eR .
R
(2)
(3)
Here, Y is the weak hypercharge generator. So at MG ,

2
S = 4g10
D2,
but the D-terms cancel out of the soft masses in the term proportional to |yt |2 . This cancellation also occurs for the analogous terms in the RGEs for the other scalar soft masses
at the one-loop level since, aside from the S term, the sfermion soft masses only enter into
3 The third family 1-loop approximation is sufficient to obtain a qualitative understanding of the results which
is our intention here. The numerical work, on the other hand, was carried out with the full two-loop RGEs.
the RGEs in the following combinations (see for example [63]):

Xt = 2|yt |2 m2t + m2t + m2H2 + 2|at |2 ,
L
R

Xb = 2|yb |2 m2b + m2b + m2H1 + 2|ab |2 ,
L
R

X = 2|y |2 m2L + m2R + m2H1 + 2|a |2 ,
143
(4)
and the only remaining trace of the D-terms in the RGEs lies within the terms proportional
to S. Therefore, in the running from MG to the EWSB scale, the D-terms only enter the
RGEs multiplied by the (GUT normalized) weak hypercharge coupling g12 and since we
are not considering large deviations from universality, we find this to be a very small effect.
In the rest of this analysis we neglect such RG effects due to D-terms although of course
we retain them in our numerical calculations. For more details, see [24,29]. Our main
point here is that the splitting induced at the GUT scale by the D-terms does not change
appreciably in magnitude as we evolve the RGEs down to the electroweak scale.
Switching the D-terms off, we now allow m10 /m16 to vary. This time we need not worry
about S since m10 and m16 cancel out. However the splitting directly affects the Xi terms
(i = t, b, ). If we take m10 /m16 > 1 the Xi are initially larger. This persists throughout
the RG evolution and leads to an overall suppression of the 3rd family sfermion masses at
low scales compared to the mSugra case since the effect of the Xi is to reduce the masses
as we evolve from high to low scales. In the case of the Higgs soft masses, on the other
hand, the tendency to be driven to lower values by the larger Xi is countered by the greater
effect of the tree-level increase and so if we increase m10 /m16 at MG we increase m2H1 and
m2H2 at the EWSB scale.
3.2. The heavy Higgs masses
The mass of the pseudo-scalar Higgs, A0 , deriving from the radiatively corrected Higgs
potential in the tadpole formalism [6567], is given by
m2A0 =

2
1 2
T
m
H2 m
2H1 MZ2 Re ZZ
MZ
cos 2

t1
t2
Re AA m2A0 + sin2 + cos2 .
v1
v2
(5)
Here, m
2Hi = m2Hi ti /vi where ti /vi are the corresponding tadpole contributions from
T is the transverse part of the Z self-energy and
0
loop diagrams, ZZ
AA is the A selfenergy. At large tan , cos 2 1 and one can see that m2A0 depends overwhelmingly on
the difference in the soft Higgs mass parameters assuming that |m2Hi | MZ2
m2A0 m2H1 m2H2 .
Of importance is the RGE for m2H1 m2H2 , which is given by
16 2
d(m2H1 m2H2 )
dt
= 3Xb + X 3Xt .
(6)
144
In general, 3Xt > 3Xb + X due to the large top Yukawa coupling, causing m2H2 to run to
low values faster than m2H1 and increasing the difference between the Higgs soft masses.
Increasing m10 /m16 will increase Xi by a factor 2(m210 m216 )|yi |2 at MG . Since 3|yt |2 >
3|yb |2 + |y |2 , one would think at first sight that the additional contribution would tend to
increase the mass difference as we run the parameters down to the EWSB scale compared
to the CMSSM. However we have neglected an additional effect. In our parameter space the
above inequality regarding the Yukawa couplings always holds because > 0 suppresses
yb through the SUSY threshold corrections involving b g and t loops which contribute
with opposite signs with the gluino loop dominating. The main contributions come from
terms enhanced by tan which arise from helicity-flipping mass insertions in the sparticle
propagators. In this case both the gluino and chargino loops are proportional to . The
decrease in caused by an increase in m10 /m16 means that the net size of the correction
to yb is reduced leaving us with a larger yb . It turns out that this more than counters the
increase in Xt and for this reason we can expect the soft Higgs mass squared difference
and thus mA0 to be lighter in the case of increased m10 /m16 . As an aside we should note
here that this is a small effect and, unfortunately from the point of view of SO(10), we do
not find Yukawa unification to better than about 20% in any part of the parameter space
probed in this paper because yb is always too small for any reasonable range of m10 /m16
that yields allowed points in the parameter space.
To illustrate the above result, Fig. 1 shows the running of the Higgs soft masses from
MG to MZ for the point in the parameter space m1/2 = m16 = 500 GeV, for different
values of m10 /m16 . We shall use this typical point in the (m1/2 , m16 )-plane to compare
the results of varying m10 /m16 and D. It satisfies most of, if not all, the experimental
constraints (depending on the exact values of D and m10 /m16 ) and avoids the particularly
sensitive regions close to where EWSB fails. Also, all the approximations used in the
analysis in this section are more or less valid in this region. Note that there is nothing
special about setting m16 = m1/2 since all the soft masses are renormalized so differently.
The precise values of the sparticle and Higgs masses can be gleaned from Table 1, points A,
B and C, and the values of the soft mass parameters are approximately given by the square
of the light family sfermion masses. The sparticle and Higgs masses are also displayed
diagrammatically in Fig. 1. These numerical results were produced using the method of
Section 5.
The D-terms have a much larger effect on the heavy Higgs masses than m10 /m16 since
they split the Higgs soft masses at tree level rather than through subtleties involving oneloop corrections. At MG , the splitting is
2
m2H1 m2H2 = 4g10
D2.
This has important implications for the breaking of electroweak symmetry; for large negative values of D 2 , m2A0 can be forced negative indicating that the electroweak symmetry
has not been broken correctly; in other words, a solution to the potential minimization
conditions consistent with the measured value of MZ cannot be obtained without also
having m2A0 < 0. Of course, negative D 2 does not automatically mean that this will occur since m2H2 is renormalized differently from m2H1 in a way that increases the difference
m2H1 m2H2 . However, it does mean that mA0 can be significantly smaller in this case than in
145
(a)
(b)
(c)
Fig. 1. This plot shows the running of m2H and m2H and a selection of sparticle/Higgs masses for m1/2 =
1
2
m16 = 500 GeV, A0 = 0, sign( > 0), tan = 50 and D = 0 for (a) m10 /m16 = 1 (the CMSSM case),
(b) m10 /m16 = 0.75 and (c) m10 /m16 = 1.25. They correspond to points A, B and C respectively in Table 1.
146
Table 1
Here we show the mass spectra (in GeV) for the standard points, m1/2 = m16 = 500 GeV. The first column, A,
is the CMSSM case, BE show the effects of changing one of D 2 or m10 /m16 , and F and G are the combination
of D 2 and m10 /m16 that contrast the most
Standard points
Input parameters
m1/2
m16
D
m10 /m16
500
500
0
1.0
500
500
0
0.75
500
500
0
1.25
500
500
0.4
1.0
500
500
0.4
1.0
500
500
0.4
1.25
500
500
0.4
0.75
(MZ )
592.9
647.7
514.4
561.5
622.9
478.0
675.3
116.1
500.0
506.1
1153.4
1121.1
1156.0
1111.7
599.5
533.9
594.5
855.9
1025.5
950.1
1010.7
350.6
555.1
531.9
393.0
116.1
502.9
509.9
1153.2
1120.9
1155.8
1111.4
599.5
533.9
594.6
870.3
1036.7
959.0
1023.2
369.5
567.0
540.3
395.7
116.1
488.9
496.3
1153.7
1121.5
1156.4
1112.0
599.4
534.0
594.5
837.0
1010.2
937.9
992.9
324.3
539.8
521.0
386.1
116.1
403.5
412.4
1144.2
1113.2
1146.9
1138.1
649.8
510.5
645.3
844.0
1013.7
951.8
1022.2
328.0
603.5
588.7
391.0
116.1
578.8
585.0
1162.5
1128.9
1165.1
1084.5
544.5
556.5
539.0
867.7
1037.0
938.0
1008.5
357.8
511.8
468.0
394.3
116.1
388.4
397.8
1144.6
1113.6
1147.2
1138.5
649.9
510.5
645.3
824.8
997.9
938.6
1004.4
297.3
591.2
579.1
380.7
116.1
581.3
587.3
1162.3
1128.7
1164.9
1084.3
544.5
556.4
539.1
881.8
1048.0
946.8
1020.7
373.0
527.5
477.7
396.4
632.7
685.9
559.1
602.7
661.8
527.3
713.1
m 0
207.8
207.9
206.9
207.5
207.5
206.7
207.8
393.1
395.7
386.3
391.1
394.4
381.0
396.4
619.7
675.6
539.4
587.5
650.4
502.1
703.8
Sparticle masses
mh0
m H 0 , mA 0
mH
mu L
mu R
md
L
md
R
meL
meR
m e
mt
1
mt
2
mb
1
mb
2
m1
m2
m
m
1
2
1
m 0
2
m 0
3
m 0
4
mg
SUSY 109
a
B(b Xs ) 104
CDM h2
632.3
685.4
559.0
602.3
661.3
527.3
712.5
1167.4
1167.8
1166.9
1167.5
1167.2
1167.0
1167.6
2.18
2.73
0.151
2.08
2.83
0.208
2.34
2.57
0.078
2.04
2.74
0.074
2.34
2.75
0.632
2.20
2.58
0.013
2.24
2.84
0.774
the CMSSM. This is interesting because the region of parameter space where 2m 0 mA0 ,
1
permitting resonant annihilation of LSP neutralinos via an S-channel A0 , can occur in different parts of the parameter space as we shall see. Fig. 2 shows the running of the soft
Higgs masses and part of the spectrum when the D-terms are non-zero. Again, the full
spectrum can be found in Table 1 points D and E.
147
(a)
(b)
Fig. 2. Same as Fig. 1(a), but in (a) D = 0.4 and in (b) D = 0.4. They correspond to points D and E in Table 1.
Finally, in order to show the full contrast of all the effects of combining the D-terms
with m10 /m16 , we show the plots in Fig. 3 (corresponding to Table 1 points F and G).
3.3. Electroweak symmetry breaking and
The value of is highly dependent on the Higgs soft masses. The minimization of the
Higgs potential gives the following equation [67]:
||2 =
2H2 tan2
m
2H1 m
tan2 1
1
1
T
MZ2 Re ZZ
2
2
(7)
in the same notation as previously. In the large tan limit, the term involving m2H1 becomes
irrelevant and m2H2 must be of O(MZ2 ) or negative at the electroweak breaking scale for an
acceptable, i.e. ||2 > 0, solution to be found. In general, large radiative corrections due
to the top Yukawa coupling achieve this. However, in the high m0 , low m1/2 region of the
148
(a)
(b)
Fig. 3. Similar to Fig. 1(a), but in (a) D = 0.4 and m10 /m16 = 1.25, while in (b) D = 0.4 and m10 /m16 = 0.75.
They correspond to points F and G in Table 1.
CMSSM, sometimes referred to as the focus-point region [6871],4 the potency of these
radiative corrections is reduced. This is because at small m1/2 the relative size of the terms
chiefly responsible for driving m2H2 negative, i.e. those involving |yt |2 multiplied by the
soft squark masses m2 and m2u R in the RGE for m2H2 , are smaller. One can understand this
Q
by realizing that a small value for m1/2 means a small value for the gaugino mass M3 and
it is terms proportional to g32 |M3 |2 that are the main factors involved in making the squark
masses (and therefore Xt ) large as they are RG evolved to low scales. Moreover, a large
|at |2 , the square of the trilinear stop-stop-Higgs coupling, also helps to drive down m2H2 .
4 Focus-point refers to the fact that the RGEs exhibit an approximate focusing effect on m2 near the weak
H2
scale as m0 is varied, keeping other parameters fixed. In other words the value of m2H is insensitive to m0 and
2
2
tends to be naturally of O(MZ ). Choosing m1/2 to be very small and m0 large, the soft masses and are naturally
of order the EW scale in Eq. (7) in order to produce the correct result for MZ , resulting in very small overall fine
tuning even though the sfermion masses can be O (several TeV)
149
Starting from A0 = 0 at MG , at is driven negative below this scale by a term proportional

to M3 . The larger M3 is, the larger |at | becomes and the stronger the effect on m2H2 as the
parameters are run towards MZ . If m1/2 is small enough the radiative corrections may not
be large enough to drive m2H2 sufficiently low to break the electroweak symmetry. For a
given value of m1/2 , it is generally possible to choose a large initial value of m2H2 , i.e. m20 ,
for which 0 at low scales. The boundary along which vanishes marks the border of
correct EWSB, although before this line is reached, the LEP limits on the chargino mass
are violated since as drops below M2 , the lightest chargino becomes GeV. As is
well known [20,39,40,72,73], by admitting non-universality in the Higgs sector, one can
find successful EWSB in regions excluded in the case of universal scalar masses.
D-terms, therefore, at the level of the input scale, can either help or hinder the EWSB
2 D 2 . Thus we
process by increasing or decreasing m2H2 at MG since m2H2 = m210 2g10
may expect the = 0 boundary to move to lower values of m16 for D 2 < 0 and to higher
values for D 2 > 0 for a given m1/2 . In practice, however, by varying the D-terms alone
(with m10 /m16 set to 1), the point m2A0 = 0 is often reached before = 0, depending on
the exact location in the parameter space. On the other hand, setting the D-terms to 0
and increasing m10 /m16 , we push up the value of m2H2 and therefore ||2 towards zero at
the EWSB scale towards without decreasing the value of m2A0 enormously. Therefore, at
much lower values of m16 than in the CMSSM, we find the boundary = 0 where EWSB
fails.
We now look at a set of rather more interesting points in parameter space that illustrate
some of the observations of this sectionpoints close to where the breakdown of the radiative EWSB mechanism occurs. We also draw attention to another important fact, namely
that, even relatively far from where EWSB fails, at low m1/2 , high m16 , becomes increasingly sensitive to changes in the D-terms. This can be traced back to the fact that the
D-term contribution essentially provides an additive constant to m2H2 imposed at MG , and,
in our parameterization, it is proportional to m16 . In the small m1/2 , large m16 region of
parameter space, in the CMSSM case, is quite small and a large positive value for D 2 for
example, will result in a relatively large correction to m2H2 and therefore a relatively large
increase in .
To begin with, in Fig. 4(a) and (b) and Table 2 points U and V we compare a point in
the CMSSM parameter space, m1/2 = 500 GeV, m16 = 990 GeV, where is still fairly
sizeable and the corresponding point with D = 0 and m10 /m16 = 1.25 where is very
small. Note that this occurs for a relatively low value of m16 , much lower than the focus
point region in the CMSSM, and in this region, is much more sensitive to the change
in m10 /m16 than for our points in Figs. 13 and Table 1. This will be important later on
when we come to discuss observables such as (g 2) and B(b Xs ). Of particular
importance is the lightness of the charginos and neutralinos, the lightest of which have
important consequences for the cold dark matter relic density.
In Fig. 5(a) and (b) and Table 2 we have points W and X. W is the CMSSM point at
m1/2 = 813 GeV and m16 = 1200 GeV. X shows the same point, but with D = 0.4 and
m10 /m16 = 1.25. For X, due to the large effect of the negative D-terms on mA0 , the heavy
Higgs masses are tiny. Note that, even though m10 /m16 = 1.25, is still far from zero,
around 300 GeV.
150
(a)
(b)
Fig. 4. Similar to Fig. 1 but in (a) m1/2 = 500 GeV, m16 = 990 GeV, m10 /m16 = 1 and D = 0, in (b)
m1/2 = 500 GeV, m16 = 990 GeV, m10 /m16 = 1.25 and D = 0. These represent points U and V in Table 2.
Note how light the chargino and neutralino spectra are in (b) compared to (a).
The final pair of points we will consider are at m1/2 = 200 GeV, m16 = 1300 GeV,
firstly for zero D-terms and m10 /m16 = 1 and secondly for D = 0.4 and m10 /m16 = 1.
The results are shown in Fig. 6(a) and (b) and Table 2, points Y and Z, demonstrating
the point made earlier: that is more sensitive to changes in D 2 (and m10 /m16 ) at small
m1/2 and large m16 than for regions where m1/2 m16 , and is especially so close to the no
EWSB boundary. Again, this fact will be of importance in our discussion of tan -enhanced
amplitudes.
3.4. The sfermions
We now consider the effects of the D-terms and m10 /m16 on the sfermion spectrum.
For a graphical representation of what follows the reader is referred back to Figs. 16. Due
to the fact that they almost completely cancel out of the RGEs, the D-term contributions
151
Table 2
This table shows the mass spectra (in GeV) for three pairs of contrasting points in the parameter space. Points U
and V compare an unremarkable point in the CMSSM space (U) with the same point but with m10 /m16 = 1.25
(V) demonstrating a huge change in . Points W and X compare another unremarkable point in the CMSSM
parameter space (W) with the same point but with m10 /m16 = 1.25 and D = 0.4 (X) where the Higgs bosons
are extremely light. Finally, points Y and Z compare two points at large m16 and small m1/2 . Point Y is the
CMSSM point, while point Z has D = 0.4. is roughly as sensitive to the D-terms as mA0 in this region of
parameter space
Points of interest
500
990
0
1.25
813
1200
0
1.0
813
1200
0.4
1.25
200
1300
0
1.0
200
1300
0.4
1.0
581.6
103.3
882.2
299.2
293.5
562.7
Sparticle masses
mh0
mA0
mH 0
mH
mu L
mu R
md
116.6
614.2
614.3
620.1
1419.4
1395.6
1421.5
116.8
436.3
436.3
445.0
1422.0
1398.1
1424.1
119.3
846.0
846.0
850.2
2007.2
1960.5
2008.6
87.2
88.8
120.2
127.5
1979.9
1937.4
1981.4
113.9
521.2
521.1
528.0
1346.7
1347.7
1349.0
114.2
986.7
986.7
990.2
1399.1
1391.2
1401.3
md
1388.6
1391.1
1947.2
2033.5
1347.7
1185.5
meL
meR
m e
mt
1040.0
1006.1
1037.2
992.1
1040.2
1006.1
1037.4
922.7
1308.5
1235.3
1306.2
1462.3
1439.1
1177.6
1431.1
1357.2
1300.8
1300.2
1298.5
801.9
1123.8
1363.1
1121.2
882.3
mt
1169.2
1096.0
1681.2
1569.0
949.2
1053.4
mb
1123.0
1057.2
1634.8
1538.6
929.1
893.4
mb
1192.1
1087.4
1697.3
1682.3
1036.9
1036.1
m1
m2
m
m
721.0
922.7
914.5
396.8
674.5
897.9
894.4
105.2
883.0
1166.7
1158.9
657.8
740.2
1287.2
1284.7
312.8
950.0
1141.9
1138.5
152.1
929.8
1045.1
933.2
162.5
625.0
422.7
938.9
676.3
333.9
598.1
m 0
209.2
92.6
347.5
292.7
80.8
82.6
m 0
396.9
118.7
657.9
323.1
152.4
162.5
m 0
611.3
218.9
927.9
365.5
316.6
592.1
m 0
624.6
422.8
938.8
676.3
332.0
596.1
mg
1200.8
1198.6
1857.2
1855.0
563.1
563.0
Input parameters
m1/2
m16
D
m10 /m16
500
990
0
1.0
(MZ )
1
2
1
2
1
2
1
2
3
4
SUSY 109
a
B(b Xs ) 104
CDM h2
1.17
2.90
0.857
1.40
2.77
0.009
0.61
3.37
0.979
0.72
5.09
0.016
1.17
2.41
1.364
1.11
2.89
29.414
152
(a)
(b)
m1/2 = 813 GeV, m16 = 1200 GeV, m10 /m16 = 1.25 and D = 0.4. These represent points W and X in Table 2.
Notice how light the heavy Higgses are in (b).
to the masses squared of the sfermions evaluated at MSUSY are numerically equivalent to
their contributions to the soft mass squared parameters at the GUT scale, Eqs. (1), to a good
approximation. As we have discussed already, increasing m10 /m16 will in general lead to
a decrease in the third family sparticle masses as a result of larger values of the Xi in the
RGEs. However, as we shall see, by far the most significant effect from the point of view of
phenomenology is the effect this has on . We now turn to examine the sparticle masses in
detail. In all of what follows, we will treat the first two families of squarks and sleptons as
being degenerate between the generations, i.e. mdR = msR , mu L = mcL , m e = m , etc.,
which is accurate to a very good approximation.
3.4.1. The squarks
In the case of the squarks the D-term effects are simple for the two light generations
and the stops; we can expect the right-handed down type squarks to be lighter/heavier than
153
(a)
(b)
m1/2 = 200 GeV, m16 = 1300 GeV, m10 /m16 = 1 and D = 0.4. These represent points Y and Z in Table 2.
The D-terms affect about as much as they do mA0 in this region of parameter space.
in the CMSSM as follows:

3 2 D2
m(D)
1
,
g

m
dR
dR
2 10 m2
dR
where the soft masses on the RHS are the mSugra soft masses at MZ and g10 is the unified gauge coupling evaluated at MG . For the up-type squarks and the left-handed down
squarks:

1 2 D2
(D)
,
mq mq 1 + g10
2
m2q
where m2q is one of m2u R or m2 . Since for the two light families there are strong radiative
Q
corrections coming from the gluino (M3 ) term in the RGEs untempered by large Yukawa
154
terms, the squark masses are always significantly larger than m16 and so the corrections
coming from the D-terms are relatively small. From our points in Figs. 16 one can see
that this is a barely perceptible effect, in fact a few percent at most (in the case of the
right-handed down squarks).
The stops receive a similar correction to the up-type squarks above:

1 2 D2
(D)
,
mt mti 1 + g10
i
2
m2t
i
where i = 1, 2, though it is a relatively larger effect, e.g. 23% instead of 1% for the
first two family squarks (except for dR ) in Table 1, on account of the fact that the stop soft
masses are substantially suppressed by the factors Xi in the relevant RGEs while the Dterms are unsuppressed. For m1/2 m16 these percentages are much larger since, in this
regime, M3 is small and the sfermion soft masses are no longer strongly enhanced at the
weak scale by the corresponding terms in the RGEs; meanwhile, the D-terms are relatively
large, having been chosen proportional to m16 . As a result, the D-term correction to the
squark masses is relatively large. The D-terms do not significantly affect the stop mass
mixing since they cancel out of the mixing term in the quadratic formula derived from the
diagonalization of the mass matrix; the mixing remains almost entirely dependent on the
trilinear coupling At which is insensitive to the sfermion soft masses.
Changing m10 /m16 , which mainly affects , will not have a strong tree-level effect on
the stop masses since we are in the large tan regime where the -dependent terms in the
off-diagonal elements of the stop mass matrix are heavily suppressed. However, there will
be a small decrease in the overall masses as noted above (around 12% for the points
in Table 1), due to large values of the Xi in the RGEs. Again, the mixing angle is almost
completely unaffected, being largely dependent on At which remains relatively constant.
For the sbottoms things are considerably more complicated. First of all, we consider
the average sbottom mass squared as compared to the CMSSM. From diagonalizing the
tree-level sbottom mass matrix one obtains (ignoring EWSB D-terms and the bottom mass
except where it is enhanced by tan )
2(D)
b

1 2
2
mb + m2b 2g10
D2 .
L
R
2
(8)
2(CMSSM)
(MZ ), etc. to avoid notational clutHere we have used a notation where m2 m

Q33
bL
ter. Therefore the average sbottom mass is lowered in the case of positive D and raised
in the event of negative D. Due to the distribution of the D-terms in the sbottom mass
matrix, the sbottom mass eigenstate that is mainly bR will vary a lot more than the one
that is mainly bL if the LR mixing is not too large. As far as the sbottom mass splitting is
concerned one finds

2(D)
2 D 2 2 + 4||2 m2 tan2 .
m m2 m2 + 4g10
(9)
b
b
bL
bR
First off, we cannot say in general which of the two terms under the square root is the larger
and cannot simply approximate the square root by the binomial expansion. Any effect on
the splitting induced by the D-term depends on whether m2 > m2 or vice-versa, and also
bL
bR
155
on the magnitude of m2 m2 relative to the magnitude of the D-term contribution. We

bL
bR
identify two contrasting regimes:
Large m1/2 and small m16 : we find m2 to be larger than m2 due to the effect of a
bL
bR
large value of the gaugino mass (M2 ) term in the RGE for m2 which is absent from
bL
the RGE for the electroweak singlet mass squared m2 . The fact that the difference
bR
between the left- and right-handed masses is due to, and grows with, m1/2 in this
regime, and that we assume that the D-term grows with m16 , means that the D-term
is therefore relatively small compared to the mass splitting, even for |D| = 0.4, and
has a much reduced effect. Nevertheless, in this region of parameter space, a positive
D 2 will increase the splitting between the masses and a negative D 2 will reduce the
splitting.
Small m1/2 and large m16 : we find the opposite is truem2 is less than m2 due to
bL
bR
the large Xt term in the m2 RGE, a term which is again absent from the corresponding
bL
equation for m2 . Here, the D-term can be relatively large compared to the difference
bR
between the soft masses and can easily be the dominant part of the first term under
the square root. In this case, at least for large D, both signs of D 2 increase the mass
splitting though negative D 2 will clearly have a stronger effect.
In certain regions of parameter space D 2 is such that it exactly cancels out the difference
between the soft masses squared, leaving the minimum possible splitting
(D)
b
2||mb tan .
In intermediate cases things depend on the exact numbers. As it turns out, around the point
m1/2 m16 500, m2 m2 and in this vicinity we cannot make a general connection
bL
bR
between the sign of D and the magnitude of the first term in the square root. There is
an additional effect that comes into play. As we mentioned earlier, decreasing D results
in a decrease in the size of , tending to decrease the contribution of the -dependent
(D)
term to m though this a sub-dominant effect throughout most of the parameter space.
b
Although the D-terms cause changes in the weak scale value of a stronger effect can
be achieved by varying m10 /m16 . Indeed we expect significantly smaller splitting of the
sbottom masses for large values of this parameter where is suppressed. One can see these
effects in the Figs. 16 and the corresponding tables.
The sbottom mixing angle is likewise difficult to predict in the general case, also being
sensitive to the difference between the two soft sbottom masses and to the off-diagonal
dependent term.
3.4.2. The sleptons
We begin by considering the sneutrinos and sleptons of the first two families. The lefthanded slepton masses are given approximately by

3 2 D2
(D)
(D)
m i meLi mL i 1 g10 2 ,
(10)
2
m
Li
156
whereas the right-handed slepton masses are given by

1 2 D2
(D)
,
meRi meRi 1 + g10
2
m2e
Ri
where i = 1, 2 and again the soft masses on the RHS are the mSugra masses at the EWSB
scale and g10 is the unified gauge coupling evaluated at MG . If m1/2 is very small then
mL i meRi is of roughly equal magnitude to m16 at the EWSB scale, at least for moderate
to large values of m16 . In this case the right- and left-handed sleptons are nearly degenerate.
Otherwise mL i is substantially larger than both meRi and m16 , radiatively driven by the
wino mass M2 . The right-handed masses do not have a term proportional to M2 in the
RGE and are always smaller than their left-handed counterparts in the CMSSM although
they may still be significantly bigger than m16 if m1/2 is large. Since the slepton masses
are smaller than the squark masses as a result of not having an M3 term in their RGEs, the
relative effect of the D-terms is much larger and can be as much as 12% for D = 0.4 for
the left-handed sleptons and 4% for the right-handed sleptons in Table 1. Consequently,
the right-handed charged sleptons can be heavier than the left-handed charged sleptons
even at moderate values of m1/2 . Again, these effects can be seen graphically in Figs. 16
and numerically in the tables.
Again, the third family is more complicated, but far more clear-cut than for the sbottoms.
The Yukawa coupling plays an important rle by balancing the M2 term in the RGE for
m2L and making sure that mR is always pushed lower than m16 at MZ and for low values
of m16 it is usually found that the stau is the LSP or is even tachyonic. The sneutrino
mass is given by Eq. (10) with mL i replaced by mL . The average stau mass is given by
Eq. (8) with the squark soft masses replaced by the slepton soft masses. Again, increasing
the value of D 2 will lower the average mass, with the predominantly L mass eigenvalue
(assuming such a distinction exists) varying more than the mainly right-handed one. For
the mass splitting, the situation is clearer:
2(D)

2 D 2 2 + 4||2 m2 tan2
m2L m2R 4g10
(11)
in the same notation as before. Note the change of sign of the D-term relative to Eq. (9).
Now, however, m2L is always bigger than m2R and we find that throughout the parameter
space we explore that a positive value of D 2 always reduces the mass splitting (although it
may swap which of the mass eigenstates is dominantly right- or left-handed) and a negative
value will always increase the difference.
For the 3rd family sleptons, although the -dependent term in Eq. (11) grows as one
decreases m10 /m16 , it is still very small relative to the difference in the soft masses squared
due to the mass squared suppression (even taking into account its tan2 enhancement).
In fact, since m2R is twice as dependent on the factor X through the RGEs as m2L , and
X is substantially smaller for m10 /m16 = 0.75 than in the CMSSM case, the difference
m2L m2R decreases. As a result the stau mass difference decreases as m10 /m16 decreases.
The average masses, on the other hand, grow with decreasing m10 /m16 , also due to the
smaller X in the RGEs. Again the reader is referred to Figs. 16 and the Tables 1 and 2.
157
3.5. The charginos and neutralinos

D-terms have a relatively small influence on the charginos and neutralinos. The soft
gaugino masses Mi are almost entirely unaffected, but there is some effect on which
governs the masses of the Higgsino-like charginos and neutralinos. We note that we can
make a distinction between Higgsino-like and gaugino-like charginos and neutralinos when
2 , where M is the W boson mass. This occurs
we have the hierarchy 2 M22 MW
W
in most of the parameter space that we examine and the mixing between gauginos and
Higgsinos is relatively small. Very roughly, in the situation where this hierarchy arises, we
have
m 0 M1 ,
1
m 0 m M2 ,
2
m 0 m 0 m .
2
(12)
An increase in D 2 will cause a corresponding increase in relative to M2 via the mechanism detailed earlier in Section 3.3. As we mentioned before, this is a relatively small
effect. Varying m10 /m16 , on the other hand, creates substantial changes in and, as a
result, the Higgsino-like neutralinos and charginos can be significantly lighter or heavier
than in the mSugra case. This is of vital importance for the calculation of the neutralino
cosmological relic density where, as approaches M1 , the lightest neutralino may have
a significant Higgsino component, enhancing the highly efficient Higgs-exchange annihilation channels. It is also important for any process with chargino and/or neutralino loops
such as (g 2) , B(b Xs ) and the SUSY correction to the bottom mass, mb , which
we have already touched on briefly. We will discuss these constraints in a little more detail
in later sections.
Another matter that needs to be taken into account in order to discuss low energy observables is the dependence of the mixing angles in the gaugino sector on and the Mi . In
the chargino sector, in the basis (W , H ), at tree level, the chargino mass matrix

2MW sin
M2
M =
(13)
2MW cos
is diagonalized by a bi-unitary transformation with diagonalization matrices U and V , i.e.

diag
M = U M V 1 . In the case where all the parameters are real, which is an assumption we make here, U and V are both orthogonal and parameterized by angles R and
L :

2M
sin L
sin
M
cos L
diag
2
W
M =
sin L cos L
2MW cos

cos R sin R
.
(14)
sin R
cos R
2 divided by powers of soft masses (assuming that the difference (2 M 2 )
To order MW
2
2
MW which is the case in most of the parameter space we analyze), and in the large tan
158
regime where sin 1 and cos 0, it can easily be shown that

cos L 1
2
2 MW
(2 M22 )2
2MW
sin L
,
2
M22
cos R 1
sin R
2
M22 MW
(2 M22 )2
2M2 MW
.
2 M22
(15)
From a quick glance, we can deduce that in this regime the chargino mixing angles will
grow as decreases as a result of increasing m10 /m16 or decreasing D 2 . When approaches M2 , as in the region close to where vanishes at the boundary of EWSB for
large m10 /m16 these approximations break down and this simple power series expansion
is no longer applicable for the mixing angles.
In the neutralino sector things are much more complicated since the mass matrix is
4 4. In the basis (B 0 , W 0 , H 10 , H 20 ), at tree-level,
M 0
M1
0
=
MZ cos sin W
MZ sin sin W
0
M2
MZ cos cos W
MZ sin cos W
MZ cos sin W
MZ cos cos W
0
MZ sin sin W
MZ sin cos W
,

where tan W = 35 gg12 is the tangent of the Weinberg angle. In the approximation where all
of the soft breaking parameters are real, this matrix can be diagonalized by an orthogonal
matrix O, such that
diag
M 0 = O T M 0 O.
In what follows, we will be interested only in the lightest neutralino, 10 . In the limit
of large tan , with 2 > M12 MZ2 and 2 M12 MZ2 (these relations again hold
throughout the majority of the parameter space probed) an approximate solution can be
found for the lightest mass eigenvalue and we can roughly express the components of the
lightest neutralino, m 0 M1 , in the basis (B 0 , W 0 , H 10 , H 20 ) as
0
1
1
0
MZ sin W
2 M12
M1 MZ sin W
2 M12
(16)
Thus, by decreasing relative to M1 , from an increase in m10 /m16 or a decrease in D 2 ,

we obtain an increase in the Higgsino fraction of the lightest neutralino. As we shall see,
this can have an enormous effect on the neutralino relic density.
159
The effects on the charginos and neutralinos of varying D 2 and m10 /m16 can be seen
in Figs. 16 and in the tables.
4. tan -enhanced amplitudes

We now try to draw some general conclusions about how the above changes in the
masses and mixing angles should affect important observables such as the muon anomalous
magnetic moment and the branching ratio B(b Xs ).
It is well known that the Feynman diagrams for the quantities (g 2) [7476],
B(b Xs ) [7780], and the SUSY threshold correction to the bottom mass [8183]
all contain, and in this case are dominated by, tan -enhanced contributions. This comes
about when the required chirality flip for these processes arises from mass insertions in the
gaugino or sfermion line as opposed to a mass insertion on the external fermion line.
4.1. The muon anomalous magnetic moment
(g2)
We will consider in detail a 2 . Throughout our parameter space, the dominant

diagram is a charginosneutrino loop. In the absence of right-handed sneutrinos, the mass
insertions can only occur in the gaugino line, with a R L H vertex at one end and a
L L W vertex at the othersee Fig. 7and the diagram is proportional to M2 tan .
Switching now to the mass eigenstate basis for the exact 1-loop calculation of the
chargino part of the amplitude, the tan enhanced diagrams come from two loops, one
containing 1 and one containing 2 , in each case connecting left- to right-handed muons
so that the helicity flip occurs in the chargino line. The part of the amplitude containing
the 1 loop is proportional to a factor cos R sin L and the part containing a 2 loop
is proportional to a factor sin R cos L . Throughout our parameter space we find these
factors to be opposite in sign and so there is some cancellation between the loop diagrams
involving the different chargino mass eigenstates. The charginos contribute factors to the
Fig. 7. The tan -enhanced mass insertion diagram for the chargino contribution to (g 2) . The various factors
show the couplings at each vertex/insertion. From the factor y v sin we deduce the diagram is proportional to
m tan . Contributions for which the helicity flip occurs on the external muon line carry a relative suppression
of tan .
160
amplitude that can be written in the form (compare with the formulae in, for example, [84]):
2
m
m
1
1
1
a = C 2 cos R sin L F
(17)
,
m
m2
2
m
m
2
2
2
a = C 2 sin R cos L F
(18)
.
m
m2
In this equation, C is a positive constant with respect to varying D-terms or m10 /m16 (at
least up to negligible corrections), and involves m , , the Higgs VEV v and the gauge
coupling g2 . The function F is a phase space integral dependent on the relevant masssquared ratio:
F (x) =
3 + 4x x 2 2 ln x
.
(1 x)3
(19)
We can project out the tan -enhanced part of the amplitude by setting sin 1 and
cos 0. The leading-order terms in the soft SUSY-breaking mass parameters are given
by:

M22
M2 2MW
1
F
a C 2
,
m 2 M22
m2
L
L

2
2M2 MW
2
F
,
a C 2
(20)
m 2 M22
m2
L
where
m2
L
m2 .
L22
And so
a a = C
m2

M22
M2
2
F
.
F
(2 M22 )
m2
m2
L
(21)
Here, C is a another positive constant:

C
g22 m2 tan
1.44 103 .
16 2
Unfortunately, from the point of view of getting an intuitive idea of how the D-terms and
sfermion-Higgs splitting will affect a , it does not make much sense to series expand the
function F in terms of the mass ratios since they are frequently close to the critical value 1.
However, F (x) is a monotonically decreasing function of x, and is shown in Fig. 8. Now
we consider what happens when we increase the value of D from 0 to 0.4 for the point
m16 = m1/2 = 500 GeV. Two main effects occur. The first is that value of mL decreases
(e.g. by 55 GeV (around 10%) for point E in Table 1 as compared with point A). The
second is that increases. The change is less significant for than for mL since is only
affected at loop order through RGE effects (e.g. by around 30 GeV (around 5%) for point
E compared with A). M2 remains basically unchanged. Consequently, the factor outside
161
Fig. 8. The loop function F (x), Eq. (19).
the large parentheses of Eq. (21), M2 /(m2 (2 M22 )), grows as D increases. However,
L
this is countered somewhat by what happens inside the parentheses. By increasing D 2 the
mass ratios increase and one moves along the x-axis of Fig. 8 to the right causing a general
decrease in the magnitude of the function in Eq. (19). One must be careful though because
will increase as a result of increasing D 2 , resulting in an increase in 2 /m2 relative
L
to M22 /m2 therefore tending to create a larger difference between the two F (x) terms
L
and canceling out the effect of the smaller overall magnitude. Comparing the points E and
A, the overall decrease narrowly wins and the difference between the loop functions F (x)
is larger in A than in E. The external factor M2 /(m2 (2 M22 )), on the other hand,
L
is larger in E than in A and overall the contribution to a is greater for D = 0.4 than for
D = 0. However, when one looks at point D for which D = 0.4, one can see that the
cancellation between the loop function terms dominates the larger overall magnitude of
each contribution and the difference between the F (x) for point D is also smaller than
for point A (note, though, that our approximation is somewhat worse for point D than for
points A and E due to the fact that is much closer in magnitude to M2 ). Table 3 shows
the corresponding values for our approximation as compared to the calculated figures.
We note here than in different parts of parameter space things change somewhat. For example, for m16 m1/2 , at least for m10 /m16 = 1, increasing D 2 has a stronger effect on
than it does on m2 and the above situation is reversed; although this time the loop function
L
for D = 0.4 is much larger than that for D = 0, the external factor M2 /m2 (2 M22 )
L
is much smaller and the amplitude for D = 0.4 is smaller than that for D = 0. Note that in
this region of parameter space, D = 0.4 is excluded by the EWSB requirements and in
any case, the above approximation would break down.
To summarize: in general, it is somewhat complicated to understand intuitively how
the D-terms affect a , even though it is a relatively simple calculation as compared with,
for example, B(b Xs ). It all depends on what part of the parameter space you are in,
where on the curve in Fig. 8 you are and also on the relative effects of the D-terms on
with respect to m2 .
L
162
Table 3
This table compares the approximation to the calculated value of a for points A, D and E in Table 1
Point
Input parameters
m1/2
m16
D
m10 /m16
500
500
0
1.0
500
500
0.4
1.0
500
500
0.4
1.0
(MZ )
mL
592.9
601.6
561.5
651.8
622.9
546.7
M2
382.8
382.8
382.7
M2
106
m2 (2 M22 )
3.06
3.00
3.30
1.25
1.39
1.11
0.68
0.83
0.55
F(
M22
m2
L
2
F ( 2
m
L
0.57
0.56
0.56
a 109 (approximation)
2.54
2.42
2.68
a 109 (actual value)

a 109 (actual value)
2.35
2.18
2.24
2.04
2.48
2.34
Increasing m10 /m16 has a more straightforward effect although it is still not immediately obvious what will happen from our approximation Eq. (21). Since an increase in
this variable will tend to decrease while leaving the sneutrino mass and M2 relatively
unchanged, from the above discussion it is clear that the difference between the loop functions will become smaller whereas the factor M2 /m2 (2 M22 ) will grow. We find that
L
throughout the parameter space, the second effect is greater and that increasing m10 /m16
causes a to grow.
4.2. B(b Xs )
A similar discussion on B(b Xs ) to that above would be rather more involved and
not any more illuminating. We will limit ourselves to making a few general remarks.
In the Standard Model, the calculated branching ratio B(b Xs ) accounts very well
for the experimentally observed value. In the MSSM one must also include contributions
from the charged Higgs diagram, which always has the same sign as the Standard Model
W loop, and the chargino diagram. Either these additional contributions cannot be too
large, or must cancel. In the positive regime, the chargino amplitude has the opposite
sign to the charged Higgs, allowing this cancellation and helping to keep the amplitude
within the experimental bounds. However, in the CMSSM often the charged Higgs mass is
large compared to the chargino masses at low values of m0 and m1/2 and the charged Higgs
contribution is not sufficient to compete with the chargino contribution. As a consequence
the calculated value for B(b Xs ) undershoots the experimental result and excludes a
significant region of the parameter space.
163
The tan -enhanced part of the chargino contribution to B(b Xs ) is a more complicated expression than the one for the (g 2) chargino contribution for two reasons.
Firstly, the approach to the calculation of B(b Xs ) is differentthe strength of the
QCD interaction necessitates operator product methods instead of straightforward oneloop renormalized perturbation theory and one has to calculate the SUSY and charged
Higgs contributions to two different Wilson coefficients. Secondly, one also has to consider the fact that there are two different stop mass eigenstates propagating in the loop as
opposed to the solitary sneutrino state, both of which contribute to tan -enhanced diagrams.
When comparing the D-term- and m10 /m16 -corrected case to the standard mSugra scenario, an important factor in the discussion is the comparative sizes of the charged Higgs
and Higgsino-like chargino masses, i.e. mH and , respectively. It turns out that in the
b Xs case, unlike in the case for (g 2) , we find that the chargino amplitude always
grows as becomes smaller, whether due to D-terms or m10 /m16 . Likewise, the charged
Higgs contribution always increases with decreasing mH (or equivalently mA0 ). However, the question is not only to do with which of mA0 or changes most. Another factor is
the overall size of the amplitudes. Even if the charged Higgs contribution becomes smaller
relative to the chargino contribution and is less effective in canceling it, it could be that
at the same time the chargino contribution becomes too small to ruin the Standard Model
prediction. In any case, it is the sum of the two amplitudes compared to the W loop that
is important rather than their relative changes in magnitude.
If, for example, we decrease D we cause a decrease in , but also a decrease in mH .
This effect is relatively larger for mH than for over much of the parameter space (except
for m1/2 m16 ) and so one may expect the Higgs term to cancel the excessively large
chargino contribution more successfully. However, as mentioned above, the relative size of
the changes is not the only important factor. The absolute size of the amplitudes matters
too. It turns out that in the region m1/2 m16 500 GeV, although the Higgs contribution
grows relative to the chargino contribution with decreasing D 2 , they both increase by a
roughly equal amount in absolute magnitude and their sum remains about the same. One
can see that the B(b Xs ) values quoted for points A, D and E in Table 1 are almost
equal.
For the region m1/2 m16 , increasing D from 0 to 0.4 (D = 0.4 is ruled out by EWSB
restrictions) affects just about as much as mH and by quite a substantial amount. Here,
the overall size of the SUSY contribution decreases; in addition, the charged Higgs contribution, although it decreases, cancels out the chargino contribution more successfully. As
we shall see, this results in the excluded region becoming smaller as D 2 increases.
A particularly interesting case is when D 2 is large and negative, close to the boundary
where m2A0 vanishes. Here, mH is also very small although can remain several hundred GeV. In this case, one finds a narrow sliver of acceptable parameter space where the
charged Higgs contribution is large enough to counter the chargino amplitude even though
the chargino amplitude is very large. We also found regions where the Higgs contribution
is so large that the upper bound on the branching fraction is violated.
164
4.3. Bottom mass corrections: mb

Sparticle loop diagrams result in a finite threshold correction to the bottom mass. The
most important contributions come from tan -enhanced terms which alter the relation
between the (running DR) bottom mass and the bottom Yukawa coupling. The corrected
expression can be approximated by (see, for example, [81])
mb
yb v cos
b g
t
1 + mb + mb ,
(22)
where
b g
mb

2s mg tan 2
I mb , m2b , m2g ,
1
2
3
(23)
yt2 At tan 2 2 2
I mt , mt , ,
1
2
16 2
(24)
and
t
mb
with
I (x, y, z) =
xy ln xy + yz ln yz + zx ln xz
(x y)(y z)(z x)
(25)
In magnitude the gluino loop tends to dominate the chargino loop due to the largeness
of s mg . Since At is always negative in the cases we consider, the chargino loop does
partly cancel the gluino loop, but overall the sparticle loops contribute with a positive sign
in the case of positive (I (x, y, z) is positive). This means that, in order to account for
the experimentally observed value of the bottom mass, the bottom Yukawa coupling yb
must decrease. In the CMSSM case, this decrease in yb ruins the prospect of bottom-tau
Yukawa unification. With variations in the D-terms and/or m10 /m16 the squark masses
change somewhat (as detailed in Section 3.4), but changes more. The gluino mass mg is
almost invariant. Both of these contributions grow with times a loop function and as increases the overall contribution will tend to increase. Additionally, in most of the CMSSM
parameter space, there exists a hierarchy where the squark masses are heavier than . In
this case the function I (m2 , m2 , 2 ) decreases as increases, helping to suppress the
b1
b2
chargino contribution relative to the gluino contribution and thus enhancing the correction to mb and resulting in an even smaller yb . The best chance of obtaining bottom-tau
Yukawa unification is therefore with m10 /m16 = 1.25 and D = 0.4, but even in this case,
throughout the (m1/2 , m16 )-plane we are still far from unificationoff by around 25% at
best. Other authors [48,52,58] have found that, even allowing the D-terms and m10 /m16
to vary much more than we have, it is impossible to reconcile Yukawa unification with the
positive preferred by (g 2) and B(b Xs ). They have found, however, that if one
splits the Higgs masses at the GUT scale while leaving the D-terms zero, such unification
is possible. One group [58] find that even then there are problems fulfilling the neutralino
relic density constraint due to the large masses involved in the favoured region of parameter
space. In the same paper they discuss possible solutions to this problem. However, it may
be due to the limitations of the bottomup procedure used in their (and indeed our) analysis
165
which often fails to converge close to the region of incorrect EWSB. Another group [48,
49], using a topdown algorithm, find no such problems and have identified regions of this
parameter space consistent with both Yukawa unification and an acceptable neutralino relic
density.
5. Calculation and constraints

We use SOFTSUSY v.1.8.7 [85], one of several publicly available codes, to calculate
the sparticle spectrum and mixings. The code has been augmented to include our SO(10)inspired boundary conditions and a routine for the calculation of the SUSY contribution
to the muon anomalous magnetic moment using the formulae in [84]. SOFTSUSY uses a
bottomup routine in which various low energy observables such as MZ , fermion masses
and gauge couplings are input as constraints in addition to the GUT scale boundary conditions. An iterative algorithm proceeds from an initial guess to find a set of sparticle masses
and mixings consistent with the high and low scale constraints. We use full 2-loop renormalization group equations for the gauge and Yukawa couplings and the parameter. For
the soft masses we use the full 1-loop RGEs and include the 2-loop contributions in the 3rd
family approximation. Full details can be found in [85]. A comparison between SOFTSUSY and similar programs, for example [8688] was made in [89] and one can directly
compare the codes online at [90].
For the calculation of the neutralino relic density and B(b Xs ) we use micrOMEGAs v.1.3.1 [91], linked to SOFTSUSY via an interface conforming with the Les
Houches Accord [92] standard that contains all the relevant parameters from SOFTSUSY
necessary for the relic density calculation. For details of these calculations, see [91] and
the papers on which they were based [93102].
In our analysis we impose the following constraints.
Direct searches:
The following lower limits from LEP provide the strongest constraints on sparticle
masses from direct searches [103]:
m 103 GeV,
meR 99 GeV.
We include these lower bounds in our plots.

Muon anomalous magnetic moment:
We include the 2 bounds on the discrepancy between experiment and Standard
Model theory assuming the latest results of the calculation based on e+ e data for the
hadronic contribution [104] and the most recent data from the BNL E821 experiment
incorporating the results from negative muons [105]. We use the values from [106]
which include the recently recalculated 4 QED correction [107] and the most recent
hadronic light-by-light contribution [108]. Similar values were obtained by an independent calculation [109]. However this second paper does not take the new theoretical
results [107,108] into account. From [106],
a aSM = (24.5 9.0) 1010 ,
exp
166
where a
(g2)
.
2
We use the 2 bound,
6.5 1010 < a < 42.5 1010 ,

as the allowed range of the SUSY contribution. Due to the inconsistency between these
results and those obtained by using decay data, and taking into account the susceptibility to change of the measurement of the e+ e cross section [104], the (g 2)
constraint should perhaps be viewed more provisionally than the others. This is unfortunate since it is one of the most important, being the only one that unambiguously
determines the sign of .
Branching ratio B(b Xs ):
The most recent world average for the branching ratio is [110]
B(b Xs )exp = (3.34 0.38) 104 ,
while the current Standard Model theory value is [111]5
B(b Xs )SM = (3.70 0.30) 104 .
We will use this Standard Model estimate of the theoretical error in our calculation
as representative of the error to be expected in our calculation which includes both
Standard Model and SUSY contributions. We do this by combining the experimental
and theoretical errors in quadrature to obtain the following upper and lower bounds on
the branching ratio at 2 :
2.40 104 < B(b Xs ) < 4.28 104 .
Neutralino dark matter:
The analysis of the data from WMAP gives a best fit value for the matter density of
2
the universe of m h2 = 0.135+0.008
0.009 and for the baryon density, b h = 0.0224
0.0009 [112]. This implies that the CDM density is
CDM h2 = 0.1126+0.0161
0.0181
at the 2 level. This can be an extremely stringent bound on the MSSM parameter
space, especially in the case of small tan . However, for large tan it is less restrictive
due to the presence of the A0 Higgs resonance, and much less so if we allow for a
source of cold dark matter other than neutralinos such as axions, or some relic density
enhancement mechanism such as non-thermal production of neutralinos (see [113] and
references therein for more examples). In these instances, the lower bound on m h2
can be neglected. We plot values for which
0.0945 < CDM h2 < 0.1287,
and indicate the allowed regions if we choose to discard the lower bound. We also plot
the locus of points for which mA0 = 2m 0 marking the position of the A0 resonance.
1
5 This takes into account only those results that include the improved ratio mMS (m /2)/mpole as opposed to
b
c
b
pole
pole
V A |b matrix element. For details see [97].
mc /mb in the Xs |(s c)V A (cb)
167
Lightest Higgs mass mh0 :

Where appropriate, we also display the contour
mh0 = 114.1 GeV,
corresponding to the LEP bound on the lightest SM Higgs boson [103] in the regions
of parameter space where the lightest MSSM Higgs boson is Standard Model like,
i.e. sin( ) is almost exactly equal to 1, where is the mixing angle relating the
mass eigenstates to the gauge eigenstates in the CP-even neutral Higgs sector. As we
shall see in the following analysis, there are regions of parameter space where the
mass of the heavy CP-even Higgs boson mH 0 becomes comparable in mass to mh0
and sin( ) is not close to unity. In these regions the LEP bound on the lightest
Higgs does not apply and we omit the contour in our plots.
Correct EWSB/tachyons:
The boundary on which ||2 vanishes, marking the border of correct radiative electroweak symmetry breaking has been plotted. In the region where ||2 < 0 a global
minimum of the two loop effective Higgs potential cannot be found. Similarly, any
regions in which m2A0 < 0, also signaling that the electroweak symmetry has not been
broken correctly, have been excluded. Regions with tachyonic sfermions are likewise
omitted.
6. The constrained parameter space

In all of the following plots we take sign() > 0, necessary to obtain the observed sign
of (g 2) , and we take A0 = 0 for simplicity, choosing to concentrate on the effects of
varying D 2 and m10 /m16 . Note that we do not demand Yukawa unification as an additional
constraint. For > 0 we were unable to find any regions of parameter space consistent
with Yukawa unification. Although in principle m10 /m16 and D 2 can take on relatively
large values we choose to explore the effects of fairly small deviations from universality.
As a result there remains a close connection between the (m1/2 , m0 )-plane of the CMSSM
and the (m1/2 , m16 )-plane in the following analysis, i.e. for the most part, the sparticle
spectrum will be similar in each case for corresponding points in the two planes, with
possible important exceptions in the Higgs and chargino/neutralino sectors. Therefore we
can still make a meaningful comparison between our SO(10) scenario and the CMSSM.
Our initial scan of the parameter space revealed that the most important effects only appear
for large values of tan and so we set tan = 50 throughout. For this case, we will show
that even small deviations from the CMSSM can lead to large changes in the topography
of the allowed regions in the (m1/2 , m16 )-plane.
Fig. 9 shows the (m1/2 , m16 )-plane in the standard CMSSM case. D-terms are set to
zero and m10 = m16 . One can see the usual features: the region where m1/2 150 GeV is
ruled out for any value of m16 due to the LEP mass bound on the lightest chargino; a large
triangular area where m16 0.5m1/2 ruled out because the LSP is a stau; the quarter-eggshaped region at small m1/2 excluded by the lower bound on B(b Xs ); the LEP lower
bound on the lightest Higgs boson mass, valid for a Standard Model-like Higgs; the arcs
representing the 1 and 2 favoured regions for the muon anomalous magnetic moment at
168
Fig. 9. This plot shows contours in the (m1/2 , m16 )-plane for a variety of constraints. In this and subsequent plots
we have A0 = 0, > 0 and tan = 50. In this figure we show the CMSSM parameter space: m10 /m16 = 1 and
D = 0. The blue (dark grey) strip and to the left of the black line at low m1/2 are excluded by the LEP bounds
on m and mh0 , respectively; in the grey triangular region at small m16 the LSP is the ; the purple (medium
grey) region extending out to m1/2 500 GeV is ruled out by B(b Xs ); the orange (light grey) and yellow
(very light grey) bands are the (g 2) 1 and 2 favoured regions; the narrow crimson (darkish grey) curve
satisfies the WMAP bounds; the dark red (dark grey) labeled line shows the exact position of the A0 resonance;
the light pink (very, very light grey) region is allowed by WMAP if there exists another source of CDM; finally,
any white regions are ruled out by the WMAP upper bound.
small to moderate m1/2 and m16 ; the A0 resonance around the region where 2m 0 = mA0
1
and rapid annihilation can occur via an S-channel A0 (or, sub-dominantly, H 0 ) satisfying
the upper bound on CDM h2 ; finally, there is the co-annihilation tail along the boundary
demarking m 0 = m1 . As usual, the region allowed by all the constraints, if we take them
1
all seriously, remains a narrow strip at fairly low m1/2 and m16 , mainly dictated by the
highly stringent WMAP bounds on CDM h2 .
Next, in Fig. 10 we look at the same plot, this time varying m10 /m16 while keeping
D = 0. Fig. 10(a) shows m10 /m16 = 0.75. First of all, due to the resulting increase in
, the B(b Xs ) exclusion region has shrunk, especially for m16 m1/2 , as would be
expected from our earlier analysis in Section 4.2. This is because the chargino amplitude no
longer over-cancels the Standard Model amplitude. The (g 2) preferred regions follow
a similar pattern (see Section 4.1 for details). Since M2 is more or less unaltered and in this
case dominates the mass of the lightest chargino, the region excluded by the LEP bound on
the 1 is almost unchanged. The stau LSP triangle is a little less restrictive since the Xi
factors in the soft mass RGEs (see Section 3) are smaller and the lighter right-handed stau
is slightly heavier as a result. Due to overall heavier A0 masses, the contours of equal mA0
move to the left and as a result of where they overlap with the contours of equal m 0 along
1
the line mA0 = 2m 0 , the A0 resonance rapid annihilation funnel is at a flatter angle as it
1
emerges from the stau LSP boundary. The end result is a larger relic density and a smaller
allowed region.
Looking now at Fig. 10(b) where m10 /m16 = 1.25, we see a huge change. First of all,
a large region appears in the upper-left of the parameter space which is excluded by the
169
(a)
(b)
Fig. 10. Same as Fig. 9, but showing the effects of varying m10 /m16 only; D = 0. In (a) m10 /m16 = 0.75 and
in (b) m10 /m16 = 1.25. The new grey triangular region at small m1/2 ,large m16 in (b) is ruled out by ||2 < 0
indicating that for this region there is no solution to EWSB.
constraint ||2 > 0 at the EWSB scale. Since is very small in the parameter space bordering this region, the lightest chargino becomes Higgsino-like and a thin strip appears
along which the m > 103 GeV bound is violated. The b Xs excluded region along
1
with the zones of preferred (g 2) are enhanced, noticeably so at higher m16 towards
the edge of the EWSB limit where is rapidly decreasing. Most interestingly, the enhancement of the A0 resonance annihilation of neutralinos has opened up the whole of
the parameter space preferred by (g 2) and allowed by the other constraints, at least
when one ignores the lower bound on CDM h2 . There are two main reasons for this. The
first is that the A0 resonance, marked by the contour mA0 = 2m 0 , is steeper and exhibits
1
a characteristic kink, thus bringing it to smaller values of m1/2 . The kink originates from
a complex interplay in the (m1/2 , m16 )-plane near the EWSB boundary, between the difference (m2H1 m2H2 ) (dependent on a particular combination of the RGE factors Xt and
Xb (discussed in Section 3.1) and which controls the evolution of mA0 ) and the absolute
170
value of m2H2 (dependent on Xt alone and which determines or equivalently the Higgsino component and mass of the lightest neutralino). With the A0 resonance at smaller
m1/2 , it is more effective at reducing CDM h2 which is proportional to m 0 and therefore,
1
for most of the parameter space (away from the EWSB boundary), roughly proportional
to m1/2 . The second reason is that neutralino LSPs in the region of parameter space close
to the boundary where || 0 have large Higgsino components (as can be inferred from
Eq. (16)). Since this region is at much lower m16 and much closer to the A0 resonance than
in the CMSSM case the effects are very much greater, namely that the neutralino LSPs
coupling to the A0 and also its coupling to gauge bosons are greatly enhanced. Next to
the m = 103 GeV boundary there is significant co-annihilation with charginos. All this
1
conspires to produce a greatly diminished relic density over much of the parameter space,
despite the fact that the GUT scale boundary conditions do not deviate massively from the
universal case.
In order to compare with Figs. 9 and 10, Figs. 1114 show the (m1/2 , m16 )-plane for
(a) m10 /m16 = 1, (b) m10 /m16 = 0.75 and (c) m10 /m16 = 1.25, this time in order of
increasing D.
Fig. 11 shows the case of D = 0.4. All three graphs are fairly similar although very
different from the CMSSM plot. Here, several curious things have happened. The area in
which EWSB does not occur is due to m2A0 < 0 this time. (g 2) is slightly smaller than
in the D = 0 scenario, as expected from Section 4.1. Also B(b Xs ) is substantially
less restrictive at low m16 and the excluded region drops off sharply next to the m2A0 = 0
boundary where the charged Higgs diagram successfully cancels out the large chargino
diagram. At larger values of m16 , still along the edge of m2A0 = 0 boundary, there is a
narrow strip excluded by the observed value for B(b Xs ), this time because the Higgs
component of the amplitude is overcompensating for the chargino component, interfering
constructively with the Standard Model contribution and giving too large a value. For D =
0.4, the stau LSP excluded region is at its largest due to the lighter right-handed stau
receiving a negative contribution due to the negative D-term. The slope of this region is
greatest for m10 /m16 = 1.25 where X is at its largest, decreasing the stau mass even
more. The A0 resonance in this case has a completely different shape due to the fact that
the Higgses are very light and beyond m1/2 800 GeV and m16 450 GeV, mA0 < 2m 0
1
everywhere. The particulars of this situation produce a semi-circular curve in the bottomleft of the (m1/2 , m16 )-plane and the corresponding relic density curves as shown. An
enhancement of the Higgsino component of the LSP results in a slightly different shape
for Fig. 11(c). Note, however, that in each of (a), (b) and (c), there is no boundary where
vanishes (mA0 always gets there first) and there are no regions where co-annihilations
involving charginos play a significant rle.
Fig. 12 shows similar plots, but with D = 0.2. This time, the situation is more like
the situation with D = 0 in that the no EWSB region is due to ||2 < 0 rather than mA0
vanishing (although mA0 is very small close to the boundary compared to the CMSSM).
For m10 /m16 = 1, Fig. 12(a), unlike in the CMSSM case, the D-terms help to keep m2H2
above zero at the top-left-hand side of the plot and there is a slim region where ||2 < 0.
Next to it, one can see at m1/2 300 GeV, m16 2000 GeV a tiny sliver of parameter
space ruled out by the B(b Xs ) upper bound due to the existence of a light charged
171
(a)
(b)
(c)
Fig. 11. Same as Fig. 9, but with D = 0.4. In (a), m10 /m16 = 1, in (b) m10 /m16 = 0.75 and in (c)
m10 /m16 = 1.25. The grey triangular region at small m1/2 , large m16 in all three graphs is ruled out this time
not because ||2 < 0, but because m2 0 < 0, again indicating a failure to break EWSB properly. N.B. close to this
A
region the Higgs masses are all very small.
172
(a)
(b)
(c)
m10 /m16 = 1.25. The grey triangular region at large m16 in graphs (a) and (c) is ruled out because ||2 < 0
as for the case D = 0.
173
Higgs. The main B(b Xs ) exclusion and the (g 2) preferred regions are basically
indistinguishable from the CMSSM. Close to the no EWSB boundary at moderate to large
m16 there is again significant neutralino annihilation and there exists a region of acceptable
relic density due to the large LSP Higgsino component. As one moves away from this region, increasing m1/2 , the proportion of Higgsino drops off and the relic density increases
above the WMAP upper bound. Increasing m1/2 further, one eventually reaches the A0
resonance where again the WMAP bounds are satisfied. This results in a large, slanted Ushaped band where CDM h2 is perfectly adjusted to account for the WMAP observations.
Fig. 12(b) is very similar to the case with zero D-terms, except that it has a larger A0 resonance rapid-annihilation funnel. In Fig 12(c) the mA0 = 2m 0 kink is more pronounced,
1
tending towards the semi-circular arc of the Fig. 11 plots.
We note that close to the boundary where radiative EWSB fails in Figs. 9, 10 and 12
there is a sudden dip in the (g 2) preferred region and the b Xs excluded region
where the Higgsino mass drops rapidly with , becomes nearly degenerate with the wino,
and large cancellations occur between the partial amplitudes corresponding to the two mass
eigenstates. For the magnetic moment of the muon, although our approximate formulae no
longer apply, one can infer this behaviour directly from the exact result Eqs. (17) and
(18) and inspection of the chargino mass matrix diagonalization equation, Eq. (14). When
2 , then /4 and m m M and the contributions
2 M22 MW
L
R
2
1
2
cancel. Of course, this cancellation is not exact, and, in any case, we have neglected the
smuon-neutralino contributions.
Increasing D to 0.2 and then to 0.4, we obtain the plots of Figs. 13 and 14. Here, and
referring back to Figs. 912, we will remark on some general trends:
There is only a small change in B(b Xs ) which is most apparent at large m16
and small m1/2 , due to the increased sensitivity of and mA0 to the input parameters
in this region. For m10 /m16 = 1 or 0.75, it is at a maximum for D 0. This happens
because, for large negative D 2 , a large contribution from a propagating H results in a
partial cancellation of the chargino contribution, thus preventing its over-destruction of
the Standard Model amplitude, while at large positive D 2 , although the charged Higgs
diagram is comparatively negligible, the chargino loop too is much smaller due to the
increasing masses. Both of these cases lead to a decrease in the excluded region. On
the other hand, for m10 /m16 = 1.25 as one increases D 2 , the B(b Xs ) exclusion
zone continues to increase as the ||2 = 0 boundary recedes away to smaller m1/2 and
larger m16 and the effect of the charged Higgs decreases.
For (g 2) , there are three distinct situations regarding its behaviour when D 2 is
varied, corresponding to the different values for m10 /m16 . For m10 /m16 = 1, as D 2
grows, at high m16 and small m1/2 , (g 2) decreases. This is due to the greater effect
that D 2 has on in this region as noted at the end of Section 4.1. On the other hand,
at points in the parameter space for which m1/2 m16 , such as the m1/2 = m16 =
500 GeV point we analyzed in detail in Section 4.1, (g 2) increases with D 2 . The
effect of increasing D at m10 /m16 = 1, therefore, is to make the muon anomalous
magnetic moment preferred region a little lower and slightly broader. For m10 /m16 =
0.75, there is no region close to the boundary of the (g 2) favoured zone where
is affected to such a large extent by the D-terms since we are significantly further
174
(a)
(b)
(c)
m10 /m16 = 1.25. The grey triangular region at large m16 in (c) is ruled out because ||2 < 0 as for the case
D = 0.
175
(a)
(b)
(c)
m10 /m16 = 1.25. The grey triangular region at large m16 in (c) is ruled out because ||2 < 0 as for the case
D = 0.
176
away from where EWSB fails than in the previous case. As a result, the (g 2)
preferred region increases in all parts of the nearby parameter space as one increases
D 2 . Finally, for m10 /m16 = 1.25, increasing D 2 , causes the ||2 = 0 boundary to be
pushed back, opening a larger area of parameter space. This also results in a decrease
in m2 . Consequently, the amplitude is increased over the whole region. However, note
the dip in (g 2) as becomes small, as commented on earlier.
For each value of m10 /m16 , increasing D 2 results in a smaller LSP region. This is
because the lighter, right-handed stau mass steadily increases relative to the lightest
2 D2 .
neutralino mass since m2eR = m216 + g10
2
As one increases D , mA0 increases and the A0 resonance makes a more acute angle
with the m1/2 axis, emerging from the stau LSP excluded region at a larger value
of m1/2 . Consequently, the neutralino dark matter density is much increased and the
region allowed by WMAP is substantially reduced in this part of parameter space. For
D = 0.4, in Fig. 14, the characteristic A0 resonance peak had almost disappeared.
To make a final note on these plots, the relic density curves in the plot shown in Fig. 13(c)
resemble those of Fig. 12(a). The difference is that the ||2 < 0 region is due to an increased m10 /m16 in Fig. 13(c) rather than a negative D 2 in Fig. 12(a).
6.1. A cautionary note
It should be strongly emphasized here that the exact location and form of the relic
density curves, the location of the EWSB excluded region and also of the A0 resonance (on
both of which the relic density strongly depends) are not only dependent on the variation
of the mSugra parameters due to D-terms or Higgs-sfermion splitting or any other changes
one may make, but also on the precise value of the top mass which governs the highly
important parameter Xt and thus the Higgs soft masses. The value used in this paper is
= 174.3 GeV, the value quoted by the PDG [103], which is now at odds with the
mPOLE
t
recently released results from the D0 Collaboration [114] (and see also [115]), which puts
= 178.0 4.3 GeV. Even this 2% change would be enough to
the top mass at mPOLE
t
significantly change the form of our plots. However, even had we used this updated value
for mt , there is still the question of the 4.3 GeV error. Varying mt within this would again
lead to very different results. Acknowledging this, we also point out that the position and
shape of the WMAP preferred region, the A0 resonance, and the ||2 = 0 region are also
highly dependent on the exact value of tan . To keep things under control, we used a
constant value of tan = 50 throughout this paper. However, by tweaking this value and
the values we set for D 2 and m10 /m16 to some small degree we would be able to account
for any difference in the top mass to a good approximation. As such, we believe that while
qualitatively correct, our results for the relic density, for the exact position of the boundary
where EWSB fails and for the exact value of the A0 mass, due to their strong dependence
on tan and mt , and also for the values of B(b Xs ) and (g 2) in the region of
0 or mA0 0 (although they are reasonably stable elsewhere) should not be taken as
quantitatively correct to any degree of accuracy. This is not a statement of the inaccuracy of
the (fairly well-tested and state-of-the-art) computer programs we used, merely a statement
about the sensitivity of certain observables to the input parameters especially in the region
177
of parameter space close to where the radiative EWSB mechanism fails. For this reason we
have chosen to focus on the qualitative aspects of our results and their origins.

In this paper we have discussed the effects that small deviations from universality, in
the form of SO(10)-inspired U(1)X D-terms and Higgs-sfermion soft mass splitting, have
on the parameter space of the CMSSM at large tan , with > 0 and A0 = 0.
In the first part of the paper we reviewed the origin of such terms before going
on to follow,
by use of various approximations, how the additional parameters D
sign(D 2 ) |D 2 |/m16 and m10 /m16 feed through the renormalization group equations,
electroweak symmetry breaking conditions and sparticle mass matrices to effect changes
in the sparticle mass spectrum and mixings.
It was noted that the D-terms are closely connected to the CP-odd A0 Higgs mass
and that m10 /m16 has a large effect on the value of at the electroweak scale. At large
negative values of D 2 or large values of m10 /m16 one can expect the radiative electroweak
symmetry breaking mechanism to break down due to m2A0 < 0 in the former case and
||2 < 0 in the latter and for this to be especially important for small m1/2 , large m16 .
The D-terms affect the scalars at tree-level, especially the heavy Higgs A0 , H 0 and H ,
and the right-handed down-type squarks and left-handed sleptons, but largely cancel out of
the RGEs. On the other hand, m10 /m16 affects the Higgs scalar soft masses at tree-level
and feeds through the renormalization group equations through large third family Yukawa
couplings, noticeably affecting the third generation of squarks and sleptons, though most
strongly influencing and the Higgsino-like charginos and neutralinos. We explained how,
contrary to first impressions, increasing m10 /m16 decreases mA0 as the indirect result of
bottom mass threshold corrections. We also gave approximate results for the mixing angles in the chargino sector and the components of the lightest neutralino, the Higgsino
components of which can be seen to grow with .
We analyzed the computed spectrum away from such regions at m1/2 = m16 = 500 GeV
for D [0.4, 0.4] and m10 /m16 [0.75, 1.25] and found deviations in the sparticle
masses compared to the universal case resulting from additive effects at the GUT scale
coming from the D-terms, and from loop effects feeding through the RGEs as a result of
the Higgs-sfermion splitting. These changes were found to be very small for those masses
predominantly dependent on the gluino mass M3 , for example the gluino and the squarks
of the first two families (except the right-handed down-type squarks which receive larger
D-term corrections), but could be sizeable for other sparticles such as the third family
sleptons and the Higgsino-like charginos and neutralinos. It was noted that close to the
boundary where EWSB fails, and otherwise at regions with small m1/2 and large m16 , the
effects of D-terms and m10 /m16 can be much more important. We analyzed a set of points
in these more sensitive regions of parameter space and observed much larger deviations in
the mass spectrum from the universal case, especially in the heavy Higgs masses, and the
Higgsino-like charginos and neutralinos.
In the next section we analyzed how the D-terms and m10 /m16 affect the dominant
tan -enhanced amplitudes of quantities such as (g 2) , B(b Xs ) and the SUSY
178
threshold corrections to the bottom mass. In general the effects were found to be relatively
small in much of the parameter space, but subtle, and in general larger in the small m1/2 ,
large m16 region.
For (g 2) we gave a detailed quantitative approximation in Section 4.1, assuming
2 and taking the large tan limit. It was observed that there is a tradethat 2 M22 MW
off between the external factor and the loop functions present in the approximate formula,
and in regions where the approximation is valid it can be quite finely balanced especially in
the case of varying the D-terms, resulting in only small deviations from the CMSSM case.
We found that the effect of increasing D 2 away from the sensitive regions close to ||2 = 0
and m2A0 = 0 and to where m1/2 m16 was to increase the amplitude due to the decreasing muon sneutrino mass despite the contrary effect of the increasing Higgsino mass. For
the particular point m1/2 = m16 = 500 GeV we gave numerical examples, verifying our
approximation. In other circumstances, however, often the effect of increasing D 2 was the
reverse, leading to a decrease in (g 2) . Increasing m10 /m16 was always found to increase (g 2) as a result of the decrease in the Higgsino mass, though a drop-off was
observed close to where radiative EWSB fails as a result of cancellation between the partial
amplitudes arising from the two chargino mass eigenstates. All such effects were, for the
most part, found to be relatively small.
In the case of B(b Xs ) it was discovered that for both m10 /m16 = 1 and 0.75, the
size of the excluded region of B(b Xs ) peaks for D 0. This is because for positive D 2 the Higgsino mass increases and therefore the chargino contribution to the Wilson
coefficients decreases and there is no longer an over-cancellation of the Standard Model
contribution, while for negative D 2 the charged Higgs contribution grows rapidly and partially cancels the chargino amplitude again saving the successful Standard Model result
over a larger region of parameter space. Increasing m10 /m16 to 1.25 in general increases
the chargino contribution resulting in a larger excluded zone. Increasing D 2 in this case
pushes back the no EWSB boundary, decreases the charged Higgs contribution and results
in a larger forbidden region.
The SUSY threshold corrections to the bottom mass were found to decrease with increasing m10 /m16 and decreasing D 2 , a result of increasing the chargino contribution with
respect to the dominant gluino contribution which has the opposite sign. This trend is what
is needed for Yukawa unification favoured by GUT models, but the corrections were found
to be far too small to achieve this aim.
Increasing m10 /m16 and decreasing D 2 were found to increase the area excluded by the
existence of a stau LSP due to an associated decrease in the mass of the lighter right-handed
stau. However, the change is quite small.
The main effect of varying m10 /m16 and D 2 was to cause substantial alterations to the
position and shape of the curves of neutralino relic density required to satisfy the WMAP
bounds. This was found to be due to both large changes in the position and shape of the A0
resonance region where mA0 2m 0 resulting from large corrections to the A0 Higgs and
1
the lightest neutralino mass (in the region where it changes from mainly bino to mainly
Higgsino), and a massively increased Higgsino component of the LSP close to the boundary where = 0, strongly affecting its couplings to the A0 Higgs and gauge bosons.
A cautionary note was made regarding the sensitivity of these results to the changing
value of mt and also to tan .
179
In conclusion we note that although corrections to universality like D-terms and Higgssfermion splitting are straightforward to implement at the input scale, by the time they
filter through to the sparticle masses and observables their effects can be complex and
often subtle. We have attempted to clarify exactly what these effects are and how they
originate. The various observables used to constrain low energy supersymmetric models
are susceptible to change significantly even under small corrections to the sfermion/Higgs
universality assumed in the CMSSM, especially observables sensitive to and mA0 such as
the neutralino relic density, and particularly in areas of parameter space close to where the
EWSB mechanism breaks down. As a result a much larger region of parameter space than
in the CMSSM becomes viable. However, from the point of view of a successful SO(10)
unified model, there is still the persistent problem of achieving Yukawa unification in the
> 0 case, as has been previously noted by other authors [48,52]. We hope to address this
point in our future work.
Acknowledgements
The author would like to thank B.C. Allanach, R. Dermisek, M. Hirsch, C. Hugonie,
S. Profumo, W. Porod and especially G.G. Ross for useful discussions, and the E.C. for a
Marie Curie Training Site grant as part of the European Network Project Physics Beyond
the Standard Model (HPMT-CT-2000-00124).
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
G.L. Kane, C.F. Kolda, L. Roszkowski, J.D. Wells, Phys. Rev. D 49 (1994) 6173, hep-ph/9312272.
H. Baer, C. Balazs, J. Cosmol. Astropart. Phys. 0305 (2003) 006, hep-ph/0303114.
J.R. Ellis, K.A. Olive, Y. Santoso, V.C. Spanos, Phys. Lett. B 565 (2003) 176, hep-ph/0303043.
L. Roszkowski, R. Ruiz de Austri, T. Nihei, JHEP 0108 (2001) 024, hep-ph/0106334.
A. Djouadi, M. Drees, J.L. Kneur, JHEP 0108 (2001) 055, hep-ph/0107316.
L.E. Ibanez, Phys. Lett. B 118 (1982) 73.
K. Inoue, A. Kakuto, H. Komatsu, S. Takeshita, Prog. Theor. Phys. 68 (1982) 927.
A.H. Chamseddine, R. Arnowitt, P. Nath, Phys. Rev. Lett. 49 (1982) 970.
R. Barbieri, S. Ferrara, C.A. Savoy, Phys. Lett. B 119 (1982) 343.
H.P. Nilles, Phys. Lett. B 115 (1982) 193.
N. Ohta, Prog. Theor. Phys. 70 (1983) 542.
L.J. Hall, J. Lykken, S. Weinberg, Phys. Rev. D 27 (1983) 2359.
S.K. Soni, H.A. Weldon, Phys. Lett. B 126 (1983) 215.
H.P. Nilles, Nucl. Phys. B 217 (1983) 366.
J.R. Ellis, D.V. Nanopoulos, K. Tamvakis, Phys. Lett. B 121 (1983) 123.
L. Alvarez-Gaume, J. Polchinski, M.B. Wise, Nucl. Phys. B 221 (1983) 495.
H.P. Nilles, Phys. Rep. 110 (1984) 1.
H.E. Haber, G.L. Kane, Phys. Rep. 117 (1985) 75.
N. Polonsky, A. Pomarol, Phys. Rev. Lett. 73 (1994) 2292, hep-ph/9406224.
N. Polonsky, A. Pomarol, Phys. Rev. D 51 (1995) 6532, hep-ph/9410231.
M. Drees, Phys. Lett. B 181 (1986) 279.
J.S. Hagelin, S. Kelley, Nucl. Phys. B 342 (1990) 95.
A.E. Faraggi, J.S. Hagelin, S. Kelley, D.V. Nanopoulos, Phys. Rev. D 45 (1992) 3272.
A. Lleyda, C. Munoz, Phys. Lett. B 317 (1993) 82, hep-ph/9308208.
180
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66]
[67]
[68]
[69]
[70]
[71]
[72]
[73]
[74]
[75]
[76]
[77]
[78]
Y. Kawamura, M. Tanaka, Prog. Theor. Phys. 91 (1994) 949.

Y. Kawamura, H. Murayama, M. Yamaguchi, Phys. Lett. B 324 (1994) 52, hep-ph/9402254.
Y. Kawamura, H. Murayama, M. Yamaguchi, Phys. Rev. D 51 (1995) 1337, hep-ph/9406245.
R. Rattazzi, U. Sarid, L.J. Hall, hep-ph/9405313.
C.F. Kolda, S.P. Martin, Phys. Rev. D 53 (1996) 3871, hep-ph/9503445.
A. Pomarol, D. Tommasini, Nucl. Phys. B 466 (1996) 3, hep-ph/9507462.
K.S. Babu, R.N. Mohapatra, Phys. Rev. Lett. 83 (1999) 2522, hep-ph/9906271.
B. Murakami, K. Tobe, J.D. Wells, Phys. Lett. B 526 (2002) 157, hep-ph/0111003.
S.F. King, G.G. Ross, Phys. Lett. B 520 (2001) 243, hep-ph/0108112.
G.G. Ross, L. Velasco-Sevilla, Nucl. Phys. B 653 (2003) 3, hep-ph/0208218.
T. Kobayashi, H. Nakano, H. Terao, K. Yoshioka, Prog. Theor. Phys. 110 (2003) 247, hep-ph/0211347.
N. Maekawa, Phys. Lett. B 561 (2003) 273, hep-ph/0212141.
S.F. King, G.G. Ross, Phys. Lett. B 574 (2003) 239, hep-ph/0307190.
M.R. Ramage, G.G. Ross, hep-ph/0307389.
D. Matalliotakis, H.P. Nilles, Nucl. Phys. B 435 (1995) 115, hep-ph/9407251.
M. Olechowski, S. Pokorski, Phys. Lett. B 344 (1995) 201, hep-ph/9407404.
H.C. Cheng, L.J. Hall, Phys. Rev. D 51 (1995) 5289, hep-ph/9411276.
V. Berezinsky, et al., Astropart. Phys. 5 (1996) 1, hep-ph/9508249.
P. Nath, R. Arnowitt, Phys. Rev. D 56 (1997) 2820, hep-ph/9701301.
Y. Kawamura, T. Kobayashi, H. Shimabukuro, Phys. Lett. B 436 (1998) 108, hep-ph/9805336.
H. Baer, M.A. Diaz, J. Ferrandis, X. Tata, Phys. Rev. D 61 (2000) 111701, hep-ph/9907211.
H. Baer, et al., Phys. Rev. D 63 (2001) 015007, hep-ph/0005027.
H. Baer, et al., Phys. Rev. D 64 (2001) 015002, hep-ph/0102156.
T. Blazek, R. Dermisek, S. Raby, Phys. Rev. D 65 (2002) 115004, hep-ph/0201081.
R. Dermisek, S. Raby, L. Roszkowski, R. Ruiz De Austri, JHEP 0304 (2003) 037, hep-ph/0304101.
J.R. Ellis, T. Falk, K.A. Olive, Y. Santoso, Nucl. Phys. B 652 (2003) 259, hep-ph/0210205.
J.R. Ellis, K.A. Olive, Y. Santoso, Phys. Lett. B 539 (2002) 107, hep-ph/0204192.
D. Auto, et al., JHEP 0306 (2003) 023, hep-ph/0302155.
C. Pallis, Nucl. Phys. B 678 (2004) 398, hep-ph/0304047.
S. Profumo, Phys. Rev. D 68 (2003) 015006, hep-ph/0304071.
B. Ananthanarayan, P.N. Pandita, Mod. Phys. Lett. A 19 (2004) 467, hep-ph/0312361.
D.G. Cerdeno, C. Munoz, hep-ph/0405057.
H. Baer, A. Belyaev, T. Krupovnickas, A. Mustafayev, hep-ph/0403214.
D. Auto, H. Baer, A. Belyaev, T. Krupovnickas, hep-ph/0407165.
H. Baer, A. Mustafayev, S. Profumo, A. Belyaev, X. Tata, hep-ph/0412059.
J.C. Pati, A. Salam, Phys. Rev. D 10 (1974) 275.
H. Georgi, S.L. Glashow, Phys. Rev. Lett. 32 (1974) 438.
S.F. King, M. Oliveira, Phys. Rev. D 63 (2001) 015010, hep-ph/0008183.
S.P. Martin, hep-ph/9709356.
S.P. Martin, M.T. Vaughn, Phys. Rev. D 50 (1994) 2282, hep-ph/9311340.
V.D. Barger, M.S. Berger, P. Ohmann, Phys. Rev. D 49 (1994) 4908, hep-ph/9311269.
P. Chankowski, S. Pokorski, J. Rosiek, Nucl. Phys. B 423 (1994) 437, hep-ph/9303309.
D.M. Pierce, J.A. Bagger, K.T. Matchev, R.-j. Zhang, Nucl. Phys. B 491 (1997) 3, hep-ph/9606211.
J.L. Feng, K.T. Matchev, T. Moroi, Phys. Rev. Lett. 84 (2000) 2322, hep-ph/9908309.
J.L. Feng, K.T. Matchev, T. Moroi, Phys. Rev. D 61 (2000) 075005, hep-ph/9909334.
J.L. Feng, K.T. Matchev, F. Wilczek, Phys. Lett. B 482 (2000) 388, hep-ph/0004043.
J.L. Feng, K.T. Matchev, Phys. Rev. D 63 (2001) 095003, hep-ph/0011356.
H. Murayama, M. Olechowski, S. Pokorski, Phys. Lett. B 371 (1996) 57, hep-ph/9510327.
R. Rattazzi, U. Sarid, Phys. Rev. D 53 (1996) 1553, hep-ph/9505428.
U. Chattopadhyay, P. Nath, Phys. Rev. D 53 (1996) 1648, hep-ph/9507386.
T. Moroi, Phys. Rev. D 53 (1996) 6565, hep-ph/9512396.
J.L. Lopez, D.V. Nanopoulos, X. Wang, Phys. Rev. D 49 (1994) 366, hep-ph/9308336.
M.A. Diaz, hep-ph/9309296.
J.L. Lopez, D.V. Nanopoulos, G.T. Park, A. Zichichi, Phys. Rev. D 49 (1994) 355, hep-ph/9308266.
[79]
[80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
[90]
[91]
[92]
[93]
[94]
[95]
[96]
[97]
[98]
[99]
[100]
[101]
[102]
[103]
[104]
[105]
[106]
[107]
[108]
[109]
[110]
[111]
[112]
[113]
[114]
[115]
181
F.M. Borzumati, Z. Phys. C 63 (1994) 291, hep-ph/9310212.

N. Oshimo, Nucl. Phys. B 404 (1993) 20.
L.J. Hall, R. Rattazzi, U. Sarid, Phys. Rev. D 50 (1994) 7048, hep-ph/9306309.
R. Hempfling, Phys. Rev. D 49 (1994) 6168.
M. Carena, M. Olechowski, S. Pokorski, C.E.M. Wagner, Nucl. Phys. B 426 (1994) 269, hep-ph/9402253.
J. Hisano, T. Moroi, K. Tobe, M. Yamaguchi, Phys. Rev. D 53 (1996) 2442, hep-ph/9510309.
B.C. Allanach, Comput. Phys. Commun. 143 (2002) 305, hep-ph/0104145.
W. Porod, Comput. Phys. Commun. 153 (2003) 275, hep-ph/0301101.
A. Djouadi, J.-L. Kneur, G. Moultaka, hep-ph/0211331.
F.E. Paige, S.D. Protopescu, H. Baer, X. Tata, hep-ph/0312045.
B.C. Allanach, S. Kraml, W. Porod, JHEP 0303 (2003) 016, hep-ph/0302102.
S. Kraml, http://kraml.home.cern.ch/kraml/comparison/compare.html.
G. Belanger, F. Boudjema, A. Pukhov, A. Semenov, hep-ph/0405253.
P. Skands, et al., hep-ph/0311123.
P. Gondolo, G. Gelmini, Nucl. Phys. B 360 (1991) 145.
J. Edsjo, P. Gondolo, Phys. Rev. D 56 (1997) 1879, hep-ph/9704361.
S. Bertolini, F. Borzumati, A. Masiero, G. Ridolfi, Nucl. Phys. B 353 (1991) 591.
A.L. Kagan, M. Neubert, Eur. Phys. J. C 7 (1999) 5, hep-ph/9805303.
P. Gambino, M. Misiak, Nucl. Phys. B 611 (2001) 338, hep-ph/0104034.
G. Degrassi, P. Gambino, G.F. Giudice, JHEP 0012 (2000) 009, hep-ph/0009337.
K.G. Chetyrkin, M. Misiak, M. Munz, Phys. Lett. B 400 (1997) 206, hep-ph/9612313.
M. Ciuchini, G. Degrassi, P. Gambino, G.F. Giudice, Nucl. Phys. B 527 (1998) 21, hep-ph/9710335.
M. Ciuchini, G. Degrassi, P. Gambino, G.F. Giudice, Nucl. Phys. B 534 (1998) 3, hep-ph/9806308.
M. Carena, D. Garcia, U. Nierste, C.E.M. Wagner, Nucl. Phys. B 577 (2000) 88, hep-ph/9912516.
Particle Data Group, S. Eidelman, et al., Phys. Lett. B 592 (2004) 1.
CMD-2, R.R. Akhmetshin, et al., Phys. Lett. B 578 (2004) 285, hep-ex/0308008.
Muon g-2, G.W. Bennett, et al., Phys. Rev. Lett. 92 (2004) 161802, hep-ex/0401008.
K. Hagiwara, A.D. Martin, D. Nomura, T. Teubner, Phys. Rev. D 69 (2004) 093003, hep-ph/0312250.
T. Kinoshita, M. Nio, hep-ph/0402206.
K. Melnikov, A. Vainshtein, hep-ph/0312226.
M. Davier, S. Eidelman, A. Hocker, Z. Zhang, Eur. Phys. J. C 31 (2003) 503, hep-ph/0308213.
C. Jessop, SLAC-PUB-9610.
K. Bieri, C. Greub, hep-ph/0310214.
D.N. Spergel, et al., Astrophys. J. Suppl. 148 (2003) 175, astro-ph/0302209.
S. Profumo, C.E. Yaguna, hep-ph/0407036.
D0, V.M. Abazov, et al., Nature 429 (2004) 638, hep-ex/0406031.
D0, V.M. Abazov, et al., hep-ex/0407005.
An approximation for NLO single Higgs boson

inclusive transverse momentum distributions
in hadronhadron collisions
J. Smith a,1 , W.L. van Neerven b
a C.N. Yang Institute for Theoretical Physics, State University of New York at Stony Brook, NY 11794-3840, USA
b Instituut-Lorentz, University of Leiden, PO Box 9506, 2300 RA Leiden, The Netherlands
Received 20 January 2005; received in revised form 29 March 2005; accepted 13 May 2005
Abstract
In the framework of the gluongluon fusion process for Higgs boson production there are two different prescriptions. They are given by the exact process where the gluons couple via top-quark loops
to the Higgs boson and by the approximation where the top-quark mass mt is taken to infinity. In
the latter case the coupling of the gluons to the Higgs boson is described by an effective Lagrangian.
Both prescriptions have been used for the 2 2 body reactions to make predictions for Higgs boson
production at hadron colliders. In next-to-leading order only the effective Lagrangian approach has
been used to compute the single particle inclusive distributions. The exact computation of the latter
has not been done yet because the n-dimensional extensions of 2 3 processes are not calculated
and the two-loop virtual corrections are still missing. To remedy this we replace wherever possible
the Born cross sections in the asymptotic top-quark mass limit by their exact analogues. These cross
sections appear in the soft and virtual gluon contributions to the next-to-leading order distributions.
This approximation is inspired by the fact that soft-plus-virtual gluons constitute the bulk of the
higher order correction. Deviations from the asymptotic top-quark mass limit are discussed.
PACS: 12.38.-t; 13.85.-t; 14.80.Bn
E-mail address: smith@insti.physics.sunysb.edu (J. Smith).

1 Partially supported by the National Science Foundation grant PHY-0354776.
doi:10.1016/j.nuclphysb.2005.05.008
J. Smith, W.L. van Neerven / Nuclear Physics B 720 (2005) 182202
183
1. Introduction
In the past few years many articles have appeared on searches for the Higgs boson and
the reactions in which they are produced. One of them is the gluongluon fusion process.
According to the standard model gluons do not interact directly with the Higgs boson but
the coupling is mediated by a fermion loop. Since the coupling of the Higgs boson to
fermions is proportional to the mass of the fermion the reaction proceeds mainly via a topquark loop [1]. The lowest order loop is a triangle graph and the Higgs boson decay rates
into two gluons or two photons were already calculated at the end of the seventies [2]. The
first calculation in the gluongluon fusion model for the production process was done at the
end of the eighties by [35] (see [6,7] for later references). Reactions like g + g g + H ,
q + q g + H and q + g q + H were calculated. In particular the first reaction involves
a box diagram leading to complicated dilogarithms already on the Born level. In the early
nineties people succeeded in calculating the next-to-leading order (NLO) corrections to
the total cross section which involved the computation of the two-loop triangular graph
with an external Higgs boson [8]. The calculation could be greatly simplified by taking the
infinite top-quark mass limit. In this limit the gluons couple directly to the Higgs boson
and the Feynman rules are given by an effective Lagrangian. It turned out that the latter
method gives a good description of the exact calculation [9] provided the Higgs boson
mass mH and the transverse momentum pt are smaller than the top-quark mass mt [4,5,7].
In particularly the total cross section receives its main contribution from small pt . If the
Higgs mass is not too large (mH < 2mt ) the effective Lagrangian gives a good description
of the total cross section so that recently one has also finished the next-to-next-leading
order (NNLO) computation [1014]. However at Higgs masses and transverse momenta
equal or larger than the top-quark mass the differential cross sections calculated with the
effective Lagrangian method start to deviate from the exact cross sections. This has been
checked on the Born level in [4,5,7]. The investigation should now be done in NLO but
we realize that the exact cross sections are not available. Differential distributions in NLO
using the effective Lagrangian (or the mt approach) have been calculated in [1518].
In the same approach the resummation of the logarithmically enhanced contributions to
d/dpt at small pt have been carried out in [1921]. The first landmark calculation to get
the full NLO differential distribution has been achieved in [22]. In the latter one has exactly
calculated all matrix elements of the 2 3 processes. These reactions even contain oneloop five-point functions. However the calculation of the graphs uses the helicity method
in four dimensions. To compute the single particle inclusive process we need the matrix
elements in n dimensions. Moreover the two-loop virtual corrections, which are needed to
cancel the infrared and collinear divergences, have not been calculated yet. Therefore we
propose to make an approximation by replacing all Born contributions in the infinite topquark mass limit by their exact analogues in the virtual-plus-soft corrections. However this
is not sufficient. We have also to demonstrate that the soft-plus-virtual gluon approximation
gives a good description of the differential cross section. Using a certain prescription we
can show that this is really the case.
Our paper is organized as follows. In Section 2 we present the formulas for the exact
cross sections and their analogues in the infinite top-quark mass limit. Then we make approximations for the partonic soft-plus-virtual and the soft-gluon cross sections. Finally
184
we adopt a prescription how to implement these formulae for the hadronic pt distributions.
In Section 3 we make comparisons between our approximate differential distributions and
those which are derived in the limit mt .
2. Approximation to the exact differential cross section for Higgs production

The differential process we study is the semi-inclusive reaction with one Higgs boson
H in the final state
H1 (P1 ) + H2 (P2 ) H (q) + X .
(2.1)
Here H1 and H2 denote the incoming hadrons and X represents an inclusive hadronic state.
In our study we limit ourselves to 2 2 and 2 3 partonic subprocesses. The kinematics
of the 2 2 reaction is
a(p1 ) + b(p2 ) c(p3 ) + H (q),
s = (p1 + p2 )2 ,
t = (p1 q)2 ,
u = (p2 q)2 .
(2.2)
The exact calculations of the 2 2 processes are given in [37]. They consist of the
following parton subprocesses
g + g g + H,
q + q g + H,
q(q)
+ g q(q)
+ H.
(2.3)
The Born cross section for the g + g g + H subprocess in Fig. 1 is equal to

(1),exact
s2
d 2 gggH
dt du
2
2
w s3 1 m8H
N
A2 (s, t, u) + A2 (u, s, t)
2
2
16 stu MW N 1

2
2

+ A2 (t, u, s) + A4 (s, t, u) s + t + u m2H ,
Fig. 1. The exact process g + g g + H .
(2.4)
185
Fig. 2. The exact processes q + q g + H and q(q)

+ g q(q)
+ H.
with
w =
e2
2 G
2MW
F
,
(2.5)
4 sin W
where e denotes the electric charge and W is the weak angle. The constants MW and GF
denote the mass of the W and the Fermi constant, respectively. Further we want to mention
that N = 3 for QCD. The dimensionless functions A2 (s, t, u) and A4 (s, t, u) are given in
the appendix of [4]. The Born cross section for the q + q g + H subprocess in Fig. 2
equals
2
(1),exact
s2
d 2 q qgH
dt du
2
w s3 u2 + t 2 m4H N 2 1
A5 (s, t, u)
2
2
2
128 s(u + t) MW N

s + t + u m2H ,
(2.6)
where the function A5 (s, t, u) is given in the appendix of [4]. Finally the Born cross section
for the q(q)
+ g q(q)
+ H reaction becomes (see Fig. 2)
s2
(1),exact
d 2 qgqH
dt du
2

w s3 u2 + s 2 m4H 1
A5 (t, s, u) s + t + u m2H .
2
2
128 t (u + s) MW N
(2.7)
In the limit of infinite top mass mt the functions above simplify enormously. Actually they
can be derived from the effective Lagrangian
1
with O(x) = Ga (x)Ga, (x),
(2.8)
4
where (x) represents the Higgs field and G is an effective coupling constant given by

w s2 2
2r
.
C
,
G2 =
(2.9)
s
2
9MW
m2t
Leff = G(x)O(x),
The quantity C is the coefficient function which describes all QCD corrections to the
top-quark loops in the limit mt . For external gluons, which are on-shell, the latter quantity has been computed up to order s in [8,9,23] and up to s2 in [24,25]. Up to
second order it reads

(5) 2 2

2
s(5) (2r )
2777
s (r )
2
(11) +
+ 19 ln r2
C s 2r , r2 = 1 +
4
4
18
mt
mt

2
67 16 r
ln 2 .
+ nf +
(2.10)
6
3
mt
186
Fig. 3. The approximate process g + g g + H .
Fig. 4. The approximate processes q + q g + H and q(q)

+ g q(q)
+ H.
Here r represents the renormalization scale and nf denotes the number of light flavours.
(5)
Moreover s is presented in a five-flavour-number scheme. In the infinite top-quark mass
limit the Feynman rules can be derived from Eq. (2.8). In that limit the Born cross sections
become
(1),m
t
d 2 gggH
dt du
(1),mt
d 2 q qgH
dt du

w s3 N
1 4
s + t 4 + u4 + m8H
2
2
144 N 1 stuMW

s + t + u m2H ,
w s3
288
(1),m
t
d 2 qgqH
dt du
(2.11)

1
s + t + u m2H ,
2
2
N
sMW
N2
t2
+ u2
(2.12)

w s3 1 u2 + s 2
s + t + u m2H ,
2
288 N tMW
(2.13)
where the graphs are shown in Figs. 3 and 4. The next order gluonic corrections to the
2 2 reactions in Figs. 1 and 2 have not been calculated yet. Some of the graphs are
shown in Fig. 5, which shows that the calculation will be very tedious. However we can
make an approximation. In the infinite top-quark mass limit the soft-plus-virtual (S + V )
cross sections could be written as [14]
(2),S+V
s2
(1)
d 2 abcH
d 2 abcH
s
=
N s, t, u, , 2 s 2
dt du
4
dt du

2

(1)
2 s
.
K MBabcH
+ s + t + u mH
4
(2.14)
(1)
Here d 2 (1) denote the Born cross sections in Eqs. (2.11)(2.13) and MBabcH is a left
over piece which is numerically very small. The term N (s, t, u, , 2 ) is an universal
function which depends on the parameter which serves as a momentum cut off for the
187
Fig. 5. Samples of two-loop graphs contributing to g + g g + H .
Fig. 6. Samples of graphs contributing to g + g g + g + H .
infrared divergence. Finally K denotes a combination of colour factors which vanishes

in the supersymmetric limit CA = CF = nf = N . Here CA , CF are the standard colour
factors in SU(N ). For more details see Eqs. (5.24)(5.26) in [14]. Since N (s, t, u, , 2 ) is
universal we replace the Born cross sections in the first term of Eq. (2.14) by the exact ones
in Eqs. (2.4), (2.6), (2.7). In this way we get a better soft-plus-virtual gluon approximation
for the Higgs boson cross section which is also valid for Higgs masses and transverse
momenta pt larger than the top-quark mass mt . The 2 3 reactions are denoted by
a(p1 ) + b(p2 ) c(p3 ) + d(p4 ) + H (q),
s = (p1 + p2 )2 ,
t = (p1 q)2 ,
u = (p2 q)2 ,
s4 = s + t + u m2H .
s4 = (p3 + p4 )2 ,
(2.15)
The matrix elements for the 2 3 processes have been exactly calculated in [22] although
in four dimensions. Some of the graphs are shown in Fig. 6. However we need them in n
dimensions to regularize the infrared and collinear divergences (for n = 4 there is a problem see [26]). Furthermore we also need the exact virtual corrections to cancel the infrared
divergences. Since the latter are not calculated yet we can only make an approximation for
the soft parts (s4 0) of the 2 3 processes. In the mt limit these parts are
(2),SOFT
s2
(1)
d 2 abcH
d 2 abcdH
s 1
=
,
I s, t, u, s4 , 2 s 2
dt du
4 s4
dt du
(2.16)
where d 2 (1) are the Born cross sections in the limit mt given in Eqs. (2.11)(2.13).
The term I (s, t, u, s4 , 2 ) is a universal factor and contains simple functions which are
proportional to ln s4 /2 . For more details see Eqs. (5.16)(5.20) in [14]. We get a better
approximation to the exact cross sections if we replace in Eq. (2.16) the cross sections
in mt limit by the exact ones in Eqs. (2.4), (2.6), (2.7). The most optimal next-toleading order (NLO) cross section that one can achieve is to use the exact lowest order
188
cross sections in Eqs. (2.4), (2.6), (2.7) and in next order to substitute them in Eqs. (2.14)
and (2.16). This relies upon the fact that the soft-plus-virtual gluon approximation is a very
good substitute for the exact cross section. We know from our experience with the cross
section in the infinite top-quark mass limit that this is really the case. This is revealed by a
study of the transverse momentum pt and the rapidity y distributions in Figs. 1315 of [14].
Above pt = 100 GeV/c and mH 100 GeV/c2 the soft-plus-virtual gluon approximation
accounts for 80% of the cross section. However we can do even better. This becomes clear
if we look at the transverse momentum distribution
xmax

dab

d H1 H2
H1 H2
2
2
S, pt , mH =
x, 2
xS, pt2 , m2H , 2 ,
dx ab
dpt
dpt
a,b=q,g xmin
with
xmin =
(2.17)

m2H + 2pt2 + 2 pt2 (pt2 + m2H )
S
xmax = 1,
(2.18)
and ab denotes the momentum fraction luminosity defined by

H1 H2
x, 2 =
ab
1
dx1

dx2 (x x1 x2 )faH1 x1 , 2 fbH2 x2 , 2 .
(2.19)
However Eq. (2.17) can also be cast in the form (see [27])
xmax

x
d H1 H2
H1 H2
2
2
S, pt , mH =
x, 2
xmin
dx ab
dpt
xmin
a,b=q,g
dab
dpt
xmin

xS, pt2 , m2H , 2 ,
where ab is the parton luminosity given by

H1 H2
H1 H2
ab
x, 2 = x 1 ab
x, 2 .
(2.20)
(2.21)
If we consider the whole cross section it makes no difference which definition we are using.
However if we limit ourselves to the soft-plus-virtual gluon approximation and moreover
we set x/xmin = 1 in Eq. (2.20) we get a difference. In fact we enhance the small x region
which leads to an improvement of the approximation. This is mainly due to the fact that
the small x gluons dominate the differential distributions as they already did in the total
cross section (see [1114]). This will be shown in the next section.
3. Differential distributions for the LHC and the Tevatron

In this section the hadronic differential distributions are presented for arbitrary Higgs
mass mH and top-quark mass mt . We compare the results for the NLO differential cross
189
sections in the infinite mt limit and in the approximation derived in the previous section,
which is valid for arbitrary mt . Since the latter is only defined for the transverse momentum
we will limit ourselves to the pt -distributions. Inthis paper we will study Higgs boson
production in protonproton
collisions at LHC ( S = 14.0 TeV) and protonantiproton
collisions at the Tevatron ( S = 2.0 TeV). The hadronic cross section is obtained from the
partonic cross section as follows
S2

d 2 H1 H2
S, T , U, m2H =
dT dU
a,b=q,g x1,min
dx1
x1
1
x2,min

dx2 H1
fa x1 , 2
x2

d 2 ab

s, t, u, m2H , 2 .
fbH2 x2 , 2 s 2
dt du
In analogy to Eq. (2.2) the hadronic kinematical variables are defined by
S = (P1 + P2 )2 ,
T = (P1 q)2 ,
U = (P2 q)2 ,
(3.1)
(3.2)
where P1 and P2 denote the momenta of hadrons H1 and H2 respectively (see Eq. (2.1)).
In the case parton p1 emerges from hadron H1 (P1 ) and parton p2 emerges from hadron
H2 (P2 ) we can establish the following relations
p1 = x1 P1 ,
s = x1 x2 S,
x1,min =
p2 = x2 P2 ,

t = x1 T m2H + m2H ,
U
,
S + T m2H
x2,min =

u = x2 U m2H + m2H ,
x1 (T m2H ) m2H
x1 S + U m2H
(3.3)
From Eq. (3.1) one can obtain the pt and y distributions. Neglecting the masses of the
incoming hadrons we have the following relations

T = m2H S pt2 + m2H cosh y + S pt2 + m2H sinh y,

U = m2H S pt2 + m2H cosh y S pt2 + m2H sinh y,
(3.4)
so that the cross section becomes
S

d 2 H1 H2
d 2 H1 H2
S, T , U, m2H .
S, pt2 , y, m2H = S 2
2
dT dU
dpt dy
(3.5)
The kinematical boundaries are

m2H S T 0,
S T + m2H U
Sm2H
T m2H
+ m2H ,
(3.6)
from which one can derive

2
0 pt2 pt,max
,
2
with pt,max
=
S
S
1
1
ln 2 y ln 2 ,
2 mH
2 mH
(S + m2H )2
4S cosh2 y
m2H ,
(3.7)
190
or
(S m2H )2
pT2 ,max ,
0 pt2
4S
4S(pt2 + m2H )
1 1 + 1 sq
with ymax = ln
, sq =
.
2 1 1 sq
(S + m2H )2
ymax y ymax ,
(3.8)
We can perform the integral over the rapidity and obtain the transverse momentum distribution

d H1 H2
S, pt2 , m2H =
dpt
ymax
ymax
dy

d 2 H1 H2
S, pt2 , y, m2H ,
dpt dy
(3.9)
with ymax given in Eq. (3.8). An alternative way to obtain the distribution above is given in
Eq. (2.17). We checked that both procedures lead to the same numerical result.
We define what we mean by leading order (LO) and next-to-leading order (NLO). In the
infinite mt limit and in the exact computation the differential cross section in LO is defined
by
d (1)

d LO
S, pt2 , m2H =
S, pt2 , m2H ,
dpt
dpt
(3.10)
where we shall denote the LO cross section in the infinite mt mass limit by d LO,mt /
dpt . The partonic cross sections in the latter quantity are given in Eqs. (2.11)(2.13). The
exact LO cross section is represented by d LO,exact /dpt with the partonic cross sections in
Eqs. (2.4), (2.6) and (2.7). The gluongluonHiggs coupling is given by G in Eq. (2.9) with
C = 1. The top-quark mass is given by mt = 174.3 GeV/c2 and the Fermi constant GF =
1.16639 105 GeV2 = 4541.68 pb in Eq. (2.5). We also adopt the leading logarithmic
representation for the running coupling and the parton densities. For the latter we choose
the parametrization according to [28] (namely set lo2002.dat) with LO
5 = 167 MeV and
nf = 5.
The NLO corrected differential cross section in the asymptotic top-quark mass limit is
given by

(5) 2 LO,mt

d NLO,mt
s ( ) d
2
2
S, pt , mH = 1 + 22
S, pt2 , m2H
dpt
4
dpt

d (2),mt
+
(3.11)
S, pt2 , m2H .
dpt
In d (2),mt all partonic cross sections use the asymptotic top-quark mass limit results
in [1518]. Further we have multiplied the LO cross section by C 2 = 1 + 22s /4 in
Eq. (2.10). Finally we have the approximation for arbitrary masses mH and mt
d LO,exact
d S+V ,approx

d NLO,approx
S, pt2 , m2H =
S, pt2 , m2H +
S, pt2 , m2H ,
dpt
dpt
dpt
(3.12)
191
where the partonic cross sections are given in Eqs. (2.4), (2.6) and (2.7) and the soft-plusvirtual gluon approximation is given in Eqs. (2.14) and (2.16). The running coupling and
parton densities are also represented in next-to-leading order for which we have chosen
the MS-scheme and nf = 5. For our plots we have adopted the parametrization obtained
= 239 MeV. For simplicity
from the set MRST [29] (namely set alf119.dat) with NLO
5
the factorization
scale
is
set
equal
to
the
renormalization
scale
r . For our plots we take

= 0 = pt2 + m2H unless mentioned otherwise.
Our first study concerns the validity of the soft-plus-virtual gluon approximation. This
is done in the asymptotic top-quark limit where we know the complete NLO correction.
For that purpose we plot
R=
d S+V ,mt /dpt

d NLO,mt /dpt
(3.13)
in the range 40< pt < 200 GeV/c and mH = 120, 160, 200 GeV/c2 . The plots are given
for the LHC ( S = 14 GeV) in Fig. 7(a). The figure reveals that at mH = 120 GeV/c2
and pt = 40 GeV/c the ratio is 1.06 and it decreases to about 0.9 at pt = 200 GeV/c.
For larger Higgs masses the ratio becomes closer to unity at pt > 100 GeV/c. This feature can be understood because at larger Higgs masses the kinematics are closer to the
boundary of phase space. The conclusion is that in the range 100 < pt < 200 GeV/c we
have 0.9 < R < 1.0 which indicates that the soft-plus-virtual gluon approximation with
the prescription in [27] works rather well. This is mainly due to the dominance of the ggchannel and the steeply rising gluon flux which is even enhanced by the definition of the
parton luminosity in Eqs. (2.20), (2.21). Note that if one redefines the parton luminosity in
Eq. (2.21) with a factor of x 2 , with corresponding changes in Eq. (2.20) then the R ratios
in Eq. (3.13) decrease below unity. We show them in Fig. 7(b). For mH = 120 GeV/c2
and pt = 40 GeV/c the ratio is 0.97 and it decreases to about 0.85 at pt = 200 GeV/c.
Therefore while the introduction of one inverse power improves the S + V approximation,
two inverse powers does not. Hence
from now on we use one inverse power.
In the case of the Tevatron ( S = 2 GeV) the soft-plus-virtual gluon approximation
works even better (see Fig. 8). In the whole range 40 < pt < 200 GeV/c we have 0.95 <
R < 1.07. However the mass range is more limited, i.e., mH = 120, 130, 140 GeV/c2
because at larger masses the cross section becomes unobservably small. This is understandable because at lower energies we are closer to the boundary of phase space where
the soft-plus-virtual gluon approximation approaches the exact cross section. The transverse momentum distributions d/dpt are plotted in the case of the LHC in Figs. 911 for
mH = 120, 160, 200 GeV/c2 , respectively. The figures reveal the differences between the
cross sections in the asymptotic mt limits and the exact (approximate) cross sections. They
become more clear if we plot the ratios
H LO =
d LO,exact /dpt
,
d LO,mt /dpt
H NLO =
d NLO,approx /dpt
,
d NLO,mt /dpt
(3.14)
which are shown in Figs. 1214. For the Born cross section they vary at pt = 40 GeV/c
from 0.93 to 1.03 for mH = 120 to 200 GeV/c2 , respectively. At pt = 200 GeV/c they all
become about 0.8 irrespective of the Higgs mass. For the NLO cases these values are 0.96
192
(a)
(b)
Fig. 7. (a) The quality of the soft-plus-virtual
gluon approximation represented by the ratio R in Eq. (3.13) for
40 < pt < 200 GeV/c at the LHC ( S = 14 TeV) for mH = 120 GeV/c2 (solid line), mH = 160 GeV/c2
(dashed line), mH = 200 GeV/c2 (dotted line). (b) The quality of the soft-plus-virtual gluon approximation represented by the ratio R
in Eq. (3.13) when we include two inverse powers in Eq. (2.21) for
40 < pt < 200 GeV/c at the LHC ( S = 14 TeV) for mH = 120 GeV/c2 (solid line), mH = 160 GeV/c2
(dashed line), mH = 200 GeV/c2 (dotted line).
193
Fig. 8. The quality of the soft-plus-virtual

gluon approximation represented by the ratio R in Eq. (3.13) for
40 < pt < 200 GeV/c at the Tevatron ( S = 2 TeV) for mH = 120 GeV/c2 (solid line), mH = 130 GeV/c2
(dashed line), mH = 140 GeV/c2 (dotted line).
to 1.28 for small pt and 0.68 to 0.75 at large pt as mH increases from mH = 120 GeV/c2
to mH = 200 GeV/c2 . The exact (approximate) cross sections are always below those in
the asymptotic mt limit except for mH = 160 GeV/c2 and mH = 200 GeV/c2 at small pt .
There are cross over points at pt = 55 GeV/c and pt = 75 GeV/c for the NLO cross
sections. The NLO corrections are very large. This becomes clear if we look at the Kfactors defined by
K=
d NLO,approx /dpt
,
d LO,exact /dpt
(3.15)
which are shown in Fig. 15(a). At pt = 40 GeV/c the K-factors vary from 1.66 to 2.02
as mH increases from mH = 120 GeV/c2 to mH = 200 GeV/c2 , respectively. At larger
pt values the K-factors decrease and at pt = 120 GeV/c they stabilize around the values
1.5, 1.55, 1.65 for mH = 120, 160, 200 GeV/c2 , respectively. For mH = 120 GeV/c2 the
difference between the asymptotic mt limit and the soft-plus-virtual gluon approximation
in NLO is of the same order as the K-factors, namely 1.5.
In our previous paper [16] we only used the infinite mt limit and defined
K(mt ) =
d NLO,mt /dpt
,
d LO,mt /dpt
(3.16)
194
Fig. 9. Differential cross sections at the LHC ( S = 14 TeV) with mH = 120 GeV/c2 . The Born cross sections
LO,exact
LO,m
t
d
/dpt (dotted line) and d
/dpt (dot-dashed line). Also shown are the NLO contributions
d NLO,approx /dpt (solid line) and d NLO,mt /dpt (dashed line).
Fig. 10. Same as in Fig. 9 for mH = 160 GeV/c2 .
195
Fig. 12. The factors H LO (dashed line) and H NLO (solid line) in Eq. (3.14) at the LHC ( S = 14 TeV) with
mH = 120 GeV/c2 .
196
197
(a)
(b)
Fig. 15. (a) The K factor (Eq. (3.15)) for the LHC ( S = 14 TeV) with mH = 120 GeV/c2 (solid line),
2
2
mH = 160 GeV/c
(dashed line) and mH = 200 GeV/c (dotted line). (b) The K(mt ) factor (Eq. (3.16))
for the LHC ( S = 14 TeV) with mH = 120 GeV/c2 (solid line), mH = 160 GeV/c2 (dashed line) and
mH = 200 GeV/c2 (dotted line).
198
Fig. 16. Same as in Fig. 9 but for the Tevatron ( S = 2 TeV) and mH = 120 GeV/c2 .
which we plot in Fig. 15(b) for the same scales, masses and parton densities as above.
We see that these K-factors increase as pt increases. Hence an alternative approach of
generating the NLO pt distribution by simply multiplying d LO,exact /dpt by K(mt )
yields a different result. However the numerical difference between the latter approach and
the S + V approximation investigated here could be small depending on the precise mass
of the Higgs.
In Fig. 16 the transverse momentum distributions are shown for the Tevatron at mH =
120 GeV/c2 and in Fig. 17 the ratios in Eq. (3.14) are plotted. Here the discrepancies in
NLO are even larger. There is very little difference between small pt and large pt and
the approximate cross section is about 0.5 to 0.8 times smaller than the one in the asymptotic mt limit. The Born approximations vary from 0.8 to 1.25 when pt ranges from
pt = 40 GeV/c to pt = 200 GeV/c. Notice that for pt > 135 GeV/c the approximate
cross section becomes even a little bit larger than the one in the case of the asymptotic mt
limit. Note that the peaks in Figs. 16 and 17 reflect the thresholds in the partonic channels
described in [37].
The K-factors (see Fig. 18(a)) are a little smaller than in the case of the LHC. At pt =
40 GeV/c they vary between 1.5 and 1.8 and at larger pt (say pt > 120 GeV/c) they are
in the range 1.3 < K < 1.4. Here the discrepancy between the asymptotic mt limit and the
approximate cross section in NLO is even larger than the corresponding K-factor.
Note that if we use the K(mt )-factor in (3.16) then it is indeed larger for the
Tevatron than for the LHC (see Fig. 18(b)). This is consistent with results for the NLO
199
Fig. 17. Same as in Fig. 12 but for the Tevatron ( S = 2 TeV) and mH = 120 GeV/c2 .
inclusive Higgs cross section in the infinite mt limit, see the tables in [14], which show
that this K(mt )-factor is larger for the Tevatron than for the LHC.
The dependence of the exact Born and the soft-plus-virtual gluon approximation
cross sections on the factorization scale is studied for mH = 120 GeV/c2 at pt =
100, 150, 200 GeV/c. The dependence can be expressed by the following quantity

d approx (pt , )/dpt
=
N pt ,
(3.17)
0
d approx (pt , 0 )/dpt

with 0 = pt2 + m2H . This quantity is plotted in the range 0.10 < < 100 for LO and
NLO in Fig. 19 for the LHC and in Fig. 20 for the Tevatron both at mH = 120 GeV/c2 .
The LO cross sections have the larger values for small /0 . What is very striking is
the improvement in scale variation while going from LO to NLO. In LO there is steep
behaviour at small /0 which is flattened out in NLO. At large /0 the difference
between LO and NLO is not so big, but still the NLO curves are flatter than the LO ones.
Basically the same curves are also found at larger Higgs masses so that there is no need to
show them. Finally there is a small dependence of N (pt , /0 ) on the transverse momenta
in both LO and in NLO.
Concluding our findings we observe that the soft-plus-virtual gluon approximation gives
a good description of the exact NLO cross section (within 90%), when tested with mt
cross sections. The difference between the asymptotic mt limit and the soft-plus-virtual
gluon approximation is larger than the K-factor in the case of the Tevatron but smaller
200
(a)
(b)
Fig. 18. (a) The K factor (Eq. (3.15)) for the Tevatron ( S = 2 TeV) with mH = 120 GeV/c2 (solid line),
2
2
mH = 130 GeV/c(dashed line) and mH = 140 GeV/c (dotted line). (b) The K(mt ) factor (Eq. (3.16))
for the Tevatron ( S = 2 TeV) with mH = 120 GeV/c2 (solid line), mH = 130 GeV/c2 (dashed line) and
mH = 140 GeV/c2 (dotted line).
201
Fig. 19. The scale dependence represented by N (pt , /0 ) in Eq. (3.17) for the LHC ( S = 14 TeV)
2
2
and mH = 120 GeV/c . The results are plotted in the range 0.1 < /0 < 10 with 0 = m2H + pt2 for
pt = 100 GeV/c (solid line), pt = 150 GeV/c (dashed line) pt = 200 GeV/c (dotted line). The upper three
curves are for d LO,exact /dpt whereas the lower three curves are for d NLO,approx /dpt .
Fig. 20. Same as in Fig. 19 but for the Tevatron ( S = 2 TeV).
202
than the K-factor in the case of the LHC. Also the validity of asymptotic mt limit depends
more on the value of the transverse momentum than on the magnitude of the Higgs mass.
Finally our approximation has a significantly smaller scale dependence for both colliders
in particular at small factorization scale.
References
[1] J.F. Gunion, H.E. Haber, G.L. Kane, S. Dawson, The Higgs Hunters Guide, AddisonWesley, Reading,
MA, 1990, hep-ph/9302272.
[2] F. Wilczek, Phys. Rev. Lett. 39 (1977) 1304;
H. Georgi, S. Glashow, M. Machacek, D. Nanopoulos, Phys. Rev. Lett. 40 (1978) 692;
J. Ellis, M. Gaillard, D. Nanopoulos, C. Sachrajda, Phys. Lett. B 83 (1979) 339;
T. Rizzo, Phys. Rev. D 22 (1980) 178.
[3] I. Hinchliffe, S.F. Novaes, Phys. Rev. D 38 (1988) 3475.
[4] R.K. Ellis, I. Hinchliffe, M. Soldate, J.J. van der Bij, Nucl. Phys. B 297 (1988) 221.
[5] U. Baur, E. Glover, Nucl. Phys. B 339 (1990) 38.
[6] R.P. Kauffman, Phys. Rev. D 44 (1991) 1415;
R.P. Kauffman, Phys. Rev. D 45 (1992) 1512.
[7] B. Field, S. Dawson, J. Smith, Phys. Rev. D 69 (2004) 074013, hep-ph/0311199.
[8] D. Graudenz, M. Spira, P. Zerwas, Phys. Rev. Lett. 70 (1993) 1372;
M. Spira, A. Djouadi, D. Graudenz, P. Zerwas, Nucl. Phys. B 453 (1995) 17, hep-ph/9504378.
[9] S. Dawson, Nucl. Phys. B 359 (1991) 283;
A. Djouadi, M. Spira, P. Zerwas, Phys. Lett. B 264 (1991) 440.
[10] R.V. Harlander, Phys. Lett. B 492 (2000) 74, hep-ph/0007289;
V. Ravindran, J. Smith, W.L. van Neerven, Nucl. Phys. B 704 (2005) 332, hep-ph/0408315;
V. Ravindran, J. Smith, W.L. van Neerven, Nucl. Phys. B (Proc. Suppl.) 135 (2004) 35, hep-ph/0405263.
[11] S. Catani, D. de Florian, M. Grazzini, JHEP 0105 (2001) 025, hep-ph/0102227;
R.V. Harlander, W.B. Kilgore, Phys. Rev. D 64 (2001) 013015, hep-ph/0102241.
[12] R.V. Harlander, W.B. Kilgore, Phys. Rev. Lett. 88 (2002) 201801, hep-ph/0201206;
R.V. Harlander, W.B. Kilgore, JHEP 0210 (2002) 017, hep-ph/0208096.
[13] C. Anastasiou, K. Melnikov, Nucl. Phys. B 646 (2002) 220, hep-ph/0207004;
C. Anastasiou, K. Melnikov, Phys. Rev. D 67 (2003) 037501, hep-ph/0208115.
[14] V. Ravindran, J. Smith, W.L. van Neerven, Nucl. Phys. B 665 (2003) 325, hep-ph/0302135;
V. Ravindran, J. Smith, W.L. van Neerven, Pramana 62 (2004) 683, hep-ph/0304005.
[15] D. de Florian, M. Grazzini, Z. Kunszt, Phys. Rev. Lett. 82 (1999) 5209, hep-ph/9902483.
[16] V. Ravindran, J. Smith, W.L. van Neerven, Nucl. Phys. B 634 (2002) 247, hep-ph/0201114.
[17] C.J. Glosser, C.J. Schmidt, JHEP 0212 (2002) 016, hep-ph/0209248.
[18] B. Field, J. Smith, M.E. Tejeda-Yeomans, W.L. van Neerven, Phys. Lett. B 551 (2003) 137, hep-ph/0210369.
[19] G. Bozzi, S. Catani, D. de Florian, M. Grazzini, Phys. Lett. B 564 (2003) 65, hep-ph/0302104.
[20] A. Kulesza, G. Sterman, W. Vogelsang, Phys. Rev. D 69 (2004) 014012, hep-ph/0309264.
[21] B. Field, Phys. Rev. D 70 (2004) 054008, hep-ph/0405219.
[22] V. Del Duca, W. Kilgore, C. Oleari, C. Schmidt, D. Zeppenfeld, Phys. Rev. Lett. 87 (2001) 122001, hepph/0105129;
V. Del Duca, W. Kilgore, C. Oleari, C. Schmidt, D. Zeppenfeld, Nucl. Phys. B 616 (2001) 367, hepph/0108030.
[23] S. Dawson, R.P. Kauffman, Phys. Rev. Lett. 68 (1992) 2273.
[24] K.G. Chetyrkin, B.A. Kniehl, M. Steinhauser, Phys. Rev. Lett. 79 (1997) 353, hep-ph/9705240.
[25] M. Krmer, E. Laenen, M. Spira, Nucl. Phys. B 511 (1998) 523, hep-ph/9611272.
[26] J. Smith, W.L. van Neerven, Eur. Phys. J. C 40 (2005) 199, hep-ph/0411357.
[27] S. Catani, D. de Florian, M. Grazzini, JHEP 0201 (2002) 015, hep-ph/0111164.
[28] A.D. Martin, R.G. Roberts, W.J. Stirling, R.S. Thorne, Phys. Lett. B 531 (2002) 216, hep-ph/0201127.
[29] A.D. Martin, R.G. Roberts, W.J. Stirling, R.S. Thorne, Eur. Phys. J. C 23 (2002) 73, hep-ph/0110215.
Topological constraints on stabilized flux vacua

Natalia Saulina
Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA
Received 1 April 2005; received in revised form 12 May 2005; accepted 24 May 2005
Abstract
We study the influence of four-form fluxes on the stabilization of the Khler moduli in M-theory
compactified on a CalabiYau four-fold. We find that, under certain nondegeneracy condition on
the flux, M5-instantons of a new topological type generate a superpotential. The existence of such
an instanton restricts possible four-folds for which the stabilization by this mechanism is expected.
These topological constraints on the background are different from the previously known constraints,
derived from the flux-free analysis of the nonperturbative effects.
1. Introduction
Whether or not we wish to accept the anthropic philosophy [1,2], a necessary condition for a plausible phenomenologically realistic background is the stabilization of all of
its moduli. In the context of the orientifold type IIB models [37] it is now clear that the
complex structure moduli and the axiondilaton modulus are fixed by a perturbative superpotential proportional to the fluxes [8]. On the other hand, the stabilization of the Khler
moduli relies on the generation of the nonperturbative superpotential. The nonperturbative
effects originate from gaugino condensation on coincident D7-branes [9,10] present in the
background and from the D3-brane instantons [1013].
The KKLT paper [6] qualitatively discussed the nonperturbative superpotential deriving
its intuition from the flux-free compactifications. The subsequent successful search for the
E-mail address: saulina@feynman.harvard.edu (N. Saulina).
doi:10.1016/j.nuclphysb.2005.05.011
204
N. Saulina / Nuclear Physics B 720 (2005) 203210
realistic backgrounds [10,14,15] was also based on the flux-free analysis [11,12] of the
nonperturbative effects.
However, recently it was realized [13,16,17] that the presence of background fluxes may
actually modify the conditions for the generation of an instanton-induced superpotential.
In this note we study the effect of the background flux on the generation of a nonperturbative superpotential for the Khler moduli. We find that, under certain restrictions on the
background flux, instantons of a new topological type generate a superpotential.
We investigate these new instantons in M-theory compactified on a CalabiYau fourfold CY 4 with four-form fluxes [18]. The effective 3D theory has four supercharges.
Moreover, if the four-fold is elliptically fibered and the area of the elliptic fiber is sent
to zero, a new fourth dimension appears and the background is described as a flux compactification of type IIB string theory on a Calabi-Yau orientifold [4]. In the framework of
M-theory the D7-branes are described as singular fibers of the elliptic fibration, while the
D3-brane instantons become the M5-brane instantons wrapped on the vertical divisors1
of CY 4 .
An M5-brane wrapped on a divisor in the four-fold generates a superpotential required
for the stabilization of Khler moduli if there are exactly two fermionic zero modes on
its world-volume. The relevant analysis of the generalized Dirac equation [16] has not yet
been done in the presence of fluxes. The purpose of this note is to fill in this gap.
We find exactly two fermionic zero modes by restricting the choice of fluxes and global
properties of the divisors. We consider divisors with Hodge numbers
h(0,1) = 0,
h(0,3) = 0,
h(0,0) = h(0,2) = 1,
(1.1)
where h(0,p) stands for the number of linearly independent harmonic (0, p) forms on the
divisor. Our choice of the flux is characterized, in addition to general supersymmetry constraints (2.3), (2.4), by a nondegeneracy condition (4.6).
Note that in the absence of fluxes there would be four fermion zero modes for divisors
with these Hodge numbers and M5-branes wrapped on such a divisor would not generate
a superpotential. Our result demonstrates that the appropriate choice of the flux lifts extra
fermion zero modes so that instantons of the previously ignored topological type contribute
to the stabilization of the Khler moduli.
We would like to emphasize that it is not trivial to reduce the number of the fermionic
zero modes of the instanton to two. For example, [17] have counted the number of the
fermion zero modes in the context of a type IIB compactification on the orientifold T 6 /Z2
in the presence of fluxes and found four zero modes. In their case, no instanton-induced
superpotential is generated.
The existence of a divisor with Hodge numbers (1.1) restricts possible four-folds for
which the stabilization of the Khler moduli due to M5-instantons of the new topological
type is expected. These topological constraints on the background are different from the
previously known constraints derived from the flux-free analysis of the nonperturbative
effects.
1 A vertical divisor is one that projects to a divisor in the base B of the elliptic fibration : CY B.
4
205
The note is organized as follows. In Section 2 we briefly review some basic facts about
the flux compactification of M-theory on a CalabiYau four-fold CY 4 and recall the geometric properties of the fermions living on the M5-brane instanton. In Section 3 we review
the Dirac-like equation for the fermions living on the M5-brane in the presence of background fluxes. For an M5-brane wrapped on a divisor D in CY 4 we recast this equation as
a set of equations for differential forms on D. In Section 4 we demonstrate that for generic
fluxes and divisors with global properties (1.1) there are exactly two fermionic zero modes.
Section 5 summarizes our results.
2. Flux compactification of M-theory on CY 4 and an M5-instanton

In this section we review basic facts about flux compactification of M-theory on Calabi
Yau 4-fold CY 4 [18] and recall the geometric properties of the fermions living on an
M5-brane instanton.
The 11D metric is a warped product
ds 2 = e2A(y) dx dx + e2B(y) gMN dy M dy N ,
(2.1)
where is the metric on the three-dimensional Minkowski space and the internal metric
has the form
(0)
(1)
gMN = t 2 gMN
+ gMN
+ .
(2.2)
(0)
Here gMN is Ricci-flat metric on CY 4 and t is the size of the 4-fold.

In the leading approximation in the limit of large t the warped factors are trivial
A(0) = B (0) = 0 and the 4-form flux has only components along the 4-fold. Moreover,
compactification gives 3D theory with four supercharges when the background flux is a
primitive form of (2, 2) type
J F(2,2) = 0
and the tadpole cancellation condition is satisfied
= 0.
F F +
12
(2.3)
(2.4)
CY 4
In Eqs. (2.3), (2.4) J is Khler form on CY 4 and is Euler characteristic of CY 4 .

Now let us consider a divisor D in CY 4 and wrap an M5-brane on it. This M5-instanton
will generate nonperturbative superpotential Wnp = G(Z)eT if fermions living on M5brane world-volume have exactly two zero modes. Here T = V + iC, V is the volume
of the divisor D and the axion C is the 6-form potential integrated over the divisor. The
prefactor G(Z) is a 1-loop determinant which is a holomorphic function of the complex
structure moduli.
Our goal in Section 3 will be to recast the equations of motion for fermions living on the
M5-instanton as a set of equations on differential forms on the divisor D. We will further
use this in Section 4 to find the case with exactly two fermion zero modes.
206
For this purpose we recall below how world-volume fermions transform under the rotations of the normal and tangent directions. The normal bundle to the M5-brane has a
product form R 3 N , where R 3 stands for external space2 and N is the line bundle describing one complex normal direction inside CY 4 .
A+
The fermions = A living on the M5-brane transform in representation 4 2 2
under Spin(6) SO(3) SO(2). Here A = 1, 2 is a spinor under external SO(3), the + ()
stands for a chiral (antichiral) spinor of SO(2) and = 1, . . . , 4 is a chiral spinor of Spin(6).
3. Recasting equations for fermion modes of the M5-instanton in terms of

differential forms
Kallosh and Sorokin [16] have derived the Dirac-like equation for the fermions living
on the world-volume of an M5-brane in the presence of background fluxes. We now apply
their equation to the case of the M5-brane wrapped on a divisor D of CY 4 . The goal of this
section is to rewrite the resulting equations in terms of differential forms on the divisor.
This will simplify our search for fermion zero modes in Section 4. The reader may skip the
details and find the resulting system of equations in (3.6)(3.9).
In the limit of the large size t of the four-fold, all components of the four-form except
those with all indices along the internal four-fold may be neglected. The equation then
reads
1
1
i i + i i Tw i j k Fijk w Tw ij k Fij k w = 0.
8
8
(3.1)
We have introduced complex coordinates zi , i = 1, 2, 3 along the divisor and the complex
coordinate w normal to the divisor inside CY 4 .
Note that Fijkw and Fij k w are the only internal flux components which appear in
Eq. (3.1). The same components are turned on when a dual, four-dimensional type IIB
orientifold description of the three-dimensional theory becomes applicable. In the case
of general flux compactifications of M-theory on a CalabiYau four-fold there could also
be other internal flux components,3 as long as they satisfy the supersymmetry constraints
(2.3), (2.4). It should be noted that remarkably, they do not affect the equation for the
instanton fermionic zero modes.
In (3.1) Tw , Tw are SO(2) Dirac matrices
Tw Tw + Tw Tw = 2gww .
(3.2)
The six-dimensional chiral (antichiral) gamma matrices i , j (i , j ) have the

properties
j i + i j = 2gi j ,
(3.3)
2 For computation of instanton generated superpotential we work in Euclidean signature in external 3D space.
3 F
ij k n and Fi jw w .
207
where gi j is Khler metric on the divisor D. Note that nothing in Eq. (3.1) acts on the
index A = 1, 2 of a spinor in R 3 . In what follows we will not write this index explicitly but
we will keep it in mind in the future counting of the number of zero modes.
The covariant derivatives j , j include the connection on the bundle of chiral Spin(6)
spinors as well as connection on the spin bundle derived from the normal bundle N .
Now we use the known fact (see for example [11]) that the bundle S + of chiral spinors
on a Khler manifold of complex dimension three is isomorphic to the bundle

(0,0)
1
1
K 2 (0,2) K 2 .
Here (0,p) stands for the bundle of (0, p) forms. We will further use that the normal
bundle on the divisor in CY 4 is isomorphic to the canonical bundle K. Recalling that is a
1
1
section of the bundle4 S + K 2 S + K 2 , we find the following degrees of freedom.
w taking values in the canonical bundle K, a section of K a w as well as a
A (0, 2) form a(2)
(0)
(0, 2) form b(2) and a scalar b(0) .
Locally we write in terms of these degrees of freedom as follows
w

= a(0)
+ aiwj i j Tw
+ b(0) + bij i j
,
(3.4)
where the chiral spinor

satisfies
i
= 0,
i = 1, 2, 3,
Tw
= 0.
(3.5)
Plugging (3.4) into (3.1) we find the following set of equations5 :

[i bjk]
= 0,
(3.6)
4 j bjk + k b(0) = 0,
(3.7)
w
D[m 1 am
2m
3 ] = 0,
(3.8)
4D j ajwk
(3.9)
w
+ Dk a(0)
= F i j k w bij .
In Eqs. (3.8), (3.9) the covariant differentials include connection on the canonical bundle.
In writing Eqs. (3.6)(3.9) we used the primitivity condition (2.3).
4. The M5-instanton with two fermion zero modes

Here we study the set of Eqs. (3.6)(3.9) for the fermionic degrees of freedom on an
M5-brane wrapped on a divisor D in a CalabiYau 4-fold. We consider divisors with Hodge
numbers
h(0,1) = h(0,3) = 0,
h(0,0) = h(0,2) = 1,
4 Here we are ignoring that is a spinor in R 3 .

1
5 X
[i1 ...ip ] = p! (Xi1 ...ip permutations).
(4.1)
208
where h(0,p) stands for the number of harmonic (0, p) forms on D. The goal of this section
is to show that for generic background fluxes the M5-branes wrapped on the divisors of this
topological type generate a superpotential.
In the absence of fluxes there would be four fermion zero modes for the divisors with
these Hodge numbers. Two zero modes6 would be coming from harmonic (0, 2) form and
the other two from (0, 0) form. It is natural to expect that choosing flux appropriately one
can lift zero modes associated with (0, 2) form. Below we realize this expectation.
Using Hodge decomposition Eqs. (3.6) and (3.7) imply that b(2) and b(0) are harmonic
forms. From h(0,2) = 1 follows that we may write bij = ij where ij is a fixed harmonic
(0, 2) form and is a complex number.
Now let us consider Eq. (3.9). Both sides of this equation take values in (0,1) (K),
the space of (0, 1)-forms with values in the canonical bundle K. From our assumption
h(0,2) = 1 follows h(0,1) (K) = 1. This implies that there is unique (up to multiplication by
complex number) harmonic (0, 1)-form taking values in K. Let us call it ckw . Now we take
inner product of both sides of (3.9) with ckw . The left side gives zero. So the consistency of
(3.9) requires
ggww (ckw ) g k p F l m p w blm = 0.

(4.2)
D
Recall also that ckw can be constructed from7 the fixed harmonic (2, 0) form ij =
(ij ) as follows:
pij w
ckw = gkp
ij ,

(4.3)
where pij w is SU(4) invariant antisymmetric tensor.

Eq. (4.2) becomes

g ij P i j ,k m k m = 0,
(4.4)
where

P i j ,k m = gww p i j w F k m p w .
So we conclude that for generic fluxes such that

gij P i j ,k m k m = 0
(4.5)
(4.6)
the only solution of (4.4) is = 0 and therefore b(2) = 0. We would like to emphasize that
this removal of the harmonic (0, 2)-form b(2) is eventually responsible for lifting of the
two extra fermion zero modes.
6 Recalling that all fields carry spinor index in R 3 .
7 This is explicit realization of the statement h(0,1) (K) = h(0,2) .
209
w and a w to be harmonic forms with values in K.

Now Eqs. (3.8) and (3.9) require a(0)
(2)
From our assumption about the topology of the divisor D h(0,3) = h(0,1) = 0 we find
w = 0 and a w = 0.
h(0,0) (K) = h(0,2) (K) = 0 and therefore a(0)
(2)
We conclude that Eqs. (3.6)(3.9) have a single solution:
b(0) = const,
w
a(0)
= 0,
w
a(2)
= 0,
b(2) = 0.
Recalling that all the fields carry hidden index A = 1, 2 of a spinor in R 3 (see discussion below (3.3)), we conclude that we found exactly two fermion zero modes. Therefore,
the M5-instanton of the topology (4.1) in the presence of generic fluxes (4.6) generates
nonperturbative superpotential for the Khler moduli.
5. Conclusion
In this note we studied how the conditions for the stabilization of the Khler moduli are
modified by background fluxes. We considered M-theory compactified on a CalabiYau
four-fold and found that, for a generic choice of background fluxes, M5-instantons of a new
topological type generate a nonperturbative superpotential required for the stabilization.
The new instanton is an M5-brane wrapped on a divisor with Hodge numbers
h(0,0) = h(0,2) = 1,
h(0,1) = h(0,3) = 0.
Meanwhile, the background fluxes, in addition to general constraints on supersymmetric

compactification (2.3), (2.4), are characterized by a non-degeneracy condition (4.6).
Divisors with these Hodge numbers appeared before8 in the discussion of the gaugino
condensation on coincident D7-branes [13]. We found that in the presence of the special
background fluxes such divisors are relevant for the generation of nonperturbative superpotential induced by the M5-instantons.
The condition for the existence of such a divisor restricts possible four-folds for which
the stabilization of the Khler moduli by this mechanism is expected. These topological
constraints on the background are different from the previously known constraints derived
from the flux-free analysis of the nonperturbative effects.
It would be interesting to find other choices of fluxes which can make the M5-branes
wrapped on more general divisors to contribute to the nonperturbative superpotential. Another interesting question is to find an explicit example of a CalabiYau four-fold, other
than K3 K3 , that contains a divisor with the desired properties (4.1) and admits the appropriate nondegenerate flux (4.6).
Acknowledgements
I would like to thank L. Motl and J. Distler for valuable discussions. My research was
supported in part by NSF grants PHY-0244821 and DMS-0244464. I am grateful to the
8 The example in [13] is K P 1 in K K .
3
3
3
210
referee of Nuclear Physics B who pointed out a mistake in the original version of this
paper.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
L. Susskind, The anthropic landscape of string theory, hep-th/0302219.

L. Susskind, Supersymmetry breaking in the anthropic landscape, hep-th/0405189.
K. Dasgupta, G. Rajesh, S. Sethi, M-theory, orientifolds and G-flux, JHEP 9908 (1999) 023, hep-th/9908088.
S. Giddings, S. Kachru, J. Polchinski, Hierarchies from fluxes in string compactifications, Phys. Rev. D 66
(2002) 106006, hep-th/0105097.
K. Becker, K. Dasgupta, Heterotic strings with torsion, JHEP 0211 (2002) 006, hep-th/0209077.
S. Kachru, R. Kallosh, A. Linde, S. Trivedi, de Sitter vacua in string theory, Phys. Rev. D 68 (2003) 046005,
hep-th/0301240.
C. Burgess, R. Kallosh, F. Quevedo, de Sitter string vacua from supersymmetric D-terms, JHEP 0310 (2003)
056, hep-th/0309187.
S. Gukov, C. Vafa, E. Witten, CFTs from CalabiYau four-folds, Nucl. Phys. B 584 (2000) 69108;
S. Gukov, C. Vafa, E. Witten, Nucl. Phys. B 608 (2001) 477478, Erratum.
P. Tripathy, S. Trivedi, Compactification with flux on K3 and tori, hep-th/0301139.
F. Denef, M. Douglas, B. Florea, Building a better racetrack, JHEP 0406 (2004) 034, hep-th/0404257.
E. Witten, Nonperturbative superpotentials in string theory, Nucl. Phys. B 474 (1996) 343, hep-th/9604030.
A. Grassi, Divisors on elliptic CalabiYau 4-folds and the superpotential in F-theory, alg-geom/9704008.
L. Grlich, S. Kachru, P. Tripathy, S. Trivedi, Gaugino condensation and nonperturbative superpotentials in
flux compactifications, hep-th/0407130.
V. Balasubramanian, P. Berglund, J.P. Conlon, F. Quevedo, Systematics of moduli stabilisation in Calabi
Yau flux compactifications, hep-th/0502058.
F. Denef, M.R. Douglas, B. Florea, A. Grassi, S. Kachru, Fixing all moduli in a simple F-theory compactification, hep-th/0503124.
R. Kallosh, D. Sorokin, Dirac action on M5 and M2 branes with bulk fluxes, hep-th/0501081.
P. Tripathy, S. Trivedi, D3-brane action and fermion zero modes in presence of background flux, hepth/0503072.
K. Becker, M. Becker, M-theory on eight-manifolds, Nucl. Phys. B 477 (1996) 155167, hep-th/9605053.
Radius stabilization by two-loop Casimir energy

G. von Gersdorff a , A. Hebecker b
a Department of Physics and Astronomy, Johns Hopkins University, 3400 N Charles Street,
Baltimore, MD 21218, USA

b Institut fr Theoretische Physik, Universitt Heidelberg, Philosophenweg 16 und 19,
D-69120 Heidelberg, Germany

Received 18 April 2005; accepted 2 June 2005
Abstract
It is well known that the Casimir energy of bulk fields induces a non-trivial potential for the
compactification radius of higher-dimensional field theories. On dimensional grounds, the 1-loop
potential is 1/R 4 . Since the 5d gauge coupling constant g 2 has the dimension of length, the twoloop correction is g 2 /R 5 . The interplay of these two terms leads, under very general circumstances
(including other interacting theories and more compact dimensions), to a stabilization at finite radius.
Perturbative control or, equivalently, a parametrically large compact radius is ensured if the 1-loop
coefficient is small because of an approximate fermionboson cancellation. This is similar to the
perturbativity argument underlying the BanksZaks fixed point proposal. Our analysis includes a
scalar toy model, 5d YangMills theory with charged matter, the examination of S 1 and S 1 /Z2
geometries, as well as a brief discussion of the supersymmetric case with ScherkSchwarz SUSY
breaking. 2-loop calculability in the S 1 /Z2 case relies on the log-enhancement of boundary kinetic
terms at the 1-loop level.
1. Introduction
Higher-dimensional field theories arise in the low energy limit of string- or M-theory,
which is our best candidate for a theory of quantum gravity. Independently, compactiE-mail addresses: gero@pha.jhu.edu (G. von Gersdorff), hebecker@thphys.uni-heidelberg.de
(A. Hebecker).
doi:10.1016/j.nuclphysb.2005.06.001
212
G. von Gersdorff, A. Hebecker / Nuclear Physics B 720 (2005) 211227
fied higher-dimensional models provide many interesting possibilities for the unification
of known fields and interactions. Familiar examples are the appearance of 4-dimensional
gauge theories as a manifestation of higher-dimensional diffeomorphism invariance or the
unification of known bosons and fermions in higher-dimensional multiplets of supersymmetry (SUSY). Thus, we consider higher-dimensional compactified models a promising
ingredient in possible physics beyond the standard model, which makes the further investigation of stabilization mechanisms for the compactification radius an interesting and
potentially important subject.
One very generic ingredient in the dynamics of the compactification radius is the
Casimir energy of massless bulk fields [1]. As a simple example, consider 5d general
relativity with a vanishing cosmological constant and a set of massless 5d fermions and
bosons. Compactifying to 4d on an S 1 with physical volume 2R, one finds a flat 4d
effective potential for the radion R. This flatness is lifted by the 1-loop Casimir energy
which, on dimensional grounds, is 1/R 4 and does not lead to a stable finite-R solution.1
Clearly, to overcome the 1/R 4 behaviour which is too simple for stabilization, one has to
introduce a mass scale into the potential, which can be achieved, for example, by considering warped compactifications [2], massive bulk matter or brane localized kinetic terms for
bulk fields [3].
In this paper, we point out that the required mass scale is, in fact, generically present
in the simplest interacting higher-dimensional field theories, such as 4 theories, gauge
theories with coupling constant g, or models with Yukawa interactions. In 5d the above
coupling constants are dimensionful, leading to a 2-loop Casimir energy contribution
/R 5 or g 2 /R 5 . Thus, radius stabilization will generically arise at the 2-loop level by
a balancing of the 1/R 4 and the 1/R 5 contributions without the need to invoke any extra
effects or operators.2 Generically, the compactification scale is set by the lowest of the
strong interaction scales of various 5d field theories present in a given model.
We note that a different 2-loop stabilization mechanism was previously considered
in the context of 6d 3 theory, where the coupling is dimensionless and a logarithmic
R-dependence arises at the 2-loop level [5]. Furthermore, the possibility of 2-loop stabilization based on the vanishing of the 1-loop contribution 1/R 4 and a balancing of the
1/R 5 and the ln(R)/R 5 terms has been pointed out in [6]. Two-loop corrections to the 4d
Casimir effect have been considered by many authors (see, e.g., [7]).
The paper is organized as follows. In Section 2, the above idea is illustrated using the
simple example of 5d 4 theory. We emphasize in particular that, by a judicious choice of
the field content, stabilization at moderately large radii, R , can be achieved, such that
higher-loop corrections are negligible. This is similar to the way in which a perturbatively
controlled non-trivial fixed point arises in the proposal of Banks and Zaks [8].
Section 3 extends the analysis to a 5d YangMills theory with charged bosons and fermions. Amusingly, all 2-loop integrals reduce straightforwardly to the simple scalar case.
1 Introducing a non-zero 5d cosmological constant , a stable solution can be found by balancing the result5
ing 2 R5 contribution against the Casimir energy. However, a positive 4d cosmological constant results whose
scale is set by R and which is therefore generically too large.
2 Note that this differs qualitatively from the results of the early discussion of higher-loop Casimir stabilization
in [4] (see below for more details).
213
Controlled 2-loop stabilization at large radius can be achieved, e.g., in the large N limit
of SU(N ) gauge theory with appropriate matter content. Furthermore, it is shown that our
2-loop stabilization mechanism extends straightforwardly to SUSY models with Scherk
Schwarz SUSY breaking.
In Section 4, we provide a qualitative discussion of the phenomenologically more interesting cases of S 1 /Z2 and of the S 1 with 3-branes. Since the finite and calculable
2-loop bulk contribution mixes with the 1-loop effect induced by brane operators with
unknown coefficients, a complete predictivity just on the basis of the field content cannot be achieved. In the generic case, the 2-loop effect considered here represents an O(1)
correction to the previously discussed stabilization by brane-kinetic terms. We point out
the interesting and natural limit of logarithmically enhanced brane-localized gauge-kinetic
terms, which allows one to neglect the bulk-2-loop effect and to achieve full predictivity
just on the basis of the particle spectrum.
A summary of our results as well as a discussion of possible further research directions,
in particular the applicability to SUSY models on S 1 /Z2 and to the case of more than 5
dimensions, can be found in Section 5.
The evaluation of the relevant loop integrals and non-trivial SUSY-based checks of the
2-loop gauge theory calculation are described in two appendices.
2. A 4 example
Consider the 5d theory of Einstein gravity and massless real scalar with classical action

1 3
1
4
5
2
S = d x g MP ,5 R5 + ()
(1)
2
2
4!
compactified on an S 1 with radius R. Clearly, a variation of the metric background field
g55 is equivalently described by a variation of the volume 2R. In the following, we will
always use a 5d Minkowski metric treating R as our volume or radion degree of freedom.
It will not be necessary to perform a Weyl rescaling of the metric to manifestly separate
graviton and radion degrees of freedom in the 4d effective theory.
Assuming that M P ,5 1/, we can consistently neglect gravitational interactions. On
dimensional grounds, the effective potential for R then reads

2
1
(1)
(2)
(3)
V (R) = 4 c + c
(2)
+ ,
+c
R
R
R2
where the c(n) are n-loop coefficients. Note that, even in the limit of large 5d Planck mass,
c(1) has to include the 5d graviton contribution.
Radius stabilization can be achieved already at the 2-loop level. Indeed, if c(1) is negative and c(2) positive, the 2-loop potential is minimized at
5 c(2)
(3)
.
4 c(1)
However, the 4d cosmological constant at the minimum is negative. This can be remedied
either by adding a 3-brane with appropriately tuned positive tension or a 5d bulk cosmological constant. In the second case, an extra R-dependent contribution to the 4d effective
R=
214
potential results,
Vcc (R) = 25 R.
(4)
Requiring both V (R) and V (R) to vanish at the same point determines the precise value
of 5 and gives rise to a slightly shifted minimum at3
R=
6 c(2)
.
5 c(1)
(5)
Unfortunately, assuming that c(n) = O(1), it is immediately clear that higher-loop terms
cannot be neglected in the vicinity of the above 2-loop minimum.4 This situation may, in
fact, be generic. In this case we cannot do more than express the justified hope that, in many
models, higher-loop effects will significantly change but not destroy the minimum found
at the 2-loop level. Our first conclusion is therefore that radius stabilization by higherloop Casimir energy is presumably generic (in the sense of occurring in a large fraction
of models) and the resulting compactification scale is of the order of the strong interaction
scale of the most strongly coupled of the bulk field theories.
However, in specific examples a quantitatively controlled minimum based on the 2-loop
approximation can occur. Indeed, in models where |c(1) | |c(2) |, the 2-loop minimum is
at R and higher-loop effects are suppressed (assuming that no undue enhancement
of the coefficients c(n) with n 3 occurs). As a concrete realization, consider the O(Ns )
symmetric generalization of the above scalar 4 Lagrangian,
4
4!
2
N
s

2
i ,
4!
(6)
i=1
together with Nf fermions which are not (or only very weakly) coupled to the scalars. It is
clear that, in this case,
c(1) 4Nf Ns 5
(7)
(which is the difference of fermionic and bosonic degrees of freedom, including the 5
graviton polarizations) while
c(n) Nsn
for n 2.
(8)
Thus, taking Ns large while keeping 4Nf Ns 5 O(1) and negative, one has R Ns2
at the 2-loop minimum and all higher-loop terms are suppressed.
3 Strictly speaking, since we are not working in the Einstein frame, the equation of motion for R is V +
M P3 ,5 R4 = 0, where R4 is the 4d curvature. Demanding V = 0 at the minimum also ensures that minimization
of V is equivalent to solving the equation of motion.

4 We could, of course, improve our estimates by extracting appropriate loop suppression factors from the coefficients c(n) using naive-dimensional analysis (see, e.g., [9]). However, since this would not affect our argument
qualitatively, we do not enter such a more detailed discussion. Alternatively, one can imagine that these factors
have already been absorbed in a redefined coupling.
215
Finally, we now want to fill in the explicit numbers for the first two-loop coefficients
used above. As already mentioned, c(1) is proportional to the difference of on-shell fermionic and bosonic degrees of freedom of the 5d theory,
(1)
c(1) = (Nfermions Nbosons )c0 ,
(9)
where [1,3] (see also Appendix A)

(1)
c0
3 (5)
.
(2)6
(10)
The 2-loop coefficient for a single scalar is due to the figure-8 diagram, which has previously been derived using the winding mode expansion [6]. In our context, it is crucial that
the tree-level masslessness in 5d is maintained by an appropriate 1-loop counterterm.5 The
result reads (for explicit calculations see Appendix A)
c(2) =
(3)2
.
8(2)9
(11)
Going from a single scalar to the O(Ns )-symmetric model, the coefficient c(2) of Eq. (11)
has to be multiplied by (Ns2 + 2Ns )/3. For our simple scalar example to work, it is important that c(2) > 0.
Thus, a quantitatively controlled 2-loop minimum arises from the potential

(3)2 2
1
3 (5)(4Nf Ns 5) +
,
N + 2Ns
V (R) =
(12)
R
(2)6 R 4
24(2)3 s
in the specific large-Ns limit described above.
While this simple analysis shows that it is quite easy to achieve stabilization in a scalar
4 theory, a more interesting and realistic case is that of a 5d gauge theory. In particular,
the masslessness assumption introduced above, which is unnatural for an interacting scalar,
will be natural for gauge bosons and charged fermions.
3. Gauge theory
We now turn to the case of a 5d gauge theory compactified on S 1 with the action

1
1
S = d 5 x g M P3 ,5 R5 2 trf FMN F MN
2
2g

/ .
+ (DM ) D M + iD
(13)
Here the bosonic and fermionic matter fields transform in some representation r of the
gauge group G, the field strength is FMN = i[DM , DN ] with DM = M + iAM , and the
trace is in the fundamental representation with generators normalized by 2 trf (Ta Tb ) =
ab .
5 There is however a finite (non-local) positive correction to the scalar mass squared of the 4d zero mode.
216
Fig. 1. Two-loop diagrams from the gauge sector.
Fig. 2. Two-loop diagrams from the matter sector.
The 2-loop vacuum energy contributions come from the diagrams depicted in Figs. 1
and 2. It turns out that, by simple algebraic manipulations, they can all be reduced to the
2-loop integral encountered in the scalar model above. Therefore, the complete result can
be expressed in terms of the constant
(2)
c0 =
(3)2
dim(G),
(2)9
(14)
where dim(G) is the dimension of the gauge group (the 1/8 of Eq. (11) is a symmetry
factor characteristic of the 4 model).
Including the symmetry factors and numerator algebra of the various diagrams and accounting for the trace normalization through the constant T (r), defined by trr (Ta Tb ) =
T (r) ab (which is also known as the Dynkin index of the representation r), one finds
(2)
cvector
3
1
9
1
= d(d 1)T (a) (d 1)T (a) + T (a) = + T (a),
4
4
(2)
4
4
c0
(2)
cscalar
7
3
= dT (r) T (r) = + T (r),
(2)
2
2
c0
(2)
cfermion
= (2 d)T (r) = 3T (r).
(2)
c0
(15)
The sum of these coefficients defines c(2) of Eq. (2) (with replaced by g 2 ) and thereby
the 2-loop contribution to the Casimir energy arising from gauge fields and gauged matter.
To facilitate comparison with other calculations, we have made explicit the d-dependence
before setting d = 5 and specified the contributions arising from the separate Feynman
(2)
gauge diagrams. Specifically, the three contributions to cvector come from the figure 8, the
217
setting sun diagram, and the setting sun diagram with ghosts (in this order, cf. Fig. 1).
(2)
Analogously, the two contributions to cscalar arise from the figure 8 and the setting sun
diagram (cf. Fig. 2). A non-trivial check of these results based on the vanishing of the
Casimir energy in models with unbroken SUSY can be found in Appendix B.
One can see that in a pure gauge theory, one again encounters precisely the situation
outlined above: c(1) is negative while c(2) is positive so that 2-loop stabilization is automatic. In order to maintain perturbativity, we need to reduce the one-loop coefficient
without affecting the other c(n) . This is most easily achieved by considering large groups
(e.g., SU(N ) with N large, where c(n) N 1+n ) and adding fermions uncharged under the
gauge group to reduce c(1) to O(1). This will lead to R g 2 N 3 at the 2-loop minimum
and result in a relative suppression of the n-loop (n > 2) contribution near this minimum
by 1/N 2n4 . In fact, this situation may arise fairly naturally since higher-loop coefficients
are dominated by the most strongly coupled gauge group factor, so that it is sufficient
to require the relevant fermions to be neutral only under this part of the group. We have
thus seen that it is easy to stabilize the S 1 radius R in a controlled fashion. Of course, to
build a realistic model one would have to include branes or to consider more sophisticated
geometries allowing for chiral fermions.
We close this section by commenting on a possible SUSY version of this scenario.
Consider a model containing a 5d supergravity multiplet, NV = dim(G) vector-multiplets
defining a super-YangMills theory with gauge group G, and NH hypermultiplets in a
representation r of G. Let us break SUSY from N = 2 to N = 0 by introducing Scherk
Schwarz boundary conditions on the S 1 [10]. The effect of this breaking is a mass shift for
gauginos, gravitinos and hyperscalars. Their KaluzaKlein (KK) masses become mn R =
n + , where the real parameter is known as the ScherkSchwarz parameter. Already
at this point it is clear that our stabilization mechanism is qualitatively unchanged: The
ScherkSchwarz parameter is dimensionless and our basic formula, Eq. (2), remains valid.
Of course, the coefficients c(i) are now functions of (which vanish for = 0).
The 1-loop contribution to the Casimir energy is specified by [4,11,12]
(1)
cSS =

12
(N
2)
(5)
(5)
,
H
V
(2)6
(16)
where we have defined the function (n) by

(n) =

cos(2k)
k=1
kn
(17)
Since (n) 0 (n) = (n), this contribution is negative as long as NV + 2 > NH . We

can now balance matter and gauge multiplets in exactly the same way as in the non-SUSY
case to ensure c(1) = O(1). For the 2-loop contribution of the vector-multiplet, we use the
results of Appendix B but evaluate the corresponding gaugino integrals with the shifted
masses to obtain
(2)
cSS vector =

2
4
NV T (a) (3) (3) .
9
(2)
(18)
218
Likewise, the hypermultiplet contribution reads6

2
4
(2)
cSS hyper =
NV T (r) (3) (3) .
(19)
9
(2)
Since, as in the non-SUSY case, the vector-multiplet contribution is positive, our 2-loop
stabilization mechanism remains effective in this simple SUSY model as long as there is
not too much matter charged under the most strongly coupled gauge group factor.
4. Compactifications with branes and fixed-points

In this section, we focus on the 5d gauge theory case discussed in Section 3 since it
is presumably more likely to be part of phenomenologically interesting theories than the
scalar toy model of Section 2. However, most of our discussion applies, qualitatively, also
to the scalar case and presumably to many other 5d models with dimensionful couplings
and corresponding 2-loop Casimir stabilization.
As already mentioned above, the pure S 1 case discussed up to now cannot give rise to
realistic models since the 4d particle spectrum is necessarily vector-like. A simple way of
embedding this type of S 1 stabilization in a realistic construction is to add a 3-brane on
which matter fields can live.7 To apply our stabilization mechanism without any modification, one could simply assume that the brane fields are not charged under the bulk gauge
theory responsible for the stabilizing 2-loop Casimir energy.
If one does not make this assumption, the charged brane fields generate, at the 1-loop
level, brane localized gauge-kinetic terms for the bulk gauge theory. Such brane-kinetic
terms contribute to the 1-loop Casimir energy induced by bulk fields. We can view this as
a 2-loop effect including both brane and bulk fields. It is easy to convince oneself that this
contribution is not parametrically suppressed relative to the bulk 2-loop effect calculated
in Section 3. Thus, we are precisely in the situation of 1-loop Casimir stabilization with
brane-kinetic terms considered in [3]. Our 2-loop calculation is then simply a finite O(1)
correction to this stabilization mechanism. However, since the coefficient of the relevant
brane-kinetic term is UV-sensitive, its value after renormalization is in essence just a new
parameter of the model. The precise result of the 2-loop calculation is then only meaningful
if one has some first-principles knowledge about the coefficient of the brane-kinetic term.
This will, in general, require a UV completion as it is provided, e.g., by a string orbifold
model.
A very similar situation arises in an S 1 /Z2 geometry. Here, not only brane-localized
charged fields but also the bulk gauge and matter fields themselves induce brane-localized
(i)
6 The relation c(i)
SS vector + cSS hyper = 0 for an adjoint hypermultiplet representation reflects the fact these
N = 2 multiplets combine into N = 4 vector-multiplets. The ScherkSchwarz mechanism leaves N = 2 SUSY

unbroken and the Casimir energy remains zero.
7 As already mentioned earlier, a brane localized cosmological constant can be used to tune the effective 4d
cosmological constant to zero. Its interplay with a possible bulk cosmological constant can also affect stabilization
(for more options along these lines see, e.g., [13]). Here, we require the bulk cosmological constant to be small
because of some symmetry principle (e.g., bulk SUSY) and treat the sum of brane cosmological constants as an
unknown parameter to be fixed by the requirement of 4d flatness.
219
gauge-kinetic terms. This is, in fact, a familiar and well-studied effect in the context
of higher-dimensional grand unified theories (GUTs) with symmetry breaking by branelocalized Higgs fields and in orbifold GUTs [14,15]. It corresponds to the statement that a
(modified) logarithmic running of the gauge couplings continues above the compactification scale, which can be understood as the running of the coefficient of a brane localized
F 2 term [15].
In fact, it is easy to check that, even in the simple scalar model, the 2-loop vacuum
energy in an S 1 /Z2 compactification has (in contrast to the pure S 1 case) a logarithmic
divergence that can be absorbed into a brane-localized 52 operator. This is analogous to
the boundary term F F (with , {0, 1, 2, 3}) induced in gauge theories with even
boundary conditions for A . However, while in the gauge theory case only the 4d part of
the gauge-kinetic term is corrected (F5 being zero at the boundary), in the scalar case
the correction only affects the 5 part (the 1-loop self energy diagram being momentumindependent).
The apparent loss of predictivity associated with the UV sensitive coefficients of branelocalized kinetic terms is not as severe as one might naively think. The reason is the
logarithmic enhancement of such terms associated with their logarithmic divergence. Indeed, the bulk gauge theory has a strong interaction scale associated with the 5d gauge
coupling, M 24 3 /g 2 [9]. It is natural to assume that, at this scale, the brane-localized
F 2 term has an O(1) coefficient. Running down to the compactification scale Mc = 1/R,
one obtains a log-enhanced coefficient ln(M/Mc ) = ln(24 3 R/g 2 ). This is dominant
with respect to the unknown O(1) initial value. The corresponding log-enhanced 1-loop
Casimir effect contribution of the brane operator is also dominant with respect to the true
2-loop effect calculated in Section 3. Thus, the leading piece of the g 2 /R 5 contribution is
calculable on the basis of the low-energy field content of the model. The parametric behaviour is, in fact, not a pure power of R but includes the logarithmic R dependence of the
coefficient. The 4d vacuum energy of a 5d gauge theory compactified on S 1 /Z 2 thus has
the form8

2
1
(1)
(2) g ln(MR)
,
V4 = 4 c + c
R
R
(20)
and can, as explained earlier, give rise to quantitatively controlled radius stabilization at
R g 2 N 3 in the case of an SU(N ) gauge theory.
To be more specific, focus on 5d SU(N ) gauge theory on S 1 /Z2 (where the gauge group
is not broken by the orbifolding) with Nf uncharged bulk fermions and with charged brane
fermions in a representation r at the x 5 = 0 boundary. The logarithmic running above the
8 As already mentioned in the Introduction, higher-loop radius stabilization was also discussed in [4], but in
a very different approach. We understand that Ref. [4] treats the 4d coupling as fundamental and R-independent,
resulting in loop corrections ln(R)/R 4 . By contrast, we consider the 5d coupling as fundamental, which implies
a 4d coupling 1/R and hence a vacuum energy loop correction ln(R)/R 5 .
Furthermore, a potential similar to Eq. (20) was derived in [6] for a scalar model with Yukawa couplings to
brane fields. Assuming an exact cancellation of the 1-loop contributions from scalars and fermions, stabilization
was achieved by a balancing of the 1/R 5 and ln(R)/R 5 terms of the 2-loop result.
220
compactification scale induces an effective brane-localized F 2 term

5
1
23
L
tr F F
ln(MR) x
8T (r) T (a)
4
96 2

5

23
+ x R T (a)
4
(21)
at scale Mc . The calculations leading to this formula are explained in detail in Section 3 of
[16] and can also be extracted from the earlier references [14,15]. Here the strong interaction scale is M 24 3 /(g 2 N ), accounting for the large-N enhancement of higher loops
in SU(N ) gauge theory. As far as the 1-loop Casimir energy calculation using a summation
of KK modes is concerned, the effect of this contribution can be summarized by an extra
contribution to the momentum-space effective action,

2
R
R 2
k
k
k
+
2b
k
k
k
(22)
2g 2
2g 2
to be used for even modes (i.e., the A modes) only. Here b is defined by minus the sum
of the two coefficients of the brane-localized F 2 terms as given in Eq. (21) and k is the
Euclidean 4-momentum. The extra factor of 2 arises since the even higher KK modes are
twice more sensitive to brane operators than the zero mode.
Based on this, the 1-loop contribution to the 4d potential induced by the brane-kinetic
terms reads
+

3 dim(G)
d d1 k
ln (1 + b )k 2 + (nMc )2
V brane (R) =
d1
2
(2)
n=1

2
ln k + (nMc )
+
b k 2
d d1 k
3 dim(G)
.

(23)
2
(2)d1 k 2 + (nMc )2
n=1
b
Here =
and the summation is only over even non-zero (cosine) modes
of the KK mode expansion on the S 1 compact space. The prefactor of 3 accounts for the
3 on-shell degrees of freedom of a massive vector. Note that we have simply interpreted
the boundary operator as a correction to the energy of each separate leading-order KK
mode. (This shifted KK mode spectrum has been derived in a different way in Appendix B of [17].) To be more precise, one would have to re-diagonalize the quadratic-order
Hamiltonian after inclusion of the boundary terms and then work with the modified KK
mode expansion [3]. However, as long as ln(MR) MR, we are entitled to linearize in
the boundary term and our simplified treatment is sufficient.
Using the calculational techniques of Appendix A, the above formula is evaluated to
give the brane-induced correction (equivalent to the dominant part of the 2-loop Casimir
energy)
2b/(R/2g 2 )
V brane (R) =
36 (5)g 2 dim(G)b
,
(2)6 R 5
(24)
leading to the final result9

3 (5) 4Nf 3(N 2 1) 5
V (R) =
2
(2)6 R 4

2
2
23
N 1
g ln(MR)
8T
(r)
.
+
T
(a)
2
R
(2)3
221
(25)
It is now easy to arrange for the coefficient of the 1/R 4 piece to be negative and O(1)
while keeping the 1/R 5 term positive and O(g 2 N 3 ) such that, as before, controlled 2-loop
stabilization at a moderately large radius is achieved.10 Compared with the S 1 case, where
MR N 2 we now have MR/ ln(MR) N 2 . Using asymptotic properties of the Lambert
W function (see, e.g., [18]) the solution for N large can be written as

MR N 2 1 + O(ln /) , = ln N 2 ,
(26)
thus giving an additional enhancement factor ln N 2 .
5. Conclusions
We have presented a mechanism stabilizing the size R of an extra dimension compactified on S 1 or on the orbifold S 1 /Z2 which is based on the presence of dimensionful
couplingsa generic feature of field theoretic models with d > 4. We have shown that, by
balancing the 1- and 2-loop contributions to the Casimir energy, a perturbatively controlled
minimum at moderately large values of R can be realized.
In the case of scalar massless 4 theory on S 1 , the Casimir energy is calculable and
finite as long as the masslessness is enforced as a renormalization condition. The 1-loop
contribution scales like R 4 , while the 2-loop effect of the scalar self-interaction gives
+R 5 , thus producing a non-trivial minimum. In order to ensure that the result is
perturbatively controlled, we have added weakly coupled fermions. Their effect is to reduce
the numerical factor of the 1-loop term while only very mildly affecting the higher-loop
contributions. This shifts the minimum to larger values of R.
The situation is basically the same in the case of a gauge theory. As before, the purely
bosonic theory already produces a 2-loop minimum in the radion potential, while the inclusion of fermions uncharged under the most strongly coupled gauge group factor ensures the
perturbativity of the result. We have also shown that in a SUSY extension of this scenario
(with SUSY-breaking la ScherkSchwarz), the mechanism works without modification.
The above calculable two-loop stabilization is modified but not destroyed if, instead of
the circle, the orbifold S 1 /Z2 is considered. The crucial new feature of this situation are
brane-kinetic terms for the gauge fields, which are generated in the 1-loop effective action.
9 The factor of 1 in the one-loop contribution is due to the orbifolding, since half of the modes are projected
2
away (cf. also the summation in Eq. (23)).
10 To turn over the sign in the second term, we need to introduce a certain amount of brane fermions. Note also
that assigning negative parities to some gauge fieldsin other words, breaking the gauge group by orbifolding
would reduce the contribution of the gauge fields to the brane kinetic terms and can even flip its sign.
222
Without the knowledge of the underlying UV physics one cannot predict the corresponding
counterterm at the cutoff scale M. However, in calculating the potential of R for large
values of the radius, R M 1 , one effectively integrates out the physics down to that
scale, and the logarithmic running of the brane-kinetic term dominates over its unknown
initial value. The dominant 2-loop effect can then be obtained by evaluating the 1-loop
integral in the presence of this log-enhanced brane-kinetic term. The new contribution
scales like g 2 ln(RM)/R 5 and is calculable on the basis of the bulk and brane field
content of the model. We emphasize that this log-divergent part of the 2-loop calculation
dominates over the remaining (finite) 2-loop effects (scaling as 1/R 5 ) as well as the
one-loop effect of the unknown brane-kinetic term (whose leading effect for large R is
also 1/R 5 ).
Although we have focused exclusively on 5d models, we note that our mechanism is
equally suitable for other higher-dimensional manifolds with a single unstabilized modulus. This is because such theories look very similar from a 4d point of view, the main
difference being a modified KK-mass spectrum which depends on the value of the modulus. Various possibilities for stabilizing such a single modulus have recently been discussed
in the context of the KKLT proposal [19], and we believe that Casimir energies will be relevant in this context. This has, in fact, very recently been discussed in the context of the
1-loop effect of massive vector fields in [20], and one can expect that the higher-loop contributions discussed here will also play an important role in the further study of moduli
stabilization in more complete, higher-dimensional scenarios.
As a simple and specific example, we briefly consider S n compactifications with the
volume undetermined by the Einstein equations. The Casimir energy coming from a selfinteracting scalar field will take the same general form as in Eq. (2), with the powers of
R modified according to the dimensionality of the coupling and the c(i) to be calculated
from the appropriate KK spectrum. The coefficient c(1) has been calculated for the case of
a sphere (without additional infinite dimensions) in Ref. [21] for n = 1, 2, 3, 4 and found
to be negative. Moreover, it is clear that c(2) is always positive since the integral appears
squared. Thus, if the inequality c(1) < 0 survives the transition from S n to M4 S n for
n > 1 (as it does for n = 1), our stabilization mechanism extends straightforwardly to these
geometries. In any case, we can be confident that many examples of 2-loop (or higher-loop)
Casimir stabilization exist within the rich class of models with n > 1 compact dimensions
where the total volume is a modulus at tree level.
Our final point concerns SUSY theories on S 1 /Z2 . The prototype SUSY breaking mechanism on S 1 , the ScherkSchwarz mechanism[10], extends to this case. It can equivalently
be described by giving a vacuum expectation value to the F component of the radion superfield (the chiral superfield whose lowest component contains R) [22,23]. Indeed, the
resulting tree level action corresponds to no-scale supergravity and the potential for R
is completely flat. Stabilization at 1-loop has previously been studied by use of bulk mass
terms for hypermultiplets [3,12,24], brane kinetic terms [3,25] or massive vector-multiplets
[20]. The mechanism we described in Section 4 seems to be quite suitable for SUSY S 1 /Z2
models without 5d masses. The 1-loop contribution is given by 12 of the S 1 result obtained
in this paper. We expect the dominant 2-loop effect to be the log-enhanced brane-kinetic
terms, the same as in the non-SUSY case. However, a detailed analysis of this scenario,
of its interplay with the other stabilization mechanisms mentioned above, or even the con-
223
struction of a realistic SUSY model go beyond the scope of the present paper and are left
to future research.
Acknowledgements
We would like to thank Roberto Contino, Alex Pomarol, Mariano Quirs, Riccardo
Rattazzi and Enrico Trincherini for useful comments and discussions. G.G. is supported
by the Leon Madansky Fellowship and by NSF Grant P420-D36-2051.
Appendix A. Basic 1- and 2-loop integrals

We begin by rederiving the known 1-loop vacuum energy for a real scalar field on
R 4 S 1 in dimensional regularization. The 1-loop effective potential of the 5d Euclidean theory is given by the d 5 limit of
(1)
V5 (R) =
+

1
d d1 k 2
2
ln
k
+
(nM
)
,
c
2R n=
(2)d
(A.1)
where Mc = 1/R. Using the fact that the d-dimensional integrals of any power of k 2 and
of ln(k 2 ) vanish in dimensional regularization, we have
2
Mc
(1)
V5 (R) =
2
Mc
dM 2
0
2
Mc
=
2
Mc
dM
0
+
n2
d d1 k
(2)d n= k 2 + (nM)2

2
d d1 k k 2 coth(|k|/M)
.
(2)d M 4
|k|/M
(A.2)
(A.3)
Appealing again to the fact that the d-dimensional integral of a pure power vanishes,
coth(|k|/M) can be replaced by coth(|k|/M) 1, making the d d1 k integral finite
and allowing one to take the limit d 5. The resulting 4d effective potential is
(1)
(1)
V4 (R) = 2RV5 (R) =
3 (5)
(2)6 R 4
with (5) = 1.0369 . . . ,
(A.4)
in agreement with Eqs. (9) and (10).

(1)
Higher-loop corrections to the effective action (in particular to V5 ) are given by
the sum of all one-particle-irreducible vacuum diagrams of the relevant field theory. As
already pointed out in the main text, the 2-loop correction is essentially just the figure 8
diagram. To be more precise, one has to add to the Lagrangian the mass counterterm of the
uncompactified 5d theory ensuring that the scalar remains massless at the 1-loop level. This
counterterm becomes part of a tadpole-like 1-loop vacuum diagram, which is part of the
2-loop correction. However, in dimensional regularization the mass counterterm vanishes
224
and the 2-loop correction (for a single scalar field) simply reads
(2)
V5
= I 2,
8
+
1
1
d d1 k
where I =
.
d
2
R n=
(2) k + (nMc )2
(A.5)
Here the prefactor contains a symmetry factor (1/8) and the tadpole integral I is evaluated
as the d 5 limit of
d1
d
(3)
k coth(|k|/M)
I=
(A.6)
=
with (3) = 1.2021 . . .
d
(2)
|k|
(2)5 R 3
This completes the derivation of Eq. (11) and thereby of Eq. (12).
Appendix B. SUSY checks of gauge theory results

To verify the results of our 2-loop gauge theory calculation, we have performed checks
based on the vanishing of the Casimir energy in theories with unbroken SUSY in 5 and 4
dimensions.
Specifically, we can consider a 5d super-YangMills (SYM) theory which, in addition
to the gauge fields, includes a real adjoint scalar and a Dirac gaugino. To calculate the
2-loop effect in this theory, coupled to a hypermultiplet (a Dirac fermion and two complex
scalars) in a representation r of the gauge group, we need the contributions coming from
the additional Yukawa- and scalar self-interaction diagrams shown in Fig. 3. For the explicit
Lagrangian see, e.g., [26].
The three setting sun diagrams coming from the Yukawa couplings yield
(2)
(2)
cgauginofermionscalar = 8c0 T (r),

(2)
(2)
(2)
(2)
cfermion = c0 T (r),
cgaugino = c0 T (a).
(B.1)
Furthermore, in the scalar sector there are two new figure 8 diagrams coming from the
coupling of to the scalar and of the scalar to itself, giving two new contributions
(2)
= 2c0(2) T (r),
cscalar
(2)
(2)
cscalarscalar = 3c0 T (r).
Fig. 3. Two-loop diagrams from Yukawa couplings and scalar self-interactions.
(B.2)
225
Finally, the gaugino and as well as the hypermultiplet fermion and scalar are charged
under the gauge groups. The corresponding 2-loop corrections can be directly read off from
Eq. (15). The two bosonic contributions are
7
(2)
(2)
cvector = T (a)c0 ,
4
(2)
(2)
cvectorscalar = 7T (r)c0 ,
(B.3)
where we have included factors 12 and 2 to account for the reality of and for the presence
of two hypermultiplet scalars respectively. The two fermionic contributions are precisely
as in the Eq. (15), just with T (r) replaced by T (a) in the case of the gaugino.
The contributions of the charged hypermultiplet thus sums up to
(2)
(2)
(2)
(2)
cvectorscalar + cscalar + cscalarscalar + cvectorfermion

(2)
(2)
+ cfermion + cgauginofermionscalar
(2)
= (7 + 2 + 3 3 1 8)T (r)c0 = 0.
(B.4)
Similarly, the self-interactions of the vector-multiplet give rise to

(2)
(2)
(2)
(2)
cvector + cvector + cvectorgaugino + cgaugino

9 7
(2)
+ 3 1 T (a)c0 = 0.
=
4 4
(B.5)
Next, since we have kept the d dependence in Eq. (15), we can immediately extend our
analysis to a 4d N = 1 SYM theory. The 2-loop Casimir energy contribution is proportional to
(2)
(2)
(2)
c4d
vector + c4d vectorgaugino = (1 1)T (a)c0 = 0,
(B.6)
where the fermionic contribution includes an extra factor 12 relative to Eq. (15) to account
for the chirality of the 4d gaugino.
Finally, consider the contribution of charged 4d matter. The 2-loop effects based on
gauge interactions can be inferred from Eq. (15) (with an appropriate factor 12 for fermion
chirality in the loop). The gauginofermionscalar contribution is as in Eq. (B.1), but with
an extra factor 14 for the fermion chirality and half the number of scalars. The scalar figure 8 diagrams are induced by the D term potential and give rise to 16 of the contribution
displayed in Eq. (B.2) because of the missing trace over SU(2)R degrees of freedom,
tr i i = 6.11 Overall, one finds
(2)
(2)
(2)
(2)
c4d vectorscalar + c4d scalarscalar + c4d vectorfermion + c4d gauginofermionscalar

5 1
(2)
+ 1 2 T (r)c0 = 0,
=
2 2
(B.7)
concluding our SUSY based checks.

11 The scalar figure-8 diagram also contains a piece proportional to (tr Ta )2 which is zero for non-Abelian as
well as anomaly-free Abelian gauge theories. The reason why the Casimir energy does not vanish in the presence
of an anomalous U(1) is the associated occurrence of a one-loop FayetIliopoulos term which spontaneously
breaks SUSY.
226
References
[1] T. Appelquist, A. Chodos, The quantum dynamics of KaluzaKlein theories, Phys. Rev. D 28 (1983) 772.
[2] J. Garriga, O. Pujolas, T. Tanaka, Radion effective potential in the brane-world, Nucl. Phys. B 605 (2001)
192, hep-th/0004109.
[3] E. Ponton, E. Poppitz, Casimir energy and radius stabilization in five- and six-dimensional orbifolds,
JHEP 0106 (2001) 019, hep-ph/0105021.
[4] I. Antoniadis, S. Dimopoulos, A. Pomarol, M. Quiros, Soft masses in theories with supersymmetry breaking
by TeV-compactification, Nucl. Phys. B 544 (1999) 503, hep-ph/9810410.
[5] A. Albrecht, C.P. Burgess, F. Ravndal, C. Skordis, Exponentially large extra dimensions, Phys. Rev. D 65
(2002) 123506, hep-th/0105261.
[6] L. Da Rold, Radiative corrections in 5D and 6D expanding in winding modes, Phys. Rev. D 69 (2004)
105015, hep-th/0311063.
[7] B.S. Kay, The Casimir effect without magic, Phys. Rev. D 20 (1979) 3052;
L.H. Ford, Casimir effect for a selfinteracting scalar field, Proc. R. Soc. London A 368 (1979) 305;
M. Bordag, D. Robaschik, E. Wieczorek, Quantum field theoretic treatment of the Casimir effect. Quantization procedure and perturbation theory in covariant gauge, Ann. Phys. 165 (1985) 192;
M. Bordag, U. Mohideen, V.M. Mostepanenko, New developments in the Casimir effect, Phys. Rep. 353
(2001) 1, quant-ph/0106045;
F.A. Barone, R.M. Cavalcanti, C. Farina, Radiative corrections to the Casimir effect for the massive scalar
field, Nucl. Phys. B (Proc. Suppl.) 127 (2004) 118, hep-th/0306011.
[8] T. Banks, A. Zaks, On the phase structure of vector-like gauge theories with massless fermions, Nucl. Phys.
B 196 (1982) 189.
[9] Z. Chacko, M.A. Luty, E. Ponton, Massive higher-dimensional gauge fields as messengers of supersymmetry
breaking, JHEP 0007 (2000) 036, hep-ph/9909248.
[10] J. Scherk, J.H. Schwarz, Spontaneous breaking of supersymmetry through dimensional reduction, Phys.
Lett. B 82 (1979) 60;
J. Scherk, J.H. Schwarz, How to get masses from extra dimensions, Nucl. Phys. B 153 (1979) 61.
[11] H. Itoyama, T.R. Taylor, Supersymmetry restoration in the compactified O(16) O(16) heterotic string
theory, Phys. Lett. B 186 (1987) 129;
I. Antoniadis, A possible new dimension at a few TeV, Phys. Lett. B 246 (1990) 377;
A. Delgado, A. Pomarol, M. Quiros, Supersymmetry and electroweak breaking from extra dimensions at the
TeV-scale, Phys. Rev. D 60 (1999) 095008, hep-ph/9812489;
G. von Gersdorff, M. Quiros, A. Riotto, Radiative ScherkSchwarz supersymmetry breaking, Nucl. Phys.
B 634 (2002) 90, hep-th/0204041;
G. von Gersdorff, L. Pilo, M. Quiros, D.A.J. Rayner, A. Riotto, Supersymmetry breaking with quasilocalized fields in orbifold field theories, Phys. Lett. B 580 (2004) 93, hep-ph/0305218.
[12] G. von Gersdorff, M. Quiros, A. Riotto, ScherkSchwarz supersymmetry breaking with radion stabilization,
Nucl. Phys. B 689 (2004) 76, hep-th/0310190.
[13] R. Hofmann, P. Kanti, M. Pospelov, (De-)stabilization of an extra dimension due to a Casimir force, Phys.
Rev. D 63 (2001) 124020, hep-ph/0012213.
[14] Y. Nomura, D.R. Smith, N. Weiner, GUT breaking on the brane, Nucl. Phys. B 613 (2001) 147, hep-ph/
0104041;
L. Hall, Y. Nomura, Gauge unification in higher dimensions, Phys. Rev. D 64 (2001) 055003, hep-ph/
0103125.
[15] A. Hebecker, J. March-Russell, A minimal S 1 /(Z2 Z2 ) orbifold GUT, Nucl. Phys. B 613 (2001) 3, hep-ph/
0106166;
R. Contino, L. Pilo, R. Rattazzi, E. Trincherini, Running and matching from 5 to 4 dimensions, Nucl. Phys.
B 622 (2002) 227, hep-ph/0108102.
[16] A. Hebecker, A. Westphal, Power-like threshold corrections to gauge unification in extra dimensions, Ann.
Phys. 305 (2003) 119, hep-ph/0212175.
[17] H.C. Cheng, K.T. Matchev, M. Schmaltz, Radiative corrections to KaluzaKlein masses, Phys. Rev. D 66
(2002) 036005, hep-ph/0204342.
[18] R.M. Corless, et al., On the Lambert W function, Adv. Comput. Math. 5 (1996) 329.
227
[19] S. Kachru, R. Kallosh, A. Linde, S.P. Trivedi, De Sitter vacua in string theory, Phys. Rev. D 68 (2003)
046005, hep-th/0301240.
[20] E. Dudas, M. Quiros, Five-dimensional massive vector fields and radion stabilization, hep-th/0503157.
[21] E. Elizalde, The vacuum energy density for spherical and cylindrical universes, J. Math. Phys. 35 (1994)
3308, hep-th/9308048.
[22] D. Marti, A. Pomarol, Supersymmetric theories with compact extra dimensions in N = 1 superfields, Phys.
Rev. D 64 (2001) 105025, hep-th/0106256.
[23] G. von Gersdorff, M. Quiros, Supersymmetry breaking on orbifolds from Wilson lines, Phys. Rev. D 65
(2002) 064016, hep-th/0110132.
[24] M.A. Luty, N. Okada, Almost no-scale supergravity, JHEP 0304 (2003) 050, hep-th/0209178.
[25] R. Rattazzi, C.A. Scrucca, A. Strumia, Brane to brane gravity mediation of supersymmetry breaking, Nucl.
Phys. B 674 (2003) 171, hep-th/0305184.
[26] P. Fayet, Supersymmetric grand unification in a six-dimensional spacetime, Phys. Lett. B 159 (1985) 121;
E.A. Mirabelli, M.E. Peskin, Transmission of supersymmetry breaking from a 4-dimensional boundary,
Phys. Rev. D 58 (1998) 065002, hep-th/9712214;
A. Hebecker, 5D super-YangMills theory in 4-D superspace, superfield brane operators, and applications
to orbifold GUTs, Nucl. Phys. B 632 (2002) 101, hep-ph/0112230.
Erratum
Erratum to: Inclusive production of single hadrons

with finite transverse momenta in deep-inelastic
scattering at next-to-leading order
[Nucl. Phys. B 711 (2005) 345]
B.A. Kniehl, G. Kramer, M. Maniatis
II. Institut fr Theoretische Physik, Universitt Hamburg,
Luruper Chaussee 149, 22761 Hamburg, Germany
Received 17 May 2005
Available online 31 May 2005
The four-momentum transfer q introduced below Eq. (1) should be defined with the
opposite sign, as q = k k .
The dotted histograms in Figs. 9(a) and (b) are erroneous due to a programming mistake. The Furry terms only contribute to the extent in which the fragmentation functions
of quarks and antiquarks differ, which they do not in the cases under consideration. The
related discussions in Sections 3 and 4 should be modified accordingly.
In contrast to what is stated in the first sentence of Appendix B, the expressions for
hF,ab
T ,L listed in Eqs. (B.3) and (B.4) only refer to partonic subprocess (12) with ab = qq
and the ones listed in Eqs. (B.5) and (B.6) refer to partonic subprocess (14) with ab = qq.
By appropriate permutations of partons b, c, and d, the corresponding expressions for
partonic subprocess (13) with ab = q q are obtained from Eqs. (B.3) and (B.4), and those
for partonic subprocess (15) with ab = qq or ab = q q are obtained from Eqs. (B.5)
and (B.6).
Acknowledgements
We thank M. Fontannaz for a useful communication.
DOI of original article: 10.1016/j.nuclphysb.2005.01.031.
E-mail address: bernd.kniehl@desy.de (B.A. Kniehl).
doi:10.1016/j.nuclphysb.2005.05.017
Nuclear Physics B 720 [FS] (2005) 235288
Vacuum orbit and spontaneous symmetry breaking

in hyperbolic sigma-models
A. Duncan, M. Niedermaier, E. Seiler
Department of Physics, 100 Allen Hall, University of Pittsburgh, Pittsburgh, PA 15260, USA
Laboratoire de Mathematiques et Physique Theorique, CNRS/UMR 6083, Universit de Tours,
Parc de Grandmont, 37200 Tours, France
Max-Planck-Institut fr Physik, Fhringer Ring 6, 80805 Mnchen, Germany
Received 7 December 2004; accepted 29 April 2004
Abstract
We present a detailed study of quantized noncompact, nonlinear SO(1, N ) sigma-models in arbitrary spacetime dimensions D 2, with the focus on issues of spontaneous symmetry breaking
of boost and rotation elements of the symmetry group. The models are defined on a lattice both in
terms of a transfer matrix and by an appropriately gauge-fixed Euclidean functional integral. The
main results in all dimensions 2 are: (i) on a finite lattice the systems have infinitely many nonnormalizable ground states transforming irreducibly under a nontrivial representation of SO(1, N );
(ii) the SO(1, N ) symmetry is spontaneously broken. For D = 2 this shows that the systems evade the
MerminWagner theorem. In this case in addition: (iii) Ward identities for the Noether currents are
derived to verify numerically the absence of explicit symmetry breaking; (iv) numerical results are
presented for the two-point functions of the spin field and the Noether current as well as a new order
parameter; (v) in a large N saddle-point analysis the dynamically generated squared mass is found
to be negative and of order 1/(V ln V ) in the volume, the 0-component of the spin field diverges as
ln V , while SO(1, N ) invariant quantities remain finite.

PACS: 11.10.Lm; 71.55.Jv
E-mail address: ehs@mppmu.mpg.de (E. Seiler).

doi:10.1016/j.nuclphysb.2005.04.038
236
A. Duncan et al. / Nuclear Physics B 720 [FS] (2005) 235288
1. Introduction
Noncompact nonlinear sigma-models occur in a variety of contexts. They are ubiquitous
in the dimensional reduction of (super-)gravity theories, which provided the main incentive
for the study of their quantum properties [17]. Motivated by structural similarities they
were also used as a test-bed for renormalization and symmetry aspects of quantum gravity
[911]. The two-dimensional versions are in addition relevant for the theory of disordered
systems and localization, see, e.g., [1216].
The most intriguing aspect of noncompact sigma-models is the apparent clash between
symmetry and unitarity: the Lagrangian is invariant under a finite-dimensionalhence
nonunitaryrepresentation of the group, while the physical Hilbert space (or at least a
sizeable subspace of it) is expected to carry a unitary and hence infinite-dimensional representation of the group, apparently not accounted for by the field content of the system.
This is particularly puzzling in the vacuum sector, where in the 2-dimensional versions
Colemans theorem [17] seems to preclude spontaneous symmetry breaking even for a noncompact group. Indeed both perturbation theory and large N techniques typically expand
around an invariant Fock vacuum in an indefinite metric state space [1,5,8]. Its positive
metric subspace, however, then carries no remnant of the original noncompact symmetry
and looks more like that of a compact model.
A recent detailed study of the 1-dimensional hyperbolic spin chain [18] showed how
in that system the clash is avoided: there are infinitely many non-normalizable ground
states transforming under an irreducible representation of the group. On the one hand
this entails that the symmetry is spontaneously broken at the level of (certain) correlation functions. On the other hand, by a change of scalar product to the one induced by the
OsterwalderSchrader reconstruction, the above representation rotating the ground states
into each other can be made unitary. The price to pay is that the reconstructed Hilbert
space is nonseparable and that the unitarity of the representation only extends to a large
but proper subspace of it. One of the goals of the present paper is to investigate the extent to which this picture of the ground state orbit generalizes to the field theoretical
case.
More generally our focus is on issues of spontaneous symmetry breaking of noncompact (boost) and compact (rotation) symmetries. The starting point is a lattice construction of the models, using both the transfer matrix formalism and the Euclidean
functional integral. In either case the infinite volume of the symmetry group requires
modifications compared to the setting for a compact symmetry group. The transfer operator is no longer trace class even in finite volume and the functional integral needs to
be gauge fixed. Two specific gauge-fixing schemes (a translationally invariant scheme
in which the zero-momentum mode of the transverse spin fields is set to zero, and a
fixed-spin gauge) are used, the first of which is convenient for numerical simulations
while the second one allows one to relate the transfer matrix to the functional integral. Once properly defined (Section 2) the systems are studied by a combination of
group theoretical techniques (Section 3), numerical simulations (Section 4), and a large
N saddle-point analysis (Section 5). Our main results in generic dimensions D 2
are:
237
On a finite spatial lattice the noncompact models are shown to have infinitely many
non-normalizable ground states transforming irreducibly under SO(1, N )in sharp
contrast to the unique ground state of the SO(1 + N ) models.
Spontaneous symmetry breaking occurs in all dimensions D 2.
As described, the symmetry breaking is surprising in dimension D = 2; a case which we
therefore investigated in more detail with the following results:
A new Tanh order parameter is used to probe the spontaneous breaking of the boost
symmetries, bypassing problems with the usual hysteresis criterion.
Quadratic Ward identities for the Noether currents are derived (including finite volume
corrections) and used to verify numerically the disappearance of explicit breaking of
the boost and rotation symmetries with increasing volume.
Numerical results are presented for the two-point functions of the spin fields and of the
Noether current, as well as for the Tanh order parameter, which show spontaneous
symmetry breaking.
In a large N saddle-point analysis (starting from the gauge fixed functional integral) the
dynamically generated squared mass is found to be negative and
of order 1/(V ln V )
in the volume V , the 0-component of the spin field diverges as ln V while SO(1, N )
invariant combinations remain finite.
In addition we point out certain subtleties, related once more to the noncompactness of the
symmetry group, without attempting definite answers here. One of them concerns the inapplicability of standard theorems in D = 2 (MerminWagner, and refinements thereof)
to argue that the maximal compact SO(N ) subgroup singled out by the gauge fixing
is not spontaneously broken; see Section 2.2 for a discussion. For any D 2 another
subtle point is the reconstruction of a Hilbert space, a transfer operator, a normalizable
ground state and a representation of the symmetry group commuting with it from the
infinite volume correlation functions via an OsterwalderSchrader reconstruction; see Section 3.5.
The paper is organized as follows: in the next section we introduce the ingredients of
a lattice construction of the systems (transfer matrix and functional integral) and pose the
questions we wish to address. In Section 3 we derive the structural characterization of the
ground states of the finite lattice systems in D 2 and prove that whenever a thermodynamic limit exists it shows spontaneous symmetry breaking. Sections 4 and 5 are devoted to
the D = 2 model, and contain the Monte Carlo study of the SO(1, 2) models and the large
N saddle-point analysis, respectively. Some technical material on the harmonic analysis of
functions on the target space and the finite volume corrections to the Ward identities are
relegated to Appendices A and B, respectively.
2. Lattice construction
We consider the hyperbolic SO(1, N ) nonlinear sigma-models with N 2 defined on a
D-dimensional Euclidean lattice, ZD , with D = d + 1 2. The systems are defined
238
on finite lattices with the thermodynamic limit ZD to be taken later on. We divide
the lattice into time slices t = {(x, t) | 1 x Ls , = 1, . . . , d} {1, . . . , Ls }d of
|t | = Lds lattice points. The Euclidean time t ranges from 0 to Lt 1, so that || = Lt Lds
is the total lattice volume. The dynamical variables (spins) are denoted by nx , x ;
those in a given time slice are written alternatively as nx , x t or as nx,t , x {1, . . . , L}d .
The spins take values in HN = {n R1,N | n n = +1, n0 > 0}. The bilinear form (dot
with a = (a 1 , . . . , a N ). We
product) is a b = a 0 b0 a 1 b1 a N bN =: a 0 b0 a b,
take as our basic lattice action

(nx nx+ 1),
S0 [n] =
(2.1)
x,
where is the unit vector in the positive -direction (with = 1, . . . , D, and the boundary
conditions specified later). Since n n 1 for all n, n HN the action is normalized such
that S0 [n] 0.
We use the connected component of the identity SO0 (1, N ) (which preserves both
sheets of the cone a a = 0) thoughout. Slightly simplifying (and abusing) the notation we shall always write SO(1, N ) := SO0 (1, N ) for it. The hyperboloid HN can then
also be viewed as a globally symmetric space SO(1, N )/SO(N ) for any one of the maximal compact SO(N ) subgroups. We shall use the stabilizer group SO (N ) of the vector
. . . , 0) throughout. Concretely this amounts to a parametrization of the spins
n = (1, 0,
as n = (, 2 1s ), where 1 is a noncompact variable and s S N 1 is a conventional compact spin. Note that this provides a global parametrization of HN . The invariant
products entering the action then read

1/2 2
1/2
x+ 1
sx sx+ .
nx nx+ = x x+ x2 1
(2.2)
We write S0 [, s] for the action in this parametrization. It can be viewed as that of a spherical S N 1 sigma-model coupled in a nonpolynomial way to the additional noncompact field
x . The invariant measure d(n) := 2d N +1 n(n n 1) (n0 ) factorizes according to

d(n) =

N/21
d 2 1

dS(s ).
(2.3)
S N1
2.1. Definition of the transfer matrix and functional integral

The dynamics of the lattice system is defined in terms of the transfer operator T which
transports lattice configurations from one time slice to the next. The square integrable wave
functions (n) = (nx , x t ) depending on the spins in some time slice t form the
Ld
Hilbert space L2 (HNs ) with respect to the product of the canonical invariant measure.
Since we usually keep Ls fixed we simply write L2 for this Hilbert space once the number
of spatial dimensions d is clear from the context. The transfer operator T acts on L2 as an
integral operator via

d(nx ) T (n, n ; 1)(n ),
(T)(n) =
(2.4)
xt
239

1
1
Ls
nx nx + nx nx+1 + nx n 2 .
T (n, n ; 1) = D,N
exp
x+1
2
2
xt
The normalization constant D,N is introduced for later use; it sets the overall scale in that

Ls
0 < T (n, n ; 1) D,N
for all configurations and d(n) T (n, n ; 1) 1. The variables
in (T)(n) can then naturally be associated with the time slice t+1 . Indeed upon iteration
of (2.4) one obtains

t
T (n) =
(2.5)
d(nx ) T (n, n ; t)(n ),
x0
where we conventionally regard Tt as a map from time slice 0 to time slice t . In this
interpretation the iterated kernel reads
T (n t , n 0 ; t)

Ls t
nx nx+
= D,N
exp
=D x0 xt

d(nx ) exp
x1 ,...,t1
(nx nx+ 1) .
(2.6)
,x0 ,...,t1
For t = LD and periodic bc in the xD -direction the last expression clearly resembles the
partition function for the action (2.1). The last integration over the variables nx,0 = nx,LD ,
however, would be divergent as the infinite volume of HN gets overcounted.
To see this more clearly note that the wave functions (n) carry the following diagonal
action of SO(1, N ),

(A)(nx , x t ) = A1 nx , x t , A SO(1, N ).
(2.7)
In contrast to a lattice system with a compact symmetry group invariant wave functions,
i.e. those satisfying (A)(n) = (n), for all A SO(1, N ), do not lie in the Hilbert
space. This is because in the inner product on L2 one of the integrations factorizes, the
infinite volume of HN gets overcounted and the L2 norm of the wave function diverges.
On an integral operator K with kernel (n, n ) the group acts as K (A)1 K(A) and
thus as (n, n ) (An, An ) on the kernels. In particular operators K whose kernels
only depend on the invariants nx ny are invariant. Importantly this holds for the iterated
transfer operator, i.e.
Tt = Tt ,
t N.
(2.8)
t N, describes the evolution of the system in Euclidean time

Since the semigroup
Eq. (2.8) means that the dynamics is SO(1, N ) invariant, as required. On the other hand it
also implies that, although T is bounded, in contrast to the transfer operator of most other
lattice systems in finite volume (as we shall see later) it is not trace class (see [18] for
the 1-dim case). As a consequence correlation functions cannot be defined in terms of the
usual expressions involving traces. The remedy is to (gauge) fix the residual symmetry
by a variant of the familiar FaddeevPopov procedure. To the best of our knowledge this
gauge-fixing does not seem to have been taken into account in earlier studies, rendering the
Tt ,
240
results somewhat formal. We now first describe this procedure and then outline the relation
to the transfer operator.
We used the following two gauge-fixing choices, both of which leave the stability group
SO (N ) of the vector n = (1, 0, . . . , 0) intact:
(1) The noncompact global gauge freedom is eliminated with a translationally invariant
gauge
choice which sets the zero momentum mode of the transverse sigma fields n :=

2
1s to zero:

(2.9)
nx = 0.
x
In this
gauge there is a nontrivial FaddeevPopov determinant which comes out to
be ( x n0x )N , see [19] in the compact case. The expectations of a general multilocal
observable O({n}) then assume the form

1
O,,1 =
d(nx ) O {n}
nx
Z1 (, )
x
x

exp S0 [n] + N ln
(2.10)
n0x ,
x
where Z1 (, ) is the partition function normalizing the averages, 1,,1 = 1.

(2) Alternatively, the noncompact global gauge-freedom may be eliminated by freezing
a single spin at an arbitrary site x0 to a conventional fixed value, typically nx0 = n ,
where n = (1, 0, . . . , 0) and x0 0 . In this case there is no FaddeevPopov factor
and the expectation value of a general observable O is simply

1
O,,2 =
(2.11)
d(nx ) O {n} eS0 [n] ,
Z2 (, )
x
=x0
if we assume that the support of the observable does not include x0 (otherwise an explicit (nx0 , n ) factor has to be included). Although the choice of a specific site x0
would seem to destroy the translational invariance of the theory, the fact that this choice
corresponds to a global gauge transformation implies that SO(1, N ) invariant observables are unaffected, and translational invariance still holds for such observables,
provided of course this invariance is not explicitly broken by boundary conditions. We
consider again periodic boundary conditions (bc) and denote the resulting expectations
by O,,2 . Invariant observables then should have the same expectations as with the
translationally invariant gauge fixing, i.e. O,,1 = O,,2 , for O SO(1, N ) invariant.
We state without proof that the finite volume partition functions Z1 and Z2 are welldefined, i.e., the gauge fixing is sufficient to render the integrals finite. Another interesting
choice of bc, in the case of the fixed spin gauge, would be periodic bc in the spatial and free
bc in the temporal direction. In analogy to the 1-dimensional model the thermodynamic
limit of these expectations should be expected to be different from each other, thereby
241
revealing a peculiar kind of long-range order. However, to study the issue through numerical simulations would presumably require much larger lattices and a cluster algorithm.
One can view ,,i as linear functionals over the algebra of bounded observables
Cb , that is, continuous bounded functions O of finitely many spins with pointwise addition and multiplication and equipped with the supremum norm, O O. As such they
qualify as states in the statistical mechanics sense: | O| O and for nonnegative O the
expectation value is nonnegative.
The fixed spin gauge with periodic bc also allows one to make contact with the transfer
matrix (2.6). For example

d(nx ) nx0 , n T (n, n; Lt ) = Z2 (, ),
(2.12)
x0
gives the partition function. Here x0 0 and (n, n ) is the invariant delta-distribution
concentrated at n = n with respect to the measure d(n). Similarly for the expectation of a generic (noninvariant) observable O(nx , ny ) located at x = (x1 , . . . , xD ), y =
(y1 , . . . , yD ), one has

O(nx , ny ) ,,2

1
=
d(nz ) nz=x0 , n
d(nx )
d(ny )
Z2 (, )
z0
xxD
yyD
T (nz , nx ; xD )O(nx , ny )T (nx , ny ; yD xD )T (ny , nz ; Lt yD ).
(2.13)
Because of the gauge fixing for finite Lt this is in general not invariant under time translations (nor, for that matter, under space translations). An important exception are SO(1, N )
x , ny ), for all A SO(1, N ). In
satisfying O(An
invariant observables O,
x , Any ) = O(n
this case Eq. (2.13) simplifies to

x , ny )
O(n
,,2

1
=
d(nz ) nz=x0 , n
d(nx )
Z2 (, )
z0
xxD
z , nx )T (nz , nx ; yD xD )T (nx , nz ; Lt + xD yD ),
O(n
(2.14)
using the invariance of the integration measure and the convolution property for the kernels
(2.6). Here translation invariance in the time direction (and trivially in the spatial direction)
is manifest. Eqs. (2.13) and (2.14) generalize straightforwardly to observables O depending on more than two spins.
Note also that any single SO(1, N ) transformation on an observable O can always be
compensated by a change in the gauge fixing condition

(A)O ,,i = O,,i n A1 n , i = 1, 2.
(2.15)
If bc other than periodic ones were adopted the bc would likewise be counter rotated in
(2.15). The issue of spontaneous symmetry breaking we wish to address in the following, of
course, asks for the invariance or noninvariance of the expectations under all of SO(1, N )
or a continuous subgroup thereof, with the gauge fixing and bc held fixed and ZD .
242
2.2. Ward identitiesabsence of explicit symmetry breaking

In this context it is important to make sure that no explicit breaking of the symmetry
is induced by the gauge fixing. In the thermodynamic limit one expects that the effect of
a single fixed mode fades out, but experience with the 1D model [18] shows that such
expectations can be misleading. In addition, as simulations are done on a finite lattice a
quantitative assessment would be useful. This can be done by deriving Ward identities
expressing the SO(1, N ) invariance of all but the gauge fixing terms in the functional measure, with the latter giving rise to finite volume corrections of the naive Ward identities.
In this section we describe the principle of the derivation as well as the naive form of the
Ward identities which is dimension independent. In contrast the form of the finite volume
corrections is dimension dependent and their determination is rather technical. For the case
D = 2 (where the symmetry breaking is surprising) this is done in Appendix B. Throughout this section the translation invariant gauge fixing 1 with periodic bc will be used, i.e.,
Eq. (2.10).
In lattice models with a compact symmetry group the invariance of the functional measure (including the Boltzmann factor) gives rise to Ward identities in a well-known way:
implement a local symmetry transformation and expand the functional integral in powers of the gauge parameter(s). Since the total response must vanish the coefficient of each
power must vanish, which gives rise to identities relating correlators of the Noether current
to other correlators. In the case at hand the gauge fixing and the associated Faddeev
Popov determinant lead to a noninvariant overall measure. Nevertheless the impact of
a local symmetry transformation can be computed and leads to modified Ward identities.
We begin by fixing our conventions for the Noether current. It takes values in the Lie
algebra so(1, N ), and we normalize the components with respect to the previously used
basis t ab , 0 a < b N , according to

a b
ab
ab
b a
Jab
(x) = t nx nx+ = nx t nx+ = nx nx+ nx nx+ .
(2.16)
Their two-point correlators can be decomposed into a transversal, a longitudinal, and a

harmonic piece. This is conveniently done in Fourier space

1 ip(xy) ab
ab
e
J (p),
Jab
(x)J (y) ,,i =: ||
p
(2.17)
where p runs over the dual lattice, p = 2

L n , n = 0, 1, . . . , L 1, = 1, . . . , D. In
order not to clutter the notation we suppress the specifications (, , i) on the right-hand
side (remember that i = 1, 2 refers to the gauge-fixing adopted). The irreducible components JTab (p) (transversal), JLab (p) (longitudinal), and JHab (p) (harmonic) are picked out
ab (p), see, e.g., [20] for details.
by acting with the corresponding projectors on J
As mentioned earlier, Ward identities now arise from studying the response of a given
expectation value under a local symmetry variation nx exp(x t ab )nx , x , performed
on all spins. To get the response of the action we prepare ( a := aa equals 1 for a = 0 and
1 for a
= 0):
t ab t ab

e x nx e x+ nx+
1
= nx nx+ + x Jab
(x)

1
( x )2 b nax nax+ + a nbx nbx+ + O 3 ,
2
243
(2.18)
where x = x+ x and [(t ab )2 nx ]c = b nax ac a nbx bc was used. For the

change in the Boltzmann factor this gives

ab
exp
nx nx+ exp
nx nx+ 1 +
x
J (x)
x,
x,
x,
1
ab
x y Jab
+
(x) J (y)
2
x,;y,

+
( x )2 b nax nax+ + a nbx nbx+ + O 3 ,
2 x,
(2.19)
where Jab
(x) = Jab
(x) Jab
(x + ).
Using (2.19) the expansion of the Boltzmann

2
factor to O( ) is trivial. Since the product of the invariant measures x dnx (n2x 1) is
invariant even under local SO(1, N ) rotations, the only noninvariant terms in the functional
measure come from the gauge fixing. Focusing on the invariant terms the total response
under a local symmetry transformation must vanish. In principle the vanishing of the coefficients of each power in gives rise to a new identity.
For example, the O() terms in the response of ncy ,,i give rise to the following first
order Ward identity

c
ab c
J ny ,,i + x,y t ab ny ,,i + terms from gauge fixing = 0.
(2.20)
Replacing ncy by with a generic (noninvariant) observable O(nx1 , . . . , nx ) a similar identity
produces a sum of contact terms. We shall not
arises where the correlator with Jab
pursue these first order Ward identities further: in Section 5 we shall verify in a large N
analysis that nax ,,i diverges as || . One expects this to hold also at fixed N ,
in which case already the example (2.20) shows that these first order Ward identities do
not necessarily have an interesting thermodynamic limit. This very fact however is worth
mentioning, because it shows how the conflict with Colemans theorem [17] is avoided:
the currents simply do not exist in the thermodynamic limit.
More interesting is the second order Ward identity from the response of the partition
function itself [20]. The vanishing of the O( 2 ) terms requires

1
x y Jab
Jab ,,i +
( )2 b E a + a E b
2 x,,y,
2 x,
+ terms from gauge fixing = 0,
(2.21)
244
a
where E a := E,,i
:= nax nax+ ,,i is the action link variable. The terms induced by
the gauge fixing can in principle be computed exactly. They are expected to die out as
ZD , but the precise form of the correction terms is cumbersome to compute.
As the case D = 2 is of particular interest the derivation of the finite volume corrections
is detailed in Appendix B. The extra terms induced by the gauge fixing then turn out to
be of order O(ln ||/||) in the limit of large volumes ||. Converting (2.21) into Fourier
space the longitudinal part JLab (p) of the current two-point function appears. The resulting
Ward identity generalizes that in the compact models [20] and reads

ln ||
, p
= 0, a < b.
JLab (p) = b E a + a E b + O
(2.22)
||
On account of the invariance of the vacua under the maximal compact subgroup singled
out by the gauge fixing (see Section 2.3) one expects that E 0 E 1 = = E N 0, so
that only two distinct cases arise:

ln ||
,
rotations,
JL12 (p) = +2E 1 + O
||

0
ln ||
01
1
JL (p) = E E + O
(2.23)
, boosts.
||
All quantities in (2.22), (2.23) of course depend on the specifications (, , i). The inequality E 0 E a , a
= 0, follows from n0 |
n|. Combined with the trivial identity
nx nx+ 1 one gets the stronger bound E 0 N E 1 1.
The individual E a cannot be expected to have a finite thermodynamic limit. In Section 5.1 we verify that in the large N expansion both E 0 and E 1 diverge logarithmically
with the volume, according to E 0 4

ln || and E 1 N1 4
ln ||, where = N/. In
0
1
contrast the invariant combination E N E approaches the finite constant 1 + /4. For
the Ward identities (2.22) therefore only the invariant combination is assured to have a finite thermodynamic limit. Nevertheless the Ward identities for the individual components
are useful to test quantitatively the degree to which the boost/rotation symmetry is restored
on a finite lattice (as far as the dynamics is concerned). Since the current correlator (2.17)
and the action link variables E a are independently measurable quantities in a Monte Carlo
simulation, validity of the identities (2.23) also provides a good test on the simulations
for given lattice size and boundary conditions. We report the results of such a test in Section 4.1.
2.3. Spontaneous symmetry breaking and Tanh order parameter
Even the very notion of spontaneous symmetry breaking in the noncompact models
requires a little thought. The conventional analysis of spontaneous symmetry breaking asks
if there is a local observable having a noninvariant expectation value if we either
(a) fix symmetry breaking boundary conditions and then take the thermodynamic limit, or
(b) add a symmetry breaking term like a magnetic field h to the action, take the thermodynamic limit and then turn the symmetry breaking term off.
245
In the second picture spontaneous symmetry breaking amounts to a hysteresis effect. In a

model with a compact symmetry group then the one-sided derivatives h 0+ and h 0
exist, but are different. This way of looking at spontaneous symmetry breaking, however,
does not readily generalize to the boost symmetries in the noncompact sigma-models because the field h has to serve double dutyas a regulator and as a probe for symmetry
breaking. For invariant observables it is clear that a nonzero field is needed in order to
(potentially) produce a normalizable measure even in finite volume. For noninvariant observables coupling to a magnetic field may or may not render the
finite volume expections
finite. Indeed, a typical coupling would add a term of the form h x n0x to the action (2.1).
However, for h < 0 then already the finite volume averages fail to exist. The h 0+
derivative is expected to be convergent for D 3 and divergent for D = 2.
Indeed, since the first version of this paper was posted, an interesting result by Spencer
and Zirnbauer appeared [21], in which it was shown that in D 3 the expectations
n0 ,,h defined without gauge fixing and with a positive magnetic field h can be bounded
by a constant (independent of h and ||) for all 3/2 and ||h 1. Thus a one-sided
hysteresis criterion here signals spontaneous symmetry breaking in the thermodynamic
limit.
The case D = 2 will be discussed in more detail in Section 2.4; in this case even the
one-sided hysteresis criterion is expected to fail. In D = 2 many authors found that n0
diverges in the thermodynamic limit, based on an (un-gauge fixed) large N expansion.
This amounts to some vestige of the large fluctuations that are responsable for the symmetry restoration in compact and Abelian models. However, since An0 , A SO(1, N ),
diverges likewise one can only conclude that the symmetry breaking cannot be seen on this
particular observable.
The approach adopted here is somewhat different. The gauge fixed functional integrals
(2.10) and (2.11) provide a complete definition of the systems in finite volume, both for
invariant and for noninvariant observables. The regulator (gauge fixing) is decoupled from
whatever probe is used for the symmetry breaking. Spontaneous symmetry breaking can
then be discussed without appeal to a one-sided hysteresis criterion and for all D 1.
The criterion we propose is simply that there exist noninvariant observables O for which
the thermodynamic limit exists and for which

lim O(An) ,,i
= lim O(n) ,,i , for some A SO(1, N ).
(2.24)
ZD
ZD
We should remark that (2.24) for a boost A signals also breaking of the compact subgroup
obtained by conjugating SO (N ) with A. Spontaneous symmetry breaking then basically
follows from the nonamenability of the group SO(1, N ). For convenience we recall the
definition here.
A Lie group G is called amenable if there exists a left invariant positive linear functional (a mean) on Cb (G), the space (and commutative C -algebra with unit) of bounded
continuous functions on G equipped with the sup-norm. All Abelian and all compact Lie
groups are amenable. Conversely, G is called nonamenable if no such mean exists. All
noncompact semisimple non-Abelian Lie groups are known to be nonamenable.
In the present context the nonamenability of SO(1, N ) implies that there has to be
bounded continuous functions of one spin, say at the origin, whose infinite volume expec-
246
tation values are not invariant under the group. The precise form of this result is described
in Theorem 3 of Section 3.
In [18] we identified for the 1D model a useful example of such a function, the so-called
Tanh order parameter. As explained in Section 2.2 the one-sided hysteresis criterion to
describe spontaneous symmetry breaking cannot readily be used for D 2. The Tanh
order parameter, on the other hand, does not require the introduction of an external field;
the gauge fixing or the boundary conditions single out the direction of symmetry breaking
and the maximally compact subgroup SO (N ) that remains unbroken is the stability group
of n . This construction readily generalizes to all D 1.
For a spacelike unit vector e we define
Te (n) := tanh(e n), e e = 1,

Tq ( ) :=
d(A) Te (An)
SO (N )

d(A) tanh q 2 1 q 2 1e0 As ,
(2.25)
SO (N )
where d(A) is the normalized Haar measure on

SO (N ). After the group averaging the
observable only
depends on := n n and
q := n e + 1. Here we parameterized n and
e as n = (, 2 1s ), s 2 = 1 and e = ( q 2 1, q e0 ), e02 = 1. This observable of course
remains finite for ZD even if n0 ,,i diverges. More importantly it is designed
to be a good indicator for spontaneous symmetry breaking already in finite volume. The
criterion (2.24) for spontaneous symmetry breaking becomes for all D 1: Te (n),,i
=
Te (An),,i for some A SO(1, N ). Since by Section 2.3 the finite volume average in
itself effects the SO (N ) average this is equivalent to T (q) := Tq (n0 ),,i having a
nontrivial dependence on q. Clearly |T (q)| 1 and T (1) = 0, by the SO (N ) invariance.
Typically a nonzero value for Te (n),,i at some q > 1 is numerically easy to detect.
In order to view this as a signal for spontaneous symmetry breaking one has to exclude that
this value decays to zero as ZD . Since by a convexity argument one expects
0

Te n ,,i Tq n0 ,,i Tq sup n0 ,,i ,
(2.26)
every nontrivial n0 ,,i will thus provide a lower bound on the measured Te (n),,i >
0, which therefore cannot decay to zero as ZD . For D = 2 we shall find later in
the large N limit that T (q) is in fact a strictlyincreasing function of q approaching 1 for
q . Specifically one has Tq ( ) tanh(n q 2 1 ), with n given by Eq. (5.6) below.
For N = 2 the same monotone increasing behavior is found in numerical simulations, see
Section 4.3.
2.4. Unbroken SO (N ) invariance in D = 2?
In two dimensions an additional subtlety arises from the MerminWagner theorem [22]
and its refinements [23,24]. Whether in the fixed spin gauge or in the translation invariant
247
gauge, the system has a residual SO (N ) invariance and can be viewed as a O(N ) vector
model with fluctuating length of the spin vectors. In D = 2, at first sight it may seem
obvious that this compact symmetry cannot be spontaneously broken, due to the mentioned
theorems.
On closer inspection, however, the situation is not quite as simple. The above mentioned
theorems on the absence of spontaneous symmetry breaking cannot really be applied, because some technical conditions for their applicability are not fulfilled. The first one is the
condition that the second derivatives of the interaction with respect to the group parameters
have to be uniformly bounded over the configuration space of the spins, which fails as a
consequence of the noncompact nature of the latter; cf. (2.2). The second condition is that
in the thermodynamic limit we have to have a Gibbs measure on the configuration space,
whereas in fact our infinite volume state is not a measure, but only a more general mean
(for more detailed discussion of this see [18]).
The symmetry breaking bc in option (a) of Section 2.3 must break the noncompact
symmetries and hence amount to something similar to the fixed spin gauge. Also here
some potential pitfalls arise, which we illustrate now for the SO(1, 2) model. If one looks
at individual configurations of the SO(1, 2) model in the fixed spin gauge at weak coupling
(specifically, = 10 and nx0 =0 = 0, say), one finds that contrary to the naive expectation
expressed above, all spin vectors nx seem to point roughly along the same direction: the
system appears to have acquired a large spontaneous magnetization! Writing momentarily
L for ,,1 with Ls = Lt =: L, we find that the average magnetization defined as

4
L ( x nx )2 L 1/2

,
M=
(2.27)
L2 x n2x L
does not vanish, rather seems to increase with L: on a 322 lattice it is 0.7197, on 642 it
is 0.7277, and on 1282 , 0.7361. The reason for this behavior becomes clearer if one considers the 0-component n0 of the spin: while it is fixed to unity at the origin, it grows
logarithmically with the distance. But large zero components necessarily imply large spatial components (as n n = 1), and consequently a large ferromagnetic coupling between
neighboring spins.
To understand this phenomenon in more detail, it is useful to look at the zero-curvature
limit of the SO(1, 2) model, which is just the Gaussian model of a two-component massless
free field, defined by the lattice action

(
nx nx+ )2 .
S=
(2.28)
2 x,
In this model the fixed spin condition means that the spin at the origin n0 is fixed to 0.
Furthermore, we can compute the counterpart of (2.27) analytically. First one has

i k
nx ny L = ik D(x y) + D(x) + D(y) ,
(2.29)
where the well-known function D is given by
D(x) =
exp 2ilx
1
L

2l
L2
(2
2
cos
l1 ,l2
L )
(2.30)
248

with the running over l1 , l2 = 0, 1, . . . , L 1 but l1 = l2 = 0 omitted. Using this, we
obtain for the square of the numerator of (2.27)
2
1
2
(2.31)
nx
= 2
D(x),
4
L
L x
L
x
and for the square of the denominator

1 2
4
n

= 2
D(x).
x
2
L
L x
L
x
(2.32)
It is apparent that in this model one gets M = 1/ 2 = 0.7071, independent of the lattice
size L. The closeness of this number to the numbers quoted above is striking. The small
difference between the numbers is due to the curvature of the target space of the SO(1, 2)
model, which becomes relevant as the spins fluctuate further from the fixed spin at the
origin.This explains why the difference grows with growing L. On the other hand, by
increasing the curvature should become less important; we checked this by measuring
M at = 40 on a 82 lattice and found M = 0.707 in agreement with these expectations.
We conclude that the apparent magnetic ordering of the lattice is due to the fact that
the spins fluctuate very far from the origin where n0 = 0, and these excursions necessarily
take place in a certain direction. If we think of the spins of the noncompact model as the
on-mass-shell momentum vectors of a unit mass particle, these vectors are constrained
by the action to be such that neighboring particles on the lattice are roughly collinear at
weak coupling. In the fixed spin gauge, the fixed vector at the origin then corresponds to a
particle at rest, surrounded by nonrelativistic neighbors. Far from the origin, the particles
become highly relativistic, collinear, and with a local center of momentum frame highly
boosted relative to the rest frame of the spin at the origin. This global drift of the center
of momentum frame can be prevented by a choice of gauge: indeed, this is precisely the
role of the gauge-fixing condition in the translationally invariant gauge, where the spins
are described in a frame in which the total spatial momentum vanishes. The fixing of only
a single spin is insufficient to arrest the gradual drift of the global center of mass frame
in the large volume limit: this phenomenon happens likewise in our model and in the twocomponent massless free field. Nevertheless one would not ascribe spontaneous symmetry
breaking to a Gaussian model.
In fact, this ordering
of the lattice is really a gauge artifact: obviously, in the translation

invariant gauge x nx vanishes and so does the magnetization (2.27). We have checked
that by boosting the configurations from the fixed spin gauge to the translation invariant
gauge, the magnetization disappearsinstead of one dominant direction of the spins we
find domains of different spin orientations.
The upshot is that in the discussion of spontaneous symmetry breaking viewpoint (a)
should be adopted. The symmetry breaking bc must break the noncompact symmetries and
hence amount to something similar to a gauge fixing of a single spin leaving a maximal
compact subgroup, here SO (N ), intact. In this gauge we could not find any local observable that has a thermodynamic limit and shows breaking of the SO (N ) symmetry, and by
analogy with the Gaussian model discussed above, we do not think that such an observable
exists. The translation invariant gauge fixing is sometimes more convenient but should lead
249
to the same conclusion. In summary, although the usual theorems do not apply, we expect
that for local observables in the two-dimensional model

(A)O ,,i = O,,i , i = 1, 2, A SO (N ).
(2.33)
Here refers to a two-sided thermodynamic limit where Z2 . We shall offer some
further comments on (2.33) in the conclusions.
3. The ground state sector

Spontaneous symmetry breaking, as described above on the level of correlation functions, only precludes the existence of an invariant ground state, without saying much about
the set of possible ground states and the possible action of the symmetry group on them.
In this section we present a Hamiltonian analysis of the ground state sector of the lattice
systems in a finite spatial volume, both for discrete and for continuous Euclidean time, that
results in a very concrete description of the ground state orbit mentioned in the introduction. The discussion, though limited to systems of finite spatial extent, is valid for SO(1, N )
sigma models in arbitrary (spatial) dimensions. An outlook on the thermodynamic limit via
the OsterwalderSchrader reconstruction is given in Section 3.5.
3.1. Transfer operator in the Schrdinger representation
To address structural issues the transfer operator in the Schrdinger representation is
useful. We begin its construction by describing the infinitesimal form of the SO(1, N ) representation in (2.7). Recall that we write SO(1, N ) for SO0 (1, N ). Then A SO(1, N )
if and only if A preserves the bilinear form a b = a 0 b0 a 1 b1 a N bN on R1,N
and both sheets of the cone a a = 0, and has unit determinant. In matrix components
the first condition becomes Aa c cd Ab d = ab , where = diag(1, 1, . . . , 1), while
the second condition amounts to A0 0 > 0. For elements t of the Lie algebra so(1, N )
the defining relation reads ta b + a tb = 0, or ta c cb + ad tb d = 0. An explicit basis is (t ab )c d = ca bd cb ad , 0 a < b N . Consider the 1-parameter subgroups
R s exp{st ab }, generated by these basis elements and set

1 ab
1 d
ab
x t (n),
t ab (n) = d est nx , x t s=0 = d
Ls ds
Ls
xt

c
with x t ab = t ab nx
,
ncx
x t .
(3.1)
The differential operators x (t ab ), 0 a < b N , generate commuting copies of the Lie

algebra so(1, N ) at each site:

ab
x t , y t cd = xy ac x t bd ad x t bc + bd x t ac bc x t ad .
(3.2)
The normalization of (t ab ) (generating identical infinitesimal transformations in all variables) is adjusted such that they satisfy the same algebra as the t ab . The quadratic Casimir
250
of the local algebras coincides at each site with minus the LaplaceBeltrami operator on
HN and is given by

N
Cx = H
(3.3)
x t ab aa bb x t a b .
x =
a<b,a ,b
We shall mainly need the following property of HN (omitting the site index momentarily): the spectrum of HN is purely continuous and is given by the interval
1
2
2
4 (N 1) + , > 0. Several complete orthonormal systems of improper eigenfunctions are known, see Appendix A and e.g. [2527]. In representation theoretical terms this
corresponds to a decomposition of the quasi-regular representation on L2 (HN ) into a direct
integral of unitary irreducible representations of the principal series (where unitarity of
the representation refers to an induced inner product; see e.g. [25], vol. 2, Section 10.1.4).
For the representation spaces one has

L (HN ) =
2
d N ()LN ()
(3.4)
with an absolutely continuous spectral measure N () d given in (A.3). Note that the
(unitary) singlet representation is not (weakly) contained in the decomposition (3.4).
Consider now the integral operator T with kernel

1
exp (1 n n ) ,
t (n n ; 1) = D,N
N1
2
2
e K N1 (),
D,N = 2
(3.5)
2
where K (z) is a modified Bessel function. The kernel of the iterated operator T x , x N,
is denoted by t (n n ; x). The normalization is such that

d(n) t (n n ; x) = 1, n HN , x N.
(3.6)
T can be shown to be a bounded selfadjoint operator on L2 (HN ) (see [18] for N = 2). The
spectrum of T is absolutely continuous and can be computed exactly (see Appendix A).
The spectral values come out as
,N () =
Ki ()
,
K N1 ()
0.
(3.7)
They are smooth even functions of with a unique maximum at = 0. Although real
and strictly bounded above by K0 ()/K(N 1)/2 () < 1 they are positive only for 0
< + (), where + () increases with like + () + const 1/3 . For > + ()
the behavior is oscillatory with exponentially decaying amplitude. Positivity however is
restored in the (naive) continuum limit for the Euclidean time: introducing momentarily
the lattice spacing a, continuum times = xa, as well as a coupling g 2 = 1/(a), one
finds

/a
g 2 (N 1)2
+ 2 .
lim 1 ,N ()
(3.8)
= exp
a0 g 2 a
2
4
251
This allows one to make contact with the heat kernel, i.e. with the integral kernel
2
exp( g2 HN )(n, n ) of the exponentiated LaplaceBeltrami operator. From (3.8) one expects

2
g HN

(n, n ) = lim t 1 n n ;

,
exp
(3.9)
a0 ag 2
2
a
where [x] denotes the integer part of x R. On the one hand this can be shown to lead
to the correct path integral (see Eq. (II.30) and Appendix D of [26]). On the other hand
one can insert the spectral resolution for T to obtain that of the heat kernel. Then the
spectral values ,N of T are simply replaced with their continuum counterparts (3.8), see
(A.13), (A.15). Massaging this integral further one can show the equivalence to the usual
expression for the heat kernel on HN , quoted in Appendix A. For our purposes a crucial
property is the strict positivity

2
g HN
(n, n ) > 0, n, n HN .

exp
(3.10)
2
For finite lattice spacing we can define T in the Schrdinger representation through its
spectral resolution. That is,
T = ,N (HN ),

s (N 1)2 /4 ,
1
s (N 1)2 .
(3.11)
4
These results for T directly carry over to the kinetic part of the transfer operator T0
which we define through its integral kernel

Ls
D,N
(3.12)
exp
(nx nx+D )2 =
t (nx nx+D ; 1)
2
with ,N (s) := ,N
xt
xt
with the same normalization constant as in (3.5). In the Schrdinger representation this
gives

N .
,N H
T0 =
(3.13)
x
xt
In particular T0 has absolutely continuous spectrum given by

,N (x ) | x > 0 ,
(T0 ) = a.c. (T0 ) =
(3.14)
x0
where the overbar refers to the closure in R. Its supremum, i.e. the spectral radius (T0 )
d
of T0 equals (T0 ) = ,N (0)Ls . On the other hand T0 is clearly symmetric with respect
to the L2 inner product and it is also bounded in the L2 norm, since T is. It follows that
T0 extends to a unique selfadjoint operator on L2 . As such its L2 norm coincides with
d
its spectral radius: T0 = (T0 ) = ,N (0)Ls . The improper eigenfunctions of T0 are of
course just direct products of those of T (see Appendix A) and thus are manifestly nonnormalizable with respect to the L2 norm.
252
To proceed we introduce (unbounded) multiplication operators n x , x {1, . . . , Ls }d ,

Ld
such that for any function V (n) on HNs (in the Schwartz space of rapidly decreasing
smooth functions, say) V (n)(n
x , x t ) = V (nx , x t )(nx , x t ). Defining

the potential operator

V (n)
:=
(3.15)
(n x n x+ 1),
x
=D
one readily verifies from (3.12) and (2.4) the following expression for the transfer operator
in the Schrdinger representation:

T = exp V (n)
(3.16)
T0 exp V (n)
.
2
2
It is again bounded and symmetric with respect to the L2 inner product and hence defines
a unique selfadjoint operator. The norm satisfies

Lds
T T0 = ,N (0)
K0 ()
K N1 ()
Lds
< 1.
(3.17)
This means T is a contraction satisfying (, Tt ) Tt (, ), for all L2 , t N.

In particular limt (, Tt ) = 0. The norm T is of particular interest because for
bounded selfadjoint operator like T it coincides with the spectral radius and hence can be
thought of as the exponential of minus the ground state energy. In Sections 3.2 and 3.3
we shall try to narrow in on the associated (improper) eigenspace.
Of course in contrast to T0 neither the spectrum of T nor its eigenfunctions are analytically accessible (except, perhaps, for very small Ls ). However, there are some generic
properties which the transfer operator of most lattice systems in finite volume has, but
which one should not expect T to have, taking the known properties of T0 as a guideline.
Since one is very much tempted to tacitly assume them, we list the expected nonproperties here:
The eigenfunctions of T should not be expected to be normalizable with respect to the
L2 norm.
T should not be expected to be a positive operator; for discrete Euclidean times then
no lattice Hamiltonian, ln T, exists.
The eigenspaces of T (considered for instance as spaces of smooth bounded functions)
carry representations of SO(1, N ) (the restrictions of ) which by the first point cannot
be expected to be unitary with respect to the L2 norm.
To the first point one can add that if there was a normalizable ground state it would necessarily have to be noninvariant, by the remark made after Eq. (2.7). The second point is
purely technical and could be avoided by working with T2 instead of T. Alternatively one
could start with a different lattice action: the heat kernel action would be an obvious choice.
For our purposes the most natural way to ensure the existence of a Hamiltonian is to take
the continuum limit in Euclidean time at the outset. This limit exists as we shall argue now.
253
To this end we assign a coupling 2 to the kinetic part in T and a coupling 1 to the
potential part and write T1 ,2 for the result. The kth power of this operator is given by
Tk1 ,2 = e
1
2 V
T0,2 e1 V
k
1
2 V
(3.18)
in accordance with the integral kernel (2.6). We introduce the lattice spacing a in the
Euclidean time direction, writing = ka for the continuum time, and set 2 = g12 a ,
1 = ga2 . For large 2 one has T exp( 21 2 HN ), so if we heuristically replace T by

this heat kernel, we are led to consider instead of (3.18) the sequence
e
1
V
2kg 2
g2 HN 2 V k 1 2 V
e 2k x x e kg
e 2kg ,
which is recognized as Trotter approximant (for k ) of

2

g
1
N +
exp
H
V
,
2 x x
g2
(3.19)
(3.20)
where the operator in the exponent is interpreted as the self-adjoint operator given by the
2
HN
1
form sum of g2
x x and g 2 V . Since both operators are unbounded but positive, actually Katos strong Trotter product formula [28] and its refinements [29] could be applied
to show that (3.19) converges strongly and even in trace norm to (3.20). But since we used
the heat kernel approximation to T0 above in an informal way, this does not lead to a rigorous proof of the Hamiltonian limit. Probably with more work this could be done, but we
do not really need this here; we simply take the semigroup R+ T , where
T = e H ,
H=
g 2 HN
1
x + 2 V ,
2
g
(3.21)
xt
as the definition of the continuum dynamics. The essential self-adjointness of H on the

space of smooth functions of compact support can also be seen directly, using the results of
[30,31]. T is a strongly continuous contraction semigroup. Both the semigroup T , > 0,
and its generator H commute with . From (3.21) and (3.17) one infers the bound T
exp( Lds g 2 (N 1)2 /8), so that
g2
(N 1)2 .
(3.22)
8
Since H is an unbounded but manifestly positive operator one could now search for a
ground state in the usual way. Technically it is more convenient to work with the bounded
positive operator T2 or the semigroup T , > 0.
(H) Lds
3.2. Existence of positive ground state wave functions

We begin by showing that T2 and T have no normalizable ground states. Here the
concept of a positivity preserving or positivity improving operator T is useful [32]. For
convenience we recall the definitions. A nonzero function L2 is called positive if
Ld
(n) 0 almost everywhere (a.e.) (that is, outside a set of measure zero in HNs with
254
respect to the product of the invariant measure on HN ) and strictly positive if (n) > 0
a.e. Then T is called positivity preserving if (T )(n) 0, a.e. and positivity improving if
(T )(n) > 0, a.e. The latter is equivalent to (1 , T 2 ) > 0 for all positive 1 , 2 .
The classic use of the positivity improving property is to establish the uniqueness of
a ground state once it is known to be normalizable, see [33] or [34] for an application to
gauge theories. Here the argument works somewhat differently: the positivity improving
property entails that a ground state cannot be normalizable with respect to the L2 norm.
We first show that both T2 and the semigroup T are indeed positivity improving. For
Tt , t N, this is obvious because of the strict positivity of the kernels T (n, n ; t) in (2.4)
and (2.6), rendering all matrix elements with strictly positive functions strictly positive.
To show that T is positivity improving we use the path integral representation of this
semigroup which is possible because of the strict positivity of the heat kernel (3.10). From
[26,35,36] one infers

T (n, n ) =

exp V ((t) dt d0 (),
(3.23)
0
Ld
where d0 () is the measure describing Brownian motion on HNs with starting configuration n and end configuration n and V is (up to a trivial normalization) the potential
defined in (3.15). The paths are continuous a.e. with respect to the measure d0 (),
and for this reason the integrand is strictly positive a.e., so strict positivity of the kernel
T (n, n ) follows, i.e., T is positivity improving. Clearly the T , > 0, are also positive
operators while for the transfer operator it convenient to work with the manifestly positive
square T2 .
To proceed we recall from (see [32], Section XIII.12) the following general result: if
T is an eigenvalue with a (proper) eigenvector , the eigenvector is nondegenerate
and can be chosen strictly positive. The latter statement corresponds to the folklore that a
(normalizable) ground state wave function does not have nodes. Applied to the case at
hand an immediate consequence of this result is that neither T2 nor T have a normalizable
ground state. This is because such a ground state would have to be unique, and thus (since
T2 and T commute with ) would have to be a singlet under the SO(1, N ) action .
However we already know that invariant wave functions are never normalizable because
of the overcounting of the group volume, and thus arrive at a contradiction. In other
/ pp (T2 ), T
/ pp (T ), where pp
words, for the noncompact sigma-models T2
denotes the pure point spectrum. We are thus faced with the unusual situation that T2 and
T must lie in the continuous and hence in the essential spectrum of the corresponding
operators. In fact [37], T2 and T have only essential spectrum

T2 = ess T2 ,
(3.24)
T = ess (T ).
Recall ([38], Section VII.3) that for a bounded selfadjoint operator T the spectrum (T )
decomposes into two disjoint sets, the discrete spectrum disc (T ) and the essential spectrum ess (T ), where ess (T ) is a closed subset of R. In terms of the spectral projectors PI
this amounts to the distinction: disc (T ) iff the range Ran PI of PI is finite-dimensional
for some open interval I containing , and ess (T ) otherwise. Weyls criterion states
255
that (T ) if and only if there is a family of normalized vectors (n )nN such that
limn (T )n = 0. Further ess (T ) if and only if the vectors n can be chosen
orthogonal, so that their weak limit vanishes, i.e. limn (, n ) = 0 for all L2 .
Note that T2 does also not have ground states in the weak sense, i.e., vectors L2
satisfying (, (T2 T2 ) ) = 0, for all L2 . This is because such a would also
be a normalizable ground state in the ordinary sense. All this of course applies to T as
well.
Definitions. In the following we denote by T a transfer operator without strong or weak
ground states. By a transfer operator we shall mean a bounded selfadjoint operator that is
positive as well as positivity improving (and possibly subject to some subsidary technical
conditions). In this situation one will naturally search for weak ground states of T, i.e.,
solutions of (, (T T)) = 0 for all in a suitable function space, where is a vector
in the dual space. Specifically we take L1 and call L a generalized ground state
of T if (, (T T)) = 0 for all L1 . The set of generalized ground states forms a
linear subspace of L which we call the ground state sector G(T) of T. The existence of
generalized ground states which are moreover strictly positive L functions is guaranteed
by a general result [37], a special case of which we describe here.
Let M be a locally compact space and a regular -finite Borel measure on it; let
M M (m, m ) T (m, m ) R+ be a function that is symmetric, continuous and
strictly positive, i.e., T (m, m ) > 0 a.e. We also assume

sup
m
d(m ) T (m, m ) < .
(3.25)
The last condition is sufficient (but by no means necessary) to ensure that T defines a
bounded operator from Lp to Lp for 1 p ; see [39] p. 173 and following pages.
The operator norm TLp Lp = supp =1 Tp is bounded by the integral in (3.25)
and coincides with it for p = 1, . Positivity of the kernel entails that T is positivity
improving. Positivity of the operator (that is, of its spectrum) is not automatic. However, if
it is not satisfied we can switch to T2 and the associated integral kernel, where positivity
is manifest. Without much loss of generality we assume therefore the kernel to be such
that T is positive. As a bounded symmetric operator on L2 the integral operator defined
by T (m, m ) has a unique selfadjoint extension which we denote by the same symbol T.
The kernel of TL will be denoted by T (m, m ; L) for L N. In this situation T and all its
powers are transfer operators in the sense of the previous definition.
To prove the existence of generalized ground states as defined above, we need a technical assumption, namely that there exist an m M (an extremizing configuration) and a
subsequence (Lj )j N N such that
T (m, m; Lj ) T (m , m ; Lj ),
m M and j N.
The existence of generalized ground states is then guaranteed by
(3.26)
256
Theorem 1. Let T be a positive integral operator with kernel T (m, m ) satisfying the
conditions listed above. Let m M be as in (3.26) and set
j (vm ) :=
TLj vm
,
(vm , TLj vm )
vm (m) := T (m, m ),
(3.27)
where ( , ) is the inner product on L2 . Using the assumption (3.26) it is shown in [37] that
the L norms of j (vm ) are bounded uniformly in j . Therefore, there exists a subsequence (jk )kN such that the weak limit
m := w lim jk (vm ),
(3.28)
exists; because (vm , j (vm )) = 1 this limit does not vanish. It is a strictly positive function in L and a generalized ground state for T, i.e.,

, T T m = 0 for all L1 .
(3.29)
Though our present proof requires (3.26) we expect that the conclusion of the theorem remains valid for a much larger class of transfer operators, which are not necessarily
integral operators, and where in particular the condition (3.26) can be dropped.
In the case at hand, all but property (3.26) are manifest for our transfer operator T2 .
Ld
We conjecture that T2 satisfies (3.26), in fact with an extremizing configuration n HNs

where all spins equal n . In Section 3.4 we present in Theorem 2 a stronger result for T2
which in particular entails the existence of generalized ground states in the above sense.
Either way, the transfer operator T2 of the SO(1, N ) nonlinear sigma model possesses
strictly positive generalized ground states. Based on Theorem 1 (and the conjecture) it
Ld
Ld
comes parameterized by a preferred configuration n HNs , i.e., n L (HNs ). This

then gives rise to an entire orbit {An , A SO(1, N )} of strictly positive generalized
ground states. In compact models the counterpart of this orbit would be trivial, i.e., would
consist simply of the one-dimensional unitary representation An = n , for all A
SO(1+N ). It is a remarkable factultimately rooted in the nonamenability of SO(1, N )
that this does not happen here.
3.3. The structure of the ground state sector
In fact the ground state sector of the SO(1, N ) nonlinear sigma-models can be described
very explicitly. We first summarize the result informally and then give a precise version in
the form of a theorem.
SO(1, N ) nonlinear sigma-models defined on a finite d-dimensional spatial lattice have
infinitely many generalized ground states transforming irreducibly according to the limit
of the spherical principal series. Every generalized ground state of the system lies in the
linear hull of a single group orbit consisting of strictly positive functions.
The spherical or type 1 unitary principal series in the one where the inducing representation of SO(N 1) is the singlet; see e.g. [59]. Note the sharp contrast to the ground state
structure of the compact models:
257
SO(N + 1) nonlinear sigma-models defined on a finite d-dimensional spatial lattice

have a unique ground state (which is a SO(N + 1) singlet and which is strictly positive up
to a phase).
The precise form of the above statement is:
Theorem 2. Let T2 be the transfer operator (2.5), (2.6) of the SO(1, N ) nonlinear
Ld
sigma-model. Then T2 is a operator on L2 = L2 (HNs ) with purely essential spectrum.

There exists a unique function 0 (n) with the following properties: it is strictly positive and -invariant, i.e., (A)0 (n) = 0 (n), for all A SO(1, N ). Further 0 (n)
independent of one variable, nx0 say, and square integrable in the other variables
is
2
x
=x0 d(nx ) 0 (n) < . In terms of this function and P(n) := H0,0,0 (n) as in
(A.10) the ground state sector G(T2 ) of T2 is given by

G T2 Span 0 (n)P(Anx0 ), A SO(1, N ) .
(3.30)
In particular all generalized ground states G(T2 ) L of T2 transform according to
the limit 0 of the spherical principal series and are contained in the linear hull of a single
group orbit. Explicitly, the former means they transform equivariantly according to

A1 n = 0 (A) (n), A SO(1, N ).
(3.31)
The theorem in particular guarantees the existence of generalized ground states in the
sense defined in Section 3.3. The crucial existence statement is that for the function 0 (m).
The proof of Theorem 2 is deferred to [37], where it appears as a special case of more
general results. The application to the transfer operator T2 of the nonlinear sigma-model
rests on a technical lemma which we present here:
Ld
Ld
Lemma. Let us denote by f : HNs HNs a function satisfying fx (n) fy (n) = nx ny for
all x, y {1, . . . , Ls }d , x, y
= x0 , and fx0 (n) = n for one x0 . Then

(3.32)
d(nx ) T n, f (n); 2 < .
x
To see this we express the two-step transfer matrix in terms of the one-step T (n, n ; 1)
defined in Eq. (2.4). This gives

d(nx ) T n, f (n); 2
x

d(nx ) d(nx )

nx nx + fx (n) + nx nx+ + nx nx+ 4 . (3.33)
exp
x
=D
We now estimate x
=x0 nx fx (n) Lds 1 and view the result of the nx integrations as
a function F (n ). It is -invariant so that one of the spins can be frozen to a fixed value,
258
say nx0 = n . The nx0 integration then can be done and one obtains

d(nx ) T n, f (n); 2
x
D,N e(Ls 1)
d

d(nx )
d(nx )
x
=x0

exp
[nx nx + nx nx+ + nx nx+ 3]
=D

x nx nx
nx =n
(3.34)
nx0
1 the integrals factorize into products of gauge-fixed
Using
one-dimensional partition functions and hence is manifestly finite.
The property (3.32) guarantees that to our transfer operator T2 (with only essential spectrum) one can associate a compact transfer operator (T2 )0 (with only discrete spectrum)
whose unique ground state wave function is the 0 (n) featuring in Theorem 2; see [37]
for details.
As stated in the theorem, the evolution operators of the nonlinear sigma-models given by
T2 (discrete Euclidean time) or T (continuous Euclidean time) both have purely essential
spectrum. The essential spectrum arises from the fact that T2 or T cannot have generalized
eigenstates transforming according to a finite-dimensional irreducible representation of
SO(1, N ). The spectrum of these operators is a closed bounded subset of [0, ). Although
there can be no normalizable eigenfunctions for the spectral values T2 or T , it is not
excluded that there exist infinite multiplets of normalizable eigenfunctions corresponding
to other spectral/eigenvalues. This situation would be interesting in that it might pave the
way to a (quasi-)particle interpretation of the spectrum. In contrast to the universal structure
of the ground state sector the existence or nonexistence of such normalizable multiplets is
a specific dynamical feature. In the case of the transfer matrix (3.16) the spectrum remains
purely essential for any invariant potential; for certain potentials infinite multiplets of
normalizable eigenfunctions might exist. In the absence of a potential, however, this is
excluded. Indeed, from (3.14) we know that the kinetic part T0 and exp( H0 ), H0 :=
2
HN
g2
x x , have absolutely continuous spectrum.
The example of the kinetic part T0 of the transfer operator (to which all results of course
apply in particular) can be used to get a feeling for how the generalized ground states
manage to be linear combinations of real and strictly positive functions: a complete system
of real eigenfunctions is given by the tensor products of the functions (A.10). Projecting
out the SO (N ) singlet yields

T0 H,L (n) =
,N (x )H,L (n),
xt
+ Lds
with H,L (n) :=

SO (N )
d(A)
Hx ,lx ,mx (Anx ),
(3.35)
xt
where the multi-index L refers to the set lx , mx , x t , and = (1 , . . . , Ls ). On

the other hand we know from (3.7), (3.14) that the supremum of the spectral values of
259
T0 is assumed if x 0, for all x t . The limiting eigenfunctions H0,L are real

and strictly positive almost everywhere (a.e.). On the other hand an intertwiner
Q(|)
from LN (1 ) LN (Ls ) to LN () can be seen to contain ( x x ) as
a factor. Assuming that Q(|) has a well-behaved limit, the irreducible component
(n) = (Q(|)H,L )(n), will likewise be real and positive a.e. as 0, in accordance with the above result.
Theorem 2 is a special case of far more general results proven in [37]. Roughly speaking
the above structure of the ground state sector turns out to arise mainly from the interplay
between group theory and the general properties of a transfer operator. It thus admits a
generalization largely independent of the details of the dynamics, which we outline here.
Let T be any transfer operator in the sense defined in Section 3.2. Then [37]:
T has purely essential spectrum, (T) = ess (T), in particular it is not compact or
trace class.
Once the existence of a single strictly positive L ground state is guaranteed the
ground state sector G(T) of T assumes a certain universal form, independent of the
details of the dynamics!
There exists a transfer operator T0 , uniquely associated with T and with the same
spectral radius, such that the ground state sector of G(T) is related to that of G(T0 ) by

G(T) Span 0 (n)P(Anx0 ), A SO(1, N ), 0 G(T0 ) .
(3.36)
Typically T0 is a compact operator so that G(T0 ) is one-dimensional and by (3.36)
all generalized ground states of T transform according to the limit 0 of the spherical
principal series: (A1 n) = (0 (A))(n), A SO(1, N ).
3.4. Thermodynamic limit, SSB, and time-slice bc
In a Hamiltonian formulation the thermodynamic limit is hard to control because the
Hilbert space changes. The way to proceed is to take the thermodynamic limit on the
level of the correlation functions and then reconstruct a Hilbert space formulation via a
OsterwalderSchrader reconstruction. We return to the second aspect in Section 3.5. Of
course even on the level of expectation values the thermodynamic limit is difficult to
control. Interestingly there is an elegant argument saying that the limit of the functional
measures (whenever it exists as a mean) cannot be invariant for all D 1.
Theorem 3. Expectations ,,bc of the SO(1, N ) nonlinear sigma-model defined on a
finite D-dimensional lattice with SO (N ) invariant bc and gauge fixing cannot have
an SO(1, N ) invariant thermodynamic limit ,,bc := limZD ,,bc . Specifically,
there exist bounded continuous functions O(n) of one spin such that

O(An) ,,bc
= O(n) ,,bc for some A SO(1, N ).
(3.37)
As noted in Section 2.3 for D 3 a similar conclusion was reached in [21] by very
different means and based on a different criterion. In the present setting the symmetry
breaking is essentially a consequence of the fact that a nonamenable group does not have
260
an invariant mean over nice function spaces [40]. Here we consider the space of bounded
continuous functions Cb (SO(1, N )) on the group manifold. Equipped with the sup-norm it
forms a commutative C -algebra with unit, so that the usual concept of a state applies.
The proof of Theorem 3 is a straightforward generalization of the argument originally presented for D = 1 in [18]. The expectation of a single-spin observable at x =
(x1 , . . . , xD ) can be written as

O(nx ) ,,bc = d,bc (n; x) O(nx ).
(3.38)
By construction, for all finite lattices the one-spin measure d,bc (n; x), x xD ,
is a normalized probability measure depending parametrically on Lt . One can thus view
(3.38) as bounded, positive, and normalized linear functionals (states) on the functions in
Cb (SO(1, N )) which happen to be independent of the variables in the SO (N ) subgroup.
By the theorem of BanachAlaoglu [38] there is therefore a subsequence of lattices
on which the states ,,bc converge to a limiting state ,,bc on Cb (SO(1, N )). Because SO(1, N ) is nonamenable this limiting state cannot be invariant. There must exist
functions Q Cb (SO(1, N )) such that their average in the state ,,bc is noninvariant. For all finite lattices Q(B),,bc = O(n),,bc holds, where O(n), n HN =
SO(1, N )/SO (N ), is the SO (N ) average of the function Q(B) and n = BSO (N ).
Since both the observable and the sequence of states are SO (N ) invariant, the limit
will also be invariant, Q(B),,bc = O(n),,bc . On the other hand by definition of
Q one has O(An),,bc = Q(AB),,bc
= Q(B),,bc = O(n),,bc , for some
A SO(1, N ), as claimed.
We remark that when the continuity requirement on the symmetry breaking observable
is dropped, the key step in the argument follows more directly from the known characterizations of an amenable symmetric space. A symmetric space G/H (G a locally compact
group and H a maximal subgroup) is called amenable if there exists a G-invariant mean
on L (G/H ) (see [41]). A unitary representation of a locally compact group G on a
Hilbert space H is called amenable if there exists a positive linear functional over B(H)
(the C -algebra of bounded linear operators on H) such that ((g)T (g)1 ) = (T ) for
all g G and all T B(H). Then the following three statements are equivalent: see [42]
and [43] (i) G/H is an amenable symmetric space. (ii) The quasiregular representation 1
of G on L2 (G/H ) is amenable. (iii) The quasiregular representation 1 almost has invariant vectors in the sense that for all compact K G and all > 0 there exists a unit vector
L2 (G/H ) such that 1 (g) < . Applied to the case at hand we know that the
quasiregular representation 1 of SO(1, N ) on HN does not almost have invariant vectors,
e.g., from the explicit decomposition (3.4). Thus HN is a nonamenable symmetric space
and by the above argument there must be symmetry breaking observables O L (HN ),
i.e., essentially bounded and measurable functions O of one spin such that (3.37) holds.
In the rest of this section we present a nonrigorous argument that these symmetry breaking single spin observables are the rule rather than the exception. To this end we introduce
expectations with a third type of boundary conditions which take advantage of the extremal
Ld
configurations n HNs . As stated after Theorem 1 the existence of these extremal configurations, although unproven at present, is highly plausible for the transfer operator T2 of
261
the noncompact sigma-models. The definition of the expectations ,,3 based on these
configurations is as follows. We set

O,,3 :=
(3.39)
d(A) OAn
,,3 .
SO (N )
The expectations referring to n are defined for a one-spin observable by

n

T (n , nx ; Lt /2 + xD )T (n , nx ; Lt /2 xD )
d(nx )
O(nx ) ,,3 =
,
T (n , n ; Lt )
xxD
(3.40)
for a two-spin observable by
n
=
O(nx , ny ) ,,3
1
T (n , n ; Lt )
d(nx )
xxD
d(ny )
yyD
T (n , nx ; Lt /2 + xD )O(nx , ny )T (nx , ny ; yD xD )
T (ny , n ; Lt /2 yD ),
(3.41)
and so on. The notation in (3.40), (3.41) is the same as in (2.13); compared to the second
type of bc the lattice now ranges over time slices xD , xD = Lt /2, . . . , 0, . . . , Lt /2, with
Lt even; all spins in the time slices Lt /2 are frozen to the special configuration n .
Although we expect n to be SO (N ) invariant (in that all spins can be chosen to be equal
Ld
to n ) the bc also work arbitrary n HNs . The invariance under SO (N ) (which was a
feature of the other two types of bc) then has to be restored by group averaging.
The advantage of these boundary conditions is that by Theorems 1 and 2 the Lt
limit can be analyzed in a similar way as in the 1-dimensional model [18]. For example for
SO(1, N ) invariant observables one has

x , ny ) n
lim O(n
,,3
Lt

1
= TxD yD 2 N/2 (N/2)0 (n )

, nx )T (n , nx ; yD xD )0 (nx )P (n )x nx .
d(nx ) O(n
0
0
xxD
(3.42)
Here the limit is taken on a subsequence (Lt )j N N as in Theorem 1, and 0 (n) is the invariant positive function in Theorem 2. The normalization is such that the limit functional
= 1. This line of argument clearly generalizes to the expectations

obeys limLt 1n,,3
of observables depending on any finite number of spins. For such observables also the
subsequent Ls limit exists on subsequences, by the theorem of BanachAlaoglu.
Clearly this argument does not depend on the number of dimensions. We conclude:
n
The expectations O
,,3 of all local SO(1, N ) invariant observables, defined with
n = n boundary conditions at xD = Lt /2, have a pointwise finite and explicitly com n .
putable thermodynamic limit, limLs limLt O
,,3
For noninvariant observables the evaluation of the thermodynamic limit is more difficult. An exception are observables depending on a single spin only. We shall now argue
262
that basically every nontrivial bounded function of a single spin will signal spontaneous
symmetry breaking in the sense of (3.37). This can be seen when using type 3 bc combined with a slightly heuristic use of Theorems 1 and 2. To this end consider the family
of measures d,3 (n; x) in (3.40) with type 3 bc. Let us write T (n , n ; Lt ) = O(d(Lt ))
for the leading asymptotics of the denominator. The density of the measures d,3 (n; x),
x xD , then behaves as
2

for Lt .
d(Lt )0 (n)2 P (n )x0 nx0
(3.43)
On general grounds d(Lt ) 0 as Lt [37]. The density (3.43) thus vanishes pointLd 1
wise in the limit. By Theorem 2, 0 (n) is normalizable on N := HNs

0 N . Based on the factorization in (3.30) and (3.40), (3.43) one obtains

lim O(nx ) ,,3 = 0 2N lim
dLt (nx0 ) O(nx0 ).
Lt
Lt
with norm
(3.44)
Here dLt (nx0 ) is a one-spin measure whose density scales like d(Lt )P((n )x0 nx0 )2
for Lt . The point here is that the second factor on the right-hand side of (3.44) is
independent of Ls and, whenever the limit Lt exists, it is not SO(1, N ) invariant.
Assuming that this is the case one can take the Ls limit of Eq. (3.44). This affects
only the 0 2N term which is invariant for any finite Ls and hence also in the limit. The
second factor however is noninvariant and gives rise to spontaneous symmetry breaking
for generic nontrival bounded one-spin observables, as asserted.
3.5. OsterwalderSchrader reconstruction
The purpose of the OsterwalderSchrader reconstruction is to reconstruct a Hilbert
space and transfer operator as well as a translation invariant state (vacuum) from
correlation functions (or more generally expectation values) in the thermodynamic limit.
Since the infinite volume limit for the transfer matrix cannot be taken (the Hilbert space
changes), this is the only way in which a physical interpretation of the model in infinite
volume can be achieved. The general procedure has been described in many places, see
for instance [4446]. The following discussion applies to SO(1, N ) sigma-models in any
dimension D 1. For the 1-dimensional version the OsterwalderSchrader reconstruction
is discussed in detail in [18].
The crucial properties required in a lattice system are
(RP) Reflection positivity, and
(TI) Time translation invariance.
The property of RP in our model can be stated as follows: Denote by C+ a suitable
linear space of continuous functions of finitely many spins nx at positive times. If O C+ ,
let O be the complex conjugate of the same function of the time reflected spins. Then
OO,,bc 0.
(3.45)
It is satisfied by our model as long as the volume is finite, the temporal size Lt is even
and we use boundary conditions that are time-symmetric; this follows from the representation of the system in terms of the transfer matrix T, see Section 2.1. For example it
263
holds in the fixed spin gauge (bc = 2) with periodic bc in time direction if the fixed spin
is chosen to have time coordinate Lt /2 (identified with Lt /2). It also holds for the fixed
time-slice gauge (bc = 3) considered in (3.41), (3.42). In the latter case the existence of a
thermodynamic limit is guaranteed at least for SO(1, N ) invariant observables. Translation
invariance is manifest already on a finite lattice and thus holds trivially also in the limit.
Since for SO(1, N ) invariant functions the different gauge fixes are presumed equivalent
this should entail the existence of a thermodynamic limit also for i = 1, 2 bc, where the
limit is then given by the same formulas as for the type 3 bc. For the fixed spin gauge (i = 2)
spatially periodic bc are used; translation invariance is not manifest on a finite lattice but
should be restored in the limit. We remark that for SO(1, N ) noninvariant observables,
such as our Tanh order parameter the properties (RP) and (TI) are not obvious, since in
on a finite lattice (RP) is violated in the translation invariant gauge, while (TI) is violated
in the fixed spin gauge and in the fixed time slice gauge. Nevertheless it is reasonable
to assume that both properties are restored in the thermodynamic limit. For noninvariant
observables in C+ we therefore assume here the existence of a translation invariant thermodynamic limit. We write for the limiting functional ,,i , i = 1, 2, 3, with the
above specifications. By construction it then also has the property RP.
The property (RP) allows one to construct a Hilbert space both on a finite lattice and
in the thermodynamic limit, which we denote by H and HOS , respectively. The definitions (O , O) := OO ,,bc and (O , O)OS := OO define a positive semidefinite
scalar product on C+ . By dividing out the subspace of elements with vanishing norm
(O, O),,bc = 0 and (O, O)OS = 0, respectively, and completion we obtain the Hilbert
spaces H and HOS , as described. Importantly HOS will in general not be separable. This
was found explicitly in the 1-dimensional model, and it is unlikely that separability will
be restored by adding spatial dimensions. For example the OS reconstruction of the solvable noncompact model of a massless free field in two dimensions likewise leads to a
nonseparable state space [47]. In contrast the spaces H are, for any finite || = Ls Lt ,
isometric to L2 . Denoting the isometry by V : H L2 , the unitary representation
on L2 induces one on H , namely := V1 V . Our second assumption is that the
thermodynamic limit of (A) exists weakly, i.e., the limit lim|| ( (A)O, O ) exists for all O, O C+ , and defines a measurable function of A SO(1, N ). This defines a
measurable action OS of SO(1, N ) on HOS . Guided by the properties of the 1-dimensional
case, we do not expect or require this action to be continuous. Further OS is expected to
u of H . Since H
be unitary only on a closed subspace HOS
OS
OS is in general not separable an alternative described by Segal and Kunze applies; see [48] and [18] in the present
u decomposes into a direct sum Hu = Hc Hs , where
context. The upshot is that HOS
OS
OS
OS
c
s is continuous and singular, respectively. Here sinthe restriction of OS to HOS and HOS
s .
gular means that (s , OS (A)s )OS = 0 for almost all A SO(1, N ) and all s HOS
u
s
If HOS is separable, HOS is absent. In one dimension such a representation OS could
s turned out to be nontrivial. The explicit form of
be constructed explicitly, and HOS
OS
also entailed that the induced by 1 C+ is actually an element of a ground state orbit
{OS (A), A SO(1, N )}. The infinite-dimensional closed subspace of HOS spanned by
c , that is, the action was continuous and unitary.
this orbit was contained in HOS
The nonamenability of SO(1, N ) now has no direct bearing on the existence of almost
invariant vectors for OS , since even a nonamenable group can have amenable represen-
264
tations. However, unitary amenable representations (in the sense defined in the remark
following Theorem 3) are characterized by the fact that almost has invariant vectors,
where is the conjugate representation ([42], Theorem 5.1). We expect that this can be
u is amenable.
used to rule out that OS restricted to HOS
Next we address the reconstruction of a transfer operator. Translation by one lattice
unit in positive time direction maps C+ into itself. Since by assumption the limiting expections are translation invariant it can be shown by standard arguments (see [18,4446])
that this map lifts to a well-defined bounded symmetric operator TOS on HOS , which is
the desired reconstructed transfer operator. By construction, there is a normalizable state
induced by the constant function 1 C+ , which is a proper eigenstate with eigenvalue
unity of TOS in sharp contrast to T which had no proper eigenstates in L2 H . We
denote by G(TOS ) the set and closed linear subspace of all (normalizable) ground states of
TOS in HOS .
Concering the interplay of TOS with OS there are two main cases to consider: first the
weak limit defining OS does not exist even when restricted to G(TOS ). Second, OS does
exist at least when restricted to G(TOS ), in which case the action of OS can be unitary or
nonunitary. In the 1-dimensional model the second possibility was realized with a unitary
action. Indeed TOS commuted with OS on all of HOS . Further G(TOS ) was a subspace of
c and
HOS
OS restricted to G(TOS ) was equivalent to 0 , the limit of the principal series.
The first possibility should be taken into account based on experience with spontaneous
breaking of compact symmetries in dimensions D 3. In this case the group no longer
acts on the (unique) ground state because of the infinitely many degrees of freedom that
would have to be transformed. In the case of a compact symmetry one has the option of
averaging the expectation values over the symmetry group, thereby introducing a large
(but still separable) Hilbert space as a direct integral over the pure phases. In this Hilbert
space there is then degeneracy of the vacuum and the symmetry group acts nontrivially on
the vacuum space. The original vacua are recovered by an ergodic decomposition of the
symmetric state. Because of the nonamenability of SO(1, N ) one does not have this option
here. In summary, we can envisage the following scenarios for the interplay between OS
and TOS :
(1) OS does not exist even when restricted to G(TOS ), or on the restriction TOS and OS
do not commute.
(2) OS does exist at least when restricted to G(TOS ), and on this subspace TOS and OS
commute. G(TOS ) then decomposes into an orthogonal sum of subspaces, G u (TOS )
on which OS acts unitarily and G nu (TOS ) on which OS acts nonunitarily. Further
OS restricted to G u (TOS ) is expected to be nonamenable and G u (TOS ) decomposes
in to a direct sum G c (TOS ) G s (TOS ), where the restriction of OS is continuous on
G c (TOS ) and singular on G s (TOS ). One or both of these subspaces could be trivial.
(2a) If G c (TOS ) is nontrivial it carries a unitary continuous representation of SO(1, N ),
which one can assume to be irreducible. Based on the results of Section 3 a plausible
candidate is again the limit of the principal series 0 . If G s (TOS ) is nontrivial the
group acts on it discontinuously as a permutation group. Such an exotic situation
was found in the 1D case for a certain non-vacuum subspace of HOS .
265
At present we do not have enough information to determine which of the above scenarios
holds. All however represent refinements of the fact that the symmetry is spontaneously
broken.
4. D = 2: Numerical simulations
Although well suited to address structural issues, the Hamiltonian formalism used in
Sections 2 and 3 is not ideal to obtain quantitative results. While in the compact models
this is still feasible [49,50] the intricate group theory required in the noncompact case
seems to render such an approach unattractive for models with a noncompact symmetry. In
Sections 4 and 5 we therefore study the dynamics in terms of correlations functions, first
by numerical simulation and then via the large N expansion.
We have performed simulations of the SO(1,2) sigma-model on square lattices of linear
dimension L = Ls = Lt ranging from L = 20 to L = 128. The simulations were performed
at different coupling values, and with the two choices for the gauge-fixing described in Section 2.2. We now describe briefly the Monte Carlo algorithms employed in the simulation
of these averages O,,i , i = 1, 2:
(1) For the average O,,1 (translationally invariant gauge and periodic bc) a Monte
Carlo sweep through a L L lattice is defined as L2 Metropolis updates of randomly
chosen pairs of spins nx1 , nx2 according to
nx1 nx1 + r,
nx2 nx2 r,
(4.1)
where the two-dimensional vector r = r(cos , sin ), r, chosen randomly in the

ranges (0, rmax ), (0, 2), respectively, and with rmax adjusted to yield acceptance rates
close
to 50%. The symmetric update of pairs of spins ensures that the gauge constraint
x = 0 is preserved. The initial configuration (we typically take cold starts, with
xn
nx = 0) is of course chosen also to satisfy this constraint. The proposed update (4.1) is
then accepted or rejected on the basis of the change in the effective action

nx nx+ 2 ln
n0x +
ln n0x .
S1 =
(4.2)
x,
After an initial run of 100000 sweeps to equilibrate the system, configurations are
stored subsequently at intervals of 2000 sweeps, which is than adequate to decorrelate
the configuration (for the observables measured, typical autocorrelation times are at
most a few hundred sweeps). The results presented in this paper derive from averages
over ensembles of 5000 independent configurations, unless otherwise stated.
(2) For the averages O,,2 , a sweep is defined as L2 updates of a randomly chosen
single spin nx1 , not including the fixed spin at x0 ,
nx1 nx1 + r,
x1
= x0
with the random vector r chosen as above, and the effective action in this case

S2 =
nx nx+ . +
ln n0x .
x,
(4.3)
(4.4)
266
Fig. 1. Typical configuration of the heights n0x , x , at strong coupling = 0.1 for L = 64. Blue, green,
orange corresponding to low, medium, high values of n0x , respectively. The mean value is n0 = 5.11. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this
article.)
Again, we typically followed an initial equilibration with 100000 sweeps by 5000

measurements spaced at 2000 sweep intervals.
We have found that long simulations of this model lead to unreliable results unless a high
quality random number generator is employed. Specifically, violations of translation invariance of two-point functions at the 3 to 4 standard deviation level were found when the
random ( ) function packaged with GNU gcc (specifically, gcc-2.96) was used. The ranlux
generator developed by Lscher [51], double precision, and with the luxury level set to 2,
was used in all simulations reported in this paper. With this generator, we have found no
violations outside of statistics in expected symmetries of measured observables.
Itis instructive to look at some typical configurations in the parametrization nx =
(x , x2 1sx ), x . As discussed in Section 2.2 with the translation invariant gauge
fixing one expects the SO (2) subgroup to be unbroken. The compact spins sx will then be
distributed similar as in the massless phase of the familiar O(2) model. The novel feature
are the noncompact components x = n0x for which we show some typical configurations
at weak and strong coupling in Figs. 1, 2. One sees that at strong coupling the mean value
n0 ,,1 is large, with relatively large localized fluctuations rendering nearby spins almost
uncorrelated. For weak coupling on the other hand most of the spins are frozen close to
bottom of the hyperboloid, n0 ,,1 1, and nearby spins are correlated, both in height
n0 and in direction n .
267
Fig. 2. Typical configuration of the heights n0x , x , at weak coupling = 10 for L = 64. Blue, green,
orange corresponding to low, medium, high values of n0x , respectively. The mean value is n0 = 1.067. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this
article.)
4.1. Spin two-point function and energy correlator

The spin two-point function nx ny ,,i is the simplest SO(1, 2) invariant bilocal object constructible in the model. The thermodynamic limit of this quantity can be studied
numerically by simulating various
size lattices at fixed . The results at = 10 for square
lattices of linear size L = || = 32, 64, 128 and i = 1 (periodic boundary conditions,
translationally invariant gauge) are shown in Fig. 3. They suggest the existence of a finite
thermodynamic limit, consistent with the analytical arguments in Section 5. It also illustrates that the spin two-point function increases with increasing separation |x y|. This
somewhat peculiar behavior has been observed before [2] and can be understood analytically both in the 1D model [18] and in a large N analysis, see Section 5.
Another natural invariant observable is the energy or action density Ex = 2(1
nx nx+ ). Its expectation value appears in the invariant combination of the Ward identities (2.22). Here we study the connected part of its two-point function and probe for
nontriviality and clustering. The subtractions involved in extracting the connected part involve large cancellations, and we have had to perform very long runs (collecting ensembles
of 40000 configurations) on somewhat smaller (20 20) lattices to find a meaningful signal. The connected part was also very small in the weak coupling regime, so we needed to
go tostrong coupling; the results of Fig. 4 correspond to = 0.1. At least for separations
r 2 lattice spacings there is a nonvanishing signal. The fact that the signal disappears
so rapidly makes it impossible to draw any firm conclusions from the numerical data on
268
Fig. 3. Two-point function nx ny ,,1 , for = 10 and varying L.
Fig. 4. Connected correlator Ex Ey Ex Ey , 20 20 lattice, = 0.1.
the nature of the asymptotic falloff: for example, to distinguish between the r 4 power behavior suggested by naive-dimensional reasoning, or exponential falloff. In summary, we
find a nontrivial energy correlator rapidly decreasing at nonzero separations.
4.2. Two-point function of the Noether current
The Ward identity (2.27) for the longitudinal momentum space current correlators provides a stringent test that the simulation scheme is fully respecting the symmetries of the
model. In Fig. 5 we show the comparison of the left and right-hand sides of (2.27) on
lattices of size 32 32 and 64 64 (periodic boundary conditions, translationally invariant gauge), for the boost ((ab) = (01)) and rotation ((ab) = (12)) Noether currents. The
agreement is within statistical errors, except for the lowest momentum modes. In fact, we
show in Appendix B that at fixed nonzero momentum the delta function gauge constraint
269
Fig. 5. Ward identity for longitudinal Noether currents: L = 64, = 0.1 (top) and = 10 (bottom).
induces a finite volume correction of order O(ln V /V ) to (2.27). These finite volume corrections are largest at the edge of the Brillouin zone, i.e., for the momentum modes of
order p 1/L (see Appendix B, Fig. 12). The transverse Noether correlators are nontrivial, and are shown in coordinate space in Fig. 6. The falloff is roughly 1/r 2 as expected on
dimensional grounds (fits also indicate a logarithmic component ln(r)/r 2 ).
4.3. Tanh observablespontaneous symmetry breaking
Finally, we have used our generated ensembles to measure the order parameter T (q)
introduced in Section 2.4 as a signal of spontaneous symmetry breaking of the SO(2, 1)
group. For N = 2 the average Tq ( ) in Eq. (2.25) is a strictly
decreasing positive function

for [1, ), with the limiting values Tq (1) = tanh( q 2 1 ) and Tq () as below. As
described in Section 2.3 by a convexity argument one expects

2
Te (n) ,,1 Tq n0 ,,1 Tq () = 1 arccos 1 q 2 ,
(4.5)
where the limit is taken from [18]. In Figs. 7, 8, the results for the three functions in
(4.5) are shown for weak and strong couplings, respectively. One sees that for weak cou-
270
Fig. 6. Transverse Noether current correlator, = 10, L = 32, 64.
Fig. 7. Order parameter T (q) for weak coupling.
271
Fig. 8. Order parameter T (q) for strong coupling.
pling the spins are almost frozen and Te (n),,1 practically coincides with the average
Tq ( n0 ,,1 ). For strong coupling, on the other hand, genuine dynamics sets in and both
quantities differ. Importantly, in either situation the symmetry breaking is manifest in that
T (q) := Te (n),,1 is a nontrivial function of q. Tq (1) vanishes on account of the unbroken SO (2) symmetry; the lower bound Tq () guarantees that the curve cannot flatten
out and vanish identically in the thermodynamic limit.
As discussed in Section 2.4 the divergence of n0 ,,1 as || is not really a test of spontaneous symmetry breaking. Moreover, because of the expected soft
(logarithmic) divergence a very large range of lattice sizes would be needed in order
to pin-down a suspected divergence of n0
,,1 . For example at = 10 we obtained
n0 ,,1 = 1.0585, 1.0695, 1.080, for L = || = 32, 64, 128, respectively. In itself this
would hardly constitute convincing evidence for a divergence in the L limit.
5. D = 2: Large N analysis
The noncompact SO(1, N ) sigma-models may be analyzed in the usual large N limit,
i.e., N , := N/ = g 2 N fixed, by saddle-point techniques analogous to those used
in the compact case. However, several important differences arise which alter qualitatively
the results in the noncompact case. The large N analysis is especially useful for examining
qualitative features like the behavior of correlation functions in the thermodynamic limit,
thereby providing a guideline for the correct extrapolation of numerical results to the continuum limit. We adopt the setting of Section 2.1 and perform a large N analysis of the
lattice-regularized model with periodic boundary conditions, using the translationally invariant gauge-fixing 1 in Section 5.1 and the fixed-spin gauge 2 in Section 5.2.
We consider
square lattices only and write L,N/,i for ,,i , i = 1, 2, with L = || the linear
size of the lattice.
272
5.1. Large N analysis in a translationally invariant gauge

By (2.10) the partition function Z1 = Z1 (, = N/) has the form

2

Z1 =
dnx nx 1
nx
x

0

1
1
0 2
0
2
.
exp N
n
nx + ln
nx
(
nx+ nx )
2 x, x+
2 x,
x
(5.1)
Implementing the nonlinear constraint as usual with an auxiliary field x , (5.1) becomes

0

1
0
0 2
Z1 =
nx+ nx + ln
dnx dx exp N
n0x
2
x
x,
x

0 2
i
x 1 nx
x

x

1
2
d nx exp N
nx ()xy ny + i
x nx
2 xy
x
(5.2)
with xy the discrete lattice Laplacian. On integrating out the n -field one finds, up to
irrelevant multiplicative factors

dn0x dx exp N S1 n0 ,
Z1
(5.3)
x
with the effective large N action

S1 =

1 0
nx ()xy n0y ln
n0x
2 xy
x

x
2
1
x n0x 1 + Tr ln[ + 2i].
2
(5.4)
The prime on the trace

in (5.4) denotes omission of the zero mode of the Laplacian, in
keeping with the ( x nx ) constraint in (4.1).
Note that in contradistinction to the compact case, the n0 field is not integrated out
in defining a large N effective action, and we are led to the problem of determining a
joint saddle-point in (n0 , ) field space. We now show that there always exists at least one
translationally invariant joint saddle-point of (5.3). Let
n0x = n + ix ,
x = i + x ,
(5.5)
where n,
are real, but the phase of the fluctuation variables x , x (despite the suggestive
notation) remains to be determined later by a detailed analysis of the local structure of
the saddle-point. The auxiliary field integrations in (5.3) run initially along the real x
axes but (as is frequently the case in HubbardStratonovich type saddle-points [52,53])
require a deformation through a purely imaginary saddle-point. In contrast, the remnant
273
field integrations over n0x run along the semiaxis [1, ), and we shall find real saddlepoint(s) with n > 1 and real.
S
S
The saddle-point conditions
=
= 0 yield
x
x
1
n 2 = 1 + ( + 2)
xx ,
1
,
2V n 2
(5.6)
where V := || = L2 is the lattice volume. Note that < 0, corresponding to a negative

dynamically generated squared mass in the gap equation (5.6). Explicitly this becomes
f (z) = 1
Vz
with f (z) := z

p
=0
1
,
(1 cos p ) z
z :=
.
V n 2
(5.7)
The discrete lattice momenta are p = 2

L m , m = 0, 1, . . . , L 1. Due to the infrared divergence of the sum in (5.7) the expectation value of the n0 field diverges logarithmically
for V , specifically as (n0 )2 L,N/,1 4

log V . Thus the dynamically generated
1
negative squared mass in the gap equation is actually of order V log
V in the thermodynamic limit. Solutions of (5.7) with f (z) > 0 correspond to n > 1. For any V , there is
always a root with z < 4 sin ( L )2 and f (z) > 0. In the weak coupling regime defined by
the inequality < 4L2 sin ( L )2 ( 40 for large L) it is easy to see that this is the only root
yielding n > 1. Henceforth we shall assume weak coupling, in the sense of the above stated
inequality, and dominance of the single saddle-point with n > 1.
We have performed explicit numerical simulations of the SO(1, N ) model at values of
N ranging from N = 20 up to N = 640 on a 20 20 lattice to check the saddle-point
result (5.7). The coupling was chosen in the weak coupling regime in the sense indicated
above, specifically = 20. The convergence to the large N limit is shown in Table 1. A fit
of the data to a functional dependence of the form A + B/N + C/N 2 gives A = 3.402.
The numerical evidence suggests that the large N functional integral is indeed dominated
by the saddle-point located in (5.6).
One may also study field correlators in the large N limit (assuming again the dominance of the saddle-point exhibited above). For example, the SO(1, N ) invariant two-point
Table 1
Comparison of n0 L,N/,1 for L = = 20 from simulations with large
N result
N
n0
20
40
80
160
320
640
(saddle-point)
3.980 0.007
3.808 0.004
3.651 0.002
3.533 0.002
3.458 0.001
3.437 0.0005
3.4136
274
Fig. 9. Large N convergence of invariant 2-point function, L = = 20.
function to leading order in large N takes the simple form

nx ny L,N/,1 n 2 Dxy
(5.8)
with D := [ + 2]
1 . Of course nx nx L,N/,1 = 1 as a consequence of the gap
equation (5.7). The first term in (5.8) arises from the n0 correlator, while the second term
represents the cumulative effect of N spatial n correlators, each of order 1/N , with a
negative sign from the indefinite metric. As the n correlators fall off with distance, the
SO(1, N ) correlator evidently is an increasing function of separation. In Fig. 9 we compare
the saddle-point result (5.8) for this correlator with simulation results obtained at finite N
on a 20 20 lattice.
Although we shall not compute 1/N corrections to the leading large N results here,
it is of interest to study the local structure of the saddle-point in this theory, which is a
prerequisite to evaluating the Gaussian fluctuation corrections. Indeed, the routing of the
field integrations through a joint saddle-point of the type studied here is somewhat more
intricate than usual: we shall find that relative to the implied contour deformations in (5.5),
further contour rotations are required in nonlocally defined field components. Expanding
the large N action S(n + ix , i + x ) to second order in the fluctuation fields x , x ,
one finds the quadratic form
S (2) =
2
1
1
1
x Mxy y 2
x
2 xy
2n V x

M 1 xy y M 1 yx x
+ 2n
x x + 2
x
(5.9)
xy
with M := + 2.
The appropriate routing of the field integrations through
the saddle (5.5) is best analyzed by going over to momentum space: we replace
x dx dx

1
ipx , etc. The quadratic form (5.9) now
by
d(p)
d(p)
where
:=
(p)e
x
p
p
V
becomes
S (2) =
275

2

1 1
M(p)(p) + 2n(p)(p)
V p 2
2
1
2
(0)
+
2n 2 V 2
V2

p
=0,q
=0
|(p q)|2
M(p)M(q)
(5.10)

p 2
with M(p) := 4 sin ( 2 ) + 2.
We shall henceforth neglect the term involving (0)2
in (5.10), as it is of order 1/V relative to the rest. Defining the one-loop polarization
function
1
1
,
p,qr
(p) :=
(5.11)
V
M(q)M(r)
q
=0,r
=0
the quadratic action can be written as

2

2

1 1
2 2
(2)
2

S =
M(p) (p)
n (p) (p) .
V p 2
M(p)
(5.12)
2n
In (5.12) we change field variables from ((p), (p)) to ((p)
:= (p) + M(p)
(p), (p)).
This functional change of variable has unit Jacobian but is of course highly nonlocal in
coordinate space. The quadratic action can now be expressed in terms of the variables
R (p) := ((p)
+ (p))/2,
I (p) := ((p)
(p))/2i,
and similarly for . The integrals over these variables run initially along the real axis, but must be rotated in passing
2
n 2 2 (p) in (5.12).
through the saddle-point depending on the sign of M(p) and M(p)
For p = 0 one has M(0) < 0, (0) > 0. Accordingly the zero mode of the field must be
rotated by /2 in passing through the saddle-point, while the zero mode of the field retains the initial contour orientation. The situation is reversed for nonzero momenta: namely,
we find
M(p) > 0,
2 2
n > 2 (p),
M(p)
for p
= 0,
(5.13)
which implies that for nonzero momentum modes the contours should be rotated by /2,
while the routing is unchanged. Calculations to next to leading order in 1/N necessarily
require careful attention to the phases induced by these contour rotations. We again emphasize that the above statements hold for the case of a single dominant saddle point, when
< 4L2 sin ( L )2 .
5.2. Large N analysis in a fixed-spin gauge
By (2.11) the partition function Z2 = Z2 (, = N/) reads

dnx n2x 1 nx0 n
Z2 =
x

0

1
1
0 2
2
.
exp N
n
nx
(
nx+ nx )
2 x, x+
2 x,
(5.14)
276
Introducing the auxiliary field as usual, the partition function becomes in this case

Z2 =
(5.15)
dn0x dx exp N S2 n0 ,
x
=x0
with the effective large N action

2

1 0
S2 =
nx ()xy n0y i
x n0x 1
2 xy
x
1
Tr ln [ + 2i],
(5.16)
2
where in (5.16) the prime on the trace now implies a projection corresponding to omission
of the integral over the field variable nx0 . The presence of a fixed spin at a definite point
on the lattice now means that the saddle-point will involve (n0 , ) fields with a nontrivial
spatial dependence. Writing in analogy to (5.5)
+
n0x = n x + ix ,
x = i x + x ,
the saddle-point conditions

n 2x = 1 + D xx ,
S
x
S
x
x =
(5.17)
= 0 now lead to
1 1
(n)
x
2 n x
(5.18)
with the projected propagator D given by

D xy = Dxy
Dx0 x Dx0 y
,
Dx0 x0
Dxy = [ + 2]
1 .
(5.19)
The explicit dependence on a special point x0 in Eqs. (5.18), (5.19) results in a nontrivial
spatial dependence for the solution fields n x , x . For x far from the fixed spin at x0 , we
expect the field x to approach the constant value = 2V1n 2 corresponding to the negative dynamically generated squared-mass in the translationally invariant gauge. As n x0 is
pinned at 1, and n x > 1 in general, n x0 > 0 and (from (5.18)) x0 > 0. So the saddlepoint solution in this case involves a spatially dependent dynamical mass, and translational
invariance is obviously lost in the propagators D, D above. Nevertheless, if we choose
periodic boundary conditions at the edge of the lattice, the fixing of a single spin still
only amounts to a SO(1, N ) gauge fixing, and SO(1, N ) invariant two-point correlators
such as
nx ny L,N/,2 n x n y D xy ,
(5.20)
must still be translationally invariant, and indeed equal to those found in the translationally
invariant gauge, namely (5.8). Eqs. (5.18), (5.19) cannot be solved analytically, but they
are numerically solvable on a given lattice by iteration: one takes a reasonable approximate
starting ansatz for n x , x and then solves (5.18) for n and in alternation until convergence is reached. Single precision convergence (67 digits) is typically reached with less
than 100 iterations.
A cross-section of the n field on a 20 20 lattice (with the fixed spin at the center point
(10, 10)) is displayed in Fig. 10.
277
Fig. 10. Cross-section of n x for L = = 20.
Fig. 11. Comparison of large N invariant 2-point correlator in fixed spin and translationally invariant gauges,
L = = 20.
The figure exhibits the qualitative features discussed above. The field solution in this
case (which corresponds to the parameters of Table 1) has a positive spike at the fixed
spin with x0 = +0.5, and x tending rapidly (within 3 or 4 lattice spacings from the
fixed spin in all directions) to the negative constant value 2V1n 2 (recall (5.6)) found in
the translationally invariant gauge. Note that the typical values of the noninvariant quantity
n = n0 L,N/,i are very different in the two gauges i = 1, 2. Nevertheless, the invariant
2-point functions computed from (5.8) and (5.20) agree, as shown in Fig. 11.
6. Conclusions
We have analyzed nonlinear sigma models with noncompact target space and symmetry group SO(1, N ) in dimensions D 2 combining analytic and numerical methods. The
lattice formulation was used with the dynamics defined both in terms of a transfer oper-
278
ator and a functional integral; in the latter case a gauge fixing was essential. Perhaps the
most remarkable feature emerging from the analysis is the intricate vacuum structure as
witnessed by Theorem 2. Analyzing the system on a finite spatial lattice via the transfer
matrixwhere one would usually expect a unique ground statewe could identify a nontrivial ground state orbit, i.e., infinitely many non-normalizable ground states transforming
irreducibly under SO(1, N ). In the thermodynamic limit spontaneous symmetry breaking
was found to occur in all dimensions D 1 (where the case D = 1 was already treated
in [18]). For dimensions less than three this highlights that the MerminWagner theorem
does not hold for these systems. To (numerically) see this spontaneous breakdown on the
level of correlation functions, the introduction of a suitable new order parameter (Tanh)
was instrumental. The mathematical reason for these unusual features was understood to
be the nonamenability of the symmetry group.
Since in two dimensions the symmetry breaking is surprising we examined this case
in more detail. Since the gauge fixing by necessity breaks the symmetry explicitly, it was
important to study quantitatively the effect of this explicit breaking via Ward identities.
Our numerical simulations show clearly that this explicit violation disappears in the thermodynamic limit, whereas the symmetry breaking shown by the Tanh order parameter
remains. The new order parameter thus provides for noncompact models a numerically
effective way to probe for spontaneous symmetry breaking in finite volume.
In addition we performed a large N saddle point analysis in the two-dimensional models. The qualitative features we deduced for the model at finite N were confirmed explicitly
in the solution of the N = model.
A variety of open questions remain: what is the significance of the spontaneous symmetry breaking for the localization phenomena the two-dimensional sigma-models provide an
effective description of? Further are these systems integrable and amenable to a bootstrap
construction, e.g., based on an R-matrix with the symmetry of the principal unitary series
representations like in [54]? For numerical simulations a (hybrid-)cluster algorithm would
be desirable; in particular to probe whether or not the peculiar long range order (sensitivity
to boundary conditions) found in the 1-dimensional model extends to the field theories.
An important question is of course whether there exists a nontrivial continuum limit. Conventional wisdom would say no, because the models are (perturbatively) asymptotically
free in the infrared but not in the ultraviolet [10,55]. To gain some feeling whether this
is true beyond perturbation theory, it might be worthwhile to study simplified hierarchical
versions of the Renormalization Group. Finally the vacuum structure in the reconstructed
Hilbert space should investigated further, as well as its relevance for dimensionally reduced
gravity theories, where a continuum limit is expected to exist.
Acknowledgements
M.N. likes to thank P. Forgacs and P. Baseilhac for discussions, the latter also for verifying aspects of the large N analysis in the literature. The research was supported by the EU
under contract number HPRN-CT-2002-00325. The research of A.D. is supported in part
by NSF grant PHY-0244599; A.D. is also grateful for the hospitality of the Max Planck
Institute (Heisenberg Institut fr Physik) where part of this work was done.
279
Appendix A. Spectral decompositions and heat kernel on HN

Let HN be minus the LaplaceBeltrami operator on the hyperboloid HN , N 2. Recall that its spectrum is absolutely continuous and is given by the interval 14 (N 1)2 + 2 ,
> 0. There are several complete orthogonal systems of improper eigenfunctions. From
a group theoretical viewpoint the most convenient system are the principal plane waves
E,k (n) (see [25,27,56] and the references
therein) labeled by > 0 and a momentum
N
1

vector k S
. Parameterizing n = (, 2 1s ), they read

1 (N 1)i
2
E,k (n) = 2 1s k
(A.1)
.
The completeness and orthogonality relations take the form

1
k ),
d(n) E,k (n) E ,k (n) =
( )(k,
N ()

d N ()
dS(k) E,k (n) E,k (n ) = (n, n ),
0
(A.2)
S N1
k ) are the normalized delta distributions with respect to the inwhere (n, n ) and (k,
variant measures d(n) and dS(k) on HN and S N 1 , respectively. The spectral weight
is

1 ( N 21 + i) 2
N () =
(A.3)
.
(2)N
(i)
The main virtue of these functions is their simple transformation law under SO(1, N ).
For A SO (N ) one has trivially E,k (A1
n) = E,Ak (n). To describe the action of the
N
1
boosts let a S
and decompose n = 2 1s into its components parallel n and
orthogonal n to a . A boost in the direction a will then leave n invariant. Denoting the
boost parameter by R the corresponding element A = A(, a) = A(, a)1 acts by

ch sh n a
1
=
.
A(, a)
(A.4)
n
ch n sh a + n
Using this in (A.1) one verifies

1
E,k A1 n = [ch + a k sh ] 2 (N 1)i E,rA (k) (n),
(A.5)
S N 1 is a rotated momentum vector whose components rA (k)

parallel and
where rA (k)

orthogonal rA (k) to a are given by
=
rA (k)
a k ch + sh
a ,
ch + k a sh
=
rA (k)
a
k (
a k)
.
ch + k a sh
(A.6)
The transformation law (A.5) characterizes the spherical principal unitary series , >
0 of SO(1, N ), where and its conjugate are unitary equivalent (see e.g. [25], vol. 2,
Sections 9.2.1 and 9.2.7). The orthogonality and completeness relations (A.2) amount to
the decomposition (3.4) of the quasi-regular representation on L2 (HN ). Further one
280
verifies

dS(k) = [ch + a k sh ]N 1 dS rA (k) .
(A.7)
This implies that dS(k) integrals over products of the form E,k (n) E,k (n ) are invariant under the SO(1, N ) action (A.5). In particular one can define spectral projectors PI
commuting with in terms of their kernels PI (n n ), I R+ :

PI (n n ) := d N ()
dS(k) E,k (n) E,k (n ),

S N1
d(n ) PI (n n )PJ (n n ) = PI J (n n ).
(A.8)
Combined with the completeness relation in (A.2) this shows that the spectra of HN
and of T in (3.5) are absolutely continuous.
A complete orthogonal set of real eigenfunctions of HN is obtained by taking the
dS(k) average of the product of E,k (n) with some spherical harmonics on the k-sphere.
This amounts to a decomposition in terms of SO (N ) irreps where the radial parts of the
resulting eigenfunctions are given by Legendre functions. Using the normalization and the
integral representation from ([57], p. 1000) one has in particular

1 (2N ) 1N/2

dS(k) E,k (n) = (2)N/2 2 1 4
P1/2+i ( ).
(A.9)
S N1
As a check on the normalizations one can take the 1+ limit in (A.9). The limit on
the rhs is regular and gives 2 N/2 / (N/2), which equals the area of S N 1 as required by
the limit of the lhs. Denoting the set of real scalar spherical harmonics by Yl,m (k), l N0 ,
m = 0, . . . , d(l) 1, with d(l) = (2l + N 2)(l + N 3)!/(l!(N 2)!) we set

H,l,m (n) := dS(k) Yl,m (k)E,k (n)
(A.10a)

1 (2N ) 1N/2l
P1/2+i ( ),
= kl ()Yl,m (s ) 2 1 4
(A.10b)
l1
1/2

2

N 1
N/2
N/2
2
+
+j
with k0 () = (2) , kl () = (2)
,
2
j =0
l 1.
The expression (A.10b) is manifestly real, the equivalence to (A.10a) can be seen as follows: from (A.2), (2.3), and the orthogonality and completeness of the spherical harmonics
one readily verifies that both (A.10a) and (A.10b) satisfy

1
( )l,l m,m ,
d(n) H,l,m (n) H ,l ,m (n) =
N ()

d N ()
H,l,m (n) H,l,m (n) = (n, n ).
(A.11)
0
l,m
281
Further both (A.10a) and (A.10b) transform irreducibly with respect to the real d(l) dimensional matrix representation of SO (N ) carried by the spherical harmonics. Hence
they must coincide. A drawback of the functions (A.10) is that the k integration spoils
the simple transformation law (A.5) under SO(1, N ). The transformation law can now be
inferred from the addition theorem

1 (2N ) 1N/2

H,l,m (n)H,l,m (n ) = (2)N/2 (n n )2 1 4
P1/2+i (n n ). (A.12)
l,m
For example for n = An this describes the transformation of the SO (N ) singlet

H,0,0 (n) under A SO(1, N ).
Having laid out the relevant representation theory let us consider the spectral decomposition of the various operators under consideration. For the kernel (3.5) of T we use an
ansatz of the form

dS(k) E,k (n) E,k (n ),
t (n n ; 1) = d N (),N ()
(A.13)
S N1
chosen such that T E,k = ,N ()E,k . To determine the eigenvalues we set n = n and
integrate over k. Using (2.3), (A.9), and the integral ([57], p. 804) one finds
,N () = (2)
N/2
1
D,N

1 (N 2) 1N/2
d 2 1 4
e
P1/2+i ( )
Ki ()
=
,
K N1 ()
(A.14)
as asserted in (3.7). The spectral representation of the iterated kernel t (n n ; x), x N,

equals (A.13) just with ,N () replaced by [,N ()]x .
In view of (3.9), (3.8) this directly yields an integral representation for the heat kernel
which (after rescaling g 2 /2 ) reads:

exp HN (n, n ) =

0
d N ()e [(
N1 2
2
2 ) + ]
dS(k) E,k (n) E,k (n ).
S N1
(A.15)
Let us briefly recap the main properties of the heat kernel on HN , see, e.g., [58]
u=
(i) exp( HN )(n, n ) is symmetric in n, n and is a bi-solution of the heat equation
H
N
u.
(ii) for each n HN , d(n) exp( HN )(n, n ) is a probability measure which converges
to the Dirac measure (n, n ) as 0+ .
(iii) it is invariant exp( HN )(An, An ) = exp( HN )(n, n ), A SO(1, N ), and hence a
function of r = arccosh(n n ) only, for which we write h (r).
(iv) h (r) is smooth and strictly positive for all r 0 and > 0; in particular the coincidence limit r 0+ is finite.
282
Most of these properties are readily verified from the spectral representation (A.15). Properties (i) and (iii) are manifest. The limit lim 0 exp( HN )(n, n ) = (n, n ) follows from
(A.2). The fact that d(n ) exp( HN )(n, n ) = 1 for all n HN and > 0, is a consequence of (3.9) and (3.6). This gives (ii). The finiteness of the coincidence limit is clear
from (A.9) and the remark after it. However the positivity in (iv) is masked by the oscillating nature of the Legendre functions. It can be shown, for example, from the alternative
expressions (A.16), (A.17) below, where also the smoothness is manifest.
Using the behavior of the Legendre P functions under a sign flip of one can rewrite
(A.15) in a form where a simplified spectral weight appears which equals that of the N = 1
case for all odd N and that of N = 2 case for all even N . With some further processing
one can show [26] the equivalence to the usual expressions for the heat kernel. For N = 2
see e.g. [60], vol. 1, Eqs. (3.32), (3.33). For N > 2 see [26]. The final result we quote from
[58]

N1
2
2
r2
1
N+1
1/2 (N1)
2
4
h (r) = (2)
(A.16)
e
e 4 , N odd,
sh r r
N+1
(N1)2
h (r) = (2) 2 1/2 e 4

N

1 2 s2
ds sh s

e 4 ,
sh s s
ch s ch r
N even.
(A.17)
From here one readily verifies the positivity property in (iv).
Appendix B. Finite volume corrections to Ward identities

Here we derive for D = 2 the finite volume corrections to the Ward identity (2.22),
(2.23). We use the translation invariant gauge fixing 1 of Section 2.1 where the SO(1, N )
invariance is violated by delta function constraint in Eq. (2.10). One may expect such corrections to vanish in the thermodynamic limit. In fact, it is possible to calculate the explicit
form of these corrections and thereby study directly their volume dependence. The most
convenient approach starts with a derivation of the exact Ward identity in a nonsingular
gauge analogous to the - (or -)gauges of quantized non-Abelian gauge theories, and
then recovers the delta-function gauge by taking the limit . Thus, we begin with the
functional integral
2

Z =
(B.1)
d nx exp S0 [n] +
nx
ln n0x + 2 ln
n0x
2
x
x
x
x
with S0 [n] regarded as afunctional of the spatial components nx of the n-field only,
eliminating n0x via n0x = 1 + n2 . Note that, as in the case of gauge field theory, the form
of the FaddeevPopov term is identical in the delta-function and the smooth gauges.
We shall indicate the procedure for the case of the rotation Ward identity in (2.23)
onlythe derivation of the correction terms for the boost Ward identity is analogous but
more tedious. We shall comment on it at the end of this appendix. For the rotation Ward
283
identity it is enough to consider a rotation in the n1 , n2 plane and we may wlog consider
the SO(1, 2) model throughout. Performing a local rotation with angle x on the nx field
then gives
nx ||
nx+ | cos (x x+ + x x+ ),
nx nx+ n0x n0x+ |
2

2
2
nx
|
nx | cos (x + x ) +
|
nx | sin (x + x ) ,
x
nx | cos x ,
n1x := |
nx2 := |
nx | sin x .
(B.2a)
(B.2b)
(B.2c)
Introducing these transformations into (B.1) and expanding to second order in the x , one
generates three sorts of terms, which we shall refer to henceforth as the A, B and C
type contributions to the rotation Ward identity. The A-terms arise from the variation of the
nx nx+ term in the action of (B.1), the B-terms from the gauge-fixing term,and the C
terms are those quadratic terms arising as cross terms of the (linear) variation in the pure
action and gauge-fixing term. Note that the FaddeevPopov and nonlinear field measure
terms play no role as they are invariant under rotations. The invariance of the functional
integral under the change of variables (B.2) then implies A + B + C = 0 where, after some
calculation, we find:

1
( x )( y ) Jx Jy
( x )2
nx nx+ ,
A=
(B.3a)
2 x,y
2 x

2 2
x2 + y2 1 1
nx ny + x y nx ny
B =
2
2
xy

2 2 1 1
x y nx ny nz nw ,

(B.3b)
xyzw
C =

( x )y Jx n2y n1z ,
(B.3c)
xyz
where as in Section 2

12
= n1x n2x+ n2x n1x+ .
Jx := Jx
At this point it is convenient to go over to momentum space by
introducing discrete Fourier
transforms appropriate for the lattice in question: thus, x = V1 p eipx (p), etc. One then
finds for the A type contributions

1
(2 2 cos p ) 2E 1 JL (p) (p)(p)
A=
(B.4)
2V p
with E a = nax nax+ , a = 1, 2, equal constants by translation and rotation invariance. Similarly, the B type terms may be rewritten

2 2
1

D(p) + D(0) +
B=
(B.5)
(p)(p)
(p) ,
V p
2
2
284
where we have defined two- and four-point functions D and , respectively, as

1 ip(xy)
e
D(p),
nax nby := ab Dxy , Dxy =
V p

1 ip(xy)
n2x n2y n 1 (0)n 1 (0) :=
e
(p),
V p

where the tilde notation indicates Fourier transform on the n fields n a (p) := x eipx nax .
Finally, introducing the three-point function as follows ( denotes a left lattice derivative)
2 1 1 ip(xy)

0
ny n (0) =
xy := Jx
e
(p),
V p
we find that the C type term amounts to
C =
1
(p)(p) (p).
V p
(B.6)
To summarize, the exact rotation Ward identity in a -gauge takes the form

(2 2 cos p ) 2E 1 JL (p)

= 2 (p) D(p) D(0) + 2 2 (p).
(B.7)
The terms arising from the gauge-fixing part of the action are isolated on the right-hand
side of (B.7): the left-hand side corresponds precisely to the naive Ward-identity (2.30). To
obtain the form of the Ward identity appropriate for the delta-function gauge used for the
simulations, we must next examine the limit of the right-hand side when . In order
1 (0)2 + n 2 (0)2 ].
to do this, we note that the action in (B.1) can be written S = S0 +
2 [n
Furthermore, we have the distributional limit
1
1
(x) + 2 2 (x) + , as .
(B.8)
2
8

We shall momentarily use the notation O0 := Z10 d n ( x nx )OeS0 to denote the
expectation of a Greens function G(
n) in the delta-gauge, whereas will denote the
-gauge expectation, as previously. The B term (B.3b) in -gauge can be rewritten

2 1 2 1
B =
(B.9)
x n (0)
x y n2x n2y 1 n 1 (0)2 .
2V x
2
xy
e
2
2 x
(x) +
Using (B.8), it is easy to obtain for the infinite limit of the first term on the right-hand
side of (B.9)
1 2
2 1 2
1
x n (0)
x = 2
(p)(p).
2V x
2V x
2V p
(B.10)
285
Again, using (B.8), one easily verifies, for any O independent of n1 , in the infinite limit:

3
S0 2
2 S0
1 n 1 (0)2 O n2 O n2
(B.11)
.
2
n 1 (0)
n 1 (0)2 0
The first derivative term in (B.11) can be written in terms of a new field x , as follows:
S0
1
=
x ,
n 1 (0) V x

0

1
1
2
nx+ + n0x + 0 0 n1x
with x := 0
(B.12)
nx
nx
x nx
while the second derivative term is easily seen to be suppressed by a factor of volume V
relative to the first, and will be ignored henceforth. In momentum space, the B type terms
are then found to yield

1
3 2
1
2
2
+
(B.13)
n (p)n (p) (0) .
V p
2V
4V 3
The large limit of the C type cross term can likewise be evaluated

0 2 1
x y Jx
ny n (0)
xy
1

(2 2 cos p )(p)(p)D(p) 2
(p)(p) (p)
2
V p
V p
with the modified three-point function

0 2
(p) :=
eip(xy) Jx
ny (0) .
(B.14)
Combining these results, we find that the rotational Ward identity, given by (B.7) in the
smooth gauges, becomes in delta-function gauge

(2 2 cos p ) 2E 1 JL (p)
2
1
2
(2 2 cos p )D(p) (p)

V
V
V

3 2
n (p)n 2 (p) (0)2 .
3
2V
(B.15)
For p fixed, the first two terms are manifestly of order V1 for large V . The last two terms
involve 3- and 4-point functions, respectively, with zero-momentum insertions, for which
the volume dependence is not a priori clear. However, they may be computed easily from
the numerical simulations of Section 4. We find that the best fits to the volume dependence
of the last two terms suggest a behavior V1 ln V . These fits were performed using the
results of measurements on 322 , 642 and 1282 lattices.
286
Fig. 12. Volume dependence of correction terms to rotation Ward identity.
Finally,
we show in Fig. 12 the right-hand side of (B.15), divided by the trivial kinematic
factor (2 2 cos p ), for = 10 and L = 32, 64, and 128. In all cases, they represent
a small numerical correction to the left-hand side, as expected from the agreement found
in Section 4.
Finite volume corrections to the boost Ward identity in (2.23) can be computed in a
manner precisely analogous to the procedure leading to (B.15). Apart from a trivial V1
term, there are in this case nine structures appearing on the right-hand side of the Ward
identity. As the formulas are somewhat lengthy we refrain from spelling them out here.
However, we have studied the finite volume dependence of these terms on 322 , 642 and
1282 lattices, and again find that the dominant asymptotic behavior is V1 ln V , as for the
rotation Ward identity.
References
[1] D. Amit, A. Davies, Symmetry breaking in the non-compact sigma model, Nucl. Phys. B 225 (1983) 221.
[2] Y. Cohen, E. Rabinovici, A study of the non-compact non-linear sigma-model: A search for dynamical
realizations of non-compact symmetries, Phys. Lett. B 124 (1983) 371.
[3] M. Gomes, Y.K. Ha, Noncompact sigma model and dynamical mass generation, Phys. Lett. B 145 (1984)
235.
[4] Y.K. Ha, Noncompact symmetries in field theories with indefinite metric, Nucl. Phys. B 256 (1985) 687.
[5] T. Morozumi, S. Nojiri, An analysis of noncompact nonlinear sigma models, Prog. Theor. Phys. 75 (1986)
677.
[6] M. Gomes, Y.K. Ha, Dynamical gauge boson in SU(N, 1)-type models, Phys. Rev. Lett. 58 (1987) 2390.
[7] J.W. van Holten, Quantum noncompact sigma models, J. Math. Phys. 28 (1987) 1420.
[8] S.A. Brunini, M. Gomes, A.J. da Silva, Remarks on noncompact sigma models, Phys. Rev. D 38 (1988) 706.
[9] B. DeWitt, Nonlinear sigma-models in four dimensions as a toy model for quantum gravity, in: Proceedings,
Geometrical and Algebraic Aspects of Nonlinear Field Theory, Amalfi 1988.
[10] J. de Lyra, B. DeWitt, S. Foong, T. Gallivan, R. Harrington, A. Kapulkin, E. Myers, J. Polchinski, The quantized O(1, 2)/O(2) Z2 sigma model has no continuum limit in four dimensions: 1. Theoretical framework,
Phys. Rev. D 46 (1992) 2527, hep-lat/9205014;
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
287
J. de Lyra, B. DeWitt, S. Foong, T. Gallivan, R. Harrington, A. Kapulkin, E. Myers, J. Polchinski, The quantized O(1, 2)/O(2) Z2 sigma model has no continuum limit in four dimensions: 2. Numerical simulations,
Phys. Rev. D 46 (1992) 2538, hep-lat/9205017.
M. Niedermaier, Dimensionally reduced gravity theories are asymptotically safe, Nucl. Phys. B 673 (2003)
131;
M. Niedermaier, H. Samtleben, An algebraic bootstrap for dimensionally reduced gravity, Nucl. Phys. B 579
(2000).
F. Wegner, The mobility edge problem: continuous symmetry and a conjecture, Z. Phys. B 35 (1979) 207.
A. Houghten, A. Jevicki, R. Kenway, A. Pruisken, Noncompact sigma-models and the existence of a mobility edge in disordered electronic systems near two dimensions, Phys. Rev. Lett. 45 (1980) 394.
S. Hikami, Anderson localization in a nonlinear sigma-model representation, Phys. Rev. B 24 (1981) 2671.
K.B. Efetov, Supersymmetry and theory of disordered metals, Adv. Phys. 32 (1983) 53.
K.B. Efetov, Supersymmetry in Disorder and Chaos, Cambridge Univ. Press, Cambridge, 1997.
S. Coleman, There are no Goldstone bosons in two dimensions, Commun. Math. Phys. 31 (1973) 259.
M. Niedermaier, E. Seiler, Nonamenability and spontaneous symmetry breakingthe hyperbolic spin chain,
hep-th/0312293.
P. Hasenfratz, Perturbation theory and zero modes in O(N ) lattice sigma-models, Phys. Lett. B 141 (1984)
385.
A. Patrascioiu, E. Seiler, Continuum limit of 2D spin models with continuous symmetry and conformal field
theory, Phys. Rev. E 57 (1998) 111;
A. Patrascioiu, E. Seiler, Does conformal quantum field theory describe the continuum lmits of 2F spin
models with continuous symmetry? Phys. Lett. B 417 (1998) 123.
T. Spencer, M.R. Zirnbauer, Spontaneous symmetry breaking of a hyperbolic sigma model in three dimensions, math-ph/0410032.
D. Mermin, H. Wagner, Absence of ferromagnetism or antiferromagnetism in one or two-dimensional
isotropic Heisenberg models, Phys. Rev. Lett. 17 (1966) 1133.
R.L. Dobrushin, S.B. Shlosman, Absence of breakdown of continuous symmetry in two-dimensional models
of statistical physics, Commun. Math. Phys. 42 (1975) 31.
C.-E. Pfister, One the symmetry of the Gibbs states in two-dimensional lattice systems, Commun. Math.
Phys. 79 (1981) 181.
N. Vilenkin, A. Klimyk, Representations of Lie Groups and Special Functions, Kluwer, Dordrecht, 1993.
C. Grosche, F. Steiner, The path integral on the pseudosphere, Ann. Phys. 182 (1988) 120.
M. Alonso, G. Pogosyan, K. Wolf, Wigner functions for curved spaces I: on hyperboloids, quant-ph/
0205041.
T. Kato, Trotters product formula for an arbitrary pair of self-adjoint contraction semigroups, in: I. Gohberg,
M. Kac (Eds.), Topics in Functional Analysis, Academic Press, New York, 1978, pp. 185195.
H. Neidhardt, V.A. Zagrebnov, TrotterKato product formula and symmetrically normed ideals, J. Funct.
Anal. 167 (1999) 113;
H. Neidhardt, V.A. Zagrebnov, TrotterKato product formula and operatornorm concergence, Commun.
Math. Phys. 2 (05) (1999) 129.
M. Braverman, O. Milatovic, M. Shubin, Essential self-adjointness of Schrdinger type operators on manifolds, Russian Math. Surveys 57 (2002) 641.
O. Milatovic, The form sum and the Friedrichs extension of Schrdinger-type operators on Riemannina
manifolds, Proc. Amer. Math. Soc. 132 (2004) 147.
M. Reed, B. Simon, Methods of Modern Mathematical Physics, vol. 4, Academic Press, New York, 1978.
B. Simon, L. Yaffe, Rigorous perimeter law upper bound on Wilson loops, Phys. Lett. B 115 (1982) 115.
M. Lscher, Absence of spontaneous symmetry breaking in Hamiltonian lattice gauge theories, preprint
1979, unpublished.
B. Simon, Functional Integration and Quantum Physics, Academic Press, New York, 1979.
J. Schaefer, Covariant path integral on hyperbolic surfaces, J. Math. Phys. 38 (1997) 11.
M. Niedermaier, E. Seiler, in preparation.
M. Reed, B. Simon, Methods of Modern Mathematical Physics, vol. 1, Academic Press, New York, London,
1972.
P. Lax, Functional Analysis, Wiley, 2003.
288
[40] A. Paterson, Amenability, American Mathematical Society, Providence, 1988.

[41] P. Eymard, Moynnes invariantes et reprsentations unitaires, Lecture Notes in Mathematics, vol. 300,
Springer-Verlag, Berlin, 1972.
[42] M.E.B. Bekka, Amenable unitary representations of locally compact groups, Invent. Math. 100 (1990) 383.
[43] V. Pestov, On some questions of Eymard and Bekka concerning amenability of homogeneous spaces and
induced representations, C. R. Math. Acad. Sci. Soc. R. Canada 25 (2003) 76, math.OA/0212380.
[44] K. Osterwalder, R. Schrader, Axioms for Euclidean Greens functions, Commun. Math. Phys. 31 (1973) 83;
K. Osterwalder, R. Schrader, Axioms for Euclidean Greens functions 2, Commun. Math. Phys. 42 (1975)
281.
[45] J. Glimm, A. Jaffe, Quantum Physics, Springer, Berlin, 1987.
[46] E. Seiler, Gauge theories as a problem of constructive quantum field theory and statistical mechanics, in:
Lecture Notes in Physics, vol. 159, Springer, Berlin, 1982.
[47] B. Schroer, J.A. Swieca, Conformal transformations for quantized fields, Phys. Rev. D 10 (1974) 480.
[48] I. Segal, R. Kunze, Integrals and Operators, Springer, 1978.
[49] A. Duncan, R. Roskies, Variational estimates for spectra in lattice Hamiltonian theories, Phys. Rev. D 31
(1985) 364.
[50] A. Duncan, R. Roskies, Asymptotic scaling in Hamiltonian calculations of the O(3) sigma-model, Phys.
Rev. D 32 (1985) 3277.
[51] M. Lscher, A portable high quality randum number generator for lattice field theory simulations, Comput.
Phys. Commun. 79 (1994) 100.
[52] J. Hubbard, Calculation of partition functions, Phys. Rev. Lett. 3 (1959) 77.
[53] R.L. Stratonovich, On a method of calculating quantum distribution functions, Sov. Phys. Dokl. 2 (1958)
416.
[54] M. Kirch, A. Manashov, Noncompact SL(2, R) spin chain, hep-th/0405030.
[55] D. Friedan, Nonlinear models in 2 + dimensions, Phys. Rev. Lett. 45 (1980) 1057;
D. Friedan, Ann. Phys. (N.Y.) 163 (1985) 318.
[56] J. Bros, U. Moschella, Two-point functions and quantum fields in de Sitter universe, Rev. Math. Phys. 8
(1996) 327, gr-qc/9511019.
[57] I. Gradshteyn, I. Ryzhik, Table of Integrals and Products, Academic Press, 1980.
[58] J.P. Anker, P. Ostellari, The heat kernel on noncompact symmetric spaces, survey for IHP Meeting on Heat
Kernels, Random Walks, and Analysis on Manifolds and Graphs, Paris, 2002.
[59] G. Warner, Harmonic Analysis on Semisimple Lie Groups I, II, Springer, Berlin, 1972.
[60] A. Terras, Harmonic Analysis on Symmetric Spaces and Applications I, Springer, Berlin, 1985.
Quantum supersymmetric TodamKdV hierarchies

Petr P. Kulish, Anton M. Zeitlin
St. Petersburg Department of Steklov Mathematical Institute, Fontanka 27, St. Petersburg 191023, Russia
Received 17 March 2005; accepted 2 June 2005
Abstract
In this paper we generalize the quantization procedure of TodamKdV hierarchies to the case of
arbitrary affine (super)algebras. The quantum analogue of the monodromy matrix, related to the universal R-matrix with the lower Borel subalgebra represented by the corresponding vertex operators is
introduced. The auxiliary L-operators satisfying RTT-relation are constructed and the quantum integrability condition is obtained. General approach is illustrated by means of two important examples.
PACS: 11.25.Hf; 11.30.Pb; 02.20.Uw; 02.20.Tw
Keywords: Superconformal field theory; Super-KdV; Quantum superalgebras; Toda field theory
1. Introduction
During more than quarter of a century both classical and quantum affine Toda field theories and related generalized (m)KdV ((modified) Kortewegde Vries) hierarchies were
extensively studied (see, e.g., [16]). These theories are integrable and have the LA pair
(2)
(or zero curvature) formulation. The most famous are sine-Gordon and A2 models (see,
e.g., [7,8]) and the associated KdV and reduced Boussinesq hierarchies. These theories allow the supersymmetric and fermionic generalizations [911] both called super because
their underlying algebraic structures are affine Lie superalgebras.
* Corresponding author.
E-mail addresses: kulish@pdmi.ras.ru (P.P. Kulish), zam@math.ipme.ru (A.M. Zeitlin).

URL: http://www.ipme.ru/zam.html.
doi:10.1016/j.nuclphysb.2005.06.002
290
P.P. Kulish, A.M. Zeitlin / Nuclear Physics B 720 [FS] (2005) 289306
The super-Toda field theory appears to be supersymmetric if and only if the associated
superalgebra possesses the purely fermionic system of simple roots. One should note, however, that usually the superalgebras allow a few simple root systems [12], they correspond
to different Toda theories and only purely fermionic system corresponds to the supersymmetric one.
The supersymmetric version of the DrinfeldSokolov reduction applied to the matrix
L-operator of the TodamKdV theories gives the generators of the related super W-algebra
with the commutation relations provided by the associated Hamiltonian structure (see, e.g.,
[13,14]).
In this paper we consider these theories from a point of view of quantum inverse scattering method (QISM) (see [15,16]). There are two approaches of applying QISM to
TodamKdV type models. The first one is more traditional and based on the quantization of the corresponding lattice systems (see, e.g., [17,18]) and the second one is based on
the quantization in terms of continuous free field theory and was introduced in [3,4].
Here we use the second approach, generalizing our results obtained in [2022] for the
affine superalgebras of rank 2 to the case of general affine superalgebra.
We build the quantum generalization of the monodromy matrix and prove that the related auxiliary L-operators (which are equal to the monodromy matrix multiplied by the
exponential of the elements from the Cartan subalgebra) satisfy the RTT-relation [15,
16], while the quantum counterpart of the monodromy matrix itself is shown to satisfy
the specialization of the reflection equation (see, e.g., [19]). This provides the quantum
integrability relation for the supertraces of the monodromy matrix taken in different representations (transfer matrices). Moreover, it is proven that the auxiliary L-operators are
related with the universal R-matrix associated with the underlying quantum affine superalgebra, with the lower Borel subalgebra represented by the vertex operators from the
corresponding Toda field theory. Using this relation in the case when the simple root system is purely fermionic, it is demonstrated that the transfer matrix is invariant under the
supersymmetry transformation as it was shown on particular classical examples of Toda
field theories [10,11].
In the last two sections the above constructions are illustrated by means of two important
examples of integrable hierarchies: quantum super-KdV [20] and SUSY N = 1 KdV [21].
These hierarchies generate two integrable structures of the superconformal field theory, the
second one is invariant under the SUSY transformation while the first one is not.
2. Bosonic TodamKdV hierarchies

Each mKdV hierarchy and related Toda field theory associated with affine Lie algebra
are generated by the following L-operator [1]:

L = u u i (u)H i
r

i=0

ei ,
(1)
291
where u lies on a cylinder of circumference 2 , i are the scalar fields with the Poisson
brackets:

u i (u), v j (v) = ij (u v)
(2)
with quasiperiodic boundary condition:
i (u + 2) = i (u) + 2ip i .
(3)
ei are the Chevalley generators of the underlying affine Lie algebra and
(i = 1, . . . , r)
form a basis in the Cartan subalgebra of the corresponding simple Lie algebra:

i
1a
[ek , el ] = kl hk ,
adekkj ej = 0,
H , ek = ki ek ,
(4)
Hi
where akj is a Cartan matrix and hk (k , H ) = ki H i . In our case this algebra is considered in evaluation representations, when e0 = e (this corresponds to the case of
untwisted affine Lie algebra, the twisted case is more complicated), is the highest root
of the related simple Lie algebra. The classical monodromy matrix for the linear problem
associated with the L-operator (1) can be expressed in the following way [3]:

2
r
2ip k H k
(i ,(u))
Pexp du
e
ei ,
s ()(M) Ms () = e
(5)
0
i=0
where s is some evaluation representation of the corresponding affine Lie algebra. Defining the auxiliary L-matrix:
L() = eip
kH k
M(),
(6)
one can find that the quadratic Poisson bracket relation is valid:

L(), L() = r 1 , L() L() ,
(7)
where r() is the trigonometric r-matrix [16,23] related with the corresponding simple Lie
algebra. The traces of the monodromy matrices in different evaluation representations s :

ts () = s M()
(8)
are in involution under the Poisson brackets:

ts (), ts () = 0.
(9)
The quantization means that we move from the quadratic Poisson bracket relation (7) to
the RTT-relation with the underlying affine Lie algebra deformed to the quantum affine
algebra (see, e.g., [15,16] and below). In this section we give the generalization of the
constructions appeared in [4,5].
First, let us quantize the scalar fields i :
k (u) = iQk + iP k u +
k
an
n
k j i 2 kj
Q ,P = ,
2
einu ,
k j 2 kj
n n+m,0 ,
an , am =
2
(10)
292
and define the vertex operators

Vk (u) := :e
(k ,(u))
: = exp

(k , an )
n=1
exp

(k , an )
n=1

e
inu

exp i (k , Q) + (k , P )u

e
inu
Then one can define the quantum generalizations of auxiliary L-operators [3]:

2
r
(q)
iP k H k
(i ,(u))
Pexp du
:e
:ei ,
L () = e
0
where now ei ,
Hk
(11)
i=0
are the generators of the corresponding quantum affine algebra:

i
H , ek = ki ek ,
[ek , el ] = kl [hk ]q ,
(q)
1akj
adek
ej = 0,
(12)
where
q = ei
2 /2
[a]q =
q a q a
,
q q 1
(q)
ade e = e e q (,) e e .
The object (11) is defined in the interval 0 < 2 < 2b1 , b = max(|bij |) (i = j ), where bij
is a symmetrized Cartan matrix, but can be analytically continued to a wider region. The
quantum monodromy matrix is defined using the relation (6):
M(q) () = eiP
kH k
L(q) ().
(13)
It can be shown that the L(q) operator satisfies the mentioned RTT-relation in the two ways
[4,5]. The first way is to consider the following product:
(q)
L () I I L(q) () .
(14)
Then, moving all the Cartan multipliers to the left we find the following:
e
iP i (H i )
2
Pexp
du K 1 (u) Pexp
2
du K2 (u),
0
r

h
K1 (u) =
:e(j ,(u)) :ej q j ,
K2 (u) =
j =0
r
1 :e(j ,(u)) :ej ,
(15)
j =0
where (H i ) = H i I + I H i . The commutation relations between vertex operators

on a circle:
Vk (u)Vj (u ) = q bkj Vj (u )Vk (u),
lead to

K 1 (u), K2 (u ) = 0,
u < u .
u > u ,
(16)
(17)
293
Due to this property one can unify two P-exponents into the single one which is equal to

s () s () L(q) ,
(18)
where s and s are some evaluation representations, and the coproduct of the quantum
affine (super)algebra is defined by:

h
H i = H i I + I H i,
(ej ) = ej q j + 1 ej ,
(ej ) = ej 1 + q
hj
ej .
(19)
Next, considering opposite product of the L-operators, one finds that it coincide with opposite coproduct of the L-operators:

I L(q) () L(q) () I = s () s () op L(q) ,
(20)
where op = and the map is defined as follows: (a b) = b a. Using the property
of the universal R-matrix, namely, R = op R [28], we arrive to the RTT-relation:

R 1 L(q) () I I L(q) () = I L(q) () L(q) () I R 1 . (21)
Remembering the expression for the monodromy matrix (13) one obtains that the RTTrelation is no longer valid for the monodromy matrices, however, it is easy to see that
k k
multiplying both RHS and LHS of (21) by ei H P one obtains:

(q)

()M(q) () = M
(q) ()M(q) ()R12 1 ,
R12 1 M
(22)
1
2
2
1
() the monodromy matrix with e 1 replaced by e
where we have denoted M
i
i
1
(q)
hi
q
and M2 () the monodromy matrix with 1 ei replaced by q hi ei . Taking
(q) cancel and we obtain the quantum
the trace, the above additional Cartan factors in M
integrability condition for their traces (transfer-matrices):

ts (), ts () = 0.
(23)
(q)
Another more universal way to obtain the RTT-relation is the correspondence between the
reduced universal R-matrix (see below) and the P-exponential form of the auxiliary L(q)
operator. That is, let us consider integrals of vertex operators:
1
Vk (u2 , u1 ) =
q q 1
u2
du Vk (u).
(24)
u1
Via the contour technique [2] one can show that these objects satisfy the quantum Serre
relations of the lower Borel subalgebra of the associated quantum affine algebra with simple roots k . Using the structure of the reduced universal R-matrix (see, e.g., [27] and
ei , ei ), where
Appendix A) we can write R = K 1 R = R(
ei = ei 1,
ei = 1 ei ,
(25)
R is a universal R-matrix and K depends on the elements from Cartan subalgebra, because
R is represented as a power series of these elements. Then, following [5] and using the
294
fundamental feature of the universal R-matrix:

(I )R = R13 R12 ,
(26)
one can show that the reduced R-matrix has the following property:

= R ei , e
R ei , e
,
+ e
R ei , e
i
i
i
i
(27)
where

e
+ e
= (I )(1 ei ),
i
i

= 1 q hi ei ,
e
i

e
= 1 ei 1,
i
ei = ei 1 1.
(28)

e
e = q bij e
e .
i j
j i
(29)
Their commutation relations are

e
e = ej e
,
i j
i

e
e = ej e
,
i j
i
(q) (u2 , u1 ) the reduced R-matrix with e represented by V (u2 , u1 )

Now, denoting by L
i
i

replaced
by
appropriate
vertex
operators
we
and using the above property of R with e
i
find:
(q) (u3 , u1 ) = L
(q) (u3 , u2 )L
(q) (u2 , u1 ),
L
u3 u2 u1 .
(30)
(q) has the property of P-exponent. When the interval = [u2 , u1 ] is small enough
Hence, L
one can show that

u2
r

(i ,(u))
L(q) (u2 , u1 ) = 1 + du
(31)
:e
:e + O 2 .
i
i=0
u1
That is, we obtain that
(q)
u2
(u2 , u1 ) = Pexp
du
u1
and L(q) = e
iH i P i
r

:e
(i ,(u))
:ei
(32)
i=0
(q) (2, 0) satisfies RTT relation by construction.

L
3. Quantum P-exponential and TodamKdV hierarchies based on superalgebras

Now let us generalize the above results to the case when the underlying algebraic structures and integrable hierarchies are related to the affine Lie superalgebra. In the previous
part we have moved from classical theory to the quantum one, here we will go in opposite direction, moving from the quantum version of the monodromy matrix and related
auxiliary L-operators, satisfying RTT-relation to their classical counterparts.
First, let us introduce two types of vertex operators, bosonic and fermionic ones:

i
WFi (u) d :e(i ,(u,)) : = i , (u) :e(i ,(u)) :,
(33)
2

WBi (u)
d :e(i ,(u,)) : = :e(i ,(u)) :,
295
(34)
where k are the superfields:

i
k (u, ) = k (u) k (u),
2
and is a Grassmann variable. Their commutation relations on a circle are:
Wsi (u)Wsk (u ) = (1)p(s)p(s ) q bik Wsk (u )Wsi (u),
u > u ,
(35)
where bkj is the symmetrized Cartan matrix for the corresponding affine Lie superalgebra,
s, s are B, F and p(F ) = 1, p(B) = 0.
The mode expansion for the bosonic fields is the same as in (10) and for fermionic fields
k (u) is the following:

k l
l (u) = i 1/2
(36)
nl einu ,
n , m = 2 kl n+m,0 .
n
These fermion fields may satisfy two boundary conditions periodic and antiperiodic
i (u+2) = i (u) corresponding to the two sectors of (S)CFTRamond (R) and Neveu
Schwarz (NS) (the supersymmetry operator appears only when all fermions are in the R
sector).
It can be shown that the integrals of the introduced vertex operators as in the bosonic
case satisfy the Serre and non-Serre relations (see, e.g., [2527]) for the lower Borel
subalgebra:
1a

(q) kj
[er , es ]q , [er , ep ]q q = 0,
adek ej = 0,
(37)
(q)
where the ad(q) operator is defined in the following way: ade e = e e (1)p()p()
q (,) e e , and er is a so-called grey root (for more details see Appendix A) of the
quantum affine superalgebra with the corresponding bosonic and fermionic roots i .
Substituting them with the appropriate multiplier (q q 1 )1 in the reduced R-matrix
one can find (easily generalizing the results of Section 2 to the case of superalgebra) that it
satisfies the P-exponential multiplication property:

u2

F
B
L(q) (u2 , u1 ) = Pexp(q) du
W (u)e +
W (u)e ,
f
u1
(q) (u3 , u1 ) = L
(q) (u3 , u2 )L
(q) (u2 , u1 ),
L
u3 u2 u1 ,
(38)
where indices f and b imply that we are summing over fermionic and bosonic simple
roots. The letter q over the Pexp means that the object introduced above in some cases
(more precisely when a number of fermionic roots is more than one) cannot be written as
P-exponential for any value of the deformation parameter due to the singularities in the
operator products generated by the fermion fields i . Thus we call this object quantum
P-exponential.
i i (q)
(2, 0) we find (similarly to the purely bosonic
Defining then L(q) eip H L
case) that it satisfies the RTT relation (21) and defining the monodromy matrix M(q)
296
i
eip H L(q) we again arrive to the property (22) and obtain again the quantum integrability
condition (23) for t(q) = str M(q) . We mention here that the relation (22) can be rewritten
in a more universal way, as a specialization of the reflection equation [19]:

12 1 M(q) ()F 1 M(q) () = M(q) ()F 1 M(q) ()R12 1 ,

R
(39)
12
12
1
2
2
1
where F = K 1 the Cartans factor from the universal R-matrix (see Appendix A), and
12 (1 ) = F 1 R12 (1 )F12 .
R
12
Now let us analyse the classical limit of the defined objects. We will use the
(q) (2, 0) in the following way:
(q) (2, 0). Let us decompose L
P-exponential property of L
(q) (2, 0) = lim
L
(q) (xm , xm1 ),

L
(40)
m=1
where we divided the interval [0, 2] into infinitesimal intervals [xm , xm+1 ] with xm+1
xm = = 2/N . Let us find the terms that can give contribution of the first order in
(q) (xm , xm1 ). In this analysis one needs the operator product expansion of fermion
in L
fields and vertex operators:
k (u) l (u ) =

i 2 kl
cpkl (u)(iu iu )p ,
+

(iu iu )
p=0
:e
(k ,(u))
::e
(l ,(u ))
= (iu iu )
(k ,l ) 2
2
:
:e(k +l ,(u)) : +

dpkl (u)(iu iu )p ,
(41)
p=1
where cpkl (u) and dpkl (u) are operator-valued functions of u. Now one can see that only two
(q) (xm1 , xm ) when q 1. The
types of terms can give the contribution of the order in L
first type consists of operators of the first order in Wi and the second type is formed by
2
the operators, quadratic in Wi , which give contribution of the order 1 by virtue of
operator product expansion. Let us look on the terms of the second type in detail.
The terms of the second type appear from the quadratic products of vertex operators
arising from:
(i) the composite roots (more precisely q-commutators of two fermionic roots),
(ii) the quadratic terms of the q-exponentials which are present in the universal R-matrix.
At first we consider terms emerging from composite roots, which have the following
form (see Appendix A):
xm
xm
1
[e , ei ]q 1
du1 Wi (u1 i0)
du2 Wj (u2 + i0)
a(i + j )(q q 1 ) j
xm1
2
+ q bij
2
du2 Wj (u2 i0)
du1 Wi (u1 + i0) .

0
xm1
(42)
297
Using the fact that q = ei /2 and that in the limit 2 0, a(i + j ) bij , one can
rewrite this as follows (leaving only terms that can give contribution to the first order in ):
1
(2i)
xm
[ej , ei ]
xm
du1
xm1
du2
xm1

1
1
:e(i +j ,(u2 )) :.
u2 u1 + i0 u2 u1 i0
(43)
Now using the well-known formula

1
1
= 2i(x),
x + i0 x i0
we obtain that (42) in the classical limit gives
xm
[ej , ei ]
(44)
du :e(i +j ,(u)) :.
(45)
xm1
Next let us consider the quadratic products arising from quadratic parts of q-exponentials
of fermionic roots. They look as follows:
1
(2)q1
i
xm
xm
du1 Wi (u1 i0)
xm1
du2 Wi (u2 + i0)e2 i .
(46)
xm1
One can rewrite this product via the ordered integrals:

q bii 1
(2)q1
i
xm
u1
du2 Wi (u2 )e2 i .
du1 Wi (u1 )
xm1
(47)
xm1
0 we obtain (forgetting about the terms that could give contribution of

In the limit
the order 2 ):
2
ibii 2
xm
u1
du2 (iu1 iu2 )
du1
xm1
bii 2
2 1
e2(i ,(u2 )) e2 i .
(48)
xm1
Therefore, the final contribution is:

xm
du e2(i ,(u)) e2 i .
(49)
xm1
Collecting now all the terms of order we find:

(q) (xm , xm1 ) = 1 +
L
xm
xm1
f1 f2

du
WFf (u)ef +
WBb (u)eb
f
[ef1 , ef2 ]WBf
+f2
(u) + O 2 .
(50)
298
(q) (xm , xm1 ) it is easy to see that in the q 1 limit L

(q) (xm , xm1 ) is
Gathering the L
equal to:
(cl)
2
(2, 0) = Pexp
du
WFf (u)ef +

f1 f2
WBb (u)eb
[ef1 , ef2 ]WBf +f (u)

1
2
(51)
(cl) (2, 0) we find that it satisfies the quadratic Poisson

Defining then L(cl) eip H L
i i
bracket relation (7) and defining the monodromy matrix M(cl) eip H L(cl) we again
obtain the classical integrability condition (9).
Now let us find the L-operator which corresponds to the monodromy matrix defined
above. Let us consider the following one:

i
i
LF = Du, Du, (u, )H
(52)
ef +
eb ,
i
are the classical superfields with the

where Du, = + u is a superderivative and
following Poisson brackets:

Du, i (u, ), Du , j (u , ) = ij Du, (u u )( ) .
(53)
i
Making a gauge transformation of the above L-operator one can arrive to the fields, satisfying classical version of super W-algebras with the commutation relations provided by
the Poisson brackets [13,14].
The associated fermionic linear problem can be reduced to the bosonic one. The
linear problem
LF (u, ) = (Du, + N1 + N0 )( + ),
(54)
where
(u, ) = + ,

i
ef ,
N1 = i (u)H i
2
f
N0 = u i (u)H i
eb ,
can be reduced to the linear problem on :

LB (u) = u + N12 + N0 (u).
That is:

i
LB = u u (u)H + i (u)H i
ef
2
f
i
(55)
2
eb .
(56)
One can easily see that the monodromy matrix for the corresponding linear problem is that
described above.
299
4. Integrals of motion and supersymmetry invariance

It is well known that (both classical and quantum) integrability conditions lead to the
involutive family of (both local and nonlocal) integrals of motion (IM). For super-versions
of these systems it is also known that sometimes it is possible to include supersymmetry
generator
r

1/2
l
l
l
2i
du (u) (u) =
2 nl an
2
G0 =
(57)
l=0 nZ
in these series [24]. Here we will show that the transfer matrix t(q) () = str M(q) () commute with G0 if the simple root system is purely fermionic, that is:

(q)
() = str () e
2iP k H k
2
Pexp

du
r

WFf (u)ef
(58)
f =0
where denote some representation of the corresponding superalgebra in which the supertrace is taken.
We note the crucial property:

G0 , WF (u) = u WB (u).
(59)
Integrating over u and multiplying by the appropriate coefficient one obtains:
[G0 , ei ] =
WFi (0) WFi (2)

q q 1
(60)
where ei is represented by the vertex operator (see previous section). Next, we will use
the important proposition (Proposition 1, Section 3.1 of [5]): for the objects Ai , Bi , I ,
satisfying the commutation relations
[I, ei ] =
Ai Bi
,
q q 1
Ai ej = q bij ej Ai ,
the following relation holds:

= R
[1 I, R]
ei Ai
ei Bi R.
i
Bi ej = q bij ej Bi , (61)
(62)
Applying this relations to our case we find (identifying Ai with WFi (0), Bi with WFi (2)
and I with G0 ):

(q) (2, 0) = L
(q) (2, 0)W B (0) W B (2)L
(q) (2, 0),
G0 , L
(63)
where
W (u) =
B
r

i=0
WBi (u)ei .
(64)
300
Now using the property (35), periodicity properties of vertex operators:

Ws (u + 2) = q (,) e2i(,P ) Wsi (u + 2)
(65)
(here s = B, F ) and cyclic property of the supertrace one obtains, multiplying both sides
k k
of (63) by e2iP H and taking the supertrace:

G0 , t(q) = 0.
(66)
We note here that if there were bosonic simple roots in the construction of the transfermatrix the above reasonings are no longer valid, because the corresponding bosonic vertex
operators commuting with G0 give the fermionic vertex operators associated with the same
root vector (the same happens when we construct the superstring vertex operators [29]), but
not the total derivative as in (59). It was already shown explicitly on the concrete examples
that the hierarchies, based on the partly bosonic simple root systems are not invariant under
the supersymmetry transformation (see, e.g., [10,11,24]).
The affine superalgebras which allow such root systems are of the following type [12]:
A(m, m)(1) = sl(m + 1, m + 1)(1) ,
A(2m, 2m)(4) = sl(2m + 1, 2m + 1)(4) ,
A(2m + 1, 2m + 1)(2) = sl(2m + 2, 2m + 2)(2) ,

A(2m + 1, 2m)(2) = sl(2m + 2, 2m + 1)(2) ,
D(m + 1, m)(1) = osp(2m + 2, 2m)(1) ,
B(m, m)(1) = osp(2m + 1, 2m)(1) ,

D(m, m)(2) = osp(2m, 2m)(2) ,
D(2, 1, )(1) .
The involutive family of the (both classical and quantum) IM in the Toda field theories
have the property, that the commutators of IM with the corresponding vertex operators
reduce to the total derivatives [2]:

(l)
Il , Wk (u) = u :O(l)k (u)Wk (u): = u k (u),
(67)
(k)
where O
(u) is the polynomial of u i (u), i (u) and their derivatives. In [5] it was shown
in the bosonic case (the proof is similar to the arguments above) that Il commute with the
transfer matrix, the generalization of these arguments to the super-case is straightforward.
In the next two sections we will give two examples of the KdV hierarchies related to
affine superalgebra B(0, 1)(1) osp(1|2)(1) (super-KdV) and twisted affine superalgebra
D(1, 1)(2)
C(2)(2) sl(2|1)(2)
osp(2|2)(2) (SUSY N = 1 KdV).
5. Example 1: super-KdV hierarchy

The super-KdV model [30,31] is based on the following L-operator:
Du, + Du, (u, )h0 (e1 + e0 ),
(68)
where h0 , e1 , e0 are the Chevalley generators of the upper Borel subalgebra of the
osp(1|2)(1) which are taken in the evaluation representation that is h0 = h, e1 = iv+ ,
e0 = X , where X , v and h are the generators of osp(1|2) superalgebra with the
301
following commutation relations:

[h, X ] = 2X ,
[h, v ] = v ,
[v , v ] = 2X ,
[v+ , v ] = h,
[X , v ] = v ,
[X+ , X ] = h,
[X , v ] = 0.
(69)
Here p(v ) = 1, p(X ) = 0, p(h) = 0. The classical monodromy matrix is:

M() = e
2iph0
2
Pexp
0

i
du (u)e(u) e1 e2(u) e2 1 + e2(u) e0 .
2
(70)
The involutive family of the integrals of motion which can be extracted from this monodromy matrix (more precisely they arise as a coefficients in the expansion in 1 series
of the trace of the logarithm of M-matrix) can be expressed via the following fields:
1
U (u) = (u) 2 (u) (u) (u),
(71)
(u) = (u) + (u) (u)
2
generating the classical limit of the superconformal algebra under the Poisson brackets:

U (u), U (v) = (u v) + 2U (u)(u v) + 4U (u) (u v),

U (u), (v) = 3(u) (u v) + (u)(u v),

(u), (v) = 2 (u v) + 2U (u)(u v).
(72)
The integrals of motion are:

(cl)
I1 = U (u) du,

2
U (u)
(cl)
I3 =
+ (u) (u) du,
2

2
U (u) 2U 3 (u) + 8 (u) (u) + 12 (u)(u)U (u) du,
I5(cl) =
..
.
(73)
The second one
(cl)
I3
gives the super-KdV equation:
Ut = Uuuu 6U Uu 6uu ,
(74)
t = 4uuu 6U u 3Uu .
2
However, the supersymmetry operator G0 = 0 du (u) cannot be included in the pair(cl)
wise commuting IM that can be easily seen from the second equation of (74) (i.e., I3 does
not commute with G0 ). Moving to the quantum case we find that the quantum analogue of
the monodromy matrix is:
(q)
() = e
2iP h0
2
Pexp
0

i
(u)
2(u)
du (u):e
:e1 + :e
:e0 ,
2
(75)
302
where h0 , e0 , e1 are now the Chevalley generators of the ospq (1|2)(1) . We did not put
the letter q over the P-exponential because this is the case when for values of 2 from
the interval (0, 2) one can write the above object as a real P-exponential (represented
via ordered integrals). Due to the presence of bosonic root e0 the trace of the quantum
monodromy matrix is not invariant under the supersymmetry transformation as it was in
the classical case.
6. Example 2: SUSY N = 1 KdV hierarchy
The L-operator corresponding to the SUSY N = 1 KdV model [32,33] is the following
one:
LF = Du, Du, (u, )h1 (e0 + e1 ),
(76)
where h1 , e0 , e1 are the Chevalley generators of the twisted affine Lie superalgebra
C(2)(2) with such set of commutation relations:
[h1 , h0 ] = 0,
[h0 , e1 ] = e1 ,
[hi , ei ] = ei ,
ad3e e1 = 0,
0
[h1 , e0 ] = e0 ,
[ei , ej ] = i,j hi
(i, j = 0, 1),
ad3e e0 = 0.
1
(77)
Here p(h0,1 ) = 0, p(e0,1 ) = 1, i.e., both simple roots are fermionic. The classical monodromy matrix is:

i
i
Pexp du (u)e(u) e1 (u)e(u) e0
M=e
2
2
0

e2 1 e2(u) e2 0 e2(u) [e1 , e0 ] .
2
2iph1
The series of the integrals of motion starts with the following ones:

1
(cl)
I1 =
U (u) du,
2

1
(u) (u)
(cl)
U 2 (u) +
du,
I3 =
2
2

1
(U )2 (u) (u) (u)
(cl)
U 3 (u)
I5 =
(u)(u)U (u) du,

2
2
4
..
.
(78)
(79)
The fields U and are defined in terms of the free fields as in previous section, but now
one can unify them into one superfield:
i(u)
3
(u, ) = U (u) .
U(u, ) Du, (u, )u (u, ) Du,
2
(80)
303
The second IM from (79) generates the first nontrivial evolution equation, the SUSY N = 1
KdV:
Ut = Uuuu + 3(UDu, U)u ,
(81)
or in components:
3
Ut = Uuuu 6U Uu uu ,
2
t = 4uuu 3(U )u .
(82)
2
Now one can see that unlike the previous example the supersymmetry generator 0 du (u)
commutes with I3(cl) and can be included in the involutive family of IM [24]. Moreover, the
results of Section 2 yield that the quantum monodromy matrix have the following form:
2
M=e
2iP h1
(q)
Pexp

du W (u)e1 + W+ (u)e0 ,
(83)
where W = d :e(u,) :. Due to the fact that both roots are fermionic the supersymmetry generator commutes with the transfer matrix.
This SUSY N = 1 KdV model was studied from a point of view of the Quantum Inverse
Scattering Method in [22]. There were constructed the analogues of Baxters Q-operator,
providing the following functional relations with the transfer-matrices:

t1/4 ()Q () = Q q 1/2 Q q 1/2 ,

t1/2 q 1/4 Q () = t1/4 q 1/2 Q q 1/2 + Q (q),
(84)
and the fusion relations between transfer-matrices in different representations:

tj q 1/4 tj q 1/4 = tj + 1 ()tj 1 () + (1)4j ,
4
(85)
which for the case when q is a root of unity can be transformed into the Thermodynamic
Bethe Ansatz equations of D2N type [35].
It should be noted also that the associated Toda field theory is a well-known N = 1
SUSY sinh-Gordon model [33,34] with the action:

1
2
(86)
d2 u d2 Du, D u,
+ m cosh() .
2
Acknowledgements
We are grateful to Professors F.A. Smirnov and M.A. Semenov-Tian-Shansky for useful
discussions. The work was supported by the CRDF (Grant No. RUMI-2622-ST-04) and the
Dynasty Foundation.
304
Appendix A
The affine Lie superalgebra in has the following commutation relations between its
Chevalley generators (H i forms a basis in the Cartan subalgebra of the underlying simple
Lie superalgebra and ek are the generators associated with positive and negative simple
roots of the whole affine algebra) [27]:

i
1a
[ek , el ] = kl [hk ]q ,
adekkj ej = 0,
H , ek = ki ek ,

[er , es ]q , [er , ep ]q q = 0,
(A.1)
if (r , r ) = (s , p ) = (r , s + p ) = 0, in this case it is usually said that r is a grey
root, which is between two roots s , p on the Dynkin diagram [25,26]. The definition of
the super-q-commutator is:
(q)
ade e = e e (1)p()p() q (,) e e ,
(A.2)
where p() is equal to 1 when is a fermionic root, and to 0 if is a bosonic root. The
universal R-matrix for the contragradient Lie superalgebra of finite growth (affine algebra
as a particular case) has the following structure:

R ,
R = K R = K
(A.3)
+
where R is a reduced R-matrix and R are defined by the formulae:

1
R = expq1 (1)p() q q 1 a() (e e )
(A.4)
for real roots and

mult

(j
)
(i)
cij (n)en en
Rn = exp (1)p(n) q q 1
(A.5)
i,j
for pure imaginary roots. Here + is the reduced positive root system (the bosonic roots
which are two times fermionic roots are excluded). The generators corresponding to the
composite roots are defined according to the construction of the CartanWeyl basis given
in [27]. For example, the generators of the type ef1 f2 are constructed by means of the
following q-commutators:
ef1 +f2 = [ef2 , ef1 ]q 1 ,
ef1 f2 = [ef1 , ef2 ]q .
(A.6)
The a() coefficients are defined as follows:

[e , e ] = a( )
k k1
q q 1
(A.7)
We will need the values of a( ) when is equal to f1 + f2 , where f1 and f2 are

fermionic simple roots:
a(f1 + f2 ) =
q bf1 f2 q bf1 f2
.
q q 1
(A.8)
305
The q-exponentials in (A.3) are defined in the usual way:

expq (x) = 1 + x +
xn
xn
x2
+ +
+ =
,
(2)q !
(n)q !
(n)q !
n0
qa 1
,
(a)q
q 1
q (1)p() q (,) .
(A.9)
References
[1] A.E. Arinshtein, V.A. Fateev, A.B. Zamolodchikov, Phys. Lett. B 87 (1979) 389;
A.V. Mikhailov, M.A. Olshanetsky, A.M. Perelomov, Commun. Math. Phys. 79 (1981) 473;
V.G. Drinfeld, V.V. Sokolov, J. Sov. Math. 30 (1985) 1975;
E. Corrigan, hep-th/9412213.
[2] B. Feigin, E. Frenkel, in: Lecture Notes in Mathematics, vol. 1620, Springer, Berlin, 1996, p. 349.
[3] V.A. Fateev, S.L. Lukyanov, Int. J. Mod. Phys. A 7 (1992) 853;
V.A. Fateev, S.L. Lukyanov, Int. J. Mod. Phys. A 7 (1992) 1325.
[4] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Commun. Math. Phys. 177 (1996) 381;
V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Commun. Math. Phys. 190 (1997) 247;
V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Commun. Math. Phys. 200 (1999) 297.
[5] V.V. Bazhanov, A.N. Hibberd, S.M. Khoroshkin, Nucl. Phys. B 622 (2002) 475.
[6] E. Frenkel, math.QA/0305216.
[7] L.D. Faddeev, L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons, Springer, Berlin, 1987.
[8] V.E. Korepin, N.M. Bogoliubov, A.G. Izergin, Quantum Inverse Scattering Method and Correlation Functions, Cambridge Univ. Press, Cambridge, 1993.
[9] M.A. Olshanetsky, Commun. Math. Phys. 88 (1983) 63.
[10] A. Gualzetti, S. Penati, D. Zanon, Nucl. Phys. B 398 (1993) 622.
[11] J.M. Evans, J.O. Madsen, Nucl. Phys. B 503 (1997) 715.
[12] V.G. Kac, Adv. Math. 26 (1977) 8.
[13] L.A. Ferreira, J.F. Gomes, R.M. Ricotta, A.H. Zimerman, Int. J. Mod. Phys. A 7 (1992) 7713.
[14] F. Delduc, L. Gallot, solv-int/9802013.
[15] L.D. Faddeev, in: A. Connes, et al. (Eds.), Quantum Symmetries/Symmetries Quantiques, North-Holland,
Amsterdam, 1998, p. 149, hep-th/9605187.
[16] P.P. Kulish, E.K. Sklyanin, in: Lecture Notes in Physics, vol. 151, Springer, New York, 1982, p. 61.
[17] L.D. Faddeev, A.Yu. Volkov, Phys. Lett. B 315 (1993) 311.
[18] A.Yu. Volkov, Lett. Math. Phys. 39 (1997) 313.
[19] P.P. Kulish, E.K. Sklyanin, J. Phys. A 25 (1992) 5963;
P.P. Kulish, R. Sasaki, Prog. Theor. Phys. 89 (1993) 741.
[20] P.P. Kulish, A.M. Zeitlin, Phys. Lett. B 581 (2004) 125, hep-th/0312159;
P.P. Kulish, A.M. Zeitlin, Theor. Math. Phys. 142 (2005) 211, hep-th/0501018.
[21] P.P. Kulish, A.M. Zeitlin, Phys. Lett. B 597 (2004) 229, hep-th/0407154.
[22] P.P. Kulish, A.M. Zeitlin, Nucl. Phys. B 709 (2005) 578, hep-th/0501019;
A.M. Zeitlin, in: String Theory: From Gauge Interactions to Cosmology, NATO Advanced Study Institute,
Proceedings of Cargese Summer School, 2004, NATO Science Series C, 2005, p. 578, hep-th/0501150.
[23] A.A. Belavin, V.G. Drinfeld, Funct. Anal. Appl. 16 (1982) 1.
[24] A. Bilal, J.-L. Gervais, Phys. Lett. B 211 (1988) 95.
[25] M. Scheunert, J. Math. Phys. 28 (1987) 1180.
[26] R. Floreanini, D.A. Leites, L. Vinet, Lett. Math. Phys. 23 (1991) 127.
[27] S.M. Khoroshkin, V.N. Tolstoy, hep-th/9404036.
[28] V.G. Drinfeld, in: Proceedings of the International Congress of Mathematics, Berkeley, 1986, Academic
Press, San Diego, CA, 1987, p. 798.
306
[29] M.B. Green, J.H. Schwarz, E. Witten, Superstring Theory, vol. 1: Introduction, Cambridge Univ. Press,
Cambridge, 1987.
[30] B.A. Kupershmidt, Phys. Lett. A 102 (1984) 213.
[31] P.P. Kulish, Zap. Nauchn. Sem. LOMI 155 (1986) 142.
[32] P. Mathieu, Phys. Lett. B 203 (1988) 287.
[33] T. Inami, H. Kanno, Commun. Math. Phys. 136 (1991) 519.
[34] P.P. Kulish, S.A. Tsyplyaev, Theor. Math. Phys. 46 (1981) 114.
[35] Al.B. Zamolodchikov, Phys. Lett. B 253 (1991) 391.
Quantum field theories and critical

phenomena on defects
Davide Fichera, Mihail Mintchev, Ettore Vicari
Dipartimento di Fisica dellUniversit di Pisa, Pisa, Italy
Istituto Nazionale di Fisica Nucleare, Sezione di Pisa, Largo Pontecorvo 3, 56127 Pisa, Italy
Received 14 February 2005; received in revised form 29 April 2005; accepted 13 May 2005
Abstract
We construct and investigate quantum fields induced on a d-dimensional dissipationless defect by
bulk fields propagating in a (d + 1)-dimensional space. All interactions are localized on the defect.
We derive a unitary noncanonical quantum field theory on the defect, which is analyzed both in the
continuum and on the lattice. The universal critical behavior of the underlying system is determined.
It turns out that the O(N )-symmetric 4 theory, induced on the defect by massless bulk fields, belongs
to the universality class of particular d-dimensional spin models with long-range interactions. On the
other hand, in the presence of bulk mass the critical behavior crossovers to the one of d-dimensional
spin models with short-range interactions.
PACS: 11.10.Kk; 05.10.Cc; 05.70.Jk
Keywords: Quantum field theories; Defects; Extra dimensions; Critical phenomena
1. Introduction
Canonical Lagrangian quantum field theory (QFT) dominates our present understanding of elementary particle physics. It is well known however, that in principle one can
E-mail addresses: fichera@sns.it (D. Fichera), mintchev@df.unipi.it (M. Mintchev), vicari@df.unipi.it
(E. Vicari).
doi:10.1016/j.nuclphysb.2005.05.018
308
D. Fichera et al. / Nuclear Physics B 720 [FS] (2005) 307324
formulate consistent QFTs without necessarily referring to a Lagrangian and the related
canonical formalism. Although may be less attractive in the context of elementary particles, this possibility is relevant in the study of critical phenomena. The enormous progress
in conformal field theory (CFT) in the past two decades has shown in fact (see, e.g., [1])
that only a small number of such models originate from a canonical Lagrangian.
Inspired by the constant advance in using extra dimensions in QFT, we propose a new
class of noncanonical QFTs in d spacetime dimensions, induced by canonical fields in
d + 1 dimensions. The main steps of our construction can be summarized as follows.
One starts with a conventional Lagrangian QFT with a local action A[i ] for the fields
{i } defined on a (d + 1)-dimensional manifold (bulk space) M. Then one considers a
d-dimensional submanifold D M, which can be interpreted physically as a defect (impurity) in the bulk. We denote by (x, y) the coordinates of a generic point of M and assume
that the defect D is recovered for y 0. The bulk fields {i (x, y)} evolve according to the
EulerLagrange equations following from the bulk action AM [i ], supplemented by initial conditions in M and boundary conditions on D. The quantum field i (x) induced by
i (x, y) on D is obtained by performing in appropriate way the limit y 0 in i (x, y).
In spite of the fact that {i (x)} are noncanonical, their correlation functions define a meaningful local QFT on the defect D. This theory, which is unitary provided that the defect
does not dissipate, will be the goal of our investigation in the present paper. The strategy
summarized above consists of two steps. One works first in the bulk, applying standard
techniques to the canonical action AM [i ]. Afterwards, one projects the theory on the
defect, deriving an effective, noncanonical action AD [i ] on D. An essential aspect of this
framework are the interactions, which are assumed to be localized on the defect D. This
idea has been already explored [2] in the context of two-dimensional conformal field theory, where a specific exponential interaction localized at a point has been shown to explain
the edge states tunneling in the fractional quantum Hall effect.
The paper is organized as follows. In Section 2 we describe how free quantum fields
are induced on defects. We discuss the basic properties of the induced fields and derive
the effective action on the defect. In Section 3 we investigate the effects of interactions
localized on the defect. In particular, we consider the (d + 1)-dimensional theory for an N component scalar field with O(N )-invariant 4 interaction localized on a d-dimensional
defect. We show that the theory can also be formulated in terms of a d-dimensional O(N )
vector model defined on the defect and interacting with a (d + 1)-dimensional bulk free
field. Using general renormalization-group arguments, we discuss the main features of the
universal critical behavior described by the unitary quantum field theory induced on the
defect. In Section 4 we solve the theory in the large-N limit, both in the continuum and
on the lattice. We determine the critical behavior on the defect, and check the scenario put
forward in Section 3. Finally, Section 5 contains our conclusions.
2. Free scalar field induced on a -defect

In order to fix the ideas, we consider the simplest example of a bulk scalar field
propagating in a (d + 1)-dimensional Minkowski space M, whose diagonal flat metric
has signature (+, , . . . , ). We adopt the coordinates (x 0 , . . . , x d1 , y) M and study a
309
-type impurity D localized on the d-dimensional Minkowski space determined by y = 0.

The dynamics is defined by the bulk action
1
AM [] =
2

d d x dy (x, y) x y2 + M 2 + 2(y) (x, y),
(1)
where characterizes the interaction of with the defect. The variation of (1) gives the
equation of motion

x y2 + M 2 (x, y) = 0, y = 0,
(2)
and the defect boundary conditions
(x, +0) = (x, 0) (x, 0),
(3)
y (x, +0) y (x, 0) = 2(x, 0).
(4)
The quantum field , satisfying Eqs. (2)(4) and the conventional equal-time commutation
relations, is unique and is fully determined by its two-point function. Performing the quantization, one must take into account that signals propagating in the bulk are both reflected
and transmitted [3,4] by the impurity. The relative reflection and transmission coefficients
in momentum space read
R(p) =
i
,
p + i
T (p) =
p
,
p + i
(5)
p being the momentum conjugated to y. R and T satisfy the unitarity conditions

T (p)T (p) + R(p)R(p) = 1,
(6)
T (p)R(p) + R(p)T (p) = 0,
(7)
which imply the absence of dissipation on the -defect.

(k, p), a (k, p)} and {a (k, p),
The coefficients (5) deform the algebra of operators {a+
+
a (k, p)}, which create and annihilate particles in the half-spaces y > 0 and y < 0, respectively. This deformation [5] is the main tool for quantizing Eqs. (2)(4). It implements the
defect boundary conditions given by Eqs. (3), (4), preserving the initial conditions defined
by the conventional equal-time commutation relations for . Referring for the details to
[6], we report the two-point vacuum expectation value of . In the range 0 one gets

(x1 , y1 )(x2 , y2 ) =
dp
E(p; y1 , y2 ; )WM 2 +p2 (x12 ).
2
(8)
Here x12 x1 x2 ,
E(p; y1 , y2 ; ) = (y1 ) (y2 )T (p)eipy12 + (y1 ) (y2 )T (p)eipy12

+ (y1 ) (y2 ) eipy12 + R(p)eipy12

+ (y1 ) (y2 ) eipy12 + R(p)eipy12
(9)
310
with y12 = y1 + y2 and

Wm2 (x) =

dd k eikx k 0 2 k 2 m2 ,
dn k
d nk
,
(2)n
(10)
is the standard two-point vacuum expectation value of a free scalar field of mass m in d
spacetime dimensions.
The quantum field (x), induced on the -defect, is defined by the weak limit
(x) = 2 lim (x, y)

(11)
y0
in the Hilbert space of (x, y). The factor 2 is introduced for further convenience. The
limit (11) exists [6] and after a straightforward change of variables in (8) one gets

(x1 )(x2 ) =
M2
d2
2 M 2
W 2 (x12 ).
2
M 2 + 2
Eq. (12) is a KllnLehmann spectral representation with density
2

2
2 M 2
2
,
= M
2
( M 2 + 2 )
(12)
(13)
which gives rise to a well-defined generalized free field [7] on D. In the context of QFT
with extra dimensions the induced field is a superposition of KaluzaKlein (KK) modes
with mass . The positive function defines the KK measure. Since this measure is polynomially bounded at infinity, is a local quantum field on the defect.1 Obviously it does not
satisfy equal-time canonical commutation relations and is therefore noncanonical. Moreover, has a continuum mass spectrum with mass gap M.
From (12) one obtains the propagator

(x12 ) T (x1 )(x2 ) =

d 2 2 2 (x12 ),
(14)
where
1
m2 (x) =
i

dd k
eikx
,
m2 k 2 i
(15)
is the familiar propagator in d spacetime dimensions. The integral in (14) is easily computed and gives for the Fourier transform of the propagator the result
(k) =
1
1
.
i M 2 k 2 i +
1 The locality properties of induced quantum fields are investigated in [8].
(16)
311
It is worth stressing that the poles, that are present in the single KK modes in the complex
k0 -plane, give rise to a cut after the resummation.
For Euclidean momenta (k1 , . . . , kd ) one gets from (16) the Schwinger function
1
s (k1 , . . . , kd ) = i (k0 = ikd , k1 , . . . , kd1 ) =
,
2
M + k2 +
(17)
which leads to the following Euclidean effective action

1
AD [] =
2

d d x (x) M 2 + (x),
(18)
localized on the defect. Eq. (18) reads in momentum space

1
AD [] =
2

dd k (k)J
(k; M, )(k),
J (k; M, )
k 2 + M 2 + ,
(19)
where
dd k eikx J (k; 0, 0)
1
r d+1
r |x|.
(20)
In what follows Eq. (20) allows to make contact with some previous work [911] in the
context of spin models in statistical mechanics.
The case M < 0 can be considered along the same lines, keeping in mind that
the interaction of with the defect produces [6] a defect bound state. It contributes to the
Schwinger function, which instead of (17), now takes the form [12]
2||
1
+ () 2
.
s (k) =
2
2
k + M 2 2
M +k +
(21)
In the range < M the defect boundary state gives imaginary energy contribution to ,
which leads to an instability. Therefore the free theory (1) is stable and well defined only
when > M.
Let us mention in conclusion that the -impurity is actually an element of a whole threeparameter family [13] of dissipationless defects, defined by the boundary conditions

a11 a12
(x, 0)
(x, +0)
=
,
(22)
a21 a22
y (x, 0)
y (x, +0)
where {aij R: a11 a22 a12 a21 = 1}. The above considerations have a straightforward
extension [6] to all of them. For the KllnLehmann spectral density one gets in the
general case
2 (2 M 2 ) + a 2 + 1] 2 M 2
2

2
2[a12
2
11
, (23)
= M
2 (2 M 2 )2 + (a 2 + a 2 + 2)(2 M 2 ) + a 2 ]
[a12
11
22
21
312
which behaves for = 0 and = like the density (13). For this reason it is enough
to concentrate in what follows on the -impurity. Most of our results hold in fact in the
general case, because they reflect the low-momentum behavior of the field .
3. 4 interactions on -defects
In this section we extend the study to the case in which 4 -type interactions are present
on the defect and the dynamics is determined by the action
1
AM [] =
2

d d x dy (x, y) x y2 + M 2 + 2(y) (x, y)
g0
4!

d d x dy (y) 4 (x, y).
(24)
We should mention that surface (boundary) critical phenomena have been widely investigated in the case the 4 -type potential determines also the critical properties in the bulk,
see for example the reviews [14,15]. Here, the defect does not only represent the impurity of the bulk system, but it is also the place where the 4 interaction is localized. As
already mentioned in the introduction, the idea of localizing the interaction on the defect
was already considered [2] in the context of two-dimensional conformal field theory and
the quantum Hall effect. In the following we discuss the phase diagram and the critical
properties of the field theory induced by (24) on the d-dimensional defect D with d 1.
The correlation functions relative to (24) are given in perturbation theory by

int
T (x1 , y1 ) (xn , yn ) = T 0 (x1 , y1 ) 0 (xn , yn )ei:AD : ,
(25)
where 0 is the free bulk field described in the previous section,
g0
AD =
4!
int
g0
x 04 (x, 0) =
4!

d d x 04 (x)
(26)
and : : denotes the normal product. Using that the interaction is localized on the defect, one
can perform in (25) the limits yi 0 with the result

int
T (x1 ) (xn ) = T 0 (x1 ) 0 (xn )ei:AD : ,
(27)
where the 0 -propagator is given by (16). In view of Eq. (18), the correlation functions (27)
are generated by the following Euclidean action on D

AD [] =
dd x

1 2
g0
M + + 4 .
2
4!
This action defines the quantum field theory induced on the defect.
(28)
313
The theory (24) can be further generalized by considering in Euclidean space the action

A[, ] =
2
g0
h
()2 (x) + 2 (x) + 4 (x) + (x) (x, 0)
d x
2
4!
2

+
d d x dy

1
M2 2
()2 (x, y) +
(x, y) ,
2
2
(29)
where 0 and h > 0 are real parameters. In the limit 0 and h the theory (24)
is recovered, apart from a trivial rescaling of the fields. In this model the field , constrained
on a d-dimensional defect, interacts with a free field propagating in d + 1 dimensions.
Moreover, as in the case of the standard O(N )-symmetric 4 theories, one may take in (29)
also N -component fields {i , i } and a particular limit of the parameters and g0 , so that
the resulting defect field is constrained to have norm one, i.e., one has

A[, ] =
dd x

+
2
h
()2 (x) + (x) (x, 0)
2
2

1
M2 2
2
(x, y) ,
d x dy () (x, y) +
2
2
d
(30)
with

i i = 1.
(31)
i=1
Note that in the limit h , this model can be seen as a free field constrained to have
norm one on the defect. The universal critical behavior is expected to remain invariant
under the above changes of the Lagrangian parameters. This can be verified in the largeN limit, see Section 4. In particular, the parameters and h are expected to be irrelevant
from the point of view of the renormalization-group theory, since they do not change the
universal behavior at its critical points. They may be set to the values = 0 and h = .
The above models can be regularized on a (d + 1)-dimensional lattice by a straightforward discretization of their Lagrangians. A example of such discretization of the action (30) is
AL =

x,
x x+ +
h
(x x )2
2 x
1
M2 2
(z z+ )2 +
,
2 z,
2 z z
(32)
where we set the lattice spacing a = 1, x and z indicate respectively the coordinates on
the defect and in the bulk, and = 1, . . . , d, = 1, . . . , d + 1. The corresponding partition
314
function is defined as

ZL =
x2 1 eAL ,
(33)
{},{} x
where = 1/T , and the coupling T plays the role of temperature. Nontrivial continuum
limits are realized at the critical points, where continuous transitions occur and a length
scale diverges in unit of the lattice spacing.
The main features of the phase diagram can be discussed using general renormalizationgroup arguments. In the case h = 0, i.e., when the fields and do not interact, the critical
properties of the field on the defect are those of the d-dimensional N -component nonlinear sigma model. If d = 1 the correlation length diverges only when T 0 for any N .
If d = 2, the system undergoes a finite-temperature Ising transition for N = 1, a finite-T
KosterlitzThouless transition [16] for N = 2, while in the case N 3 the system becomes critical only in the limit T 0, with a length scale that increases exponentially,
i.e., eb/T , typical of asymptotically free models. For d = 3 there is a finite-temperature
transition for any N with universal properties characterized by nontrivial power laws.2 We
have the same scenario for d = 4 but with mean-field critical behaviors apart from logarithms. For d > 4 the behavior is just mean field. See, e.g., Refs. [21,22] for reviews.
These critical behaviors remain unchanged for h > 0 when M > 0, i.e., when M is strictly
positive and kept fixed. In this case the large-distance correlations in the bulk decay exponentially, i.e., eMr , inducing only short-ranged interactions on the defect. They do not
change the universal critical behavior of the low modes when the correlation length on
the defect is sufficiently large, i.e., when
1/M. On the other hand, when the bulk field
becomes massless, the large-distance bulk correlations induce long-range interactions
on the defect, cf. Eq. (20), which can change the critical properties on the defect. These
long-range interactions can be inferred from the analysis of Section 2. When M = 0, integrating out the bulk field in Eq. (33) (see Section 4.2) one obtains a defect action of the
form

dd k J (k)(k)
(k),
AD =
1/2
+ j2 k 2 + .
J (k) j1 k 2
The critical properties of statistical systems with long-range interactions, such as
s/2
+ j2 k 2 +
J (k) = js k 2
(34)
(35)
were studied in [9,10] within an expansion in powers of 2s d and in [11] in powers of 1/N . The results of [911] show that statistical systems with Hamiltonians of the
2 The universal properties of three-dimensional O(N ) vector models have been determined by various theoretical methods and experiments. The most precise theoretical estimates of the standard critical exponents have
been obtained by lattice techniques. We mention = 0.63012(16), = 0.03639(15) [17] and = 0.63020(12),
= 0.0368(2) [18] for the three-dimensional Ising model corresponding to N = 1, = 0.67155(27) and
= 0.0380(4) [19] for the XY universality class (N = 2), and = 0.7112(5) and = 0.0375(5) [20] for the
Heisenberg universality class (N = 3). In the large-N limit one finds = 1 and = 0 [21].
315
type (34) undergo a continuous transition at finite temperature for any N . The cases d > 2,
corresponding to < 0, are in the classical regime [9], where, for any N , the critical behavior of the magnetic susceptibility and correlation length are given by [9]
t 1 ,
(36)
where t (T Tc )/Tc is the reduced temperature; therefore the critical exponents are
= = = 1. The case d = 2 is on the borderline = 0, and multiplicative logarithms
correct the classical behavior, [9]

N+2
t 1 ln t 1 N+8 .
(37)
We also mention that for d < 2, corresponding to > 0, the critical exponents are nontrivial, indeed
=1+

N +2
+ O 2 ,
N +8

= 1 + O 2 .
(38)
The O( 2 ) terms are reported in Refs. [9,10].

The above-mentioned critical properties suggest that the critical point for M = 0 is actually a multicritical point in the T M plane. Indeed, beside the two standard relevant scaling
fields associated with the temperature T and external field H , there is another relevant parameter given by the bulk mass M. Switching M on, the critical behavior crossovers to the
more stable one for M > 0. We will return to this point in Section 4.
We finally note that the critical properties discussed in this section apply also to the more
general dissipationless defects defined by Eq. (22), since the low-momentum behavior of
the corresponding effective Lagrangian remains substantially unchanged.
4. The large-N limit

In this section we show how the scenario put forward in the preceding section is actually realized in the large-N limit, which can be analytically investigated by solving the
corresponding saddle-point equations. We first discuss the large-N limit in an appropriate
continuum formulation. Then we consider a more rigorous nonperturbative treatment of
the bulk theory based on a lattice regularization of the path integral. In other words, we
consider a lattice model whose critical behavior is described by the unitary quantum field
theory induced on the defect. We present the solution of its large-N limit for d = 1, 2. As
we shall see, this large-N investigation fully supports the scenario outlined in Section 3.
4.1. Continuum theory
We want to study the effects of the 4 interaction localized on the defect, and in particular the universal properties near the critical point at which the bulk correlation length
diverges, i.e., for M 0. For this purpose we focus on the model (28) with a N -component
316
field i , thus considering

AD [] =
N

1 2
g0
dd x
i M + i +
2
4!
i=1
2
i i
(39)
i=1
Note that the above theory is renormalizable in two dimensions, unlike the standard 4
model which is renormalizable in four dimensions.
In the following we first determine the critical behavior of the theory (39) at M = 0,
computing in particular the standard critical exponents and the equation of state in the
limit N with fixed g0 N g. We introduce the auxiliary field and a source H for
N , writing the partition function in the form

Z(H ) = [d][d]eAD [,;H ] ,
(40)
where

AD [, ; H ] =

N
1
3
3 2
d x
i ( + )i +
H N .
2
g0
2g0
d
(41)
i=1
Integrating over 1 , . . . , N 1 and setting = N , one gets

Z(H ) = [d][d ]eAD [,;H ] ,
(42)
where

3
1
3
( + ) +
2
2
g
2g
N 1
Tr log( + ) H .
+
2
AD [, ; H ] =
dd x
(43)
The large-N behavior is governed by the uniform ( = const, = const)

saddle-point
approximation. Varying (43) and rescaling and H according to N , H H N ,

one gets
= H,
6
2 ( ) +
g
(44)

dd k
1
= 0.
|k| +
(45)
The integral in (45) is regularized by means of an UV cutoff . Moreover, we take d > 1

in order to avoid IR instabilities. As expected, Eqs. (44), (45) strongly resemble their counterparts [21] in the conventional 4 theory. The only difference is the term |k| in the
denominator of the integrand, which replaces k 2 from the standard 4 model. The consequences of this modification are easily analyzed.
317
Let us consider first the case H = 0. In the broken phase = 0, one has = 0. Eq. (45)
has a solution for only if
< c
g
6

dd k
1
.
|k|
(46)
Setting
t
6
( c ),
g
(47)
one gets from (45)

2 = t (t)2 ,
(48)
which implies =
In the symmetric phase = 0 and m = 0, where m is the inverse correlation
length , which determines the large-distance exponential decay of the two-point correlation function of the field, i.e., G(r) er/ . Now Eq. (45) takes the form
1
2.
6
+
g

dd k
t
1
= .
|k|(|k| + m) m
From (49) one deduces

1,
mt , =
1
(49)
d > 2,
d1 ,
(50)
1 < d < 2.
From the critical behavior of the two-point correlation function one finds = 1. These
results agree with the formulas (38) obtained within an 2 d expansion. The case
d = 2 is special because the integral in (45) generates the logarithm ln
m ; one has meanfield behavior with logarithmic corrections. The limiting cases d = 1 and d = 2 will be
further discussed in the next subsection on the lattice.
We turn now to the case H = 0. Combining (44) and (45) one gets
6
+t = m+m
g
dd k
1
.
|k|(|k| + m)
(51)
In the domain 1 < d < 2 it implies

H f t 1/ ,
(52)
f (x) = (1 + x) ,
d +1
1
=
(53)
,
=
,
d 1
d 1
where f (x) is a universal function apart from trivial normalizations (usually one sets
f (0) = 1 and f (1) = 0 [21]). Note that the critical exponents satisfy the scaling and
hyperscaling relations
= (d 2 + ),
2
= (2 ),
d +2
,
d 2+
(54)
318
in the range 1 < d < 2. From Eq. (52) one may also derive the effective potential
(Helmholtz free energy) V (z), i.e., the generator of one-particle irreducible correlation
functions of at zero momentum, see, e.g., Refs. [22,24]. We obtain

d 1 2 d/(d1)
3
1+
z
1 ,
V (z) =
(55)
d
6
where z t , which represents the renormalized expectation value of the field . In
1 4
Eq. (55) V (z) has been normalized according to V (z) = 12 z2 + 24
z + O(z6 ). For d = 2
the effective potential reduces to the simple form
1
1
V (z) = z2 + z4 ,
2
24
(56)
as in the case of the standard 4 model in four dimensions [24].

It is worth noting that the theory (39) induced on a d-dimensional defect with 1 < d < 2
presents analogies with the standard 4 model for 2 < d < 4, see, e.g., Ref. [21]. In both
cases we have critical behaviors characterized by nontrivial power laws. Moreover, replacing d with d/2 in Eqs. (52), (53) and (55), one recovers the corresponding expressions for
the large-N limit of the standard O(N ) vector models.
As already discussed in Section 3, the critical behavior found for M = 0 is unstable
against the parameter M, since in the presence of a bulk mass the M = 0 critical behavior
is not anymore observed. In this case, the models are expected to show the critical behavior
of the well-known d-dimensional O(N ) vector universality class with short-range interactions, i.e., when J (k) k 2 in Eq. (34). Therefore, the critical point for M = 0, T = Tc , is
a multicritical point in the T M plane. Scaling arguments applied to multicritical points,
see, e.g., the review [23], suggest the following generalized scaling hypothesis (for H = 0)

t f Mt ,
Fsing t d f Mt ,
(57)
where Fsing is the singular part of the free energy, is the defect magnetic susceptibility;
t (T Tc )/Tc , , , are the multicritical exponents. In particular is the crossover
exponent associated with the instability direction parametrized by M. The scaling behavior (57) is expected to hold in the critical crossover limit t, M 0 keeping Mt fixed.
This issue can be investigated in the large-N limit, starting from the action (39), as we did
in this section, but keeping M = 0 through the various steps to arrive at the saddle-point
equation. Setting H = 0, we obtain
6
(M m + ) +
g

dd k
mM +
M 2 + k2
= 0,
(58)
where now m is related to the zero-momentum correlation function (usually called magnetic susceptibility) by = 1/m. Defining t by Eq. (47), one finds
6
t = (M + m) +
g
m M + M 2 + k2 k2
.
dd k
k 2 (m M + M 2 + k 2 )
(59)
319
Keeping only the leading terms in the critical crossover limit for 1 < d < 2, we find
t md1 S(M/m),

2
2
(d2) 1 r + r +

S(r) = d
.
1 r + r 2 + 2
(60)
This result is in agreement with the expected scaling (57), and implies = 1/(d 1). We
also mention that in the classical regime d > 2, one finds = 1, while for d = 2 the classical crossover relations must be corrected due to multiplicative logarithms, as in Eq. (37). It
is worth mentioning that the relation between and in the crossover regime changes: one
has in the limiting case M/m = 0, while in the opposite limit M/m = , 2 .
4.2. Theory on the lattice
In this section we study the large-N limit of the lattice model defined by the action (32).
We consider lattices of size Ld E with periodic boundary conditions along all d + 1
directions. All dimensional parameters are expressed below in units of the lattice spacing a,
which is set to 1 for convenience. To begin with, we integrate out the bulk field in the
partition function (33), obtaining a d-dimensional nonlinear sigma model defined on the
defect. Performing a Fourier transformation along the defect coordinate x and computing
the Gaussian integrals in , one finds

ZL =
(61)
x2 1 eN Aeff ,
{} x
where Aeff is given by

Aeff =

1 2
k B(k; M, h, E) k k ,
2
(62)

with k 2 = 2 (1 cos k ), k = 2n /L where n = 1, . . . , L. The function B, following from the integration, is formally given by
B(k; M, h, E) = h2 Q1
1,1 ,
(63)
where Q is the E E matrix

2 1
1 . . .
2

..
Q = M + k 2 I + 0
.
..
.
1 0
0
..
.
+h 0
..
.
1
1 2
..
.
The basic features of B are investigated in Appendix A.
0 0
.
0
(64)
320
In the large-N limit and keeping fixed, the solution is given by the saddle-point equation, which, in the limit L , takes the form

=
dd k
1
.
2
m + k + [B(0; M, h, E) B(k; M, h, E)]
(65)
Here m is related to the magnetic susceptibility by m = ()1 . In the large-N limit,

for M = 0, while 2 for M > 0. According to Eq. (63), in order to get the
function B, we must evaluate the 1, 1 matrix element of the inverse matrix of Q given in
Eq. (64). In the limit E one finds
h2
B(k; M, h, ) =
.
(66)
h + [4M 2 + 4k 2 + (M 2 + k 2 )2 ]1/2
This result is derived in Appendix A.
In the following we simplify the calculation by fixing = 0 and h = , but we have
also checked that the universal critical behavior remains the same when choosing generic
values , h > 0. In terms of
R(k; M) B(0; M, , ) B(k; M, , )

2 1/2 2
1/2
= 4M 2 + 4k 2 + M 2 + k 2
4M + M 4
,
(67)
the large-N saddle-point equation becomes

=
dd k
1
.
m + R(k; M)
(68)
A continuous transition occurs when the defect length scale diverges, i.e., m 0.
Let us now consider the case d = 2. One can easily check that if M > 0 the integral (68)
diverges when m 0, as expected since when M > 0 the effective action represents a
particular regularization of the 2d nonlinear sigma model, which does not have finitetemperature transitions according to the MerminWagner theorem [25]. In this case, the
magnetic susceptibility and the correlation length diverges in the limit , as
2 + M2
,
(69)
M 4 + M2
which is the same behavior of the standard 2d O(N ) spin model with N > 2 and nearestneighbor interactions, apart from a trivial normalization of the inverse temperature . On
the other hand, when M = 0 a transition occurs at finite temperature, since the integral (68)
is finite for m = 0. Indeed, one finds
2 e4C(M) ,

c =
d2 k
C(M) =
1
= 0.252731
R(k; 0)
(70)
and
t 1 ln t 1 ,
(71)
321
Fig. 1. The inverse temperature versus m for several values of M.
where t (c )/c . Note that in this case the MerminWagner theorem [25] does not
apply since long-range spin interactions are present.
As already discussed, the critical point Tc for M = 0 is actually a multicritical point.
Switching M on, the critical behavior crossovers to the more stable critical behavior for
M > 0 given by Eq. (69). This is shown in Fig. 1, where the function is plotted against
m for various values of M. The curve for M = 0 is approached by those for M > 0 when
M 0, except in a small region close to m = 0, where they suddenly depart from the
M = 0 curve and diverge for m 0. We note that scales as M when M 0. This
is consistent with the expectation that the crossover exponent is given by = 1, apart
from logarithmic corrections. We finally mention that the relation among the quantities
= 1/(m), t and M can be derived from equation
c
1
t
=1
c
c

d2 k
1
.
m + R(k; M)
(72)
Let us briefly comment in conclusion on the case d = 1. One again finds a crossover
between different critical behaviors. Unlike the case d = 2, the critical behavior is observed
when in both M = 0 and M > 0 cases. When M = 0 Eq. (68) shows that the
correlation length increases exponentially, i.e., = 1/m ec . On the other hand, when
M > 0 one has = 1/ m , which is the behavior of an one-dimensional model with

short-range interaction.
322
5. Conclusions
In conclusion, we have shown how noncanonical quantum fields i can be generated in
d dimensions using the interaction of canonical fields i with a -type (or more general
dissipationless) defect in d + 1 dimensions. The fields i , which propagate on the defect,
define a unitary quantum field theory there. In order to clarify its basic features, we analyzed the case in which an N -component scalar field i in d + 1 dimensions is subject to
a 4 -type interaction localized on a d-dimensional defect. This theory can also be formulated in terms of a d-dimensional O(N ) vector model, defined on the defect and interacting
with a (d + 1)-dimensional bulk free field.
General renormalization-group arguments and calculations in the large-N limit allow
to get a rather complete picture of the critical behavior of the field theory induced on
the defect. The large-N limit of the theory was studied within its continuum formulation
and also for statistical systems representing a class of lattice regularizations. The main
results can be summarized as follows. When the bulk fields i are massless, the induced
4 theory belongs to the universality class of a specific statistical spin model with longrange interactions, behaving as 1/r d+1 . The former provides therefore a field theoretical
description of the universal critical properties of the latter. The presence of a nonzero bulk
mass M causes an interesting crossover to the critical behavior of the standard O(N ) spin
model with short-range interactions. We argue that the critical point Tc at M = 0 is actually
a multicritical point in the T M plane. Following renormalization-group theory applied to
multicritical points, one can then consider a critical crossover limit, defined when t
T Tc 0 and M 0. This presents a universal scaling behavior, which can be studied
within the continuous theory with nonvanishing bulk mass M.
The generalization of this work to models involving gauge and/or fermion fields opens
interesting new possibilities and deserves further investigation.
Appendix A. Some useful formulas

In this appendix we provide a few details on the derivation of Eq. (66). In order to
determine the function B from Eq. (63), one needs to evaluate the matrix element 1, 1 of
the inverse of a n n matrix of the type
a 1 0 1
1 b 1 . . .
0
..
.
.
.
..
..
..
An = 0
.
.
.
.
1
1 0 1 b
(A.1)
We are interested in the case a > b 2. This can be done by using the formula
1
det An1,1
An 1,1 =
,
det An
(A.2)
323
where A1,1 indicates the minor matrix corresponding to the 1, 1 matrix element. Let us
introduce the n n matrix
b 1 0
..
..
.
.
1 b
.
Bn =
(A.3)
..
..
. 1
.
0
..
..
. 1 b
.
Then the following relation holds
det An1,1 = det Bn1 ,
det An = a det Bn1 2 det Bn2 2.
(A.4)
In order to compute the determinant of the matrix Bn , we note that

det B1 = b,
det B2 = b2 1,
and the recursive formula

det Bn = b det Bn1 det Bn2 .
In the large-n limit and for b > 2, det Bn diverges and
det Bn1 b b2 4
< 1.
lim
=
n det Bn
2
This allows us to derive the formula

1

det Bn2
2
lim A1
a
2
=
lim
n n 1,1
n
det Bn1 det Bn1
1
,
=
a b + b2 4
which was used to obtain Eq. (66).
(A.5)
(A.6)
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
P. Di Francesco, P. Mathieu, D. Senechal, Conformal Field Theory, Springer-Verlag, New York, 1997.
H. Saleur, Lectures on non-perturbative field theory and quantum impurity problems, cond-mat/9812110.
I. Cherednik, Int. J. Mod. Phys. A 7 (1992) 109.
G. Delfino, G. Mussardo, P. Simonetti, Nucl. Phys. B 432 (1994) 518, hep-th/9409076.
M. Mintchev, E. Ragoucy, P. Sorba, J. Phys. A 36 (2003) 10407, hep-th/0303187.
M. Mintchev, P. Sorba, JSTAT 0407 (2004) P001, hep-th/0405264.
R. Jost, The General Theory of Quantized Fields, American Mathematical Society, Providence, RI, 1965.
M. Mintchev, Phys. Lett. B 524 (2002) 363, hep-th/0111019.
M. Fisher, S.-k. Ma, B.G. Nickel, Phys. Rev. Lett. 29 (1972) 917.
J. Sak, Phys. Rev. B 8 (1973) 281.
M. Suzuki, Prog. Theor. Phys. 49 (1973) 442;
M. Suzuki, Prog. Theor. Phys. 49 (1973) 1106;
M. Suzuki, Prog. Theor. Phys. 49 (1973) 1440.
324
[12] M. Mintchev, L. Pilo, Nucl. Phys. B 592 (2001) 219, hep-th/0007002.

[13] S. Albeverio, L. Dabrowski, P. Kurasov, Lett. Math. Phys. 45 (1998) 33.
[14] K. Binder, in: C. Domb, J. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 8, Academic
Press, London, 1983.
[15] H.W. Diehl, in: C. Domb, J. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 10, Academic
Press, London, 1986.
[16] J.M. Kosterlitz, D.J. Thouless, J. Phys. C 6 (1973) 1181.
[17] M. Campostrini, A. Pelissetto, P. Rossi, E. Vicari, Phys. Rev. E 65 (2002) 066127, cond-mat/0201180.
[18] Y. Deng, H.W.J. Blte, Phys. Rev. E 68 (2003) 036125.
[19] M. Campostrini, M. Hasenbusch, A. Pelissetto, P. Rossi, E. Vicari, Phys. Rev. B 63 (2001) 214503, condmat/0010360.
[20] M. Campostrini, M. Hasenbusch, A. Pelissetto, P. Rossi, E. Vicari, Phys. Rev. B 65 (2002) 144520, condmat/0110336.
[21] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, fourth ed., Clarendon, Oxford, 2001.
[22] A. Pelissetto, E. Vicari, Phys. Rep. 368 (2002) 549, cond-mat/0012164.
[23] I.D. Lawrie, S. Sarbach, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena,
vol. 9, Academic Press, London, 1984.
[24] A. Pelissetto, E. Vicari, Nucl. Phys. B 522 (1998) 605, cond-mat/9801098.
[25] N.D. Mermin, H. Wagner, Phys. Rev. Lett. 17 (1966) 1133.
A new (in)finite-dimensional algebra for quantum

integrable models
Pascal Baseilhac, Kozo Koizumi
Laboratoire de Mathmatiques et Physique Thorique CNRS/UMR 6083, Universit de Tours,
Parc de Grandmont, 37200 Tours, France
Received 22 April 2005; accepted 27 May 2005
Abstract
A new (in)finite-dimensional algebra which is a fundamental dynamical symmetry of a large class
of (continuum or lattice) quantum integrable models is introduced and studied in details. Finitedimensional representations are constructed and mutually commuting quantitieswhich ensure the
integrability of the systemare written in terms of the fundamental generators of the new algebra. Relation with the deformed DolanGrady integrable structure recently discovered by one of the
authors and Terwilligers tridiagonal algebras is described. Remarkably, this (in)finite-dimensional
algebra is a q-deformed analogue of the original Onsagers algebra arising in the planar Ising
model. Consequently, it provides a new and alternative algebraic framework for studying massive, as
well as conformal, quantum integrable models.
PACS: 02.20.Uw; 11.30.-j; 11.25.Hf; 11.10.Kk
Keywords: Onsagers algebra; Tridiagonal algebra; DolanGrady relations; Quadratic algebras; Integrable
models
E-mail addresses: baseilha@phys.univ-tours.fr (P. Baseilhac), kozo.koizumi@lmpt.univ-tours.fr

(K. Koizumi).
doi:10.1016/j.nuclphysb.2005.05.021
326
P. Baseilhac, K. Koizumi / Nuclear Physics B 720 [FS] (2005) 325347
1. Introduction
Two-dimensional completely integrable models (continuum or lattice) are characterized by the existence of an (in)finite set of mutually commuting conserved quantities. Such
property allows to solve the model exactly without considering approximation schemes.
Since the exact solution of the planar Ising model by Onsager in 1944 [1], several methods
have been proposed to analyze in a nonperturbative way integrable models. For instance,
the factorized scattering theory (based on YangBaxter/startriangle relations), quantum
group symmetry, Bethe ansatz techniques and conformal field theory framework are constantly applied to derive exact results (such as the exact S-matrix, VEVs, forms factors, . . .)
in quantum integrable models. For systems with an (in)finite number of degrees of freedom, integrability takes its roots in the existence of an (in)finite-dimensional symmetry.
For instance, in the context of (massless) conformal field theory, the infinite-dimensional
Virasoro algebra with fundamental generators Ln satisfying

c 3
n n n+m,0 ,
[Ln , Lm ] = (n m)Ln+m +
(1)
12
where c denotes the central charge, actually gives a powerful algebraic approach to critical
statistical systems and corresponding field theories [2]. Indeed, exact results like correlation functions (which were previously not accessible using standard techniques like
renormalization group methods) can be derived in a systematic manner using the properties
of the Virasoro algebra. Later on, other types of infinite-dimensional conformal symmetries including for instance supersymmetry [3], parafermionic symmetry [4], W-algebras
[5] and current algebras [6] have been considered, extending (1). Similarly, they found several applications in a large class of critical statistical systems or conformal field theories
with enlarged symmetries.
In the context of integrable massive quantum field theory or lattice systems, hidden
symmetries associated with quantum groups [7,8] or deformed Virasoro algebra [9] (in the
vicinity of critical points) have been introduced in order to derive exact results in these
models. However, an (in)finite-dimensional algebra characterizing the dynamical symmetry of a large class of quantum integrable massive models has not yet been found. The only
known example in this direction is the Onsagers algebra with generators Ak , Gl satisfying
[Ak , Al ] = 4Gkl ,
[Gl , Ak ] = 2Al+k 2Al+k ,
[Gk , Gl ] = 0
(2)
for any integers k, l. This (in)finite-dimensional Lie algebra was originally introduced in
[1] in order to solve the planar Ising model in zero magnetic field. Although it played
a crucial role in the original solution of the Ising model, this algebra only arises in a few
other quantum integrable models (XY, superintegrable chiral Potts [10] and generalizations
[11]). In these models, all conserved quantities I2k+1 can be simply expressed in terms of
the fundamental generators Ak as
(3)
(Ak + Ak ) + (Ak+1 + Ak+1 ),
2
2
for k 0 where , are arbitrary parameters. Also, note that in the 80s the Onsagers
algebra was shown to be closely related with the integrable structure discovered by Dolan
I2k+1 =
327
and Grady in [12]. Despite of its nice properties, the Onsagers algebra remained an interesting curiosity in the last sixty years.
Clearly, identifying the underlying (in)finite-dimensional symmetry in quantum integrable systems is a fundamental and important problem that we wish to address in this
paper. Indeed, we construct explicitly and study in details an (in)finite-dimensional alge k+1 satisfying
braic structure with fundamental generators Wk , Wk+1 , Gk+1 , G
[W0 , Wk+1 ] = [Wk , W1 ] =
q 1/2
1
k+1 Gk+1 ),
(G
+ q 1/2
k+1 , W0 ]q = Wk1 Wk+1 ,

[W0 , Gk+1 ]q = [G
k+1 ]q = Wk+2 Wk ,
[Gk+1 , W1 ]q = [W1 , G
[W0 , Wk ] = 0,
[W1 , Wk+1 ] = 0,
(4)
and
[Gk+1 , Gl+1 ] = 0,
k+1 , G
l+1 ] = 0,
[G
k+1 , Gl+1 ] + [Gk+1 , G

l+1 ] = 0,
[G
with fixed scalar and k, l N. Finite-dimensional representations are obtained, and examples of quantum integrable systems (XXZ spin chain, sine-Gordon and Liouville field
theories) which enjoy this symmetry are given. More generally, our framework opens the
possibility of analyzing a large class of quantum integrable models from a new point of
view, in the spirit of Onsagers approach [1].
The paper is organized as follows. In Section 2, the fundamental relations (4) are derived using its relation with a class of quadratic algebra, namely the reflection equation.
Indeed, finite-dimensional representations of its fundamental generators are shown to be
generated from general solutions of the reflection equation. In particular, the closure of the
algebra is ensured by the existence of a set of linear relations among the generators. Ex k+1 ,
plicit expressions of mutually commuting quantities in terms of Wk , Wk+1 , Gk+1 , G
generalizing (3) are obtained. Also, we argue that similarly to the undeformed case, the integrable structure generated from our q-deformed Onsagers algebra coincides with the
deformed DolanGrady integrable structure recently discovered in [13,14]. In particular,
we exhibit the correspondence in the simplest cases. In the last section, we give some examples of quantum integrable systems which enjoy this (in)finite-dimensional symmetry.
2. Structure of the algebra

In the last thirty years, one of the most important progress in the approach of quantum
integrable systems has been based on the star-triangle relations which originated in [1,15]
and led to the YangBaxter equations, the theory of quantum groups as well as the quantum
inverse scattering method. Although the star-triangle relations and the Onsagers algebra
(2) first appeared in the work of Onsager, to our knowledge, a direct and explicit link
between both structure has never been found or even noticed. Based on the recent results
in [13,14], in this section we will exhibit such a link, relating a class of quadratic algebras
the reflection equation sometimes called the boundary YangBaxter equationand a new
328
finite-dimensional algebra with deformation parameter q generalizing the Onsagers one.1

This link will allow us to derive the defining relations (4) for the new algebra, as well as
mutually commuting operators written in the basis of its fundamental generators.
2.1. Fundamental generators and recursion relations
Following [13,14], let us consider the quadratic algebra (reflection equation) which was
first introduced by Cherednik in [16]:

R(u/v) K(u) I R(uv) I K(v) = I K(v) R(uv) K(u) I R(u/v).
(5)
This equation arises, for instance, in the context of quantum integrable systems with boundaries [17]. We report the reader to the literature on the subject for more details. For our
purpose, we restrict our attention to the trigonometric R-matrix R(u) which solves the
2 ), it reads
YangBaxter equation. In the spin- 12 representation of Uq 1/2 (sl

R(u) =
(6)
ij (u)i j ,
i,j {0,3,}
where

1 1/2
q + 1 u q 1/2 u1 ,
2
+ (u) = + (u) = q 1/2 q 1/2 ,
00 (u) =
33 (u) =

1 1/2
q 1 u + q 1/2 u1 ,
2
and j are Pauli matrices, = (1 i2 )/2. Suppose that one knows an initial twodimensional matrix solution K (0) (u) of (5), then a family of solutions to (5) can be easily
obtained using the so-called dressing procedure [17]. Indeed, consider the fundamental
solution (called L-operator) L(u) of the quantum YangBaxter algebra

R(u/v) L(u) L(v) = L(v) L(u) R(u/v).
(7)
In the basis {S , s3 } of the quantum algebra Uq 1/2 (sl2 ) with defining relations [s3 , S ] =
S and [S+ , S ] = (q s3 q s3 )/(q 1/2 q 1/2 ), the L-operator takes the simple form:
1/2

1/4 s /2

uq q 3 u1 q 1/4
q q 1/2 S
q s3 /2

L(u) =
(8)
.
q 1/2 q 1/2 S+
uq 1/4 q s3 /2 u1 q 1/4 q s3 /2
From the results of [17], it follows, for any parameter v C, that

K (N ) (u) LN (uv) L1 (uv)K (0) (u)L1 uv 1 LN uv 1
(9)

acting on the quantum space V (N ) N
j =1 Vj also solves (5). Due to the choice (8), it is
clear that this general dressed solution is a two-dimensional matrix in the auxiliary space
with operator entries. Then, in full generality we decide to write it as

j j(N ) (u),
K (N ) (u) =
(10)
j {0,3,}
1 As we will see later on, defining relations for the Onsagers algebra (2) are recovered setting q = 1.
329
(N )
where j (u) are rather complicated operators acting on the quantum space V (N ) . Our
main objective is now to write these operators in a more convenient form. In the following,
we choose the trivial solution of (5) to be K (0) (u) (+ /c0 + )/(q 1/2 q 1/2 ). Note
(N )
that, according to (8), (9), the operators j (u) are combinations of Laurent polynomials
Nof degree 2N d 2N in the spectral parameter u and operators acting solely on
j =1 Uq 1/2 (sl2 ).
2.1.1. Case N = 1
For simplicity, let us start by considering N = 1 in (9). Note that this special case was
first studied in details in [18], having some interesting applications in the context of quasiexactly solvable systems (AzbelHofstadter one, in particular). This simplest dressed
(1)
solution takes the form (10) for N = 1 where the operators j (u) are easily written as
[13,14]:
0 (u) + 3 (u) = uq 1/2 W0 u1 q 1/2 W1 ,
(1)
(1)
(1)
(1)
0 (u) 3 (u) = uq 1/2 W1 u1 q 1/2 W0 ,

(1)
(1)
(1)
(1)
G
q 1/2 u2 + q 1/2 u2
(1)
+ 1/2 1 1/2 + 0 ,
1/2
1/2
c0 (q q
)
q +q
(1)
c0 G
q 1/2 u2 + q 1/2 u2
(1)
(1)
1
+
+ c0 0 ,
(u) =
q 1/2 q 1/2
q 1/2 + q 1/2
(1)
(1)
+ (u) =
(11)
(1)
(1)
(1) (1)
where the generators W0 , W1 , G1 , G
1 have been introduced. As shown in [13], the
(1) (1)
(1)
(1)
(1)
(1) = [W(1) , W(1) ]q where the
are
given
by
G
=
[W1 , W0 ]q and G
generators G1 , G
1
1
1
0
1
q-commutator
[X, Y ]q = q 1/2 XY q 1/2 Y X

has been introduced. In the basis of Uq 1/2 (sl2 ), they admit the following representations:
1 1/4
vq S+ q s3 /2 + v 1 q 1/4 S q s3 /2 ,
c0
1
(1)
W1 = v 1 q 1/4 S+ q s3 /2 + vq 1/4 S q s3 /2 ,
c0
2

2 s
v + v 2
q 1/2 + q 1/2
(j )
(1)
3 + v 2 q s3
G1 =
w
v
q
c0 (q 1/2 q 1/2 ) q 1/2 + q 1/2 0
2

,
+ q q 1 S
2

1/2
1/2

2 s
v + v 2
q +q
(j )
(1)
2 s3
3
G1 =
w v q
+v q
c0 (q 1/2 q 1/2 ) q 1/2 + q 1/2 0
2

.
+ q q 1 S+
(1)
W0 =
(1)
(12)
The remaining constant term 0 in (11) can be either obtained directly from (9), or follows by plugging (10) with the structure (11) and (12) in (5). This last requirement imposes
330
strong constraints on the generators, leading to the so-called AskeyWilson relations [19].
We report the reader to [13] for details. In any case, one finds
(1)
0 =
v 2 + v 2
(j )
w ,
c0 (q q 1 ) 0
(13)
where the Casimir operator eigenvalue of Uq 1/2 (sl2 ) reads w0 = q j +1/2 + q j 1/2 .
(j )
2.1.2. Case N = 2
For N = 2, the calculations are more involved but the procedure being straightforward,
we do not need to report the detailed analysis. Part of the results below can be found in [14],
but written in a slightly different form. In this case, the solution of the reflection equation
(5) is given by (10) for N = 2 with (see also [14])
(2)
(2)
(2)
(2)
(2)
(2)
0 (u) + 3 (u) = uq 1/2 P0 (u)W0 + P1 (u)W1
(2)
(2)
(2)
(2)
u1 q 1/2 P0 (u)W1 + P1 (u)W2 ,
(2)
(2)
(2)
(2)
(2)
(2)
0 (u) 3 (u) = uq 1/2 P0 (u)W1 + P1 (u)W2
(2)
(2)
(2)
(2)
u1 q 1/2 P0 (u)W0 + P1 (u)W1 ,
(2)
+ (u) =
(2)
(u) =
q 1/2 u2 + q 1/2 u2 (2)

P (u)
c0 (q 1/2 q 1/2 ) 0
(2)
1
(2)
(2)
(2)
(2)
P0 (u)G1 + P1 (u)G2 + 0 ,
+ 1/2
1/2
q +q
q 1/2 u2 + q 1/2 u2 (2)
P0 (u)
q 1/2 q 1/2
(2)

c0
(2) + P (2) (u)G
(2) + c0 (2) ,
+ 1/2
P0 (u)G
1
1
2
0
1/2
q +q
(14)
(2)
(2)
(2)
(2) for k {0, 1} have been introduced for
where the generators Wk , Wk+1 , Gk+1 , G
k+1
further convenience. Their finite-dimensional representations in the tensor product basis
of Uq 1/2 (sl2 ) Uq 1/2 (sl2 ) can be obtained as before, and coincide exactly with the ones
corresponding to the general case N . So, we refer the reader to the next section for explicit
expressions. Let us however mention that the exact expressions for the Laurent polynomials
(2)
Pk (u) with k {0, 1} are given by

(1)
(2)
P0 (u) = q 1/2 u2 + q 1/2 u2 + c0 q 1/2 q 1/2 0
P1 (u) = q 1/2 + q 1/2 .
(2)
v 2 + v 2
(j )
w0 ,
1/2
1/2
q +q
(15)
Finally, the constant term in (14) reads

(2)
0 =
v 2 + v 2
(j ) (1)
w0 0 .
1/2
1/2
q +q
(16)
331
2.1.3. General case N

For general values of N , we want to find an explicit expression for K (N ) (u) such that the
(N )
dependence on the spectral parameter u in the operators j (u) is disentangled, similarly
to (14). Based on previous results for N = 1 and N = 2, we propose the following ansatz
in (10) for K (N ) (u):
(N )
N
1
(N )
(N )
N
1
(N )
q 1/2 u2
(N )
0 (u) + 3 (u) = uq 1/2
Pk (u)Wk u1 q 1/2
(N )
(N )
k=0
0 (u) 3 (u) = uq 1/2
N
1
+ (u) =
(N )
(u) =
c0
(q 1/2
q 1/2 u2
q 1/2 )
+ q 1/2 u2
q 1/2
q 1/2
(N )
(N )
(N )
k=0
Pk (u)Wk+1 u1 q 1/2
(N )
(N )
N
1
k=0
+ q 1/2 u2
(N )
Pk (u)Wk+1 ,
Pk (u)Wk ,
k=0
(N )
P0
(N )
P0
(u) +
(u) +
q 1/2
q 1/2
1
+ q 1/2
c0
+ q 1/2
N
1
(N )
(N )
(N )
Pk (u)Gk+1 + 0 ,
k=0
N
1
Pk (u)G
k+1
(N )
(N )
k=0
(N )
+ c0 0 ,
(17)
(N )
where Pk (u) are Laurent polynomials to be determined. As explained above, according

to the analysis of [17], one knows that

K (N +1) (u) LN +1 (uv)K (N ) (u)LN +1 uv 1
(18)
is also a solution of (5) provided L(u) obeys (7). For the ansatz (17) to be correct for all
N , we thus need to show that K (N +1) (u) keeps the form (17) with N N + 1. To do that,
we proceed as follows. First, let us assume (17) given N fixed. Then, we have to find an
(N )
(N )
(N ) (N )
explicit relation between the tensor product of the old basis Wk , Wk+1 , Gk+1 , G
k+1
(N +1)
(N +1)
(N +1) (N +1)
with Uq 1/2 (sl2 ) and the new one Wk , Wk+1 , Gk+1 , G
k+1 . Such relation can be
obtained, however, being rather complicated, we report the reader to Appendix A for the
explicit recursion relations (A.1) in the (N + 1)-tensor product basis of Uq 1/2 (sl2 ). At the
same time, the following recursion relations for the Laurent polynomials:

v 2 + v 2
(j )
(N +1)
(N )
1/2 2
1/2 2
P0 (u)
P0
(u) = q u + q
u 1/2
w
q + q 1/2 0

(N )
+ c0 q 1/2 q 1/2 0 ,
(N )

v 2 + v 2
(j )
(N )
(u) = q 1/2 + q 1/2 Pk+1 (u) Pk (u) 1/2
w
q + q 1/2 0
for k {1, . . . , N 1},
(N )

(N +1)
PN (u) = q 1/2 + q 1/2 PN +1 (u)
(N +1)
Pk
(19)
appear naturally. In addition, the constant term transforms as

(N +1)
v 2 + v 2
(j ) (N )
w .
+ q 1/2 0 0
q 1/2
(20)
332

(N +1)
Remarkably, in the basis (A.1) the operators j
(u) can be drastically simplified. In(N )
deed, using the ansatz (17) and the recursion relations for the polynomials Pk (u) as
written above, these operators reduce to
(N )
(N +1)
(N +1)
(N )
(u) + 3
(u) = 0 (u) + 3 (u)
N N +1
0
(N +1)
(N +1)
+ uq 1/2 q s3 1
u1 q 1/2 q s3 2

(N +1)
(N +1)
(N )
(N )
(u) 3
(u) = 0 (u) 3 (u)
N N +1
0
+ uq 1/2 q s3 2
u1 q 1/2 q s3 1

(N +1)
(N )
(N +1)
(u) = + (u)
N N +1 + q 1/2 q 1/2 +
,
+

(N +1)
(N )
(N +1)
(u) = (u)
N N +1 + q 1/2 q 1/2
,
(N +1)
(N +1)
(21)
where we denote
(N +1)
(N +1)
(N +1)
= vq 1/4 S q s3 /2 1
+ v 1 q 1/4 S q s3 /2 2
1
(N +1)
+
I 3
,
q q 1
+1)
+1)
(N +1) = v 1 q 1/4 S+ q s3 /2 (N
+ vq 1/4 S+ q s3 /2 (N
1
2
c0
(N +1)
+
I 4
.
q q 1
(N +1)
It is clear that the extra (unwanted) operators j

with j {1, . . . , 4} (written below)
must be vanishing in order for the ansatz (17) to apply in the case N + 1. To show that, let
(N +1)
us focus, for instance, on 1
. According to previous analysis, one has

+1)
)
(N )
(N )
(N
= c0 q 1/2 q 1/2 0(N ) W(N
1
0 PN +1 (u)WN
+
N
1

1/2 2
(N )
(N )

(N )
q u + q 1/2 u2 Pk (u) q 1/2 + q 1/2 Pk+1 (u) Wk .
k=1
(22)
(N )
(N )
(N ) (N )
Wk , Wk+1 , Gk+1 , G
k+1
In other words, the fundamental generators

must satisfy nontrivial linear relations. Furthermore, for consistency reasons the relations (22) must be
independent of the spectral parameter u. It is then useful to notice that the Laurent polynomials (19) can be written as
(N )
Pk+1 (u) =

N
1 1/2 2

q u + q 1/2 u2 nk+1 (N )
1
Cn
q 1/2 + q 1/2
q 1/2 + q 1/2
(23)
n=k1
(1)
(N )
for 1 k N , together with the initial condition P0 (u) 1. Here the coefficient Ck
are found to satisfy the recursion relations
(N )
C0

(N 1)
= q q 1 c0 0
v 2 + v 2
(j ) (N 1)
w C
,
+ q 1/2 0 0
q 1/2
(N 1)

(N )
Ck = q 1/2 + q 1/2 Ck+1
(N 1)

(N )
CN +1 = q 1/2 + q 1/2 CN +2
v 2 + v 2
(j ) (N 1)
w C
q 1/2 + q 1/2 0 k
333
for 1 k N 2,
(24)
with C0 = (q 1/2 + q 1/2 ). In particular, using (23) we immediately deduce for k

{1, . . . , N 1}
1/2 2
(N )
(N )

(N )
q u + q 1/2 u2 Pk
(25)
(u) q 1/2 + q 1/2 Pk+1
(u) = Ck+1
.
(1)
(N +1)
Note that relations analogous to (22) also hold for j

with j {2, 3, 4} substituting
(N )
(N )
(N )
(N )
W
by W , G
or G , respectively. Then, as explained above, K (N +1) (u) will have
k
k+1
k+1
k+1
(N +1)
= 0 for j {1, . . . , 4} are satisfied for all values of N .

the structure (17) provided j
After replacing (25) in (22), we finally get the following (spectral parameter independent)
linear relations among the fundamental generators
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 W0
Ck+1 Wk = 0,
k=1
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 W1
Ck+1 Wk+1 = 0,
k=1
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 G1
Ck+1 Gk+1 = 0,
k=1
N

(N ) (N )
(N ) (N )
c0 q 1/2 q 1/2 0 G
Ck+1 G
1
k+1 = 0
(26)
k=1
with (20), (13) and

k

(N )
Ck+1 = q 1/2 + q 1/2 (1)N k+1
v 2 + v 2
(j )
w
1/2
q + q 1/2 0
N k
N!
(k)!(N k)!
for k {1, . . . , N}. Using the representation (A.1), we have checked explicitly that these
linear relations are satisfied for all N . For simplicity, details are reported in Appendix B.
+1)
vanishing in (21), it follows that K (N +1) (u) is given by (10), with (17)
The terms (N
j
using the substitution N N + 1. This being true for N = 1 and N = 2 as shown in
previous sections, we conclude that the general solutions of the reflection equations can be
written as (10) with (17) for all values of N , where the algebraic structure is now encoded
(N )
(N )
(N ) (N )
in the fundamental generators Wk , Wk+1 , Gk+1 , G
k+1 .
2.2. Integrable structure and generating function
Applied to quantum integrable systems on the lattice, the generalized quantum inverse
scattering approach provides a powerful method in order to derive in a systematic way a
family of independent mutually commuting quantities. For instance, as shown in [17] one
334
can introduce the functional

t (N ) (u) = tr0 K+ (u)K (N ) (u)
(27)
with (10) and K+ (u) which solves the dual reflection equation.2 Here tr0 denotes the
trace over the two-dimensional auxiliary space. Then, it is proven in [17] that(27) satisfies

(N )
t (u), t (N ) (v) = 0 for all u, v C,
(28)
i.e., t (N ) (u) constitutes the generating function for the mutually commuting operators. For
instance, let us plug the c-number solution of the dual reflection equation [20,21]

(q 1/2 + q 1/2 )(qu2 q 1 u2 )/c0
uq 1/2 + u1 q 1/2
K+ (u) =
,
+ (q 1/2 + q 1/2 )(qu2 q 1 u2 )
uq 1/2 + u1 q 1/2
(29)
1 ,
with
we derive
t (N ) (u) =
arbitrary complex parameters and (10) with (17) in (27). Immediately,

N
1
2
(N )
(N )
qu q 1 u2 Pk (u) I2k+1 + F(u) I,
(30)
k=0
with (23) and

F(u) =
(q 1/2 + q 1/2 )(qu2 q 1 u2 )

c0

1/2 2
q u + q 1/2 u2 (N )
(N )
(+ + ).
P0 (u) + c0 0
q 1/2 q 1/2
Note that the function F(u) is obviously not important from an algebraic point of view.
(N )
Here, we have introduced the operators I2k+1 which can be written in terms of the fundamental generators as
I2k+1 = Wk + Wk+1 + + Gk+1 + G

k+1 ,
(N )
(N )
(N )
(N )
(N )
for k {0, 1, . . . , N 1}. In particular, the property (28) leads to

(N )
(N )
I2k+1 , I2l+1 = 0 for all k, l {0, . . . , N 1}.
(31)
(32)
These latter commutation relations impose strong constraints on the fundamental generators. Indeed, plugging (31) in (32) one gets the commutation relations
(N ) (N )
(N )
(N ) (N ) (N )
(N )
(N )
Wk , Wl = 0,
Wk+1 , Wl+1 = 0,
Wk , Wl+1 + Wk+1 , Wl = 0,
(N ) (N )
(N ) (N )
(N ) (N ) (N ) (N )
Gk+1 , Gl+1 = 0,
G
G
k+1 , Gl+1 = 0,
k+1 , Gl+1 + Gk+1 , Gl+1 = 0,
(N ) (N ) (N )
(N ) (N ) (N )

(N )
(N
)
Wk+1 , Gl+1 + Gk+1 , Wl+1 = 0,

Wk+1 , G
l+1 + Gk+1 , Wl+1 = 0,
2 The dual reflection equation follows from (5) by changing u q 1/2 u1 , v q 1/2 v 1 and K(u) in
its transpose.
(N )
(N )
(N )
(N )
Wk , Gl+1 + Gk+1 , Wl
= 0,
335
(N ) (N )
(N )
(N )
Wk , G
l+1 + Gk+1 , Wl = 0.
(33)
A few remarks can now be done. First, although from the beginning we have considered
tensor product representations of Uq 1/2 (sl2 ), the form of (31) essentially relies on the structure (10) with (17). Then, classifying all possible finite, (in)finite-dimensional or cyclic
tensor product representations different from (A.1) is an interesting open question. Secondly, it should be stressed that given N , the relations (26) and their generalizations (see
Appendix B) (54)(57) are responsible of the truncation of the integrable hierarchy, i.e.,
(N )
(N )
any quantity I2k+1 for k N can be written in terms of all I2k+1 with k N 1. It follows
that given N , there are only N -independent mutually commuting fundamental quantities.
To conclude, let us mention that for the special case N = 1 (and k = 0) it is easy to check
that (31) coincides exactly with the result of [18].
2.3. Fundamental q-deformed commutation relations
We are now interested in the algebraic structure associated with the fundamental gen)
(N )
(N ) (N )
erators W(N
k , Wk+1 , Gk+1 , Gk+1 . The form of (17) and the representations (A.1) being
determined by the quadratic algebra (5), this equation fixes all fundamental relations
among the generators. To extract these relations, we proceed as follows. First, from (8)
and (6), one has

(1/2) L(u) = R(u),
(34)
where (1/2) denotes the spin- 12 representation of Uq 1/2 (sl2 ). Then, plugging K (N ) (u) in
(5), this latter equation can be written as

(1/2)
id(N ) LN +1 (uv 1 )K (N ) (u)LN +1 (uv) I K (N ) (v)

= I K (N ) (v) (1/2) id(N ) LN +1 (uv)K (N ) (u)LN +1 uv 1 .
(35)
Being satisfied for any value of the spectral parameter u, following [13] we can consider
its asymptotic expansion for u . Replacing (10) with (17) in (35), one finds that the
leading equation is trivially satisfied. However, the (next) two subleading ones read
(1/2)
(N +1)
(N +1) (N )
K (v),
id(N ) W0
K (N ) (v) = (1/2) id(N ) W0
vv 1
(1/2)

(N
+1)
(N
+1)
K (N ) (v).
id(N ) W1
K (N ) (v) = (1/2) id(N ) W1
vv 1
(36)
Using the recursion relations (A.1) for the finite-dimensional representations of the fundamental generators and (1/2) [S ] = and (1/2) [s3 ] = 3 /2, one has

1/2 (N )
(N +1)
q W0
v/c0
(1/2) W0
=
(N ) ,
v 1
q 1/2 W0

1/2 (N )
(N +1)
W1
v 1 /c0
q
(1/2) W1
=
(N ) .
v
q 1/2 W1
It is now easy to simplify the intertwining relations (36). After some calculations, the
constraints (26) appear naturally with the use of (25). These latter relations being satisfied
336
(see Appendix B for details), omitting the index N we end up with the defining relations
(4) for all k {0, . . . , N 1} provided one identifies
=
(q 1/2 + q 1/2 )2
.
c0
(37)
Note that some of the relations (4) already appear in (33). Actually, it is easy to give an
alternative derivation of the q-deformed commutation relations (4) using the representation
(A.1). Indeed, for any N the constraints (33) are satisfied. Let us consider N N + 1
(N +1)
(N +1)
(N +1) (N +1)
in these constraints, and replace the generators Wk , Wk+1 , Gk+1 , G
k+1 , k
{0, . . . , N} by their finite-dimensional representations (A.1). After some straightforward
calculations, the q-deformed relations (4) arise explicitly. Consequently, this shows perfect
consistency between the approach (35) associated with the symmetries underlying (5),
and the properties of the integrable structure (31), i.e., (33). It follows that all elements
(N )
(N )
(N ) (N )
(N )
(N )
Wk , Wk+1 , Gk+1 , G
using the
k+1 for k {0, . . . , N 1} are generated from W0 , W1
recursion relations (4).
2.4. Relation with tridiagonal algebras and deformed DolanGrady hierarchy
The integrable structure (2) is known to be identical [22] with the DolanGrady construction introduced and studied in [12], but corresponding to a different notation. This
latter structure was found to apply to the class of Hamiltonian of the form
H = A0 + A1 .
(38)
As shown in [12], the integrability condition of the related models (Ising, XY, . . .) relies
on the existence of two (necessary and sufficient) conditions, the DolanGrady relations,
defined by
[A0 , [A0 , [A0 , A1 ]]] = 16[A0 , A1 ] and [A1 , [A1 , [A1 , A0 ]]] = 16[A1 , A0 ]. (39)
All higher mutually commuting quantities beyond (38) can be written solely in terms of the
fundamental operators A0 , A1 . Note that provided A0 , A1 satisfy (39), the whole Onsagers
algebra (2) is generated. Also, A0 , A1 as well as the other generators of the Onsagers
2 [2325]. We refer the reader
algebra can be expressed in the basis of the loop algebra sl
to these works for details.
Surprisingly, a q-deformed analogue of the DolanGrady relations (39) recently appeared in the context of P - and Q-polynomial association schemes [2628]:

A, A, A, A q q 1 = [A, A ] and
A , A , A , A q q 1 = [A , A] . (40)
By [28, Definition 3.9], the tridiagonal algebra T is the associative algebra with unity
generated by two symbols A, A subject to the relations (40). We call A, A the standard
generators. Here q is a deformation parameter (usually assumed to be not a root of unity)
and is a fixed scalar. Let V denote a finite-dimensional irreducible module for T. Then the
pair of linear transformations A : V V and A : V V is said to be a tridiagonal (TD) pair
[27, Definition 1.1], which complete classification remains an open problem. The subset
of TD pairs such that A, A have eigenspaces of dimension one is called Leonard pairs,
337
classified in [29]. In particular, Leonard pairs satisfy (for details, see [30]) the so-called
AskeyWilson (AW) relations first introduced by Zhedanov in [19]. Other examples of TD
pairs can be found in [28,31]: for = 0 in which case (40) reduce to q-Serre relations;
for q = 1 and = 16 which leads to the DolanGrady relations (39). The more general
situation = 0, q = 1 was recently considered in details in [13,14]. There, it was found
that TD pairs A, A admit a realization in terms of the quantum affine KacMoody algebra
2 ). This algebra is generated by Q , Q and H subjects to
Uq 1/2 (sl
q 1/2 Q Q = 0,
q 1/2 Q Q
2H 1
q 1/2 Q Q = q
q 1/2 Q Q
,
q 1/2 q 1/2
= q
Q
q
H
q
H Q = q
Q q
H ,
q
H Q
(41)
together with the q-Serre relations

Q3 Q 1 + q + q 1 Q2 Q Q + 1 + q + q 1 Q Q Q2 Q Q3 = 0,
2

3 Q 1 + q + q 1 Q
Q Q + 1 + q + q 1 Q Q Q
Q
2 Q
3 = 0 .
Q
2 ) is ensured by the coproduct :
Also, the Hopf algebraic structure of Uq 1/2 (sl

Uq 1/2 (sl2 ) Uq 1/2 (sl2 ) Uq 1/2 (sl2 ) associated with (41) acting on the fundamental generators as
(Q ) = Q I + q H Q ,
I + q H Q
,
(Q ) = Q
H
H
H
q =q q .
(42)
2 ) Uq 1/2 (sl
2 )
More generally, one defines the N -coproduct (N ) : Uq 1/2 (sl

Uq 1/2 (sl2 ) as
(N ) (id id ) (N 1)
for N 3 with (2) , (1) id. Note that the opposite N -coproduct
(N ) is similarly defined with
, where the permutation map (x y) = y x for all
2 ) is used. As noticed in [14], it is not difficult to show that some linear
x, y Uq 1/2 (sl
2 ) generators satisfy (40). More generally, using the homomorcombinations of Uq 1/2 (sl
phism property of (42), it is straightforward to check that

1
(N ) 1
H
(N )
H
and A
Q + Q+ +
q
A
Q+ + Q +
+ q
c0
c0
(43)
defines a family of TD pairs (i.e., (40) is satisfied) for arbitrary parameters
and the
identification (37). Note that although the special case
= 0 is considered in most of
this paper, as pointed out in [14] the algebraic structure remains the same for
= 0. The
only difference arises in the exact expressions for the c-number coefficients in the linear
relations (26).
338
A q-deformed analogue of the DolanGrady integrable structure [12] was proposed in

[13,14], also based on the properties of the quadratic algebra (5). In the basis of A, A , the
two first charges simply read
I1 = A + A ,

c0
I3 =
A, A q , A q + A
(q 1/2 + q 1/2 )2

c0
+
A , A q, A q + A ,
(q 1/2 + q 1/2 )2
(44)
which are mutually commuting in virtue of (40). Comparison between the results of previous Sections and the ones of [14] can be done easily, and allow us to find the explicit
(N )
(N )
(N ) (N )
expression of the fundamental generators Wk , Wk+1 , Gk+1 , G

k+1 in terms of A, A satisfying (40) in the simplest cases N = 1 and N = 2. Furthermore, it provides an alternative
check of the fundamental q-deformed relations (4), the linear relations (26) as well as their
generalizations (54)(57).
2.4.1. Case N = 1
(1)
Assuming that the entries j (u) are combinations of Laurent polynomials of degree
2 d 2 in u and operators, it was shown in [13] that any solution K (1) (u) takes the
form (10) for N = 1 and the identification

(1)
(1)
(1)
(1) = A, A .
W0 = A,
W1 = A ,
G1 = A , A q ,
G
(45)
1
q
It should be stressed that this does not require the relation (18) to be satisfied, and only
(1)
relies on the degree of the Laurent polynomials in j (u). Nevertheless, the fact that
K (1) (u) satisfies (5) imposes strong constraints on the generators A, A : plugging (10) for
N = 1 with (11) and using (45), one obtains the so-called AskeyWilson relations [19]
(q 1/2 + q 1/2 )2 v 2 + v 2 (j )
A
w0 A,
c0
c0

(q 1/2 + q 1/2 )2
v 2 + v 2 (j )
A 2 A + AA 2 q + q 1 A AA =
A
w0 A .
c0
c0
A2 A + A A2 q + q 1 AA A =
(46)
In particular, it is easy to check these relations for the representation (12). Having in mind
the q-deformed relations (4) and the corresponding hierarchy (31) for = 0, it is natural
to propose from (44) the following identification:

c0
(1)
W1 1/2
A, A , A q q + A ,
1/2
2
(q + q
)

c0
(1)
(47)
W2 1/2
A , A q , A q + A .
1/2
2
(q + q
)
This notation introduced in (46), together with (45) leads to a set of very simple linear
relations among the fundamental generators. These relations are actually responsible of
the truncation of the hierarchy (44) for N = 1, i.e., one finds that I3 is proportional to I1 .
More generally, any higher charge I2k+1 for k 1 can be expressed in terms of the first
339
one [14]. Based on this truncation which occurs at N = 1 and the defining relations (4),
one finds a slightly generalized version of the first two equations in (26) for the special
case N = 1:

(1) (1)
c0 q 1/2 q 1/2 0(1) W(1)
l C0 Wl1 = 0,

(1) (1)
(1) (1)
c0 q 1/2 q 1/2 0 Wl+1 C0 Wl+2 = 0
(48)
for any l 0. In particular, the AW relations correspond to l = 0.

2.4.2. Case N = 2
Based on the results of [14], we proceed similarly. Assuming now that the entries
(1)
j (u) are combinations of Laurent polynomials of degree 4 d 4 in u and operators, the solution K (2) (u) takes the form (10) for N = 2 with the identification
(2)

c0
A, A , A q q + A ,
1/2
2
+q
)

c0
(2)
W2 = 1/2
A , A q , A q + A.
1/2
2
(q + q
)
(2)
W0 = A,
W1 =
W1 = A ,
(2)
G1 = A , A q ,
(2)
(q 1/2
2
2
G2 = 1 A 2 , A2 q 2 + 2 A , A q + 3 A , A
(2)

q 1/2 q 1/2 2
2
+ 0 ,
A
+
A
q + q 1

(2) = A, A ,
G
1
q
2 2

2

2
(2)
= 1 A , A
G
+ 2 A, A q + 3 A, A
2
q2
+

q 1/2 q 1/2 2
A + A 2 + 0 ,
q + q 1
(49)
where
0 =

(j ) 2
w0
2
v 4 + v 4
,
1
1
c0 (q 1/2 q 1/2 ) (q 1/2 + q 1/2 )2

q + q 1
1 =
c0 (q 1/2 q 1/2 )
,
q 2 q 2
3 =
c0 (q 1/2 + q 1/2 )
.
q 2 q 2
2 =
c0 (q + q 1 )
,
(q q 1 )(q 1/2 + q 1/2 )
(2)
(2)
It is important to notice that the expressions of the fundamental generators W0 , W1 ,

(2) (2)
G1 , G
1 in terms of A, A remains unchanged compared to the case N = 1. In addition,
the proposal (47) is confirmed. Plugging K (2) (u) given by (10) with (14) in (5) and using
340
(49), we obtain the N = 2 generalization of the AW relations (46):

2(v 2 + v 2 )
(j )
w A, A , A q q
(q 1/2 + q 1/2 )2 0

2(v 2 + v 2 ) (j )
+ A , A , A q q 1 +
w0 A
c0

1/2
(q + q 1/2 )2
(v 2 + v 2 )2
(j ) 2
+
w
A,
c0
c0 (q 1/2 + q 1/2 )2 0
(2)

2(v 2 + v 2 )
(j )
G2 , A q = 1/2
w0 A , A, A q q
1/2
2
(q + q
)

2(v 2 + v 2 ) (j )
+ A, A, A q q 1 +
w0 A
c0

1/2
(q + q 1/2 )2
(v 2 + v 2 )2
(j ) 2
+
w
A .
c0
c0 (q 1/2 + q 1/2 )2 0
(2)
q
A, G2
(50)
(2) , A]q = [A, G(2) ] and [G(2) , A ]q = [A , G

(2) ] , so that no other relations
Notice that [G
2
2 q
2
2 q
than (50) are obtained. Based on the algebraic structure (4), as before it is then natural to
propose

c0
(2)
(2)
(2)
W2 1/2
A, G2 q + W2 ,
1/2
2
(q + q
)
(2)
c0
(2)
(2)
(51)
W3 1/2
G2 , A q + W1 ,
1/2
2
(q + q
)
where the explicit expression of W2 , W1 , G2 in terms of A, A are used in the r.h.s.
of (51). Identifying (51) in the constraint (50), one finds immediately the linear relations
(26) for N = 2. Using the same argument as before about the truncation of the hierarchy,
it follows that only I1 and I3 are independent for N = 2. All higher charges being linear
combinations of them, we deduce

(2) (2)
(2) (2)
(2) (2)
c0 q 1/2 q 1/2 0 Wl C0 Wl1 C1 Wl2 = 0,

(2) (2)
(2) (2)
(2) (2)
c0 q 1/2 q 1/2 0 Wl+1 C0 Wl+2 C1 Wl+3 = 0,
(52)
(2)
(2)
(2)
for any l 0. For l = 0, the relations (50) are recovered.

2.4.3. General case N
For more general values of N , although technically difficult it is well expected that
the fundamental generators can be written solely in terms of A, A . Indeed, for q = 1 the
deformed DolanGrady integrable structure must reduce to the undeformed one, so that
operators in the two hierarchies are in one-to-one correspondence. This goes beyond the
scope of this paper, so we do not pursue the analysis for N > 2.
We now want to focus our attention on the construction of linear relations generalizing
(26), similarly to (48) and (52). First, it is an exercise to show using (48) and (4) that the
relation

(1) (1)
(1) (1)
c0 q 1/2 q 1/2 0 Gl+1 C0 Gl+2 = 0
(53)
341
(1) . Indeed, this is in agreement with the argument based on

is satisfied, and similarly for G
l+1
(N )
the truncation of the hierarchy (31) for the more general case = 0. Any charge I2k+1
(N )
for k N being expressed as a linear combination of all charges I2k+1 for k N 1,

and the parameters , , being independent, we propose for general values of N the
following relations generalizing (26):
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 Wl
Ck+1 Wkl = 0,
(54)
k=1
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 Wl+1
Ck+1 Wk+l+1 = 0,
(55)
k=1
N

(N ) (N )
(N )
(N )
c0 q 1/2 q 1/2 0 Gl+1
Ck+1 Gk+l+1 = 0,
(56)
k=1
N

1/2

(N ) (N )
1/2 (N ) (N )
c0 q q
0 Gl+1
Ck+1 G
k+l+1 = 0,
(57)
k=1
for any l 0. For N = 1 and N = 2, these relations were obtained above. We have checked
explicitly that these relations also hold for general values of N , using the explicit finitedimensional representations of the generators. We report the reader to Appendix B for
details. As a consistency check, let us mention that the relations (56), (57) actually follow
from (54) and (55), using the q-deformed commutation relations (4).
3. Concluding remarks
The Onsagers algebra is known to be generated from two elements A0 , A1 satisfying
the DolanGrady relations (39). Defining 4G1 = [A1 , A0 ], all higher elements Ak , Gl in
(2) are generated from the recursion relations [2325]
1
1
Ak+1 Ak1 = [G1 , Ak ],
(58)
Gl = [Al , A0 ] .
2
4
The relations (39) are actually sufficient to reconstruct the Onsagers algebra (2). Furthermore, for finite-dimensional representations the spectral properties of (38) (as well as
arbitrary combinations of Ak , Gl ) are known to be encoded in the closure of the algebra
[23,24] which reads (for some coefficients k )

(59)
k Akl = 0,
k Gkl = 0.
k
In this paper, based on the link between the quadratic algebra (5) and the deformed
DolanGrady integrable structure recently discovered in [13,14], we have found that the
algebra (2) introduced by Onsager in [1] admits a q-deformed (in)finite-dimensional
k+1 with k N. Simianalogue (4) with fundamental generators Wk , Wk+1 , Gk+1 , G
larly to the Onsagers algebra, the integrable structure follows from the q-deformed
342
DolanGrady relations (i.e., tridiagonal algebra) (40) with A W0 , A W1 and the

q-deformed recursion relations in (4). It should be stressed that this new algebra possesses either finite- or infinite-dimensional representations: finite-dimensional representations (A.1) have been obtained, in which case the generators satisfy a set of linear relations
generalizing the AskeyWilson ones (54)(57). On the other hand, in the limit N the
algebra3 (4) becomes infinite-dimensional, in which case vertex operators representations
can be used [13].
This new symmetry ensures the existence of an (in)finite number (associated with the
(in)finite parameter N ) of mutually commuting quantities given by (31). For the special
(undeformed) case q = 1, the defining relations (40), (4) coincide exactly with the ones
considered in [12,23]. Indeed, simple comparison between (3), (4) and (31) gives the exact
relation between our generators and the ones in [1]:
Wk
q=1 (Ak + Ak )/2,
Wk+1
q=1 (Ak+1 + Ak+1 )/2,
k+1
Gk+1
q=1 = G
4Gk+1 .
(60)
q=1
Also, the linear relations (26) as well as their generalizations (54)(57) reduce to the ones
proposed in [23].
The most interesting problem now is to analyze quantum integrable models with this
new mathematical framework, in order to extract any nonperturbative information. In this
direction, identifying the models with such underlying symmetry is obviously the first thing
to be done. In particular, it should be stressed that (4) is closely related with Uq 1/2 (sl2 ) with
[13,14]. Then, it is well expected that the characteristics of the model
generators Q , Q
are encoded in the parameters N , q, c0 , v whereas the fundamental generators should
correspond to some observables. Below, we give various examples of quantum integrable
models (lattice, massive, boundary or conformal) which enjoy the symmetry (4).
XXZ open spin chain with general boundary conditions. The fundamental gener(N )
(N )
ators W0 , W1 are related4 with the nonlocal conserved charges obtained in [32] using
the method proposed in [33,34], and N corresponds to the number of sites. The deformation parameter q characterizes the anisotropy = (q 1/2 + q 1/2 )/2 of the model, whereas
, (c0 ) are non-diagonal left (right) boundary conditions, respectively. Also, v = 1.
Note that for more general right boundary conditions associated with extra parameters
,
the algebraic structure remains essentially unchanged. It follows that the underlying fundamental symmetry of the XXZ open spin chain with general boundary conditions is the
q-deformed Onsagers algebra (4) with representations (A.1). This symmetry, sometimes
called boundary quantum group algebra, is in one-to-one correspondence with the tridiagonal algebra [2628] as shown in [14]. Based on previous analysis, it follows that the
transfer matrix can be simply written as (30). Details will be reported elsewhere.
Sine-Gordon quantum field theory. In the bulk, the sine-Gordon model is known to
possess nonlocal conserved charges [7], usually denoted Q , Q , generating a Uq 1/2 (sl2 )
3 Of course, the linear relations (54)(57) need to be clarified in this case.
4 Note that the nonlocal charges derived in [32] correspond to certain integrable boundary conditions, not the
most general ones.

(N )
343
(N )
symmetry. Actually, W0 , W1 for N admit a vertex operator representation in oneto-one correspondence with linear combinations of these charges [13] and parametrized by
c0 (arbitrary). The deformation parameter q and parameter v are easily related with the
coupling constant 2 and the rapidity of the fundamental particles (soliton/antisoliton),
respectively. In case of a non-dynamical [21] or a dynamical boundary [35] the model still
remains integrable, but the symmetry is restricted. The corresponding nonlocal conserved
charges have been constructed in [33,34], and generate an example of tridiagonal algebra
[13,14]. For these boundary integrable models, one has the identification N and
c0 = 1.
Liouville quantum field theory. In this conformal limit of the sine-Gordon model, it
+ are conserved (commuting with the stressis easy to check that either Q+ , Q or Q , Q
energy tensor). Then, an infinite number of conserved quantities are obtained from (31) for
N and some vanishing parameters , , . It follows that the Liouville field theory
(as well as its boundary counterpart) enjoys the symmetry generated by a subalgebra of (4).
Quasi-exactly solvable systems, Bethe ansatz and q-difference equations. For general values of N = 1, the spectral problem associated with (31) leads to a system of partial
q-difference equations that clearly needs further investigation. For the special case N = 1,
one obtains a second-order q-difference equation which has been considered in details in
[18,36], and lead to Bethe equations. Interestingly, for the limit q 1 it becomes the Heun
(or similarly the PschlTeller) equation. Furthermore, at this special value of the deformation parameter the Onsager algebra (2) exhibited and studied in the context of quasi-exactly
solvable systems and nonlinear holomorphic supersymmetry [37] is recovered. Then, we
expect our construction will provide a new (algebraic) approach to the surprising relation
between conformal field theory and differential equations pointed out in [38], as well as its
massive counterpart.
Related problems will be considered elsewhere.
Acknowledgements
We thank P. Terwilliger for comments. K. Koizumi is supported by CNRS and French
Ministry of Education and Research. Part of this work is supported by the TMR Network
EUCLID Integrable models and applications: from strings to condensed matter, contract
number HPRN-CT-2002-00325.
Appendix A. Tensor product representations of the fundamental generators
1 1/4
(N )
vq S+ q s3 /2 I + v 1 q 1/4 S q s3 /2 I + q s3 W0 ,
c0
1
(N +1)
(N )
W1
= v 1 q 1/4 S+ q s3 /2 I + vq 1/4 S q s3 /2 I + q s3 W1 ,
c0
(N +1)
W0
344

(N +1)
G1

q 1/2 + q 1/2 2 s3
(N )
2 s3
v
I + I G1
q
+
v
q
c0 (q 1/2 q 1/2 )

(N )
(N )
+ q q 1 vq 1/4 S q s3 /2 W0 + v 1 q 1/4 S q s3 /2 W1
2

= q q 1 S
I
(v 2 + v 2 )w0
I,
+
c0 (q 1/2 q 1/2 )
(j )
(N +1)
G
1

q q 1 2
q 1/2 + q 1/2 2 s3
(N )
v q
S
+ v 2 q s3 I + I G
+
1
2
1/2
1/2
c0 (q q
)
c0
q q 1 1 1/4
(N )
(N )
v q S+ q s3 /2 W0 + vq 1/4 S+ q s3 /2 W1
c0
(v 2 + v 2 )w0
I,
+
c0 (q 1/2 q 1/2 )
(j )
(N +1)
Wk1
w (q 1/2 + q 1/2 )q s3
v 2 + v 2
(N )
)
= 0
I W(N
k
k+1
q 1/2 + q 1/2
q 1/2 + q 1/2
(j )
(v 2 + v 2 )w0
q 1/2 q 1/2
(N +1)
Wk
+ 1/2
1/2
1/2
2
(q + q
)
(q + q 1/2 )2

1/4
(N )
(N )
vq S+ q s3 /2 Gk+1 + c0 v 1 q 1/4 S q s3 /2 G
k+1
(j )
(N )
+ q s3 Wk1 ,
(N +1)
Wk+2
w0 (q 1/2 + q 1/2 )q s3
v 2 + v 2
(N )
(N )
Wk 1/2
I Wk+1
1/2
1/2
q +q
q + q 1/2
(j )
(v 2 + v 2 )w0
q 1/2 q 1/2
(N +1)
W
+
(q 1/2 + q 1/2 )2 k+1
(q 1/2 + q 1/2 )2

1 1/4
(N )
(N )
v q
S+ q s3 /2 Gk+1 + c0 vq 1/4 S q s3 /2 G
k+1
(j )
+ q s3 Wk+2 ,
(N )
(N +1)
Gk+2
2 s

c0 (q 1/2 q 1/2 )2 2
1
(N )
(N )
S G
v q 3 + v 2 q s3 Gk+1
k+1
q 1/2 + q 1/2
q 1/2 + q 1/2
1/4

)
)
(N )
1
vq
+ I G(N
+
q
q
S q s3 /2 W(N
k+2
k1 Wk+1

(N )
(N )
+ v 1 q 1/4 S q s3 /2 Wk+2 Wk
(v 2 + v 2 )w0
(N +1)
G
,
(q 1/2 + q 1/2 )2 k+1
(j )

(N +1)
G
k+2
345
2 s

(q 1/2 q 1/2 )2 2
1
(N )
3 + v 2 q s3 G
(N )
S
v
q
+
k+1
k+1
c0 (q 1/2 + q 1/2 )
q 1/2 + q 1/2
(N )
+IG
k+2
(N )
q q 1 1 1/4
(N )
v q S+ q s3 /2 Wk1 Wk+1
+
c0
(v 2 + v 2 )w (j )
(N )
(N )
0 (N +1)
G
(A.1)
+ vq 1/4 S+ q s3 /2 Wk+2 Wk + 1/2
(q + q 1/2 )2 k+1
for k {0, 1, . . . , N 1}.

Appendix B. Generalized linear relations
The purpose of this appendix is to show that the linear relations (26), as well as their
generalizations (54)(57), are satisfied for all values of N . For N fixed, let us first assume
(N )
(N )
(N )
(N ) with k {0, . . . , N} satisfy (54)(57). Then, a straightthat Wk , Wk+1 , Gk+1 and G
k+1
forward calculation based on the finite-dimensional representations (A.1) shows that
N
+1

(N +1) (N +1)
(N +1) (N +1)
Ck+1 Wkl c0 q 1/2 q 1/2 0
Wl
k=1
N +1
(N +1) (N )
w0 (q 1/2 + q 1/2 )q s3
k+1 Wk+l
1/2
1/2
q +q
(j )
k=1
v2
+ v 2
q 1/2 + q 1/2
I

N
+1
(N +1)
(N )
k+1 Wk+1l
k=1
N +1
(N +1) (N )
q 1/2 q 1/2
k+1 Gk+l
vq 1/4 S+ q s3 /2
1/2
1/2
q +q
k=1

N
+1

(N +1) (N )
1 1/4
s3 /2
+ c0 v q
S q
G
+
k+1
k+l
k=1
+ q s3
N
+1
(N +1)
(v 2 + v 2 )w0 (N +1) (N +1)
Wl
(q 1/2 + q 1/2 )2 0
j
(N )
k+1 Wkl +
k=1

(N +1) (N +1)
c0 q 1/2 q 1/2 0
Wl
,
where l 0 and
(N +1)
k+1 =
N
+1

m=k
(v 2 + v 2 )w0
(q 1/2 + q 1/2 )2
(j )
Using (24), it is easy to notice that

(N )

(N +1)
(k1) = q 1/2 + q 1/2 Ck+2
mk
(N +1)
Ck+1 .
for 2 k N + 1
(B.1)
346

(N +1)
and 0

(N )
= c0 q q 1 0 .
Replacing these expressions and (20) in (B.1), all terms vanish in virtue of (54)(57).
Consequently,
N
+1

(N +1) (N +1)
(N +1) (N +1)
Ck+1 Wkl c0 q 1/2 q 1/2 0
Wl
= 0,
(B.2)
k=1
(N +1)
provided (54)(57) for N fixed. Similar relations also holds for Wk+1
(N +1) with k {0, . . . , N + 1}. Now, we proceed by recursion:
G
(N +1)
, Gk+1
and
k+1
General case N and l = 0: the linear relations (26) are satisfied for N = 1 and N = 2,
either corresponding to the AskeyWilson relations (for N = 1), or its N = 2 generalization (50). According to (B.2), it follows that (26) are satisfied for all values of N .
General case N and arbitrary l 0: Due to (26), (10) admits the representation (17)
and the mutually commuting operators take the form (31). For N = 1 and N = 2, we
obtained (48) for arbitrary l 0. Due to (B.2), (54) is indeed satisfied for all values
(N )
(N )
(N ) with
of N . Clearly, similar analysis can be done (see (53)) for Wk+1 , Gk+1 and G
k+1
k {0, . . . , N}. Then, we conclude that (54)(57) are satisfied for all values of N .
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
L. Onsager, Phys. Rev. 65 (1944) 117.

A.A. Belavin, A.M. Polyakov, A.B. Zamolodchikov, Nucl. Phys. B 241 (1984) 333.
D. Friedan, Z. Qiu, S.H. Shenker, Phys. Lett. B 151 (1985) 37.
V.A. Fateev, A.B. Zamolodchikov, Sov. Phys. JETP 62 (1985) 215, Zh. Eksp. Teor. Fiz. 89 (1985) 380.
V.A. Fateev, S.L. Lukyanov, Int. J. Mod. Phys. A 3 (1988) 507;
V.A. Fateev, S.L. Lukyanov, Sov. J. Nucl. Phys. 49 (1989) 925.
V.G. Knizhnik, A.B. Zamolodchikov, Nucl. Phys. B 247 (1984) 83.
D. Bernard, A. LeClair, Commun. Math. Phys. 142 (1991) 99.
B. Davies, O. Foda, M. Jimbo, T. Miwa, A. Nakayashiki, Commun. Math. Phys 151 (1993) 89.
S.L. Lukyanov, Y. Pugai, Sov. Phys. JETP 82 (1996) 1021, Zh. Eksp. Teor. Fiz. 109 (1996) 1900.
G. von Gehlen, V. Rittenberg, Nucl. Phys. B 257 (1985) 351.
C. Ahn, K. Shigemoto, Mod. Phys. Lett. A 6 (1991) 3509.
L. Dolan, M. Grady, Phys. Rev. D 25 (1982) 1587.
P. Baseilhac, Nucl.Phys. B 709 (2005) 491.
P. Baseilhac, Nucl. Phys. B 705 (2005) 605.
G.H. Wannier, Rev. Mod. Phys. 17 (1945) 50.
I.V. Cherednik, Teor. Mat. Fiz. 61 (1984) 55.
E.K. Sklyanin, J. Phys. A 21 (1988) 2375.
A.V. Zabrodin, Theor. Math. Phys. 104 (1996) 762, Teor. Mat. Fiz. 104 (1) (1995) 8.
A.S. Zhedanov, Mod. Phys. Lett. A 7 (1992) 1589.
H.J. de Vega, A. Gonzlez-Ruiz, J. Phys. A 27 (1994) 6129.
S. Ghoshal, A.B. Zamolodchikov, Int. J. Mod. Phys. A 9 (1994) 3841.
J.H.H. Perk, Startriangle equations, quantum Lax operators, and higher genus curves, in: Proceedings 1987
Summer Research Institute on Theta functions, in: Proceedings of Symposia in Pure Mathematics, vol. 49,
American Mathematical Society, Providence, RI, 1989, pp. 341354;
H. Au-Yang, B.M. McCoy, J.H.H. Perk, S. Tang, Solvable models in statistical mechanics and linobreak
Riemann surfaces of genus greater than one, in: M. Kashiwara, T. Kawai (Eds.), in: Algebraic Analysis,
vol. 1, Academic Press, San Diego, 1988, pp. 2940.
347
[23] B. Davies, J. Phys. A 23 (1990) 2245;

B. Davies, J. Math. Phys. 32 (1991) 2945.
[24] S.-S. Roan, Onsagers algebra, loop algebra and chiral Potts model, Max Planck Institute fr Mathematik,
preprint, 1991.
[25] E. Date, S.-S. Roan, J. Phys. A 33 (2000) 3275.
[26] P. Terwilliger, J. Algebraic Combin. 2 (1993) 177.
[27] T. Ito, K. Tanabe, P. Terwilliger, Some algebra related to P - and Q-polynomial association schemes, in:
Codes and association schemes, Piscataway, NJ, 1999, in: DIMACS Series in Discrete Mathematics and
Theoretical Computer Science, vol. 56, American Mathematical Society, Providence, RI, 2001, pp. 167
192.
[28] P. Terwilliger, Two relations that generalize the q-Serre relations and the DolanGrady relations,
math.QA/0307016.
[29] P. Terwilliger, Linear Algebra Appl. 330 (2001) 149.
[30] P. Terwilliger, R. Vidunas, Leonard pairs and the AskeyWilson relations, math.QA/0305356.
2 ), math.QA/0310042.
[31] T. Ito, P. Terwilliger, Tridiagonal pairs and the quantum affine algebra Uq (sl
[32] A. Doikou, Boundary quantum group generators from the open transfer matrix, math-ph/0402067.
[33] L. Mezincescu, R.I. Nepomechie, Int. J. Mod. Phys. A 13 (1998) 2747.
[34] G.W. Delius, N.J. MacKay, Commun. Math. Phys. 233 (2003) 173.
[35] P. Baseilhac, K. Koizumi, Nucl. Phys. B 649 (2003) 491.
[36] P.B. Wiegmann, A.V. Zabrodin, Nucl. Phys. B 451 (1995) 699.
[37] S.M. Klishevich, M.S. Plyushchay, Nucl. Phys. B 628 (2002) 217;
S.M. Klishevich, M.S. Plyushchay, J. Phys. A 36 (2003) 11299.
[38] V. Bazhanov, S. Lukyanov, A. Zamolodchikov, J. Stat. Phys. 102 (2001) 567;
P.E. Dorey, R. Tateo, J. Phys. A 32 (1999) L419;
J. Suzuki, J. Phys. A 32 (1999) L183.
The vertex-face correspondence and correlation

functions of the fusion eight-vertex model
I: The general formalism
Takeo Kojima a , Hitoshi Konno b , Robert Weston c
a Department of Mathematics, College of Science and Technology, Nihon University,
Chiyoda-ku, Tokyo 101-0062, Japan

b Department of Mathematics, Faculty of Integrated Arts & Sciences, Hiroshima University,
Higashi-Hiroshima 739-8521, Japan

c Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, UK
Abstract
By making use of the vertex-face correspondence, we give an algebraic analysis formulation of
correlation functions of the k k fusion eight-vertex model in terms of the corresponding fusion
SOS model. Here k Z>0 . A general formula for correlation functions is derived as a trace over
the space of states of lattice operators such as the corner-transfer matrices, the half-transfer matrices
(vertex operators) and the tail operator. We give a realization of these lattice operators as well as the
sl2 ).
space of states as objects in the level k representation theory of the elliptic algebra Uq,p (
1. Introduction
The eight-vertex model was solved by Baxter in the series of seminal papers [14].
One of the key insights in these papers was the realization that by a suitable change of
basis it was possible to map the model to an Ising-like model [4]. This model, which we
E-mail addresses: kojima@math.cst.nihon-u.ac.jp (T. Kojima), konno@mis.hiroshima-u.ac.jp (H. Konno),
r.a.weston@ma.hw.ac.uk (R. Weston).
doi:10.1016/j.nuclphysb.2005.05.012
T. Kojima et al. / Nuclear Physics B 720 [FS] (2005) 348398
349
now refer to as an SOS model, possessed the property of charge conservation through a
vertex, and its transfer matrix could be diagonalized using a conventional Bethe ansatz.
The height restricted versions of the SOS model, now called RSOS or ABF models, later
achieved independent fame, largely due to their connection with conformal field theory
models [5,6].
The method of fusion, leading to higher-spin analogues of the eight-vertex model, was
developed in [7] and [8]. Baxters vertex-face correspondence between the eight-vertex
and SOS models was then extended to these fusion models in [9]. As a result, fusion, or
higher-spin, SOS models were defined and studied in great detail [10,11].
In the early 1990s a new approach to solvable lattice models was developed by Jimbo,
Miwa and their collaborators [12,13]. The approach was applied originally and most fully
to the 6-vertex model. The central idea was to exploit to the fullest possible extent an undersl2 ) symmetry of the infinite lattice model. The transfer matrix and its associated
lying Uq (
vector space, the eigenstates, and the local operators of the model were all constructed in
terms of this algebra and its associated vertex operators. This enabled this group to express
all correlation functions of the six-vertex model in terms of traces of algebraic objects over
sl2 ). One final ingredient, a free-field realization of
highest-weight representations of Uq (
the algebra, was then used to compute these traces, thus yielding multiple integral expressions for correlation functions [13].
The success of this method which we shall call the algebraic analysis approach led to a
great deal of subsequent work in which a whole variety of models were considered from a
similar point of view. In particular, a parallel discussion of the eight-vertex model in terms
sl2 ) [1416], was presented in [17,18]. However, a free-field
of an elliptic algebra, Aq,p (
sl2 ) was and still is lacking. The difficulty is again essentially related
realization of Aq,p (
to the lack of charge conservation for the eight-vertex model. Thus, while expressions for
correlation functions as traces over highest-weight modules exist, it has not proved possible
in general to evaluate these traces.
RSOS models were also considered using the algebraic analysis approach [19,20]. In
this case, a free-field realization of the vertex operators appearing in the trace formula
was constructed and the trace was computed [20]. What is more, this was originally done,
perhaps surprisingly, in the absence of a full understanding of the underlying symmetry
algebra.
The correlation functions of the eight-vertex model were finally computed in a beautiful piece of work by Lashkevich and Pugai [21,22]. Their approach was to use and extend
Baxters vertex-face correspondence in order to map the trace expression for correlation
functions of the eight-vertex model into one for SOS models. They then computed the
latter using the free-field realization of [20]. An interesting aspect of this work was that
correlation functions of the eight-vertex model were found to correspond to correlation
functions of the SOS model incorporating a certain dislocation or tail operator. Furthermore, this tail operator had a surprisingly simple realization in terms of the SOS free-field
realization.
sl2 ) associated with fusion SOS models was first defined in
The elliptic algebra Uq,p (
terms of elliptic Drinfeld currents in [23]. This algebra, or more precisely the closely resl2 ) [16], was later interpreted as a quasi-Hopf twisting of Uq (
sl2 ) in
lated algebra Bq, (
350
[24]. A free-field realization of Uq,p (

sl2 ) was constructed in [23,24] and the level one case
was shown to be equivalent to the phenomenologically derived realization of [20].
In this paper, we revisit the approach of Lashkevich and Pugai [21] in light of these
sl2 ).
recent developments in the understanding of the underlying symmetry algebra Uq,p (
Our twin motivations to carry out this work were: to generalize the approach of [21] to the
higher-spin fusion vertex models, and obtain as many explicit results as possible; and to
construct the objects appearing in the correlation function traces, such as vertex and tail
sl2 ).
operators, directly in terms of the algebra Uq,p (
In Section 2, we construct the higher-spin vertex and SOS weights, and the higher-spin
intertwiners that relate them. In Section 3, we review the relevant aspects of the algebraic
analysis approach for vertex and SOS models. In Section 4, we generalize the graphical
arguments of [21] in order to connect correlation functions of the higher-spin vertex models to those of SOS models with an associated tail operator. In Section 5, we review the
sl2 ) and give a direct algebraic construction of the vertex operators
construction of Uq,p (
and the tail operators occurring in the SOS trace expression for vertex model correlation
functions. One of our key results is contained in Conjecture 5.8, which gives a remarkably
simple algebraic picture of the tail operator as an integer power of one of the half-currents
sl2 ). In a subsequent paper [25], we shall consider the
occurring in the definition of Uq,p (
spin-1 generalization of the eight-vertex model in detail. This corresponds to the level
k = 2 case of the general formalism given in the present paper.
2. The fusion vertex and SOS models

In this section, we define and relate the fusion vertex and SOS models that are of interest
to us in this paper. They are statistical-mechanical models whose Boltzmann weights are
expressed in terms of elliptic theta functions.
2.1. Notation
K
First, let us fix our notation for theta functions. Let p = e K , q = e 2K and =
2i
u
K
e 2K . We introduce x, and r by x = q, p = e = x 2r , i.e., = 2iK
K and r = .
Throughout this paper Im > 0.
We use the theta functions defined in terms of p = e2i by
1 (u| ) = 2p 1/8 (p;
p)
sin u

0 (u| ) = ie

2 (u| ) = 1 u +
i(u+/4)

1 2p n cos 2u + p 2n ,
n=1

u + ,
2
1

1
,
2

+ 1
i(u+/4)
.
3 (u| ) = e
1 u +
2
We define the symbols [u](s) , [u] and [u] , by

2u
u2
u
(s)
u
s
,
[u] = x
= C1
x 2s x
r
[u] = [u](r) ,
351
C = x 4 e 4 1/2 ,
[u] = [u](rk)
with
p (z) = (z; p) (p/z; p) (p; p) ,
(z; p1 , p2 , . . . , pm ) =

nm
1 zp1n1 p2n2 pm
.
n1 ,n2 ,...,nm =0
Furthermore, we also make use of functions defined in terms of [u] by

[A]M = [A][A 1] [A M + 1],
[A, B] = [A][A + 1] [B] (A < B), [A, A 1] = 1,

[A][A 1] [A B + 1]
A
,
=
B
[B][B 1] [1]
1 a+bM a+b+M

[ 2 , 2 ]
M
(a, b)M = (b, a)M = ab+M
.
[a][b]
2
In order to distinguish the above notation from that of the q-integer, we use the following
notation for the latter:
x n x n
.
x x 1
Finally, the relation of our , , r, u to , , L, u in Date et al. [11] is as follows:
DJKMO = 1 , DJKMO = i, LDJKMO = r and uDJKMO = u.
JnKx =
2.2. The fusion vertex models

2.2.1. The eight-vertex model
The eight-vertex model is a two-dimensional square lattice model. The dynamical variable j takes the values + or . For each vertex, we associate a variable j with each
edge j . We allow only the eight configurations for each vertex as depicted in Fig 1(b).
We assign the following Boltzmann weight (the R-matrix) R(u)1 2 to each configura1 2
tion:
a(u)
d(u)
b(u) c(u)
R(u) = R0 (u)
(2.1)
,
c(u) b(u)
d(u)
a(u)
where
R0 (u) = z
r1
2r
(pq 2 z; q 4 , p) (q 2 z; q 4 , p) (p/z; q 4 , p) (q 4 /z; q 4 , p)

,
(pq 2 /z; q 4 , p) (q 2 /z; q 4 , p) (pz; q 4 , p) (q 4 z; q 4 , p)
(2.2)
352
1 2
=
1 2
R(u v)
(a)
(b)
Fig. 1. The eight-vertex model: (a) the R-matrix; (b) the eight possible configurations.
a(u) =
c(u) =
u
2 ( 2r1 | 2 )2 ( 2r
| 2)
2 (0 | 2 )2 ( 1+u
2r | 2 )
u
1 ( 2r1 | 2 )2 ( 2r
| 2)
2 (0 | 2 )1 ( 1+u
2r | 2 )
u
2 ( 2r1 | 2 )1 ( 2r
| 2)
b(u) =
d(u) =
2 (0 | 2 )1 ( 1+u
2r | 2 )
u
1 ( 2r1 | 2 )1 ( 2r
| 2)
2 (0 | 2 )2 ( 1+u
2r | 2 )
(2.3)
,
(2.4)
with z = 2 = x 2u . The extra parameter u is called the spectral parameter. Let V denote a
two-dimensional vector space spanned by v+ , v . We regard R(u) as an operator acting
on V V ,

R(u)v1 v2 =
R(u)1 2 v1 v2 .
1 ,2 =+,
1 2
The R-matrix satisfies the YangBaxter equation, unitarity and crossing symmetry relations given as follows:
R12 (u1 u2 )R13 (u1 u3 )R23 (u2 u3 )
= R23 (u2 u3 )R13 (u1 u3 )R12 (u1 u2 ),
(2.5)
R(u)P R(u)P = I,

t

1
P R(u)P 1 = y 1 R(u 1) y 1 ,
(2.6)
(2.7)
where P (v1 v2 ) = v2 v1 , and t1 denotes transposition with respect to the first vector
space in the tensor product. Note that R0 (u) satisfies the following inversion relations:
R0 (u)R0 (u) = 1,
R0 (u)R0 (u + 1) =
[u + 1]
.
[u]
2.2.2. Fusion of the eight-vertex model

Define the operator 1k by
1k =
1
(P1k + + Pk1k + I ) (P13 + P23 + I )(P12 + I ).
k!
353
This yields the projection operator onto the space V (k) of symmetric tensors in V k .
(k)
A basis for V (k) is given by {v }=k,k+2,...,k , where = 1 + 2 + + k (with
i {+, }), and
v(k) = 1k v1 v2 vk
1
=
v (1) v (2) v (k) ,
k!
(2.8)
Sk
where Sk is the symmetric group.

We then define an operator

R1k,j (u) = 1k R1j (u + k 1) Rk1j (u + 1)Rk j (u) End V (k) Vj .
This satisfies
R1k,j (u)1k = R1k,j (u).
The k k fusion of the R-matrix is then given by
R (k,k) (u) = 1
k R1k,k (u)R1k,k1 (u 1) R1k,1 (u k + 1).
(2.9)
This is an operator in End(V (k) V (k) ). It satisfies

R (k,k) (u) = R (k,k) (u)1k = R (k,k) (u)1
k .
(2.10)
Using the YBE (2.5) repeatedly and (2.10), we verify that R (k,k) (u) satisfies the YBE on
V (k) V (k) V (k) . R (k,k) (u) also satisfies the unitarity condition which is the simple
higher k version of (2.6).
The situation with crossing symmetry is somewhat more complicated. For general k we
have in general the following relation [13]:
(k) (k,k)

t
P R
(2.11)
(u)P (k) 1 = (Q id)R (k,k) (u 1) Q1 id ,
where P (k) is the permutation operator acting on V (k) V (k) , and Q is a (k + 1)dimensional matrix whose entries are independent of u. Clearly, for k = 1, it follows from
(2.7) that Q = y . For k = 2, we find by explicit calculation that [26]

2
0 1 y2
1 1+y
Q=
0
x2
0
2 1 y2 0 1 + y2
where
x2 =
1 0 (0 | )3 ( 1 | )
,
2 0 ( 1 | )3 (0 | )
y2 =
2 (0 | )3 ( 1 | )
2 ( 1 | )3 (0 | )
The matrix elements of R (k,k) (u) then define the k k fusion eight-vertex model whose
dynamical variables takes values in {k, k + 2, . . . , k}.
354
2.2.3. The ground states

We consider the principal regime specified by
0 < k < 1,
0 < < K ,
1 < u < 0,
where k denotes the elliptic modulus. Hence r > 1. In this regime, we find that for the k = 1
eight-vertex model we have c > a, |b|, d, with a > 0, b < 0, d > 0. Maximal Boltzmann
weight configurations, which we refer to as ground states, will therefore only involve the
weight c. More generally, for arbitrary k > 0, we find that a maximal-weight configuration
of edge variables is labeled by
{0, 1, . . . , k}, and is a periodic repetition of the pattern
in Fig. 2, where
= k 2
.
2.3. Fusion SOS models
2.3.1. The eight-vertex SOS model
The eight-vertex SOS model, usually referred to as simply the SOS model, is also a twodimensional square lattice model [5]. The dynamical variables aj are called local heights.
They take values in Z. For each face, we associate a local height aj with each vertex j .
We allow only the configurations satisfying the so-called admissibility condition |aj
ak | = 1 for any two adjacent local heights aj and ak . Then we have only the six possible
configurations for each face depicted in Fig. 3(b).
Fig. 2. The ground state configuration of the fusion vertex model labeled by
.

W
a
b

b
uv

c
(a)
(b)
Fig. 3. The SOS model: (a) the face weight; (b) the six possible configurations.
We assign the
figuration:

n
W
n1

n
W
n1

n
W
n1
following Boltzmann weight (or face weight) W
355
a1 a2

a4 a3 u to each con-

n 1
u = R0 (u),
n2

[n u][1]
n 1
u = R0 (u)
,
n
[n][1 + u]

[n 1][u]
n 1
u = R0 (u)
.
n
[n][1 + u]
(2.12)
The face weights satisfy the following face-type YangBaxter equation, unitarity, and
crossing symmetry relations:

a b f g b c
u v
v W
u W
W
g d
e d
f g
g

a g
g c
a b

u ,
v W
=
(2.13)
uv W
W
e d
g c
f e
g

a b a e

(2.14)
u = 1,
u W
W
d c
e c
e

a+dbc [b]
a b
d a
2
u
=
()
W
(2.15)
1
.
W
d c
c b
[a]
2.3.2. Fusion of the SOS model
The k k fusion of face weights is obtained as follows: define

a
a a1
(k,1) a b
u =
u+k1 W 1
W
W
d c
d d1
d1
d1 ,...,dk1

a
b
u .
W k1
dk1 c
a2
d2

u + k 2

Then the RHS is independent of the choice of a1 , . . . , ak1 provided |a a1 | = |a1 a2 | =

= |ak1 b| = 1 [11]. We then define

a b
u
W (k,k)
d c

b
(k,1) a
(k,1) a1 b1
u
k
+
1
W
u
k
+
2
=
W
a1 b1
a2 b2
a1 ,...,ak1

a
bk1
W (k,1) k1
(2.16)
u .
d
c
Here the RHS is independent of the choice of b1 , . . . , bk1 provided |b b1 | = |b1 b2 | =
= |bk1 c| = 1. In W (k,k) , the admissible condition for the dynamical variables is
extended to aj ak {k, k + 2, . . . , k} for any two adjacent local heights aj , ak . The
fused face weight W (k,k) satisfies the face type YBE, unitarity and crossing symmetry
356
Fig. 4. A ground state configuration of the fusion SOS model.
relations. The latter two relations are given by

a s
(k,k) a b
u
W
u = b,d ,
W (k,k)

d c
s c
s
W (k,k)
d
a

(k)
Ga,d (k,k) a
c
u
=
W
(k)
b
b
Gb,c
(2.17)

d
1
u
,
c
(2.18)
(k)
ga
where ga = sa [a], sa = 1, sa sa+1 = ()a and Ga,b = gb (a,b)
. In addition, we have a
k
symmetry for k Z>0 [11]

(a, b)k (d, a)k (k,k) d c
(k,k) d a
u =
u .
W
(2.19)
W
c b
a b
(d, c)k (c, b)k
2.3.3. The ground states
We discuss regime III specified by the region
0 < p < 1,
1 < u < 0.
In this regime, the ground states of the level k fusion SOS weights are of the form shown
in Fig. 4, where m Z,
{0, 1, . . . , k}, and we define
= k
.
The ground state indicated in Fig. 4 with height variable m +
on a specified reference
site is labeled by the pair (m,
).
2.4. The vertex-face correspondence
In order to solve the eight-vertex model by the Bethe ansatz, Baxter discovered the
celebrated identity referred to as the vertex-face correspondence [3]. This correspondence
was later generalized to higher fusion level k in [9].
2.4.1. The simple k = 1 case
Let us consider the following vector (Fig. 5(a))
(u)ab = + (u)ab v+ + (u)ab v ,

(a b)u + a
a
+ (u)b = 0
2 ,
2r

(u)ab = 3

(a b)u + a
2
2r
(2.20)
(2.21)
357
(u)ba =
(u)ab =
(a)
(b)
Fig. 5. (a) The intertwining vector; (b) the dual intertwining vector.
(a)
(b)
Fig. 6. The vertex-face correspondence: (a) via the intertwining vector; (b) via the dual intertwining vector.
with |a b| = 1. Baxter showed the following identity (Fig. 6(a)).

a
R(u v)11 22 1 (u)ab 2 (v)bc =
2 (v)ab 1 (u)bc W
b

b Z
1 ,2

b
u
v
.
c
(2.22)
We hence call (u)ab the intertwining vector. This identity is a key formula throughout
this paper. One should note that Baxters original intertwining vector intertwines the eightvertex model in the disordered regime with the SOS model in regime III [3,11]. In order to
consider the eight-vertex model in the principal regime, we have derived the eight-vertex
R-matrix (2.2), the SOS face weight (2.12) and the intertwining vector (2.21) from those
of Baxter [3] (which are the same as those used in [11]) by using Jacobis imaginary transformation.
In addition to the intertwining vector, it is necessary to introduce its dual counterpart and
a second intertwining vector. The dual intertwining vector (u)ab (Fig. 5(b)) is defined by
ab 2
C (u 1)ab ,
2[b][u]
whereas the second intertwining vector (u)ab (b = a a) is given by

[u][a]
(u)ab =
(u)ab v , (u)ab =
(u 2)ab
[u
1][b]
=
(u)ab v = (u)ab ,
(u)ab =
(2.23)
with |a b| = 1 in both cases. Then by direct calculation, one can verify the following
inversion relations (Fig. 7)

(u)ab (u)bc = a,c ,
(2.24)
=
(u)ab (u)ba = , ,
(2.25)
a=b1
(u)ab (u)ca = b,c ,
(2.26)
358
(a)
(b)
Fig. 7. The inversion relations between the intertwining vector and its dual.
(u)ab (u)ba = , .
(2.27)
b=a1
These inversion properties are the reason that we call (u)ab the dual intertwining vector.
It then follows from the crossing symmetry properties of R and W that the following
vertex-face correspondence holds:

c b
1 2
a
b
a
b
u
v
.
R(u v) (u)b (v)c =
2 (v)b 1 (u)c W
b a
1
2
1 2

sZ
1 ,2
(2.28)
This relation is represented by Fig. 6(b).

2.4.2. The general k case
Let us now discuss the fusion of the vertex-face relationship (2.22). We define fused
intertwining vectors by
c
(k) (u)ab = 1k (u + k 1)ac1 (u + k 2)cc12 (u)bk1 .
(2.29)
The RHS is independent of the choice of c1 , . . . , ck1 provided |a c1 | = |c1 c2 | = =

|ck1 b| = 1. The components of (k) (u)ab are given by the following formula:

(k) (u)ab =
(k) (u)ab v(k) ,
{k,k+2,...,k}
(k) (u)ab =
1 (u + k 1)ac1 2 (u + k 2)cc12 k (u)bk1 .
1 ,...,k
1 +2 ++k =
From (2.9), (2.16), and (2.22), it follows that (k) (u)ab satisfies the k k fusion vertexface correspondence relations with respect to R (k,k) and W (k,k) . That is, we have

(k)
(k)
R (k,k) (u v)11 22 (u)ab (v)bc
1 ,2

b Z
(k)
(v)ab (k)
(u)bc W (k,k)
2
1
a
b

b
u
v
.
c
(2.30)
Similarly, we fuse the second intertwining vector (u)ab as follows:

(k) (u)ab = 1k (u + k 1)ac1 (u + k 2)cc12 (u)bk1 .
c
359
Then we find that the components of (k) (u)ab are given by

(k) (u)ab v(k) ,
(k) (u)ab =
{k,k+2,...,k}
[u + k 1][a] (k)
(u 2)ab .
[u 1][b]
In addition, we define the fusion of the dual intertwining vector in the following way:

(k) (u)ba =
(u + k 1)ca1 (u + k 2)cc21 (u)bck1
(k) (u)ab =
c1 ,...,ck1
(2.31)
with the property

1k (k) (u)ba = (k) (u)ba 1k .
Written out in component form, the last relation indicates that the RHS of

(k) (u)ba =
1 (u + k 1)ca1 2 (u + k 2)cc21 k (u)bck1
c1 ,...,ck1
is independent of the choice of 1 , . . . , k provided = 1 + + k .

As above, it follows immediately from (2.28), (2.9) and (2.16), that we have

(k)
(k)
R (k,k) (u v)11 22 (u)ab (v)bc
1 ,2

b Z

(k)
(v)ab (k)
(u)bc W (k,k)
2
1
c
b

b
uv .
a
(2.32)
It is worth noting that this formula is also obtained from (2.30) by using the crossing
symmetry properties (2.11) and (2.18). In fact, the fused dual intertwining vector is related
to the intertwining vector as follows:
(k)
(k)
(k) (u)ab = C (k) (u)Ga,b
(2.33)
Q (u 1)ab .
= 1 case is given in (2.23), whereas

the k = 2 case in [26].
Finally, using (2.24)(2.27), it is easy to verify the following inversion relations:

(k) (u)ab (k) (u)bc = a,c ,
(2.34)
Here C (k) (u) is a certain normalization function. The k
{k,k+2,...,k}
(k)
(u)ab (k) (u)ba = , ,
(2.35)
ab+{k,k+2,...,k}
(k) (u)ab (k) (u)ca = b,c ,
(2.36)
{k,k+2,...,k}
ba+{k,k+2,...,k}
(k)
(u)ab (k) (u)ba = , .
(2.37)
360
Fig. 8. The graphical representation of the L-matrix.
Fig. 9. A maximal weight L-matrix configuration.
2.4.3. The L-matrix

In the next section, we will make use of the L-matrix, defined in terms of the intertwiner and dual intertwiner by

(k) a b
u = (k) (u)dc (k) (u)ab .
L
(2.38)
c d
The graphical representation is given in Fig. 8.
It is also useful to define the matrix

(k) a b
(k) a
u
=
L
L
c d
c
{k,k+2,...,k}
b
d

u .

(2.39)
k
If we restrict 1 < u + k1
2 < 0, m 1 + 2 , and choose
{0, 1, . . . , k}, we find that
the L-matrix with a, b, c, d specified as follows:

(k) m +
m +

u ,
L
m +
m +

has a maximum absolute value for the choice =

= k 2
. Thus, maximal weight
L-matrix configurations are of the form shown in Fig. 9.
3. The corner-transfer matrix and half-transfer matrices

In this and the next section, we consider correlation functions of the fusion eight-vertex
models introduced above. We express and manipulate these correlation functions using two
ideas: the expression for correlation functions in terms of the corner-transfer matrix (CTM)
and half-transfer matrices (HTMs) [13,27]; and the vertex-face correspondence [21]. We
here review the first part, i.e., the algebraic analysis approach to both the fusion eight-vertex
and the SOS models.
361
Fig. 10. The restriction of edge variables associated with our correlation function.
3.1. Fusion eight-vertex models

Correlation functions of the fusion vertex models correspond to the probabilities of the
edge variables taking certain values on some specified set of edges of a lattice. More
concretely, consider the following dimension (2L + N ) 2L lattice on which the edge
variables at the indicated sides have the values 1 , . . . , N (where N in Fig. 10 as shown is
actually 3, and for simplicity we assume that L is even).
The correlation function we consider specifies the probability of such a configuration.
It is the ratio of the weighted sum over such restricted configurations to the weighted sum
over all configurations (the latter sum being the partition function). The total weight of any
configuration is the product of the local vertex Boltzmann weights.
The algebraic analysis approach of [13,27] gives a way of computing such sums in the
infinite L limit. To be more specific, it allows the computation of this correlation function
for the infinite-volume lattice in which sums are taken over edge variable configurations
which are fixed to one of the ground state configurations of Fig. 2 beyond a finite, but arbitrarily large, distance from the centre of the lattice. We denote this correlation function by
P (
) (1 , 2 , . . . , N ), where
{0, 1, . . . , k} labels the chosen ground state configuration.
3.1.1. The space of states
The starting point is to replace the weighted sum by a trace over a vector space H(
)
representing a line running from the centre to the boundary of the infinite lattice. H(
) is
(k)
called the space of states and defined in terms of basis vectors v ( = k, k + 2, . . . , k)
by

(k)
(k)
(i) {k, k + 2, . . . , k},
H(
) = SpanC v(1)
v(0)

(i) = (
) (i) for i 0 ,

2
k for i = 0 mod 2,
(
) (i) =
k 2
for i = 1 mod 2.
The correlation function P (
) (1 , 2 , . . . , N ) is then represented in terms of a ratio of
traces over H(
) of CTMs and HTMs.
362
3.1.2. CTMs and the partition function

These operators are best defined graphically. The NorthWest corner-transfer matrix
A(u) : H(
) H(
) is represented in Fig. 11, where represent the infinite directions.
This and other graphical representations should be read is as follows: the matrix ele2 ,1
ment A(u)...,
is obtained by computing the weighted sum associated with the lattice in
...,2 ,1
Fig. 11, with all internal edge variables summed over, and with the West external horizontal
edge variables and North vertical edge variables fixed to the values
2 1
and
..
.
2 ,
1
respectively. Clearly, one can then define SouthWest, SouthEast and NorthEast cornertransfer matrices in an analogous manner. A simple consideration of the boundary conditions establishes that these operators act as
ASW (u) : H(
) H(
) ,
where
= k
.
ASE (u) : H(
) H(
) ,
ANE (u) : H(
) H(
)
It is an easy exercise to show that the crossing symmetry relation (2.11) implies that these
new operators are related to A(u) by
ASW (u) = QA(1 u),
ASE (u) = QA(u)Q1 ,
ANE (u) = A(1 u)Q1 ,

where Q :
H(
)
H(
) is the operator
(3.1)
Q = Q Q Q.
Baxters key observation about the corner-transfer matrix is that in the infinite volume
(
)
limit we have A(u) x 2uH , where 2H (
) , the corner Hamiltonian, has discrete and
equidistant eigenvalues bounded from below, and means equal up to a scalar. Thus,
from (3.1), we have
(
)
ANE (u)ASE (u)ASW (u)A(u) x 4H .
Fig. 11. The NorthWest vertex model corner-transfer matrix A(u).
(3.2)
In terms of CTMs, the partition function Z (

) is expressed by

(
)
Z (
) = TrH(
) ANE (u)ASE (u)ASW (u)A(u) TrH(
) x 4H .
363
(3.3)
It is remarkable that this partition function is known to coincide with the principally specialized character of the level k irreducible highest-weight
sl2 -module V (
) with highest
weight
= (k
)0 +
1 (
= 0, 1, . . . , k) [28]. Here i (i = 0, 1) denotes the fundamental weight of
sl2 . Namely we have
Z (
) TrH(
) x 4H
(
)
x 1/2
(k)
( ) =
(k)
=
( ),
(x 2 ; x 2 ) (x 2 ; x 4 )
(3.4)
[
+ 1](k+2) .
(3.5)
Here we set e2i = x 4 . Later we will use another expression for

( ) in terms of the
string function [29]:
(k)
(k) ( ) =
2k1

M 2
M
2k(n+ 2k
)
cM
( ) x 4k(n+ 2k )
(3.6)
nZ M0 mod 2k
where cM
( ) denotes the string function defined by
n

(
+2) M 2 k
( ) = x 4 4(k+2) 4k 8(k+2)
dim V (
)M n x 4
cM
(3.7)
n0
and cM
( ) = 0 for M =
mod 2. Here V (
) denotes the weight space of V (
);
denotes the null root of
sl2 satisfying (, ) = 0 = (, ), (, d) = 1 with a standard
symmetric bilinear form ( , ) : P P 12 Z, P = Z0 Z1 Z. Note that the string
function satisfies the following relations:
( ) = cM
( ) = ckM
( ) = cM+2k
( ).
cM
3.1.3. HTMs
To express the correlation functions in a similar way to (3.3), we need to introduce
(
,
)
HTMs, that is, North and South half-transfer matrices, denoted by (u) : H(
) H(
)
(
,
)
and S: (u) : H(
) H(
) respectively, and defined graphically by Fig. 12.
These operators are viewed as acting in an anti-clockwise direction about the finite end,
i.e., the end whose edge variable is fixed to the value . Again, crossing symmetry implies
that we have the relation
(
,
)
(u) = Q(
,
) (u)Q1
S;
(3.8)
where we define the dual operator (

,
) (u) by
(
,
)
Q (u 1).
(
,
) (u) =

) superscripts on these various operators.
We will often suppress the (
,
364
(a)
(b)
Fig. 12. (a) The North half-transfer matrix (u). (b) The South half-transfer matrix S; (u).
The heuristic graphical arguments of [27] then lead to the following relations for halftransfer and corner-transfer matrices:
)
(
,
(u2 )(
,
)
(u1 )
2
1

=

1 ,2 {k,k+2,...,k}
)
R (k,k) (u1 u2 )11 22 (
,
(u1 )(
,
)
(u2 ),

1
(
,
) (u)(
,
) (u) = id,
(3.9)
(3.10)
{k,k+2,...,k}
A(u) (v) = (v u)A(u).
(3.11)
Furthermore (3.2) and (3.11) yield
(
,
) (u)x 4H
(
)
(
)
= x 4H (
,
) (u 2).
(3.12)
3.1.4. Correlation functions

Now let us divide the lattice depicted in Fig. 10 into the pieces corresponding to CTMs
and HTMs. We obtain Fig. 13.
According to this picture, we can express the correlation function P (
) (1 , 2 , . . . , N )
as follows:
P (
) (1 , 2 , . . . , N ) =
1 (
)
F (1 , 2 , . . . , N )
Z (
)
(3.13)
where
F (
) (1 , 2 , . . . , N )

= TrH N (
) ANE (u)ASE (u)SN (u) S1 (u)ASW (u)A(u)1 (u) N (u)
(3.14)
with (
) = k
. One can then use the relations (3.1), (3.8) and (3.11) to write all operators in terms of (u) and A(u) and to re-order them. We thus obtain the following
simplified expression for the correlation function:
365
Fig. 13. The vertex model correlation function trace.
P (
) (1 , 2 , . . . , N )

1
N (
)
TrH N (
) x 4H
N (u) 1 (u)1 (u) N (u) .
= (k)
( )
(3.15)
3.2. Fusion SOS models

We next recall the analogous technology developed in [27] to write the infinite-volume
limit of SOS correlation functions as traces of CTMs and HTMs.
(
)
The first step is to define the space of states Hm,a on which our various SOS operators
(
)
act. We define Hm,a (
{0, 1, . . . , k}, m Z, a 2Z + m +
) by

(
)
Hm,a
= SpanC vs(2) vs(1) vs(0) s(i) Z,
(
)
(i) =
sm
(
)
s(i + 1) s(i) {k, k + 2, . . . , k},

(
)
s(i) = sm
(i) for i 0, s(0) = a ,
m +
i = 0 mod 2,
m +
i = 1 mod 2.
Hm,a is the vector space associated with the height variables along a line running from
the centre of a lattice to the boundary, for which the central height is fixed to a, and far
366
from the centre the boundary heights are fixed to the ground state configuration . . . , m +
,
m +
, m +
, m +
, m +
, . . . .
3.2.2. CTMs and the partition function
The infinite volume NorthWest corner-transfer matrix Aa (u) is now defined graphically, see Fig. 14, where we suppress the appearance of the spectral parameter u associated
with each SOS face weight. Note that the centre height is fixed to a and all internal
height variables are summed over. Aa (u) preserves the boundary conditions, and can be
viewed as an operator that acts in an anti-clockwise direction about the centre of the lat(
)
(
)
tice as Aa (u) : Hm,a Hm,a . One can define the SouthWest, SouthEast and NorthEast
corner-transfer matrices with fixed central heights in an analogous manner. Using the crossing symmetry relations (2.18), allows us to write each of these in terms Aa (u). As for vertex
models, it is a simple, but illuminating exercise, to show that we have
ASW;a (u) = ga Aa (1 u),
ASE;a (u) = Aa (u) 1 ,
ANE;a (u) = ga Aa (1 u) 1 ,
(
)
(3.16)
(
)
where : Hm,a Hm,a is defined by

vs(2) vs(1) vs(0)

s(2), s(1) k s(1), s(0) k ( vs(2) vs(1) vs(0) )
and ga and (a, b)k are as previously defined.
Again, in parallel to the vertex case, it is known that in the infinite volume limit we
(
)
(
)
have Aa (u) x 2uHm,a , where the corner Hamiltonian Hm,a has discrete and equidistant
eigenvalues bounded from below. From (3.16), we therefore have
(
)
ANE;a (u)ASE;a (u)ASW;a (u)Aa (u) [a]x 4Hm,a .

(
)
Then the partition function Zm is expressed by CTMs as follows:

(
)
=
TrH(
) ANE;aN (u)ASE;aN (u)ASW;a (u)Aa (u)
Zm
m,a
am+
+2Z
(
)
[a] TrH(
) x 4Hm,a .
m,a
am+
+2Z
Fig. 14. The NorthWest corner-transfer matrix Aa (u).
(3.17)
367
It is known that the trace part is given by the string function [28,30]
(
)
TrH(
) x 4Hm,a = cM
( )x
(mrar )2
krr
m,a
(3.18)
where M a m mod 2. Then by the calculation given in Appendix A, we obtain the

following expression of the partition function.
Theorem 3.1.
(
)
Zm
(
)
[a] TrH(
) x 4Hm,a = [m]
(k) ( ),
m,a
(3.19)
am+
+2Z
were
(k) ( ) is the principally specialized character given by (3.6).
The case k = 1 was obtained by Lashkevich and Pugai [21]. Note that this represents the
vertex-face correspondence between the spaces of states.
It is worth noting the resemblance of (3.19) to the branching formula for the product of
characters of irreducible integrable representations of
sl2 :

(k)
(rk2)
(r2)
(
)
bm,a
( )a1 ( ),
( )m1 ( ) =
(3.20)
1ar1
(s)
where the principally specialized character a ( ) is given in (3.5). The branching func(
)
tion bm,a ( ) is known to be the character of the irreducible Virasoro module Virm,a associated with the coset (
sl2 )k (
sl2 )rk2 /(
sl2 )r2 , and with the highest weight hm,a =
(k
)
(mrar )2 k 2
3k
and central charge cVir = k+2
(1 2(k+2)
2k(k+2) +
4krr
rr ). The main difference
between the two formulae are:
(1) (3.20) corresponds to the direct sum decomposition of the tensor product representation

(r2)
(rk2)
(k)
(
)
m,a
V a1 .
=
V
V m1
(3.21)
1ar1
(s)
Here V (a ) denotes the level s irreducible integrable representation with the highest
(s)
(
)
weight a = (s a)0 + a1 being dominant integral. m,a denotes the corresponding irreducible coset Virasoro module Virm,a . For generic r, the complete reducibility
of the tensor product representation is unknown, and one can not expect a formula
like (3.21). However one can define the coset Virasoro algebra even in this case by the
Goddard-Kent-Olive construction associated with the tensor product representation.
Its irreducible representation will be realized in terms of the representation theory of
(
)
sl2 ) and identified with Hm,a in Section 5.2.1.
the elliptic algebra Ux,p (
(2) In terms of lattice models, (3.20) appears in a consideration of the fusion RSOS model
with r > k + 2 Z and restricted heights 1 a r 1 and 1 m r k 1 [31].
On the other hand, (3.19) is associated with the fusion SOS model with r (> k + 2)
generic and with no restriction on the local heights.
368
Fig. 15. (a), (b), (c) and (d): the North, West, South and East half-transfer matrices.
3.2.3. HTMs
(
)
(
)
The next object to consider is the North half-transfer matrix b,a (u) : Hm,a Hm,b ,
which we define graphically by Fig. 15(a).
Again the centre weights are fixed, and the boundary conditions change as indicated.
We view the operator as acting anti-clockwise about the centre of the lattice. There are
3 other half-transfer matrices: W ;b,a (u), S;b,a (u), and E;b,a (u), associated with our
SOS model that we might consider, and these are defined graphically by Fig. 15(b), (c)
and (d). These West, South and East half-transfer matrices are again related to the North
half-transfer matrix by crossing symmetry. We find
W ;b,a (u) = (a, b)k
ga
b,a (1 u),
gb
S;b,a (u) = (b, a)k b,a (u) 1 ,

ga
E;b,a (u) = b,a (1 u) 1 .
gb
As for vertex models, the graphical arguments of [27], that rely on the YangBaxter
equation and unitarity, lead to the following relations:

c d
u
c,b (u2 )b,a (u1 ) =
(3.22)
W (k,k)
u
2 c,d (u1 )d,a (u2 ),
b a 1
d

b,a
(u)a,b (u) = id,
(3.23)
a
Ab (u)b,a (v) = b,a (v u)Aa (u),
(3.24)
where we have introduced the dual HTM defined by
b,a
(u) = W ;b,a (u) =
1
(k)
Gb,a
b,a (u 1).
From (3.17) and (3.24), we also have

(
)
(
)
b,a (u) x 4Hm,a = x 4Hm,b b,a (u 2).
(3.25)
3.2.4. Correlation functions

SOS(
)
(a, a1 , . . . , aN ) of the fusion SOS model
The (N + 1)-point correlation function Pm
is the probability of the local height variables taking certain values a, a1 , . . . , aN on spec-
SOS(
)
Fig. 16. The (N + 1)-point function Fm
369
(a, a1 , . . . , aN ).
ified set of vertices 0, 1, . . . , N of a lattice. Fig. 16 represents the (N + 1)-point function

SOS(
)
Fm
(a, a1 , . . . , aN ) divided into the parts corresponding to CTMs and HTMs.
According to this picture we have
PmSOS(
) (a, a1 , . . . , aN ) =
1
(
)
Zm
FmSOS(
) (a, a1 , . . . , aN ),

FmSOS(
) (a, a1 , . . . , aN ) = TrH(
) a,a1 (u) aN1 ,aN (u)ANE;aN (u)ASE;aN (u)
m,a

S;aN ,aN1 (u) S;a1 ,a (u)ASW;a (u)Aa (u) .
Using the relations (3.17), (3.24) and (3.25), we obtain the simplified expression:
PmSOS(
) (a, a1 , . . . , aN )
( N (
))

[aN ]
=
Tr ( N (
)) x 4Hm,aN aN ,aN1 (u) a1 ,a (u)
(k)
[m]
( ) Hm,aN

a,a1 (u) aN1 ,aN (u) .
(3.26)
4. The vertex-face correspondence

While the algebraic analysis approach works well for many models, including fusion
six-vertex and fusion SOS models [19,23,3234], it runs into a technical obstacle for the
370
case of the fusion eight-vertex model. The difficulty is the lack of a suitable free-field
realization to evaluate the trace occurring in (3.15). In order to overcome this problem, we
shall follow Baxter [3] and Lashkevich and Pugai [21] and relate the fusion eight-vertex
model to the fusion SOS model. This latter model is more tractable from the point of view
of the algebraic analysis approach [20,23,24].
4.1. Dressed vertex models
The starting point in establishing the connection of the expression (3.15) with SOS
models is to dress the boundary of the vertex model defined on the finite (2L + N ) 2L
lattice shown in Fig. 10 with the intertwining and dual intertwining vectors expressed by 3vertices in Fig. 5(a) and (b). The procedure is shown in Fig. 17. In this section, we identify
the vertex model spectral parameter u as u = u0 v, where u0 and v are the spectral
parameters associated with vertical and horizontal lines, respectively.
As well as the fixed interior edge variables 1 , 2 , . . . , N (N = 3 is shown in Fig. 17),
we fix the boundary edge variables, also marked by bullets, to take the values shown in
Fig. 17, where m 1 + k1
2 and
{0, 1, . . . , k}. The total Boltzmann weight associated
with a configuration of edge variables is given by the product of R-matrix values around
4-vertices and and values around 3-vertices. The rational for fixing the . . . , m +
,
m +
, m +
,
. . . boundary condition is that for suitably large lattices it
m +
, m +
,
imposes the vertex model boundary condition corresponding to the ground state shown in
Fig. 2. This follows from the observations concerning Fig. 9.
Fig. 17. The dressed vertex model.
371
Let us denote the weighted sum over all edge variably configurations of Fig. 17, with
(
)
1 , . . . , N fixed, by FL;m (1 , 2 , . . . , N ). The correlation function we are interested in is
the ratios
(
)
PL;m (1 , 2 , . . . , N )
1
(
)
ZL;m
(
)
FL;m (1 , 2 , . . . , N )
(
)
where the partition function ZL;m is the corresponding unrestricted sum, i.e.,
(
)
ZL;m =
(
)
FL;m (1 , 2 , . . . , N ).
1 ,2 ,...,N
(
)
Ultimately, we will consider the infinite L limit of PL;m (1 , 2 , . . . , N ). The conjecture

is that in this limit the m dependence associated with the boundary will disappear, and that
we can identify the vertex model correlation function as
(
)
P (
) (1 , 2 , . . . , N ) = lim PL;m (1 , 2 , . . . , N ).
(4.1)
4.2. The relationship with fusion SOS correlation functions

(
)
We will next show how to relate the dressed function FL;m (1 , 2 , . . . , N ) corresponding to Fig. 17 to one associated with SOS models. The argument precedes via a number of
diagrammatic equivalences. The first step is to successively use the vertex-face correspondence relations depicted in Fig. 6(a) and (b) to turn vertex weights into SOS weights. We
use Fig. 6(a) starting from the NE corner and Fig. 6(b) starting the SW corner of Fig. 17.
A little thought and diagram drawing will convince the reader that you can carrying this
procedure through until you end up with a dislocation that extends from the NW to the SE
corner of the diagram passing through the N fixed edges. There are many ways to draw
this dislocationone is shown in Fig. 18.
The next step is to use the relation of Fig. 7(a) to remove the dislocation step by step
starting from the NW. We can do this until we reach the leftmost central fixed edge variable 1 . Hence the sum in Fig. 18 is equal to that of Fig. 19 after taking the infinite volume
limit and dividing into the parts corresponding to SOS model CTMs and HTMs.
This picture is similar to Fig. 16 expressing the correlation function of fusion SOS
model. But we here have two new ingredients. One is the L-matrix defined in (2.38). The
(
)
(
)
other is the tail operator b,a (u0 ) : Hm,a Hm,b graphically defined in Fig. 20, where
the s on the vertical lines are summed over.
4.3. The tail operator
The tail operator is characterized by the following commutation relation

(k) c d
v E;c,d (u) d,b (u0 ) = c,a (u0 )E;a,b (u),
L
a b
d
372
Fig. 18. The correlation function after converting to SOS weights.
(
)
Fig. 19. The correlation function Fm (1 , 2 , . . . , N ).
373
Fig. 20. The graphical definition of the tail operator b,a (u0 ).
Fig. 21. Commutation relations of the tail operator and East half-transfer matrix.
where u = u0 v and the L(k) -matrix is given in (2.39). This is due to the simple graphical argument given by Fig. 21. The two steps make successive use of the fundamental
intertwining property shown in Fig. 6(b).
Expressed in terms of the North half-transfer matrix, this becomes after replacing u0
u0 1, v v

c d
u
u
(4.2)
L(k)
0 c,d (u)d,b (u0 ) = c,a (u0 )a,b (u),
a b
d
where we define a,b (u0 ) ga 1 a,b (u0 1) gb1 .

The other key property of the tail operator is
a,a (u0 ) = a,a (u0 1) = id.
(4.3)
This follows as a consequence of relation (2.24).

4.4. General formula for correlation functions
Now we return to the problem of computing the vertex model correlation function
P (
) (1 , 2 , . . . , N ) introduced in Section 3.2. In the infinite volume limit, we make the
identification (4.1), and hence compute P (
) (1 , 2 , . . . , N ) via the equation
P (
) (1 , 2 , . . . , N ) =
1
(
)
Zm
Fm(
) (1 , 2 , . . . , N ),
(4.4)
374

(
)
where Fm (1 , 2 , . . . , N ) is expressed as a trace over the composition of SOS model

corner- and half-transfer matrices and the tail operator shown in Fig. 19. We hence find
Fm(
) (1 , 2 , . . . , N )

(k) a a1
(k) a1
u L
L1
=
a a1 0 1 a1

a,aj ,aj

a2
(k) aN 1
u LN

a2 0
aN
1

aN
u0
aN

TrH(
) a,a1 (u) aN1 ,aN (u)ANE;aN (u)
m,a

(u) S;a1 ,a (u)ASW;a (u)Aa (u) .
aN ,aN (u0 )ASE;aN (u)S;aN ,aN1
Here, the sum extends over a 2Z + m +

and aj , aj 2Z + m + j (
). Rewriting
this expression purely in terms of the North half-transfer matrix a,b (u) and NorthWest
corner-transfer matrix Aa (u), and making use of the cyclicity of the trace and relation
(3.24) gives
(
)
Fm,a
(1 , 2 , . . . , N )

a a1
(k) a1
u
L
=
L(k)
1
a a1 0 1 a1

a,aj ,aj

Tr
aN
N
H( (
))
m,aN
a

N ,aN1

a2
(k) aN 1
u
L
N a
a2 0
N 1

aN
u0
aN
4H ( N (
))
Xm,a
N

(1) a ,a (1)a,a1 (1) aN1 ,aN (1)aN ,aN (u0 1) .
1
(
)
The factor Zm in the denominator of (4.4) is the partition function expressed by Fig. 19
with N = 0. This is the same as the SOS partition function given in (3.19) due to (4.3).
(
)
Using the infinite volume limit Aa (u) x 2uHm,a of the SOS corner-transfer matrix, we
finally have
P (
) (1 , 2 , . . . , N )

1
(k) a a1
u
L
=
1
(k)
a a1 0
[m]
( ) a,a ,a
j j

(k) a1 a2
(k) aN 1 aN
u LN
L1

u0
a1 a2 0
aN
a
N
1

4H ( N (
))
aN Tr ( N (
)) xm,a
a ,a (1) a ,a (1)
H

m,aN
N1

a,a1 (1) aN1 ,aN (1)aN ,aN (u0 1) .
(4.5)
The vertical spectral parameter u0 introduced in the vertex model still appears in this expression. However, we expect that after taking the trace and all the summations that there
will in fact be no u0 dependence [21]. A more detailed discussion of this point will be
given in [25].
375
4.5. The vertex-face correspondence for lattice operators

From Fig. 19, we can also extract the following vertex-face correspondence for lattice
operators such as the half-transfer matrices, the corner-transfer matrix and the space of
states.
4.5.1. Half-transfer matrices
(
,
)
(
,
)
Let us define (u; u0 ) and
(u; u0 ) by

(
,
) (u; u0 ) = f (k) (u u0 )
(
,
) (u; u0 ) =

(k) (u u0 )aa a ,a (u),
(4.6)
am+
+2Z
a a+{k,k+2,...,k}
(
,
)
Q
(u 1; u0 )

= f (k) (u u0 )1
(k) (u u0 )aa a ,a (u).
am+
+2Z
a a+{k,k+2,...,k}
(4.7)
Here the function f (k) (u) is chosen such that

f (k) (u)f (k) (u 1) = C (k) (u),
f (k) (u 2) =
(4.8)
[u + k 1] (k)
f (u),
[u 1]
(4.9)
where C (k) (u) is a function appearing in (2.33). If we consider u0 as a fixed constant

and suppress it from the notation from these lattice operators, then we have the following
theorem.
Theorem 4.1.
(
,
)
(i) The lattice operator (u) satisfies the commutation relation (3.9).
(
,
)
(
,
)
(ii) (u) and
(u) satisfy the inversion relation (3.10).
Proof. The commutation relation (3.9) follows from (2.30) and (3.22), whereas the inversion relation (3.10) follows from (2.34) and (3.23). 2
4.5.2. The corner-transfer matrix
Let us define
(
) =

a
am+l+2Z
(
)
a ,a (u0 ) [a]x 4Hm,a .
376
(
,
)
Theorem 4.2. (
) and
(
)
(u; u0 ) satisfy
(
,
) (u 2; u0 ) = (
,
) (u; u0 ) (
) .
(4.10)
This should be compared with (3.12).

(k)
Proof. Multiply (4.2) by f (k) (u u0 2) (u u0 )ab and take the summation in a.

Then from (2.31), (2.37) and (4.9), we obtain

c,a (u0 )[a]f (k) (u u0 2)(k) (u u0 2)ab a,b (u)
a
f (k) (u u0 )(k) (u u0 )ca c,a (u)a,b (u0 )[b].
a
(
)
4Hm,a on right of this expression and use (3.25). Then taking direct sum
Now
act with x
b,c of the result, we have

(
)
c,a (u0 )[a]x 4Hm,a
f (k) (u u0 2)(k) (u u0 2)ab a,b (u 2)
a

a
(k)
(u u0 )(k) (u u0 )ca c,a (u)
We hence obtain (4.10).
(
)
a,b (u0 )[b]x 4Hm,a .

Fig. 19 with N = 0 gives the partition function for the dressed vertex model. From
(
)
(4.3), this coincides with the SOS partition function Zm as mentioned above. Then from
Theorem 3.1, we have
(
)
= [m] Z (
) .
Zm
The RHS is the partition function of the k k fusion eight-vertex model multiplied by
[m] . The multiplicity by the factor [m] can be regarded as the effect of dressing the
boundary of the vertex model. Understanding this multiplicity and considering (4.10), we
can roughly make the following vertex-face correspondences for the spaces of states and
corner-transfer matrices.

(
)
(
)
Hm,a
,
x 4H (
) .
H(
)
a
+m+2Z
2)
5. The realization of fusion SOS models in terms of the elliptic algebra Ux,p (sl
(
)
(
)
In this section, we show how the space of the states Hm,a , the corner Hamiltonian Hm,a ,
the half-transfer matrix b,a (u) and the tail operator b,a (u) appearing in (4.5) can all be
constructed in terms of the representation theory of the elliptic algebra Ux,p (
sl2 ). One
of the key results is the remarkably simple algebraic form of the tail operator given by
Conjecture 5.8.
377
5.1. The elliptic algebra Ux,p (

sl2 )
We first present a brief review of the elliptic algebra Ux,p (
sl2 ) and its associated vertex
operators [23,24].1 The elliptic algebra Ux,p (
sl2 ) provides the Drinfeld realization of the
sl2 ) [16] tensored by a Heisenberg algebra.
face-type elliptic quantum group Bx, (
5.1.1. The definition and a realization of Ux,p (
sl2 )
Definition 5.1 (The elliptic algebra Ux,p (
sl2 ) [23,24]). The elliptic algebra Ux,p (
sl2 ) (p =
x 2r , r C) is the associative algebra of the currents E(u), F (u), K(u) and the grading
operator d satisfying the following relations:
K(u1 )K(u2 ) = (u1 u2 )K(u2 )K(u1 ),
K(u1 )E(u2 ) =
K(u1 )F (u2 ) =
E(u1 )E(u2 ) =
1r
2 ]
E(u2 )K(u1 ),
1+r
[u1 u2 2 ]
[u1 u2 1+r
2 ]
F (u2 )K(u1 ),
1r
[u1 u2 + 2 ]
[u1 u2 + 1]
E(u2 )E(u1 ),
[u1 u2 1]
[u1 u2 +
[u1 u2 1]
F (u2 )F (u1 ),
F (u1 )F (u2 ) =
[u1 u2 + 1]

F (u) = z + 1 F (u),
E(u) = z + 1 E(u),
d,
d,
z r
z r

E(u1 ), F (u2 )

k
k
+

k
k
1
x z1 /z2 H u2 +
x z1 /z2 H u2
,
=
4
4
x x 1

k
1
r
k
1
r
H (u) = K u
+
K u
.
2 4
2
2 4
2

Here r = r k, z = x 2u , zi = x 2ui (i = 1, 2) and (z) = nZ zn . The constant is given
by
=
(x 2 ; p , x)
(x 2 ; p, x)
with (z; p, x) =
(x 2 z; p, x 4 ) (px 2 z; p, x 4 )
,
(x 4 z; p, x 4 ) (pz; p, x 4 )
and the scalar function (v) is given by

(v) =
+ (v)
+ (v)
with + (v) = z 2r x 2
1
(px 2 z; p, x 4 )2 (z1 ; p, x 4 ) (x 4 z1 ; p, x 4 )
.
(pz; p, x 4 ) (px 4 z; p, x 4 ) (x 2 z1 ; p, x 4 )2
1 U
sl2 ) is conventionally referred to as Uq,p (
sl2 )the difference is merely a change of notation.
x,p (
(5.1)
378
The symbols always indicates the replacement r r . For example, p = x 2r , [u] =

x
u2
u
r
x 2r (x 2u ).
The algebra Ux,p (

sl2 ) is realized by tensoring Ux (
sl2 ) and a Heisenberg algebra [24].
sl2 ).
For this realization, it is convenient to introduce the Drinfeld realization of Ux (
Definition 5.2 (The Drinfeld realization of Ux (
sl2 )). The quantum affine algebra Ux (
sl2 ) is
the associative algebra generated by h, am , xn (m Z=0 , n Z), d and the central element
k satisfying the relations

d, xn = nxn ,
[h, d] = 0,
[d, an ] = nan ,

[h, an ] = 0,
h, x (z) = 2x (z),
J2nKx JknKx k|n|
x
n+m,0 ,
n
J2nKx k|n| n +

x
z x (z),
an , x + (z) =
n

J2nKx n
z x (z),
an , x (z) =
n

z x 2 w x (z)x (w) = x 2 z w x (w)x (w),

k
+
1
x z/w x k/2 w x k z/w x k/2 w ,
x (z), x (w) =
1
x x
[an , am ] =
where x (z), (z) and (z) denote the Drinfeld currents defined by

x (z) =
xn zn ,
nZ

x k/2 z = x h exp x x 1
an zn ,
n>0

x k/2 z = x h exp x x 1
an zn .
n>0
the Heisenberg algebra generated by the pair P , Q with

Let us denote by C{H}
[Q, P ] = 1. Then we have the following realization of Ux,p (
sl2 ).
Theorem 5.1 [24]. The elliptic algebra Ux,p (
sl2 ) is realized by tensoring Ux (
sl2 ) and the
Heisenberg algebra C{H}. The generators E(u), F (u), K(u) and d are given by
P
K(u) = k(z)eQ z( r r ) 2 + 2r + 4 ( r r ) ,
1
E(u) = u+ (z, p)x + (z)e2Q z r (P +1) ,

1
F (u) = x (z)u (z, p)z r (P +h1) ,

1
1
d = d (P 1)(P + 1) + (P + h 1)(P + h + 1),
4r
4r
1
379
where k(z) and u (z, p) are given by

k n
JnKx
JnKx
n
exp
,
a
z
a
z
x
k(z) = exp
n
n
J2nKx Jr nKx
J2nKx JrnKx
n>0
n>0

r n
1
u+ (z, p) = exp
x
a
z
,
n
Jr nKx
n>0

1
an (x r z)n .
u (z, p) = exp
JrnKx
n>0
The following commutation relations are important.

Proposition 5.2.

K(u), P = K(u),
E(u), P = 2E(u),
F (u), P = 0,
(5.2)

K(u), P + h = K(u),
E(u), P + h = 0,
F (u), P + h = 2F (u).
(5.3)
Next we define the half currents and the L-operator.
Definition 5.3 (Half currents [24]). We define the half currents E + (u), F + (u) and K + (u)
by

r +1
[1]
+
,
E + (u) = E + (u)
,
K (u) = K u +
2
[P 1]
[1]
,
F + (u) = F + (u)
[P + h 1]
where

[u v + k2 P + 1]
dw
+
E (u) = a
E(v)
,
2iw
[u v + k2 ]
C

[u v + P + h 1]
dw
+
F (v)
.
F (u) = a
2iw
[u v]
C
The contours
and C are defined as follows

k
C: |pz| < |w| < |z|,
C : p x z < |w| < x k z,
and the constant a, a are chosen to satisfy
a a[1]
= 1.
x x 1
Definition 5.4 (L-operator [24]). We define the L-operator L + (u) End(C2 ) Ux,p (
sl2 )
by

+

1
0
K (u 1)
0
1 F + (u)
.
L + (u) =
E + (u) 1
0
1
0
K + (u)1
380
5.1.2. The vertex operators of Ux,p (

sl2 )
Let V (
) be the level k irreducible Ux (
sl2 )-module with highest weight
. Let us
denote by (n,z , Vn,z ) (n = 0, 1, . . . , k) the (n + 1)-dimensional evaluation representation

(n)
sl2 ): Vn,z = V (n) C[z, z1 ], V (n) = nm=0 Cvm . In physical applications, we
of Ux (
only have to consider the case n = k. The co-algebra structure of Bx, (
sl2 ) allows us to
define the type I intertwining operator (u, P ) : V (
) V (k
) Vk,z of Bx, (
sl2 )modules. Note that Bx, (
sl2 )
sl2 ).
= Ux (
Now let us consider the Ux,p (
sl2 )-modules. According to the realization of Ux,p (
sl2 )
sl2 )-module V (
) by
given by Theorem 5.1, we define the level k Ux,p (

V (
) =
V (
) emQ .
mZ
sl2 ) is simply defined to be the same as (u, P ):

The type I vertex operator (u)
of Ux,p (
(u)
= (u, P ) : V (
) V (k
) Vk,z .
Throughout this paper we consider only the type I vertex operator.
From the intertwining relation for (u, P ), we obtain the following relation which
characterizes the vertex operator uniquely up to a normalization factor:
2 )L + (u1 ) = R +(13) (u1 u2 , P + h)L + (u1 )(u
2 ).
(u
1,k
(5.4)
+
+
(u, s) is an image of the L-operator: R1k
(u v, s) = (id k,w )L+ (u, s) with
Here R1k
2u
2v
z = x , w = x . The finite-dimensional representations of the elliptic currents as well as
+
the expression for the matrix R1k
(u, s) can be found in Appendix C of [24]. Inputting the
+
realization of L (u) given by Definition 5.4 into (5.4), we can solve (5.4) for (u).
Let us
define the components of the vertex operator (u)

as follows:

k
1
=
k,m (u) vm .
u
2
m=0
Then we obtain the following realization of the vertex operators [24].

Theorem 5.3. The highest component k,k (u) is given by

k
h
k
Jr nKx n n
:e 2 (z) 2 z 2r (P +h) ,
z
k,k (u) = :exp
J2nKx JrnKx
n=0
where
n =
an
for n > 0,
JrnKx k|n|
x an
Jr nKx
for n < 0,
such that [m , n ] = m+n,0
J2mKx JkmKx JrmKx

m
Jr mKx .
381
For the remaining components m = 0, 1, . . . , k, we have the formula

k,m (u)

km km

k
[1][P + h + 2k m 1 + 2j ]
+
= k,k (u)F u + r
2
[P + h + k 1 + 2j ][P + h + k + 2j ]
j =1

dw1
dwkm
=
k,k (u)F (v1 ) F (vkm )

2iw1
2iwkm
C1
Ckm
km
[u vj + P + h +
k
2
[u vj +
j =1
1 2(k m j )][1][P + h + 2k m 1 + 2j ]
k
2 ][P
+ h + k 1 + 2j ][P + h + k + 2j ]
(5.5)
z = x 2u , w
and the integral contours Cj (1 j k m) are given by

j
k
1 k k
C1 : x z < |w1 | < p x z, x z,

Cj : x k z < |wj | < p 1 x k z, x k z, x 2 wj 1 (2 j k m).
where
= x 2vj
The expression of k,k (u) is equivalent but slightly different from the one given in [24] in
the zero-mode part. Note the following properties.
Proposition 5.4.

P , k,k (u) = 0,
F (v)k,k (u) =

h, k,k (u) = kk,k (u),
[u v k2 ]
k,k (u)F (v),

[u v + k2 ]
E(v)k,k (u) = k,k (u)E(v).
(5.6)
sl2 ).
Let us next consider the commutation relations of the vertex operators of Ux,p (
In [16], they are expected to be commutation relations with exchange coefficients being
exactly the fused face weights W (k,k) (2.16). In order to derive such relations, we consider
the following gauge transformation of the vertex operators k,m (u) (u) with =
2m k.
(u) = k,m (u)
km

j =1
[P + h + k 1 + 2j ][P + h + k + 2j ]
[1][P + h + 2k m 1 + 2j ]
( = k, k + 2, . . . , k).
From (5.5), we find

k
2
k
(u) = k,k (u)F + u + r
.
2
(5.7)
By using the realization obtained in Theorem 5.3 and the commutation relation (5.6), we
have checked the following commutation relations for levels k = 1, 2, 3.
382
Conjecture 5.5 (Commutation relations). The vertex operators (u) ( = k, k + 2,

. . . , k), satisfy the commutation relations
2 (u2 ) 1 (u1 )

=

W
(k,k)
1 {k,k+2,...,k}
1 +2 =1 +2
P + h 1 2
P + h 1

P + h 2
u u2
P +h 1
1 (u1 ) 2 (u2 ),
where the coefficients
W (k,k)
(5.8)
are given by (2.16).
5.2. The realization of fusion SOS models and the tail operator
Now we formulate the fusion SOS model in terms of the representation theory of
(
)
(
)
Ux,p (
sl2 ) and give a realization of the space of states Hm,a
, the corner Hamiltonian Hm,a
,
the half-transfer matrix b,a (u) and the tail operator b,a (u).
5.2.1. The space of states and the CTM Hamiltonian
We first show that the level k Ux,p (
sl2 )-modules have a natural decomposition into the
sl2 )rk2 /(
sl2 )rk .
Virasoro highest-weight modules associated with the coset (
sl2 )k (
(
)
Irreducible Virasoro modules are identified with the spaces of states Hm,a of the k k
fusion SOS model.
In order to see such a decomposition, it is convenient to realize the level k Ux (
sl2 )PF and the Fock
module V (
) in terms of a q-deformed Zk -parafermion module H
,M
module F a of the Drinfeld bosons an [23,35] (see also [36] for the CFT case).
The q-deformed Zk -parafermion algebra is conveniently introduced through the qdeformed Z-algebra associated with the level k Drinfeld currents of Ux (
sl2 ). The algebraic structure of the q-deformed Z-algebra is quite parallel to the classical case [37].
The q-deformed case was considered in [38]. The
deformed Z-algebra is generated by
Z,n (n Z) whose generating functions Z (z) = nZ Z,n zn are defined by

1
1
n
+
n
,
Z+ (z) = exp
an z x (z) exp
an z
JknKx
JknKx
n>0
n>0
kn
kn

x
x
Z (z) = exp
an zn x (z) exp
an zn .
JknKx
JknKx
n>0
n>0
The Z-algebra commutes with the Drinfeld bosons an , n = 0. Then the level k highestweight Uq (
sl2 )-module V (
) with highest weight
has the structure
V (
) = F a
,
where F a = C[an (n > 0)]. The space
is called the vacuum space defined by

= v V (
) an v = 0 (n > 0) .
(5.9)
383
The space
is spanned by the vectors v
(1 , . . . , s ; n1 , . . . , ns ) (s 0, j {}, ns 0,
ns1 + ns 0, . . . , n1 + + ns 0) given by

(x 2 x k+
i +j
2
+
k+ i 2 j
i j
; x 2k )
Z1 (z1 ) Zs (zs ) 1 e 2
k 2k
(x 2 x
; x )

v
(1 , . . . , s ; n1 , . . . , ns )z1n1 zsns .
1i<j s
n1 ,...,ns Z
Here the action of Z,n is defined as follows:

n 0,
Z,n f e
2 ,
Z,n f e
2 =
[Z,n , f ] e
2 , n 1,
for f C[Z+,n , Z,n (n 0)]. The weight of v
(1 , . . . , s ; n1 , . . . , ns ) is
+
s
j =1 j
(
+2)
+ n1 + + ns .
and its degree is 4(k+2)
Now let us consider the q-deformed Zk -parafermion. Define the basic Zk -parafermion
currents (z) and (z) through the following relations:
h
Z+ (z) = (z) e z k ,
h
Z (z) = (z) e z k ,

(z), = (z), h = (z), = (z), h = 0.
To make this expression well defined, (z) and (z) should have their mode expansions
depending on the weight of vectors on which they act. Namely, on the vector with weight
such that (h, ) = m, we have
(z) + (z) =
+, mk n z k +n1 ,
m
nZ
(z) (z) =
, mk n z k +n1 .
nZ
The q-deformed Zk -parafermion algebra is generated by +, mk n , , mk n (n Z). Its

relations can be expressed as follows:

2
(x 2 w/z; x 2k )
(z) (w)
(x 2+2k w/z; x 2k )
2
w k (x 2 z/w; x 2k )
=
(w) (z),
z
(x 2+2k z/w; x 2k )
z
w
384
2
(x 2+k w/z; x 2k )
(z) (w)
(x 2+k w/z; x 2k )
2
k (x 2+k z/w; x 2k )
w
(w) (z)
z
(x 2+k z/w; x 2k )

1
kw
k w
x
.
=
z
z
x x 1
z
w
By construction, the following statement is obvious.

Theorem 5.6. The following currents x (z) and operator d with h give a level k represensl2 ).
tation of Uq (

1
1
an zn :e z k h ,
x + (z) = (z):exp
(5.10)
JknKx
n=0

k|n|
1
x
n
:e z k h ,
x (z) = (z):exp
(5.11)
an z
JknKx
n=0
d =d
PF
+d ,
a
(5.12)
m2 x km
h2
am am
J2mKx JkmKx
4k
(5.13)
where
da =

m>0
and d PF is an operator such that
(k
)
1 e 2 ,
2k(k + 2)
PF
PF

d , (z) = z (z).
d , (z) = z (z),
z
z
d PF 1 e 2 =
We define the Z2k charge of , mk n and 1 e 2 to be 2 and

mod 2k, respectively.
For example, the Z2k charge of the vector
1,
+2(2 ++s )
n1
k
2,
+2(3 ++s )
n2
k
s ,
ns e 2
(5.14)

PF the irreducible parafermion module of the Z
is
+ 2 sj =1 j . Let us denote by H
,M
2k
charge M defined by the relation
M
+2Z
PF
H
,
e 2 =
M
2k1

PF
H
,M
e
M+2kn
(5.15)
nZ M=0 mod 2k
PF = {0} for M
mod 2 and HPF = HPF
Here H
,M
,M
,M+2k . We also assume the symmetry

[36]
PF
PF
PF
= Hk
,M+k
= H
,M
.
H
,M
385
PF as the following linear operators:

The basic parafermion currents act on the space H
,M
PF
PF
(z) : H
,M
H
,M+2
,
PF
PF
(z) : H
,M
H
,M2
.
PF is known to be [36]
The character of the q-Zk -parafermion space H
,M
4 cPF
4d PF
24 Tr
x
= ( )cM
( )
HPF x
(5.16)
,M
where cPF = 2(k1)

k+2 . cM ( ) and ( ) are the string function and Dedekinds -function,
given by (3.7) and

1
( ) = x 4 24 x 4 ; x 4 .
From (5.9) and (5.15), the level k irreducible highest-weight module V (

) of Ux (
sl2 )
with highest weight
is realized as follows:
V (
) = F a
2k1

PF
H
,M
e(M+2kn) 2 .
(5.17)
nZ M=0 mod 2k
In particular, the highest-weight vector is given by
1 1 e
2 .
(5.18)
From (5.17), the normalized character of V (

) is evaluated as follows:

c
h
(k) x 4 , y = x 4 24 TrV (
) x 4d y 2
=
2k1

M 2
cM
( )x 4k(n+ 2k ) y k(n+ 2k ) .
(5.19)
nZ M=0 mod 2k
character (3.6).
By setting y = x 2 , we reproduce the level k principally specialized

sl2 )-modules V (
) = mZ V (
) emQ . From
Now let us consider the Ux,p (
(5.17), we have
V (
) =

2k1

FM;m,
,n
(5.20)
mZ nZ M=0 mod 2k
with
PF
e(M+2kn) 2 emQ .
FM;m,
,n = F a H
,M
(5.21)
Let r be generic and note that

P |FM;m,
,n = m,
P + h|FM;m,
,n = M + m + 2kn.
From (5.21) and (5.16), the character of the space FM;m,

,n is evaluated as follows:
4 cVir
(mr(M+m+2kn))r )2
4d
24 Tr
krr
= cM
( )x
,
x
FM;m,
,n x
(5.22)
386
3k
where cVir = k+2
(1 2(k+2)
rr ). This coincides with the one point function (3.18) of the
fusion SOS model for a = M + m + 2kn. We hence make the following identification:
(
)
the SOS space of states: Hm,a
FM;m,
,n , a = M + m + 2kn, M
mod 2,
1
(
)
d cVir .
the corner Hamiltonian: Hm,a
24
Furthermore let us set
Fm,
(n)
2k1

FM;m,
,n .
M=0 mod 2k
When r is generic, the character of Fm,

(n) coincides with the one of the irresl2 )k
ducible Virasoro module Virm,a (a
+ m mod 2) associated with the coset (
sl2 )r2 . In addition, in Appendix B, we consider the case when r is an integer
(
sl2 )rk2 /(
> k + 2. In this case, Fm,
(n) is reducible. We observe that the BRST resolution of the
complex formed by Fm,
(n) yields the irreducible coset Virasoro minimal module Virm,a ,
(a m +
mod 2). These considerations leads us to the following conjecture:
Conjecture 5.7. The space Fm,
(n) is isomorphic to the irreducible coset Virasoro module
3k
(1 2(k+2)
Virm,a , a m +
mod 2 with the central charge cVir = k+2
4rr ) and the highest
weight hm,a =
(k
)
2k(k+2)
(mrar )2 k 2
.
4krr
5.2.2. The vertex operators

The vertex operator (u) ( = k, k + 2, . . . , k) of the elliptic algebra Ux,p (
sl2 ) in
(5.7) acts on the space FM;m,
,n as
(u) : FM;m,
,n FM+;m,k
,n
and satisfies the commutation relation (5.8). The relation (5.8) is similar to that of the lattice
vertex operators (3.22) but not precisely the same. Noting the symmetry (2.19), it turns out
that the following gauge transformation (u)

(u) resolves this discrepancy:
1
(P + h, P + h + )k

k
2
k
1
+
= k,k (u)F u + r
.
2
(P + h, P + h + )k
(u) = (u)
(5.23)
In fact (u) satisfies the commutation relation

2 (u2 )1 (u1 )

(k,k)
=
W
1 +2 =1 +2
P +h
P + h 2
P + h 1
P + h 1 2

u1 u2

1 (u1 )2 (u2 ).
This is exactly the same commutation relation as (3.22) if we make the identification
a+,a (u) = (u)
(5.24)
387
(
)
on FM;m,
,n = Hm,a (a = M + m + 2kn, M
mod 2). This is the realization of the
sl2 ).
half-transfer matrix in terms of the vertex operator of Ux,p (
As discussed in [21], there is a second realization of the vertex operators. This is due to
(
)
(
)
the symmetries of the space of states Hm,a
= Hm,a
and the Boltzmann weights

a b
a b
u = W (k,k)
u .
W (k,k)

c d
c d
In fact from (3.22), we have
c,a (u2 )a,b (u1 ) =

W
(k,k)
c
a
d
b

u1 u2 c,d (u1 )d,b (u2 ).

(5.25)
On the other hand, we have an operator

(
)
(k
)
a ,a (u) : Hm,a Hm,a ,
(5.26)
which can be shown, using the same argument that leads to (3.22), to satisfy the commutation relation
c,a (u2 )a,b (u1 )

c d
=
W (k,k)
a b

u1 u2 c,d (u1 )d,b (u2 ).

(5.27)
Comparing this with (5.25), we can simply make the following identification.
a,b (u) = a,b (u)
(5.28)
(
)
. From (5.24), we have
on Hm,b
(u) = (u)|H(
)
m,a

k+
2

k
1
+

= k,k (u)F u + r
.
(
)
2
(P + h, P + h )k Hm,a
(5.29)
5.2.3. The tail operator

We next consider the realization of the tail operator introduced in Section 4.3. The tail
(
)
(
)
operator a ,a (u) : Hm,a Hm,a is characterized by the commutation relation (4.2). In
addition, it follows from formula (4.5) that we only have to consider the case a a 2Z.
In a similar way to the case of vertex operators, we seek to realize the tail operator in the
following form
a+,a (u) = (u)
( 2Z)
(
)
(5.30)
on the space FM;m,

,n = Hm,a (a = M + m + 2kn, M
mod 2). Note that from (5.22),
the tail operator should satisfy

P + h, (u) = .
P , (u) = 0,
388
Substituting (5.24) and (5.30) into (4.2), we obtain the following commutation relation:
1 (u1 )2 (u2 )

=

L
(k)
2 {k,k+2,...,k}
1 +2 =1 +2
P +h
P + h 1

P + h 2
u u1
P + h 1 2 2
2 (u2 )1 (u1 ).
(5.31)
Here L(k) are the k-fusion L-matrices, explicit formulae for which are given in Appendix C.
Let us first consider the case < 0 in (u). Setting 1 = 2k, 2 = k, 1 = 2s,
2 = k + 2s (s N) in (5.31), we have
2k (u1 )k (u2 )

k

P +h
(k)
=
L
P + h + 2k
s=0
P + h + k 2s
P +h+k

u2 u1 k+2s (u2 )2s (u1 ),

(5.32)
is given by (C.5) with m = P + h, n = P

has
simple poles at u = 0, 1, . . . , k + 1. We take the residue at u1 = u2 + k 1. Assuming
that 2k (u1 )k (u2 ) on the left-hand side of (5.32) does not have a pole at u1 = u2 +k 1,
we have the necessary condition
where the coefficient L(k)
0=
+ h + 2k. This L(k)
k

[P + h + k]s [P + h + 2k s 1]ks [k + s 1]s [k]ks
s=0
k+2s (u2 )2s (u2 + k 1).
(5.33)
Substituting in formula (5.23), we obtain the recursion relation for the tail operator
k

0=
[P + h + 2k]s [P + h + 3k s 1]ks [k + s 1]s [k]ks
s=0
ks

k
1
F+ u + r
2s (u2 + k 1)
.
2
(P + h 2s, P + h k)k
To solve this for (u + k 1), we make the following ansatz:

k
2s (u + k 1) = F + (u + r)s s (P + h),
2
(5.34)
where s (P + h) is a function to be determined. Then the necessary condition (5.34) reduces to the relation
k

[P + h]s [P + h + k s 1]ks [k + s 1]s [k]ks
s=0
= 0.
s (P + h)
(P + h 2s, P + h k)k
389
This equation is satisfied if we choose s (P + h) = gP1+h gP +h2s (where ga is defined

below (2.18)). We hence obtain the identification

s
k
+
2s (u + k 1) = gP +h F u + r gP1+h .
2
To check that this is also a sufficient condition, we substitute (5.34) back into the full
commutation relation (5.32). We find that (5.32) then reduces to the same theta function
identity that occurs in the commutation relation (5.8) for vertex operators, which we have
checked for the cases k = 1, 2, 3. See Appendix D for details.
Let us next study (u) for the case > 0. For this purpose, we use the second realization of the vertex operators (5.29). In addition we have the symmetry
(k)
a
(k) (u)a
b = (u)b ,
Hence
(k)
(k) (u)a
(u)ab .
b =
b
d

u .

Therefore from (4.2), we have

(k) c
c,a (u1 )a,b (u2 ) =
L
a
d
b

(k)
a
c
b
d

u = L(k) a

c

u2 u1 c,d (u2 )d,b (u1 ).

(5.35)
On the other hand, consider the operator

(
)
(
)
a ,a (u) : Hm,a
Hm,a
.
From the same argument that leads to (4.2), we have the commutation relation

c d
u
c,a (u1 )a,b (u2 ) =
L(k)
u
1 c,d (u2 )d,b (u1 ).
a b 2
d
(5.36)
Comparing (5.35) and (5.36) and using the second realization of the vertex operator (5.29),
we identify
a,b (u) = a,b (u)
(
)
(
)
under the identification of Hm,a with Hm,a . Therefore, we obtain the following realization
a+2s,a (u + k 1) = a2s,a (u + k 1)

s

k
= gP +h F + u + r gP1+h
.
(
)
2
Hm,a
We are thus lead to the following simple conjecture.
Conjecture 5.8 (The realization of the tail operator). The tail operator a2s,a (u) (s N)
is realized by the following power of the half-current:
390
a2s,a (u)

s

k
1
+
= gP +h F u r + 1 gP +h
(
)
2
Hm,a

dw1
dws
=
F (v1 ) F (vs )
2iw1
2iws
J1
(1)s
Js

s
[P + h 2s] [u vj + P + h k2 2s + 2j ]
(
) .
[P + h]
[u vj k2 + 1]
Hm,a
j =1
The integrations contours Jj (1 j s) are given by

Jj : x k z1 , x 2 wj +1 < |wj | < p 1 x k z1 .
6. Summary
In this paper, we have first generalized the approach of [21] in order to obtain the trace
formula (4.5) for N -point correlation functions of the level k fusion analogue of the eight(
)
vertex model. The objects that appear in this trace are the space of states Hm,a , the corner
(
)
(u), and the tail operator
Hamiltonian Hm,a , the half-transfer matrices b,a (u) and b,a
b,a (u). We have constructed each of these objects in terms of the algebra Ux,p (
sl2 ) in
Section 5. A multiple integral formula for (4.5) then follows rather simply.
In a following paper [25], we shall examine the k = 2 case in detail. We shall make use
of the rather simpler 1-boson/1-fermion free field realization that exists in this case in order
to produce and analyze explicit expressions for certain correlation functions.
Acknowledgements
The authors would like to thank M. Jimbo for stimulating discussions. They also
thank A. Kuniba, M. Lashkevich, T. Nakanishi, A. Nakayashiki, M. Okado, Y. Pugai,
Y.H. Quano, M. Rossi, J. Shiraishi, and T. Takebe for useful conversations. T.K. and H.K.
are grateful to colleagues at Heriot-Watt University for their kind hospitality during the period when this work was started. H.K. and R.W. respectively thank P. Goddard, in DAMTP,
Cambridge University and M. Jimbo at Tokyo University for their warm hospitality.
T.K. is supported by the Grant-in-Aid for Young Scientists (B) (14740107) from the
JSPS. H.K. thanks the JSPS and Royal Society for an exchange fellowship, and acknowledges support from the Grant-in-Aid for Scientific Research (C) 15540033, JSPS. R.A.W.
acknowledges partial support given by the EUCLID research training network funded by
the European Commission under contract HPRN-CT-2002-00325.
391
Appendix A. The proof of formula (3.19)

It is convenient to use the parametrization a =
+ m + 2(kn + t) (n Z, 0 t
k 1 mod 2). From (3.18), we have

(
)
[a] TrH(
) x 4Hm,a
m,a
am+
+2Z
k1

[
+ m + 2(kn + t)]cM
( )x
(mr(
+m+2(kn+t))r )2
krr
nZ t=0 mod k
2k1

[M + m + 2kn]cM
( )x
(mr(M+m+2kn))r )2
krr
nZ M=0 mod 2k
with M a m mod 2. Note that cM

( ) = 0 for M
mod 2. Using the formula
2

u
2 s(2ur)
u
s
rs
[u] = x r
x
, this can be rewritten as follows:
sZ () x
m2
x r m
()s x r s(s+1) x 2ms I (s),

sZ
where we define
I (s) =
2k1

k2
cM
( )x k (k(2ns)+M+ 2 ) x 4 .
k 2
nZ M=0 mod 2k
We show that I (s) (s Z) is independent of s. In fact, for the case s = 2u + 1 (u Z), we

can eliminate u by shifting n n + u. Then we obtain
I (2u + 1) =
2k1

M 2
cM
( )x 4k(n+ 2k ) x 2k(n+ 2k )
nZ M=0 mod 2k
(k)
=
( ).
Let us next set s = 2u. We have

I (2u) =
2k1

M 2
cM
( )x 4k(n+ 2k ) x 2k(n+ 2k ) .
nZ M=0 mod 2k
( ) = cM
( ), we find that
By changing n n, M M and using the symmetry cM
I (2u) coincides with I (2u + 1). Therefore the LHS of (3.19) is
m2
LHS = x r m
(k)
()s x r s(s+1) x 2ms
( )
sZ
(k)
= [m]
( ).
This coincides with the RHS.
392
Appendix B. The BRST resolution of Fm, (n)

Let r > k + 2 Z and fix m,
Z with 0
k. Note that regarding the half currents
E + (u) and F + (u) as the screening currents [23], we can define the q-Virasoro algebra
associated with the coset (
sl2 )k (
sl2 )rk2 /(
sl2 )r2 as their commutant [39]. As such a
q-Virasoro module, Fm,
(n) is reducible. Consider the BRST operator given by
s
Q+
s = E(u) : Fm,
(n) Fm2s,
(n).
Proposition B.1 [23]. The BRST operator Q+
s is independent of u and is nilpotent in the
following sense:
+
+
+
Q+
s Qr s = Qr s Qs = 0.
+
Setting Q2j = Q+
m , Q2j +1 = Qr m (j Z), we then have the following complex Cm,
Q2
Q1
Q0
Q1
Fm+2r ,
(n) Fm,
(n) Fm,
(n) .
We conjecture the following statement about the cohomology group H j (Cm,
) of this complex [23]:
H j (Cm,
) = 0 (j = 0).
Then by the EulerPoincar principle, we can evaluate the character of the 0th cohomology
group as follows:
cVir
TrH 0 (Cm,
) x 4(d+ 24 )

cVir
cVir
TrFm2r j,
(n) x 4(d+ 24 ) TrFm2r j,
(n) x 4(d+ 24 ) ,
=
(B.1)
j Z
where from (5.21), we have
TrFm2r j,
(n) x 4(d+
cVir
24 )
2k1

cM
( )x
(mr(Mm2r j +2kn))r 2rr j )2

krr
M=0 mod 2k
(
)
( ), i.e., the character of the irreducible
(B.1) coincides with the branching function bm,a
coset Virasoro minimal module Virm,a (a = m +
mod 2). This is also known to equal to
the one point function of the k fusion RSOS model with height restriction 1 a r 1
and 1 m r k 1 [10,31]. Hence in this case we can make the identification
RSOS(
)
H 0 (Cm,
) Hm,a
.
Appendix C. Fusion of the L-matrix

In this appendix, we give explicit formulae for the fused L-matrix. The L-matrix L(1)
is defined by

a b
u =
(1) (u)dc (1) (u)ab .
L(1)

c d
393
From (2.21) and (2.23), we obtain the formula

n+m
[u nm
m m 1
2 ][ 2 ]
u
=
L(1)
,
n n1
[u][n]

nm

[u n+m
(1) m m 1
2 ][ 2 ]
u
=
L
.
n n1
[u][n]
The k-fused L-matrix L(k) is given by

(k) m0 mk
0
u =
L
(k) (u)nnk0 (k) (u)m
mk .

n0 nk

According to the fusion formulae for (k) (u) and (k) (u) (2.29) and (2.31), the L(k)
satisfies the following fusion formula:

m0 mk
u
L(k)
n0 nk

m0 m1
(1) m1 m2
u
+
k
1
L
u
=
L(1)

n0 n1
n1 n2
n1 ,n2 ,...,nk1

(1) mk1 mk
L
(C.1)
u ,
nk1 nk
where the right-hand side is independent of the dynamical variables m1 , m2 , . . . , mk1 . By
induction making use of (C.1), we obtain the following compact expressions:

(k) m m k + 2i
u
L
n n k + 2j

Min(i,kj
)

m
mk+i
u + i
=
L(ki)
n n k + 2j i + 2l
l=Max(0,ij )

mk+i
m k + 2i
(i)
u
L
n k + 2j i + 2l n k + 2j

Min(i,j
)
m + i
(i) m
=
u
+
k
i
L
n n i + 2l
l=Max(0,i+j k)

m+i
m k + 2i
u .
L(ki)
n i + 2l n k + 2j
Here 0 i, j k and we have

m + k
(k) m
L
u
n n + k 2j
1 (n+m)+k1j 1 (nm) u+ 1 (n+m) u+ 1 (mn)
2
kj
n+k12j n+kj u
kj
kj
(C.2)
394

m
m k
u
n n + k 2j
1 (nm)+k1j 1 (n+m) u+ 1 (nm) u 1 (m+n)

L(k)
kj
j
j
n+k12j n+kj u
j
kj
k
kj
(C.3)
In particular, we use the following formulae in Section 5.2.3:

m m + k 2j
L(k)
u
n
n+k
=
nm
[ n+m
2 + k 1 + j ]kj [ 2 1 + j ]j
[n + k 1]k

L(k)
=
[u +
m
n
j ]kj [u
[u]k

m + k 2j
u
nk
mn
2
nm
[ n+m
2 ]j [ 2 ]kj [u +
m+n
2
+ j k]j
j ]kj [u +
[n]k [u]k
m+n
2
nm
2
(C.4)
+ j k]j
(C.5)
Recently one of the authors obtained an explicit expression of L(k) in terms of the verywell-poised elliptic hyper geometric series [40].
Appendix D. Commutation relations of the tail and vertex operators

In this appendix, we check the commutation relations between the tail operator (u)
and the vertex operator (u). Let us consider an integral of the form

dw1
dw2
F (v1 )F (v2 )f (v1 , v2 ),
2iw1
2iw2
where the integration contours for w1 and w2 are the same. The commutation relation
1 v2 1]
F (v1 )F (v2 ) = [v
[v1 v2 +1] F (v2 )F (v2 ), implies that this integral is equal to

dw1
2iw1
[v2 v1 1]
dw2
.
F (v1 )F (v2 )f (v2 , v1 )
2iw2
[v2 v1 + 1]
Observing this, we define the notion of weak equality. The functions f (v1 , v2 ) and
g(v1 , v2 ) are equal in weak sense if
f (v1 , v2 ) +
[v2 v1 1]
[v2 v1 1]
f (v2 , v1 ) = g(v1 , v2 ) +
g(v2 , v1 ).
[v2 v1 + 1]
[v2 v1 + 1]
We write f (v1 , v2 ) g(v1 , v2 ) to denote weak equality.
Let us consider the commutation relation

k

P +h
L(k)
2k (u2 )k (u1 ) =
P + h + 2k
395

P + h + k 2s
u
u
2
P +h+k 1
s=0
k+2s (u1 )2s (u2 ).
(D.1)
This reduces to the following weak equality

I (v1 , v2 , . . . , vk ) 0,
where
I (v1 , v2 , . . . , vk )
!
k

[u2 vj + n k2 + 2j 1]
1
[n]
k
= (1)
[n + 2k] (n + k, n + 2k)k
[u2 vj k2 ]
j =1
k

(k)
s=0
(1)s
ks

n + k 2s
u + u2 + 1
n+k 1
n
n + 2k
1
[n + k 2s]
[n + k] (n, n + k 2s)k
[u1 vj + n
k
2
[u1 vj +
j =1
k
+ 2j 1]
k
2]
[u2 vj + n
3k
2
+ 2j 1]
[u2 vj k2 ]
j =ks+1
Let us consider
k (u2 ) k (u1 ) =
k

W
(k,k)
s=0
P +h
P +h+k
P + h + k 2s
P +h

u1 u2

k+2s (u1 ) k2s (u2 ).
(D.2)
This in turn reduces to the weak equality

I (v1 , v2 , . . . , vk ) 0,
where
I (v1 , v2 , . . . , vk )
=
k
k

[u1 vj k2 ]
[u2 vj + n
j =1
[u1 vj +
k

s=0
ks

j =1
W k,k
k
2 ] j =1
+ 2j 1]
[u2 vj + k2 ]

ks
[u2 vj k ]
n + k
2
u
u
2
k
n 1
[u
v
+
2
j
2]
j =1
n
n + k 2s
[u1 vj + n
k
2
k
2
+ 2j 1]
[u1 vj +
k
2]
k

j =ks+1
[u2 vj + n
3k
2
+ 2j 1]
[u2 vj + k2 ]
396
We have checked I (v1 , v2 , . . . , vk ) 0 for the case k = 1, 2, 3.

Proposition C.2. The L(k) -matrix and the Boltzmann weight Wk,k are related by

n
n + k
u
W k,k
n + k 2s
n
!

[n k][n + 2k] (n + 2k, n + k)k (k)
n
n + k 2s
=
u
+
1
.
L
n + 2k
n+k
[n + k][n] (n, n + k 2s)k
By using this proposition, we have
I (v1 , v2 , . . . , vk |u1 , u2 )
!
[n + 2k]
= (1)k
(n + k, n + 2k)k I (v1 , v2 , . . . , vk |u1 , u2 )
[n]
k

[u2 vj k2 ]
j =1
[u2 vj + k2 ]
Therefore (D.1) and (D.2) reduce to the same identity, which we have checked for k =
1, 2, 3. This supports our conjecture (5.34) for the explicit form of the tail operator.
Let us check a commutation relation of the tail and vertex operators which does not
reduce to one for just vertex operators in the above way. Let us consider the commutation
relation for s c,
2s (u2 )k (u1 )

k

P +h
(k)
=
L
P + h + 2s
t=0

P + h + k 2t
u u2 k+2t (u1 )2k2s2t (u2 ).
P + h k + 2s 1
Taking the residue at u1 = u2 k, a necessary condition becomes the following theta

identity
k

[n + k s]t [s]kt [2k + n 1 s t]kt [s + t 1]t
t=0
(1)s+tk
1
[n 2s 2t + 2k]
= 0.
[n]
(n + k 2s, n + 2k 2s 2t)k
References
[1] R.J. Baxter, Partition function of the eight-vertex lattice model, Ann. Phys. 70 (1972) 193228.
[2] R.J. Baxter, Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain. 1.
Some fundamental eigenvectors, Ann. Phys. 76 (1973) 124.
Equivalence to a generalized ice-type model, Ann. Phys. 76 (1973) 2547.
397
Eigenvectors of the transfer matrix and Hamiltonian, Ann. Phys. 76 (1973) 4871.
[5] G.E. Andrews, R.J. Baxter, P.J. Forrester, Eight-vertex SOS model and generalized RogersRamanujan-type
identities, J. Stat. Phys. 35 (1984) 193266.
[6] D.A. Huse, Exact exponents for infinitely many new multicritical points, Phys. Rev. B 30 (1984) 39083915.
[7] P.P. Kulish, N.Yu. Reshetikhin, E.K. Sklyanin, YangBaxter equation representation theory: I, Lett. Math.
Phys. 5 (1981) 393403.
[8] P.P. Kulish, E.K. Sklyanin, Solutions of the YangBaxter equation, J. Sov. Math. 19 (1982) 15961620.
[9] E. Date, M. Jimbo, T. Miwa, M. Okado, Fusion of the eight vertex SOS model, Lett. Math. Phys. 12 (1986)
209215.
[10] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Exactly solvable SOS models, Nucl. Phys. B 290 (1987)
231273.
[11] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Exactly solvable SOS models II, Adv. Stud. Pure
Math. 16 (1988) 17122.
[12] B. Davies, O. Foda, M. Jimbo, T. Miwa, A. Nakayashiki, Diagonalization of the XXZ Hamiltonian by vertex
operators, Commun. Math. Phys. 151 (1993) 89153.
[13] M. Jimbo, T. Miwa, Algebraic Analysis of Solvable Lattice Models, CBMS Regional Conference Series in
Mathematics, vol. 85, American Mathematical Society, Providence, 1994.
[14] O. Foda, K. Iohara, M. Jimbo, R. Kedem, T. Miwa, H. Yan, An elliptic quantum algebra for
sl2 , Lett. Math.
Phys. 32 (1994) 259268.
[15] O. Foda, K. Iohara, M. Jimbo, R. Kedem, T. Miwa, H. Yan, Notes on highest weight modules of the elliptic
algebra Aq,p (
sl2 ), Prog. Theor. Phys. Suppl. 118 (1995) 134.
[16] M. Jimbo, H. Konno, S. Odake, J. Shiraishi, Quasi-Hopf twistors for elliptic quantum groups, Transform.
Groups 4 (1999) 303327.
[17] M. Jimbo, T. Miwa, A. Nakayashiki, Difference equations for the correlation functions of the eight-vertex
model, J. Phys. A 26 (1993) 21992209.
[18] M. Jimbo, R. Kedem, H. Konno, T. Miwa, R.A. Weston, Difference equations in spin chains with a boundary,
Nucl. Phys. B 448 (1995) 429456.
[19] M. Jimbo, T. Miwa, Y. Ohta, Structure of the space of states in RSOS models, Int. J. Mod. Phys. A 8 (1993)
14571477.
[20] S. Lukyanov, Y. Pugai, Multi-point local height probabilities in the integrable RSOS models, Nucl. Phys.
B 473 (1996) 631658.
[21] M. Lashkevich, Y. Pugai, Free field construction for correlation functions of the eight-vertex model, Nucl.
Phys. B 516 (1998) 623651.
[22] M. Lashkevich, Free field construction for the eight-vertex model: Representation for form factors, Nucl.
Phys. B 621 (2002) 587621.
[23] H. Konno, An elliptic algebra Uq,p (
sl2 ) and the fusion RSOS models, Commun. Math. Phys. 195 (1998)
373403.
sl2 ): Drinfeld currents and vertex opera[24] M. Jimbo, H. Konno, S. Odake, J. Shiraishi, Elliptic algebra Uq,p (
tors, Commun. Math. Phys. 199 (1999) 605647.
[25] T. Kojima, H. Konno, R. Weston, The vertex-face correspondence and correlation functions of the fusion
eight-vertex models II: The 21 vertex model, in preparation.
[26] H. Konno, Fusion of Baxters elliptic R-matrix and the vertex-face correspondence, math.QA/0503726,
RIMS Koukyuroku, 2005, in press.
[27] O. Foda, M. Jimbo, T. Miwa, K. Miki, A. Nakayashiki, Vertex operators of solvable lattice models, J. Math.
Phys. 35 (1994) 1346.
[28] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, One-dimensional configuration sums in vertex models
and affine Lie algebra characters, Lett. Math. Phys. 17 (1989) 6977.
[29] V.G. Kac, D.H. Peterson, Infinite dimensional Lie algebras, theta-functions and modular forms, Adv.
Math. 53 (1984) 125264.
[30] E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Paths, Maya diagrams and representations of
sl2 (r, C),
Adv. Stud. Pure Math. 19 (1989) 149191.
[31] E. Date, M. Jimbo, T. Miwa, M. Okado, Automorphic properties of local height probabilities for integrable
solid-on-solid models, Phys. Rev. B 35 (1986) 21052107.
398
[32] M. Idzumi, T. Tokihiro, K. Iohara, M. Jimbo, T. Miwa, T. Nakashima, Quantum affine symmetry in vertex
models, Int. J. Mod. Phys. A 8 (1993) 14791511.
sl2 ), vertex operators, and their correlation functions,
[33] M. Idzumi, Level 2 irreducible representations of Uq (
Int. J. Mod. Phys. A 9 (1994) 44494484.
[34] A.H. Bougourzi, R.A. Weston, N -point correlation functions of the spin-1 XXZ model, Nucl. Phys. B 417
(1994) 439462.
[35] A. Matsuo, A q-deformation of Wakimoto modules, primary fields and screening operators, Commun. Math.
Phys. 160 (1994) 3348.
[36] D. Gepner, Z. Qiu, Modular invariant partition functions for parafermionic field theories, Nucl. Phys. B 285
(1987) 423.
[1]
[37] J. Lepowsky, M. Primc, Structure of the standard modules for the affine Lie algebra A1 , Contemp. Math. 46
(1985).
sl2 ), J. Algebra 182 (1996) 448
[38] N.H. Jing, Higher level representations of the quantum affine algebra Uq (
468.
[39] E.V. Frenkel, N.Yu. Reshetikhin, Deformations of W -algebras associated to simple Lie algebras, Commun.
Math. Phys. 197 (1998) 132.
[40] H. Konno, The vertex-face correspondence and the elliptic 6j -symbols, math.QA/0503725, Lett. Math.
Phys., in press.
Noncommutative geometry and non-Abelian Berry

phase in the wave-packet dynamics of Bloch
electrons
Ryuichi Shindou a , Ken-Ichiro Imura b
a Department of Physics, University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo 113-0033, Japan
b Condensed Matter Theory Laboratory, RIKEN (Wako), Hirosawa 2-1, Wako 351-0198, Japan
Received 27 November 2004; accepted 27 May 2005

Abstract
Motivated by a recent proposal on the possibility of observing a monopole in the band structure,
and by an increasing interest in the role of Berry phase in spintronics, we studied the adiabatic motion
of a wave packet of Bloch functions, under a perturbation varying slowly and incommensurately to
the lattice structure. We show, using only the fundamental principles of quantum mechanics, that the
effective wave-packet dynamics is conveniently described by a set of equations of motion (EOM)
for a semiclassical particle coupled to a non-Abelian gauge field associated with a geometric Berry
phase.
Our EOM can be viewed as a generalization of the standard Ehrenfests theorem, and their derivation was asymptotically exact in the framework of linear response theory. Our analysis is entirely
based on the concept of local Bloch bands, a good starting point for describing the adiabatic motion of a wave packet. One of the advantages of our approach is that the various types of gauge
fields were classified into two categories by their different physical origin: (i) projection onto specific bands, (ii) time-dependent local Bloch basis. Using those gauge fields, we write our EOM in
a covariant form, whereas the gauge-invariant field strength stems from the noncommutativity of
covariant derivatives along different axes of the reciprocal parameter space. On the other hand, the
degeneracy of Bloch bands makes the gauge fields non-Abelian.
For the purpose of applying our wave-packet dynamics to the analyses on transport phenomena
in the context of Berry phase engineering, we focused on the Hall-type and polarization currents.
E-mail address: imura@riken.jp (K.-I. Imura).

doi:10.1016/j.nuclphysb.2005.05.019
400
R. Shindou, K.-I. Imura / Nuclear Physics B 720 [FS] (2005) 399435
Our formulation turned out to be useful for investigating and classifying various types of topological
current on the same footing. We highlighted their symmetries, in particular, their behavior under
time reversal (T ) and space inversion (I ). The result of these analyses was summarized as a set of
cancellation rules. We also introduced the concept of parity polarization current, which may embody
the physics of orbital current. Together with charge/spin Hall/polarization currents, this type of orbital
current is expected to be a potential probe for detecting and controlling Berry phase.
1. Introduction
The search for a quantized magnetic monopole has a long history [1,2]. Recently a
group of condensed-matter physicists [3,4] embodied the idea of detecting a monopole in
the band structure [5]. In crystal momentum space, monopoles appear as a source or a
sink of the reciprocal magnetic field [6,7] associated with the geometric phase of Bloch
electrons. The geometric phase of a Bloch electron, i.e., its Berry phase, has also attracted
much attention on the technological side, in particular, in the context of spintronics. A spin
Hall effect has been of much theoretical concern [812], since it may provide a possible
efficient way to induce spin current in a semiconductor sample on which spintronic devices
[13] will be constructed.
The subject studied in this paper stands at the interface between the forefront of the
search for a monopole and the latest technology of spintronics. We study the wave-packet
dynamics of a Bloch electron under perturbations slowly varying in space and in time.
We derive and analyze a set of equations of motion (EOM) which describes the centerof-mass motion of such a wave packet together with its internal motion associated with
its (pseudo)spin. A reciprocal gauge field of geometric origin (Berry connection) appears
naturally in such EOM [7]. Then we combine our formalism with the Boltzmann transport
theory to describe such phenomena as spin and orbital transport. Its relevance to quantum
charge/spin pumping [14,15] will be also briefly discussed.
Before plunging into the detailed description of our project, let us briefly remind you
what the Berry phase is, and how it has become to be widely recognized in the community. In his landmark paper [16], Berry introduced it as a quantal phase acquired by
a wave function whose Hamiltonian is subject to an adiabatic perturbation. The Berry
connection, i.e., a gauge field appears as a phase of the overlap of two wave functions
infinitesimally separated in the adiabatic parameter space. Before being formulated in
such a systematic manner, the Berry phase, however, had already been recognized and
discussed, for somewhat restricted cases though, in several independent contexts. The
molecular AharonovBohm effect discussed in Ref. [17] is nothing but a manifestation
of Berry phase. Its relevance to band structure had also been recognized in limited situations, such as anomalous Hall effect (AHE) [18,20] as well as in the study of quantized
Hall conductance [21]. The role of Berry phase in piezo- and ferro-electrics has also been
of much theoretical interest [22]. Recently, the Berry phase in AHE has attracted a renewed
attention, revealing its rich topological structures [3,4,6,7,2326]. The Berry phase has also
been generalized to a non-Abelian case [27].
401
The equations of motion (EOM) for a wave packet of Bloch functions1 is instrumental
in all the analyses done in the paper. In order to illustrate our program, we begin with some
details of the description of such EOM. A wave packet of Bloch functions is localized in
x ) (where k is a crystal momentum characterizing the Bloch
the phase space around (k,
function). The wave packet is also composed of a specific Bloch band n, whose energy
(0)
x ) obey a set of
dispersion relation is given by n (k).
The center of mass coordinates (k,
classical EOM, as the Ehrenfests theorem says. In the presence of electro-magnetic field
(E, B), its motion is subject to an electric and Lorentz forces,

d k
d x
(1)
= e E(x) +
B(x) ,
dt
dt
(0)
d x
n (k)
(2)
.
=
dt
k
These EOM, together with the Boltzmann transport theory, describe the electro-magnetic
response of the system. To see this point, let us express the charge current in terms of the
as
momentum distribution function f (k)

d k
d x ,
JC = e
(3)
f (k)
(2)D
dt
where D is the dimension of coordinate space. The net current vanishes in the thermal
equilibrium. A finite net current appears when either
is deviated from its equilibrium value, or
(1) f (k)
(2) d x /dt acquires an anomalous term, i.e., an anomalous velocity.
Case (1) corresponds obviously to the usual ohmic transport, in which the current is induced by a small deformation of a Fermi sphere from its thermally equilibrated distribution.
In this case the current is, therefore, carried only by the electrons in the vicinity of the Fermi
surface.
The Berry phase contribution to Eq. (3) corresponds to Case (2), and involves, in contrast to Case (1), all the electrons below the Fermi surface. This type of geometric current
might be also dissipationless [10,28,29]. When Berry connection is taken into account, the
classical EOM, in particular, Eq. (2) is subject to a modification. In terms of a reciprocal
magnetic field B, the EOM for x now reads [7],
x , t) d k
eff (k,
d x
=
B(k),
(4)
+
dt
dt
k
x , t) is an effective energy, which will be defined in more precise terms in
where eff (k,
Eq. (37). The nature of reciprocal magnetic field will be clarified in Section 2. One can
acts quite similarly to the Lorentz force in the real space. B(k)
observe in Eq. (4) that B(k)

1 A Bloch function is an eigenstate of a periodic Hamiltonian such as Eq. (9), whose energy spectrum forms a
(0)
band structure n (k)
defined as in Eq. (10).
402
encodes information on the topological nature of band structure, in particular, that of band
[5], which
crossings [7]. Indeed, a degeneracy point corresponds to a monopole of B(k)
has played a crucial role in the understanding of anomalous Hall effect (AHE) [6,2325].
In this paper we study the wave-packet dynamics of Bloch electrons subject to a perturbation (x, t) varying slowly in space and time. Even though our treatment of (x, t) is in
completely general terms, we can give some concrete examples of (x, t) as in Ref. [7],

H p, x; (x, t) = H0 p + 1 (x, t), x + 2 (x, t) + 3 (x, t).
(5)
H (p, x; = 0) is an unperturbed Hamiltonian. The first two categories, 1 (x, t) and
2 (x, t), are in a vectorial form, whereas 3 (x, t) is a scalar. In the case of electro-magnetic
perturbations, 2 (x, t) = 0. A finite 2 (x, t) could be relevant, e.g., for the study of deformational perturbations in a crystal [7].
x ), we
Following the quantum mechanical motion of a wave packet localized around (k,
study its EOM focusing on the topological nature of band structure, and interpret them in
terms of the reciprocal vector potential Aq defined in the (2D + 1)-dimensional parameter
x , t). This set of parameters {q} plays in our case the role of adiabatic
space {q} = (k,
parameters in the original formulation of Berry phase [16]. Our approach is entirely based
on the fundamental relations of Schrdinger quantum mechanics, and makes no reference
to (i) time-dependent variational principle [7], or (ii) path-integral method using Wannier
basis [30]. Although our approach is conceptually much simpler than those mentioned
above, this type of analysis can be found, to our knowledge, only in the classical literature
[19,20]. We have in mind a linear response theory with the help of Boltzmann equation.
We, therefore, restricted our analysis to the first order of external perturbation (x, t). We
emphasize here that all our analyses are asymptotically exact in the framework of linear
response theory.
This paper is organized as follows: in Section 2, we first discuss the nature of nonAbelian gauge field, appearing in our EOM, which will be derived later in Section 4. In
Section 3, we state and formulate unambiguously our problem, as well as listing all the
assumptions we will make. The EOM is derived in Section 4, whose possible application
to Berry phase engineering is discussed in Section 5, before coming to the conclusions in
Section 6. Some technical details are left for Appendices AD.
2. Origin of the gauge field

appeared in Eq. (4) lies, as will be further
The nature of a reciprocal magnetic field B(k)
x )
discussed in Section 4, in the noncommutativity of the center of mass coordinates, (k,
[10]. In more mathematical terms, B(k) is a curvature associated with a geometric Berry
connection, i.e., a gauge field. The relation between such noncommutative coordinates as
seen in Eqs. (24), (27) and the MM in momentum space has been of much theoretical interest [3,4,10,31]. From a more general point of view, physics in noncommutative spacetime
coordinates has been of great theoretical interest, rather in high-energy physics community, in particular, in the context of string and M theories [32,33]. In the following we
consider, instead, the physical origins from which our gauge fields stem, and the mecha-
403
nism how they are generated, focusing on the case of Bloch electrons under slowly varying
perturbation (x, t).
2.1. Non-Abelian gauge field, or Berry phase, encoding information on the band structure
Let us consider the motion of a wave packet composed of a limited number, say, N of
degenerate bands over the whole Brillouin zone. When neither the time reversal symmetry
nor the spatial inversion symmetry is broken, there always appears a two-fold degeneracy
at every k-point (Kramers doublet). If there is no further degeneracy in the system, then
N = 2 in our language. In the following chapters, we will derive, using only the most
fundamental relations of Schrdinger quantum mechanics, effective equations of motion
(EOM) for this wave packet. These EOM are most conveniently interpreted in terms of nonAbelian gauge fields in the reciprocal space. When we derived these effective equations of
motion, we restricted our available Hilbert space to these degenerated bands. In the course
of this procedure of projection onto the N bands, all the relevant information, about the
bands integrated away, was encoded in the form of a gauge field, and appears in the EOM
for the wave packet as a Berry phase.
In order to illustrate this point, let us investigate how those gauge fields are expressed
explicitly in terms of Bloch functions. We will see in later sections that the concept of Bloch
bands is susceptible of perturbations varying incommensurately to the lattice structure. As
a result Bloch electrons become subject to a (non-Abelian) gauge field in a (2D + 1) x , t), which we will call below the reciprocal space, from
dimensional parameter space (k,
the mean crystal mothe view point that it is a generalization of the space spanned by k,
mentum of the wave packet. The reciprocal vector potential takes the form of a N N
matrix, whose elements are given by

x , t)
un (k,

(Aq )mn = i um (k, x , t)

(6)
,
q
where q should be understood as a general coordinate q = k , x , t and , = 1, . . . , D.
x are center of mass coordinates defined in more precise terms, respectively, in Eqs. (16)
k,
and (15). |un (k, x , t) = exp(ikx)|n (k, x , t) is the periodic part of a local2 Bloch state,
x + a|un (k, x ) = x|un (k, x ). Inner products involving the periodic part |un (k, x , t),
mean an integration over the unit-cell, with a normalization un (k, x , t)|un (k, x , t) = 1.
In the Abelian case N = 1, this vector potential is indeed related to the reciprocal magnetic
introduced in Eq. (4) as
field B(k)
=
B(k)
Ak .
k
In the non-Abelian case, the gauge invariant reciprocal field strength should be defined as
Fq1 q2 = q1 Aq2 q2 Aq1 + i[Aq1 , Aq2 ],
(7)
2 The concept of local Bloch function will be briefly introduced in Section 2.2 before being formulated in
more precise terms in Section 3.
404
un
m
where q1 , q2 = k , x , t. Using a trivial relation u
q |un + um | q = 0, the last term of
Eq. (7) can be rewritten as

N

un
un
um
um

ul ul
ul ul
,
i
q2
q1
q1
q2
l=1
whereas,
(q1 Aq2 q2 Aq1 )mn = i

um un
um un
.
q1 q2
q2 q1
(8)
Comparing those two equations, one can immediately see that if N

l=1 |ul ul | were 1,
i.e., if {|ul ; l = 1, . . . , N} spanned a complete basis, then Fq1 q2 would vanish identically.
This indicates the fact that the nature of our gauge field lies indeed in the projection of an
available Hilbert space onto the relevant N bands. If |ul spanned a complete basis, and
no band were projected away, there would be no information which should be encoded in
the gauge fields. Note also that Eq. (8) takes the familiar form of the Berry curvature in the
study of magnetic Bloch bands [6,21].
2.2. Gauge field of two different origins
The gauge field introduced in Eq. (6) has two different physical origins:
(1) Projection onto a subspace spanned by N Bloch bands;
(2) Bloch basis moving in time.
The first point has been already discussed in Section 2.1, whereas the second point may
need some explanation. In the following sections, we will study the wave-packet dynamics
in the phase space in the presence of space and time dependent external perturbation, which
varies incommensurately to the lattice structure. In order to define a crystal momentum in
such a situation, we replace the spatial coordinate x in the perturbation (x, t), introduced
as in Eq. (5), by the center-of-mass coordinate x of a wave packet under consideration. This
recovers the original lattice periodicity of the Hamiltonian, leading us to the concept of local Hamiltonian (Eq. (11)) and its local Bloch eigenstates (Eq. (12)). The above procedure
is justified, whenever the external perturbation varies sufficiently smoothly compared with
the width of the wave packet. We then expand the wave packet in terms of the local Bloch
eigenstates, |n (k, x (t), t), which evolve as a function of time, both explicitly (through t)
and implicitly (through x (t)). This is why our local Bloch function, or rather its periodic
part, which has appeared in Eq. (6), depended not only on k but also x and t. Because of the
nature of our local Bloch basis, such nontrivial gauge field structure as was introduced in
the previous section emerges. To be precise, we had better distinguish between two different types of gauge field (strength) appearing in Eqs. (6), (7): (i) Fk k , (ii) Fk x and Fk t .
Although the reciprocal field strength introduced in Eq. (7) has various components, i.e.,
not only (a) Fk k , Fk x and Fk t ((a) = (i) + (ii)), but also (b) Fx x and Fx t , the latter
components (b) do not appear in our EOM for the wave packet, showing a clear contrast
with Ref. [7]. However, we will be working in the framework of a linear response theory,
405
and within that framework our EOM turn out to be consistent, when N = 1, with those of
Ref. [7]. This point will be further clarified in Section 4.4 by performing a simple power
counting analysis.
We will see in detail in Section 4 that the two types of gauge field
(1) Fk k ,
(2) Fk x and Fk t ,
have actually slightly different origins, as well as their different physical consequences
which we will discuss in Section 5. The former, Fk k , is indeed related to the projection
of available Hilbert space onto the relevant degenerate N bands. It yields a finite anomalous
velocity, and plays a central role in the understanding of AHE. It appears in the presence
of magnetic Bloch bands and ferromagnetic backgrounds [6,2325]. On the other hand,
the latter, Fk x and Fk t appear only in the presence of the time-dependent Bloch basis
mentioned above.
3. Statement of the problem

Before discussing the EOM in the following section, let us define and formulate our
problem here as well as listing all the assumptions we will make. We stress here that all the
approximations which we will make are stated here, and that the derivation of the EOM in
the following section is indeed exact under the assumptions made in this section.
Let us consider the motion of a wave packet of Bloch functions under perturbations
slowly varying in space and time. This perturbation can be, e.g., external electro-magnetic
field, as was the case in the study of magnetic Bloch bands [6,21]. The external perturbation (x, t), varying incommensurately to the crystal structure, breaks the translational
symmetry of the unperturbed Hamiltonian,3
H0 (p, x) =
(p + eAuni )2
+ U (x),
2me
U (x + a) = U (x),
(9)
where Auni represents the vector potential of homogeneous magnetic field in case it exists.
The full vector potential A is thus divided into two parts as A = Auni + A, where A
is absorbed in 1 (x, t). Eigenstates of the above Hamiltonian (9), i.e., (magnetic) Bloch
bands (specified by band indices n) are characterized by crystal momenta k,

H0 n(0) (k) = n(0) (k)n(0) (k) .
(10)
3 The innocent looking equation, Eq. (9), more specifically, the periodic potential U (x) in it, encodes all the
information on the band structure and, consequently, the secrets of its nontrivial topological nature. It should be
emphasized that U (x) is written symbolically in the sense that (i) it can also be a function of momentum p due
to spinorbit interaction, (ii) it is generally spin-dependent, either, i.e., it takes a 2 2 matrix form in spin space
on top of a ferromagnetic background, (iii) the smallest unit of translational symmetry a is replaced by magnetic
translation vectors in the case of magnetic Bloch bands [6]. Thus Eq. (9) should be interpreted accordingly to the
situation.
406
Once the perturbation (x, t) is switched on, this crystal momentum k is no longer a good
quantum number of the system. However, the typical wave length of the external perturbation is longer by several order of magnitudes than the lattice constants, in a physically
relevant parameter regime of our interest. In that case, intermediate length scales do exist,
to which our wave packet will belong, in which the external perturbation (x, t) can be
regarded spatially constant at the zeroth order of approximation. We are thus entitled to
consider a wave packet, well localized in this length scale of external perturbation, which
has also a peak sharp enough in the space of crystal momentum, moving under perturbations slowly varying in space and time.
Let us now consider a wave packet, | (t), localized in the phase space, spanned by the
x ). For simplicreal space coordinate x and the crystal momentum k, in the vicinity of (k,
ity, and without losing generality, we can assume that the wave packet has a symmetric and
smooth shape such that it has a well-distinguished peak at (k(t),

x (t)) in the phase space,
where k(t) and x (t) should coincide with the expectation value of k and x at a given time t.
Our present goal is to study, as accurately as possible, the quantum mechanical motion
As will soon
of this wave packet, and derive the effective equations of motion for x and k.
become clearer, an interpretation in terms of reciprocal gauge field (strength) uncover the
nature of various physical phenomena, such as anomalous Hall effect (AHE) [6,2325],
spin Hall effect [10] and quantum charge/spin pumping [14,15].
3.1. Assumption of slowly varying perturbation (x, t)-concept of the local Hamiltonian
and its local Bloch bands
We consider from now on a perturbation (x, t) introduced in Eq. (5). As far as the
intermediate length scales discussed at the beginning of this section are concerned, (x, t)
can be regarded, over the spread of our wave packet, almost spatially constant. We, therefore, choose, as the starting point of our analysis, a Hamiltonian, dubbed in Ref. [7] as a
local Hamiltonian, in which x-dependence of (x, t) is replaced by x , a constant at a given
time:

Hloc = H p, x; (x, t) .
(11)
This Hloc has a very remarkable property; at a given time t it has the same translational
symmetry as the nonperturbed Hamiltonian H0 = H (p, x; = 0), i.e., in other words, Hloc
can be diagonalized by a set of local Bloch eigenstates |n (k, x , t) forming a local band
n (k, x , t), which now depends on x (t) and t:

Hloc n (k, x , t) = n (k, x , t)n (k, x , t) .
(12)
We are actually considering a degenerate case where n (k, x , t) (n = 1, . . . , N ) takes the
same value, which we define to be loc (k, x , t), i.e.,
loc (k, x , t) 1 (k, x , t) = = N (k, x , t).
(13)
We will see below that the concept of the local Hamiltonian and its associated conduction
bands plays a central role in the derivation of EOM.
407
3.2. Construction of a wave packet

Superposing local Bloch functions introduced above, we now construct our wave
packet. In the spirit of Boltzmanns transport theory, an exchange of energy between the
electron and the environment occurs only through scattering events. In the following, we
will investigate an adiabatic motion of this wave packet. This picture should be valid over
the typical length scale of an adiabatic flight between two scattering events, i.e., over the
mean free path of an electron. Let us now proceed step by step, making each logical step
as clear as possible.
(1) Let us first focus on the real space, in which the electron wave packet is localized
around x . Then we can compose a wave packet4 out of local Bloch functions associated
with the local Hamiltonian at x :
N

(t) =
dk an (k, t)n (k, x , t) .
(14)
n=1
an (k, t) should be normalized properly. The x -dependence of | (t) is implicit on the lefthand side of Eq. (14), which is actually due to the time dependent Bloch basis, |n (k, x , t).
(2) In order for the self-consistency, we require that our wave packet (14) does give,
the correct expectation value of x, i.e., x (t) = (x1 (t), . . . , xD (t)):

x (t) = (t)x (t) .
(15)
This guarantees that our wave packet yield, indeed, the center-of-mass position preassigned
in Eq. (11), and that our program makes a self-consistent closed loop.
Our wave packet (14) can be also regarded as a functional of an (k, t), i.e., | (t) =
| ({an (k, t)}), in which the coefficients an (k, t) are chosen so that the self-consistency
condition (15) should be satisfied. Eq. (15) is, however, nothing but a weak constraint compared with a huge number of degrees of freedom allowed for an (k, t). In order to specify
with further precision the coefficients an (k, t), we now turn our eyes to the k-space. As
has been discussed at the beginning of this section, we can consider, in the length scale of
our interest, a wave packet which is localized both in x and in k. We therefore require, in
addition to Eq. (15), that our wave packet should also give the correct expectation value
of k :

k (t) = (t)k (t) .
(16)
The k dependence of | (t) is thus encoded in an (k, t).5
4 As has been discussed at the beginning of this section, the expansion (14) is justified, as far as the spread
of wave packet in real space is sufficiently small compared with the typical length scale over which the external
perturbation can be regarded almost constant.
5 So far, our treatment of x and k has not been symmetric. This is entirely due to the fact that our perturbation
(x, t) does not depend on k. We can consider, in principle and without much difficulty, such a perturbation that
depends on k, and perform symmetric treatment of x and k. However, in this paper, we restricted ourselves, for
the clarity of the paper, to the former case.
408
In the following section, we will derive the EOM for x (t) and k(t),
following the quantum mechanical motion of the wave packet we have just prepared. In order to make the
set of EOM self-contained, however, we need also to take care of the motion of internal pseudospin degrees of freedom spanned by N bands. For that purpose it will turn
out to be convenient to separate an (k, t) into its phase or pseudospin part, zt (k, t) =
(z1 (k, t), . . . , zN (k, t)), and its amplitude part, (k, t), by introducing
an (k, t) =

(k, t)zn (k, t),
(k, t) =
N

an (k, t)2 .
n=1
|zn |2 clearly represents the probability that the electron wave packet sits on the nth band
among the N -fold degenerated bands. Thereby it corresponds to the internal degrees of
freedom associated with the wave packet, such as spin and/or orbital, while (k, t) is the
momentum distribution function for the wave packet. Since we assumed that the wave
packet is well-localized not only in its real space but also in its reciprocal space, we can
assume without any loss of generality the following reduction formula,

dk f (k, t)(k, t) = f k(t),

t ,
(17)
for any sufficiently smooth function f (k, t). This prescription will be used frequently at
the final stage of the derivation of EOM.
3.3. First order perturbation theory with respect to (x, t): a linear response theory
The wave packet introduced above should obey the Schrdinger equation

(18)
(t) = H (t) .
t
As we have briefly seen in the introduction, our eventual objective is to apply the EOM to
the framework of the Boltzmann transport theory, using formula such as Eq. (3), in order
to describe phenomena including the anomalous Hall effect, spin Hall effect and quantum
pumping, etc. For that purpose it is enough to consider a linear response of the system,
keeping only the terms up to first order of (x, t)/ x and (x, t)/t.
In the case of the electro-magnetic fields, the perturbation (x, t) is embodied by a
vector potential A(x, t), and a scalar potential A0 (x); the full Hamiltonian reads H0 (x, p +
eA(x, t)) eA0 (x). Thereby, a linear response to applied electro-magnetic fields, E =
A0 /x A/t, B = A corresponds to the first order perturbation theory w.r.t.
(x, t).
We expand the Hamiltonian in powers of x x as
i

D
1
Hloc Hloc
(x x )
H = Hloc +
+
(x x ) .
2
x
x
(19)
=1
The first order term on the r.h.s. is written in a symmetrical way in order to keep the
Hamiltonian to be Hermitian. In the following, based on Eq. (19) we develop a systematic
409
perturbation theory w.r.t. (x, t). In this paper, we focus on the linear response of the system, keeping only the terms up to first order in the expansion. Our treatment is, therefore,
self-consistent in the framework of linear response theory.
4. Equations of motion
In this section, we sketch the derivation of EOM, paying particular attention to, how
the two different types of reciprocal field strength, introduced in Section 2, appear in the
EOM. Before going into the details of the derivation of EOM, let us remind you that there
are two possible sources of Berry curvature in the reciprocal parameter space:
(1) projection of available Hilbert space onto the degenerated N Bloch bands;
(2) local Bloch basis changing gradually in the course of time.
The time dependence of the local Bloch basis stems, not only from the explicit t dependence of the local Hamiltonian, Hloc , but also from our self-consistent treatment of the
problem, where the local Hamiltonian depends on the center-of-mass position of the electron wave packet through the external perturbation, (x(t), t).
In many respects, our point of view is reminiscent of the standard Ehrenfests theorem of
quantum mechanics: the expectation value of an operator, such as x or p, obeys a classical
EOM. We actually follow the same type of procedure as the derivation of the Ehrenfests
theorem, and in this sense our EOM can be regarded as a generalized Ehrenfests theorem
for Bloch electrons under perturbations varying slowly in space and time. We will come
back to this point later.
4.1. Preliminaries
We investigate, in this section, time evolution of the wave packet constructed in Eq. (14).
We are interested, not only in its motion in the phase space, but also in the motion of its
internal spin/orbital degrees of freedom. In Section 5 we will develop further analyses,
from the viewpoint of transport phenomena, on the dynamics associated with such internal
degrees of freedom. Having in mind applications of our formalism to those fields, we
formulate our equations as generally as possible. More concretely, we consider the time
evolution of an arbitrary observable, O, or rather of its expectation value,

= (t)O (t) .
O(t)
Since we have adopted, for the sake of simplicity, the Schrdinger picture, as seen in
Eq. (18), the wave functions evolve in time, while observables are time-independent. We
develop later more detailed analyses on the EOM focusing on the case where O = x or
k , but we consider a general observable O as far as possible in formulating our equations. This will make it easier to apply our formalism to further studies on the dynamics
associated with the internal degrees of freedom of Bloch electron.
Having those in mind, let us consider the expectation value of an arbitrary observable O,
N

(k , t)Omn (k , k)an (k, t),

dk dk am
O(t) =
(20)
m,n=1
410
where we have introduced an abbreviated notation for the matrix elements of an operator
O evaluated in the restricted subspace spanned by N Bloch bands, i.e.,

Omn (k , k) = m (k , x , t)On (k, x , t) .
(21)
Note that in this restricted Hilbert space not only k , x or H but also /t are considered
to be an operator O. Omn (k , k) is generally a N N matrix for given (k , k), whereas
for a given (m, n), it has, in general, off-diagonal matrix elements, and can be also regarded
as a matrix in k-space. The presence of finite off-diagonal matrix elements of Omn (k , k),
either in the k space or in the pseudospin space prevents some observables from commuting
each other, thereby induces Berry curvature in our final EOM.
Let us first consider two concrete examples:
(1) Case of O = x : the matrix elements of an observable x are
x mn (k , k) = i(k k)mn
+ (k k)(Ak )mn .

k
(22)
The first term is off-diagonal in k-space, when k is discrete, due to the k-derivative, but
is diagonal w.r.t. the band index. In Eq. (22), we kept both k and k indices in order to
emphasize the fact that this first term is off-diagonal. In the following, we will omit quite
frequently the k -index, pretending that x mn (k , k) is diagonal in k-space after (k k)
is integrated away. On the contrary, the second term is diagonal in k-space, but the reciprocal vector potential Ak defined similarly to Eq. (6), as,

un (k, x , t)
,
(Ak )mn = i um (k, x , t)
k
has off-diagonal matrix element between different bands. In the above equations we did
not write down the explicit t-dependence of x (t) in the brackets.
(2) Case of O = k : this case is even simpler. The crystal momentum k is diagonal
both in k and in pseudospin indices,
k mn (k , k) = (k k)mn k .
(23)
Let us further investigate the off-diagonal components of x mn (k , k). We focus here on
its commutation relation in k-space. Since the first term on the r.h.s. of Eq. (22) is offdiagonal and the second term is not proportional to an identity matrix, these two terms do
not commute each other. One can indeed verify

x , x mn (k) = i(Fk k )mn ,
(24)
where [A, B] = AB BA is a standard commutator of two N N matrices A, B. Thus
the noncommutativity of x mn (k) turns out to be the origin of the emergence of Berry
curvature Fk k . Another important remark on Eq. (22) is that it leads us to introduce
naturally the concept of covariant derivative in momentum space [10], defined as
(k )mn = mn
i(Ak )mn .
k
(25)
411
In terms of the covariant derivative (k )mn thus introduced, the matrix elements
x mn (k) can be rewritten simply as
x mn (k) = i(k )mn .
(26)
The commutator between two covariant derivatives along different axes is directly related
a non-Abelian Berry curvature
[k , k ]mn = i(Fk k )mn .
(27)
In geometric terms, Eq. (27) can be interpreted in such a way that two parallel transports
along different axes on a curved surface generally do not commute each other.
4.2. To derive the EOM
Let us now consider the time derivative of the expectation value, O(t).
Expanding
| (t) in terms of the local Bloch functions as Eq. (14), one can classify the time derivative
of O(t)
into three parts:

d O(t)
a (k, t)
=
O (k, k )a (k , t)
dk dk
dt
t

,

dk dk a (k , t)
+
O (k , k ) a (k , t)
t

a (k, t)
.
+
(28)
dk dk a (k , t)O (k , k)
t

We have in mind that the operator O is either x , k or some other observables. In the case
of standard Ehrenfests theorem,
d
|O| = i|[H, O]|,
dt
the second term of Eq. (28) does not exist, since the matrix element, Omn (k , k) is timedependent only when the local Bloch basis evolves in time. The first and the third terms,
i.e., the change of expansion coefficients a (k, t) yields a commutator, [H, O]. They contain, however, also a Berry connection contribution, which, together with the second term,
(2)
produce a new type of contribution, which we will call O in Eq. (32). The first term of
(1)
Eq. (32), O
, is a generalization of the standard Ehrenfests commutator [H, O], which
induces, when O = x , Fk k in the EOM. On the other hand, the second term, x(2)
, can
be rewritten in terms of Fk x and Fk t .
(1)
(2)
In order to rewrite Eq. (28) in terms of O and O , let us first look into the following
relation,

N
an (k, t)
=
am (k, t)
+ iH mn (k) .
(29)
t
t mn
m=1
412
The first term is a Berry connection contribution. As is clear when it is written more precisely as

un

= um k, x (t), t un k, x (t), t ,

= um
(30)
t mn
t
t
it emerged as a result of the time evolution of the local Bloch basis. On the other hand,
the second term of Eq. (29) yields the commutation [H, O]mn in Eq. (32). Note also that
the derivative /t in Eq. (29) picks up both the explicit and implicit t-dependence. Correspondingly, one can also rewrite the first term of Eq. (29) using two types of the gauge
field introduced in Section 2, i.e.,

d x
+ (At )mn .
= (Ax )mn
(31)
t mn
dt
x and
Our next objective is to calculate the time derivative of an operator such as k,
express them in such a way that their interpretation in terms of the reciprocal field strength
will become as easy as possible. For that purpose, we rearrange the terms in Eq. (28) into
(1)
(2)
and O
as,6
two parts, O
d
(1)
(2)
O(t) = O + O ,
dt
N

(1)
O = i
(k1 ) H , O mn (k1 , k2 )an (k2 ),
dk1 dk2 am
m,n=1
(2)
O =
N

m,n=1

(k1 )
dk1 dk2 am

, O
Omn (k1 , k2 ) an (k2 ).
(k1 , k2 ) +
t
t
mn
(32)
In Eq. (32) we did not write down explicitly, for the sake of simplicity, the dependence
on x and t in |n (k) = |n (k, x (t), t).
(1)
As has been announced in advance, O is a generalization (or, rather a restricted version) of the standard Ehrenfests commutator, coming exclusively from the first and third
(2)
terms of Eq. (28), whereas O is a new type of contribution, which is a collection of
Berry curvature terms from all the three parts of Eq. (28). Not only have they different
origins, but also are they susceptible of different physical interpretations in terms of the
reciprocal field strength. We will see in Section 4.2 that in the particular case of O = x ,
(1)
(2)
the two contributions, x and x are related actually to the two different parts of the
gauge field introduced in Section 2, i.e., (i) Fk k , and (ii) Fk x , Fk t .
6 In order to obtain Eq. (32), one has only to substitute literally Eq. (29) into the expression for d O/dt
in
Eq. (28), and rename the dummy variables in the following way:
k k1 ,
k k2 ,
k k3 ,
m,
n,
l.
413
Having in mind what has been stated above, we can now derive the EOM for x (t) and
k (t). Let us first consider the case of O = k . As seen in Eq. (23), the momentum operator k is not only diagonal in k coordinates and band indices, but also its matrix element
(1)
is time-independent. Thus only the first term k contributes to its EOM . Furthermore,
among various matrix elements of the Hamiltonian given in (35), (36), only those terms
which contain off-diagonal matrix elements w.r.t. k indices contribute to its commutator
with k . As a result, its EOM turns out to be simplified as,

x , t)
d k (t)
loc (k, x , t)
loc (k,
= dk (k, t)
=
.
(33)
dt
x
x
In the second equality, we replaced k in the integrand by its mean value, following the
prescription given in Eq. (17). This is nothing but the standard EOM for the momentum of
the electron wave packet shown in Eq. (1).
As for the position operator x , Eq. (22) contains both the k-derivative and timedependent matrix elements between different band indices. As a result, the EOM for the
real space coordinate is subject to a drastic change in comparison with Eq. (2). In Section 2, we classified the reciprocal fields into two categories, i.e., (i) Fk k , and (ii) Fk t
and Fk x ddtx . We will see in the next section that the decomposition (32) clearly demonstrates why we classified them in that way. We have studied on a very general basis in
(1)
(2)
this section that the two components, O and O , are structurally well distinguishable,
and have completely different nature. We will see more specifically in the next section that
(2)
x(1)
and x are related respectively to the reciprocal fields (i) and (ii). Thus different
origins of two types of reciprocal fields will be uncovered. The fact that the classification
of reciprocal fields discussed in Section 2 can be done explicitly and unambiguously as the
decomposition (32), is actually one of the main advantages of our approach. Let us now
(2)
turn to a close inspection of the nature of x(1)
and x .
(1)
(2)
4.3. Nature of x and x
In this section let us further analyze the nature of decomposition (32), focusing on the
case of O = x . x (t) is given by Eq. (15) together with Eq. (22). The first term of Eq. (32),
(1)
x in the present case, has particularly a familiar form, which often appears in the context
of the Ehrenfests theorem
|p |
d
|x | = i|[H, x ]| =
.
dt
m
(1)
(34)
The similarity between x and Eq. (34) becomes clearer, when
one expands the wave

packet | in terms of a complete set of bases | as | = a | . The difference
is that the set of bases used in the expansion was complete in Eq. (34), whereas it was
(1)
restricted to N Bloch bands in x . This constraint is the origin of nonvanishing field
strengths.
(1)
Let us now proceed to rewrite x in terms of the reciprocal field strength, Fk k ,
defined in (7). The matrix elements of the Hamiltonian, i.e., Eq. (19) in the restricted sub-
414
space, spanned by N Bloch bands are calculated to be

H mn (k , k) = m (k , x , t)H n (k, x , t) = (k k)H mn (k),

loc
loc
i loc
H mn (k) = eff
x +
(k )mn + (k )mn
,
x
2 x
x
(35)
(36)
where loc = loc (k, x , t) is a degenerate eigenvalue of the local Hamiltonian.7 We also
introduced a renormalized energy, eff (k, x , t), which takes the form of a N by N matrix
whose (m, n)-components are given by
eff
(k, x , t) = loc (k, x , t)mn + mn (k, x , t).
mn
(37)
Its off-diagonal matrix elements are due to correction terms,

un (k, x , t)
i um (k, x , t)

(k,
x
,
t)
mn (k, x , t) =
loc
loc

2
k
x

un (k, x , t)
i um (k, x , t)

,
Hloc loc (k, x , t)

2
x
k
where the summation over = 1, . . . , D was assumed implicitly. Using the matrix ele(1)
ments given in Eqs. (35), (36) one can rewrite x in the following way,8
x(1)
N

(k, t)
dk (k, t)zm

k , eff (k, x , t) mn
m,n=1

D

loc (k, x , t)
+
(Fk k )mn
.
zn (k, t) + x(1)
(38)
=1
(1)
Apart from the energy correction , the first term of x is nothing but a standard velocity term, i.e., the first term in the r.h.s. of Eq. (2). A remark worth mentioning here is
that the covariant derivative in the commutator plays a central role in ensuring the SU(N )
(1)
gauge invariance of final results, which we will see later. x is a irrelevant term9 which
vanishes with the help of prescription introduced in Eq. (17). For a later convenience, let
us introduce the following abbreviated vector notation for zn (k(t),

t):

z (t) z1 k(t),
t , . . . , zN
k(t), t ,
7 See Eqs. (11), (13). Recall also that

Hloc mn (k , k) = m (k , x , t)Hloc n (k, x , t) = (k k)mn loc (k, x , t)
is proportional to an identity in the pseudospin space.
8 Details are given in Appendix A.
9 Its explicit form is given in Eq. (A.4) in Appendix A.
(39)
415
where we always have in mind the prescription (17). Using this notation, one can further
rewrite Eq. (38) as

N
D

(
k)
loc
(1)
x =
(40)
z m (t)
+
(Fk k )mn
k , eff (k)
z n (t),
mn
x
m,n=1
=1
is not written explicitly.

where x -dependence of loc (k)
(1)
(2)
In contrast to x , the second term of Eq. (32), x in the present case, would not
have existed, unless the local Bloch basis had evolved in time. However, in a general sit(2)
uation described by a time-dependent Bloch basis, there is no reason to believe that x
should vanish. Indeed, we will give you in Section 5 some concrete examples where a finite
(2)
contribution from x plays a crucial role in determining the physical properties of the
(2)
system. Using Eq. (29), one can easily verify that x are related to the second category
of reciprocal fields, i.e., Fk t and Fk x ddtx .
x(2)
N

= z(t)
D

=1
can be rewritten as

d x
+ (Fk t )mn zn (k, t)
(Fk x )mn
dt

(k, t)
dk (k, t)zm
m,n=1
(2)
x
Fk x
d x
+ Fk t z (t),
dt
(41)
where the summation over = 1, . . . , D was omitted in the first line. The decomposition (32) together with Eqs. (40), (41) gives a complete physical justification of the
classification of Fq1 q2 done in Section 2. In other approaches [7,30] the two types of reciprocal fields appear in an indistinguishable manner, and two different origins of reciprocal
gauge field studied in this paper remain to be hidden.
4.4. The complete set of EOM and its SU(N ) gauge invariance
d
x (t) in the decomposition (32),
We have successfully related the two contributions to dt
(1)
(2)
i.e., x and x , respectively, to two types of gauge invariant reciprocal fields, (i) Fk k ,
and (ii) Fk t and Fk x ddtx . Together with Eq. (33), this allows us to rewrite our EOM for
as
x (t) and k(t)

d x
d k
d x
(42)
= z [k , eff ] Fk k
Fk x
Fk t z ,
dt
dt
dt
x , t)
d k
loc (k,
(43)
=
.
dt
x
Repeated indices should be summed over = 1, . . . , D. The effective energy eff is re x , t,

lated to the local loc as Eq. (37). In Eqs. (42), (43), eff and loc are functions of k,
i.e., eff = eff (k, x , t), loc = loc (k, x , t). In order to obtain a complete set of EOM, we
still need to know an EOM for z (t) defined in (39). The details of its derivation is given in
416
Appendix B, and the result is,

d k
d x
loc
d z
= eff x
A
Ax At z ,
i
1
dt
x
dt k
dt
(44)
where repeated indices should be summed over = 1, . . . , N , and again, eff =

x , t), loc = loc (k,
x , t).
eff (k,
Here let us make a few comments on the physical meaning of Eq. (44). On the r.h.s. the
together with the second term simply give rise to a usual
diagonal part of eff , i.e., loc (k)1,
x,t)
x , t) N x loc (k,
U (1) phase factor associated with an effective energy, loc (k,
.
=1
x
On the other hand, (k) generally has off-diagonal matrix elements between different
bands and thereby yields a nontrivial SU(N ) phase factor, which corresponds to the precession of the spin and/or orbital associated with the wave packet. The remaining terms
of Eq. (44) represents a BerryWilczekZee phase [27] originating from the adiabatic mo x )-space. In the
tion of the wave packet. The first two terms are due to its motion in (k,
Abelian case (N = 1), the EOM for x (t) and k(t), i.e., Eqs. (42), (43), are independent of
the motion of phase degree of freedom, z (t), whereas z (t) acquires a quantal phase due to

exp[i dt ( d k A + d x Ax + At )] where the summation over
the evolution of x and k:
dt
dt
was omitted. This is analogous to the Berry quantal phase [16].

Finally let us briefly sketch how one can make sure of the SU(N )gauge invariance of
our EOM. Namely, as the N -fold Bloch states are energetically degenerate over the whole
Brillouin zone, these EOM should be independent of the choice of N Bloch bases and be
invariant under the following gauge transformation:
N

u n (k,
um (k,
x , t) =
x , t) gmn (k,
x , t),
m=1
t) = g 1 (k,
x , t)z(k,
t).
z (k,
(45)
x , t) and z(k,

t) are transformed inversely to each other, making the l.h.s. of
Here |un (k,
Eq. (14) invariant. The gauge field and the field strength associated with it are transformed
in the following way,
g
A q = g 1 Aq g ig 1
,
q
F q q = g 1 Fq q g,
x , t. Using this, one can easily see that the covariant derivative, defined
where q = k,
generally for this q as
q =
iAq ,
q
(46)
behaves as if it were a linear transformation in the vector space spanned by N -fold degenerate Bloch states:
q = g 1 q g.
(47)
417
This is why it is dubbed as a covariant derivative. Furthermore, one can also check that
defined in Eq. (37) is transformed as
the N N matrix eff (k)
= g 1 eff (k)g.
eff (k)
Using the above transformation rules, one can indeed verify that our EOM (42), (43) in
combination with (44) are invariant under SU(N ) gauge transformation (45).
Eqs. (42)(44) constitute the central result of this paper together with their applications,
which will be further discussed in Section 5.
4.5. Abelian case: comparison with other approaches
In the Abelian case: N = 1, the above equations of motion (42), (43) reduces to
x , t)
d x eff (k,
d k
d x
=
Fk x
Fk t ,
Fk k
dt
dt
dt
k
x , t)
d k
loc (k,
=
.
dt
x
(48)
(49)
EOM similar to Eqs. (48), (49) have been derived, to our knowledge, twice, using either
(1) time-dependent variational principle [7] or;
(2) path-integral method using Wannier basis.10
If we compare Eqs. (2.19) of Ref. [7] and our Eqs. (48), (49), it can be observed that three
terms,
d x
d k
+ Fx k
+ Fx t
(50)
dt
dt
are lacking on the right-hand side of (49). However, one can easily check by a simple
power counting that these terms appear only at orders higher than 2 in the perturbation
series w.r.t. or x x . Let us briefly illustrate this point. Since a subscript x implies a
derivative w.r.t. x,
which is always accompanied by x x , it increases the power by one. It
is also the case for the subscript t. Therefore, the first and the last terms of (49) turns out
x , t)/ x is also of the
immediately to be at least of the second order of . Since loc (k,
first order w.r.t. in (49), one can verify that the second term of (50) is also at least of the
second order w.r.t. . One can thus conclude that all the lacking terms (50) should appear
only at the second order w.r.t. or x x .
Another difference between Eqs. (48), (49) and Eq. (2.19) of Ref. [7] is that in our EOM
x , t) and not to eff (k,

x , t) as in Ref. [7].
for d k/dt,
the derivative / x applies to loc (k,
x , t) to enter
In our formalism, as is clearly shown in Eq. (33), there is no room for (k,
Fx x
10 We had some difficulty to justify the use of Wannier basis used in Ref. [30] as a complete basis necessary in
the path integral formalism. This is closely related to the arbitrariness of Wannier function discussed extensively
in Ref. [34].
418
the expression (49). Nevertheless, repeating the same type argument, i.e., the power count x , t), one can confirm that the contribution from (k,
x , t) is not physically
ing for (k,
relevant at the first order of or of x x .
We have not only developed a systematic perturbation theory w.r.t. or x x , but also
we make no approximation apart from the assumptions stated in Section 3. Our calculation
must be, therefore, exact at the first order of perturbation theory. Since possible discrepancies start only at the second order in the perturbation series, our result, Eqs. (48), (49) is
not inconsistent11 with that of Ref. [7].
5. Discussion: Berry phase engineering

The gauge invariant EOM (42), (43) have been successfully derived in the previous
section. The decomposition (32) uncovered the origin of two different types of reciprocal
fields introduced in Section 2. In this section we discuss some physical consequences of
Section 4 in the context of Berry phase engineering.
In the introduction, we argued that a finite net charge current could be induced by the
U (1) Berry phase correction to the semiclassical EOM (1), (2). This finite charge current
is actually carried by all the electrons below the Fermi surface, i.e., by the electrons in the
ground state. Generalizing this U (1) argument to the non-Abelian case, we will discuss in
this section how the various types of non-Abelian field strength appearing in our EOM are
related to concrete physical realizations, mainly focusing on the SU(2) case. This opens a
new possibility of manipulating the ground state electronic wave function by controlling
Berry phase, which is sometimes called, Berry phase engineering.
After introducing some terminologies and fixing notations, we will focus on two topics.
In Section 5.2, we will see that Fk k is related to the physics of Hall type current. We
first observe that the charge Hall current can be described by a trace of non-Abelian field
strength Fk k , while this current vanishes whenever the system is time-reversally (T -)invariant. The charge Hall current carried by a (k, ) Bloch electron and that of the (k, )
electron precisely cancel each other. This observation leads us to investigate two physical
situations which avoids such a cancellation that occurs to the charge Hall current and allows
the reciprocal magnetic field to expose experimentally as a Hall-type current. The two
situations are:
(1) anomalous (charge) Hall current observed in systems with broken reversal symmetry, i.e., in ferromagnets [6,2325],
(2) spin Hall current in time reversally symmetric systems [812].
In Section 5.3, we will argue that Fk t is directly related to various types of polarization currents, currents induced in insulators under time-dependent perturbations. Similarly
to the case of Hall type charge current, which will be discussed in Section 5.2, we first
11 The calculation done in Ref. [7] is also first order, in particular, their Hamiltonian, Eqs. (2.1), (2.14) and
(2.15), is first order, but they kept all the possible Berry phase contribution without making further consideration
of power counting, whereas we omitted systematically higher order terms.
419
Table 1
Transformation properties under time reversal T and spatial inversion I
Invariance
Charge: 1
Spin: S(k)
Parity: (k)
Hall-type current: Fk k (k)
Polarization current: Fk t (k)
+
+
A negative (positive) sign indicates whether a matrix element in question at k reverses its sign (or not) in comparison with that of k when the system is invariant under a certain symmetry operation, such as T or I . For
example, the negative sign for S(k) in the T -invariant case means that the spin matrix S (k) is identical to
(S (k))t up to a certain SU(N ) gauge transformation as given in Eq. (67).
Table 2
Cancellation rules for charge/spin/parity Hall-type/polarization currents
Type of current
invariance
Hall-type
I
Polarization
I
Charge
Spin
Parity
+
+
A negative (positive) sign indicates that contributions to the total charge/spin/parity (vertical axis) Halltype/polarization (horizontal axis) current from k and k electrons (do not) cancel each other when the system is
invariant under either T or I . This table can be deduced from Table 1. For example, in the T -invariant case, a negative sign for the spin in Table 1 gives, together with another negative sign for the Hall-type current, a positive
sign for the spin Hall current in this table, corresponding, respectively, to Eqs. (67), (65) and (78).
observe that in such systems that are symmetric under spatial inversion, the polarization
electric/spin current actually vanishes due to a cancellation associated with the inversion
symmetry of the system. Then, in order to overcome this difficulty we propose, in parallel with Section 5.2, two physical systems in which the problem of cancellation will be
resolved, and the gauge invariant reciprocal field strength appears explicitly in a macroscopic physical quantity, i.e., as a polarization current:
(1) If the inversion symmetry is broken externally or spontaneously, the Abelian Fk t
gives rise to a relevant contribution to the polarization electric/spin current [15,22].
(2) Even in systems symmetric under spatial inversion, non-Abelian Fk t may have a
chance to manifest itself as a polarization orbital current, if that orbital degree of freedom
changes its sign under the spatial inversion.
The analogy and correspondence between the Hall type and polarization currents are summarized in Tables 1 and 2.
5.1. Preliminaries
Before further discussing Berry phase transport, we first introduce some terminologies
as well as giving an unambiguous definition to spin/orbital currents.
420
In order to illustrate our point, let us first consider a spin current. We have naturally
in mind that there are different points of view [1012] on the definition of spin current
S . The difficulty of defining a spin current stems simply from the fact that
operator, J
the bare spin S is generally not
quantity due to spinorbit interaction, i.e., the
a conserved
S
continuity equation, S /t + D
=1 J /x = 0, is not satisfied. Thereby the Noethers
theorem does not apply. If one focuses on the time derivative of a local spin S (x, t) in the
general case of nonconserved spin, one could observe that there are two contributions to
it of physically different nature, i.e., contributions from (i) a local spin current, (ii) a local
precession of spin. The former is the one of our interest, and the latter is related to the nonconservation of spin. Unfortunately, there is no systematic prescription for distinguishing
between those two contributions.
We can still define on quite general ground a current operator JI associated with an
internal degree of freedom I as the time derivative of a spatial polarization of I as,
JI =
dPI
,
dt
(51)
M
1 1
{Ij xj + xj Ij }.
PI = D
L
2
(52)
j =1
L and M denote the system size and the total number of electrons, respectively. The subscripts j attributed to I and x specify each electron. In the case of a spin current, the
operator I in Eqs. (51), (52) is simply a usual spin operator, I = S . The spin/orbital
current defined in this way is indeed directly observable.12 For example, an increasing
(PS ) = P (S ) indicates that extra up (down)-spin electrons with S = +1/2 (1/2)
accumulate in one (the other) end of a system with x = L/2 (L/2), which could be
experimentally detected by some optical probes.
We will also discuss orbital currents. A Bloch electron has, in addition to the spin degree
of freedom, orbital degrees of freedom in multiband systems. These orbital degrees of
freedom describe the charge distribution in the unit cell, and hence,13
(x, p) = (x + a, p).
(54)
12 Otherwise, we could have defined it also as

M
1 1
dx dx
Ij
+
Ij .
J I = D
2
dt
dt
L
j =1
This type of definition is convenient for the application of Kubo formula [11,12,29].
13 The periodicity of , i.e., Eq. (54) excludes its off-diagonal matrix elements in k-space:
mn (k , k) = (k k)mn (k),

(2 )D
mn (k) = m (k, t) n (k, t) =
Vcell

unit cell

dx um (k, t)x (x, i + k) xun (k, t) .
(53)
421
In Section 5.3 we focus on an orbital operator which behaves quite contrastingly to the
spin under time reversal and spatial inversion, i.e.,
(x, p) = (x, p),
(55)
(x, p) = (x, p).
(56)
In contrast to the spin operator S, the orbital operator reverses its sign under spatial
inversion, while invariant under time reversal. Accordingly, we dub this orbital operator
as a parity operator.14 Various transformation properties of the operators I = S , are
summarized in Table 1.
The spin/orbital current associated with an operator I was introduced in Eqs. (51), (52).
We calculate in Sections 5.2 and 5.3 those spin/parity currents carried by the ground state.
In order to make later discussions clearer and physically more appealing, we keep only
their matrix elements in the restricted subspace N spanned by N -fold degenerate bands; if
m N and l
/ N , then

m (k)S l (k) = 0,
(57)

m (k)(x, p)l (k) = 0.
(58)
As will be seen in Sections 5.2 and 5.3, these approximations make the following discussions considerably simpler, i.e., not only a charge current but also spin/parity currents
become related simply to the non-Abelian field strength, Fk k and Fk t .
5.2. Fk k induces Hall type currents: AHE and spin Hall effect
The field strength Fk k describes various kinds of spontaneous Hall currents carried by
the ground state. In order to demonstrate it we expose our system under a uniform electric
field E; we consider below the following Hamiltonian,
D

H p, x, (x, t) = H0 (p, x) + e
E x .
(59)
=1
We see below that the Hall-type current can be expressed essentially as a trace of IF k k
in the N -fold degenerate pseudospin space. The cancellation or the survival of such Halltype topological current is determined by its transformation properties under time reversal
(T ) of the system. In systems with broken T symmetry a finite charge/mass Hall current
generally appear. In two spatial dimension D = 2, this situation is often described in terms
of ChernSimons gauge field, accounting for the quantized Hall conductance [35], as well
as fractional charge and statistics [36]. ChernSimons terms also appear in electrically
neutral systems [37]. Quantization of spin Hall conductance in unconventional superfluids
has also been studied in this context [38].
14 This orbital operator is different from the angular momentum operator, = x p, which we might also call
an orbital operator. reverses its sign under time reversal, while remains invariant under spatial inversion.
422
5.2.1. Charge Hall current

Let us first see that the trace of the non-Abelian field strength Fk k describes a spontaneous charge Hall conductivity, i.e., an anomalous Hall conductivity [6,2325]. In terms of
Eqs. (51), (52), we are considering the case of I = 1. Applying the EOM, Eqs. (42), (43),
to the present case, we consider a Bloch electron, in the j th pseudospin state15 and with a
The SU(N ) gauge invariant EOM for this electron under a uniform
crystal momentum k.
electric field E now reads,

d x
loc (k)
(j ) d k ,
=
z(j ) Fk k (k)z
dt
dt
k
=1
(60)
d k
= eE .
dt
(61)
(j )
When the crystal momentum k is located below the Fermi surface, all those pseudospin
states are completely occupied and contribute to the charge current. In order to calculate
the charge Hall current, we need
e
N
(j )

d x
j =1
dt
loc (k)
2
(j ) E
= N e
e
z(j ) Fk k (k)z
k
j =1
= N e

loc (k)
E .
e2 Tr Fk k (k)
(62)
Then, by integrating these contributions over filled k points, we obtain the total current
carried by all the electrons below the Fermi energy F ,
D

Hall
2
JC = e
=1
loc (k)<
F

d k
E .
Tr Fk k (k)
(2)D
(63)
Here we assumed for the sake of simplicity that the nonperturbed Hamiltonian H0 (p, x)
is also invariant under spatial inversion (I ). As far as the Hall type current is concerned,
however, this I symmetry plays only a minor role. Eq. (63) takes indeed the form of a
Hall type current reflecting the antisymmetry of Fk k : Fk k = Fk k . The first term of
Eq. (62) did not contribute to Eq. (63) due to a cancellation associated with the I symmetry.
On the contrary, Eq. (63) remains finite irrespective of the I -invariance of H0 .
Unfortunately, the charge Hall current obtained in Eq. (63) vanishes whenever the system is T -invariant. Let us see this point more explicitly. We consider an unperturbed
Hamiltonian H0 (x, p) which is invariant under T . The T -invariance relates its N -fold
degenerate Bloch functions at k with those at k up to a certain SU(N ) gauge transforma(j )
15 The j th eigenstate is a linear combination of different pseudospin states m = 1, . . . , N , i.e.,
N
m=1 zm
(i)
(j
)
These eigenstates are chosen to be orthogonal to each other: z z = ij for i, j = 1, . . . , N .
|m (k).
423
tions g(k),

N

x, a uj (k) gj i (k).
[iy ]ab x, bui (k) =
(64)
j =1
b=,
In the language of field strength, this reduces to,

Fk k (k)t = g 1 Fk k (k)g,
(65)
where the superscript t represents a transposed matrix, i.e., (F t )mn = (F)nm . One can
verify this using Eqs. (6), (7), (64). Consequently, the charge current carried by a Bloch
electron at k cancels with that of k electron, i.e.,
D

D

Tr Fk k (k) E =
Tr Fk k (k) E .
=1
(66)
=1
Eqs. (63), (66) indicate the absence of spontaneous Hall current in T -invariant systems.
5.2.2. Spin Hall current
We are thus led to investigate the spin current, expecting that the spin Hall current
remains finite even in T -invariant systems [812]. The underlying idea is simply that such
a sign change as seen in Eq. (66) may be compensated by that of spin operator under
the operation of time reversal. The spin current has been defined in Eq. (51), in a more
general context for an arbitrary internal degree of freedom I. Let us first observe that the
T -invariance of H0 (x, p) is instrumental for this compensation. The total spin carried by
the Bloch electrons at k has the same absolute value and opposite sign of that of the Bloch
electrons at k,16

Tr S (k) = Tr S (k) ,
(68)
where S (k) is a N N matrix, whose (m, n)-components are given by

S mn (k) = m (k)S n (k)

1

(2)D
um (k)x, a ( )ab x, bun (k) .
dx
=
Vcell
2
unit cell
(69)
a,b=,
Let us now consider the spin Hall current defined as (51), (52) with I being I = S , the
usual spin operator. In order to evaluate a spin Hall current, the EOM (60), (61) used for
the calculation of charge Hall current are no longer sufficient. We need instead to derive
16 To be more specific, T -invariance relates the matrix of S at k and at k through a SU(N ) gauge transfor
mation g,
S (k)t = g 1 S (k)g.
Eq. (67) can be verified explicitly using Eq. (64).
(67)
424
EOM for an observable,

1
S
O
(70)
= O (S ) = (S x + x S ).
2
Following the same type of procedure as the derivation of Eqs. (60), (61), we perform,
in particular, the decomposition (32). As was also the case in Eqs. (60), (61), the second
(2)
contribution to (32), i.e., O vanishes in the present case.17 The EOM reads,
N
S

d O

S
=i
(k, t) H , O
(k, k )an (k , t)
dk dk am
mn
dt
m,n=1

N
D

S
=i
(71)
E x , O
an .
dk am H0 + e
m,n=1
=1
mn
The assumption (57) allows us to rewrite Eq. (71) as

N
S

d O
loc (k)
=
(k, t) S mn (k)
dk (k)zm
dt
k
m,n=1

D

e
S (k)Fk k (k) mn E + h.c. zn (k, t).
+
2
(72)
=1
The details of the derivation of Eq. (72) is given in Appendix C. We then apply the pre The contribution to the
scription (17), replacing k in the integrand by its mean value k.
spin Hall current by an electron occupying the j th pseudospin state at k is, therefore,
S (j )
(j )

(j ) loc (k) + e
k k (k)z
(j ) + c.c. E .
= z(j ) S (k)z
z S (k)F
2
k
=1
(73)
Finally we take the summation over N pseudospin states and over filled k points to find,

N
S (j )
Hall
d k d O
JS =
(2)D
dt
d O
dt
loc (k)<
F
=e
D

=1
j =1
loc (k)<
F

d k
k k (k)
E .
Tr S (k)F
(2)D
(74)
As was the case in Eq. (63), we also assumed in Eq. (74) that H0 is invariant under I , so
that the first term of Eq. (73) should not appear in Eq. (74), i.e.,

loc (k)
d k
(75)
Tr
S
(k)
= 0.
(2)D
k
loc (k)<
F
17 As long as the electric field E in Eq. (59) is uniform, the local Bloch function defined in Eq. (11) has no
(2)
dependence on x and t . As a result, O vanishes.
425
Contrary to this normal part, the spin Hall current associated with the anomalous velocity,
given by Eq. (74), has a possibility to be finite irrespective of the I and T symmetries.
When the system is invariant under either T or I ,18 the spin current carried by a Bloch
electron at k and that of k give the same contribution;

Tr S (k)Fk k (k) = Tr S (k)Fk k (k) .
(78)
Under T symmetry Eq. (78) is a consequence of Eqs. (65), (67). Eqs. (74), (78) confirm
that our expectation that the spin Hall current is robust against T symmetry was indeed the
case. A set of cancellation rules for the Hall type currents are established in Table 2.
5.3. Fk t induces a polarization current: parity polarization current and quantum spin
pump
We have seen in the previous section that Fk k is related to Hall type currents associated with the internal degrees of freedom such as charge and spin, by applying a uniform
electric field to the system. Here we argue that Fk t describes various kinds of polarization current. More specifically, we consider a situation where a band insulator is subject
to a time-dependent perturbation which does not break the periodicity of the underlying
crystal, i.e., we consider a Hamiltonian,

H x, p; (t) = H x + a, p; (t) .
(79)
Since the perturbation (t) does not depend on x, the local Hamiltonian defined in Eq. (11)
reduces simply to

Hloc (x, p, t) = H x, p; (t) .
Depending on the perturbation (t), the ground state wave function of the local Hamiltonian, Hloc (x, p, t) also evolves temporally. Since an electronic wave function for the
ground state describes spatial distributions of charge, spin and orbital, its evolution in general induces various kinds of currents in the system. When the system is isolated from the
external circuit, an induced current accumulates an extra charge (or spin, orbital) on one
side of the system, which results in a spatial polarization of charge, spin and orbital [39].
Accordingly, this type of current associated with such internal degrees of freedom is often
called a polarization current. In the following, we describe the physics of polarization current using the language of non-Abelian gauge field, in particular, that of Fk t . One of the
advantages of taking such a viewpoint is that the role of symmetry becomes transparent,
which we summarized as a set of cancellation rules in Table 2.
18 In the I -invariant case, instead of Eq. (64), Eq. (84) holds, i.e., the I -invariance of H relates x, a|u (k)
i
0
with x, |uj (k) up to a SU(N ) gauge degree of freedom h. This reduces in terms of spin and field strength to
S (k) = h1 S (k)h,
(76)
(k) = h1 F
(77)
Fk k
k k (k)h.
Eq. (76) justifies (75), whereas multiplying Eqs. (76) and (77), one finds immediately Eq. (78).
426
5.3.1. Charge polarization current

Let us first consider a charge polarization current. In the case of time-dependent perturbation (79), the EOM analogous to Eqs. (60), (61), are found to be
(j )
d x
loc (k)
t)z(j ) ,
z(j ) Fk t (k,
=
dt
k
(80)
d k
= 0,
dt
(81)
(j )
where j = 1, . . . , N . Collecting contributions from all N pseudospin states and from all
filled k points, one can calculate the charge current carried by the ground state as
pol
JC = e

N
(j )

d k d x
d k
t) ,
Tr Fk t (k,
=e
D
D
(2)
dt
(2)
j =1
BZ
(82)
BZ
where the k-integral

was performed over the whole Brillouin zone (BZ). Eq. (82) is analogous to Eq. (63), which we found for the charge Hall current. We can see that the trace
of different types of reciprocal field strength, i.e., Fk t and Fk k are related to different
types of physical currents, i.e., polarization and Hall type currents.
We have seen in the previous section that no charge Hall current flows whenever the
system is invariant under time reversal T . In contrast, we are going to see below that
the charge polarization current vanishes whenever the system is invariant under spatial
inversion I ,19
Hloc (x, p, t) = Hloc (x, p, t).
(83)
In this case, its N -fold degenerate Bloch functions at k is related to those at k up to a

certain SU(N ) gauge transformation h(k, t),
N

x, a ui (k, t) =
x, a uj (k, t) hj i (k, t).
(84)
j =1
Since the field strength Fk t is related through Eqs. (6), (7) to those wave functions,
Eq. (84) reduces to the following identity,
Fk t (k, t) = h1 Fk t (k, t)h.
Consequently,

Tr Fk t (k, t) = Tr Fk t (k, t) .
(85)
(86)
Eqs. (82) and (86) indicate that the charge polarization current always vanishes in I -invariant systems.
19 I.e., the underlying crystal structure has centro-symmetric points.
427
5.3.2. Parity polarization current

We have already encountered a similar situation in the previous section. Under the time
reversal T , Fk k (k) is transformed to (Fk k (k))t up to a SU(N ) gauge degree of
freedom, as seen in Eq. (65). As a result, the charge Hall current vanished in T -invariant
systems. On the other hand, a spin Hall current was robust against T -invariance. The reason
was that not only Fk k but also the spin operator are odd under the time-reversal, as given
respectively in Eqs. (65) and (67).
Following the same type of logic, we can expect that an orbital polarization current
may remain finite irrespective of the I -invariance of Hloc , as far as the associated orbital
operator (x, p) changes its sign under the spatial inversion I .20 Accordingly we dub this
type of orbital current a parity polarization current.
Expecting that the above analogy is indeed a sensible one, let us further analyze the
parity polarization current carried by the ground state. Since we have defined this orbital
current as Eqs. (51), (52), we have to consider an EOM for

1
O = O (x, p) = (x + x ).
2
(87)
We derive their EOM in terms of the decomposition (32). Our local Hamiltonian is time(2)
dependent, and so is its local Bloch function. Therefore, O appearing in Eq. (32)
remains finite in general:

d O
dt
(1)
(2)
= O + O
N

=i

(k, t) Hloc it , O mn (k, k )an (k , t).

dk dk am
(88)
m,n=1
This equation is analogous to Eq. (71). The covariant derivative k = x in Eq. (71)
was replaced in Eq. (88) by another covariant derivative w.r.t. time, i.e., t :
(it )mn =
mn i(At )mn .
t
(89)
Let us further develop the analogy between the two cases, i.e., we rewrite Eq. (88) in the
following way, precisely as we rewrote (71) as (72). The details of the derivation is given
in Appendix D, which is in parallel with Appendix C, and the result is
d O
dt
N

m,n=1
(k, t)mn (k)

dk (k)zm
loc (k)
zn (k, t)
k

1
(k, t) (k)Fk t (k) + h.c. mn zn (k, t).
zm
2
20 See Eq. (55).
(90)
428
Following the prescription (17), we see that the parity polarization current carried by k
Bloch electron occupying the j th pseudospin state is given by,
(j )
dO
dt

(j ) loc (k) z(j ) (k)F
k t (k)z
(j ) + c.c. .
= z(j ) (k)z
(91)
After taking its summation over N pseudospin states and over filled k points, we finally
obtain a parity polarization current carried by the ground state,
pol
JO =
BZ

N
(j )

d k d O
d k
k t (k)
.
=
Tr (k)F
D
D
(2)
dt
(2)
j =1
(92)
BZ
The first term of Eq. (91) did not contribute to Eq. (92) due to a cancellation associated
with the T -invariance,21

d k
loc (k) = 0.
Tr (k)
(95)
D
(2)
k
BZ
On the contrary, the Berry phase contribution, i.e., Eq. (92) turns to be quite robust against
both I and T symmetries. In particular, in the I -invariant case, (k) is identical to
(k) up to the SU(N ) gauge transformation h,
(k) = h1 (k)h.
(96)
This can be shown explicitly using Eqs. (55), (56), (84), (53). Eqs. (85), (96) indicate

Tr (k)Fk t (k) = Tr (k)Fk t (k) .
(97)
Eq. (97) holds also true in the T invariant case, as is clear from Eqs. (93), (94). Eqs. (92),
(97) confirm our hypotheses that the parity polarization current is indeed robust against I
symmetry.
5.3.3. Quantum spin pump
Another possible direction to be explored is to study how to induce a spin polarization
current by breaking both T -invariance and I -invariance. This scenario can be implemented
[15] in a certain kind of quantum spin chains such as Cu-bensoate and Yb4 As3 . The ground
state of these quantum magnets is known to be quantum critical point (QCP), which is
interpreted as a Dirac monopole, i.e., a source of the U (1) field strength Fkt . When this
quantum system is driven around this QCP by applying an electric field E and/or magnetic
21 T -invariance relates N -fold degenerate Bloch functions at k with the ones at k up to a SU(N ) gauge
degree of freedom g as Eq. (64). This implies,

t
(k) = g 1 (k)g,
t

Fk t (k) = g 1 Fk t (k)g.
Eq. (93) justifies (95), whereas multiplying Eq. (93) with Eq. (94) one finds immediately (97).
(93)
(94)
429
field B, a spin polarization current can be induced. The electro-magnetic fields break both
the I -invariance and T -invariance. They also induce a spin gap and realize a quantum
critical point at the origin of EB plane. When the system goes adiabatically around this
origin, a quantized number of spins will be transported from one edge to the other through
the system. This quantized value is a physical manifestation of the first Chern number
associated with the QCP.
5.4. Fk x associated with the spatial inhomogeneity
Contrary to Fk k and Fk t , the reciprocal field strength Fk x does not seem to be related directly to a physical observable such as Hall type currents and polarization currents.
However, when the system contains spatial inhomogeneity such as lattice defects, Fk x
appears and plays an important role in the dynamics of Bloch electrons around the defects [7]. Another possible application of Fk x is the electron transport properties around
a magnetic domain wall, where the spatial modulation of ferromagnetic moments induce
Fk x , and naturally influences the EOM for the electron wave packet through this Berry
curvature.
6. Conclusions
We have derived and analyzed the semiclassical EOM for a wave packet of Bloch electrons, under perturbations slowly varying in space and in time. Their interpretation in terms
of non-Abelian gauge field in the reciprocal parameter space was the central issue of the
paper. The same type of EOM has been previously derived for the Abelian, i.e., U (1) case,
by using either (i) time-dependent variational principle [7] or (ii) path-integral method using Wannier basis [30]. We have generalized such EOM to a non-Abelian case by using
only the most fundamental principles of quantum mechanics.
The advantage of our formalism was that
(1) it was asymptotically exact in the framework of linear response theory, as a result
of systematic expansion w.r.t. the perturbation or x x ,
(2) it revealed that there are different types of gauge field of different physical origin,
(3) it was useful for developing symmetry analyses on various types of Berry phase
transport.
The first point refers to Eq. (19) and all the related analyses developed in Sections 3 and 4.
The relevance of our results in relation to other approaches was further discussed in Section 4.4. As for the second point, two different sources of gauge field have been revealed,
i.e., (i) projection onto a subspace spanned by N Bloch bands; (ii) Bloch basis moving in the course of time. The former is the origin of Fk k which is directly related to
spontaneous Hall currents of various degrees of freedom. The latter brings about Fk x
and Fk t , which plays an important role in the spatially and temporally inhomogeneous
system.
430
Finally, concerning the last point in the above list, we have applied our formalism to the
analyses on the spin and orbital transport phenomena with the help of Boltzmann transport theory. The role of time reversal and space inversion symmetries in the appearance of
finite Hall/polarization currents has been extensively studied. The cancellation rules are
summarized in Tables 1 and 2. The concept of parity polarization current has also been
introduced, which may concretize Berry phase engineering in the context of orbital transport.
We leave for a future study further investigations on their application to the domain
wall physics and that of quantum pumping. In conclusion, we believe that our analyses on
non-Abelian gauge field will see in the near future a possible application in the context of
Berry phase engineering.
Note added
After completion of this work, we were informed of a related effort by D. Culcer, Y. Yao
and Q. Niu [40].
Acknowledgements
We would like to thank Shuichi Murakami and Naoto Nagaosa for introducing us into
this flourishing area of physics, as well as giving us a motivation to work on this problem.
We are also grateful to Dimitrie Culcer and Qian Niu for their comments and suggestions
on our paper. R.S. is a JSPS Postdoctoral Fellow. K.I. is supported by RIKEN as a Special
Postdoctoral Researcher.
Appendix A. Matrix element H mn (k) and x(1)
Matrix elements of the linearized Hamiltonian, i.e., Eqs. (35), (36), have been extensively used in Section 4.3. Let us recall those equations together with the matrix elements
of noncommutative coordinates, i.e., Eqs. (22), (26), (24), (27). Our purpose here is to
substitute the expression (36) into
x(1)
=i
N

(k) H , x mn (k)an (k),

dk am
(A.1)
m,n=1
(1)
an expression analogous to the second line of Eq. (32), and to rewrite x in terms of the
field strength [Fk k ]mn .
The first term of Eq. (36), i.e., eff (k, x , t) gives a standard velocity term when inserted
into Eq. (A.1). Since [Fk k ]mn is related to the commutator, [k , k ]mn or equivalently,
[x , x ]mn (k), one can easily imagine that the last two terms give in Eq. (36) give when
inserted into Eq. (A.1) a contribution related to [Fk k ]mn . One can indeed verify

loc
1 loc
, ik
k + k
2 x
x
mn

N

loc
i loc
=
[k , k ]mn +
, k
[k ]ln
2 x
x
ml
l=1

N

loc
loc
+
[k ]ml
, k
+ [k , k ]mn
x
x
ln
l=1

2
loc
2 loc
i loc
=
.
Fk k +
[k ]mn + [k ]mn
x
2 k x
k x
The second term of Eq. (36) gives, when inserted into the commutator,

loc (k, x , t)
2 loc
x , k
=
x .
x
k x
mn
431
(A.2)
(A.3)
Collecting the contribution (A.3) and the last two terms of Eq. (A.2), i.e., terms not related
(1)
to Fk k , one defines x introduced in Eq. (37):
x(1)

,m,n
dk

2 loc
(k, t) x (t) izm

(k, t)[k ]zn (k, t) .
k x
(A.4)
This term vanishes after k-integration with the help of prescription given in Eq. (17).
Appendix B. Derivation of the EOM for z (t)
We demonstrate here the derivation of EOM for z (t), i.e., EOM describing the motion
of the internal pseudospin degree of freedom.
For that purpose we once have to go back to
Eq. (29). After multiplying it with a weight (k, t), we integrate it over all the k-points,
to find,

z
1
(k, t) z(k, t) +
dk
2 t
t

d x
loc loc
Ax + At z
Ak +
= i dk (k, t) eff + x
x
x
dt

1
loc z
+
(B.1)
,
dk (k, t)
2
x k
where eff = loc 1 + as given in Eq. (37). In Eq. (B.1), repeated -indices should be
summed over = 1, . . . , D. In order to obtain Eq. (B.1), we also used,

1
loc
1
loc z
loc (k)
=
dk
(B.2)
( z) + z
.
dk
x k
2 k x
2
x k
432
We now substitute,

D

loc (k, x , t)
(k, t)
,
(k, t) =
t
k
x
=1
into Eq. (B.1), then perform a partial integral w.r.t. k . The result is,

z
loc z
+
dk (k, t)
x k t

d x
loc loc
= i dk eff + x
Ax + At z.
Ak +
x
x
dt
(B.3)
Finally, in order to rewrite Eq. (B.3) in the form of Eq. (43) and complete its derivation,
we adopt the prescription (17).
S and spin Hall current
Appendix C. EOM for O
Our purpose here is to rewrite an EOM for Eq. (71) into its final form, i.e., Eq. (72), so
that we can express the spin Hall current as the following trace in the pseudospin space,
(k)]E
Tr[S (k)F
.
k k
Let us first recall the assumption we made in Eq. (57). This assumption allows us to
S (k, k ) in Eq. (71) into a product of S and x ;
factorize O
mn
S x + x S mn =
N

S ml x ln + x ml S ln .
l=1
Correspondingly, the commutator appearing in Eq. (71) can be decoupled into the following two types of commutators,

S 1

1
H , O
(C.1)
= H , S x + S H , x h.c.
2
2
The first term together with its Hermitian conjugate conceives a commutator between H
and S , which constitutes the EOM for the spin: dS

dt . Firstly, we show that this commutator vanishes. Since S is diagonal in k-space, it clearly commutes with H0 ;

H0 , S mn (k , k) = (k k) loc (k), S (k) mn = 0.
(C.2)
Therefore, the commutator between H and S becomes proportional to the covariant
derivative of S (k) w.r.t. k , i.e.,
D

H , S mn (k , k) = e
E x , S mn (k , k)
=1
= ie(k k)
D

=1

E k , S (k) mn .
(C.3)
433
Because the spin operator itself does not depend on the crystal momentum, the partial
derivative of S (k) w.r.t. k reduces to the commutator between iAk and S (k), i.e.,

un (k)

um (k)

S
S
u
S mn (k) =
(k)
+
u
(k)
n
m
k
k
k

N
um

ul ul |S |un + um |S |ul ul un .
=
(C.4)
k
k
l=1
In the second line k-dependence is not written explicitly. We used Eq. (57) between the
two lines. Then the covariant derivative of S (k) w.r.t. k appearing in Eq. (C.3) also
vanishes,

S mn (k) iAk , S (k) mn = 0.
k , S (k) mn =
k
(C.5)
Consequently, the first term and its Hermitian conjugate in Eq. (C.1) are indeed zero. On
the other hand, the second term in Eq. (C.1) contains the commutator between H and
x , which describes the EOM for x now. This term gives rise to a field strength Fk k
through the commutator between covariant derivatives w.r.t. different components of the
crystal momentum, i.e., [k , k ]. Namely,

D

E x , x (k , k)
H , x = H0 + e
=1
= (k k) loc (k)1 + e

E ik , ik
=1

D

loc
= i(k k)
1e
E Fk k (k) .
k

(C.6)
=1
Finally, substituting Eq. (C.1) together with Eqs. (C.3), (C.5) and (C.6) into Eq. (71), one
finds Eq. (72).
Appendix D. EOM for O and parity polarization current
In parallel with Appendix C, we rewrite below Eq. (88) into its final form, i.e., Eq. (90).
Let first recall the assumption (58), which says that the parity operator has no matrix
element outside the N -fold degenerate band. This implies that O can be factorized into
a product of two N by N matrices, or x = x . Thanks to this factorization, the
commutator in Eq. (88) can be decomposed into two types of commutators as

Hloc it , O

1

1
Hloc it , x + h.c. + Hloc it , x + h.c. .
=
2
2
(D.1)
434
On the r.h.s., the first term is a commutator between Hloc it and , which constitutes the EOM of the parity under time-dependent perturbations: d/dt. Since the parity
operator itself is independent of time, we can prove that this commutator indeed vanishes,
in the same way as we did in Eq. (C.4),

Hloc it , = 0.
(D.2)
On the other hand, the second line of Eq. (D.1) is a commutator between H0 it
and x , which describes the EOM for x . This commutator gives rise to Fk t through a
commutation relation between two covariant derivatives, one w.r.t. time and the other w.r.t.
the momentum, [ik , it ] = iFk t . Thus the second line of Eq. (D.1) may be rewritten as

Hloc it , x mn (k , k) = i(k k) loc (k) it , ik mn

loc (k)

= i(k k)
mn (Fk t )mn .
(D.3)
k
Substituting Eq. (D.1) together with Eqs. (D.2), (D.3) into Eq. (88), one finds Eq. (90).
References
[1] P.A.M. Dirac, Proc. R. Soc. London 133 (1931) 60.
[2] G. t Hooft, Nucl. Phys. B 79 (1974) 276;
A.M. Polyakov, JETP Lett. 20 (1974) 194.
[3] M. Onoda, N. Nagaosa, J. Phys. Soc. Jpn. 71 (2002) 19.
[4] Z. Fang, et al., Science 302 (2003) 92.
[5] See also, G.E. Volovik, JETP Lett. 46 (1987) 98, and references therein.
[6] M.C. Chang, Q. Niu, Phys. Rev. Lett. 75 (1995) 1348;
M.C. Chang, Q. Niu, Phys. Rev. B 53 (1996) 7010.
[7] G. Sundaram, Q. Niu, Phys. Rev. B 59 (1999) 14915.
[8] J.E. Hirsch, Phys. Rev. Lett. 83 (1999) 1834.
[9] S. Zhang, Phys. Rev. Lett. 85 (2000) 393.
[10] S. Murakami, N. Nagaosa, S.C. Zhang, Science 301 (2003) 1348;
S. Murakami, N. Nagaosa, S.C. Zhang, Phys. Rev. B 69 (2004) 235206.
[11] J. Sinova, et al., Phys. Rev. Lett. 92 (2004) 126603.
[12] D. Culcer, et al., Phys. Rev. Lett. 93 (2004) 046602.
[13] S. Datta, B. Das, Appl. Phys. Lett. 56 (1990) 665.
[14] D.J. Thouless, Phys. Rev. B 27 (1983) 6083;
Q. Niu, Phys. Rev. Lett. 64 (1990) 1812.
[15] R. Shindou, J. Phys. Soc. Jpn. 74 (2005) 1214.
[16] M.V. Berry, Proc. R. Soc. London, Ser. A 392 (1984) 45.
[17] C.A. Mead, Chem. Phys. 49 (1980) 23.
[18] R. Karplus, J.M. Luttinger, Phys. Rev. 95 (1954) 1154.
[19] J. Zak, Phys. Rev. B 15 (1977) 771;
J. Zak, Phys. Rev. B 16 (1977) 4154.
[20] E.N. Adams, E.I. Blout, Phys. Chem. Solids 10 (1959) 286;
E.I. Blout, Solid State Phys. 13 (1962) 305.
[21] D.J. Thouless, et al., Phys. Rev. Lett. 49 (1982) 405.
[22] R.D. King-Smith, D. Vanderbilt, Phys. Rev. B 47 (1993) 1651;
R. Resta, Europhys. Lett. 22 (1993) 133;
See also, R. Resta, Rev. Mod. Phys. 66 (1994) 899, and references therein.
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
K. Ohgushi, S. Murakami, N. Nagaosa, Phys. Rev. B 62 (2000) 6065.

R. Shindou, N. Nagaosa, Phys. Rev. Lett. 87 (2001) 116801.
T. Jungwirth, Q. Niu, A.H. MacDonald, Phys. Rev. Lett. 88 (2002) 207208.
Y. Yao, et al., Phys. Rev. Lett. 92 (2004) 037204.
F. Wilczek, A. Zee, Phys. Rev. Lett. 52 (1984) 2111.
J. Inoue, G.E.W. Bauer, L.W. Molenkamp, Phys. Rev. B 70 (2004) R041303.
J. Schliemann, D. Loss, Phys. Rev. B 69 (2004) 165315.
H. Koizumi, Y. Takada, Phys. Rev. B 65 (2002) 153104.
A. Berard, H. Mohrbach, Phys. Rev. D 69 (2004) 127701.
N. Seiberg, E. Witten, JHEP 9909 (1999) 32.
A. Connes, M.R. Douglas, A. Schwarz, JHEP 9802 (1998) 3.
N. Marzari, D. Vanderbilt, Phys. Rev. B 56 (1997) 12847.
K. Ishikawa, T. Matsuyama, Z. Phys. C 33 (1986) 41;
K. Ishikawa, T. Matsuyama, Nucl. Phys. B 280 (1987) 523.
S.C. Zhang, T.H. Hansson, S. Kivelson, Phys. Rev. Lett. 62 (1989) 82;
See also, S.C. Zhang, Int. J. Mod. Phys. 6 (1992) 25.
G.E. Volovik, JETP 67 (1988) 1804;
J. Goryo, K. Ishikawa, Phys. Lett. A 260 (1999) 294;
A. Furusaki, M. Matsumoto, M. Sigrist, Phys. Rev. B 64 (2001) 054514.
G.E. Volovik, V.M. Yakovenko, J. Phys.: Condens. Matter 1 (1989) 5263;
T. Senthil, J.B. Marston, M.P.A. Fisher, Phys. Rev. B 60 (1999) 4245.
N. Sai, K.M. Rabe, D. Vanderbilt, Phys. Rev. B 66 (2002) 104108.
D. Culcer, Y. Yao, Q. Niu, cond-mat/0411285, Phys. Rev. B, in press.
435
Comment on: The cluster expansion for the

self-gravitating gas and the thermodynamic limit
[Nucl. Phys. B 711 (2005) 604]
Victor Laliena
Departamento de Fsica Terica, Universidad de Zaragoza, Pedro Cerbuna 12, E-50009 Zaragoza, Spain
In a series of papers, de Vega and Snchez claimed that the thermodynamic limit of a
self-gravitating system can be taken by letting the number of particles, N , and the volume,
V , tend to infinity keeping the ratio N/V 1/3 constant [1]. This limit, which I call diluted
following the terminology of the first paper of [1], is different from the usual thermodynamic limit, where the density N/V is kept constant. The relevant variable for the diluted
limit, which can be found by naive dimensional analysis, is = Gm2 N/V 1/3 T .
Recently, I proved rigorously that the diluted limit does not give a well-defined thermodynamic limit [2]: the relevant thermodynamic potentials are not extensive and the
thermodynamic quantities suffer from the same problems as in the usual thermodynamic
limit. For instance, the free energy scales with N 5/3 . In spite of this, in The cluster expansion for the self-gravitating gas and the thermodynamic limit [3], de Vega and Snchez
continue arguing that the diluted limit gives extensive thermodynamic potentials and well
behaved thermodynamic quantities at sufficiently high temperature, i.e. for > c , where
c depends on the thermodynamic ensemble as well as on the geometry of the system
boundary. To defend their point, these authors try to show that the statements made in
Ref. [7] [Ref. [2] of the present paper] have crucial failures which invalidate the conclusions given in Ref. [7]. However, the argument they give subsequently is based on a
misunderstanding of the proof of nonexistence of the diluted limit given in [2], and is easily
refuted, as will be seen in the following.
DOI of original article: 10.1016/j.nuclphysb.2004.12.022.

E-mail address: laliena@unizar.es (V. Laliena).
0550-3213/$ see front matter 2005 Published by Elsevier B.V.
doi:10.1016/j.nuclphysb.2005.05.022
440
V. Laliena / Nuclear Physics B 720 [FS] (2005) 439442
Let us briefly remember the proof of nonexistence of the diluted limit given in [2].
Consider a system of N classical particles, enclosed on a region of linear size R (so that
V = R 3 ) interacting via a gravitational potential conveniently regularized at short distances. In the diluted limit we have N , R , with N/R constant. The variable
of Ref. [3] is, by definition,
Gm2 N
,
(1)
RT
since V = R 3 is the volume of the region available for the system. Now, let us consider a
region of linear size R0 , with N R03 , enclosed in the available space of the system. Note
that R0 R. Using a simple sequence of inequalities, it is proved in [2] that
=

R03N
(2)
exp N(N 1)/R0 ,
N!
where ZC is the canonical partition function, = 1/T is the inverse of the temperature,
and > 0 is Gm2 times a geometrical number independent of R0 if R0 is large. The above
inequality shows that the free energy grows with N at least as N 5/3 , and therefore is not
extensive.
The authors of [3] argue that since to derive inequality (2) I have introduced another
length R0 , the relevant variable is = Gm2 N/T R0 , and, since R0 N 1/3 , we have
c , so that the system is deep in the collapse phase, where the results of [1,3] do not apply.
Obviously, this is wrong. R0 is not a characteristic length of the system. It is an auxiliary
mathematical length, introduced just to prove that the diluted limit is ill-behaved. It has no
physical meaning, and hence it is left unspecified. It can have any value, the only restriction
being that it must scale with the number of particles as R0 N 1/3 . By definition, the length
entering in equation (1) of Ref. [3] is the linear size of the spatial region available to the
system. Hence, = Gm2 /T R. Only in such case the integrals over the coordinates that
appear in equations (2.9), (2.15), etc., of [3] can be taken between 0 and 1. Inequality (2)
can be written as
ZC

R03N
exp (N 1)R/R
(3)
0 ,
N!
where > 0 is now a dimensionless purely geometrical number independent of the size
of the system. Since R/R0 N 2/3 , we see that the free energy scales N 5/3 if is kept
constant, and therefore is non-extensive.
The authors correctly said in section six of [3] that It is clearly true that the partition
function sums over all configurations including collapsed situations. What I showed in
[2], by considering configurations for which all particles are contained on a region of size
R0 , is that the collapsed configurations indeed dominate the partition function in the thermodynamic limit, even in the diluted case. Therefore, the next sentence in section six of
[3], However, this collapsed configurations have a negligible weight for < 0 N/R0 ,
is obviously false. It is just the opposite: collapsed configurations dominate the partition
function for any as N with N/V 1/3 fixed. Hence, self-gravitating objects which
are apparently stable in nature cannot be mathematically described by the diluted thermodynamic limit of the partition function, but rather by metastable states which are not yet
collapsed [5].
ZC
441
It is clear that the grand canonical partition function cannot scale as exp[Ng(, )] if
the canonical partition function grows with N faster than exp[Nf ()]. Hence, the grand
canonical ensemble cannot give an extensive thermodynamic potential, either.
Inequalities (2) or (3) are valid whatever the value of the temperature (i.e., of ). Hence,
they prove that the gas phase obtained in Refs. [1,3] does not exist in the diluted limit. For
a finite system, we expect a gas phase at high temperature (small ) and a collapse phase at
low temperature (large ), with the two phases separated by a phase transition or crossover
at some c . As the system size grows, c will decrease toward zero, the gas phase will
shrink and the collapse phase will eventually cover the whole phase diagram.
In [3] it is shown that the diluted limit exists order by order in the cluster expansion
(a similar proof was given in [2] for the high temperature expansion). This contradicts
the proof of nonexistence of the diluted limit. Since inequalities (2) or (3) are derived
rigorously, with no assumption, the high temperature and cluster expansions cannot be
valid. The reasons why this kind of expansions fail have been analyzed in [2]. Basically,
there are two possibilities:
(i) The series in /N may not converge in the N limit, due to the contribution
of high order diagrams that are naively suppressed by powers of 1/N , but which actually
may give a significant contribution due to the short distance divergences, since the dimensionless cut-off behaves as a = A/R, where A is a fixed length [cf. Eq. (2.8) of the [3]].
Hence, the infrared divergences which occur in the R limit become ultraviolet divergences, as a 0, in the dimensionless integrals. Concerning this point, the statement
that appears at the end of the introduction of the paper [3], one can take the limit N
and then a 0 does probably not hold due to the singularities of the integrals that give
the coefficients of the cluster expansion. The rigorous limit is, obviously N , a 0,
with N a fixed. The modification of the procedure to take the N limit may be at the
core of the failure of the cluster expansion developed in [3].
(ii) Even if the cluster expansion was convergent, the series could not represent the
thermodynamic potential, since it is in contradiction with the rigorous result of [2]. In
deriving the cluster expansion there is at least one mathematically unjustified exchange of
limits that may invalidate the equality between the thermodynamic potential and the cluster
series. The cluster expansion (I use the notation of [3]),

fij +
fij fkl + ,
QN () = 1 +
(4)
is rigorous, since the number of terms in the sums is finite for finite N . To proceed further,
the authors of [3] expand fij in powers of /N , and exchange the sum and integral. Mathematically, it can be very difficult to analyze the conditions under which this exchange of
limits is allowed, but we can get some insight from physical intuition. Inequality (3) implies that the system is collapsed for large N if is of order 1 respect to 1/N . This means
that the cluster expansion is dominated by the higher order terms (many particles within a
cluster). Hence, the canonical (Gibbs) integration measure on configuration space is very
concentrated
on collapsed configurations. Hence, this measure is very different from the
flat measure i d 3 ri that correspond to a gas. Keeping only the dominant of the expansion
of fij in powers of /N means that one is using effectively the flat measure. To recover
442
something similar to the concentrated true measure, one has to sum an infinite number of
terms in /N . In other words, the expansion in /N is valid only in the gas phase. Inequality (3) suggests that the gas phase can only take place for of the order of N 2/3 . Hence,
the radius of convergence of the expansion in shrinks to zero as N .
Similar statements claiming the existence of the diluted thermodynamic limit have been
made in [4] without even mentioning the results of Ref. [2].
References
[1] H.J. de Vega, N. Snchez, Phys. Lett. B 490 (2000) 180;
H.J. de Vega, N. Snchez, Nucl. Phys. B 625 (2002) 409;
H.J. de Vega, N. Snchez, Phys. Lett. B 625 (2002) 460.
[2] V. Laliena, Nucl. Phys. B 668 (2003) 403, astro-ph/0303301.
[3] H.J. de Vega, N.G. Snchez, Nucl. Phys. B 711 (2005) 604, astro-ph/0307318.
[4] H.J. de Vega, J.A. Siebert, Nucl. Phys. B 707 (2005) 529, astro-ph/0305322.
[5] P.H. Chavanis, M. Rieutord, Astron. Astrophys. 412 (2003) 1, astro-ph/0302594;
P.H. Chavanis, Astron. Astrophys. 432 (2005) 117, astro-ph/0404251.
CUMULATIVE AUTHOR INDEX B711B20
Abramowicz, H.
Abramowicz, H.
Abt, I.
Abt, I.
Adamczyk, L.
Adamczyk, L.
Adamus, M.
Adamus, M.
Adler, V.
Adler, V.
Aganagic, M.
Agashe, K.
Aghuzumtsyan, G.
Aghuzumtsyan, G.
Aguilar-Saavedra, J.A.
Ahn, C.
Akemann, G.
Allfrey, P.D.
Allfrey, P.D.
ALPHA Collaboration
Altarelli, G.
Andreotti, M.
Antoniadis, I.
Antoniadis, I.
Antonioli, P.
Antonioli, P.
Antonov, A.
Antonov, A.
Aoki, S.
Aoki, S.
Aquila, V.
Arneodo, M.
Arneodo, M.
Artamonov, A.
Ashoorioon, A.
Aulakh, C.S.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B715 (2005) 304
B719 (2005) 165
B713 (2005) 3
B718 (2005) 3
B717 (2005) 119
B714 (2005) 307
B712 (2005) 287
B713 (2005) 3
B718 (2005) 3
B713 (2005) 378
B720 (2005) 64
B717 (2005) 34
B715 (2005) 120
B716 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 407
B718 (2005) 35
B719 (2005) 77
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B716 (2005) 261
B711 (2005) 275
0550-3213/2005 Published by Elsevier B.V.

doi:10.1016/S0550-3213(05)00523-7
Babu, K.S.
Bagnasco, S.
Bailey, D.S.
Bailey, D.S.
Bajnok, Z.
Bajnok, Z.
Bak, D.
Baldini, W.
Balog, J.
Bamberger, A.
Bamberger, A.
Barakbaev, A.N.
Barakbaev, A.N.
Barbagli, G.
Barbagli, G.
Barbi, M.
Barbieri, A.
Barbuto, E.
Bardakci, K.
Bari, G.
Bari, G.
Barnes, E.
Barreiro, F.
Barreiro, F.
Bartsch, D.
Bartsch, D.
Baseilhac, P.
Basile, M.
Basile, M.
Basu, A.
Bazzocchi, F.
Becker, K.
Becker, M.
Bedford, J.
Behrens, U.
Behrens, U.
B720 (2005) 47
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B714 (2005) 307
B716 (2005) 519
B712 (2005) 115
B717 (2005) 34
B714 (2005) 256
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B719 (2005) 53
B718 (2005) 35
B715 (2005) 141
B713 (2005) 3
B718 (2005) 3
B716 (2005) 33
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B720 (2005) 325
B713 (2005) 3
B718 (2005) 3
B713 (2005) 136
B715 (2005) 372
B715 (2005) 349
B715 (2005) 349
B712 (2005) 59
B713 (2005) 3
B718 (2005) 3
444
Beisert, N.
Beisert, N.
Bell, M.A.
Bell, M.A.
Bellagamba, L.
Bellagamba, L.
Bellan, P.
Bellan, P.
Beneke, M.
Benen, A.
Benen, A.
Berche, B.
Berche, P.-E.
Bernabu, J.
Bernreuther, W.
Bertolin, A.
Bertolin, A.
Bettoni, D.
Bhadra, S.
Bhadra, S.
Binosi, D.
Bloch, I.
Bloch, I.
Blumenhagen, R.
Blmlein, J.
Bobeth, C.
Boels, R.
Bod, T.
Bod, T.
Bolognesi, S.
Bolognesi, S.
Bonciani, R.
Bonciani, R.
Bonora, L.
Boos, E.E.
Boos, E.G.
Boos, E.G.
Boos, H.E.
Borras, K.
Borras, K.
Borreani, G.
Boscherini, D.
Boscherini, D.
Boughezal, R.
Bozza, C.
Brandhuber, A.
Britto, R.
Brock, I.
Brock, I.
Brook, N.H.
Brook, N.H.
Brownson, E.
Brugnera, R.
Brugnera, R.
B715 (2005) 190

B717 (2005) 137
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 67
B713 (2005) 3
B718 (2005) 3
B719 (2005) 275
B719 (2005) 275
B716 (2005) 352
B712 (2005) 229
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B716 (2005) 352
B713 (2005) 3
B718 (2005) 3
B713 (2005) 83
B716 (2005) 128
B713 (2005) 522
B715 (2005) 234
B713 (2005) 3
B718 (2005) 3
B718 (2005) 134
B719 (2005) 67
B712 (2005) 229
B716 (2005) 280
B715 (2005) 413
B717 (2005) 19
B713 (2005) 3
B718 (2005) 3
B712 (2005) 573
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 278
B718 (2005) 35
B712 (2005) 59
B715 (2005) 499
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Brmmer, N.
Brmmer, N.
Bruni, A.
Bruni, A.
Bruni, G.
Bruni, G.
Bruski, N.
Buchbinder, E.I.
Buchbinder, I.L.
Buchmller, W.
Buontempo, S.
Buras, A.J.
Buras, A.J.
Buras, A.J.
Bussey, P.J.
Bussey, P.J.
Butterworth, J.M.
Butterworth, J.M.
Bttner, C.
Bttner, C.
Buzzo, A.
Bylsma, B.
Bylsma, B.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B711 (2005) 314
B711 (2005) 367
B712 (2005) 139
B718 (2005) 35
B713 (2005) 522
B714 (2005) 103
B716 (2005) 173
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
Cachazo, F.
Calabrese, R.
Caldwell, A.
Caldwell, A.
Capua, M.
Capua, M.
Cara Romeo, G.
Cara Romeo, G.
Carena, M.
Carli, T.
Carli, T.
Carlin, R.
Carlin, R.
Cassel, D.G.
Catanesi, M.G.
Catterall, C.D.
Catterall, C.D.
Cester, R.
Chang, J.-F.
Chankowski, P.H.
Chankowski, P.H.
Chatelain, C.
Chaudhuri, S.
Chekanov, S.
Chekanov, S.
Chikawa, M.
Choi, K.
Choi, S.Y.
Chong, Z.-W.
CHORUS Collaboration
B715 (2005) 499

B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 319
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B712 (2005) 347
B713 (2005) 555
B717 (2005) 190
B719 (2005) 275
B719 (2005) 188
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B718 (2005) 113
B711 (2005) 83
B717 (2005) 246
B718 (2005) 35
Chwastowski, J.
Chwastowski, J.
Cibinetto, G.
Ciborowski, J.
Ciborowski, J.
Ciesielski, R.
Ciesielski, R.
Cifarelli, L.
Cifarelli, L.
Cindolo, F.
Cindolo, F.
Cirelli, M.
Cocco, A.G.
Cole, J.E.
Cole, J.E.
Collins-Tooth, C.
Collins-Tooth, C.
Contin, A.
Contin, A.
Contino, R.
Cooper-Sarkar, A.M.
Cooper-Sarkar, A.M.
Coppola, N.
Coppola, N.
Corcella, G.
Corradi, M.
Corradi, M.
Corriveau, F.
Corriveau, F.
Costa, M.
Costa, M.
Cotaescu, I.I.
Cottrell, A.
Cottrell, A.
Cui, Y.
Cui, Y.
Cvetic, M.
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 219
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 165
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 609
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 140
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 246
DAgostini, G.
DAgostini, G.
Dal Corso, F.
Dal Corso, F.
DallAgata, G.
Dalpiaz, P.
DAmbrosio, N.
Dams, C.
Danielson, T.
Danilov, P.
DAppollonio, G.
de Boer, J.
de Favereau, J.
de Favereau, J.
de Jong, M.
Delbar, T.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 223
B717 (2005) 34
B718 (2005) 35
B716 (2005) 421
B718 (2005) 3
B713 (2005) 3
B712 (2005) 433
B715 (2005) 234
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B718 (2005) 35
445
De Lellis, G.
Della Morte, M.
del Peso, J.
del Peso, J.
Dementiev, R.K.
Dementiev, R.K.
Denner, A.
De Pasquale, S.
De Pasquale, S.
Derendinger, J.-P.
De Rosa, G.
Derrick, M.
Derrick, M.
de Vega, H.J.
Devenish, R.C.E.
Devenish, R.C.E.
de Wit, B.
de Wolf, E.
Dhawan, S.
Dhawan, S.
DHoker, E.
DHoker, E.
Di Capua, E.
Di Capua, F.
Dimopoulos, S.
Dinsdale, M.J.
Dobashi, S.
Dobashi, S.
Dobur, D.
Dobur, D.
Dolan, L.
Dolgoshein, B.A.
Dolgoshein, B.A.
Dore, U.
Doyle, A.T.
Doyle, A.T.
Drews, G.
Drews, G.
Dudas, E.
Duncan, A.
Duncan, A.
Dunne, W.
Durkin, L.S.
Durkin, L.S.
Dusini, S.
Dusini, S.
B718 (2005) 35
B713 (2005) 378
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 48
B713 (2005) 3
B718 (2005) 3
B715 (2005) 211
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B711 (2005) 604
B713 (2005) 3
B718 (2005) 3
B716 (2005) 215
B713 (2005) 3
B713 (2005) 3
B718 (2005) 3
B715 (2005) 3
B715 (2005) 91
B718 (2005) 35
B718 (2005) 35
B715 (2005) 120
B713 (2005) 465
B711 (2005) 3
B711 (2005) 54
B713 (2005) 3
B718 (2005) 3
B717 (2005) 361
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 65
B714 (2005) 256
B720 (2005) 235
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Eden, B.
Eisenberg, Y.
Eisenberg, Y.
Ellis, J.
Enkhbat, Ts.
Ermolov, P.F.
Ermolov, P.F.
B712 (2005) 157

B713 (2005) 3
B718 (2005) 3
B718 (2005) 247
B720 (2005) 47
B713 (2005) 3
B718 (2005) 3
446
Eskreys, A.
Eskreys, A.
Essler, F.H.L.
Evans, J.M.
Everett, A.
Everett, A.
Ewerth, T.
Ewerth, T.
B713 (2005) 3
B718 (2005) 3
B712 (2005) 513
B717 (2005) 327
B713 (2005) 3
B718 (2005) 3
B713 (2005) 522
B714 (2005) 103
Fabbrichesi, M.
Falkowski, A.
Favart, D.
Fazio, S.
Fehr, L.
Feng, B.
Fermilab E835 Collaboration
Ferrando, J.
Ferrando, J.
Ferrara, S.
Ferrero, M.I.
Ferrero, M.I.
Ferretti, G.
Ferroglia, A.
Feruglio, F.
Fichera, D.
Figiel, J.
Figiel, J.
Fiorillo, G.
Foster, B.
Foster, B.
Foudas, C.
Foudas, C.
Fourletov, S.
Fourletov, S.
Fourletova, J.
Fourletova, J.
Frahm, H.
Frekers, D.
Frezzotti, R.
Fry, C.
Fry, C.
Fuchs, J.
Flp, T.
B715 (2005) 372

B718 (2005) 113
B718 (2005) 35
B718 (2005) 3
B715 (2005) 713
B715 (2005) 499
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B717 (2005) 223
B713 (2005) 3
B718 (2005) 3
B717 (2005) 137
B716 (2005) 280
B720 (2005) 64
B720 (2005) 307
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B712 (2005) 513
B718 (2005) 35
B713 (2005) 378
B713 (2005) 3
B718 (2005) 3
B715 (2005) 539
B715 (2005) 713
Gabareen, A.
Gabareen, A.
Gaillard, M.K.
Galas, A.
Galas, A.
Gallo, E.
Gallo, E.
Gambino, P.
Garfagnini, A.
Garfagnini, A.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 607
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 77
B713 (2005) 3
B718 (2005) 3
Garzoglio, G.
Gattnar, J.
Gattringer, C.
Gehrmann, T.
Geiser, A.
Geiser, A.
Genta, C.
Genta, C.
Ghodsi, A.
Gialas, I.
Gialas, I.
Giedt, J.
Giombi, S.
Girdhar, A.
Giuliano, D.
Giusti, P.
Giusti, P.
Giveon, A.
Gladilin, L.K.
Gladilin, L.K.
Gladkov, D.
Gladkov, D.
Glasman, C.
Glasman, C.
Gliozzi, F.
Gliozzi, F.
Gmeiner, F.
Gckeler, M.
Goebel, F.
Goers, S.
Goers, S.
Goldberg, J.
Gollwitzer, K.E.
Gomes, J.F.
Gmez-Reino, M.
Gonalo, R.
Gonalo, R.
Gonzlez, O.
Gonzalez-Garcia, M.C.
Gorbahn, M.
Gorbunov, P.
Gorkavenko, V.M.
Gorsky, A.
Gosau, T.
Gosau, T.
Gttlicher, P.
Gttlicher, P.
Grabowska-Bod, I.
Grabowska-Bod, I.
Graciani Diaz, R.
Grady, M.
Graham, M.
Grgoire, G.
Gregoire, T.
B717 (2005) 34
B716 (2005) 105
B716 (2005) 105
B712 (2005) 229
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 30
B713 (2005) 3
B718 (2005) 3
B713 (2005) 607
B719 (2005) 234
B711 (2005) 275
B711 (2005) 480
B713 (2005) 3
B718 (2005) 3
B719 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 91
B719 (2005) 255
B713 (2005) 83
B717 (2005) 304
B713 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B717 (2005) 34
B714 (2005) 179
B713 (2005) 263
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B719 (2005) 219
B713 (2005) 291
B718 (2005) 35
B714 (2005) 217
B718 (2005) 293
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B713 (2005) 204
B717 (2005) 34
B718 (2005) 35
B720 (2005) 3
Grella, G.
Grigorescu, G.
Grigorescu, G.
Grijpink, S.
Grimm, T.W.
Grimus, W.
Grinza, P.
Grinza, P.
Groys, M.
Grzelak, G.
Grzelak, G.
Guadagnini, E.
Gler, M.
Gutbrod, F.
Gutsche, O.
Gutsche, O.
Gwenlan, C.
Gwenlan, C.
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 153
B713 (2005) 151
B714 (2005) 357
B718 (2005) 394
B713 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 53
B718 (2005) 35
B720 (2005) 116
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Haas, T.
Haas, T.
Hfliger, P.
Hain, W.
Hain, W.
Haisch, U.
Hall-Wilton, R.
Hall-Wilton, R.
Hamaguchi, K.
Hamatsu, R.
Hamatsu, R.
Hamilton, J.
Hamilton, J.
Hanlon, S.
Hara, T.
Hart, J.C.
Hart, J.C.
Hartmann, H.
Hartmann, H.
Hartner, G.
Hartner, G.
Harvey, J.A.
Hatanaka, T.
Heaphy, E.A.
Heaphy, E.A.
Heath, G.P.
Heath, G.P.
Hebecker, A.
Hebecker, A.
Heinesch, R.
Heise, R.
Heitger, J.
Helbich, M.
Helbich, M.
Hilger, E.
B713 (2005) 3
B718 (2005) 3
B719 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 291
B713 (2005) 3
B718 (2005) 3
B712 (2005) 139
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 136
B716 (2005) 88
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 173
B720 (2005) 211
B712 (2005) 229
B717 (2005) 137
B713 (2005) 378
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
447
Hilger, E.
Hochman, D.
Hochman, D.
Holm, U.
Holm, U.
Honecker, G.
Honkonen, J.
Horn, C.
Horn, C.
Horsley, R.
Horsley, R.
Horvth, I.
Horvthy, P.A.
Hoshino, K.
Hristova, I.R.
Hu, M.
Huang, Q.-G.
Hung, P.Q.
Hung, P.Q.
Hyakutake, Y.
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 83
B714 (2005) 292
B713 (2005) 3
B718 (2005) 3
B713 (2005) 601
B717 (2005) 304
B714 (2005) 175
B714 (2005) 269
B718 (2005) 35
B718 (2005) 35
B717 (2005) 34
B713 (2005) 219
B712 (2005) 325
B720 (2005) 89
B712 (2005) 115
Iacobucci, G.
Iacobucci, G.
Ibarra, A.
Iga, Y.
Iga, Y.
Imura, K.-I.
Intriligator, K.
Irges, N.
Irrgang, P.
Irrgang, P.
Isidori, G.
B713 (2005) 3
B718 (2005) 3
B715 (2005) 523
B713 (2005) 3
B718 (2005) 3
B720 (2005) 399
B716 (2005) 33
B719 (2005) 121
B713 (2005) 3
B718 (2005) 3
B718 (2005) 319
Jacobsen, J.L.
Jger, S.
Jakob, H.-P.
Jakob, H.-P.
Janke, W.
Janke, W.
Janssen, B.
Janssen, B.
Jarczak, C.
Jimenez, M.
Jimenez, M.
Jockers, H.
Joffe, D.
Jones, T.W.
Jones, T.W.
Joshipura, A.S.
Jung, E.
B716 (2005) 439

B714 (2005) 103
B713 (2005) 3
B718 (2005) 3
B719 (2005) 275
B719 (2005) 312
B711 (2005) 392
B712 (2005) 371
B712 (2005) 157
B713 (2005) 3
B718 (2005) 3
B718 (2005) 203
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 151
B717 (2005) 272
Kagawa, S.
Kagawa, S.
Kahle, B.
B713 (2005)
B718 (2005)
B713 (2005)
3
3
3
448
Kahle, B.
Kaji, H.
Kaji, H.
Kalinin, S.
Kalinowski, J.
Kalmykov, M.Yu.
Kananov, S.
Kananov, S.
Kaneko, S.
Karshon, U.
Karshon, U.
Karstens, F.
Karstens, F.
Kasemann, M.
Kasper, J.
Kataoka, M.
Kataoka, M.
Katkov, I.I.
Katkov, I.I.
Kawada, J.
Kawai, H.
Kawamura, T.
Kayis-Topaksu, A.
Kira, D.
Kira, D.
Keramidas, A.
Keramidas, A.
Ketov, S.V.
Khein, L.A.
Khein, L.A.
Khovansky, V.
Kim, J.Y.
Kim, J.Y.
Kim, S.
Kimura, T.
Kind, O.
Kind, O.M.
Kiritsis, E.
Kisielewska, D.
Kisielewska, D.
Kitamura, S.
Kitamura, S.
Kitanine, N.
Kitazawa, Y.
Kiyo, Y.
Klasen, M.
Kleiss, R.
Knechtli, F.
Kniehl, B.A.
Kniehl, B.A.
Kniehl, B.A.
Kobayashi, Y.
Koch, F.
Kodama, K.
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 555
B718 (2005) 276
B713 (2005) 3
B718 (2005) 3
B713 (2005) 151
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B711 (2005) 253
B718 (2005) 35
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 88
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B712 (2005) 115
B711 (2005) 163
B713 (2005) 3
B718 (2005) 3
B712 (2005) 433
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B712 (2005) 600
B715 (2005) 665
B714 (2005) 67
B713 (2005) 487
B716 (2005) 421
B719 (2005) 121
B711 (2005) 345
B713 (2005) 487
B720 (2005) 231
B716 (2005) 88
B717 (2005) 387
B718 (2005) 35
Koffeman, E.
Koffeman, E.
Kohno, T.
Kohno, T.
Koizumi, K.
Kojima, T.
Kolev, D.
Koma, M.
Koma, Y.
Komarova, M.V.
Komatsu, M.
Konishi, K.
Konno, H.
Konstandin, T.
Kooijman, P.
Kooijman, P.
Koop, T.
Krs, B.
Korzhavina, I.A.
Korzhavina, I.A.
Kse, U.
Kotanski, A.
Kotanski, A.
Ktz, U.
Ktz, U.
Kounnas, C.
Kowal, A.M.
Kowal, A.M.
Kowalski, H.
Kowalski, H.
Kramberger, G.
Kramberger, G.
Kramer, G.
Kramer, G.
Krause, A.
Kreisel, A.
Kreisel, A.
Kriz, I.
Krumnack, N.
Krumnack, N.
Krykhtin, V.A.
Kulaxizi, M.
Kulinski, P.
Kulinski, P.
Kulish, P.P.
Kuramashi, Y.
Kuroki, T.
Kutasov, D.
Kuze, M.
Kuze, M.
Kuzmin, V.A.
Kuzmin, V.A.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B720 (2005) 325
B720 (2005) 348
B718 (2005) 35
B713 (2005) 575
B713 (2005) 575
B714 (2005) 292
B718 (2005) 35
B718 (2005) 134
B720 (2005) 348
B716 (2005) 373
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B711 (2005) 112
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B715 (2005) 211
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B711 (2005) 345
B720 (2005) 231
B715 (2005) 349
B713 (2005) 3
B718 (2005) 3
B715 (2005) 639
B713 (2005) 3
B718 (2005) 3
B711 (2005) 367
B719 (2005) 234
B713 (2005) 3
B718 (2005) 3
B720 (2005) 289
B713 (2005) 407
B711 (2005) 253
B719 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Labarga, L.
B713 (2005)
Labarga, L.
Laliena, V.
Lammers, S.
Lammers, S.
Langfeld, K.
Lasio, G.
Lavoura, L.
Lebedev, O.
Lebedev, O.
Lee, J.S.
Leineweber, T.
Lelas, D.
Lelas, D.
Levchenko, B.B.
Levchenko, B.B.
Levy, A.
Levy, A.
Li, L.
Li, L.
Li, M.
Liao, Y.
Lightwood, M.S.
Lightwood, M.S.
Lim, H.
Lim, H.
Limentani, S.
Limentani, S.
Ling, T.Y.
Ling, T.Y.
Liu, C.
Liu, C.
Liu, X.
Liu, X.
Lhr, B.
Lhr, B.
Lohrmann, E.
Lohrmann, E.
Loizides, J.H.
Loizides, J.H.
Long, K.R.
Long, K.R.
Longhin, A.
Longhin, A.
Lottini, S.
Louis, J.
Louis, J.
Loverre, P.F.
Lo Vetere, M.
Lozano, Y.
Lozano, Y.
L, H.
Lbeck, S.
Lucini, B.
Ludovici, L.
B718 (2005) 3
B720 (2005) 439
B713 (2005) 3
B718 (2005) 3
B716 (2005) 105
B717 (2005) 34
B713 (2005) 151
B712 (2005) 139
B717 (2005) 190
B718 (2005) 247
B712 (2005) 229
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 219
B713 (2005) 235
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 255
B718 (2005) 153
B718 (2005) 203
B718 (2005) 35
B717 (2005) 34
B711 (2005) 392
B712 (2005) 371
B717 (2005) 246
B718 (2005) 341
B715 (2005) 461
B718 (2005) 35
449
ukasik, J.
ukasik, J.
Lukina, O.Yu.
Lukina, O.Yu.
Lukyanov, S.L.
Luppi, E.
Lst, D.
uzniak, P.
uzniak, P.
Lysov, V.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 103
B717 (2005) 34
B713 (2005) 83
B713 (2005) 3
B718 (2005) 3
B718 (2005) 293
Ma, K.J.
Ma, K.J.
Maccaferri, C.
Macr, M.
Maddox, E.
Maddox, E.
Maeda, T.
Magill, S.
Magill, S.
Maillard, T.
Maillet, J.M.
Makhlioueva, I.
Malka, J.
Malka, J.
Mandelkern, M.
Maniatis, M.
Maniatis, M.
Mankel, R.
Mankel, R.
Mann, R.B.
Manvelyan, R.
Marandella, G.
Marchesano, F.
Marchetto, F.
Margotti, A.
Margotti, A.
Marinelli, M.
Marini, G.
Marini, G.
Marmorini, G.
Marotta, A.
Martin, J.F.
Martin, J.F.
Martinez, M.
Martins, M.J.
Masiero, A.
Mass, E.
Mastroberardino, A.
Mastroberardino, A.
Mastrolia, P.
Mastrolia, P.
Mathews, P.
Matsuo, Y.
B713 (2005) 3
B718 (2005) 3
B715 (2005) 413
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B715 (2005) 275
B713 (2005) 3
B718 (2005) 3
B716 (2005) 3
B712 (2005) 600
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B711 (2005) 345
B720 (2005) 231
B713 (2005) 3
B718 (2005) 3
B716 (2005) 261
B717 (2005) 3
B715 (2005) 173
B712 (2005) 20
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B718 (2005) 134
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B711 (2005) 565
B712 (2005) 86
B715 (2005) 523
B713 (2005) 3
B718 (2005) 3
B712 (2005) 229
B716 (2005) 280
B713 (2005) 333
B711 (2005) 253
450
Matsuzawa, K.
Matsuzawa, K.
Mattingly, M.C.K.
Mattingly, M.C.K.
Maxwell, C.J.
McInnes, B.
Megevand, A.
Meinhard, H.
Melo, C.S.
Melzer-Pellmann, I.-A.
Melzer-Pellmann, I.-A.
Menary, S.
Menary, S.
Menichetti, E.
Mescia, F.
Messina, M.
Metlica, F.
Metlica, F.
Metreveli, Z.
Meyer, U.
Meyer, U.
Miglioranzi, S.
Miglioranzi, S.
Migliozzi, P.
Mihaila, L.N.
Mikhailov, Yu.S.
Milite, M.
Milite, M.
Miller, D.J.
Mintchev, M.
Mints, A.L.
Mirea, A.
Miyanishi, M.
Mizoguchi, S.
Moghimi-Araghi, S.
Monaco, V.
Monaco, V.
Montanari, A.
Montanari, A.
Moore, J.E.
Mosaffa, A.E.
Muciaccia, M.T.
Mueller, A.H.
Mukhopadhyaya, B.
Musgrave, B.
Musgrave, B.
Mussa, R.
Musso, F.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 465
B718 (2005) 55
B716 (2005) 319
B718 (2005) 35
B711 (2005) 565
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B718 (2005) 319
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 487
B717 (2005) 19
B713 (2005) 3
B718 (2005) 3
B711 (2005) 83
B720 (2005) 307
B713 (2005) 607
B713 (2005) 3
B718 (2005) 35
B716 (2005) 462
B718 (2005) 362
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 487
B714 (2005) 30
B718 (2005) 35
B715 (2005) 440
B720 (2005) 47
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B716 (2005) 543
Naculich, S.G.
Nagano, K.
Nagano, K.
Nakamura, M.
Nakano, T.
B713 (2005) 263

B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B718 (2005) 35
Nakatsu, T.
Nalimov, M.Yu.
Namsoo, T.
Namsoo, T.
Nania, R.
Nania, R.
Nappi, C.R.
Narita, K.
Nath, P.
Nechaev, S.
Negrini, M.
Nepomechie, R.I.
Nguyen, C.N.
Nguyen, C.N.
Niedermaier, M.
Niedermayer, F.
Nigro, A.
Nigro, A.
Nilles, H.P.
Ning, Y.
Ning, Y.
Nirschl, M.
Nitta, M.
Niu, K.
Niwa, K.
Nonaka, N.
Noor, U.
Noor, U.
Notz, D.
Notz, D.
Nowak, R.J.
Nowak, R.J.
Nuncio-Quiroz, A.E.
Nuncio-Quiroz, A.E.
B715 (2005) 275

B714 (2005) 292
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B717 (2005) 361
B718 (2005) 35
B711 (2005) 112
B714 (2005) 336
B717 (2005) 34
B714 (2005) 307
B713 (2005) 3
B718 (2005) 3
B720 (2005) 235
B714 (2005) 256
B713 (2005) 3
B718 (2005) 3
B718 (2005) 113
B713 (2005) 3
B718 (2005) 3
B711 (2005) 409
B711 (2005) 133
B718 (2005) 35
B718 (2005) 35
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Obertino, M.M.
Ogawa, S.
Oh, B.Y.
Oh, B.Y.
Ohta, N.
Okusawa, T.
Oldeman, R.G.C.
Olechowski, M.
Olkiewicz, K.
Olkiewicz, K.
nengt, G.
Ooguri, H.
Osborn, H.
Osborn, J.C.
Ota, O.
Ota, O.
B717 (2005) 34
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B712 (2005) 115
B718 (2005) 35
B718 (2005) 35
B718 (2005) 113
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B715 (2005) 304
B711 (2005) 409
B712 (2005) 287
B713 (2005) 3
B718 (2005) 3
Padhi, S.
Palla, L.
B713 (2005) 3
B714 (2005) 307
Palla, L.
Pallavicini, M.
Palmonari, F.
Palmonari, F.
Palomares-Ruiz, S.
Panero, M.
Panman, J.
Papavassiliou, J.
Paredes, A.
Park, D.K.
Pashnev, A.
Pastrone, N.
Patel, S.
Patel, S.
Patrignani, C.
Paul, E.
Paul, E.
Pavel, N.
Pavel, N.
Pawlak, J.M.
Pawlak, J.M.
Pelfer, P.G.
Pelfer, P.G.
Pellegrino, A.
Pea-Garay, C.
Penin, A.A.
Perlt, H.
Perlt, H.
Peschanski, R.
Pesci, A.
Pesci, A.
Petcov, S.T.
Petrera, M.
Petropoulos, P.M.
Phong, D.H.
Phong, D.H.
Pilaftsis, A.
Pilo, L.
Pinnow, H.A.
Pinzul, A.
Piotrzkowski, K.
Piotrzkowski, K.
Plamondon, M.
Plamondon, M.
Plucinski, P.
Plucinski, P.
Plyushchay, M.S.
Pokorski, S.
Pokrovskiy, N.S.
Pokrovskiy, N.S.
Polini, A.
Polini, A.
Polychronakos, A.P.
Pomarol, A.
B716 (2005) 519

B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B712 (2005) 392
B719 (2005) 255
B718 (2005) 35
B716 (2005) 352
B713 (2005) 438
B717 (2005) 272
B711 (2005) 367
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B719 (2005) 219
B716 (2005) 303
B713 (2005) 601
B717 (2005) 304
B716 (2005) 401
B713 (2005) 3
B718 (2005) 3
B712 (2005) 392
B716 (2005) 543
B715 (2005) 211
B715 (2005) 3
B715 (2005) 91
B718 (2005) 247
B712 (2005) 3
B711 (2005) 530
B718 (2005) 371
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 269
B717 (2005) 190
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B711 (2005) 505
B719 (2005) 165
451
Ponsot, B.
Ponsot, B.
Pope, C.N.
Pordes, S.
Poschenrieder, A.
Pozzorini, S.
Profumo, S.
Prokopec, T.
Proskuryakov, A.S.
Proskuryakov, A.S.
Przybycien, M.
Przybycien, M.
B714 (2005) 357

B718 (2005) 394
B717 (2005) 246
B717 (2005) 34
B716 (2005) 173
B717 (2005) 48
B712 (2005) 86
B716 (2005) 373
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
QCDSF Collaboration
Quirs, M.
Quirs, M.
B713 (2005) 601

B712 (2005) 3
B716 (2005) 319
Rabinovici, E.
Ragnisco, O.
Rago, A.
Rago, A.
Rajabpour, M.A.
Rakow, P.E.L.
Rakow, P.E.L.
Ramage, M.R.
Rattazzi, R.
Ratz, M.
Rautenberg, J.
Rautenberg, J.
Raval, A.
Raval, A.
Ravindran, V.
Ravindran, V.
Redondo, J.
Reeder, D.D.
Reeder, D.D.
Reinhardt, H.
Remiddi, E.
Remiddi, E.
Ren, Z.
Ren, Z.
Renner, R.
Renner, R.
Repond, J.
Repond, J.
Ri, Y.D.
Ri, Y.D.
Ribeiro, G.A.P.
Ricci, R.
Riccioni, F.
Ridolfi, G.
Rinaldi, L.
Rinaldi, L.
Riotto, A.
B719 (2005) 3
B716 (2005) 543
B714 (2005) 91
B719 (2005) 255
B718 (2005) 362
B713 (2005) 601
B717 (2005) 304
B720 (2005) 137
B720 (2005) 3
B712 (2005) 139
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 333
B716 (2005) 128
B715 (2005) 523
B713 (2005) 3
B718 (2005) 3
B716 (2005) 105
B712 (2005) 229
B716 (2005) 280
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B711 (2005) 565
B719 (2005) 234
B711 (2005) 231
B719 (2005) 77
B713 (2005) 3
B718 (2005) 3
B712 (2005) 3
452
Robichaud-Veronneau, A.
Robins, S.
Robins, S.
Robles-Llana, D.
Robutti, E.
Rodrguez-Gmez, D.
Rodrguez-Gmez, D.
Roethel, W.
Rolf, J.
Romano, G.
Rosa, G.
Rosen, J.
Rosiek, J.
Rosin, M.
Rosin, M.
Rouhani, S.
Royon, C.
Rozanov, A.
Rubinsky, I.
Rhl, W.
Rumerio, P.
Runkel, I.
Rusack, R.W.
Ruspa, M.
Ruspa, M.
Ryan, P.
Ryan, P.
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 234
B717 (2005) 34
B711 (2005) 392
B712 (2005) 371
B717 (2005) 34
B713 (2005) 378
B718 (2005) 35
B718 (2005) 35
B717 (2005) 34
B714 (2005) 103
B713 (2005) 3
B718 (2005) 3
B718 (2005) 362
B716 (2005) 401
B718 (2005) 35
B718 (2005) 3
B717 (2005) 3
B717 (2005) 34
B715 (2005) 539
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Sacchi, R.
Sacchi, R.
Saharian, A.A.
Saitta, B.
Sakaguchi, M.
Salehi, H.
Salehi, H.
Saleur, H.
Saleur, H.
Samtleben, H.
Snchez, N.G.
Santacesaria, R.
Santamarta, R.
Santamarta, R.
Santroni, A.
Sanz, V.
Sartorelli, G.
Sartorelli, G.
Sasaki, S.
Sati, H.
Sato, O.
Sato, Y.
Satta, A.
Satta, G.
Saulina, N.
Saulina, N.
B713 (2005) 3
B718 (2005) 3
B712 (2005) 196
B718 (2005) 35
B714 (2005) 51
B713 (2005) 3
B718 (2005) 3
B712 (2005) 513
B716 (2005) 439
B716 (2005) 215
B711 (2005) 604
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B717 (2005) 34
B712 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 88
B715 (2005) 639
B718 (2005) 35
B718 (2005) 35
B718 (2005) 35
B716 (2005) 543
B715 (2005) 304
B720 (2005) 203
Savin, A.A.
Savin, A.A.
Sawanaka, H.
Saxon, D.H.
Saxon, D.H.
Schfer, A.
Schfer, A.
Schfer-Nameki, S.
Schagen, S.
Schappacher, C.
Scherer Santos, R.J.
Schierholz, G.
Schierholz, G.
Schiller, A.
Schiller, A.
Schioppa, M.
Schioppa, M.
Schlenstedt, S.
Schlenstedt, S.
Schleper, P.
Schleper, P.
Schmidke, W.B.
Schmidke, W.B.
Schmidt, M.G.
Schneekloth, U.
Schneekloth, U.
Schnitzer, H.J.
Schoeffel, L.
Schrner-Sadenius, T.
Schrner-Sadenius, T.
Schuller, K.
Schultz, J.
Schweigert, C.
Sciulli, F.
Sciulli, F.
Scotto Lavina, L.
Scrucca, C.A.
Seiler, E.
Seo, S.H.
Seth, K.K.
Sever, A.
Shamanov, V.
Shcheglova, L.M.
Shcheglova, L.M.
Shen, Y.-G.
Shibuya, H.
Shindou, R.
Shiroishi, M.
Shiu, G.
Shore, G.M.
Shore, G.M.
Shoshi, A.I.
Siopsis, G.
Sirignano, C.
B713 (2005) 3
B718 (2005) 3
B713 (2005) 151
B713 (2005) 3
B718 (2005) 3
B716 (2005) 105
B717 (2005) 304
B714 (2005) 3
B713 (2005) 3
B715 (2005) 173
B715 (2005) 413
B713 (2005) 601
B717 (2005) 304
B713 (2005) 601
B717 (2005) 304
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 373
B713 (2005) 3
B718 (2005) 3
B713 (2005) 263
B716 (2005) 401
B713 (2005) 3
B718 (2005) 3
B714 (2005) 67
B717 (2005) 34
B715 (2005) 539
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B720 (2005) 3
B720 (2005) 235
B717 (2005) 34
B717 (2005) 34
B719 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B712 (2005) 347
B718 (2005) 35
B720 (2005) 399
B712 (2005) 573
B712 (2005) 20
B712 (2005) 411
B717 (2005) 86
B715 (2005) 440
B715 (2005) 483
B718 (2005) 35
Sitenko, Yu.A.
Skillicorn, I.O.
Skillicorn, I.O.
Slavnov, N.A.
Sominski, W.
Sominski, W.
Smirnov, V.A.
Smith, C.
Smith, J.
Smith, W.H.
Smith, W.H.
Smolyakov, M.N.
Soares, M.
Soares, M.
Sodano, P.
Soddu, A.
Sokatchev, E.
Solano, A.
Solano, A.
Solbrig, S.
Sommer, R.
Sommovigo, L.
Son, D.
Son, D.
Song, J.S.
Sorrentino, S.
Sosnovtsev, V.
Sosnovtsev, V.
Sotkov, G.M.
Spada, F.R.
Spence, B.
Spira, M.
Splittorff, K.
Spradlin, M.
Sridhar, K.
Stadie, H.
Stairs, D.G.
Stancari, G.
Stancari, M.
Stanco, L.
Stanco, L.
Standage, J.
Standage, J.
Stefanski Jr., B.
Steinhauser, M.
Steinhauser, M.
Stern, A.
Stifutkin, A.
Stifutkin, A.
Stonjek, S.
Stonjek, S.
Stopa, P.
Stopa, P.
Stsslein, U.
B714 (2005) 217

B713 (2005) 3
B718 (2005) 3
B712 (2005) 600
B713 (2005) 3
B718 (2005) 3
B716 (2005) 303
B718 (2005) 319
B720 (2005) 182
B713 (2005) 3
B718 (2005) 3
B717 (2005) 19
B713 (2005) 3
B718 (2005) 3
B711 (2005) 480
B712 (2005) 325
B712 (2005) 157
B713 (2005) 3
B718 (2005) 3
B716 (2005) 105
B713 (2005) 378
B716 (2005) 248
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B714 (2005) 179
B718 (2005) 35
B712 (2005) 59
B719 (2005) 35
B712 (2005) 287
B711 (2005) 199
B713 (2005) 333
B718 (2005) 3
B713 (2005) 3
B717 (2005) 34
B717 (2005) 34
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 83
B713 (2005) 487
B716 (2005) 303
B718 (2005) 371
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
453
Stsslein, U.
Straub, P.B.
Straub, P.B.
Strolin, P.
Strumia, A.
Strumia, A.
Suchkov, S.
Suchkov, S.
Susinno, G.
Susinno, G.
Suszycki, L.
Suszycki, L.
Sutiak, J.
Sutiak, J.
Sutton, M.R.
Sutton, M.R.
Sztuk, J.
Sztuk, J.
Szuba, D.
Szuba, D.
Szuba, J.
Szuba, J.
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B715 (2005) 173
B720 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Takcs, G.
Takcs, G.
Takahashi, M.
Takasaki, K.
Takayama, Y.
Talavera, P.
Tamakoshi, T.
Tanimoto, M.
Tapper, A.D.
Tapper, A.D.
Targett-Adams, C.
Targett-Adams, C.
Tassi, E.
Tassi, E.
Tausk, J.B.
Tawara, T.
Tawara, T.
Teper, M.
Terras, V.
Terrn, J.
Terrn, J.
Tezuka, I.
Tiecke, H.
Tiecke, H.
Timirgaziu, C.
Tioukov, V.
Tok, T.
Tokushuku, K.
Tokushuku, K.
Tolla, D.D.
Tolun, P.
B714 (2005) 307

B716 (2005) 519
B712 (2005) 573
B715 (2005) 275
B715 (2005) 665
B713 (2005) 438
B715 (2005) 275
B713 (2005) 151
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 278
B713 (2005) 3
B718 (2005) 3
B715 (2005) 461
B712 (2005) 600
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B716 (2005) 65
B718 (2005) 35
B716 (2005) 105
B713 (2005) 3
B718 (2005) 3
B715 (2005) 413
B718 (2005) 35
454
Tomaradze, A.
Tomino, D.
Toshito, T.
Tran, N.-K.
Trancanelli, D.
Trapletti, M.
Travaglini, G.
Trigiante, M.
Trincherini, E.
Trugenberger, C.A.
Tsenov, R.
Tseytlin, A.A.
Tseytlin, A.A.
Tsouchnika, E.
Tsukerman, I.
Tsurugai, T.
Tsurugai, T.
Tsutsui, I.
Tsvelik, A.M.
Turcato, M.
Turcato, M.
Tymieniecka, T.
Tymieniecka, T.
Tyszkiewicz, A.
Tyszkiewicz, A.
B717 (2005) 34
B715 (2005) 665
B718 (2005) 35
B712 (2005) 325
B719 (2005) 234
B713 (2005) 173
B712 (2005) 59
B716 (2005) 215
B720 (2005) 3
B716 (2005) 509
B718 (2005) 35
B715 (2005) 190
B718 (2005) 83
B717 (2005) 387
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B715 (2005) 713
B719 (2005) 103
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Uhlig, S.
Uiterwijk, J.W.E.
Ukleja, A.
Ukleja, A.
Ukleja, J.
Ukleja, J.
Ullio, P.
Uman, I.
Uraltsev, N.
Ushida, N.
B716 (2005) 173

B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B712 (2005) 86
B717 (2005) 34
B719 (2005) 77
B718 (2005) 35
Vafa, C.
van Dantzig, R.
van der Bij, J.J.
van der Bij, J.J.
Van de Vyver, B.
van Neerven, W.L.
van Neerven, W.L.
Vassilevich, D.V.
Vzquez, M.
Vzquez, M.
Verbaarschot, J.J.M.
Vicari, E.
Vidnovic, T.
Vilain, P.
Vlasov, N.N.
Vlasov, N.N.
Voituriez, R.
B715 (2005) 304

B718 (2005) 35
B713 (2005) 278
B716 (2005) 280
B718 (2005) 35
B713 (2005) 333
B720 (2005) 182
B715 (2005) 695
B713 (2005) 3
B718 (2005) 3
B712 (2005) 287
B720 (2005) 307
B717 (2005) 34
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B714 (2005) 336
Volobuev, I.P.
Volovich, A.
von Gersdorff, G.
von Gersdorff, G.
Voss, K.C.
Voss, K.C.
B717 (2005) 19
B711 (2005) 199
B712 (2005) 3
B720 (2005) 211
B713 (2005) 3
B718 (2005) 3
Wagner, C.E.M.
Walczak, R.
Walczak, R.
Walsh, R.
Walsh, R.
Wang, L.-T.
Wang, M.
Wang, M.
Wang, W.
Was, Z.
Wecht, B.
Weigand, T.
Weigel, M.
Weisz, P.
Wenger, U.
Werkema, S.
Weston, R.
Whitmore, J.J.
Whitmore, J.J.
Whyte, J.
Whyte, J.
Wichmann, K.
Wichmann, K.
Wick, K.
Wick, K.
Wiese, K.J.
Wiggers, L.
Wiggers, L.
Willey, R.
Willmann, R.D.
Wilquet, G.
Wing, M.
Wing, M.
Winter, K.
Wlasenko, M.
Wlasenko, M.
Wolf, G.
Wolf, G.
Wolff, U.
Wong, S.M.H.
Worek, M.
Wright, J.
B716 (2005) 319

B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B712 (2005) 20
B713 (2005) 3
B718 (2005) 3
B716 (2005) 199
B713 (2005) 555
B716 (2005) 33
B713 (2005) 83
B719 (2005) 312
B714 (2005) 256
B715 (2005) 461
B717 (2005) 34
B720 (2005) 348
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B711 (2005) 530
B713 (2005) 3
B718 (2005) 3
B714 (2005) 256
B718 (2005) 341
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B718 (2005) 35
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 378
B715 (2005) 440
B713 (2005) 555
B716 (2005) 33
Xu, C.
B716 (2005) 487
Yages Molina, A.G.

Yages Molina, A.G.
B713 (2005)
B718 (2005)
3
3
Yamada, N.
Yamada, S.
Yamada, S.
Yamazaki, Y.
Yamazaki, Y.
Yin, X.
Yoneya, T.
Yoneya, T.
Yoon, C.S.
Yoshida, K.
Yoshida, R.
Yoshida, R.
Young, C.A.S.
Youngman, C.
Youngman, C.
Yue, C.-X.
B713 (2005) 407

B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 137
B711 (2005) 3
B711 (2005) 54
B718 (2005) 35
B714 (2005) 51
B713 (2005) 3
B718 (2005) 3
B717 (2005) 327
B713 (2005) 3
B718 (2005) 3
B716 (2005) 199
Zambrana, M.
Zambrana, M.
Zamolodchikov, A.B.
Zarembo, K.
Zarembo, K.
Zarnecki,
A.F.
Zarnecki,
A.F.
Zawiejski, L.
Zawiejski, L.
B713 (2005) 3
B718 (2005) 3
B719 (2005) 103
B715 (2005) 190
B717 (2005) 137
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
Zeitlin, A.M.
Zerwas, P.M.
Zeuner, W.
Zeuner, W.
ZEUS Collaboration
ZEUS Collaboration
Zhang, F.
Zhautykov, B.O.
Zhautykov, B.O.
Zhou, C.
Zhou, C.
Zichichi, A.
Zichichi, A.
Ziegler, A.
Ziegler, A.
Ziegler, Ar.
Ziegler, Ar.
Zimerman, A.H.
Zotkin, D.S.
Zotkin, D.S.
Zotkin, S.A.
Zotkin, S.A.
Zoubos, K.
Zucchelli, P.
Zweber, P.
Zwirner, F.
455
B720 (2005) 289

B711 (2005) 83
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B716 (2005) 199
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B714 (2005) 179
B713 (2005) 3
B718 (2005) 3
B713 (2005) 3
B718 (2005) 3
B719 (2005) 234
B718 (2005) 35
B717 (2005) 34
B715 (2005) 211

Nucl - Phys.B v.720 PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Nucl - Phys.B v.720 PDF

Загружено:

Авторское право:

Доступные форматы

Nuclear Physics B 720 (2005) 346

Gravitational quantum corrections in warped

Received 20 December 2004; accepted 2 May 2005

E-mail address: thomas.gregoire@cern.ch (T. Gregoire).

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

2. Warped supersymmetric brane worlds

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

g5 5 M53 R5 + i M MRN DR k(y) MN N

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

2.2. Loop corrections

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

As usual, L also contains conformal transformations that extend the 4D super-Poincar

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

Leff = 2RM5 d V3/2 ()V3/2m m V0

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

conformal compensator = exp(/3)

and the rest of the global 4D super-Poincar group (P = i plus

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

substituting the flat vielbein a by the curved one ea = e a , we can write

Not surprisingly, the presence of the vielbein shows that QL

The linearized general coordinate transformations are then given by:

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

From the last two expressions, it follows that:

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

to be purely chiral + antichiral (i.e., no linear superfield component) while keeping = 0.

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

component as m V0m = + and form the two combinations

3 (y)T (x) + 2i (x) = A (x) 3 (y)T (x),

as it should. Moreover, with the identification = exp(/3)

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

graviphoton appears in covariant derivatives via a Z2 -odd charge, y y + iq(y)Ay :

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

term of the different superspin components defined previously6

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

matrix notation as:

Inverting this matrix, we find that for mn

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

By using then the identity

expanding the scalar propagator in KK modes, and continuing to Euclidean momentum,

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

Fig. 2. Cancellation of two diagrams contributing to the effective Khler potential.

4. Explicit computation and results

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

also understood that the circumference 2R should be promoted to the superfield T + T

and their asymptotic behavior at small argument x 1 is instead

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

(kz1 )2 ek(T +T ) . Defining also u = min(z, z ) and v = max(z, z ), the propagator

1 I1 (pz1 )K 2 (pz0 ) + K 1 (pz1 )I2 (pz0 )

1 I1 (pz0 )K 2 (pz1 ) + K 1 (pz0 )I2 (pz1 )

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

can be computed with standard techniques, exploiting the so-called SommerfeldWatson

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

2 , which as mentioned above

T. Gregoire et al. / Nuclear Physics B 720 (2005) 346

controls the UV divergences in the Casimir energy is

d 4 p 2 Z(p) i (1 kzi i p 2 ii (p)) i (kzi i p 2 ii  (p))

g5 5 M53 R5 + i M MRN DR k(y) MN N

graviphoton appears in covariant derivatives via a Z2 -odd charge, y y + iq(y)Ay :

(kz1 )2 ek(T +T ) . Defining also u = min(z, z ) and v = max(z, z ), the propagator

d 4 p 2 Z(p) i (1 kzi i p 2 ii (p)) i (kzi i p 2 ii (p))

z1 = z1 ek1 /2 z1 k1 z1 /2.